The first test tests that the notmuch-show-refresh-view function
produces the exact same output for an unmodified show buffer. This
test should pass since the relevant functionality has already been
applied.
The second test tests show refresh for a show buffer that has been
modified by navigation and message visibility toggling. Ideally
refresh-view should preserve this state of the notmuch-show buffer.
Unfortunately it currently does not, so this test is know to be broken
and is marked as such.
There's no reason to output "Non-text part:" lines for parts that are
not leaf nodes, eg. multipart/* and message/rfc822. We fix the text
here to test for their absence. The next patch will fix reply
accordingly.
The setup is useless if gdb is not present, so it doesn't hurt to skip
it. The diff here is huge, but the commit is really just moving most
of the script inside the initial if, and adding an else block to print
a warning.
This addresses atomicity of tag synchronization, the last atomicity
problems in notmuch new. Each message add or remove is wrapped in its
own atomic section, so interrupting notmuch new doesn't lose progress.
This tests notmuch new's ability to recover from arbitrary stopping
failures. It interrupts notmuch new after every database commit and,
on every resulting database snapshot, re-runs notmuch new to
completion and checks that the final database state is invariant.
This means that test_subtest_known_broken needs to be called before
every known broken subtest, which is no different than what is
documented for the test_begin_subtest case.
The assumption is that every test ends up calling either skipping,
calling test_ok_ or test_failure_ and and the latter in turn delegate
to the known_broken versions in the case where
test_subtest_known_broken_ is set.
Human-friendly scenario:
* open a thread where a message which ends with an HTML part is
followed by another message
* make the first message visible
* goto the beginning of the second message (first line, first colon)
* hit "RET"
Result: nothing happens except for "No URL at point" message
Expected result: the second message is shown/hidden
The root cause is that the HTML part has `keymap' text property set.
In particular, "RET" is bound to visit a URL at point. The problem is
that `keymap' property affects the next character following the region
it is set to (see elisp manual [1]). Hence, the first character of
the next message has a keymap inherited from the HTML part.
[1] http://www.gnu.org/software/emacs/elisp/html_node/Special-Properties.html
There is existing support for broken tests. But it is not convenient
to use. The primary issue is that we have to maintain a set of
test_expect_*_failure functions which are equivalent to the normal
test_expect_* counterparts except for what functions are called for
result reporting. The patch adds test_subtest_known_broken function
which marks a subset as broken, making the normal test_expect_*
functions behave as test_expect_*_failure. All test_expect_*_failure
functions are removed. Test_known_broken_failure_ is changed to
format details the same way as test_failure_ does.
Another benefit of this change is that the diff when a broken test is
fixed would be small and nice.
Documentation is updated accordingly.
Update test_emacs documentation in test/README according to the latest
changes in emacs tests. Move the note regarding setting variables
from test/emacs to test/README.
The main goal of this overhaul is to define how message/rfc822 parts
should be handled. message/rfc822 parts should be output in a similar
fashion to the outer message, including some subset of the rfc822
headers. The following decisions about formatting of message/rfc822
parts were made:
The format and content of message/rfc822 parts shall be as similar as
possible to that of full messages. In particular, for formatted
outputs, the "content" of rfc822 part output should include "headers"
and "body" fields).
The "body" field shall include the body of the message.
The "headers" field shall include the following headers, since these
are the ones available from the GMimeMessage:
"From"
"To"
"Cc"
"Subject"
"Date"
However, for the case of --format=raw the raw rfc822 should be output,
including all headers.
A subset of relevant headers shall be output in reply.
The test embedded rfc822 message is also modified to be itself
multipart, so we can more fully test how all sub parts of the message
part are output.
Note added by Committer:
Currently, expect one test (--format=raw --part=3, rfc822 part) to fail.
The test message date, "Tue, 05 Jan 2001 15:43:57 -0000", is not
actually a real date. 05 Jan 2001 was in fact a Friday, not a
Tuesday. Date parsers (such as "date" in coreutils) will return "Fri"
as the day for this string, even if "Tue" is specified.
Also, the time zone "-0000" is actually always returned as "+0000", so
we change that here was well.
This will be relevant for later patches when we begin parsing rfc822
part headers, where gmime returns a parsed date string.
If we do want to test date parsing, we should do that in a separate
test.
There were two "--format=text --part=0" tests. One of them was
supposed to be a test for "--format=text --part=1".
There were also two errant "test_expect_equal_file OUTPUT EXPECTED"
lines, that are removed here.
The feature to show subject changes in the collapsed thread view was
originally added (8ab433607) with an option
(notmuch-show-always-show-subject) to display the subject
for all messages, even when there was no change.
The subsequent commit (4f04d273) changed the sense of the test (or to
and) and the name of the controlling variable
(notmuch-show-elide-same-subject).
But this commit is broken in a few ways:
1. The original definition of notmuch-show-always-show-subject was
left around
But the variable isn't actually used in the code at all, so it
just adds clutter and confusion to the customization interface.
2. The name and description of the controlling variable doesn't
match the implementation
The name suggests that setting the variable to t will cause
repeated subjects to be elided, (suggesting that when it is nil
all subjects will be shown).
However, when the variable is nil, no subjects are shown. So a
correct name for the variable in this sense would be
notmuch-show-subject-changes.
Showing subject changes is a useful feature, and should be on by
default. (We don't want to bury generally useful features behind
customizations that users have to find).
Rather than fixing the name of the variable and changing its default
value, here we remove the condition entirely, such that the feature is
enabled unconditionally.
So both the currently-used variable and the stale definition of the
formerly-used are removed.
Also, the one relevant test-suite result is updated, (showing the
intial subject of a collapsed thread, and no subject display for later
messages that do not change the subject).
eb4cf465 introduces changes which weren't part of the submitted
patch (id:"87liwlip2j.fsf@gmail.com"), presumably made during
resolving merge conflicts.
The first one causes the title of a test to be printed a second time,
in place of the correct title:
PASS Message with .. in Message-Id:
PASS Message with .. in Message-Id:
instead of:
PASS Message with .. in Message-Id:
PASS Sending a message via (fake) SMTP
The second one is simply the insertion of a line break, so no harm there.
This commit reverts both changes, as they were clearly accidental.
Signed-off-by: Pieter Praet <pieter@praet.org>
The sleep was to force the directory's mtime to advance between the
previous notmuch new and the subsequent rm;notmuch new.
The current convention is to use the existing increment_mtime function
for this purpose, (which avoids the test suite being slowed down by
calls to sleep).
Thanks to Austin Clements for noticing this undesired sleep.
Test for bug. Current stemming support for notmuch adds extra terms
to the DB which aren't removed when the file renames are detected.
When folder tags are added to a message, Xapian terms for both XFOLDER
and ZXFOLDER are generated. When one of the filenames are
renamed/removed, only the XFOLDER tags are removed, leaving it possible
for a match on a folder: tag that was previously but is no longer a
match in the maildir.
Before the change, every Emacs test ran in a separate Emacs
instance. Starting Emacs many times wastes considerable time and
it gets worse as the test suite grows. The patch solves this by
using a single Emacs server and emacsclient(1) to run multiple
tests. Emacs server is started on the first test_emacs call and
stopped when test_done is called. We take care not to leave
orphan Emacs processes behind when test is terminated by whatever
reason: Emacs server runs a watchdog that periodically checks
that the test is still running.
Some tests need to provide user input. Before the change, this
was done using echo(1) to Emacs stdin. This no longer works and
instead `standard-input' variable is set accordingly to make
`read' return the appropriate string.
Without this, mail messages delivered by emacs_deliver_message might
not be seen by the next invocation of "notmuch new", (which can lead
to test-suite failures if emacs_deliver_message is fast enough).
Change add_email_corpus, emacs_deliver_message and tests to use
$TEST_DIRECTORY instead of '..'.
This improves the behavior of the usage of --root=<dir>, as the
assumption of what '..' means will usually be incorrect.
Document -root option in README and update valgrind to work with
-root.
Instead of generating auxiliary run_emacs script every time
test_emacs is run, do it once in the beginning of the test.
Also, use absolute paths in the script to make it more robust.
Using `setq' for setting variables in Emacs tests affect other
tests that may run in the same Emacs environment. Currently it
works because each test is run in a separate Emacs instance. But
in the future multiple tests will run in a single Emacs instance.
The patch changes all variables to use `let', so the scope of the
change is limited to a single test.
Few Emacs tests used sed(1) to remove unexpected output in the
beginning to avoid getting confused by messages such as "Parsing
/home/cworth/.mailrc... done". This is no longer needed since
tests are run in a temporary home directory instead of the user's
one. So remove these sed(1) calls.
Before the change, the common Emacs test scheme was to print
buffer content to stdout and redirect it to a file or capture it
in a shell variable. This does not work if we switch to using
emacsclient(1) for running the tests, because you can not print
to the stdout in this case. (Actually, you can print to stdout
from Emacs server, but you can not capture the output on
emacsclient(1)).
The patch introduces new Emacs test auxiliary functions:
`test-output' and `test-visible-output'. These functions are
used to save buffer content to a file directly from Emacs. For
most tests the changes are trivial, because Emacs stdout output
was redirected to a file anyway. But some tests captured the
output in a shell variable and compare it with the expected
output using test_expect_equal. These tests are changed to use
files and test_expect_equal_file instead.
Note: even if we do not switch Emacs tests to emacsclient(1), the
patch makes tests cleaner and is an improvement.
Most test_emacs calls have long arguments that consist of many
expressions. Putting them on a single line makes it hard to read
and produces poor diff when they are changed. The patch puts
every expression in test_emacs calls on a separate line.
Few Emacs tests had test_expect_equal_file arguments in the wrong
order: the first argument should be the test output and the
second one should be the expected.
Update the test mail corpus to have two files with the same content to
expose the bug where a single message with multiple filenames only
reports a single filename.
Update expected results for search --output=files to match new
behavior for multiple files corresponding to a single message
Signed-off-by: Mark Anderson <ma.skies@gmail.com>
Various typo fixes in some test-case data.
Signed-off-by: Pieter Praet <pieter@praet.org>
Edited-by: Carl Worth <cworth@cworth.org> Restricted to just
test-case data.
Various typo fixes in documentation within the code that can be made
available to the user, (emacs function help strings, "notmuch help"
output, notmuch man page, etc.).
Signed-off-by: Pieter Praet <pieter@praet.org>
Edited-by: Carl Worth <cworth@cworth.org> Restricted to just
documentation and fixed fix of "comman" to "common" rather than
"command".
Various typo fixes in comments within the source code.
Signed-off-by: Pieter Praet <pieter@praet.org>
Edited-by: Carl Worth <cworth@cworth.org> Restricted to just
source-code comments, (and fixed fix of "descriptios" to "descriptors"
rather than "descriptions").
Various typo fixes in comments within the Makefile and other build scripts.
Signed-off-by: Pieter Praet <pieter@praet.org>
Edited-by: Carl Worth <cworth@cworth.org> Restricted to just build files.
Various typo fixes in auxiliary text files included with the source,
(README, TODO, etc.).
Signed-off-by: Pieter Praet <pieter@praet.org>
Edited-by: Carl Worth <cworth@cworth.org> Restricted to just text files.
We exercise each of the documented values (nil, a string, and a
list). For the list, we test matching a specific entry, matching a
catch-all regular expression, and no match at all (in which case there
is no FCC set).
The worry here is that a binary linking with libnotmuch might lose
access to Xapian::Error symbols because libnotmuch hides them.
We are careful here to create ./fakedb/.notmuch in order to trigger a
Xapian exception, and not just a missing file check.
Thanks to jrollins and amddragon for suggestions.
(cherry picked from commit 66f37f5f6864a988f94ddb893e3a176af57f6c8e)
The main() function should be written as just another function with a
return value. This allows for more reliable code reuse. Imagine that
main() grows too large and needs to be factored into multiple
functions. At that point, exit() is probably the wrong thing, yet can
also be hard to notice as it's in less-frequently-tested exceptional
cases.
Each top level test (basic, corpus, etc...) is run with a fixed
timeout of 2 minutes.
The goal here is to treat a hung test as a failure. The emacs test for
sending mail is known to be problematic on the Debian
autobuilders. This is both a bandaid fix for that, and a sensible long
term feature.
(cherry picked from commit 5f99c80e02736c90495558d9b88008a768876b29)
The intent was always to make these Received headers span multiple
lines. But the escapes were causing the shell to ignore the newlines,
so that the result instead was long Received headers on a single line
each.
Fixing the intent here doesn't actually change the test-suite results
at all.
This is much more realistic, as most messages in the wild will have multiple
Received headers. Also, this demonstrates a current bug in the Received
header parsing, (multiple Received headers are not properly concatenated
depending on the order in which headers are parsed in a message).
This tests the recently-added detection/hiding of top-posted quotations and
in the testing, it takes advantage of the less-recently-added
visible-buffer-string function for emitting only the visible text
from a buffer.
Again, this is a much cleaner and more thorough test, and in fact
exposes a bug in the format=text output, that will be fixed the next
commit. Because of this, some of the multipart tests currently fail.
This is a much cleaner way to do the emacs tests, since we're actually
comparing output against existing files with expected output. We also
won't miss any trailing newlines this way.
And speaking of which, one of the expected output files was actually
missing a trailing blank line that was actually in one of the original
messages, so this was fixed.
This feature was recently added, so it of course needs a test now.
Signed-off-by: Jameson Graef Rollins <jrollins@finestructure.net>
Edited-by: Carl Worth <cworth@cworth.org> Fixed test to use
notmuch_search_sanitize in order to be robust against unpredictable
thread ID numbers, (due to unpredictable order in which the filesystem
presents files).
In the master branch in test/emacs two tests access the build users home
directory, so does emacs_deliver_message in the crypto branch.
The tests should not touch the build user's home directory. The patch
creates a directory in the temporary test directory and sets home
accordingly.
In case of a non-existent home directory, the tests are failing without
this patch.
Signed-off-by: Jameson Graef Rollins <jrollins@finestructure.net>
This test doesn't have anything to do with json, and has everything to
do with testing search capability, so I'm not sure why it was in the
wrong place.
The "Search for non-existent message prints nothing" test fits better
with the existing tests in search-output, so move it there. Also add a
similar test for the --format=json case.
These tests also use the new test_expect_equal_file function, (to ensure
that the presence of a trailing newline is correctly tested).
These test now properly test for the presence of a newline at the end
of all output. Right now some of these test will fail because the
search output is currently broken to *not* produce proper newlines in
some cases.
Since commit 2f8871df6e notmuch has been
using a function (show_part_content) originally written only for text
parts to save all MIME parts. The problem with this is that this
function converts CRLF pairs to LF only and optionally converts to
UTF-8 encoding. These two conversions have the potential to corrupt
binary data when passed through the function.
This test demonstrates that corruption, and so fails currently, until
we fix the bug.
Not that it affects the correctness of the test, but it's nice to use
proper spelling. This kind of change could invalidate a signature on the
test message, but I think that would have happened previously when the
HTML part was added in the first place.
Use .gz filenames for saved attachments in the tests to check
that Emacs does not re-compress the file.
Use test_expect_equal_file instead of test_expect_equal to avoid
binary output on the console.
Before the change, test_expect_equal_file moved files it compared
in case of failure. The patch changes it to copy the files
instead. This allows testing non-temporary files which are
stored in git.
Note: the change should not result in new temporary files left
after the tests. Test_expect_equal_file used to move files only
on failure, so callers had to cleanup them anyway.
The primary goal here is to keep the decrypted output as similarly
structured as undecrypted output as possible. Now, when decrypting
parts, only the original encrypted part is replaced by the it's
decrypted content. If this part isn't itself a multipart, then all
part numbering should remain consistent during decryption.
The only draw back here is that the useless application/pgp-encrypted
sub-part of the multipart/encrypted part is also emitted. But this
part can be easily ignored by clients.
Some folks have complained about the part renumbering that occurs when
the entire multipart/signed part is replaced with the part contents
after verification. This is primarily because it incurs an additional
computational cost to retrieve individual parts, since verification
has to be performed again to ensure that part numbering is consistent.
This patch simply leaves the full multipart/signed part as is.
The emacs crypto test is also updated to reflect this change.
This patch adds the tag "signed" to messages with any multipart/signed
parts, and the tag "encrypted" to messages with any
multipart/encrypted parts. This only occurs when messages are indexed
during notmuch new, so a database rebuild is required to have old
messages tagged.
This adds a new "crypto" test script to the test suite to test
PGP/MIME signature verification and message decryption. Included here
is a test GNUPGHOME with a test secret key (passwordless), and test
for:
* signing/verification
* signing/verification with full owner trust
* verification with signer key unavailable
* encryption/decryption
* decryption failure with missing key
* encryption/decryption + signing/verfifying
* reply to encrypted message
* verification of signature from revoked key
These tests are not expected to pass now, but will as crypto
functionality is included.
We need to be able to test for the presence of a newline at the end of
output. There's no good way to capture trailing newlines in bash, so
redirecting output to a file is the next best thing. This new
function should be used when testing for output that is expected to
have trailing newlines.
The next commit will demonstrate the use of this.
The patch replaces all (message (buffer-string)) calls in emacs
tests with (princ (buffer-string)). This avoids accidentally
interpreting '%' as format specifiers and makes code simpler
because we do not need to capture stderr.
Also, the patch works around an Emacs (23.3+1-1 on current Debian
Unstable) segfault in "Ensure that emacs doesn't drop results"
test. Note: the segfault does not happen on every test run.
Though, it seems to be consistently reproducible if the test uses
300 messages instead of 30. Hopefully, it is the crash described
in Emacs bug #8545 [1] which is already fixed.
[1] http://debbugs.gnu.org/cgi/bugreport.cgi?bug=8545
Change #!/bin/bash at start of tests to "#!/usr/bin/env bash". That way
systems running on bash < 4 can prepend bash >= 4 to path before
running the tests.
The patch adds test-lib.el file for Emacs tests auxiliary stuff.
Currently, it implements two functions: `visible-buffer-string'
and `visible-buffer-substring'. These are similar to standard
counterparts without "visible-" prefix but exclude invisible
text. The functions are not used anywhere at the moment but
should be useful for testing hiding/showing in the Emacs
interface.
Edited-by: Carl Worth <cworth@cworth.org> Fixed "basic" test to ignore
new test-lib.el file.
The example multipart message is made a bit more complicated by adding
a message/rfc822 message, and the all parts are output and tested in
all output formats.
Remove double quotes and flatten "foo@bar.com <foo@bar.com>" to
"foo@bar.com".
Edited-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net> (clean up
expected output for emacs tests).
Signed-off-by: Jameson Rollins <jrollins@finestructure.net>
The compilation of the smtp-dummy program would fail if a build was
attempted on a system without getline. Fix this by simply including
the existing notmuch_compat_srcs variable when constructing the list
of source files for compiling smtp-dummy.
Previously, notmuch show flattened all output, losing information
about the nesting of the MIME hierarchy. Now, the output is properly
nested, (both in the --format=text and --format=json output), so that
clients can analyze the original MIME structure.
Internally, this required splitting the final closing delimiter out of
the various show_part functions and putting it into a new
show_part_end function instead. Also, the show_part function now
accepts a new "first" argument that is set not only for the first MIME
part of a message, but also for each first MIME part within a series
of multipart parts. This "first" argument controls the omission of a
preceding comma when printing a part (for json).
Many thanks to David Edmondson <dme@dme.org> for originally
identifying the lack of nesting in the json output and submitting an
early implementation of this feature. Thanks as well to Jameson Graef
Rollins <jrollins@finestructure.net> for carefully shepherding David's
patches through a remarkably long review process, patiently explaining
them, and providing a cleaned up series that led to this final
implementation. Jameson also provided the new emacs code here.
Previously, the outer multipart part of any multipart/mixed,
multipart/signed, etc. MIME message was silently omitted from the
"notmuch show" output. This prevented any client from correctly
determining to which parts a signature applies, for example.
Now, we actually emit these parts as their own parts. The output is
still flattened---the contained parts are not yet included "within"
the multipart part---so it's still not possible to determine to which
parts a signature applies, but this is one step along the path.
The test suite is updated to reflect this change, (though we'll
eventually want to fix the emacs interface to not display buttons for
the multipart enclosure parts as there's nothing useful for the user
to actually do with them).
This tests "notmuch show" with both --format=text and --format=json on
a message with some non-trivial MIME multipart nesting, (multiple parts
within a multipart/mixed part which is within a multipart/signed part).
The test captures the current behavior (where only the leaf nodes of
the MIME structure are emitted as a flat list---the multipart parts
are effectively ignored). We plan to soon change the json output at
least to emit an actual hierarchy matching the MIME structure, (at
which point we will update this test).
Theses were expected failures only due to a bug in GMime (with
versions of GMime before 2.4.18). As of GMime version 2.4.18 this bug
is fixed and these tests now pass.
With the previous commit, unexpected output before or between search results
would be displayed. However, trailing junk from the "notmuch search" output
would still be silently swallowed.
The most common case for an error message from "notmuch search" would be
an invalid command-line, and in that case, there would be no search results
and the trailing error message would get swallowed.
We fix the process sentinel to check for leftover data and add it to the
final buffer. We also add a test case to ensure this works.
Rather than silently swallowing unexpected output, the emacs interface will now
display it. This will allow error messages to actually arrive at the emacs
interface (though not in an especially pretty way). This also allows for easier
investigation of the inadvertent swallowing of search results that span page
boundaries (as demonstrated by the recent added emacs-large-search-buffer test).
The page-boundary bug has been present since a commit from 2009-11-24:
93af7b5745
Many thanks to Thomas Schwinge for tracking that bug down and
contributing the test for it.
The new name is more descriptive of the bug being tested. Also, the test
is rewritten slightly so that it's much more plain to see how the bug
manifests itself, (that messages are droped from the emacs result at
regular intervals). Primarily, this is by collapsing the large blobs
used to inflate the message subjects.
In the original json code, search matching nothing would return a
valid, empty json array (that is, "[]"). I broke this in commit
6dcb7592e3 when adding support for
--output=threads|messages|tags. This time, while fixing the bug also
add a test to the test suite to help avoid future regressions.
Now that we understand the bug here, we rename this test to
search-insufficient-from-quoting to clarify the bug being exercised,
(which occurs when the From: line contains an unquoted '.' character).
We also mark these tests as expected failures until the bug gets fixed.
Currently, there are two places in the test framework that contain very
long list on a single line. Whenever a test is added (or changed) in
several branches and these branches are merged, it results in conflict
which is hard to resolve because one has to go through the whole long
line to find where the conflict is.
This patch splits these long lists to several lines so that the
conflicts are easier to resolve.
Currently, whenever we call index_terms multiple times for a single
field, the term generator is being reset to position 0 each time. This
means that with text such as:
To: a@b.c, x@y.z
one can get a bogus match by searching for:
To: a@y.c
Thanks to Mark Anderson for reporting the bug, (and providing a nice,
minimal test case that inspired what is used here).
This is a new feature which is not implemente yet, so these tests mostly
fail currently. A subsequent commit will add the feature and cause these
tests to start passing.
These tests verify that we can search for containing folders of mail files
by word or by phrase and that the search terms are updated correctly when
directories are renamed.
This reverts commit f22a7ec1e2.
Interrupting the test suite due to an actual bug in a test script
would be just fine, but interrupting the run of the entire test suite
at the first test failure is unacceptable.
Previously, this directory was only preserved for failing tests. But
it's important to be able to easily debug known-broken tests, so
preserve the actual vs. expected output for those as well.
Use varying dates in the test messages to test the order authors are
listed in. Add tests with repeated author names and unusual date
ordering. Most of these are broken at the moment, but will be fixed
shortly.
Edited-by: Carl Worth <cworth@cworth.org>: Also update the expected
results for existing emacs tests that currently codify the incorrect
author ordering, (and similarly note them as broken in the current
test suite).
Incremental search does not match strings that span a
visible/invisible boundary. This results in failure to correctly
isearch for authors in `notmuch-search' mode if the name of the author
is split between the visible and invisible components of the authors
string. To avoid this, attempt to truncate the visible component of
the authors string on a boundary between authors, such that the
entirety of an author's name is either visible or invisible.
When a test fails, a tmp.<testname> file is left behind. These files
are useful for the person debugging the test failure, but are never
anything we want to commit.
Edited-by: Carl Worth <cworth@cworth.org>: Changed from tmp.emacs to
tmp.* and added explanation in the commit message.
This testing *does* capture the bug of missing '[' and ']' characters
int "notmuch search --output=tags" case. This is another manifestation
of the same bug causing the missing final newline (as mentioned in the
previous commit).
This code simply wasn't being exercised by the test suite before, so
this will be useful.
Meanwhile, there's currently a bug in "notmuch search --output=tags"
in that it doesn't print a final newline. But the current test suite
isn't able to catch this bug since the $() construct of the shell
doesn't preserve the distinction of whether the final newline is
present or not.
This test script does some initial test setup (generating a few
messages), which is all well and good, but we don't need to print that
as a test result---particularly since the test result was effectively
hard-coded to always pass.
When test_begin_subtest is not followed by corresponding test_expect_equal,
the output of the rest of the test script is errornously suppressed. Add
code to detect these bugs in test scripts.
Break notmuch-test whenever a test script returns non-zero status.
This happens either when some test from the script fails or when there
is an error in the script.
This is especially useful in the latter case since the error may not
appear in the final aggregated results.
The newline was removed from say_color in commit 222926ab to allow
printing test status in the beginning of the line. Error messages are
never followed by other text so we add the newline to error function.
Git-style tests (test_expect_success etc.) suppress stdout and stderr
unless -v is given. Notmuch-style tests (created by test_begin_subtest
and test_expect_equal) do not have this behavior so implement it the
same.
Additionally, for both test styles, the test-lib.sh is changed so that
the content of suppressed stdout and stderr is shown in case of failed
test.
Finally a test for this functionality is added to basic tests.
This is to prevent notmuch from destroying any information the user
has encoded as flags in the maildir filename. Tests are also added to
the test suite to verify the documented behavior.
Some people use notmuch with non-maildir files, (for example, email
messages in MH format, or else cool things like using sluk[*] to suck
down feeds into a format that notmuch can index).
To better support uses like that, don't do any renaming for files that
are not in a directory named either "new" or "cur".
[*] https://github.com/krl/sluk/
The FCC code saves a message in maildir format, and sets the S flag by
default, so now, automatically, FCC messages will not show up as
"unread", (which seems natural enough).
There's nothing in the current API documentation that would suggest
the behavior being tested here. Attempt to implement this could have
some nasty side effects, (such as notmuch_message_maildir_flags_to_tags
implicitly calling notmuch_message_tags_to_maildir_flags and maybe
even opening up some bad looping possibilities).
Much better to stick with what we have documented, which we believe will
actually be useful, (and easy enough to comprehend).
These needed to be changed to be brought up to the current state of
the maildir-sync tests. This includes style changes, but also the
elimination of any assumption about pre-existing message filenames,
(such as msg-003) which actually don't exist anymore.
Also, the known broken tests are changed to emit FAIL rather than
BROKEN simply to make them easier to fix, (so that they print the
current problems rather than hiding them).
Finally, an additional test is added to ensure that when a duplicate
file is added without flags, it doesn't invalidate flags from other
duplicates, (instead the flags are effectively merged).
Add maildir synchronization tests for multiple messages with the same
message-id. As this is not yet implemented in notmuch, some of these
teste are marked as BROKEN.
I use $(< ) operator to avoid fiddling with stripped trailing newlines
from test results which happens when output+=$(command) is used.
This change reworks these tests in several ways:
1. Bring tests into "new" test style preferring test_expect_equal over
test_expect_success in almost all cases.
2. Don't emit test results for intermediate items not actually being
tested, (things like "no new messages", "search for message",
etc.). Those things are already covered by existing tests such as
"basic" or "search" and only serve to obscure what's actually being
tested.
3. Change sense of the test showing failure to rename a file from
"new" to "cur" when "cur" doesn't exist.
In this case, notmuch should detect that this is not a maildir and
should not attempt to do any renaming of the file.
4. Extend dump/restore test to also exercise addition of tag, not just
removal.
Both items #3 and #4 above show shortcomings in the current
implementation. These are currently resulting in test results of FAIL
and indicate bugs that need to be fixed.
We have test names like maildir-sync now, so it's cleaner if the
temporary files created are named things like maildir-sync-10.out
rather than maildir-10.out. Presumably the extra stripping here came
from naming conventions in git's test suite.
This is part of an effort to avoid proliferation of excessive
top-level notmuch commands. Also, "raw" better captures the
functionality here, (as opposed to "cat" which is a fairly oblique
reference to a bad Unix abbreviation whose metaphor doesn't work here
since "notmuch cat" operates only on a single message and hence cannot
"con'cat'enate" anything).
This command outputs a raw message matched by search term to the
standard output. It allows MUAs to access the messages for piping,
attachment manipulation, etc. by running notmuch cat rather then
directly access the file. This will simplify the MUAs when they need
to operate on a remote database.
Edited-by: Carl Worth <cworth@cworth.org>: Remove trailing whitespace,
add missing "test_done" to new test script to avoid "Unexpected exit"
error.
This was too rude of a thing to do and could easily introduce
problems, (as reported by Rob Browning whose environment required some
HOME-specific things for shell startup).
Instead, implement more focused changes to ensure that particular file
in $HOME don't cause problems. Specifically, we fix known problems
with ~/.signature and ~/.mailrc here.
The original mails used to pupulate the mail corpus had had their
attachments (obnoxiously) scrubbed by the pipermail mail archiver.
Since we actually want to test the handling of attachments, this is
less than useful. Restore these files from my own collection, (with
some Received and similar headers pruned).
I still don't know everything about how I want search order to be
customizable, but I do like the current defaults, (namely, performing
a new search gives results newest first, but performing a saved search
like "tag:inbox" gives results as oldest first).
Until we come up with a better plan for people to select what *they*
want, (rather than just getting what I want), let's codify the current
results in the test suite.
After any emacs test failure, the tmp.emacs directory will have this
run_emacs script in it which the user can use to run emacs within the
test suite environment, (pointing at the test suite's notmuch
database, using the local notmuch command-line program, and the local
notmuch emacs lisp code).
My scripts expect that empty search result is actually empty. Since
commit 6dcb7592, even empty search prints a newline character and this
breaks my scripts.
This patch adds a test for this bug. In the test I cannot use
test_expect_equal function as $() operator suppresses the final
newline and this kind of difference is not detected.
test/search | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
The bash code in the test suite is using associative arrays which were
only added to bash as of release 4.0.
If the test suite is run with an older bash, we now immediately error
out and explain the situation, (instead of emitting confusing error
messages and failing dozens of tests, which is what happened before
this change).
Some recently-added tests used hard-coded thread ID values in search
specifications. This is unreliable since the thread IDs depend on the
order in which "notmuch new" encounters new files, (which in turn can
depend on inode ordering within the filesystem).
Fix these by using the new "notmuch search --output=threads" to find the
correct thread IDs given a hard-coded (but reliable) message ID.
The reply is primarily taken care of by "notmuch reply" which is already
thoroughly tested. But a recent bug is inserting a duplicate From header
in the emacs-based reply. So exercise that bug here.
Update the tests so that they no longer expect the Bcc header in the
output of "notmuch reply" now that it has been removed.
Edited-by Carl Worth: Simply applying the change to our newly
modularized test suite.
We test that the message we sent via (fake) SMTP is included in the mail
index after a "notmuch new". This verifies that the FCC setting indeed
successfully saved the sent message within the notmuch mail store.
Simply setting an explicit date is cleaner than letting the current,
(arbitrary), date get generated for the email message and then constantly
filtering that date out of search results.
Now that the FCC code is fixed to use the notmuch database path, we can
actually enable this by default, which should be highly useful for all
new users of notmuch.
Rather than *reall* sending mail here, we instead have a new test
program, smtp-dummy which implements (a small piece of) the
server-side SMTP protocol and saves a mail message to the filename
provided. This gives us reasonable test coverage of a large chunk of
the notmuch+emacs code base (down to talking to an SMTP server with
the final mail contents).
We set the HOME environment variable to the test directory to avoid
the tests relying on any configuration files from the test author's
own home directory, (such as ${HOME}/.emacs or similar).
If Xapian sees unquoted ".." as in id:123..456 then it thinks that's a
range specification. We avoid this problem by instead passing
id:"123..456" to Xapian.
We simulate the act of selecting the "inbox" saved search from
notmuch-hello and the act of selecting a desired thread from the
notmuch-search results.
The test for the navigation of notmuch-hello is currently marked as
BROKEN since its output is in the opposite order compared to the
'(notmuch-search "tag:inbox")' test. This question of ordering is a
currently open issue on the notmuch mailing list, so we'll let the
test suite reflect that for now.
Finally, this commit also abstracts some common emacs lisp code,
(waiting for the current buffer's process to complete), into a new
notmuch-test-wait function that is made available to anything calling
test_emacs.
This should be quite handy for doing automated testing of the
emacs-based functionality in notmuch. This function invokes emacs with
the necessary command-line arguments, (to run in batch mode with no
local initialization, to load the notmuch code from the source
directory, and to ensure an 80-column width).
While adding the documentation here for add_email_corpus I noticed
that the other email-adding functions in test-lib.sh were not yet
documented here, so add all of that documentation.
When the NOTMUCH variable was originally invented it was used as an
explicit path to the notmuch binary being tested. Today, the test
suite sets the PATH variable instead, so the NOTMUCH variable always
has a value of simply "notmuch".
We simplifying that by using the constant value rather than the
continual variable reference.
A bug in the results-aggregation code was causing the test suite to report
"all tests passed" even when there were failures, (as long as there were
also no "broken" tests). Fix this.
Now that we can usefully pass section names via the NOTMUCH_SKIP_TESTS
environment variable, it's useful to actually print those names out
for the user. Then, since we're now printing these names, let's use
nicer names, (not excessively long but also not using abbreviations
like "msg").
In order for --valgrind to be useful, we drop noisy additional output of
all of the commands being executed in verbose mode. This makes --verbose
alone quite useless, so we don't document it any more.
Also, add a zlib valgrind suppression that was showing up frequently in the
test suite.
This file was obviously describing the git test suite previously, and
would have been very hard to understand in the context of the notmuch
test suite. HOpefully it's easier to follow now.
By scanning test-lib.sh for occurrences of "git" or "GIT", I found
that most of those are internal things, (like the GIT_TEST_TEE_STARTED
variable). But GIT_SKIP_TESTS is part of the user-interface to the
test suite, so we rename it to reference notmuch rather than git.
Also, the GIT_TRACE warning is git-specific, so we drop that as well.
Since we are now using an explicit list of tests to run in
notmuch-test we need to be careful that we don't add a new file of
tests and then forget to add it to the list.
The numbers were meaningless, and they made it hard to find a file of interest.
Instead, we get the ordering we want by adding an explicit list of
tests to run to the notmuch-test script.
These were interfering with the aggregate statistics reported at the
end of the test-suite run. (Always reporting 1 broken, 1 fixed, and 1
skipped). The correct way to test the test-suite itself would be to
run the test suite externally for these cases, capture the expected
result, and then report that as a PASS test.
But, really, there's almost no value in these tests anyway. It's
almost to the level of testing that 'if false; exit 1; fi' returns
1. That is, there are so many ways that the test suite could be broken
internally, that these minor tests don't really help.
The original git test suite works by concatenating many commands into
a very long string (each separated by &&). This is painful to work
with since it prevents the editor from helping by parsing the shell
script, indenting, colorizing, etc.
Instead, we switch this back to something like the original notmuch
test suite, and add two new functions to test-lib.sh
(test_begin_subtest and test_expect_equal) to support these.
This also fixes the test suite to once again display the diff when a
test fails to generate the expected input.
This makes the new, git-derived test suite report results in a manner
similar to the original notmuch test suite.
Notable changes include:
* No more initial '*' on every line
* Only colorize a single word
* Don't print useless test numbers
* Use "PASS" in place of "ok"
* Begin sentences with a capital letter
* Print test descriptions for each block
* Separate each block of tests with a blank line
* Don't summarize counts between each block
This avoids "make test" emitting messages from three (3!) recursive
invocations of make. We change the invocations of the tests themselves
to occur directly from the shell script rather than having the shell
script invoke make again and using wildcards in the Makefile.
In order to have repeatable test suite, all times in messages are set
to UTC time zone to match the time zone (TZ variable) set in
test-lib.sh.
Signed-off-by: Michal Sojka <sojkam1@fel.cvut.cz>
The changes are:
- The notmuch-test was split into several files (t000?-*.sh).
- Removed helper functions which were moved to test-lib.sh
- Replaced every printf with test_expect_success.
- Test commands chained with && (test-lib.sh doesn't use "set -e" in
order to complete the test suite even if something fails)
- Many variables such as ${MAIL_DIR} were properly quoted as they
contain spaces.
- Changed quoting patterns in add_message and generate_message (single
quotes are already used by the test framework).
- ${TEST_DIR} replaced by ${PWD}
QUICK HOWTO:
To run the whole test suite
make
To run only a single test
./t0001-new.sh
To stop on the first error
./t0001-new.sh -i
then mail store and database can be inspected in
"trash directory.t0001-new"
To see the output of tests
./t0001-new.sh -v
To not remove trash directory at the end:
./t0001-new.sh -d
To run all tests verbosely:
make GIT_TEST_OPTS="-v"
Signed-off-by: Michal Sojka <sojkam1@fel.cvut.cz>
Modify the helper functions to work with git-based test suite i.e.
1) Quote arguments where it is necessary.
2) Do not use $NOTMUCH. It is equal to "notmuch" since $PATH is set to
the build tree.
3) Modify pass_if_equal to fit into the git-based test suite.
Signed-off-by: Michal Sojka <sojkam1@fel.cvut.cz>
This removes Git specific things from the test-lib.sh and adds helper
functions for notmuch taken from Carl's notmuch-test script. README is
also slightly modified to reflect the current state.
Signed-off-by: Michal Sojka <sojkam1@fel.cvut.cz>
Git uses a simple and yet powerful test framework, written in shell.
The framework is easy to use for both users and developers so I think
it would help if it is used in notmuch as well.
This is a copy of Git's test framework from commit
b6b0afdc30e066788592ca07c9a6c6936c68cc11 in git repository.
Signed-off-by: Michal Sojka <sojkam1@fel.cvut.cz>
Micah Anderson reported an issue where a message failed to display in
the emacs interface, (it instead gave an error, "json-read-string: Bad
string format").
Micah tracked this down to the json output from "notmuch show" being
interrupted by a GMime error message:
gmime-CRITICAL **: g_mime_stream_filter_add: assertion
`GMIME_IS_FILTER (filter)
I tracked this down further to notmuch passing a NULL value to
g_mime_stream_filter_add. And this was due to calling
g_mime_filter_charset_new with a value of "unknown-8bit".
So we add a test message withe a Conten-Type of "text/plain;
charset=unknown-8bit" from Micah's message. Then we fix "notmuch show"
to test for NULL before calling g_mime_stream_filter_add. Bug fixed.
Scott Henson reported an internal error that occurred when he tried to
add a message that referenced another message with a message ID well
over 300 characters in length. The bug here was running into a Xapian
limit for the length of metadata key names, (which is even more
restrictive than the Xapian limit for the length of terms).
We fix this by noticing long message ID values and instead using a
message ID of the form "notmuch-sha1-<sha1_sum_of_message_id>". That
is, we use SHA1 to generate a compressed, (but still unique), version
of the message ID.
We add support to the test suite to exercise this fix. The tests add a
message referencing the long message ID, then add the message with the
long message ID, then finally add another message referencing the long
ID. Each of these tests exercise different code paths where the
special handling is implemented.
A final test ensures that all three messages are stitched together
into a single thread---guaranteeing that the three code paths all act
consistently.
We're about to add a test with an excessively long message-id, (512
characters or so). This exceeds filename length limits, so just always
the simple counter to generate the filenames, (which we were doing for
messages with non-custom IDs anyway).
The commit said it fixed a problem with headers >200 characters
long. But examination of the code suggests that it was a header of
exactly 200 characters long that caused the problem. So we add a test
case for that here.
Before the fix in the previous commit, valgrind would detect many
errors when replying to the message created with this test case. After
that commit, those errors are gone.
Immediately after releasing 0.3 we learned that the magic-from-guessing
code could hang in an infinite loop in some cases. The bug occurred
only when the user had configured only a primary email addresss and no
other email addresses.
The test suite wasn't previously covering this case, so address this
shortcoming.
this test actually tests behavior that I consider as broken.
The Bcc should be to the same address as used in the From line,
otherwise we are creating a potential information leak as email
that is related to one email account (say, work) is copied to
a different account
Signed-off-by: Dirk Hohndel <hohndel@infradead.org>
Reviewed-by: Carl Worth <cworth@cworth.org>
These tests don't actually pass yet, since the feature being tested
has not been merged. But gettting these tests in first will let us
more easily test that the feature actually works, (and will help us
ensure we don't forget the feature before the next release).
right now these are not trying to be overly fancy
simply one test per strategy that we apply to figure out the best
from address - including the fallback if there's nothing to go on
Signed-off-by: Dirk Hohndel <hohndel@infradead.org>
When the test suite is run in a different time zone that where Carl
lives, some tests may fail depending on the time when the test suite is
run. For example, just now I get:
Search for all messages ("*"):... FAIL
--- test-031.expected 2010-04-23 09:33:47.898634822 +0200
+++ test-031.output 2010-04-23 09:33:47.898634822 +0200
@@ -1,5 +1,5 @@
-thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; Test message #6 (inbox unread)
-thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; Test message #14 (inbox unread)
+thread:XXX 2001-01-06 [1/1] Notmuch Test Suite; Test message #6 (inbox unread)
+thread:XXX 2001-01-06 [1/1] Notmuch Test Suite; Test message #14 (inbox unread)
thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; body search (inbox unread)
thread:XXX 2000-01-01 [1/1] searchbyfrom; search by from (inbox unread)
thread:XXX 2000-01-01 [1/1] Notmuch Test Suite; search by to (inbox unread)
By setting a fixed time zone in the test script, these problems should
be eliminated.
Signed-off-by: Michal Sojka <sojkam1@fel.cvut.cz>
The test suite doesn't yet cover --format=json output nor UTF-8 in
subject or body.
This patch starts with test cases for 'search --format=json' and
'show --format=json'.
Furthermore, it has test cases for a search for a UTF-8 string in a mail
body for a UTF-8 string in a mail subject.
Finally, it has a test case for --format=json with UTF-8 messages,
demonstrating the fix in 1267697893-sup-4538@sam.mediasupervision.de.
Reviewed-by: Carl Worth <cworth@cworth.org>
Updated tests to current implementation of the test suite.
These tests demonstrate a bug in the current implementation
of "notmuch show --format=json", (timestamp output is changed
depending on current timezone).
If future updates to the test suite add more messages to the database
before this "notmuch show" test, then the message-ID numbers in the
expected output will all change. But we can at least compute the
numbers so that this test will continue to pass.
In the recent change to rename threads based on changing subject
lines, I broke message ordering within "notmuch show" output. But our
test suite didn't catch that regressions, because we didn't have any
tests of "notmuch show".
This adds one "notmuch show" test along with the thread-naming
tests. It's not a whole suite of "notmuch show" testing, but it does
catch this regression at least.
We're starting to get test output that's fairly long, so it's much
kinder to just show a diff rather than displaying the complete
expected and actual output. To allow the user to investigate things
after the fact, we save the expected and actual output to files named
test-${test_number}.expected and test-${test_number}.output .
We recently added a feature to name threads based on the messages that
actually matched the search, (as opposed to simply the oldest or
newest message in the thread whether it matched or not). So add tests
for that, and (surprise, surprise!) the feature does not entirely
work.
Hurrah---no more manual verification of that PASS column.
This means that "make test" can actually be a useful part of the
release process now, (since it will exit with non-zero status if there
are any failures).
Previously some tests (dump/restore) were doing ad-hoc verification of
values and their own printing of PASS/FAIL, etc. This made it
impossible to count test pass/fail rates in a single place.
The only reason these tests were written that way was because the old
execute_expecting function only worked if one could directly test the
stdout output of a notmuch command. The recent switch to pass_if_equal
means that all tests can use it.
This feature was added recently and should have gotten a new test at
the time.
As this test demonstrates, the code is broken, ("notmuch search '*'
returns bogus dates of the Unix epoch for any threads where the
term "and" does not appear in any messages).
Using a date in the current year makes the test suite fragile since
the search output will include a date of "January 05" for now, but
will start doing "2010-01-05" in the future.
The filenames aren't predictable (including the current directory) nor
stable from one run to the next (including the PID). This makes it
hard to predict the output from a search command that returns such a
message (such as "*").
The original goal was simply to ensure that each generated message was
distinguishable somehow. So just use the message counter instead.
The old execute_expecting function was doing far too much for its own
good. One of the worst aspects of this was that it introduced
shell-quoting challengers where the caller could not easily control
the precise invocation of the command to be executed.
I personally couldn't find a way to test "notmuch search '*'" without
the shell expanding * against files in the current directory, or
having bogus quotation marks appearing in the search string,
(defeating the recognition of "*" as a special search term).
Hopefully this aspect of the test suite will be much easier to maintain now.
The recent fix to properly decode encoded headers made the expected
output of "notmuch reply" differ by a single space, (previously, there
were two spaces before the References: value and now there is just
one).
Fix the test suite so that these are all noted as correct results
again.
These new tests demonstrate a bug as follows:
Multiple messages are added to the database
All of these message references a common parent
The parent message does not exist in the databas
In this scenario, the messages will not be recognized as belonging to
the same thread. We consider this a bug, and the new tests treat this
as a failure.
Edited by Carl Worth <cworth@cworth.org>: Split these tests into their
own commit (before the fix of the bug). This lets me see the actual
failure in the test suite, before the fix is applied. Also fix the
alignment of new messages from test suite, (so that the PASS portions
all line up---which is important while we're still manually verifying
test-suite results).
These results have all the same terms as the target phrase, but
not in the expected order. They are designed to ensure that we
actually test phrase searches.
And as it turns out, we're not currently quoting the search terms
properly, so the phrase-search tests now fail with this commit.
The sequential identifiers have the advantage of being guaranteed to
be unique (until we overflow a 64-bit unsigned integer), and also take
up half as much space in the "notmuch search" output (16 columns
rather than 32).
This change also has the side effect of fixing a bug where notmuch
could block on /dev/random at startup (waiting for some entropy to
appear). This bug was hit hard by the test suite, (which could easily
exhaust the available entropy on common systems---resulting in large
delays of the test suite).
These tests were surprisingly simple to write---not much code at all
and most of them worked the first time even with hand-prepared
versions of the expected output.
The previous generate_message function is what's needed when testing
"notmuch new". But after that, we never want to generate a message
without also adding it to the index. So create a new add_message
function with this convenience.
This is a test for the recently added feature where we detect that the
reply-to address already exists in the To: or Cc: header so will
already be replied to. In this case we want to include the From:
address in our reply, (where, otherwise we would use the Reply-To
address *instead* of the address in the From header).
The feature tested here is that we reply to both the sender and to
others addresses on the To: line of the original message, but that we
don't reply to our own address.
This is the standard support of reply-to, (replying to that address
rather than the from address). It has nothing to do with the proposed
feature for extra-clever handling of a mail from a mailing-list that
has munged the reply-to header.
When reply to a message addresses to an address configured in the
other_email setting in the configuration file, the reply should use
that address in the From header. Test this.
We were sleeping merely to ensure that our updates to the mail store
would result in the mtime of the appropriate directories being
updated. We make the test suite run faster by not sleeping, but
instead explicitly updating the mtime of the directory to a future
time with touch.
We're careful to ensure that the time is not merely in the future
compared to the current time, but also later than any previous update
to the same directory mtime.
This makes the test suite bash-specific, but that's not much of
an issue for me, (if somebody else would prefer some other language
then they can rewrite the test suite and maintain it).
The advantage here is that we'll now be able to easily generate
custom messages for testing operations that depend on the message
content, (such as "notmuch reply", etc.).
This notmuch-test script simply runs a few different notmuch operations,
(things that I found were useful while testing the rename-support code).
It's not useful as a test suite yet, since it doesn't actually check
the results of any operation, (the user of the suite has to know what
the results should be and must manually verify them. So there's no
integration with the build system yet, (no "make test" target).
But I didn't want to lose what I had so far, so here it is.