This switches the new batch-tag format away from using a home-grown
hex-encoding scheme for message IDs in the dump to simply using Xapian
queries with Xapian quoting syntax.
This has a variety of advantages beyond presenting a cleaner and more
consistent interface. Foremost is that it will dramatically simplify
the quoting for batch tagging, which shares the same input format.
While the hex-encoding is no better or worse for the simple ID queries
used by dump/restore, it becomes onerous for general-purpose queries
used in batch tagging. It also better handles strange cases like
"id:foo and bar", since this is no longer syntactically valid.
When we switch to using regular Xapian queries in the dump format, \n
will cause problems, so we disallow it. Specially, while Xapian can
quote and parse queries containing \n without difficultly, quoted
queries containing \n still span multiple lines, which breaks the
line-orientedness of the dump format. Strictly speaking, we could
still round-trip these, but it would significantly complicate restore
as well as scripts that deal with tag dumps. This complexity would
come at absolutely no benefit: because of the RFC 2822 unfolding
rules, no amount of standards negligence can produce a message with a
message ID containing a line break (not even Outlook can do it!).
Hence, we simply disallow it.
This patch corrects several undesirable behaviours:
1) Empty files were not detected, leading to buffer read overrun.
2) An initial blank line cause restore to silently abort
3) Initial comment line caused format detection to fail
notmuch_json_show_sanitize replaced "filename" field values even in part
structures, where the value is predictable. Make it only normalize the
filename value if it is an absolute path (begins with slash), which is
true of the Maildir filenames that were intended to be normalized away.
This slightly changes the output of an existing test since we now
report non-zero exits with a pop-up buffer instead of at the end of
the search results.
* test/emacs:
- Rename subtests "{Add,Remove} tag from notmuch-show view" to
"notmuch-show: {add,remove} single tag {to,from} single message"
to be consistent with the following tests.
- New subtest "notmuch-show: add multiple tags to single message":
`notmuch-show-add-tag' ("+") can add multiple tags to a message.
- New subtest "notmuch-show: remove multiple tags from single message":
`notmuch-show-remove-tag' ("-") can remove multiple tags from a message.
We want to test both that error/warning messages are generated when
they should be, and not generated when they should not be. This varies
between restore and batch tagging.
These one need the completed functionality in notmuch-restore. Fairly
exotic tags are tested, but no weird message id's.
We test each possible input to autodetection, both explicit (with
--format=auto) and implicit (without --format).
The quoting for ${SEARCH} is broken when it's supposed to be '*', and
it seems tricky to get it right. Just drop the variable and use '*'
directly. Before this, none of the messages ever matched, and the test
was comparing zeros.
test_expect_equal_json uses json.tool from the system Python. While
Python 2 wasn't picky about the encoding of stdin, Python 3 decodes
stdin strictly according to the environment. Since we set LC_ALL=C
for the tests, Python 3's json.tool was assuming stdin would be in
ASCII and aborting when it couldn't decode the UTF-8 characters from
some of the JSON tests. This patch sets the PYTHONIOENCODING
environment variable to utf-8 when invoking json.tool to override
Python's default encoding choice.
Without this change, GCC complains as follows:
gcc test/random-corpus.o test/database-test.o notmuch-config.o command-line-arguments.o lib/libnotmuch.a util/libutil.a parse-time-string/libparse-time-string.a -o test/random-corpus -lgmime-2.6 -lgio-2.0 -lgobject-2.0 -lglib-2.0 -Wl,-rpath,/usr/lib -ltalloc -lxapian
/usr/bin/ld: lib/libnotmuch.a(database.o): undefined reference to symbol '_ZNSs4_Rep10_M_destroyERKSaIcE@@GLIBCXX_3.4'
/usr/bin/ld: note: '_ZNSs4_Rep10_M_destroyERKSaIcE@@GLIBCXX_3.4' is defined in DSO /usr/lib/libstdc++.so.6 so try adding it to the linker command line
/usr/lib/libstdc++.so.6: could not read symbols: Invalid operation
collect2: error: ld returned 1 exit status
make: *** [test/random-corpus] Error 1
We demonstrate the current notmuch restore parser being confused by
message-id's and tags containing non alpha numeric characters
(particularly space and parentheses are problematic because they are
not escaped by notmuch dump).
We save the files as hex escaped on disk so that terminal emulators
will not get confused if the test fails (as we mostly expect it to do).
Initial use case is testing dump and restore, so we only have
message-ids and tags.
The message ID's are nothing like RFC compliant, but it doesn't seem
any harder to roundtrip random UTF-8 strings than RFC-compliant ones.
Tags are UTF-8, even though notmuch is in principle more generous than
that.
updated for id:m2wr04ocro.fsf@guru.guru-group.fi
- talk about Unicode value rather some specific encoding
- call talloc_realloc less times
Initially, provide a way to create "stub" messages in the notmuch
database without corresponding files. This is essentially cut and
paste from lib/database.cc. This is a seperate file since we don't
want to export these symbols from libnotmuch or bloat the library with
non-exported code.
Previously, the test framework generated a variable name for each
external prereq as a poor man's associative array. Unfortunately,
prereqs names may not be legal variable names, leading to
unintelligible bash errors like
test_missing_external_prereq_emacsclient.emacs24_=t: command not found
Using proper associative arrays to track prereqs, in addition to being
much cleaner than generating variable names and using grep to
carefully construct unique string lists, removes restrictions on
prereq names.
Previously, if a test script aborted (e.g., because it passed too few
arguments to a test function), the test driver loop would simply
continue on to the next test script and the final results would
declare that everything passed (except that the test count would look
suspiciously low, but maybe you just misremembered how many tests
there were).
Now, if a test script exits with a non-zero status and did not produce
a final results file, we propagate that failure out of the driver loop
immediately.
To keep this simple, this patch removes the PID from the test-results
file name. This PID was inherited from the git test system and seems
unnecessary, since the file name already includes the name of the test
script and the test-results directory is created anew for each run.
And require that if TEST_EMACS is specified, so is TEST_EMACSCLIENT.
Previously, the test framework always used "emacsclient", even if the
Emacs in use was overridden by TEST_EMACS. This causes problems if
both Emacs 23 and Emacs 24 are installed, the Emacs 23 emacsclient is
the system default, but TEST_EMACS is set to emacs24. Specifically,
with an Emacs 24 server and an Emacs 23 client, emacs tests that run
very quickly may produce no output from emacsclient, causing the test
to fail.
The Emacs server uses a very simple line-oriented protocol in which
the client sends a request to evaluate an expression and the server
sends a request to print the result of evaluation. Prior to Emacs bzr
commit 107565 on March 11th, 2012 (released in Emacs 24.1), if
multiple commands were sent to the emacsclient between when it sent
the evaluation command and when it entered its receive loop, it would
only process the first response command, ignoring the rest of the
received buffer. This wasn't a problem with the Emacs 23 server
because it sent only the command to print the evaluation result.
However, the Emacs 24 server first sends an unprompted command
specifying the PID of the Emacs server, then processes the evaluation
request, then sends the command to print the result. If the
evaluation is fast enough, it can send both of these commands before
emacsclient enters the receive loop. Hence, if an Emacs 24 server is
used with an Emacs 23 emacsclient, it may miss the response printing
command, ultimately causing intermittent notmuch test failures.
Previously, many tests in emacs-subject-to-filename didn't quote the
$output argument to test_expect_equal. As a result, if $output was
empty, test_expect_equal would be passed only one argument and would
abort the entire test script. By quoting the argument, we ensure
test_expect_equal will always receive two arguments.
We now test for user ignore patterns before attempting to determine if
a directory entry is itself a directory. As a result, we no longer
abort for broken symlinks if the user has explicitly ignored them.
This fixes the test added in the previous patch. It also slightly
changes the debug output checked by another test of ignores.
The macro with-current-notmuch-show-message executes command
`notmuch show --format=raw id:...` which just outputs the contents
of the mail file verbatim (into temporary buffer). In case e.g. utf-8
locale is used the temporary buffer has buffer-file-coding-system as
utf-8. In this case Emacs converts the data to multibyte format, guessing
that input is in utf-8.
However, the "raw" (MIME) message may contain octet data in any other
8bit format, and as no (MIME-)content spesific handling to the message
is done at this point, conversion to other formats may lose information.
By setting coding-system-for-read 'no-conversion drops the conversion part
and makes this handle input as notmuch-get-bodypart-internal() does.
This marks the broken test in previous change fixed.
Previously, we would treat multi-message mboxes as one giant email,
which, besides the obvious incorrect indexing, often led to
out-of-memory errors for archival mboxes. Now we explicitly reject
multi-message mboxes. For historical reasons, we retain support for
single-message mboxes, but official deprecate this behavior.
This test is currently broken. Note that its brokenness cascades and
causes the next test to fail as well (because notmuch incorrectly
indexes the mbox file).
There are currently 45 TESTS scripts. 36 of those load
test-lib.sh using '. ./test-lib.sh' and 9 '. test-lib.sh'.
In latter case test-lib.sh is first searched from directories
in PATH (posix) and then from current directory (bash feature).
Changed the 9 files to execute '. ./test-lib.sh'. The test-lib.sh
should never be loaded from directory in PATH.
Previously, this would simply indicate that the grep failed without
any indication of the Emacs output it failed on. Now we take
advantage of the test framework's handling of stdout to display the
incorrect Emacs output if the test fails.
It seems we have never tested the case that restore --accumulate
actually adds tags. I noticed this when I started optimizing and no
tests failed.
The bracketing with "restore --input=dump.expected" are to make sure
we start in a known state, and we leave the database in a known state
for the next test.
The test designed to exercise Emacs' rendering of HTML emails
containing images inadvertently assumed w3m was available under Emacs
23. The real point of this test was to check that Emacs 24's shr
renderer didn't crash when given img tags, so use shr if it's
available, html2text otherwise (which is built in), and do only a
simple sanity check of the result.
This regexp agrees with Xapian query syntax much more closely, though
we specifically disallow various cases that would be confusing in the
context of an email body (e.g., punctuation at the end of an id: link
is not considered part of the id: link because it's probably part of
the surrounding text).
In particular, this handles id: links that are not surrounded by
quotes much better, which stash is much more likely to generate now
that we don't quote id's that don't need to be quoted. It also
handles quoted id: links better.
We update the buttonization test to reflect the new pattern.
This matches the current behavior of the buttonizer, so it passes, but
many of these cases are not what you'd want (and some of them aren't
even valid Xapian queries). The next patch will fix the handling of
these cases and update the test.
Over time, maintaining this very long regex has become irritating,
especially when resolving conflicts.
This patch replaces the call to sed with multiple extra arguments to
find. Since each test binary is now on it's own line, this should
make resolving conflicts easier.
Test the date/time parser module directly, independent of notmuch,
using the parse-time test tool.
Credits to Michal Sojka <sojkam1@fel.cvut.cz> for writing most of the
tests.
Add a smoke testing tool to support testing the date/time parser
module directly and independent of the rest of notmuch.
Credits to Michal Sojka <sojkam1@fel.cvut.cz> for the stdin parsing
idea and consequent massive improvement in testability.
Currently, we only properly escape stashed id queries, but there are
other places where the Emacs UI constructs queries for boolean terms.
Since this escaping function is meant to be used in other places, it
avoids escaping strings that don't need escaping.
This disallows adding empty tags, since nothing but confusion follows
in their wake, and disallows adding tags that begin with "-" because
they are also confusing, the tag "-" is impossible to remove using the
CLI, and because the syntax for removing such tags conflicts with long
argument syntax.
This does not place any restrictions on what tags can be removed, as
that would make it difficult for people who have the misfortune of
already having malformed tags to remove these tags.
Although messages are created in a particular order, it seems that
when they are created on a tmpfs, they do not always come back in the
same order, leading to the same files being ignored but being output
in a different order. This causes the test to fail because the outputs
being compared are the same.
Fix the failures by sorting the output of notmuch --debug and
comparing this to a hand-sorted version of its output.
Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
The use of --background option (instead of shell '&') ensures that
smtp-dummy is listening its server socket until execution of shell
script can continue, thus the client will always have socket where
to connect.
smtp-dummy outputs smtp_dummy_pid variable in shell assignment format;
eval'ing that output makes that variable available for the shell.
As the smtp-dummy instance is no longer child process of the script
the SIGKILL signal sent to it will ensure it is going away in case
the mail sender fails to connect to smtp-dummy.
When shell executes background process using '&' the scheduling of
that new process is arbitrary. It could be that smtp-dummy doesn't
get execution time to listen() it's server socket until some other
process attempts to connect() to it. The --background option in
smtp-dummy makes it to go background *after* it started to listen
its server socket.
When --background option is used, the line "smtp_dummy_pid='<pid>'"
is printed to stdout from where shell can eval it.
Demonstrates that *every* file/directory which matches one of the values
in 'new.ignore' will be ignored, independent of its depth/location in
the mail store.
Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
Obviates the need to create a 'NOTMUCH_NEW' clone which runs
'notmuch new --debug'. This will be used in a later patch.
Doesn't cause any issues for other tests.
* test/emacs:
- New subtest "notmuch-show: collapse all messages in thread":
`notmuch-show-open-or-close-all' with prefix arg ("C-u M-RET")
collapses all messages in thread.
- New subtest "notmuch-show: uncollapse all messages in thread":
`notmuch-show-open-or-close-all' without prefix arg ("M-RET")
uncollapses all messages in thread.
* test/emacs:
- New subtest "notmuch-show: show message headers":
Setting `notmuch-message-headers-visible' to t causes all headers
defined in `notmuch-message-headers' to be shown.
- New subtest "notmuch-show: hide message headers":
Setting `notmuch-message-headers-visible' to nil causes all headers
defined in `notmuch-message-headers' to be hidden.
("Subject:" may be an exception; See the use of `headers-start' in
`notmuch-show-insert-msg')
- New subtest "notmuch-show: hide message headers (w/ notmuch-show-toggle-headers)":
Setting `notmuch-message-headers-visible' to t causes all headers
defined in `notmuch-message-headers' to be shown, but they can be
hidden for the current message by running `notmuch-show-toggle-headers'.
This requires changing the contents of the crypto tests, as one thread
that was marked read by the earlier tests in test/emacs is no longer
marked read.
This moves tests for:
- 09d19ac "test: emacs: toggle eliding of non-matching messages in
`notmuch-show'", which should have actually read: "test: emacs:
toggle processing of cryptographic MIME parts in `notmuch-show'".
See commit 19ec74c5.
- 5ea1dbe "test: emacs: toggle eliding of non-matching messages in
`notmuch-show'"
- 345faab "test: emacs: toggle thread content indentation in
`notmuch-show'"
Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
Since $TEST_DIRECTORY is an absolute path, any filenames generated
with it will be complete paths. Only use the basename to generate
suffixes for filenames.
Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
Most Emacs tests end with a call to (test-output), which saves the
buffer to a filed called OUTPUT. Previously, if the test code failed
with an exception before this call, the test framework would then
compare against the OUTPUT file from the last Emacs test, resulting in
confusing diffs.
This requires one tweak to an emacs test that made two calls to
test_emacs and expected an OUTPUT file from the first call. We simply
reverse the order of the test_emacs calls.
On FreeBSD, and probably anywhere else someone installed xapian to
some other prefix, we need to use XAPIAN_LDFLAGS to make the linker can
actually find libxapian.
Before the change, test_expect_equal_file() function treated the first
argument as "actual output file" and the second argument as "expected
output file". When the test fails, the files are copied for later
inspection. The first files was copied to "$testname.output" and the
second file to "$testname.expected". The argument order for
test_expect_equal_file() is often wrong which results in confusing
diff output and incorrectly named files.
The patch solves the issue by changing test_expect_equal_file() to
treat arguments just as two files, without any special properties
(like "actual" and "expected"). The file names for copying is now
based on the given file name: "$testname.$file1" and
"$testname.$file2". E.g. if test_expect_equal_file() is called with
"OUTPUT" and "EXPECTED", the copied files can be named
"emacs.1.OUTPUT" and "emacs.1.EXPECTED".
The down side of this approach is that diff argument order depends on
test_expect_equal_file() argument order. So sometimes we get diff
from expected to actual results, and sometimes the other way around.
But the files are always named correctly.
The behaviour of "emacsclient --eval nil" changed from emacs23 to
emacs24, and in emacs24 it prints 'nil' rather than an empty string.
(format "%S" foo) produces a sexpr form of foo, and is consistent
between the two versions.
The version of message.el in emacs24 omits the charset=us-ascii,
causing the current version of this test to fail. With this patch, we
accept either option. According to RFC 2046, they are semantically
equivalent.
When running emacs tests using emacs 23.1.1 the tests block (until timeout)
when emacs function (notmuch-test-wait) is called.
There is an emacs bug #2930 titled:
23.0.92; `accept-process-output' and `sleep-for' do not run sentinel
It seems this is present in emacs 23.1.
Calling list-processes after accept-process-output seems work around
this problem; in case Emacs version is 23.1 a defadvice is activated
to do just that.
notmuch-test-wait called sleep-for in a loop to wait unconditionally 0.1
seconds while waiting for process to exit.
accept-process-output returns as soon as there is any data available
from process, so using it avoids unnecessary fixed delays.
Both of these functions run process sentinels.
The string function in a sprinter may be called with a NULL string
pointer (eg if a header is absent). This causes a segfault. We fix
this by checking for a null pointer in the string functions and update
the sprinter documentation.
At the moment some output when format=text is done directly rather than
via an sprinter: in that case a null pointer is passed to printf or
similar and a "(null)" appears in the output. That behaviour is not
changed in this patch.
The syntax --output=filename is a smaller change than deleting the
output argument completely, and conceivably useful e.g. when running
notmuch under a debugger.
Format canonicalization of JSON output is no longer necessary, so
remove it. Value canonicalization (e.g., normalizing thread IDs) is
still necessary, so all of the sanitization functions remain.
Previously, we used a variety of ad-hoc canonicalizations for JSON
output in the test suite, but were ultimately very sensitive to JSON
irrelevancies such as whitespace. This introduces a new test
comparison function, test_expect_equal_json, that first pretty-prints
*both* the actual and expected JSON and the compares the result.
The current implementation of this simply uses Python's json.tool to
perform pretty-printing (with a fallback to the identity function if
parsing fails). However, since the interface it introduces is
semantically high-level, we could swap in other mechanisms in the
future, such as another pretty-printer or something that does not
re-order object keys (if we decide that we care about that).
In general, this patch does not remove the existing ad-hoc
canonicalization because it does no harm. We do have to remove the
newline-after-comma rule from notmuch_json_show_sanitize and
filter_show_json because it results in invalid JSON that cannot be
pretty-printed.
Most of this patch simply replaces test_expect_equal and
test_expect_equal_file with test_expect_equal_json. It changes the
expected JSON in a few places where sanitizers had placed newlines
after commas inside strings.
These extra directories cause problems for building on Debian
twice in a row.
In order to remove directories, we need to us "rm -rf" instead of
"rm -f". So now we should be extra careful what we add to the
variable CLEAN.
This patch switches from the current ad-hoc printer to the structured
formatters in sprinter.h, sprinter-text.c and sprinter-json.c.
The JSON tests are changed slightly in order to make them PASS for the
new structured output formatter.
The text tests pass without adaptation.
The JSON format eliminates the complex escaping issues that have
plagued the text search format. This uses the incremental JSON parser
so that, like the text parser, it can output search results
incrementally.
This slows down the parser by about ~4X, but puts us in a good
position to optimize either by improving the JSON parser (evidence
suggests this can reduce the overhead to ~40% over the text format) or
by switching to S-expressions (evidence suggests this will more than
double performance over the text parser). [1]
This also fixes the incremental search parsing test.
This has one minor side-effect on search result formatting.
Previously, the date field was always padded to a fixed width of 12
characters because of how the text parser's regexp was written. The
JSON format doesn't do this. We could pad it out in Emacs before
formatting it, but, since all of the other fields are variable width,
we instead fix notmuch-search-result-format to take the variable-width
field and pad it out. For users who have customized this variable,
we'll mention in the NEWS how to fix this slight format change.
[1] id:"20110720205007.GB21316@mit.edu"
This advises the search process filter to make it process one
character at a time in order to test the pessimal case for incremental
search output parsing.
The text parser fails this test because it gets tricked into thinking
a parenthetical remark in a subject is the tag list.
There didn't seem to be these basic tests for --format=text,
as there are for --format=json. These are just the tests from
the `json' script, with adjusted expected outputs.
Previously, notmuch new only synchronized maildir flags to tags for
files with a maildir "info" part. Since messages in new/ don't have
an info part, notmuch would ignore them for flag-to-tag
synchronization.
This patch makes notmuch consider messages in new/ to be legitimate
maildir messages that simply have no maildir flags set. The most
visible effect of this is that such messages now automatically get the
unread tag.
Currently, notmuch new only synchronizes maildir flags to tags for
files that have an "info" part. However, in maildir, new mail doesn't
gain the info part until it moves from new/ to cur/. Hence, even
though mail in new/ doesn't have an info part, it is still a maildir
message and thus has maildir flags (though none of them set).
The most visible effect of not synchronizing maildir flags for
messages in new/ is that newly delivered messages don't get the unread
tag (unless it is assigned by some other mechanism, like new.tags).
This patch does *not* modify the test for messages in cur/ that do not
have an "info" part. Unlike a message in new/, a message in cur/
without an info part is no longer a maildir message, and thus
shouldn't be subject to maildir flag synchronization.