This adds a 100 termpos gap between all phrases indexed by
_notmuch_message_gen_terms. This fixes a bug where terms from the end
of one header and the beginning of another header could match together
in a single phrase and a separate bug where term positions of
un-prefixed terms overlapped.
This fix only affects newly indexed messages. Messages that are
already indexed won't benefit from this fix without re-indexing, but
the fix won't make things any worse for existing messages.
This adds two known-broken tests and one working test related to the
term positions assigned to terms from different headers or MIME parts.
The first test fails because we don't create a termpos gap between
different headers. The second test fails because we don't adjust
termpos at all when indexing multiple parts.
Previously, we indexed the name and address parts of from/to headers
with two calls to _notmuch_message_gen_terms. In general, this
indicates that these parts are separate phrases. However, because of
an implementation quirk, the two calls to _notmuch_message_gen_terms
generated adjacent term positions for the prefixed terms, which
happens to be the right thing to do in this case, but the wrong thing
to do for all other calls. Furthermore, _notmuch_message_gen_terms
produced potentially overlapping term positions for the un-prefixed
copies of the terms, which is simply wrong.
This change indexes both the name and address in a single call to
_notmuch_message_gen_terms, indicating that they should be part of a
single phrase. This masks the problem with the un-prefixed terms
(fixing the two known-broken tests) and puts us in a position to fix
the unintentionally phrases generated by other calls to
_notmuch_message_gen_terms.
Two of these are currently known-broken. We index the name and
address parts in two separate calls to _notmuch_message_gen_terms.
Currently this has the effect of placing the term positions of the
prefixed terms from the second call right after those of the first
call, but screws up the term positions of the non-prefixed terms.
Two of the search tests for "from" and "to" queries were clearly
trying to search for prefixed phrases, but forgot to shell quote the
phrases. Fix this by quoting them correctly.
This is effectively a revert of
commit 6812136bf5
Author: Jani Nikula <jani@nikula.org>
Date: Mon Mar 31 00:21:48 2014 +0300
lib: drop support for single-message mbox files
The intention was to drop support for indexing new single-message mbox
files (and whether that was a good idea in the first place is
arguable). However this inadvertently broke support for reading
headers from previously indexed single-message mbox files, which is
far worse.
Distinguishing between the two cases would require more code than
simply bringing back support for single-message mbox files.
At least in emacs24, this removes the "site-lisp" directories from the
load path in addition to enforcing --no-site-lisp --no-init-file.
This works around a slightly mysterious bug on Debian that causes
test-lib.el not to load when there is cl-lib.el(c) in some site-lisp
directory. It should be harmless in general since we really don't
want to load any files from addon packages to emacs.
It turns out to be inconvenient to delete the downloaded datafiles with
distclean, so I propose a new target which does that instead.
The closest conventional target is 'maintainer-clean'; the difference
here is that having the original source tarball is not enough to
reconstruct these files.
The linking to talloc is hard-coded in the testing Makefile. This patch
causes the linking to talloc to be done according to how TALLOC_LDFLAGS
was configured.
Signed-off-by: Charles Celerier <cceleri@cs.stanford.edu>
- The old test was quite impossible to debug; the new one shows the difference
between the two directories, if any.
- "repository" doesn't make sense for out of tree builds. Or tarball
builds, for that matter.
All we do here is calculate the backup filename, and call the existing
dump routine.
Also take the opportunity to add a message about being safe to
interrupt.
The main goal is to support gzipped output for future internal
calls (e.g. from notmuch-new) to notmuch_database_dump.
The additional dependency is not very heavy since xapian already pulls
in zlib.
We want the dump to be "atomic", in the sense that after running the
dump file is either present and complete, or not present. This avoids
certain classes of mishaps involving overwriting a good backup with a
bad or partial one.
We've supported mbox files containing a single message for historical
reasons, but the support has been deprecated, with a warning message
while indexing, since Notmuch 0.15. Finally drop the support, and
consider all mbox files non-email.
Previously, the term escaper used a blacklist of characters that
needed escaping. This blacklist turned out to be somewhat incomplete;
for example, it did not contain non-whitespace ASCII control
characters or Unicode "fancy quotes", both of which do require the
term to be escaped.
Switch to a whitelist of characters that are definitely safe to leave
unquoted. This fixes the broken test introduced by the previous
patch.
The current term escaper gets most of these right, but fails to escape
things containing Unicode "fancy quotes" or things containing
non-whitespace control characters.
I still have one machine with old enough Xapian to not have compaction
support. Make the tests check for unsupported compact operation when
compact is not available.
This allows (and requires) the original-tags to be passed along with
the current-tags to be passed to notmuch-tag-format-tags. This allows
the tag formatting to show added and deleted tags.By default a removed
tag is displayed with strike-through in red (if strike-through is not
available, eg on a terminal, inverse video is used instead) and an
added tag is displayed underlined in green.
If the caller does not wish to use the new feature it can pass
current-tags for both arguments and, at this point, we do exactly that
in the three callers of this function.
Note, we cannot tidily allow original-tags to be optional because we would
need to distinguish nil meaning "we are not specifying original-tags"
from nil meaning there were no original-tags (an empty list).
We use this in subsequent patches to make it clear when a message was
unread when you first loaded a show buffer (previously the unread tag
could be removed before a user realised that it had been unread).
The code adds into the existing tag formatting code. The user can
specify exactly how a tag should be displayed normally, when deleted,
or when added.
Since the formatting code matches regexps a user can match all deleted
tags with a ".*" in notmuch-tag-deleted-formats. For example setting
notmuch-tag-deleted-formats to '((".*" nil)) tells notmuch not to show
deleted tags at all.
All the variables are customizable; however, more complicated cases
like changing the face depending on the type of display will require
custom lisp.
Currently this overrides notmuch-tag-deleted-formats for the tests
setting it to '((".*" nil)) so that they get removed from the display
and, thus, all tests still pass.
It turns out there was a reason the old man pages were stored in a man
compatible hierarchy, namely so that we could run man on them before
installing.
Hardcode doc build location into test suite. This isn't ideal, but
let's unbreak the test suite for now.
The checksum file is used by the test infrastructure to verify the downloaded
test database is the one we had in mind. Note that this test is
rather strict, and the the checksum file needs to be recommitted when
the database is regenerated.
add a pattern .gitignore to ignore the actual databases
Test the upgrade from probabilistic to boolean folder: terms, and
addition of path: terms.
The test depends on the pre-built test corpus and database tarball and
checksum file being in place. If it's not, the test is skipped. The
mechanism to fetch the test database will be added later.
At the time of writing, a working test database and checksum file is
available at
http://notmuchmail.org/releases/test-databases/
It has been noted that some non-GNU environments make lack
sha256sum. We leave this portability issue for a followup patch.
In xapian terms, convert folder: prefix from probabilistic to boolean
prefix, matching the paths, relative from the maildir root, of the
message files, ignoring the maildir new and cur leaf directories.
folder:foo matches all message files in foo, foo/new, and foo/cur.
folder:foo/new does *not* match message files in foo/new.
folder:"" matches all message files in the top level maildir and its
new and cur subdirectories.
This change constitutes a database change: bump the database version
and add database upgrade support for folder: terms. The upgrade also
adds path: terms.
Finally, fix the folder search test for literal folder: search, as
some of the folder: matching capabilities are lost in the
probabilistic to boolean prefix change.
We will need this for improved folder search tests, but having some
folders should exercise our code paths better anyway.
Modify the relevant test accordingly to make it pass.
This reorganization triggers a bug in the test suite, namely that it
expects the output of --output=files to be in a certain order. So we
add the fix for that into the same commit.
This mainly involves sorting, although the case --duplicate=$n
requires more subtlety.
Sanitize tabs and newlines to spaces rather than question marks in
--output=summary --format=text output.
This will also hide any difference in unfolding a header that has been
folded with a tab. Our own header parser replaces tabs with spaces,
while gmime would retain the tab.
The printf builtin "%(fmt)T" specifier (which allows time values
to use strftime-like formatting) is introduced in bash 4.2.
Trying to execute this in pre-4.2 bash will fail -- and if this
happens execute the fallback piece of perl code to do the same thing.
The test names assigned to NOTMUCH_SKIP_TESTS variable can now be given
with or without the Tddd- prefix for tester convenience:
The test name without Tddd -prefix stays constant even when test filenames
are renumbered.
The test name with Tddd -prefix is printed out when tests run.
According the semantics of make, the expansion of $(dir) in recipes
uses dynamic scope, i.e. the value at the time the recipe is run. This
means if test/Makefile.local is not the last sub-makefile included,
all heck breaks loose.
Previously, we stripped the "Tnnn-" part from the test name when
printing its description at the beginning of each test. However, this
makes it difficult to find the source script for a test (e.g., when a
test fails). Put this prefix back.
Searching by Message-Id no longer works via the old mail-archive.com
API, though I have contacted them in hopes that they restore it to
prevent dead links. Anyway, the new API is cleaner.
Acked-by: Austin Clements <amdragon@MIT.EDU>
Script `notmuch-test` expects the results file have T\d\d\d- part
intact so the results files (and some test output files) are now
name as such.
Without this change `notmuch-test` will exit in case the test
script it was executing exited with nonzero value.
The T\d\d\d- part is dropped in new variable $this_test_bare which is
used in progress informational messages and when loading .el files in
emacs tests (whenever $this_test_bare.el exists).
If there is a syntax error in the emacs test library, it causes other
tests to hang or crash without a useful error message.
This test could be eliminated if the error reporting for emacs tests
was somehow improved.
All test scripts to be executed are now named as T\d\d\d-name.sh,
numers in increments of 10.
This eases adding new tests and developers to see which are test scripts
that are executed by test suite and in which order.
When naming test scripts in format 'T\d\d\d-name.sh' the list of
tests to run are created dynamically. This makes test
'Ensure that all available tests will be run by notmuch-test'
in test/basic obsolete.
Previously the tags on each line in tree view were separarted by ", "
not just " ". This is different from show and search views.
This patch removes this comma. This is a large patch as essentially
every line of each of the expected outputs in the tree tests needs
updating.
Apart from aesthetic reasons this simplifies the switch to
notmuch-tag-format-tags in the next patch.
In some environments (at least Hurd), process-attributes is
unimplimented and always returns nil. This ends up causing test
failures (see e.g. id:87a9ffofsc.fsf@zancas.localnet).
Historically and according to POSIX 1003.1-2001, a signal of 0 can be
used to check the validity of a pid. This seems less heinous than
parsing the output of ps(1).
This fixes a non-deterministic failure in "Ignore files and
directories specified in new.ignore (multiple occurrences)". The test
assumed that all directories would be scanned, even though nothing
updated the mtime of ${MAIL_DIR}. It *usually* worked nevertheless
because the tests run quickly enough that the directory mtime is
usually the same as the current time, so notmuch new does not update
the mtime in the database (because more changes could occur in the
same second). However, when it occasionally did update the mtime in
the database, the notmuch new call in this test would (correctly) skip
"pass 2" of scanning ${MAIL_DIR}, causing it to skip the following
expected lines:
(D) add_files_recursive, pass 2: explicitly ignoring ${MAIL_DIR}/.git
(D) add_files_recursive, pass 2: explicitly ignoring ${MAIL_DIR}/.ignored_hidden_file
(D) add_files_recursive, pass 2: explicitly ignoring ${MAIL_DIR}/ignored_file
This patch fixes this problem by touching ${MAIL_DIR} to ensure it
gets scanned and by rearranging the test to ensure the directories are
touched immediately before the main notmuch new call in the test.
There is an obscure bug in notmuch-hello that very occasionally causes
emacs_deliver_message to fail. Since it it doesn't serve any actual
purpose in the function we delete it, and leave tracking down the the
bug for another day.
Most of the tests previously using emacs_deliver_message do not use
the actual transmitted message, so we replace it with a simpler (and
presumably more reliable function) that only saves (and indexes) an
fcc copy of the message.
When NOTMUCH_TEST_QUIET environment variable is set to non-null value
messages when new test script starts and when test PASSes are disabled.
This eases picking the cases when tests FAIL (as those are still printed).
In preparation for quiet mode print empty line before writing the
test description. This is done now in function designed for it --
it will also be called when test fails.
test-lib.sh sometimes did equivalent of `basename "$0" .sh`, sometimes
skipping the basename part and sometimes .sh part. This worked as
we never had path components in $0 (more than ./) nor .sh ending.
Now the equivalent of `basename "$0" .sh` is done once and used
everywhere. In the future we may have .sh suffix in test names
-- removing those is a good idea.
The choice of decreasing timestamps is a hack which reduces the number
of existing tests which fail. This can be changed to increasing
if/when somebody wants update another 47 tests.
add a new function notmuch_date_sanitize for rfc822-ish things. Add
date sanitization to notmuch_show_sanitize_all and use it more places.
This is all in aid of a transition to unique timestamps on messages.
Eventually we want test messages to have distinct dates to avoid
reproducability problems. This sanitization will prevent some test
failures when that change is made.
Replace the use of a local function in maildir-sync with
notmuch_json_show_sanitize
This was causing test failures because version strings varied in
length between GNU/Linux and GNU/KFreeBSD. One can also imagine
different versions of gnupg causing the same failure.
When executed command line is written to *Notmuch errors* buffer,
shell-quote-argument will backslash-escape any char that is not in
"POSIX filename characters" (i.e. matching "[^-0-9a-zA-Z_./\n]").
Currently in two emacs tests shell has expanded $PWD as part of
emacs variable, which will later be fed to #'shell-quote-argument
and finally written to ERROR file. If $PWD contained non-POSIX
filename characters, data in ERROR file will not match $PWD when
later comparing in shell. Therefore, in these two particular cases
the escaped $PWD is replaced with YYY in ERROR file and expected
content is adjusted accordingly.
This fixes races in thread-local and global tagging in notmuch-search
(e.g., "+", "-", "a", "*", etc.). Previously, these would modify tags
of new messages that arrived after the search. Now they only operate
on the messages that were in the threads when the search was
performed. This prevents surprises like archiving messages that
arrived in a thread after the search results were shown.
This eliminates `notmuch-search-find-thread-id-region(-search)'
because these functions strongly encouraged racy usage.
This fixes the two broken tests added by the previous patch.
These tests check that both thread-local and global search tagging
operations are race-free. They are currently known-broken because
they aren't race-free.
These queries will match exactly the set of messages currently in the
thread, even if more messages later arrive. Two queries are provided:
one for matched messages and one for unmatched messages.
This can be used to fix race conditions with tagging threads from
search results. While tagging based on a thread: query can affect
messages that arrived after the search, tagging based on stable
queries affects only the messages the user was shown in the search UI.
Since we want clients to be able to depend on the presence of these
queries, this ushers in schema version 2.
(Unfortunately, it's difficult to first demonstrate this problem with
a known-broken test because modern Linux kernels have argument length
limits in the megabytes, which makes Emacs really slow!)
It was looking in completely the wrong place for the backup and the
(test) xapian database. Unfortunately test_begin_subtest hides the
relevant errors.
Currently notmuch-show looks at the prefix-arg directly via
current-prefix-arg. This changes it to use the interactive
specification.
One test (for elide-toggle functionality) set the prefix arg
directly. Update this test to set the new argument directly.
One test (reply to encrypted message in the crypto test) recently
started failing on some systems. The failure I saw were two extra
lines of the form
<87d2nbc5xg.fsf@host.i-did-not-set--mail-host-address--so-tickle-me>
The test pipes the output through
grep -v -e '^In-Reply-To:' -e '^References:'
which would normally these two ids but it does not, in this case,
because they are so long they get put on a separate line in the output.
To fix this we set mail-host-address for emacs deliver. example.com
seems a sensible address to use. This is short enough that we don't
get the line breaks above and the tests then all pass.
As explained by Jeffrey Stedfast, the author of GMime, quoted in [1]:
> Passing the GMIME_ENABLE_RFC2047_WORKAROUNDS flag to g_mime_init()
> *should* solve the decoding problem mentioned in the thread. This
> flag should be safe to pass into g_mime_init() without any bad side
> effects and my unit tests do test that code-path.
The thread being referred to is [2].
[1] id:87bo56viyo.fsf@nikula.org
[2] id:08cb1dcd-c5db-4e33-8b09-7730cb3d59a2@gmail.com
Some common broken RFC 2047 encodings that we currently let gmime
parse strictly. We could tell gmime to be forgiving in what it accepts
as RFC 2047 encoding, making these tests pass.
When 'xpg_echo' bash shell option is unset (usually the default)
echo builtin does not expand backslash-escape sequences by default
(i.e. '\n' is echoed as '\n' instead of newline). Not all bash
installations have this feature we depend on activated by default.
Note that the feature is bash (and GNU /bin/echo) specific. It is used
as it is convenient. If portability is needed (elsewhere) use printf(1)
(also often available as a shell builtin).
If any of the tests in our test system is not passing the execution
of the test suite completes with nonzero exit value.
It is better to rely on the exit value of the test system instead
of some arbitrary strings in test output (or use both).
notmuch_message_tags_to_maildir_flags() unconditionally moves messages from
maildir directory "new/" to maildir directory "cur/", which makes messages lose
their "new" status in the MUA. However some users want to keep this "new"
status after, for instance, an auto-tagging of new messages.
However, as Austin mentioned and according to the maildir specification,
messages living in "new/" are not allowed to have flags, even if mutt allows it
to happen. For this reason, this patch prevents moving messages from "new/" to
"cur/", only if no flags have to be changed. It's hopefully enough to satisfy
mutt (and maybe other MUAs showing the "new" status) users checking the "new"
status.
Changelog:
* v2: Fix bool type as well as NULL returned despite having no errors (Austin
Clements)
* v4: Tag the related test (contributed by Michal Sojka) as working
Signed-off-by: Louis Rilling <l.rilling@av7.net>
[Condition for keeping messages in new/ was extended to satisfy all
tests from the previous patch. -Michal Sojka]
[Added by David Bremner, to keep the tests passing at each commit]
update insert tests for new maildir synchronization rules
As of id:1355952747-27350-4-git-send-email-sojkam1@fel.cvut.cz
we are more conservative about moving messages from ./new to ./cur.
This updates the insert tests to match
As mentioned by Jani Nikula in id:87vcccp4y3.fsf@nikula.org, some cases
of maildir synchronization are not covered by our tests. Let's add the
missing tests.
Some MUA's like mutt show the difference between "new" emails living in maildir
directory new/, and "old" emails living in maildir directory cur/. However
notmuch tag unconditionally moves selected messages from new/ to cur/, even if
no maildir synchronized tag is changed.
While maildir specification forbids messages with tags living in new/, there is
no need to move messages to cur/ when no maildir synchronized tag is changed.
Thus notmuch can remain transparent with respect to other MUA's.
[ Edited commit log to better describe the intended changes, and tag the
test as broken until the actual changes are implemented -- Louis Rilling ]
Signed-off-by: Louis Rilling <l.rilling@av7.net>
[ Converted to use test_subtest_known_broken, David Bremner ]
RFC 2047 states that the encoding and charset in an encoded word are
case-insensitive, so force them to lower case in the reply test. This
fixes an issue caused by GMime versions (somewhere between 2.6.10 and
2.6.16), which changed the capitalization of the encoding.
Previously, reply's default text format used an odd mix of RFC 2045
MIME encoding for the reply template's body and some made-up RFC
2822-like UTF-8 format for the headers. The intent was to present the
headers to the user in a nice, un-encoded format, but this assumed
that whatever ultimately sent the email would RFC 2047-encode the
headers, while at the same time the body was already RFC 2045 encoded,
so it assumed that whatever sent the email would *not* re-encode the
body.
This can be fixed by either producing a fully decoded UTF-8 reply
template, or a fully encoded MIME-compliant RFC 2822 message. This
patch does the latter because it is
a) Well-defined by RFC 2822 and MIME (while any UTF-8 format would be
ad hoc).
b) Ready to be piped to sendmail. The point of the text format is to
be minimal, so a user should be able to pop up the template in
whatever editor they want, edit it, and push it to sendmail.
c) Consistent with frontend capabilities. If a frontend has the
smarts to RFC 2047 encode the headers before sending the mail, it
probably has the smarts to RFC 2047 decode them before presenting
the template to a user for editing.
Also, as far as I know, nothing automated consumes the reply text
format, so changing this should not cause serious problems. (And if
anything does still consume this format, it probably gets these
encoding issues wrong anyway.)
Previously, the References header code seemed to assume
notmuch_message_get_header would return NULL if the header was not
present, but it actually returns "". As a result of this, it was
inserting an unnecessary space when concatenating an empty or missing
original references header with the new reference.
This shows up in only two tests because the text reply format later
passes the whole reply template through g_mime_filter_headers, which
has the side effect of stripping out this extra space.