On ppc64el, races are detected by TSan:
WARNING: ThreadSanitizer: data race (pid=4520)
Read of size 8 at 0x7ffff20016c0 by thread T1:
#0 strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:386 (libtsan.so.2+0x77c0c)
#1 strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:378 (libtsan.so.2+0x77c0c)
#2 g_strdup ../../../glib/gstrfuncs.c:362 (libglib-2.0.so.0+0xa4ac4)
Previous write of size 8 at 0x7ffff20016c0 by thread T2:
#0 malloc ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:647 (libtsan.so.2+0x471f0)
#1 g_malloc ../../../glib/gmem.c:130 (libglib-2.0.so.0+0x7bb68)
Location is heap block of size 20 at 0x7ffff20016c0 allocated by thread T2:
#0 malloc ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:647 (libtsan.so.2+0x471f0)
#1 g_malloc ../../../glib/gmem.c:130 (libglib-2.0.so.0+0x7bb68)
This appears to be a false positive in GLib, as explained at
https://gitlab.gnome.org/GNOME/glib/-/issues/1672#note_1831968
In short, a call to fstat fails under TSan and GLib's g_sterror will
intern the error message, which will be reused by other threads.
Since upstream appears to be aware that GLib doesn't play nicely with
TSan, suppress everything coming from the library instead of
maintaining a fine grained list.
Reported at
https://buildd.debian.org/status/fetch.php?pkg=notmuch&arch=ppc64el&ver=0.38%7Erc0-1&stamp=1692959868&raw=0
Depending on compiler (gcc, g++, clang) and standard options (c99, c11),
string.h may or may not include strings.h, leading to possibly missing
or conflicting declarations of strcasestr.
Include both so that both detection and compilation phases use the same
(possibly optimised) implementations.
Suggested-by: Thomas Schneider <qsx@chaotikum.eu>
Suggested-by: Florian Weimer <fweimer@redhat.com>
Suggested-by: Tomi Ollila <tomi.ollila@iki.fi>
Native compilation is kindof useless in the test suite because we
throw away the cache after every subtest. The test suite could in
principle share an eln cache within a given test file; for now try to
minimize the amount of native-compilation. There is an intermittent
bug where emacs loses track of its default-directory; I suspect (but
have no proof) that bug is related to native compilation and/or race
conditions. This patch seems to prevent that bug (or at least reduce
its frequency).
According to discussion on
https://gitlab.gnome.org/GNOME/glib/-/issues/3078
it looks like upstream will stop supporting top of file comments.
It is questionable whether we really need this feature, but for now
update notmuch-config to simulate it.
As of Emacs 29.1, In-Reply-To is in the default value for
message-hidden-headers. We actually want to see that in the test
suite, so remove it again. To future proof the tests, fix a default
value for message-hidden-headers specifically for the test suite.
It is wasteful to remove a filename term when the whole message
document is about to be removed from the database. Profiling with perf
shows this takes a significant portion of the time when cleaning up
removed files in the database.
The logic of n_d_remove_message becomes a bit more convoluted here in
order to make the change minimal.
It is possible that this function can be further optimized, since the
expansion of filename terms into filenames is probably not needed
here.
It isn't really clear how this worked before. Traversing the terms of
a document after deleting it from the database seems likely to be
undefined behaviour at best
We put some effort into testing the built copy rather than some
installed copy. On the other hand for people like packagers, testing
the installed copy is also of interest.
When NOTMUCH_TEST_INSTALLED is set to a nonempty value, tests do not
require a built notmuch tree or running configure.
Some of the tests marked as broken when running against installed
notmuch are probably fixable.
When running the test suite without building first, it is desirable to
have the tests consider these variables being undefined as equivalent
to the feature not being present, and in particular for the tests not
to generate errors.
In the decade (!) since this corpus was last updated, the keyserver
network is essentially dead, and I have migrated gpg keys. Bump the
version number as a clean way of switching signatures. Also update the
instructions to suggest using "--locate-external-key" to download the
public key. By default this uses WKD, which is now supported for my
UID.
According to my reading of RFC5322, there is an obsolete syntax for
Message-Id which permits folding whitespace (i.e. to be removed /
ignored by parsers). In [1] Paul Wise observed that notmuch removed
whitespace on indexing, but does not do any corresponding
normalization of queries. Mark the latter as a bug by adding a failing
test.
[1]: id:20230409044143.4041560-1-pabs3@bonedaddy.net
Py 3.12 finally pulled the plug on the `SafeConfigParser` class which
has been deprecated since py 3.2.
We use it in the legacy bindings only, so take the easy route of
importing `ConfigParser` as `SafeConfigParser` and monkey-patching so
that the class has the expected interface.
This prevents data loss when users configure the search cache Maildir to be a
real Maildir containing their real mail data, since the search cache Maildir
is expected to contain only symlinks to the real mail data.
Prevents: <ZCsQBNmbzwkvbpHA@localhost.localdomain>
This should be be slightly faster since it avoids forking a shell
and is less code in and less dependencies for the script.
Since String::ShellQuote isn't used elsewhere, drop mention of it.
Spaces need to be stripped when querying the Message-Id,
because notmuch stores them in Xapian with spaces stripped.
All double-quote characters need to be doubled to escape them,
otherwise they will be added as extra query terms outside the id.
We pre-parse into a list of compiled regular expressions to avoid
calling regexc on the hot (indexing) path. As explained in the code
comment, this cannot be done lazily with reasonable error reporting,
at least not without touching a lot of the code in index.cc.
This replaces two instances of Xapian::Query::MatchAll with the
equivalent but thread-safe alternative Xapian::Query(std::string()).
Xapian::Query::MatchAll maintains an internal pointer to a refcounted
Xapian::Internal::QueryTerm.
None of this is thread-safe but that wouldn't be an issue if
Xapian::Query::MatchAll wasn't static. Because it's static, the
refcounting goes awry when Notmuch is called from multiple threads.
This is actually documented by Xapian:
4715de3a9f/xapian-core/include/xapian/query.h (L65)
While static, Xapian::Query::MatchNothing is safe because it doesn't
maintain an internal object and as such, doesn't use references.
Two best-effort tests making use of TSan were added to showcase the
issue (I couldn't figure out a way to deterministically reproduce it
without making an unmaintainable mess).
First, when two databases are created in parallel, a query that uses
Xapian::Query::MatchAll is made (lib/query.cc), resulting in the
following backtrace on a segfault:
#0 0x00007ffff76822af in Xapian::Query::get_terms_begin (this=0x7fffe80137f0) at api/query.cc:141
#1 0x00007ffff7f933f5 in _notmuch_query_cache_terms (query=0x7fffe80137c0) at lib/query.cc:176
#2 0x00007ffff7f93784 in _notmuch_query_ensure_parsed_xapian (query=0x7fffe80137c0) at lib/query.cc:225
#3 0x00007ffff7f9381a in _notmuch_query_ensure_parsed (query=0x7fffe80137c0) at lib/query.cc:260
#4 0x00007ffff7f93bfe in _notmuch_query_search_documents (query=0x7fffe80137c0, type=0x7ffff7fa9b1e "mail", out=0x7ffff666da18) at lib/query.cc:361
#5 0x00007ffff7f93ba4 in notmuch_query_search_messages (query=0x7fffe80137c0, out=0x7ffff666da18) at lib/query.cc:349
#6 0x00007ffff7f83d98 in notmuch_database_upgrade (notmuch=0x7fffe8000bd0, progress_notify=0x0, closure=0x0) at lib/database.cc:934
#7 0x00007ffff7fa110f in notmuch_database_create_with_config (database_path=0x7ffff666dcb0 "/tmp/notmuch.MZ2AGr", config_path=0x7ffff7faab3c "", profile=0x0, database=0x0, status_string=0x7ffff666dc90) at lib/open.cc:754
#8 0x00007ffff7fa0d6f in notmuch_database_create_verbose (path=0x7ffff666dcb0 "/tmp/notmuch.MZ2AGr", database=0x0, status_string=0x7ffff666dc90) at lib/open.cc:653
#9 0x00007ffff7fa0ceb in notmuch_database_create (path=0x7ffff666dcb0 "/tmp/notmuch.MZ2AGr", database=0x0) at lib/open.cc:637
...
Second, some queries would make use of Xapian::Query::MatchAll
(lib/regexp-fields.cc), resulting in the following backtrace on a
segfault:
#0 0x00007f629828b690 in Xapian::Internal::QueryBranch::gather_terms (this=0x7f628800def0, void_terms=0x7f629726d5a0) at api/queryinternal.cc:1245
#1 0x00007f629828c260 in Xapian::Internal::QueryScaleWeight::gather_terms (this=0x7f628800df70, void_terms=0x7f629726d5a0) at api/queryinternal.cc:1434
#2 0x00007f629828b69f in Xapian::Internal::QueryBranch::gather_terms (this=0x7f628800dd90, void_terms=0x7f629726d5a0) at api/queryinternal.cc:1245
#3 0x00007f6298282571 in Xapian::Query::get_unique_terms_begin (this=0x7f628800dcd8) at api/query.cc:166
#4 0x00007f629841a59b in Xapian::Weight::Internal::accumulate_stats (this=0x7f628800dca0, subdb=..., rset=...) at weight/weightinternal.cc:86
#5 0x00007f62983c15ba in LocalSubMatch::prepare_match (this=0x7f628800df20, nowait=true, total_stats=...) at matcher/localsubmatch.cc:172
#6 0x00007f62983c8fcc in prepare_sub_matches (leaves=std::vector of length 1, capacity 1 = {...}, stats=...) at matcher/multimatch.cc:237
#7 0x00007f62983c98a3 in MultiMatch::MultiMatch (this=0x7f629726d9a0, db_=..., query_=..., qlen=3, omrset=0x0, collapse_max_=0, collapse_key_=4294967295, percent_cutoff_=0, weight_cutoff_=0, order_=Xapian::Enquire::ASCENDING, sort_key_=0, sort_by_=Xapian::Enquire::Internal::VAL, sort_value_forward_=true, time_limit_=0, stats=..., weight_=0x7f6288008d50, matchspies_=std::vector of length 0, capacity 0, have_sorter=false, have_mdecider=false) at matcher/multimatch.cc:353
#8 0x00007f629826fcba in Xapian::Enquire::Internal::get_mset (this=0x7f628800e0b0, first=0, maxitems=0, check_at_least=0, rset=0x0, mdecider=0x0) at api/omenquire.cc:569
#9 0x00007f629827181c in Xapian::Enquire::get_mset (this=0x7f629726db80, first=0, maxitems=0, check_at_least=0, rset=0x0, mdecider=0x0) at api/omenquire.cc:937
#10 0x00007f6298be529a in _notmuch_query_search_documents (query=0x7f6288009750, type=0x7f6298bfaafe "mail", out=0x7f629726dcc0) at lib/query.cc:447
#11 0x00007f6298be4ae8 in notmuch_query_search_messages (query=0x7f6288009750, out=0x7f629726dcc0) at lib/query.cc:349
...
Printing Xapian::Query::MatchAll->internal.px->_refs in these
circumstances can help quickly identifying this scenario.
This is motivated by some test frameworks (like Rust's Cargo) that
runs unit tests in parallel and would easily encounter this issue,
unless client code gates every call to Notmuch behind a lock.
This is what can be expected from the tests when they fail:
== stderr ==
+==================
+WARNING: ThreadSanitizer: data race (pid=207931)
+ Read of size 1 at 0x7b10000001a0 by thread T2:
+ #0 memcpy <null> (libtsan.so.2+0x62506)
+ #1 void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) [clone .isra.0] <null> (libxapian.so.30+0x872b3)
+
+ Previous write of size 8 at 0x7b10000001a0 by thread T1:
+ #0 operator new(unsigned long) <null> (libtsan.so.2+0x8ba83)
+ #1 Xapian::Query::Query(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, unsigned int) <null> (libxapian.so.30+0x855cd)
...