Commit graph

894 commits

Author SHA1 Message Date
David Bremner
c6733a45c8 lib: add config key INDEX_AS_TEXT
Higher level processing as a list of regular expressions and
documentation will follow.
2023-04-02 19:21:37 -03:00
Kevin Boulain
6273966d0b lib: replace some uses of Query::MatchAll with a thread-safe alternative
This replaces two instances of Xapian::Query::MatchAll with the
equivalent but thread-safe alternative Xapian::Query(std::string()).
Xapian::Query::MatchAll maintains an internal pointer to a refcounted
Xapian::Internal::QueryTerm.

None of this is thread-safe but that wouldn't be an issue if
Xapian::Query::MatchAll wasn't static. Because it's static, the
refcounting goes awry when Notmuch is called from multiple threads.
This is actually documented by Xapian:
4715de3a9f/xapian-core/include/xapian/query.h (L65)

While static, Xapian::Query::MatchNothing is safe because it doesn't
maintain an internal object and as such, doesn't use references.

Two best-effort tests making use of TSan were added to showcase the
issue (I couldn't figure out a way to deterministically reproduce it
without making an unmaintainable mess).

First, when two databases are created in parallel, a query that uses
Xapian::Query::MatchAll is made (lib/query.cc), resulting in the
following backtrace on a segfault:
  #0  0x00007ffff76822af in Xapian::Query::get_terms_begin (this=0x7fffe80137f0) at api/query.cc:141
  #1  0x00007ffff7f933f5 in _notmuch_query_cache_terms (query=0x7fffe80137c0) at lib/query.cc:176
  #2  0x00007ffff7f93784 in _notmuch_query_ensure_parsed_xapian (query=0x7fffe80137c0) at lib/query.cc:225
  #3  0x00007ffff7f9381a in _notmuch_query_ensure_parsed (query=0x7fffe80137c0) at lib/query.cc:260
  #4  0x00007ffff7f93bfe in _notmuch_query_search_documents (query=0x7fffe80137c0, type=0x7ffff7fa9b1e "mail", out=0x7ffff666da18) at lib/query.cc:361
  #5  0x00007ffff7f93ba4 in notmuch_query_search_messages (query=0x7fffe80137c0, out=0x7ffff666da18) at lib/query.cc:349
  #6  0x00007ffff7f83d98 in notmuch_database_upgrade (notmuch=0x7fffe8000bd0, progress_notify=0x0, closure=0x0) at lib/database.cc:934
  #7  0x00007ffff7fa110f in notmuch_database_create_with_config (database_path=0x7ffff666dcb0 "/tmp/notmuch.MZ2AGr", config_path=0x7ffff7faab3c "", profile=0x0, database=0x0, status_string=0x7ffff666dc90) at lib/open.cc:754
  #8  0x00007ffff7fa0d6f in notmuch_database_create_verbose (path=0x7ffff666dcb0 "/tmp/notmuch.MZ2AGr", database=0x0, status_string=0x7ffff666dc90) at lib/open.cc:653
  #9  0x00007ffff7fa0ceb in notmuch_database_create (path=0x7ffff666dcb0 "/tmp/notmuch.MZ2AGr", database=0x0) at lib/open.cc:637
  ...

Second, some queries would make use of Xapian::Query::MatchAll
(lib/regexp-fields.cc), resulting in the following backtrace on a
segfault:
  #0  0x00007f629828b690 in Xapian::Internal::QueryBranch::gather_terms (this=0x7f628800def0, void_terms=0x7f629726d5a0) at api/queryinternal.cc:1245
  #1  0x00007f629828c260 in Xapian::Internal::QueryScaleWeight::gather_terms (this=0x7f628800df70, void_terms=0x7f629726d5a0) at api/queryinternal.cc:1434
  #2  0x00007f629828b69f in Xapian::Internal::QueryBranch::gather_terms (this=0x7f628800dd90, void_terms=0x7f629726d5a0) at api/queryinternal.cc:1245
  #3  0x00007f6298282571 in Xapian::Query::get_unique_terms_begin (this=0x7f628800dcd8) at api/query.cc:166
  #4  0x00007f629841a59b in Xapian::Weight::Internal::accumulate_stats (this=0x7f628800dca0, subdb=..., rset=...) at weight/weightinternal.cc:86
  #5  0x00007f62983c15ba in LocalSubMatch::prepare_match (this=0x7f628800df20, nowait=true, total_stats=...) at matcher/localsubmatch.cc:172
  #6  0x00007f62983c8fcc in prepare_sub_matches (leaves=std::vector of length 1, capacity 1 = {...}, stats=...) at matcher/multimatch.cc:237
  #7  0x00007f62983c98a3 in MultiMatch::MultiMatch (this=0x7f629726d9a0, db_=..., query_=..., qlen=3, omrset=0x0, collapse_max_=0, collapse_key_=4294967295, percent_cutoff_=0, weight_cutoff_=0, order_=Xapian::Enquire::ASCENDING, sort_key_=0, sort_by_=Xapian::Enquire::Internal::VAL, sort_value_forward_=true, time_limit_=0, stats=..., weight_=0x7f6288008d50, matchspies_=std::vector of length 0, capacity 0, have_sorter=false, have_mdecider=false) at matcher/multimatch.cc:353
  #8  0x00007f629826fcba in Xapian::Enquire::Internal::get_mset (this=0x7f628800e0b0, first=0, maxitems=0, check_at_least=0, rset=0x0, mdecider=0x0) at api/omenquire.cc:569
  #9  0x00007f629827181c in Xapian::Enquire::get_mset (this=0x7f629726db80, first=0, maxitems=0, check_at_least=0, rset=0x0, mdecider=0x0) at api/omenquire.cc:937
  #10 0x00007f6298be529a in _notmuch_query_search_documents (query=0x7f6288009750, type=0x7f6298bfaafe "mail", out=0x7f629726dcc0) at lib/query.cc:447
  #11 0x00007f6298be4ae8 in notmuch_query_search_messages (query=0x7f6288009750, out=0x7f629726dcc0) at lib/query.cc:349
  ...

Printing Xapian::Query::MatchAll->internal.px->_refs in these
circumstances can help quickly identifying this scenario.

This is motivated by some test frameworks (like Rust's Cargo) that
runs unit tests in parallel and would easily encounter this issue,
unless client code gates every call to Notmuch behind a lock.

This is what can be expected from the tests when they fail:
   == stderr ==
  +==================
  +WARNING: ThreadSanitizer: data race (pid=207931)
  +  Read of size 1 at 0x7b10000001a0 by thread T2:
  +    #0 memcpy <null> (libtsan.so.2+0x62506)
  +    #1 void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) [clone .isra.0] <null> (libxapian.so.30+0x872b3)
  +
  +  Previous write of size 8 at 0x7b10000001a0 by thread T1:
  +    #0 operator new(unsigned long) <null> (libtsan.so.2+0x8ba83)
  +    #1 Xapian::Query::Query(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, unsigned int, unsigned int) <null> (libxapian.so.30+0x855cd)
  ...
2023-03-31 08:11:39 -03:00
Kevin Boulain
fb55ff28a2 lib/message-property: sync removed properties to the database
_notmuch_message_remove_all_properties wasn't syncing the message back
to the database but was still invalidating the metadata, giving the
impression the properties had actually been removed.

Also move the metadata invalidation to _notmuch_message_remove_terms
to be closer to what's done in _notmuch_message_modify_property and
_notmuch_message_remove_term.
2023-03-30 08:01:09 -03:00
Kevin Boulain
568f6bc3c2 lib/message-property: catch xapian exceptions
Since libnotmuch exposes a C interface there's no way for clients to
catch this.
Inspired by what's done for tags (see notmuch_message_remove_tag).
2023-03-30 07:08:47 -03:00
Kevin Boulain
d86e03c786 lib/notmuch: update example
Likely missed in 86cbd215e, when notmuch_query_search_messages_st was
renamed to notmuch_query_search_messages.
2023-02-27 08:34:38 -04:00
David Bremner
09f2ad8e85 lib: add better diagnostics for over long filenames.
Previously we just crashed with an internal error. With this change,
the caller can handle it better. Update notmuch-new so that it doesn't
crash with "unknown error code" because of this change.
2023-02-20 09:22:32 -04:00
David Bremner
1d5d0ae686 lib/message: move xapian call inside try/catch block in _n_m_delete
The call to delete_document can throw exceptions (and can happen in
practice [1]), so catch the exception and extract the error
message. As a side effect, also move the call to _n_m_has_term inside
the try/catch. This should not change anything as that function
already traps any Xapian exceptions.

[1]: id:wwuk039sk2p.fsf@chaotikum.eu
2022-12-27 11:59:46 -04:00
David Bremner
16d92abf9f lib/database: propagate status code from _notmuch_message_delete
_notmuch_message_delete can return (at least)
NOTMUCH_STATUS_XAPIAN_EXCEPTION, which we should not ignore.
2022-12-27 11:59:29 -04:00
David Bremner
2e5ef69fbf lib: add field processor for lastmod: prefix
By sharing the existing logic used by the sexp query parser, this
allows negative lastmod revisions to be interpreted as relative to the
most recent revision.
2022-09-03 08:43:33 -03:00
David Bremner
93c602a82f lib: factor out lastmod range handling from sexp parser.
This will permit the re-use of the same logic in the infix query
parser. The location of the shared code in the infix side is for
consistency with the other shared parsing logic. It will make more
sense when a Xapian field processor is added for the lastmod prefix.
2022-09-03 08:36:53 -03:00
David Bremner
606d9b02e4 lib/sexp: provide relative lastmod queries
Test the relatively trivial logic changes for the sexp query parser
first before refactoring that logic to share with the infix query
parser.
2022-09-03 08:36:53 -03:00
David Bremner
84e4e130e2 lib/open: create database path in some cases
There is some duplication of code here, but not all of the locations
valid to find a database make sense to create. Furthermore we nead two
passes, so the control flow in _choose_database_path would get a bit
convoluted.
2022-09-03 08:24:43 -03:00
David Bremner
8ba3057d01 lib/open: return non-SUCCESS on missing database path
This simplifies the logic of creating the directory path when it doesn't
exist.
2022-09-03 08:24:43 -03:00
David Bremner
25e2790e30 lib/open: refactor call to mkdir into function
This makes the error handling available for re-use. Using
g_mkdir_with_parents also handles the case of a pre-existing
directory. This introduces new functionality, namely creating the
parent directories, which will be useful for creating directories like
'.local/share/notmuch/default'.
2022-09-03 08:24:43 -03:00
David Bremner
6a9ae99099 lib/sexp: add parameter expansion for regex and wildcard
Fix the bug reported at [1].

The parameter expansion for regex and wildcard modifiers has to be
done a bit differently, because their arguments are not s-expressions
defining complete Xapian queries.

[1]: id:87o7yxqxy6.fsf@code.pm
2022-07-01 08:37:00 -03:00
David Bremner
e7ffb74041 lib/sexp: allow * as alias for "" in range searches.
It can be tedious to use "" inside of a string, e.g. in a shell script.
2022-06-25 19:49:55 -03:00
David Bremner
7863234586 lib/sexp: special case "" as an argument in lastmod ranges.
Support this syntax for constincy with (data from to) ranges.
2022-06-25 19:49:55 -03:00
David Bremner
6f749dd24a lib: check for writable db in n_m_tags_maildir_flags
The database needs to be writable because the list of stored file
names will change in general.
2022-06-25 16:06:34 -03:00
David Bremner
3f27cce71f lib: add NOTMUCH_STATUS_CLOSED_DATABASE, use in _n_d_ensure_writable
In order for a database to actually be writeable, it must be the case that it
is open, not just the correct type of Xapian object. By explicitely
checking, we are able to provide better error reporting, in particular
for the previously broken test in T566-lib-message.
2022-06-25 16:06:18 -03:00
David Bremner
7e654e2a45 lib: Add missing private status values.
These were missed when the corresponding status codes were added.
2022-06-25 16:06:10 -03:00
David Bremner
e70df92085 lib/tag: handle NULL argument to notmuch_tags_valid
Make the behaviour when passed NULL consistent with
notmuch_filenames_valid. The library already passes the result of
notmuch_message_get_tags without checking for NULL, so it should be
handled.
2022-06-25 16:05:45 -03:00
David Bremner
879ec9d76a lib/message: check return status from _n_m_add_{path,folder}_terms
Mainly to propagate information about Xapian exceptions.
2022-06-25 12:55:02 -03:00
David Bremner
b102d0ad11 lib/message: check return status of _n_m_{add,remove}_term
Xapian exceptions are not something that can be ignored, in general.
2022-06-25 12:55:02 -03:00
David Bremner
7d5a9bd3ae lib: define macro NODISCARD
In either C++17 (or later) mode, or when running cppcheck, this can be
used to selectively generate warnings about discarded return values.
2022-06-25 12:55:02 -03:00
David Bremner
bc80ff829a lib/message: drop _notmuch_message_get_thread_id_only
This function has been unused since commit 4083fd8.
2022-06-25 12:55:02 -03:00
David Bremner
f48d2e2ff8 lib/message: catch exceptions in _n_m_add_term
Some code movement is needed to make sure the cache is only
invalidated when the Xapian operation succeeds.
2022-06-25 12:55:02 -03:00
David Bremner
4f8a2d2253 lib/message: use false from stdbool.h
As far as I know, this is just a style / consistency thing, unless
notmuch code starts defining FALSE inconsistently with false.
2022-05-26 08:30:00 -03:00
David Bremner
6810881705 lib: fix uninitialized field in message objects.
Initially reported by Eliza Vasquez [1] (via valgrind).

[1]: id:87o7zxj086.fsf@eliza.
2022-05-26 08:09:32 -03:00
Michael J Gruber
785f9d656d fix build without sfsexp
a1d139de ("lib: add sexp: prefix to Xapian (infix) query parser.",
2022-04-09) introduced sfsexp infix queries. This requires the infix
preprocessor to be built in in a way which does not require sfsexp when
notmuch is built without it.

Make the preprocessor throw a Xapian error in this case (and fix the
build).

Signed-off-by: Michael J Gruber <git@grubix.eu>
2022-04-15 14:17:31 -03:00
David Bremner
a1d139de4d lib: add sexp: prefix to Xapian (infix) query parser.
This is analogous to the "infix" prefix provided by the s-expression
based query parser.
2022-04-15 08:25:46 -03:00
David Bremner
8ed6a172b3 lib: do not phrase parse prefixed bracketed subexpressions
Since Xapian does not preserve quotes when passing the subquery to a
field processor, we have to make a guess as to what the user
intended. Here the added assumption is that a string surrounded by
parens is not intended to be a phrase.
2022-03-19 07:27:29 -03:00
David Bremner
7f8af14bdc lib: bump minor version to 6.
One new status value and one configuration value added.
2022-01-29 18:13:26 -04:00
David Bremner
2c1d1107f5 lib: strip trailing '/' from pathnames (sexp queries).
This changes makes the sexp query parser consistent with the infix one
in ignoring trailing '/'. Here we do a bit better and ignore any
number of trailing '/'.
2022-01-27 07:48:27 -04:00
David Bremner
c62c22c9fb lib: drop trailing slash for path and folder searches (infix)
This resolves an old bug reported by David Edmondson in 2014. The fix
is only needed for the "boolean" case, as probabilistic / phrase
searching already ignores punctuation.

This fix is only for the infix (xapian provided) query parser.

[1]: id:cunoasuolcv.fsf@gargravarr.hh.sledj.net
2022-01-27 07:48:27 -04:00
David Bremner
0a32741fce lib/parse-sexp: handle lastmod queries.
This particular choice of converting strings to integers requires C++11.
2022-01-26 07:41:02 -04:00
David Bremner
77ab961a1d lib/parse-sexp: support actual date queries.
The default argument processing overlaps somewhat with what is already
done in _notmuch_date_strings_to_query, but we can give more specific
error messages for the s-expression context.

The extra generality of _sexp_parse_range will be useful when we
implement additional range prefixes (at least 'lastmod' is needed).
2022-01-26 07:41:02 -04:00
David Bremner
bf3cc5eed2 lib/date: factor out date range parsing.
This will allow re-using the same logic in the s-expression parser.
2022-01-26 07:41:02 -04:00
David Bremner
303f207a54 lib/parse-sexp: support zero argument date queries
These are not too practical, although they may simplify some user
query generation code. Mainly this adds a new prefix keyword to the
parser.
2022-01-26 07:41:02 -04:00
David Bremner
2786aa4d54 lib/database: delete stemmer on destroy
Commit [0] left the stemmer object accessible, but did not add
de-allocation code to notmuch_database_destroy. This commit corrects
that oversight.

Leak originally reported by Austin Ray [1].

[0]: 3202e0d1fe
[1]: id:20220105224538.m36lnjn7rf3ieonc@athena
2022-01-22 21:14:29 -04:00
David Bremner
df7c5acd75 lib/config: move g_key_File_get_string before continue
In [1] Austin Ray reported some memory leaks in
notmuch_database_open. One of those leaks is caused by jumping to the
next key without freeing val. This change avoids that leak.

[1]: id:20220105224538.m36lnjn7rf3ieonc@athena
2022-01-22 21:14:29 -04:00
David Bremner
79936ac93e lib/config: add known config key "show.extra_headers"
Used in a following commit to enable including extra headers beyond
the default in structured output.
2022-01-18 08:09:14 -04:00
David Bremner
fad2e7540b lib/open: no default mail root in split configurations
If we know the configuration is split, but there is no mail root
defined, this indicates a (lack of) configuration error. Currently
this can only arise in XDG configurations.
2022-01-15 15:59:39 -04:00
David Bremner
64212c7b91 lib/config: make sure the config map exists when loading defaults
We should not rely on one of the other "_notmuch_config_load_*"
functions being called before this one.
2022-01-15 15:59:27 -04:00
David Bremner
63b4c46983 lib/open: use notmuch->params to track split status
Persisting this status will allow us to use the information in other
compilation units, in particular when setting configuration defaults.
2022-01-15 15:53:31 -04:00
David Bremner
fd0edeb561 lib/open: use db struct as talloc ctx for choose_database_path
The extra talloc struct "local" was left over from before the notmuch
struct was allocated earlier. Having the notmuch struct available in
this function will allow more flexibility to track the configuration
variations (e.g. split vs. non-split).
2022-01-15 15:51:33 -04:00
David Bremner
3eb25c94bd Merge branch 'release' 2021-12-29 14:20:49 -04:00
David Bremner
25e0f5e592 lib/open: do not consider .notmuch alone as an existing database.
It makes perfect sense for users to want to pre-create .notmuch,
e.g. to install hooks, so we should handle the case of a .notmuch
directory without an actual xapian database more gracefully.
2021-12-29 14:11:21 -04:00
David Bremner
18cdd21b8b lib/config: use g_key_file_get_string to read config values
Unlike the previous g_key_file_get_value, this version processes
escape codes for whitespace and \. The remaining two broken tests from
the last commit are because "notmuch config get" treats every value as
a list, and thus the previously introduces stripping of leading
whitespace applies.
2021-12-04 12:17:09 -04:00
David Bremner
1e7d33961e Merge branch 'release' 2021-12-04 09:27:30 -04:00
David Bremner
59aac9cef3 lib/config: don't overwrite database.path if the caller passed it
If the user passed a path, and we opened it, then we consider that
definitive definition of "database.path". This makes libnotmuch
respond more gracefully to certain erroneous combinations of
NOTMUCH_CONFIG settings and config file contents.
2021-12-03 20:52:11 -04:00