When dealing with recursive queries (i.e. thread:{foo}) it turns out
to be useful just to deal with the underlying Xapian objects, and not
wrap them in notmuch objects.
The previous code had the somewhat bizarre effect that the (notmuch
specific) query string was "*" (interpreted as MatchAll) and the
allegedly parsed xapian_query was "MatchNothing".
This commit also reduces code duplication.
At least to the degree that the Xapian QueryParser based parser
also supports them. Support short alias 'rx' as it seems to make more
complex queries nicer to read.
This is equivalent to adding the same field name "" for multiple
prefixes in the Xapian query parser, but we have to explicitely
construct the resulting query.
The many tests potentially overkill, but they could catch typos in the
prefixes table. As a simplifying assumption, for now we assume a
single argument to the wildcard operator, as this matches the Xapian
semantics. The name 'starts-with' is chosen to emphasize the supported
case of wildcards in currrent (1.4.x) Xapian.
We use "boolean" to describe fields that should generate terms
literally without stemming or phrase splitting. This terminology
might not be ideal but it is already enshrined in
notmuch-search-terms(7).
Anything that is quoted or not purely word characters is considered a
phrase. Phrases are not stemmed, because the stems do not have
positional information in the database. It is less efficient to scan
the term twice, but it avoids a second pass to add prefixes, so maybe
it balances out. In any case, it seems unlikely query parsing is very
often a bottleneck.
All operations and (Xapian) fields will eventually have an entry in
the prefixes table. The flags field is just a placeholder for now, but
will eventually distinguish between various kinds of prefixes.
There is not much of a parser here yet, but it already does some
useful error reporting. Most functionality sketched in the
documentation is not implemented yet; detailed documentation will
follow with the implementation.
Set the parsing syntax when the (notmuch) query object is
created. Initially the library always returns a trivial query that
matches all messages when using s-expression syntax.
It seems better to select the syntax at query creation time because
the lazy parsing is an implementation detail.
The configure part is essentially the same as the other checks using
pkg-config. Since the optional inclusion of this feature changes what
options are available to the user, include it in the "built_with"
pseudo-configuration keys.
When using notmuch-reply and guessing the From: address from
Delivered-To headers, I had the wrong address chosen today. This was
because the messages from the notmuch list contain these headers in this
order:
Delivered-To: hannu.hartikainen@gmail.com
...
Delivered-To: hannu@hrtk.in
In my .notmuch-config I have the following configuration:
primary_email=hannu@hrtk.inother_email=hannu.hartikainen@gmail.com;...
Before this change, notmuch-reply would guess From: @gmail.com because
that is the first Delivered-To header present. After the change, the
primary address is chosen as I would expect.
Notmuch 0.32 corresponds to libnotmuch 5.4 as indicated by docstrings;
however, the minor number wasn't bumped. Any libnotmuch downstream
consumer using the LIBNOTMUCH_CHECK_VERSION macro to support multiple
versions won't be able to access the new 5.4 functions.
Signed-off-by: Austin Ray <austin@austinray.io>
Both notmuch_database_open() and notmuch_database_open_verbose()'s
documentation state they call notmuch_database_open_with_config() with
config_path=NULL; however, their implementations pass an empty string.
The empty string is the correct value to maintain their original
behavior of not loading the user's configuration so their documentation
is incorrect.
This change addresses two known issues with large sets of changes to
the database. The first is that as reported by Steven Allen [1],
notmuch commits are not "flushed" when they complete, which means that
if there is an open transaction when the database closes (or e.g. the
program crashes) then all changes since the last commit will be
discarded (nothing is irrecoverably lost for "notmuch new", as the
indexing process just restarts next time it is run). This does not
really "fix" the issue reported in [1]; that seems rather difficult
given how transactions work in Xapian. On the other hand, with the
default settings, this should mean one only loses less than a minutes
worth of work. The second issue is the occasionally reported "storm"
of disk writes when notmuch finishes. I don't yet have a test for
this, but I think committing as we go should reduce the amount of work
when finalizing the database.
[1]: id:20151025210215.GA3754@stebalien.com
Unfortunately, it doesn't make a difference if we call
cancel_transaction or not, all uncommited changes are discarded if
there is an open (unflushed) transaction.
Since most memory allocation is (ultimately) in the talloc context
defined by a notmuch_database_t pointer, this gives a more complete
view of memory still allocated at program shutdown.
We also change the talloc report in notmuch.c to mode "a" to avoid
clobbering the newly reported log.
This prevents the message document getting multiple thread-id terms
when there are multiple files with the same message-id.
This change shifts some thread ids, requiring adjustments to other tests.
Although this default worked for "notmuch config get", it didn't work
most other places. Restore the previous functionality, with the
wrinkle that XDG locations will shadow $HOME/mail if they exist.
This fixes a bug reported by Jack Kamm in id:87eeefdc8b.fsf@gmail.com
In principle this could be done without depending on C++11 features,
but these features should be available since gcc 4.8.1, and this
localized usage is easy to replace if it turns out to be problematic
for portability.
Prior to 0.32, notmuch had the (undocumented) behaviour that it
expanded a relative value of database.path with respect to $HOME. In
0.32 this was special cased for database.path but broken for
database.mail_root, which causes problems for at least notmuch-new
when database.path is set to a relative path.
The change in T030-config.sh reflects a user visible, but hopefully
harmless behaviour change; the expanded form of the paths will now be
printed by notmuch config.
When compat canonicalize_file_name was introduced, it was limited to
C code only because it was used by C code only during that time.
>From 5ec6fd4d, (lib/open: check for split configuration when creating
database., 2021-02-16), lib/open.cc, which is C++, relies on the
existent of canonicalize_file_name.
However, we can't blindly enable canonicalize_file_name for C++ code,
because different implementation has different additional signature for
C++ and users can arbitrarily add -DHAVE_CANONICALIZE_FILE_NAME=0 to
{C,CXX}FLAGS.
Let's move our implementation into a util library.
Helped-by: Tomi Ollila <tomi.ollila@iki.fi>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Ignoring this return value seems like a bad idea in general, and in
particular it has been hiding one or more bugs related to handling
long directory names.
This is intended to fix the slow behaviour of "notmuch new" (and possibly
"notmuch reindex") when large numbers of files are deleted.
The underlying issue [1] seems to be the Xapian glass backend spending
a large amount of time in db.has_positions when running queries with
large-ish amounts of unflushed changes.
This commit removes two uses of Xapian queries [2], and replaces them with
an approximation of what Xapian would do after optimizing the
queries. This avoids the calls to has_positions (which are in any case
un-needed because we are only using boolean terms here).
[1] Thanks to "andres" on IRC for narrowing down the performance
bottleneck.
[2] Thanks to Olly Betts of Xapian fame for talking me a through a fix
that does not require people to update Xapian.
Since the library searches in several locations for a config file, the
caller does not know which of these is chosen in the usual case of
passing NULL as a config file. This changes provides an API for the
caller to retrieve the name of the config file chosen. It will be
tested in a following commit.