It turns out there is not really much code in query-fp.cc useful for
supporting the new syntax. The code we could potentially factor out
amounts to calling notmuch_database_get_config; both the key
construction and the parsing of the results are specific to the query
syntax involved.
This provides functionality analogous to query: in the Xapian
QueryParser based parser. Perhaps counterintuitively, the saved
queries currently have to be in the original query syntax (i.e. not
s-expressions).
One subtle aspect is the replacement of _find_prefix with
_notmuch_database_prefix, which understands user headers. Otherwise
the code mainly consists of creating a fake prefix record (since the
user prefixes are not in the prefix table) and error handling.
This is necessary so that programs can take infix syntax queries from
a user and use the sexp query syntax to construct e.g. a refinement of
that query.
It will be convenient not to have to construct a notmuch query object
when parsing subqueries, so the commit rewrites the query
expansion (currently only used for thread:{} queries) using only
Xapian. As a bonus it seems about 15% faster in initial experiments.
When dealing with recursive queries (i.e. thread:{foo}) it turns out
to be useful just to deal with the underlying Xapian objects, and not
wrap them in notmuch objects.
The previous code had the somewhat bizarre effect that the (notmuch
specific) query string was "*" (interpreted as MatchAll) and the
allegedly parsed xapian_query was "MatchNothing".
This commit also reduces code duplication.
At least to the degree that the Xapian QueryParser based parser
also supports them. Support short alias 'rx' as it seems to make more
complex queries nicer to read.
This is equivalent to adding the same field name "" for multiple
prefixes in the Xapian query parser, but we have to explicitely
construct the resulting query.
The many tests potentially overkill, but they could catch typos in the
prefixes table. As a simplifying assumption, for now we assume a
single argument to the wildcard operator, as this matches the Xapian
semantics. The name 'starts-with' is chosen to emphasize the supported
case of wildcards in currrent (1.4.x) Xapian.
We use "boolean" to describe fields that should generate terms
literally without stemming or phrase splitting. This terminology
might not be ideal but it is already enshrined in
notmuch-search-terms(7).
Anything that is quoted or not purely word characters is considered a
phrase. Phrases are not stemmed, because the stems do not have
positional information in the database. It is less efficient to scan
the term twice, but it avoids a second pass to add prefixes, so maybe
it balances out. In any case, it seems unlikely query parsing is very
often a bottleneck.
All operations and (Xapian) fields will eventually have an entry in
the prefixes table. The flags field is just a placeholder for now, but
will eventually distinguish between various kinds of prefixes.
There is not much of a parser here yet, but it already does some
useful error reporting. Most functionality sketched in the
documentation is not implemented yet; detailed documentation will
follow with the implementation.
Set the parsing syntax when the (notmuch) query object is
created. Initially the library always returns a trivial query that
matches all messages when using s-expression syntax.
It seems better to select the syntax at query creation time because
the lazy parsing is an implementation detail.
The configure part is essentially the same as the other checks using
pkg-config. Since the optional inclusion of this feature changes what
options are available to the user, include it in the "built_with"
pseudo-configuration keys.
When using notmuch-reply and guessing the From: address from
Delivered-To headers, I had the wrong address chosen today. This was
because the messages from the notmuch list contain these headers in this
order:
Delivered-To: hannu.hartikainen@gmail.com
...
Delivered-To: hannu@hrtk.in
In my .notmuch-config I have the following configuration:
primary_email=hannu@hrtk.inother_email=hannu.hartikainen@gmail.com;...
Before this change, notmuch-reply would guess From: @gmail.com because
that is the first Delivered-To header present. After the change, the
primary address is chosen as I would expect.
Notmuch 0.32 corresponds to libnotmuch 5.4 as indicated by docstrings;
however, the minor number wasn't bumped. Any libnotmuch downstream
consumer using the LIBNOTMUCH_CHECK_VERSION macro to support multiple
versions won't be able to access the new 5.4 functions.
Signed-off-by: Austin Ray <austin@austinray.io>
Both notmuch_database_open() and notmuch_database_open_verbose()'s
documentation state they call notmuch_database_open_with_config() with
config_path=NULL; however, their implementations pass an empty string.
The empty string is the correct value to maintain their original
behavior of not loading the user's configuration so their documentation
is incorrect.
This change addresses two known issues with large sets of changes to
the database. The first is that as reported by Steven Allen [1],
notmuch commits are not "flushed" when they complete, which means that
if there is an open transaction when the database closes (or e.g. the
program crashes) then all changes since the last commit will be
discarded (nothing is irrecoverably lost for "notmuch new", as the
indexing process just restarts next time it is run). This does not
really "fix" the issue reported in [1]; that seems rather difficult
given how transactions work in Xapian. On the other hand, with the
default settings, this should mean one only loses less than a minutes
worth of work. The second issue is the occasionally reported "storm"
of disk writes when notmuch finishes. I don't yet have a test for
this, but I think committing as we go should reduce the amount of work
when finalizing the database.
[1]: id:20151025210215.GA3754@stebalien.com
Unfortunately, it doesn't make a difference if we call
cancel_transaction or not, all uncommited changes are discarded if
there is an open (unflushed) transaction.
Since most memory allocation is (ultimately) in the talloc context
defined by a notmuch_database_t pointer, this gives a more complete
view of memory still allocated at program shutdown.
We also change the talloc report in notmuch.c to mode "a" to avoid
clobbering the newly reported log.
This prevents the message document getting multiple thread-id terms
when there are multiple files with the same message-id.
This change shifts some thread ids, requiring adjustments to other tests.
Although this default worked for "notmuch config get", it didn't work
most other places. Restore the previous functionality, with the
wrinkle that XDG locations will shadow $HOME/mail if they exist.
This fixes a bug reported by Jack Kamm in id:87eeefdc8b.fsf@gmail.com
In principle this could be done without depending on C++11 features,
but these features should be available since gcc 4.8.1, and this
localized usage is easy to replace if it turns out to be problematic
for portability.
Prior to 0.32, notmuch had the (undocumented) behaviour that it
expanded a relative value of database.path with respect to $HOME. In
0.32 this was special cased for database.path but broken for
database.mail_root, which causes problems for at least notmuch-new
when database.path is set to a relative path.
The change in T030-config.sh reflects a user visible, but hopefully
harmless behaviour change; the expanded form of the paths will now be
printed by notmuch config.
When compat canonicalize_file_name was introduced, it was limited to
C code only because it was used by C code only during that time.
>From 5ec6fd4d, (lib/open: check for split configuration when creating
database., 2021-02-16), lib/open.cc, which is C++, relies on the
existent of canonicalize_file_name.
However, we can't blindly enable canonicalize_file_name for C++ code,
because different implementation has different additional signature for
C++ and users can arbitrarily add -DHAVE_CANONICALIZE_FILE_NAME=0 to
{C,CXX}FLAGS.
Let's move our implementation into a util library.
Helped-by: Tomi Ollila <tomi.ollila@iki.fi>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Ignoring this return value seems like a bad idea in general, and in
particular it has been hiding one or more bugs related to handling
long directory names.
This is intended to fix the slow behaviour of "notmuch new" (and possibly
"notmuch reindex") when large numbers of files are deleted.
The underlying issue [1] seems to be the Xapian glass backend spending
a large amount of time in db.has_positions when running queries with
large-ish amounts of unflushed changes.
This commit removes two uses of Xapian queries [2], and replaces them with
an approximation of what Xapian would do after optimizing the
queries. This avoids the calls to has_positions (which are in any case
un-needed because we are only using boolean terms here).
[1] Thanks to "andres" on IRC for narrowing down the performance
bottleneck.
[2] Thanks to Olly Betts of Xapian fame for talking me a through a fix
that does not require people to update Xapian.
Since the library searches in several locations for a config file, the
caller does not know which of these is chosen in the usual case of
passing NULL as a config file. This changes provides an API for the
caller to retrieve the name of the config file chosen. It will be
tested in a following commit.
Eventually we want to do all opening of databases in the top
level (main function). This means that detection of missing databases
needs to move out of subcommands. It also requires updating the
library to use the new NO_DATABASE status code.
This matches functionality in the the CLI function
notmuch_config_get_database_path, which was previously used in the CLI
code for all calls to open a database.
The layer of shims here seems a bit wasteful compared to just calling
the corresponding string map functions directly, but it allows control
over the API (calling with notmuch_database_t *) and flexibility for
future changes.
_trial_open can't know if the PATH_ERROR return value will cause the
error message to be returned from the library, so it's up the caller
to clean up if not.
Like the hook directory, we primarily need a way to communicate this
directory between various components, but we may as well let the user
configure it.
Most of the diff is generalizing choose_hook_dir to work for both
backup and hook directories.
Choose sibling directory of xapian database, as .notmuch may not
exist.
libgen.h is already used in debugger.c, so it is not a new dependency
/ potential portability problem.
This changes some error reporting, either intentionally by reporting
the highest level missing directory, or by side effect from looking in
XDG locations when given null database location.
The main functionality will be tested when notmuch-new is converted to
support split configuration. Here only the somewhat odd case of split
mail root which is actually symlinked to the database path is tested.
Introduce a new configuration value for the mail root, and use it to
locate mail messages in preference to the database.path (which
previously implied the mail messages were also in this location.
Initially only a subset of the CLI is tested in a split
configuration. Further changes will be needed for the remainder of the
CLI to work in split configurations.
The idea is to allow reuse in n_d_create_with_config. This is
primarily code movement, with some changes in error messages to reduce
the number of input parameters.
This is slightly more tidy, but more importantly it allows for re-use
of this code in n_d_create_with_config. That re-use will be crucial
when we no longer call n_d_open_with_config from
n_d_create_with_config.
This removes duplication between the struct element and the
configuration string_map entry. Create a simple wrapper for setting
the database path that makes sure the trailing / is stripped.
In the future Xapian will apparently support this more conveniently
for the cases other than READ_ONLY => READ_ONLY
Conceptually this function seems to fit better in lib/open.cc;
database.cc is still large enough that moving the function makes
sense.
This will allow re-opening in a different mode (read/write
vs. read-only) with current Xapian API. It will also prove useful when
updating the compact functions to support more flexible database
location.
Include the (currently unused) mode argument which will specify which
mode to re-open the database in. Functionality and docs to be
finalized in a followup commit.
Based on a patch from Michael J Gruber [1]. As of glib 2.67 (more
specifically [2]), including "gmime-extra.h" inside an extern "C"
block causes build failures, because glib is using C++ features.
Observing that "gmime-extra.h" is no longer needed in
notmuch-private.h, which can simply delete that include, but
we have to correspondingly move the includes which might include
it (in particular crypto.h) out of the extern "C" block also.
This seems less fragile than only moving gmime-extra, and relying on
preprocessor sentinels to keep the deeper includes from happening.
Move to the include to the outside of the extern block.
[1]: id:aee618a3d41f7889a7449aa16893e992325a909a.1613055071.git.git@grubix.eu
[2]: https://gitlab.gnome.org/GNOME/glib/-/merge_requests/1715
The hook directory configuration needs to be kept in synch with the
other configuration information, so add scaffolding to support this at
database opening time.