If the user passed a path, and we opened it, then we consider that
definitive definition of "database.path". This makes libnotmuch
respond more gracefully to certain erroneous combinations of
NOTMUCH_CONFIG settings and config file contents.
It is confusing to use two different names (sexp vs sexpr) when
compared with the command line option --query=sexp and (furthermore)
singular vs plural when compared with the man page title.
This is a bit different than n_d_{open,create}_with_config, since
there are several non-zero status codes where we do want to return a
non-NULL database structure.
This code previously relied on _finish_open to free the notmuch struct
on errors (except for the case of database == NULL, which was a
potential double free). When we removed those frees from _finish_open,
we introduced a (small) memory leak.
In this commit, fix the memory leak, and harmonize the on-error
behaviour with n_d_open_with_config.
During refactoring for 0.32, the code that set notmuch=NULL on various
errors was moved into _finish_open. This meant that the the code which
relied on that to set *database to NULL on error was no longer
correct. It also introduced a potential double free, since the notmuch
struct was deallocated inside _finish_open (via n_d_destroy).
In this commit we revert to "allocator frees", and leave any cleanup
to the caller of _finish_open. This allows us to get back the
behaviour of setting *database to NULL with a small change. Other
callers of _finish_open will need free notmuch on errors.
There are some enum typedefs with the enum name:
typedef enum _name_t { ... } name_t;
We don't need or use the enum names _name_t for anything, and not all
of the enum typedefs have them. We have the typedefs specifically to
use the typedef name.
Use the anonymous enum in the typedefs:
typedef enum { ... } name_t;
Remove the comment markers from the placeholder NOTMUCH_DEPRECATED(),
added in commit e5f3c3ed50 ("lib: add stub for
notmuch_database_open_with_config").
As discussed in the thread starting at [1], the fully qualified domain
name is a bit tricky to get reproducibly, might reveal information
people prefer to keep private, and somewhat unlikely to provide
reliable mail routing.
The new approach of $current_username@localhost is better for the
first two considerations, and probably at least as good as a test mail
address.
[1]: id:87sfyibqhj.fsf@tethera.net
Macros implement lazy evaluation and lexical scope. The former is
needed to make certain natural constructs work sensibly (e.g. (tag
,param)) but the latter is mainly future-proofing in case the DSL is
is extended to allow local bindings.
For technical background, see chapters 6 and 17 of [1] (or some other
intermediate programming languages textbook).
[1] http://cs.brown.edu/courses/cs173/2012/book/
It turns out there is not really much code in query-fp.cc useful for
supporting the new syntax. The code we could potentially factor out
amounts to calling notmuch_database_get_config; both the key
construction and the parsing of the results are specific to the query
syntax involved.
This provides functionality analogous to query: in the Xapian
QueryParser based parser. Perhaps counterintuitively, the saved
queries currently have to be in the original query syntax (i.e. not
s-expressions).
One subtle aspect is the replacement of _find_prefix with
_notmuch_database_prefix, which understands user headers. Otherwise
the code mainly consists of creating a fake prefix record (since the
user prefixes are not in the prefix table) and error handling.
This is necessary so that programs can take infix syntax queries from
a user and use the sexp query syntax to construct e.g. a refinement of
that query.
It will be convenient not to have to construct a notmuch query object
when parsing subqueries, so the commit rewrites the query
expansion (currently only used for thread:{} queries) using only
Xapian. As a bonus it seems about 15% faster in initial experiments.
When dealing with recursive queries (i.e. thread:{foo}) it turns out
to be useful just to deal with the underlying Xapian objects, and not
wrap them in notmuch objects.
The previous code had the somewhat bizarre effect that the (notmuch
specific) query string was "*" (interpreted as MatchAll) and the
allegedly parsed xapian_query was "MatchNothing".
This commit also reduces code duplication.
At least to the degree that the Xapian QueryParser based parser
also supports them. Support short alias 'rx' as it seems to make more
complex queries nicer to read.
This is equivalent to adding the same field name "" for multiple
prefixes in the Xapian query parser, but we have to explicitely
construct the resulting query.
The many tests potentially overkill, but they could catch typos in the
prefixes table. As a simplifying assumption, for now we assume a
single argument to the wildcard operator, as this matches the Xapian
semantics. The name 'starts-with' is chosen to emphasize the supported
case of wildcards in currrent (1.4.x) Xapian.
We use "boolean" to describe fields that should generate terms
literally without stemming or phrase splitting. This terminology
might not be ideal but it is already enshrined in
notmuch-search-terms(7).
Anything that is quoted or not purely word characters is considered a
phrase. Phrases are not stemmed, because the stems do not have
positional information in the database. It is less efficient to scan
the term twice, but it avoids a second pass to add prefixes, so maybe
it balances out. In any case, it seems unlikely query parsing is very
often a bottleneck.
All operations and (Xapian) fields will eventually have an entry in
the prefixes table. The flags field is just a placeholder for now, but
will eventually distinguish between various kinds of prefixes.
There is not much of a parser here yet, but it already does some
useful error reporting. Most functionality sketched in the
documentation is not implemented yet; detailed documentation will
follow with the implementation.
Set the parsing syntax when the (notmuch) query object is
created. Initially the library always returns a trivial query that
matches all messages when using s-expression syntax.
It seems better to select the syntax at query creation time because
the lazy parsing is an implementation detail.
The configure part is essentially the same as the other checks using
pkg-config. Since the optional inclusion of this feature changes what
options are available to the user, include it in the "built_with"
pseudo-configuration keys.
When using notmuch-reply and guessing the From: address from
Delivered-To headers, I had the wrong address chosen today. This was
because the messages from the notmuch list contain these headers in this
order:
Delivered-To: hannu.hartikainen@gmail.com
...
Delivered-To: hannu@hrtk.in
In my .notmuch-config I have the following configuration:
primary_email=hannu@hrtk.inother_email=hannu.hartikainen@gmail.com;...
Before this change, notmuch-reply would guess From: @gmail.com because
that is the first Delivered-To header present. After the change, the
primary address is chosen as I would expect.
Notmuch 0.32 corresponds to libnotmuch 5.4 as indicated by docstrings;
however, the minor number wasn't bumped. Any libnotmuch downstream
consumer using the LIBNOTMUCH_CHECK_VERSION macro to support multiple
versions won't be able to access the new 5.4 functions.
Signed-off-by: Austin Ray <austin@austinray.io>
Both notmuch_database_open() and notmuch_database_open_verbose()'s
documentation state they call notmuch_database_open_with_config() with
config_path=NULL; however, their implementations pass an empty string.
The empty string is the correct value to maintain their original
behavior of not loading the user's configuration so their documentation
is incorrect.
This change addresses two known issues with large sets of changes to
the database. The first is that as reported by Steven Allen [1],
notmuch commits are not "flushed" when they complete, which means that
if there is an open transaction when the database closes (or e.g. the
program crashes) then all changes since the last commit will be
discarded (nothing is irrecoverably lost for "notmuch new", as the
indexing process just restarts next time it is run). This does not
really "fix" the issue reported in [1]; that seems rather difficult
given how transactions work in Xapian. On the other hand, with the
default settings, this should mean one only loses less than a minutes
worth of work. The second issue is the occasionally reported "storm"
of disk writes when notmuch finishes. I don't yet have a test for
this, but I think committing as we go should reduce the amount of work
when finalizing the database.
[1]: id:20151025210215.GA3754@stebalien.com
Unfortunately, it doesn't make a difference if we call
cancel_transaction or not, all uncommited changes are discarded if
there is an open (unflushed) transaction.
Since most memory allocation is (ultimately) in the talloc context
defined by a notmuch_database_t pointer, this gives a more complete
view of memory still allocated at program shutdown.
We also change the talloc report in notmuch.c to mode "a" to avoid
clobbering the newly reported log.
This prevents the message document getting multiple thread-id terms
when there are multiple files with the same message-id.
This change shifts some thread ids, requiring adjustments to other tests.
Although this default worked for "notmuch config get", it didn't work
most other places. Restore the previous functionality, with the
wrinkle that XDG locations will shadow $HOME/mail if they exist.
This fixes a bug reported by Jack Kamm in id:87eeefdc8b.fsf@gmail.com
In principle this could be done without depending on C++11 features,
but these features should be available since gcc 4.8.1, and this
localized usage is easy to replace if it turns out to be problematic
for portability.
Prior to 0.32, notmuch had the (undocumented) behaviour that it
expanded a relative value of database.path with respect to $HOME. In
0.32 this was special cased for database.path but broken for
database.mail_root, which causes problems for at least notmuch-new
when database.path is set to a relative path.
The change in T030-config.sh reflects a user visible, but hopefully
harmless behaviour change; the expanded form of the paths will now be
printed by notmuch config.
When compat canonicalize_file_name was introduced, it was limited to
C code only because it was used by C code only during that time.
>From 5ec6fd4d, (lib/open: check for split configuration when creating
database., 2021-02-16), lib/open.cc, which is C++, relies on the
existent of canonicalize_file_name.
However, we can't blindly enable canonicalize_file_name for C++ code,
because different implementation has different additional signature for
C++ and users can arbitrarily add -DHAVE_CANONICALIZE_FILE_NAME=0 to
{C,CXX}FLAGS.
Let's move our implementation into a util library.
Helped-by: Tomi Ollila <tomi.ollila@iki.fi>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Ignoring this return value seems like a bad idea in general, and in
particular it has been hiding one or more bugs related to handling
long directory names.
This is intended to fix the slow behaviour of "notmuch new" (and possibly
"notmuch reindex") when large numbers of files are deleted.
The underlying issue [1] seems to be the Xapian glass backend spending
a large amount of time in db.has_positions when running queries with
large-ish amounts of unflushed changes.
This commit removes two uses of Xapian queries [2], and replaces them with
an approximation of what Xapian would do after optimizing the
queries. This avoids the calls to has_positions (which are in any case
un-needed because we are only using boolean terms here).
[1] Thanks to "andres" on IRC for narrowing down the performance
bottleneck.
[2] Thanks to Olly Betts of Xapian fame for talking me a through a fix
that does not require people to update Xapian.
Since the library searches in several locations for a config file, the
caller does not know which of these is chosen in the usual case of
passing NULL as a config file. This changes provides an API for the
caller to retrieve the name of the config file chosen. It will be
tested in a following commit.
Eventually we want to do all opening of databases in the top
level (main function). This means that detection of missing databases
needs to move out of subcommands. It also requires updating the
library to use the new NO_DATABASE status code.
This matches functionality in the the CLI function
notmuch_config_get_database_path, which was previously used in the CLI
code for all calls to open a database.
The layer of shims here seems a bit wasteful compared to just calling
the corresponding string map functions directly, but it allows control
over the API (calling with notmuch_database_t *) and flexibility for
future changes.
_trial_open can't know if the PATH_ERROR return value will cause the
error message to be returned from the library, so it's up the caller
to clean up if not.
Like the hook directory, we primarily need a way to communicate this
directory between various components, but we may as well let the user
configure it.
Most of the diff is generalizing choose_hook_dir to work for both
backup and hook directories.
Choose sibling directory of xapian database, as .notmuch may not
exist.
libgen.h is already used in debugger.c, so it is not a new dependency
/ potential portability problem.
This changes some error reporting, either intentionally by reporting
the highest level missing directory, or by side effect from looking in
XDG locations when given null database location.
The main functionality will be tested when notmuch-new is converted to
support split configuration. Here only the somewhat odd case of split
mail root which is actually symlinked to the database path is tested.
Introduce a new configuration value for the mail root, and use it to
locate mail messages in preference to the database.path (which
previously implied the mail messages were also in this location.
Initially only a subset of the CLI is tested in a split
configuration. Further changes will be needed for the remainder of the
CLI to work in split configurations.
The idea is to allow reuse in n_d_create_with_config. This is
primarily code movement, with some changes in error messages to reduce
the number of input parameters.