Commit graph

866 commits

Author SHA1 Message Date
David Bremner
1870b3ae4b lib/parse-sexp: support regular expressions
At least to the degree that the Xapian QueryParser based parser
also supports them. Support short alias 'rx' as it seems to make more
complex queries nicer to read.
2021-09-04 17:07:19 -07:00
David Bremner
5cb452c325 lib: factor out query construction from regexp
This will allow re-use of this code outside of the Xapian query parser.
2021-09-04 17:07:19 -07:00
David Bremner
0b98ad5e4e lib/query: generalize exclude handling to s-expression queries
In fact most of the code path is in common, only the caching of terms
in the query needs to be added for s-expression queries.
2021-09-04 17:07:19 -07:00
David Bremner
bafc307190 lib/parse-sexp: handle unprefixed terms.
This is equivalent to adding the same field name "" for multiple
prefixes in the Xapian query parser, but we have to explicitely
construct the resulting query.
2021-09-04 17:07:19 -07:00
David Bremner
0ca4ad2670 lib/parse-sexp: add '*' as syntactic sugar for '(starts-with "")'
Users that insist on using a literal '*' as a tag, can continue to do
so by quoting it when searching.
2021-09-04 17:07:19 -07:00
David Bremner
011d06f4d6 lib/parse-sexp: 'starts-with' wildcard searches
The many tests potentially overkill, but they could catch typos in the
prefixes table. As a simplifying assumption, for now we assume a
single argument to the wildcard operator, as this matches the Xapian
semantics. The name 'starts-with' is chosen to emphasize the supported
case of wildcards in currrent (1.4.x) Xapian.
2021-09-04 17:07:19 -07:00
David Bremner
8322f536f5 lib/parse-sexp: add term prefix backed fields
We use "boolean" to describe fields that should generate terms
literally without stemming or phrase splitting.  This terminology
might not be ideal but it is already enshrined in
notmuch-search-terms(7).
2021-09-04 17:07:19 -07:00
David Bremner
90d9c2ad5c lib/parse-sexp: support phrase queries.
Anything that is quoted or not purely word characters is considered a
phrase.  Phrases are not stemmed, because the stems do not have
positional information in the database. It is less efficient to scan
the term twice, but it avoids a second pass to add prefixes, so maybe
it balances out. In any case, it seems unlikely query parsing is very
often a bottleneck.
2021-09-04 17:07:19 -07:00
David Bremner
200e164dc7 lib/parse-sexp: support subject field
The broken tests are because we do not yet handle phrase searches.
2021-09-04 17:07:19 -07:00
David Bremner
f83cd2a05a lib/parse-sexp: support and, not, and or.
All operations and (Xapian) fields will eventually have an entry in
the prefixes table. The flags field is just a placeholder for now, but
will eventually distinguish between various kinds of prefixes.
2021-09-04 17:07:19 -07:00
David Bremner
a2785c3919 lib/parse-sexp: stem unquoted atoms
This is somewhat less DWIM than the Xapian query parser, but it has
the advantage of simplicity.
2021-09-04 17:07:19 -07:00
David Bremner
3202e0d1fe lib: leave stemmer object accessible
This enables using the same stemmer in both query parsers.
2021-09-04 17:07:19 -07:00
David Bremner
be7e83de96 lib/parse-sexp: parse single terms and the empty list.
There is not much of a parser here yet, but it already does some
useful error reporting. Most functionality sketched in the
documentation is not implemented yet; detailed documentation will
follow with the implementation.
2021-09-04 17:07:19 -07:00
David Bremner
9ae4188610 lib: add new status code for query syntax errors.
This will help provide more meaningful error messages without special
casing on the client side.
2021-09-04 17:07:19 -07:00
David Bremner
c4f2f33a50 lib: define notmuch_query_create_with_syntax
Set the parsing syntax when the (notmuch) query object is
created. Initially the library always returns a trivial query that
matches all messages when using s-expression syntax.

It seems better to select the syntax at query creation time because
the lazy parsing is an implementation detail.
2021-09-04 17:07:19 -07:00
David Bremner
34733fa25e lib: split notmuch_query_create
Most of the function will be re-usable when creating a query from an
s-expression.
2021-09-04 17:07:19 -07:00
David Bremner
a83ad52da4 configure: optional library sfsexp
The configure part is essentially the same as the other checks using
pkg-config. Since the optional inclusion of this feature changes what
options are available to the user, include it in the "built_with"
pseudo-configuration keys.
2021-09-04 17:07:19 -07:00
Hannu Hartikainen
717e3dcdc3 lib: consider all instances of Delivered-To header
When using notmuch-reply and guessing the From: address from
Delivered-To headers, I had the wrong address chosen today. This was
because the messages from the notmuch list contain these headers in this
order:

Delivered-To: hannu.hartikainen@gmail.com
...
Delivered-To: hannu@hrtk.in

In my .notmuch-config I have the following configuration:

primary_email=hannu@hrtk.in
other_email=hannu.hartikainen@gmail.com;...

Before this change, notmuch-reply would guess From: @gmail.com because
that is the first Delivered-To header present. After the change, the
primary address is chosen as I would expect.
2021-08-29 18:10:08 -07:00
David Bremner
3df2281746 notmuch release 0.32.3-1 for unstable (sid) [dgit]
[dgit distro=debian no-split --quilt=linear]
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEkiyHYXwaY0SiY6fqA0U5G1WqFSEFAmEclwwACgkQA0U5G1Wq
 FSEdBw//cF+LAJy9qyWX8axQcSwo1/ixZkyxSCc0JVaDhhAal5GdVd3E+vDKZp3Q
 KWExBKncTAciB1CooKNurnvFr30s9MX+SrDK0rPCqXtObMRYS96AEDCvul9Z9mBT
 eE6thb66Y2Mrr2dyHiyG/GHrp9wces+sSnB6xM6bqO16c9UsMuqjTIIqlb0A7oP8
 eN/LLPAcVoXIMry3hVQWzv6bJWfqQK9Ho66p1fH+QswgeeKIzh5HR3ZnGdsBhnyN
 n3l9QPYuhN7/85cjWH4AE0/2EQLIF/Ewu86qYHzvOufc7oiNXRGpzQtMNqqur/da
 JIOizaylhLkFP39i/7CHvldCTx0FOCVhbHtFwb1Hm7rPqr8DhOvXyZbDHK/LkWtg
 jBKVFld4nNv/zm1uz6S3E5rDFsxuvpMVuesWKj6Yb051QvfGqVTnwRHP3EFxMyk8
 CVle5a3mMg8bnlN9o/WchB3z3ybijK4HLjGB+FWDZMUvhKXkfZL6GUeo/EQ8byaX
 mDEx5u56euqgNQS1PnIF/nTaOCbu75IqYAUSqwVXBcICi7LfYKDwGezBiEhoRiKt
 RQoulN9FNS+HuL+85LN4TdZyJOvD1Rc9xdT9TwuLr47nIc/KdAWUxyYSEo4PMiZ+
 NNiCatI5kPJrp0q1A5xDRkpFavLu5Mtu9ore+Cf1nA18iEFNQwc=
 =cECE
 -----END PGP SIGNATURE-----

Merge tag 'debian/0.32.3-1'

notmuch release 0.32.3-1 for unstable (sid) [dgit]

[dgit distro=debian no-split --quilt=linear]
2021-08-18 21:46:42 -07:00
David Bremner
d930011690 lib/open: look in MAILDIR for database, as documented.
This fixes the bug id:87bl9lx864.fsf@kisara.moe
2021-08-17 17:09:21 -07:00
Austin Ray
f1a310b3a9 lib: bump libnotmuch minor version
Notmuch 0.32 corresponds to libnotmuch 5.4 as indicated by docstrings;
however, the minor number wasn't bumped. Any libnotmuch downstream
consumer using the LIBNOTMUCH_CHECK_VERSION macro to support multiple
versions won't be able to access the new 5.4 functions.

Signed-off-by: Austin Ray <austin@austinray.io>
2021-08-17 16:30:22 -07:00
Austin Ray
414ba75c81 lib: correct deprecated db open functions' docs
Both notmuch_database_open() and notmuch_database_open_verbose()'s
documentation state they call notmuch_database_open_with_config() with
config_path=NULL; however, their implementations pass an empty string.
The empty string is the correct value to maintain their original
behavior of not loading the user's configuration so their documentation
is incorrect.
2021-08-17 16:30:05 -07:00
David Bremner
6e7365fb20 lib: update transaction documentation
Partly this is to recognize the semantics we inherit from Xapian,
partly to mention the new autocommit feature.
2021-06-27 14:06:30 -03:00
David Bremner
e2a3e5fa51 lib: autocommit after some number of completed transactions
This change addresses two known issues with large sets of changes to
the database.  The first is that as reported by Steven Allen [1],
notmuch commits are not "flushed" when they complete, which means that
if there is an open transaction when the database closes (or e.g. the
program crashes) then all changes since the last commit will be
discarded (nothing is irrecoverably lost for "notmuch new", as the
indexing process just restarts next time it is run).  This does not
really "fix" the issue reported in [1]; that seems rather difficult
given how transactions work in Xapian. On the other hand, with the
default settings, this should mean one only loses less than a minutes
worth of work.  The second issue is the occasionally reported "storm"
of disk writes when notmuch finishes. I don't yet have a test for
this, but I think committing as we go should reduce the amount of work
when finalizing the database.

[1]: id:20151025210215.GA3754@stebalien.com
2021-06-27 14:03:00 -03:00
David Bremner
2f608d2a94 lib/config: add NOTMUCH_CONFIG_AUTOCOMMIT
This will be used to control how often atomic transactions are
committed.
2021-06-27 13:59:42 -03:00
David Bremner
65f923219e database/close: remove misleading code / comment
Unfortunately, it doesn't make a difference if we call
cancel_transaction or not, all uncommited changes are discarded if
there is an open (unflushed) transaction.
2021-06-27 13:58:17 -03:00
David Bremner
49893c2c61 lib/database: fix style mistake.
The spacing of the declaration was wrong in ea30110.
2021-06-27 13:52:43 -03:00
David Bremner
4b0c6fb2f1 Merge branch 'release' 2021-06-25 09:34:29 -03:00
David Bremner
ea301102ab lib: write talloc report in notmuch_database_destroy
Since most memory allocation is (ultimately) in the talloc context
defined by a notmuch_database_t pointer, this gives a more complete
view of memory still allocated at program shutdown.

We also change the talloc report in notmuch.c to mode "a" to avoid
clobbering the newly reported log.
2021-06-25 09:20:37 -03:00
David Bremner
35d559eb18 lib/config: fix memory leak
This commit fixes a small memory leak (per iterator restart) by
actually using the talloc context intended to be blown away on
restart.
2021-06-25 09:13:04 -03:00
David Bremner
651a1b085b lib/message: use passed database for error handling
'message' should always be initialized if we reach here, but in case it
is not, we still want to be able to log an error message.
2021-06-05 15:41:28 -03:00
David Bremner
b0a11dbc38 lib/{open,message}: make some internal functions static
They are not used outside their file, so being extern seems like an oversight
2021-06-05 15:40:00 -03:00
David Bremner
748352693c lib/thread: add common prefix to debug messages, join lines
This will simplify filtering these message, e.g. in the test suite.
2021-05-23 08:01:38 -03:00
David Bremner
702635d5f6 Merge branch 'release' 2021-05-22 09:34:55 -03:00
David Bremner
3f4de98e7c lib/n_d_index_file: re-use thread-id of existing message
This prevents the message document getting multiple thread-id terms
when there are multiple files with the same message-id.

This change shifts some thread ids, requiring adjustments to other tests.
2021-05-22 09:08:02 -03:00
David Bremner
c84ccb70f3 Merge branch 'release' 2021-05-15 09:10:58 -03:00
David Bremner
b3258244c8 lib/open: restore default database path of $HOME/mail
Although this default worked for "notmuch config get", it didn't work
most other places. Restore the previous functionality, with the
wrinkle that XDG locations will shadow $HOME/mail if they exist.

This fixes a bug reported by Jack Kamm in id:87eeefdc8b.fsf@gmail.com
2021-05-15 08:40:05 -03:00
David Bremner
b1b6798588 lib/message: mark flag2tag as const
This table is intended to be immutable
2021-05-14 06:39:12 -03:00
David Bremner
929386fad9 lib/generate_thread_id: move static buffer to notmuch_database_t
Work towards the goal of concurrent access to different Xapian
databases from the same process.
2021-05-14 06:38:19 -03:00
David Bremner
8410be8e08 lib: make glib initialization thread-safe
In principle this could be done without depending on C++11 features,
but these features should be available since gcc 4.8.1, and this
localized usage is easy to replace if it turns out to be problematic
for portability.
2021-05-13 22:21:57 -03:00
David Bremner
393c92b042 lib/notmuch_database_reopen: reload some database metadata
In some uses of reopen, new documents and threads maybe have been
added, and e.g. compaction may have changed the uuid.
2021-05-12 08:40:04 -03:00
David Bremner
1040e7aa07 lib/config: expand relative paths when reading from database
This makes the treatment of relative paths consistent between the
database and config files.
2021-05-10 11:12:58 -03:00
David Bremner
31098c4ae4 lib/config: canonicalize paths relative to $HOME.
Prior to 0.32, notmuch had the (undocumented) behaviour that it
expanded a relative value of database.path with respect to $HOME. In
0.32 this was special cased for database.path but broken for
database.mail_root, which causes problems for at least notmuch-new
when database.path is set to a relative path.

The change in T030-config.sh reflects a user visible, but hopefully
harmless behaviour change; the expanded form of the paths will now be
printed by notmuch config.
2021-05-10 11:08:18 -03:00
David Bremner
5f80e106d6 lib/config: remove early free in _get_email_from_passwd_file
This (obvious) bug was caused by cut&pasting the code from
notmuch-config.c into the library and adding on a return at the end.
2021-04-24 12:11:45 -03:00
Đoàn Trần Công Danh
441a327051 compat: rename {,notmuch_}canonicalize_file_name
When compat canonicalize_file_name was introduced, it was limited to
C code only because it was used by C code only during that time.

>From 5ec6fd4d, (lib/open: check for split configuration when creating
database., 2021-02-16), lib/open.cc, which is C++, relies on the
existent of canonicalize_file_name.

However, we can't blindly enable canonicalize_file_name for C++ code,
because different implementation has different additional signature for
C++ and users can arbitrarily add -DHAVE_CANONICALIZE_FILE_NAME=0 to
{C,CXX}FLAGS.

Let's move our implementation into a util library.

Helped-by: Tomi Ollila <tomi.ollila@iki.fi>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
2021-04-24 08:07:00 -03:00
David Bremner
084e60d54a lib/n_d_index_file: check return value from _n_m_add_filename
Ignoring this return value seems like a bad idea in general, and in
particular it has been hiding one or more bugs related to handling
long directory names.
2021-04-18 10:02:20 -03:00
David Bremner
9ad19e4454 lib: directly traverse postlists in _n_message_delete
This is intended to fix the slow behaviour of "notmuch new" (and possibly
"notmuch reindex") when large numbers of files are deleted.

The underlying issue [1] seems to be the Xapian glass backend spending
a large amount of time in db.has_positions when running queries with
large-ish amounts of unflushed changes.

This commit removes two uses of Xapian queries [2], and replaces them with
an approximation of what Xapian would do after optimizing the
queries. This avoids the calls to has_positions (which are in any case
un-needed because we are only using boolean terms here).

[1] Thanks to "andres" on IRC for narrowing down the performance
bottleneck.

[2] Thanks to Olly Betts of Xapian fame for talking me a through a fix
that does not require people to update Xapian.
2021-04-18 09:50:36 -03:00
David Bremner
f5d4349921 lib: provide notmuch_config_path
Since the library searches in several locations for a config file, the
caller does not know which of these is chosen in the usual case of
passing NULL as a config file. This changes provides an API for the
caller to retrieve the name of the config file chosen. It will be
tested in a following commit.
2021-03-27 09:26:14 -03:00
David Bremner
217f819608 CLI+lib: detect missing database in split configurations.
Eventually we want to do all opening of databases in the top
level (main function). This means that detection of missing databases
needs to move out of subcommands. It also requires updating the
library to use the new NO_DATABASE status code.
2021-03-27 09:26:14 -03:00
David Bremner
2e39ce6eb5 lib: add NOTMUCH_STATUS_NO_DATABASE
This will allow more precise return values from various open related functions.
2021-03-27 09:26:14 -03:00