Commit graph

56 commits

Author SHA1 Message Date
Jani Nikula
6987286a5b lib: remove enum names from typedefs
There are some enum typedefs with the enum name:

    typedef enum _name_t { ... } name_t;

We don't need or use the enum names _name_t for anything, and not all
of the enum typedefs have them. We have the typedefs specifically to
use the typedef name.

Use the anonymous enum in the typedefs:

    typedef enum { ... } name_t;
2021-10-23 08:38:53 -03:00
David Bremner
036734252d lib: factor out expansion of saved queries.
This is intended to allow use outside of the Xapian query parser.
2021-09-04 17:07:19 -07:00
David Bremner
4083fd8bec lib/thread-fp: factor out query expansion, rewrite in Xapian
It will be convenient not to have to construct a notmuch query object
when parsing subqueries, so the commit rewrites the query
expansion (currently only used for thread:{} queries) using only
Xapian. As a bonus it seems about 15% faster in initial experiments.
2021-09-04 17:07:19 -07:00
David Bremner
b3bbaf1bc2 lib/query: factor out _notmuch_query_string_to_xapian_query
When dealing with recursive queries (i.e. thread:{foo}) it turns out
to be useful just to deal with the underlying Xapian objects, and not
wrap them in notmuch objects.
2021-09-04 17:07:19 -07:00
David Bremner
5cb452c325 lib: factor out query construction from regexp
This will allow re-use of this code outside of the Xapian query parser.
2021-09-04 17:07:19 -07:00
David Bremner
3202e0d1fe lib: leave stemmer object accessible
This enables using the same stemmer in both query parsers.
2021-09-04 17:07:19 -07:00
David Bremner
be7e83de96 lib/parse-sexp: parse single terms and the empty list.
There is not much of a parser here yet, but it already does some
useful error reporting. Most functionality sketched in the
documentation is not implemented yet; detailed documentation will
follow with the implementation.
2021-09-04 17:07:19 -07:00
David Bremner
e2a3e5fa51 lib: autocommit after some number of completed transactions
This change addresses two known issues with large sets of changes to
the database.  The first is that as reported by Steven Allen [1],
notmuch commits are not "flushed" when they complete, which means that
if there is an open transaction when the database closes (or e.g. the
program crashes) then all changes since the last commit will be
discarded (nothing is irrecoverably lost for "notmuch new", as the
indexing process just restarts next time it is run).  This does not
really "fix" the issue reported in [1]; that seems rather difficult
given how transactions work in Xapian. On the other hand, with the
default settings, this should mean one only loses less than a minutes
worth of work.  The second issue is the occasionally reported "storm"
of disk writes when notmuch finishes. I don't yet have a test for
this, but I think committing as we go should reduce the amount of work
when finalizing the database.

[1]: id:20151025210215.GA3754@stebalien.com
2021-06-27 14:03:00 -03:00
David Bremner
929386fad9 lib/generate_thread_id: move static buffer to notmuch_database_t
Work towards the goal of concurrent access to different Xapian
databases from the same process.
2021-05-14 06:38:19 -03:00
David Bremner
f5d4349921 lib: provide notmuch_config_path
Since the library searches in several locations for a config file, the
caller does not know which of these is chosen in the usual case of
passing NULL as a config file. This changes provides an API for the
caller to retrieve the name of the config file chosen. It will be
tested in a following commit.
2021-03-27 09:26:14 -03:00
David Bremner
6251e2bb9e lib: remove "path" from notmuch struct
This removes duplication between the struct element and the
configuration string_map entry. Create a simple wrapper for setting
the database path that makes sure the trailing / is stripped.
2021-03-20 07:23:40 -03:00
David Bremner
f0717aa380 lib: save path of xapian database in notmuch struct.
This will allow re-opening in a different mode (read/write
vs. read-only) with current Xapian API. It will also prove useful when
updating the compact functions to support more flexible database
location.
2021-03-18 08:03:48 -03:00
David Bremner
4743e87c2c lib: cache configuration information from database
The main goal is to allow configuration information to be temporarily
overridden by a separate config file. That will require further
changes not in this commit.

The performance impact is unclear, and will depend on the balance
between number of queries and number of distinct metadata items read
on the first call to n_d_get_config.
2021-02-06 18:56:05 -04:00
David Bremner
3b40978241 lib: factor out prefix related code to its own file
Reduce the size of database.cc, and limit the scope of prefix_table,
make sure it's accessed via a well-defined internal API.
2020-12-23 09:21:17 -04:00
David Bremner
e34e2a68b6 lib: factor out feature name related code.
database.cc is uncomfortably large, and some of the static data
structures do not need to be shared as much as they are.

This is a somewhat small piece to factor out, but it will turn out to
be helpful to further refactoring.
2020-12-23 09:06:34 -04:00
David Bremner
a09293793f lib: replace use of static_cast for writable databases
static_cast is a bit tricky to understand and error prone, so add a
second pointer to (potentially the same) Xapian database object that
we know has the right subclass.
2020-07-28 08:47:58 -03:00
David Bremner
095d3d7134 lib: move deallocation of memory from n_d_close to n_d_destroy
In order to mimic the "best effort" API of Xapian to provide
information from a closed database when possible, do not
destroy the Xapian database object too early.

Because the pointer to a Xapian database is no longer nulled on close,
introduce a flag to track whether the notmuch database is open or not.
2020-07-22 19:52:55 -03:00
David Bremner
b90d852a2f lib: migrate from Xapian ValueRangeProcessor to RangeProcessor
This will be mandatory as of Xapian 1.5.  The API is also more
consistent with the FieldProcessor API, which helps code re-use a bit.

Note that this switches to using the built-in Xapian support for
prefixes on ranges (i.e. deleted code at beginning of
ParseTimeRangeProcessor::operator(), added prefix to constructor).

Another side effect of the migration is that we are generating smaller
queries, using one OP_VALUE_RANGE instead of an AND of two OP_VALUE_*
queries.
2020-07-11 17:20:09 -03:00
uncrustify
2b62ca2e3b lib: run uncrustify
This is the result of running

     $ uncrustify --replace --config ../devel/uncrustify.cfg *.c *.h *.cc

in the lib directory
2019-06-14 07:41:27 -03:00
David Bremner
b52cda90f0 lib: cache user prefixes in database object
This will be used to avoid needing a database access to resolve a db
prefix from the corresponding UI prefix (e.g. when indexing). Arguably
the setup of the separate header map does not belong here, since it is
about indexing rather than querying, but we currently don't have any
other indexing setup to do.
2019-05-25 07:08:20 -03:00
David Bremner
319dd95ebb lib: add 'body:' field, stop indexing headers twice.
The new `body:` field (in Xapian terms) or prefix (in slightly
sloppier notmuch) terms allows matching terms that occur only in the
body.

Unprefixed query terms should continue to match anywhere (header or
body) in the message.

This follows a suggestion of Olly Betts to use the facility (since
Xapian 1.0.4) to add the same field with multiple prefixes. The double
indexing of previous versions is thus replaced with a query time
expension of unprefixed query terms to the various prefixed
equivalent.

Reindexing will be needed for 'body:' searches to work correctly;
otherwise they will also match messages where the term occur in
headers (demonstrated by the new tests in T530-upgrade.sh)
2019-04-17 08:48:16 -03:00
Jani Nikula
008a5e92eb lib: convert notmuch_bool_t to stdbool internally
C99 stdbool turned 18 this year. There really is no reason to use our
own, except in the library interface for backward
compatibility. Convert the lib internally to stdbool.
2017-10-09 22:27:16 -03:00
David Bremner
4034a7cec7 lib: isolate n_d_add_message and helper functions into own file
'database.cc' is becoming a monster, and it's hard to follow what the
various static functions are used for. It turns out that about 1/3 of
this file notmuch_database_add_message and helper functions not used
by any other function. This commit isolates this code into it's own
file.

Some side effects of this refactoring:

- find_doc_ids becomes the non-static (but still private)
  _notmuch_database_find_doc_ids
- a few instances of 'string' have 'std::' prepended, avoiding the
  need for 'using namespace std;' in the new file.
2017-08-01 21:17:47 -04:00
Jani Nikula
bc11759dd1 build: switch to hiding libnotmuch symbols by default
The dynamic generation of the linker version script for libnotmuch
exports has grown rather complicated.

Reverse the visibility control by hiding symbols by default using
-fvisibility=hidden, and explicitly exporting symbols in notmuch.h
using #pragma GCC visibility. (We could also use __attribute__
((visibility ("default"))) for each exported function, but the pragma
is more convenient.)

The above is not quite enough alone, as it would "leak" a number of
weak symbols from Xapian and C++ standard library. Combine it with a
small static version script that filters out everything except the
notmuch_* symbols that we explicitly exposed, and the C++ RTTI
typeinfo symbols for exception handling.

Finally, as the symbol hiding test can no longer look at the generated
symbol table, switch the test to parse the functions from notmuch.h.
2017-05-12 07:17:18 -03:00
David Bremner
31b8ce4558 lib: create field processors from prefix table
This is a bit more code than hardcoding the two existing field
processors, but it should make it easy to add more.
2017-03-03 07:15:13 -04:00
David Bremner
e17a914b77 lib: add _notmuch_database_reopen
The main expected use is to recover from a Xapian::DatabaseChanged
exception.
2017-02-25 21:09:17 -04:00
David Bremner
0e037c34dd lib: Let Xapian manage the memory for FieldProcessors
It turns out this is exactly what release() is for; Xapian will
deallocate the objects when it's done with them.
2017-02-18 22:18:06 -04:00
David Bremner
e30fa4182f lib: merge internal prefix tables
Replace multiple tables with some flags in a single table. This makes
the code in notmuch_database_open_verbose a bit shorter, and it should
also make it easier to add other options to fields, e.g. regexp
searching.
2017-02-18 22:17:39 -04:00
David Bremner
0abcad7c0e lib: optionally silence Xapian deprecation warnings
This is not ideal, but the new API is not available in Xapian 1.2.x, and
it seems to soon to depend on Xapian >= 1.4
2016-11-15 07:47:55 -04:00
Daniel Kahn Gillmor
6a833a6e83 Use https instead of http where possible
Many of the external links found in the notmuch source can be resolved
using https instead of http.  This changeset addresses as many as i
could find, without touching the e-mail corpus or expected outputs
found in tests.
2016-06-05 08:32:17 -03:00
David Bremner
b9bf3f44ea lib: add support for named queries
This relies on the optional presense of xapian field processors, and the
library config API.
2016-05-25 07:40:44 -03:00
David Bremner
30caaf52b0 lib: make a global constant for query parser flags
It's already kindof gross that this is hardcoded in two different
places. We will also need these later in field processors calling back
into the query parser.
2016-05-25 07:40:44 -03:00
David Bremner
bbf6069252 lib: optionally support single argument date: queries
This relies on the FieldProcessor API, which is only present in xapian
>= 1.3.
2016-05-08 08:17:07 -03:00
Austin Clements
cb08a2ee01 lib: Add "lastmod:" queries for filtering by last modification
The implementation is essentially the same as the date range search
prior to Jani's fancy date parser.
2015-08-14 18:23:49 +02:00
Austin Clements
98ee460eaa lib: API to retrieve database revision and UUID
This exposes the committed database revision to library users along
with a UUID that can be used to detect when revision numbers are no
longer comparable (e.g., because the database has been replaced).
2015-08-13 23:52:51 +02:00
Austin Clements
7f57b747b9 lib: Add per-message last modification tracking
This adds a new document value that stores the revision of the last
modification to message metadata, where the revision number increases
monotonically with each database commit.

An alternative would be to store the wall-clock time of the last
modification of each message.  In principle this is simpler and has
the advantage that any process can determine the current timestamp
without support from libnotmuch.  However, even assuming a computer's
clock never goes backward and ignoring clock skew in networked
environments, this has a fatal flaw.  Xapian uses (optimistic)
snapshot isolation, which means reads can be concurrent with writes.
Given this, consider the following time line with a write and two read
transactions:

   write  |-X-A--------------|
   read 1       |---B---|
   read 2                      |---|

The write transaction modifies message X and records the wall-clock
time of the modification at A.  The writer hangs around for a while
and later commits its change.  Read 1 is concurrent with the write, so
it doesn't see the change to X.  It does some query and records the
wall-clock time of its results at B.  Transaction read 2 later starts
after the write commits and queries for changes since wall-clock time
B (say the reads are performing an incremental backup).  Even though
read 1 could not see the change to X, read 2 is told (correctly) that
X has not changed since B, the time of the last read.  In fact, X
changed before wall-clock time A, but the change was not visible until
*after* wall-clock time B, so read 2 misses the change to X.

This is tricky to solve in full-blown snapshot isolation, but because
Xapian serializes writes, we can use a simple, monotonically
increasing database revision number.  Furthermore, maintaining this
revision number requires no more IO than a wall-clock time solution
because Xapian already maintains statistics on the upper (and lower)
bound of each value stream.
2015-08-13 23:52:51 +02:00
David Bremner
b53e1a2da7 lib: add a log function with output to a string in notmuch_database_t
In principle in the future this could do something fancier than
asprintf.
2015-03-29 00:34:15 +01:00
Todd
0de999aab5 Add the NOTMUCH_FEATURE_INDEXED_MIMETYPES database feature
This feature will exist in all newly created databases, but there is
no upgrade provided for it.  If this flag exists, it indicates that
the database was created after the indexed MIME-types feature was
added.
2015-01-24 16:47:47 +01:00
Austin Clements
ee476f1e76 lib: Enable ghost messages feature
This fixes the broken thread order test.
2014-10-25 19:31:27 +02:00
Austin Clements
1cdb96d3c4 lib: Add a ghost messages database feature
This will be implemented over the next several patches.  The feature
is not yet "enabled" (this does not add it to
NOTMUCH_FEATURES_CURRENT).
2014-10-25 19:25:54 +02:00
Austin Clements
8363c90531 lib: Database version 3: Introduce fine-grained "features"
Previously, our database schema was versioned by a single number.
Each database schema change had to occur "atomically" in Notmuch's
development history: before some commit, Notmuch used version N, after
that commit, it used version N+1.  Hence, each new schema version
could introduce only one change, the task of developing a schema
change fell on a single person, and it all had to happen and be
perfect in a single commit series.  This made introducing a new schema
version hard.  We've seen only two schema changes in the history of
Notmuch.

This commit introduces database schema version 3; hopefully the last
schema version we'll need for a while.  With this version, we switch
from a single version number to "features": a set of named,
independent aspects of the database schema.

Features should make backwards compatibility easier.  For many things,
it should be easy to support databases both with and without a
feature, which will allow us to make upgrades optional and will enable
"unstable" features that can be developed and tested over time.

Features also make forwards compatibility easier.  The features
recorded in a database include "compatibility flags," which can
indicate to an older version of Notmuch when it must support a given
feature to open the database for read or for write.  This lets us
replace the old vague "I don't recognize this version, so something
might go wrong, but I promise to try my best" warnings upon opening a
database with an unknown version with precise errors.  If a database
is safe to open for read/write despite unknown features, an older
version will know that and issue no message at all.  If the database
is not safe to open for read/write because of unknown features, an
older version will know that, too, and can tell the user exactly which
required features it lacks support for.
2014-08-30 10:42:08 -07:00
Jani Nikula
90cd1bac4e lib: add date range query support
Add a custom value range processor to enable date and time searches of
the form date:since..until, where "since" and "until" are expressions
understood by the previously added date/time parser, to restrict the
results to messages within a particular time range (based on the Date:
header).

If "since" or "until" describes date/time at an accuracy of days or
less, the values are rounded according to the accuracy, towards past
for "since" and towards future for "until". For example,
date:november..yesterday would match from the beginning of November
until the end of yesterday. Expressions such as date:today..today
means since the beginning of today until the end of today.

Open-ended ranges are supported (since Xapian 1.2.1), i.e. you can
specify date:..until or date:since.. to not limit the start or end
date, respectively.

CAVEATS:

Xapian does not support spaces in range expressions. You can replace
the spaces with '_', or (in most cases) '-', or (in some cases) leave
the spaces out altogether.

Entering date:expr without ".." (for example date:yesterday) will not
work as you might expect. You can achieve the expected result by
duplicating the expr both sides of ".." (for example
date:yesterday..yesterday).

Open-ended ranges won't work with pre-1.2.1 Xapian, but they don't
produce an error either.

Signed-off-by: Jani Nikula <jani@nikula.org>
2012-10-31 16:55:32 -03:00
Austin Clements
e59cc0031f lib: Add support for nested atomic sections.
notmuch_database_t now keeps a nesting count and we only start a
transaction or commit for the outermost atomic section.

Introduces a new error, NOTMUCH_STATUS_UNBALANCED_ATOMIC.
2011-09-23 21:50:38 -04:00
Austin Clements
206938ec9b Add a generic function to get a list of terms with some prefix.
Replace _notmuch_convert_tags with this and simplify
_create_filenames_for_terms_with_prefix.  This will also come in handy
shortly to get the message file name list.
2011-03-21 02:45:18 -04:00
Carl Worth
bb74e9dff8 lib: Rework interface for maildir_flags synchronization
Instead of having an API for setting a library-wide flag for
synchronization (notmuch_database_set_maildir_sync) we instead
implement maildir synchronization with two new library functions:

	notmuch_message_maildir_flags_to_tags
  and   notmuch_message_tags_to_maildir_flags

These functions are nicely documented here, (though the implementation
does not quite match the documentation yet---as plainly evidenced by
the current results of the test suite).
2010-11-11 03:40:19 -08:00
Michal Sojka
d9d3d3e6f0 Make maildir synchronization configurable
This adds group [maildir] and key 'synchronize_flags' to the
configuration file. Its value enables (true) or diables (false) the
synchronization between notmuch tags and maildir flags. By default,
the synchronization is disabled.
2010-11-10 13:09:32 -08:00
Carl Worth
c81cecf620 lib: Add GCC visibility(hidden) pragmas to private header files.
This prevents any of the private functions from being leaked out
through the library interface (at least when compiling with a
recent-enough gcc to support the visibility pragma).
2010-11-01 22:35:48 -07:00
Carl Worth
98845fdbb2 Avoid database corruption by not adding partially-constructed mail documents.
Previously we were using Xapian's add_document to allocate document ID
values for notmuch_message_t objects.  This had the drawback of adding
a partially constructed mail document to the database. If notmuch was
subsequently interrupted before fully populating this document, then
later runs would be quite confused when seeing the partial documents.

There are reports from the wild of people hitting internal errors of
the form "Message ... has no thread ID" for example, (which is
currently an unrecoverable error).

We fix this by manually allocating document IDs without adding
documents. With this change, we never call Xapian's add_document
method, but only replace_document with either the current document ID
of a message or a new one that we have allocated.
2010-06-04 10:16:53 -07:00
Carl Worth
e0a8dee8bc Fix printf for when uint64_t != unsigned long long int
Thanks to Michal Sojka <sojkam1@fel.cvut.cz> for pointing out the
correct fix, which I verified in the freely-available WG14/N1124 draft
(from the C99 working group) which is available here:

http://www.open-std.org/JTC1/SC22/wg14/www/docs/n1124.pdf
2010-02-09 11:14:16 -08:00
Carl Worth
9439b217c3 Switch from random to sequential thread identifiers.
The sequential identifiers have the advantage of being guaranteed to
be unique (until we overflow a 64-bit unsigned integer), and also take
up half as much space in the "notmuch search" output (16 columns
rather than 32).

This change also has the side effect of fixing a bug where notmuch
could block on /dev/random at startup (waiting for some entropy to
appear). This bug was hit hard by the test suite, (which could easily
exhaust the available entropy on common systems---resulting in large
delays of the test suite).
2010-02-09 11:14:11 -08:00