Commit graph

677 commits

Author SHA1 Message Date
Daniel Kahn Gillmor
0bb05ff693 reindex: drop all properties named with prefix "index."
This allows us to create new properties that will be automatically set
during indexing, and cleared during re-indexing, just by choice of
property name.
2017-10-21 19:53:08 -03:00
Daniel Kahn Gillmor
20ff9de24d index: implement notmuch_indexopts_t with try_decrypt
This is currently mostly a wrapper around _notmuch_crypto_t that keeps
its internals private and doesn't expose any of the GMime API.
However, non-crypto indexing options might also be added later
(e.g. filters or other transformations).
2017-10-21 19:52:47 -03:00
Daniel Kahn Gillmor
0b9e1a2472 properties: add notmuch_message_remove_all_properties_with_prefix()
Subsequent patches will introduce a convention that properties whose
name starts with "index." will be stripped (and possibly re-added)
during re-indexing.  This patch lays the groundwork for doing that.
2017-10-20 07:58:43 -03:00
Daniel Kahn Gillmor
a18bbf7f15 crypto: make shared crypto code behave library-like
If we're going to reuse the crypto code across both the library and
the client, then it needs to report error states properly and not
write to stderr.
2017-10-20 07:58:20 -03:00
Jani Nikula
008a5e92eb lib: convert notmuch_bool_t to stdbool internally
C99 stdbool turned 18 this year. There really is no reason to use our
own, except in the library interface for backward
compatibility. Convert the lib internally to stdbool.
2017-10-09 22:27:16 -03:00
Daniel Kahn Gillmor
e3a6368e8d fix reference to notmuch_message_get_properties 2017-09-24 09:15:24 -03:00
Daniel Kahn Gillmor
f4ac5ecd5c lib: index the content-type of the parts of encrypted messages
This is a logical followup to "lib: index the content type of
signature parts", which will make it easier to record the message
structure of all messages.
2017-09-17 20:01:19 -03:00
Jani Nikula
55c047ee0b lib: index the content type of signature parts
It's useful (*) to be able to easily find messages with certain types
of signatures. Having the mimetype: prefix searches fail for some
content types is also genuinely surprising (*). Index the content type
of signature parts.

While at it, switch to the gmime convenience constants for content and
signature part indexes.

*) At least for developers of email software!
2017-09-17 20:01:00 -03:00
Jani Nikula
930d0aefb1 lib: abstract content type indexing
Make the follow-up change of indexing signature content types
easier. No functional changes.
2017-09-17 20:00:32 -03:00
Jani Nikula
eb29e26a99 build: fix out-of-tree builds, again
Broken, again, by yours truly in bc11759dd1 ("build: switch to
hiding libnotmuch symbols by default"). Reference notmuch.sym via
$(srctree).
2017-09-13 08:48:17 -03:00
Daniel Kahn Gillmor
3445385f95 fix documentation bug (leading quotes break documentation) 2017-09-05 21:54:46 -03:00
David Bremner
debfae20db lib: enforce that n_message_reindex takes headers from first file
This is still a bit stopgap to be only choosing one set of headers,
but this seems like a more defensible set of headers to choose.
2017-09-05 21:51:57 -03:00
David Bremner
0260ee371e lib&cli: use g_object_new instead of g_object_newv
'g_object_newv' is deprecated, and prints annoying warnings. The
warnings suggest using 'g_object_new_with_properties', but that's only
available since glib 2.55 (i.e. a month ago as of this writing).
Since we don't actuall pass any properties, it seems we can just call
'g_object_new'.
2017-09-04 08:04:44 -03:00
David Bremner
0a40ea4b48 lib: add notmuch_message_has_maildir_flag
I considered a higher level interface where the caller passes a tag
name rather than a flag character, but the role of the "unread" tag is
particularly confusing with such an interface.
2017-08-29 21:56:21 -03:00
David Bremner
8a8fb39b0c lib/message: split n_m_maildir_flags_tags, store maildir flags
In a future commit this will allow querying maildir flags seperately
from tags to allow resolving certain conflicts.
2017-08-29 21:51:10 -03:00
Daniel Kahn Gillmor
eb232ee0ab reindex: drop notmuch_param_t, use notmuch_indexopts_t instead
There are at least three places in notmuch that can trigger an
indexing action:

 * notmuch new
 * notmuch insert
 * notmuch reindex

I have plans to add some indexing options (e.g. indexing the cleartext
of encrypted parts, external filters, automated property injection)
that should properly be available in all places where indexing
happens.

I also want those indexing options to be exposed by (and constrained
by) the libnotmuch C API.

This isn't yet an API break because we've never made a release with
notmuch_param_t.

These indexing options are relevant in the listed places (and in the
libnotmuch analogues), but they aren't relevant in the other kinds of
functionality that notmuch offers (e.g. dump/restore, tagging, search,
show, reply).

So i think a generic "param" object isn't well-suited for this case.
In particular:

 * a param object sounds like it could contain parameters for some
   other (non-indexing) operation.  This sounds confusing -- why would
   i pass non-indexing parameters to a function that only does
   indexing?

 * bremner suggests online a generic param object would actually be
   passed as a list of param objects, argv-style.  In this case (at
   least in the obvious argv implementation), the params might be some
   sort of generic string.  This introduces a problem where the API of
   the library doesn't grow as new options are added, which means that
   when code outside the library tries to use a feature, it first has
   to test for it, and have code to handle it not being available.
   The indexopts approach proposed here instead makes it clear at
   compile time and at dynamic link time that there is an explicit
   dependency on that feature, which allows automated tools to keep
   track of what's needed and keeps the actual code simple.

My proposal adds the notmuch_indexopts_t as an opaque struct, so that
we can extend the list of options without causing ABI breakage.

The cost of this proposal appears to be that the "boilerplate" API
increases a little bit, with a generic constructor and destructor
function for the indexopts struct.

More patches will follow that make use of this indexopts approach.
2017-08-23 07:55:12 -03:00
Daniel Kahn Gillmor
b10ce6bc23 database: add n_d_index_file (deprecates n_d_add_message)
We need a way to pass parameters to the indexing functionality on the
first index, not just on reindexing.  The obvious place is in
notmuch_database_add_message.  But since modifying the argument list
would break both API and ABI, we needed a new name.

I considered notmuch_database_add_message_with_params(), but the
functionality we're talking about doesn't always add a message.  It
tries to index a specific file, possibly adding a message, but
possibly doing other things, like adding terms to an existing message,
or failing to deal with message objects entirely (e.g. because the
file didn't contain a message).

So i chose the function name notmuch_database_index_file.

I confess i'm a little concerned about confusing future notmuch
developers with the new name, since we already have a private
_notmuch_message_index_file function, and the two do rather different
things.  But i think the added clarity for people linking against the
future libnotmuch and the capacity for using index parameters makes
this a worthwhile tradeoff.  (that said, if anyone has another name
that they strongly prefer, i'd be happy to go with it)

This changeset also adjusts the tests so that we test whether the new,
preferred function returns bad values (since the deprecated function
just calls the new one).

We can keep the deprecated n_d_add_message function around as long as
we like, but at the next place where we're forced to break API or ABI
we can probably choose to drop the name relatively safely.

NOTE: there is probably more cleanup to do in the ruby and go bindings
to complete the deprecation directly.  I don't know those languages
well enough to attempt a fix; i don't know how to test them; and i
don't know the culture around those languages about API additions or
deprecations.
2017-08-23 07:38:37 -03:00
Yuri Volchkov
cec4a87539 database: move striping of trailing '/' into helper function
Stripping trailing character is not that uncommon
operation. Particularly, the next patch has to perform it as
well. Lets move it to the separate function to avoid code duplication.

Also the new function has a little improvement: if the character to
strip is repeated several times in the end of a string, function
strips them all.

Signed-off-by: Yuri Volchkov <yuri.volchkov@gmail.com>
2017-08-22 18:47:51 -03:00
Daniel Kahn Gillmor
55f9f6505e lib: clarify description of notmuch_database_add_message
Since we're accumulating the index when we add a new file to the
message, the semantics have slightly changed.  This tries to align the
documentation with the actual functionality.
2017-08-20 08:33:46 -03:00
Daniel Kahn Gillmor
5b93fa6e70 lib: add notmuch_message_reindex
This new function asks the database to reindex a given message.
The parameter `indexopts` is currently ignored, but is intended to
provide an extensible API to support e.g. changing the encryption or
filtering status (e.g. whether and how certain non-plaintext parts are
indexed).
2017-08-01 21:17:47 -04:00
David Bremner
34d7753992 lib: add _notmuch_message_remove_indexed_terms
Testing will be provided via use in notmuch_message_reindex
2017-08-01 21:17:47 -04:00
David Bremner
50340bcb78 lib: add notmuch_thread_get_total_files
This is relatively inexpensive in terms of run time and implementation
cost as we are already traversing the list of messages in a thread.
2017-08-01 21:17:47 -04:00
David Bremner
8a8e2b11c2 lib: add notmuch_message_count_files
This operation is relatively inexpensive, as the needed metadata is
already computed by our lazy metadata fetching. The goal is to support
better UI for messages with multipile files.
2017-08-01 21:17:47 -04:00
David Bremner
411675a6ce lib: index message files with duplicate message-ids
The corresponding xapian document just gets more terms added to it,
but this doesn't seem to break anything. Values on the other hand get
overwritten, which is a bit annoying, but arguably it is not worse to
take the values (from, subject, date) from the last file indexed
rather than the first.
2017-08-01 21:17:47 -04:00
David Bremner
4fdabd636e lib: refactor notmuch_database_add_message header parsing
This function is large and hard to understand and modify. Start to
break it down into meaningful pieces.
2017-08-01 21:17:47 -04:00
David Bremner
2f94b3090c lib: factor out message-id parsing to separate file.
This is really pure C string parsing, and doesn't need to be mixed in
with the Xapian/C++ layer. Although not strictly necessary, it also
makes it a bit more natural to call _parse_message_id from multiple
compilation units.
2017-08-01 21:17:47 -04:00
David Bremner
95b52e85b2 lib/n_d_add_message: refactor test for new/ghost messages
The switch is easier to understand than the side effects in the if
test. It also potentially allows us more flexibility in breaking up
this function into smaller pieces, since passing private_status around
is icky.
2017-08-01 21:17:47 -04:00
David Bremner
4034a7cec7 lib: isolate n_d_add_message and helper functions into own file
'database.cc' is becoming a monster, and it's hard to follow what the
various static functions are used for. It turns out that about 1/3 of
this file notmuch_database_add_message and helper functions not used
by any other function. This commit isolates this code into it's own
file.

Some side effects of this refactoring:

- find_doc_ids becomes the non-static (but still private)
  _notmuch_database_find_doc_ids
- a few instances of 'string' have 'std::' prepended, avoiding the
  need for 'using namespace std;' in the new file.
2017-08-01 21:17:47 -04:00
Daniel Kahn Gillmor
d55fffffd7 fix the generated documentation output 2017-07-18 06:53:57 -03:00
Daniel Kahn Gillmor
87bdfbc91f Fix orthography 2017-07-18 06:50:44 -03:00
David Bremner
4ce7591610 lib: paper over allocation difference
In gmime 3.0 this function is "transfer none", so no deallocation is
needed (or permitted)
2017-07-14 21:23:52 -03:00
David Bremner
eeb64cdeeb lib: add version of _n_m_f_get_combinded_header for gmime 3.0
The iterator is gone, so we need a new loop structure.
2017-07-14 21:23:52 -03:00
David Bremner
439c5896b6 lib: refactor _notmuch_messsage_file_get_combined_header
We need to rewrite the loop for gmime-3.0; move the loop body to its
own function to avoid code duplication.  Keep the common exit via
"goto DONE" to make this pure code movement.  It's important to note
that the existing exit path only deallocates the iterator.
2017-07-14 21:23:52 -03:00
David Bremner
c040464a7c lib: wrap use of g_mime_utils_header_decode_date
This changes return type in gmime 3.0
2017-07-14 21:23:52 -03:00
David Bremner
cbb2d5608e lib/cli: replace use of g_mime_message_get_sender
This function changes semantics in gmime-3.0 so make a new function
that provides the same functionality in both
2017-07-14 17:58:09 -03:00
David Bremner
6dd00d6486 lib/index: add simple html filter
The filter just drops all (HTML) tags. As an enabling change, pass the
content type to the filter constructor so we can decide which scanner
to user.
2017-07-01 12:32:27 -03:00
David Bremner
64f81f95a1 lib/index.cc: generalize filter state machine
To match things more complicated than fixed strings, we need states
with multiple out arrows.
2017-07-01 12:32:17 -03:00
David Bremner
4a085a5137 lib/index: separate state table definition from scanner.
We want to reuse the scanner definition with a different table.  This
is mainly code movement, and making the state table part of the filter
struct/class.
2017-07-01 12:32:03 -03:00
David Bremner
20c15bc820 lib/index: generalize name of indexing filter
In followup commits we will generalize the functionality of this
filter to deal with other types of non-indexable content.
2017-07-01 12:31:55 -03:00
Jani Nikula
30c475c1ef build: visibility=default for library structs is no longer needed
Commit d5523ead90 ("Mark some structures in the library interface
with visibility=default attribute.") fixed some mixed visibility
issues with structs. With the symbol default visibility reversed, this
is no longer a problem.
2017-05-13 08:38:18 -03:00
Jani Nikula
bc11759dd1 build: switch to hiding libnotmuch symbols by default
The dynamic generation of the linker version script for libnotmuch
exports has grown rather complicated.

Reverse the visibility control by hiding symbols by default using
-fvisibility=hidden, and explicitly exporting symbols in notmuch.h
using #pragma GCC visibility. (We could also use __attribute__
((visibility ("default"))) for each exported function, but the pragma
is more convenient.)

The above is not quite enough alone, as it would "leak" a number of
weak symbols from Xapian and C++ standard library. Combine it with a
small static version script that filters out everything except the
notmuch_* symbols that we explicitly exposed, and the C++ RTTI
typeinfo symbols for exception handling.

Finally, as the symbol hiding test can no longer look at the generated
symbol table, switch the test to parse the functions from notmuch.h.
2017-05-12 07:17:18 -03:00
Jani Nikula
d5ed9af0e4 build: do not export compat functions from lib
Commits 9db2145272 ("lib/gen-version-script.h: add getline and
getdelim to notmuch.sym if needed") and 3242e29e57 ("build: add
canonicalize_file_name to symbols exported from libnotmuch.so")
started exporting compat functions from libnotmuch so that the cli
could use them. But we shouldn't export such functions from the
library. They are not part of our ABI. Instead, the cli should include
its own copies of the compat functions.
2017-05-11 20:41:10 -03:00
David Bremner
11d47950c1 lib: Add regexp expansion for for tags and paths
From a UI perspective this looks similar to what was already provided
for from, subject, and mid, but the implementation is quite
different. It uses the database's list of terms to construct a term
based query equivalent to the passed regular expression.
2017-05-09 07:44:29 -03:00
David Bremner
eab365c742 lib: Add regexp searching for mid: prefix
The bulk of the change is passing in the field options to the regexp
field processor, so that we can properly handle the
fallback (non-regexp case).
2017-05-09 07:44:15 -03:00
Fredrik Fornwall
e565118172 Replace index(3) with strchr(3)
The index(3) function has been deprecated in POSIX since 2001 and
removed in 2008, and most code in notmuch already calls strchr(3).

This fixes a compilation error on Android whose libc does not have
index(3).
2017-04-20 06:59:22 -03:00
David Bremner
e1c1d33f37 Merge branch 'release'
Another regexp search fix.
2017-03-29 20:58:34 -03:00
David Bremner
cb84f84878 lib: handle empty string in regexp field processors
The non-field processor behaviour is is convert the corresponding
queries into a search for the unprefixed terms. This yields pretty
surprising results so I decided to generate a query that would match
the terms (i.e. none with that prefix) generated for an empty header.
2017-03-29 20:44:32 -03:00
David Bremner
d877240f4e Merge branch 'release'
wildcard search fixes, plus release busywork
2017-03-25 11:51:03 -03:00
David Bremner
38a56b98f9 lib: only trigger phrase processing for regexp fields when needed
The argument is that if the string passed to the field processor has
no spaces, then the added quotes won't have any benefit except for
disabling wildcards. But disabling wildcards doesn't seem very useful
in the normal Xapian query parser, since they're stripped before
generating terms anyway. It does mean that the query 'from:"foo*"' will
not be precisely equivalent to 'from:foo' as it is for the non
field-processor version.
2017-03-24 09:24:13 -03:00
David Bremner
242d5a3ed5 lib: make notmuch_query_add_tag_exclude return a status value
Since this is an ABI breaking change, but we already bumped the SONAME
for the next release
2017-03-22 08:47:13 -03:00
David Bremner
3721bd45d7 lib: replace deprecated n_q_count_threads with status returning version
This function was deprecated in notmuch 0.21.  We re-use the name for
a status returning version, and deprecate the _st name.
2017-03-22 08:35:07 -03:00
David Bremner
5ce8e0b11b lib: replace deprecated n_q_count_messages with status returning version
This function was deprecated in notmuch 0.21.  We re-use the name for
a status returning version, and deprecate the _st name. One or two
remaining uses of the (removed) non-status returning version fixed at
the same time
2017-03-22 08:35:07 -03:00
David Bremner
86cbd215eb lib: replace deprecated n_q_search_messages with status returning version
This function was deprecated in notmuch 0.21.  We re-use the name for
a status returning version, and deprecate the _st name.
2017-03-22 08:35:07 -03:00
David Bremner
1e982de508 lib: replace n_query_search_threads with status returning version
This function was deprecated in notmuch 0.21. We finally remove the
deprecated API, and rename the status returning version to the simpler
name. The status returning is kept as a deprecated alias.
2017-03-22 08:28:09 -03:00
David Bremner
fc63c15833 lib: bump SONAME to libnotmuch5
We plan a sequence of ABI breaking changes. Put the SONAME change in a
separate commit to make reordering easier.
2017-03-22 08:27:58 -03:00
David Bremner
c39f6361d0 rename libutil.a to libnotmuch_util.a
Apparently some systems (MacOS?) have a system library called libutil
and the name conflict causes problems. Since this library is quite
notmuch specific, rename it to something less generic.
2017-03-18 21:37:43 -03:00
David Bremner
a8a2705222 Merge branch 'release'
Merge in memory fixes
2017-03-18 21:02:42 -03:00
Tomi Ollila
06adc27668 lib/message.cc: fix Coverity finding (use after free)
The object where pointer to `data` was received was deleted before
it was used in _notmuch_string_list_append().

Relevant Coverity messages follow:

3: extract
Assigning: data = std::__cxx11::string(message->doc.()).c_str(),
which extracts wrapped state from temporary of type std::__cxx11::string.

4: dtor_free
The internal representation of temporary of type std::__cxx11::string
is freed by its destructor.

5: use after free:
Wrapper object use after free (WRAPPER_ESCAPE)
Using internal representation of destroyed object local data.
2017-03-18 20:59:46 -03:00
David Bremner
62822a4e2d lib: clamp return value of g_mime_utils_header_decode_date to >=0
For reasons not completely understood at this time, gmime (as of
2.6.22) is returning a date before 1900 on bad date input. Since this
confuses some other software, we clamp such dates to 0,
i.e. 1970-01-01.
2017-03-15 21:58:25 -03:00
Jani Nikula
d56a801b67 lib/database: reduce try block scope to things that really need it
No need to maintain the pure C stuff within a try block, it's arguably
confusing. This also reduces indent for a bunch of code. No functional
changes.
2017-03-10 09:21:05 -04:00
Olly Betts
81bd72cebb lib: Fix RegexpPostingSource
Remove incorrect skipping to first match from init(), and add explicit
skip_to() and check() methods to work around xapian-core bug (the
check() method will also improve speed when filtering by one of
these).
2017-03-07 19:44:36 -04:00
David Bremner
dfacfe14f3 lib: query make exclude handling non-destructive
We filter added exclude at add time, rather than modifying the query by
count search. As noted in the comments, there are several ignored
conditions here.
2017-03-04 20:47:25 -04:00
David Bremner
e209b71873 lib: centralize query parsing, store results.
The main goal is to prepare the way for non-destructive (or at least
less destructive) exclude tag handling. It does this by having a
pre-parsed query available for further processing. This also allows us
to provide slightly more precise error messages.
2017-03-04 20:47:25 -04:00
Jani Nikula
f3edc5dc86 lib: use delete[] to free buffer allocated using new[]
Fix warning caught by clang:

lib/regexp-fields.cc:41:2: warning: 'delete' applied to a pointer that was allocated
      with 'new[]'; did you mean 'delete[]'? [-Wmismatched-new-delete]
        delete buffer;
        ^
              []
lib/regexp-fields.cc:37:17: note: allocated with 'new[]' here
        char *buffer = new char[len];
                       ^
2017-03-04 20:42:39 -04:00
David Bremner
6cb1c617a7 lib: add mid: as a synonym for id:
mid: is the url scheme suggested by URL 2392. We also plan to
introduce more flexible searches for mid: than are possible with
id: (in order not to break assumptions about the special behaviour of
id:, e.g. identifying at most one message).
2017-03-03 17:46:48 -04:00
David Bremner
55524bb063 lib: regexp matching in 'subject' and 'from'
the idea is that you can run

% notmuch search subject:/<your-favourite-regexp>/
% notmuch search from:/<your-favourite-regexp>/

or

% notmuch search subject:"your usual phrase search"
% notmuch search from:"usual phrase search"

This feature is only available with recent Xapian, specifically
support for field processors is needed.

It should work with bindings, since it extends the query parser.

This is easy to extend for other value slots, but currently the only
value slots are date, message_id, from, subject, and last_mod. Date is
already searchable;  message_id is left for a followup commit.

This was originally written by Austin Clements, and ported to Xapian
field processors (from Austin's custom query parser) by yours truly.
2017-03-03 17:46:48 -04:00
David Bremner
31b8ce4558 lib: create field processors from prefix table
This is a bit more code than hardcoding the two existing field
processors, but it should make it easy to add more.
2017-03-03 07:15:13 -04:00
David Bremner
7bd63833bf lib/message.cc: use view number to invalidate cached metadata
Currently the view number is incremented by notmuch_database_reopen
2017-02-25 21:15:38 -04:00
David Bremner
e0b22c139c lib: handle DatabaseModifiedError in _n_message_ensure_metadata
The retries are hardcoded to a small number, and error handling aborts
than propagating errors from notmuch_database_reopen. These are both
somewhat justified by the assumption that most things that can go
wrong in Xapian::Database::reopen are rare and fatal. Here's the brief
discussion with Xapian upstream:

   24-02-2017 08:12:57 < bremner> any intuition about how likely
      Xapian::Database::reopen is to fail? I'm catching a
      DatabaseModifiedError somewhere where handling any further errors is
      tricky, and wondering about treating a failed reopen as as "the
      impossible happened, stopping"

   24-02-2017 16:22:34 < olly> bremner: there should not be much scope for
    failure - stuff like out of memory or disk errors, which are probably a
    good enough excuse to stop
2017-02-25 21:13:50 -04:00
David Bremner
e17a914b77 lib: add _notmuch_database_reopen
The main expected use is to recover from a Xapian::DatabaseChanged
exception.
2017-02-25 21:09:17 -04:00
David Bremner
e0e8586fc7 Merge branch 'release'
Merge in g_hash_table read-after-free fix
2017-02-23 09:08:15 -04:00
David Bremner
884dccf293 lib: make _notmuch_message_ensure_property_map static
It's not called outside message.cc
2017-02-23 08:54:36 -04:00
David Bremner
3db9e94b0e lib: make _notmuch_message_ensure_metadata static
It's not called anywhere outside message.cc.
2017-02-23 08:54:25 -04:00
David Bremner
4e649d000b lib: fix g_hash_table related read-after-free bug
The two g_hash_table functions (insert, add) have different behaviour
with respect to existing keys. g_hash_table_insert frees the new key,
while g_hash_table_add (which is really g_hash_table_replace in
disguise) frees the existing key. With this change 'ref' is live until
the end of the function (assuming single-threaded access to
'hash'). We can't guarantee it will continue to be live in the
future (i.e. there may be a future key duplication) so we copy it with
the allocation context passed to parse_references (in practice this is
the notmuch_message_t object whose parents we are finding).

Thanks to Tomi for the simpler approach to the problem based on
reading the fine glib manual.
2017-02-22 06:28:03 -04:00
David Bremner
0e037c34dd lib: Let Xapian manage the memory for FieldProcessors
It turns out this is exactly what release() is for; Xapian will
deallocate the objects when it's done with them.
2017-02-18 22:18:06 -04:00
David Bremner
e30fa4182f lib: merge internal prefix tables
Replace multiple tables with some flags in a single table. This makes
the code in notmuch_database_open_verbose a bit shorter, and it should
also make it easier to add other options to fields, e.g. regexp
searching.
2017-02-18 22:17:39 -04:00
David Bremner
70519319b5 lib: optimize counting documents
From #xapian

olly> bremner: btw, i noticed notmuch count see ms to request all the documents and then ignores them

bremner> hmm. There's something funny about the way that notmuch uses matches in general iirc

olly> it should be able to do: mset = enquire.get_mset (0, 0, notmuch->xapian_db->get_doccount ());
...
olly> get_matches_estimated() will be exact because check_at_least is the size of the database
2017-01-27 21:54:44 -04:00
Steven Allen
4a2ce7b570 docs: fix notmuch_message_properties_value documentation
It returns the value, not the key.
2017-01-15 14:25:00 -04:00
Jani Nikula
c906da9f60 lib: use glib for sha1 digests instead of embedding libsha1
We already depend on glib both directly and indirectly (via gmime). We
might as well make use of its facilities. Drop the embedded libsha1
and use glib for sha1 digests.
2017-01-08 10:50:38 -04:00
Jani Nikula
217404ff86 lib: fix the todo comment placement on NOTMUCH_STATUS_XAPIAN_EXCEPTION
The todo comment got separated from the status it's related to at
commit 3f32fd8a1c ("Add missing comment for
NOTMUCH_STATUS_READONLY_DATABASE."). Later, commit b65ca8e0ba ("lib:
modify notmuch.h for automatic document generation") moved it, but to
the wrong place. Fix the location.
2017-01-07 08:30:08 -04:00
David Bremner
0abcad7c0e lib: optionally silence Xapian deprecation warnings
This is not ideal, but the new API is not available in Xapian 1.2.x, and
it seems to soon to depend on Xapian >= 1.4
2016-11-15 07:47:55 -04:00
David Bremner
7f07a3f0ed lib: replace deprecated xapian call 'flush()' with 'commit()'
This will make notmuch incompatible with Xapian before 1.1.0, which is
more than 6 years old this point.
2016-10-25 18:13:52 -03:00
David Bremner
b2d6f07a02 lib: document API added in 0.23
The API was already documented, but for future readers note when the
functions were added,
2016-10-06 22:46:01 -03:00
David Bremner
af8903df34 require xapian >= 1.2.6
It seems that no-one tried to compile without Xapian compact support
since March of 2015, since that's when I introduced a syntax error in
that branch of the ifdef.

Given the choice of maintaining this underused branch of code, or
bumping the Xapian dependency to a version from 2011, it seems
reasonable to do the latter.
2016-10-06 22:45:46 -03:00
David Bremner
c2e74662bb lib: bump minor version to mark added symbols
This should not change the SONAME, and therefore won't change the
dynamic linking behaviour, but it may help some users debug missing
symbols in case their libnotmuch is too old.
2016-10-01 22:19:07 -03:00
Tomi Ollila
1c3a8e0898 lib/database.cc: fix misleading indentation
Found by gcc 6.1.1 -Wmisleading-indentation option (set by -Wall).
2016-09-28 08:14:08 -03:00
David Bremner
514a0a6a3b lib: add talloc reference from string map iterator to map
This is needed so that when the map is modified during traversal, and
thus unlinked by the database code, the map is not disposed of until the
iterator is done with it.
2016-09-24 10:08:45 -03:00
Daniel Kahn Gillmor
693ca8d8a8 add property: query prefix to search for specific properties
We want to be able to query the properties directly, like:

   notmuch count property:foo=bar

which should return a count of messages where the property with key
"foo" has value equal to "bar".
2016-09-21 18:14:25 -03:00
David Bremner
58fe8fce1d lib: iterator API for message properties
This is a thin wrapper around the string map iterator API just introduced.
2016-09-21 18:14:25 -03:00
David Bremner
b846bdb482 lib: extend private string map API with iterators
Support for prefix based iterators is perhaps overengineering, but I
wanted to mimic the existing database_config API.
2016-09-21 18:14:24 -03:00
David Bremner
b8bb6d7964 lib: basic message-property API
Initially, support get, set and removal of single key/value pair, as
well as removing all properties.
2016-09-21 18:14:24 -03:00
David Bremner
8b03ee1d5a lib: private string map (associative array) API
The choice of array implementation is deliberate, for future iterator support
2016-09-21 18:14:24 -03:00
David Bremner
4dfb69169e lib: read "property" terms from messages.
This is a first step towards providing an API to attach
arbitrary (key,value) pairs to messages and retrieve all of the values
for a given key.
2016-09-21 18:14:24 -03:00
David Bremner
59fed50a82 lib: update cached mtime in notmuch_directory_set_mtime
Without this change, the following code fails

  notmuch_directory_set_mtime(dir, 12345);
  assert(notmuch_directory_get_mtime(dir) == 12345);
2016-08-23 20:58:46 -03:00
David Bremner
9e177b236c lib: reword comment about XFOLDER: prefix
I believe the current one is misleading, because in my experiments
Xapian did not add : when prefix and term were both upper case. Indeed,
it's hard to see how it could, because prefixes are added at a layer
above Xapian in our code. See _notmuch_message_add_term for an example.

Also try to explain why this is a good idea.  As far as I can ascertain,
this is more of an issue for a system trying to work with an unknown set
of prefixes. Since notmuch has a fixed set of prefixes, and we can
hopefully be trusted not to add XGOLD and XGOLDEN as prefixes, it is
harder for problems to arise.
2016-08-18 05:11:37 -03:00
David Bremner
293186d6c6 lib: provide _notmuch_database_log_append
_notmuch_database_log clears the log buffer each time. Rather than
introducing more complicated semantics about for this function, provide
a second function that does not clear the buffer. This is mainly a
convenience function for callers constructing complex or multi-line log
messages.

The changes to query.cc are to make sure that the common code path of
the new function is tested.
2016-08-09 09:34:11 +09:00
David Bremner
3a45d29ed4 lib: add built_with handling for XAPIAN_DB_RETRY_LOCK
This support will be present only if the appropriate version of xapian
is available _and_ the user did not disable the feature when
building. So there really needs to be some way for the user to check.
2016-06-29 09:05:49 +02:00
Istvan Marko
9b60dc3cd9 Use the Xapian::DB_RETRY_LOCK flag when available
Xapian 1.3 has introduced the DB_RETRY_LOCK flag (Xapian bug
275). Detect it in configure and optionally use it. With this flag
commands that need the write lock will wait for their turn instead of
aborting when it's not immediately available.

Amended by db: allow disabling in configure
2016-06-29 09:03:34 +02:00
David Bremner
38f0d44a82 doc: forbid further operations on a closed database
We could add many null pointer checks, but currently I don't see a use
case that justifies it.
2016-06-28 23:20:38 +02:00
David Bremner
44cfa90bdc lib: fix definition of LIBNOTMUCH_CHECK_VERSION
Fix bug reported in id:20160606124522.g2y2eazhhrwjsa4h@flatcap.org

Although the C99 standard 6.10 is a little non-obvious on this point,
the docs for e.g. gcc are unambiguous. And indeed in practice with the
extra space, this code fails

#include <stdio.h>
#define foo (x) (x+1)

int main(int argc, char **argv){
  printf("%d\n",foo(1));
}
2016-06-11 13:01:44 -03:00
David Bremner
4291f32680 lib: fix memory leak of field processor objects
The field processor objects need to be deallocated explicitly just like
the range processors (or a talloc destructor defined).
2016-06-10 09:20:22 -03:00
David Bremner
ba0b95f846 lib: document config metadata
This probably should have been part of 3458e3c89c, but I missed it.
2016-06-07 07:51:57 -03:00
Daniel Kahn Gillmor
6a833a6e83 Use https instead of http where possible
Many of the external links found in the notmuch source can be resolved
using https instead of http.  This changeset addresses as many as i
could find, without touching the e-mail corpus or expected outputs
found in tests.
2016-06-05 08:32:17 -03:00
Tomi Ollila
cf09631a45 lib: whitespace cleanup
Cleaned the following whitespace in lib/* files:

lib/index.cc:              1 line:  trailing whitespace
lib/database.cc            5 lines: 8 spaces at the beginning of line
lib/notmuch-private.h:     4 lines: 8 spaces at the beginning of line
lib/message.cc:            1 line:  trailing whitespace
lib/sha1.c:                1 line:  empty lines at the end of file
lib/query.cc:              2 lines: 8 spaces at the beginning of line
lib/gen-version-script.sh: 1 line:  trailing whitespace
2016-06-05 08:23:28 -03:00
David Bremner
b9bf3f44ea lib: add support for named queries
This relies on the optional presense of xapian field processors, and the
library config API.
2016-05-25 07:40:44 -03:00
David Bremner
30caaf52b0 lib: make a global constant for query parser flags
It's already kindof gross that this is hardcoded in two different
places. We will also need these later in field processors calling back
into the query parser.
2016-05-25 07:40:44 -03:00
David Bremner
92e59568fa lib: config list iterators
Since xapian provides the ability to restrict the iterator to a given
prefix, we expose this ability to the user. Otherwise we mimic the other
iterator interfances in notmuch (e.g. tags.c).
2016-05-25 06:51:16 -03:00
David Bremner
3458e3c89c lib: provide config API
This is a thin wrapper around the Xapian metadata API. The job of this
layer is to keep the config key value pairs from colliding with other
metadata by transparently prefixing the keys, along with the usual glue
to provide a C interface.

The split of _get_config into two functions is to allow returning of the
return value with different memory ownership semantics.
2016-05-24 08:53:03 -03:00
David Bremner
792bea5aff lib/cli: add library API / CLI for compile time options
This is intentionally low tech; if we have more than two options it may
make sense to build up what infrastructure is provided.
2016-05-13 07:29:12 -03:00
David Bremner
bbf6069252 lib: optionally support single argument date: queries
This relies on the FieldProcessor API, which is only present in xapian
>= 1.3.
2016-05-08 08:17:07 -03:00
Daniel Kahn Gillmor
e366bb2227 complete ghost-on-removal-when-shared-thread-exists
To fully complete the ghost-on-removal-when-shared-thread-exists
proposal, we need to clear all ghost messages when the last active
message is removed from a thread.

Amended by db: Remove the last test of T530, as it no longer makes sense
if we are garbage collecting ghost messages.
2016-04-15 07:13:49 -03:00
Daniel Kahn Gillmor
1695415039 On deletion, replace with ghost when other active messages in thread
There is no need to add a ghost message upon deletion if there are no
other active messages in the thread.

Also, if the message being deleted was a ghost already, we can just go
ahead and delete it.
2016-04-15 07:07:23 -03:00
Daniel Kahn Gillmor
9eebae3da4 Introduce _notmuch_message_has_term()
It can be useful to easily tell if a given message has a given term
associated with it.
2016-04-15 07:07:23 -03:00
Daniel Kahn Gillmor
011fc41d4d Add internal functions to search for alternate doc types
Publicly we are only exposing the non-ghost documents (of "type"
"mail").  But internally we might want to inspect the ghost messages
as well.

This changeset adds two new private interfaces to queries to recover
information about alternate document types.
2016-04-15 07:07:23 -03:00
Daniel Kahn Gillmor
604d1e0977 fix thread breakage via ghost-on-removal
implement ghost-on-removal, the solution to T590-thread-breakage.sh
that just adds a ghost message after removing each message.

It leaks information about whether we've ever seen a given message id,
but it's a fairly simple implementation.

Note that _resolve_message_id_to_thread_id already introduces new
message_ids to the database, so i think just searching for a given
message ID may introduce the same metadata leakage.
2016-04-15 07:07:23 -03:00
Jani Nikula
54aeab1962 lib: clean up _notmuch_database_split_path
Make the logic it a bit easier to read. No functional changes.
2016-04-12 20:46:42 -03:00
Jani Nikula
a352d9ceaa lib: fix handling of one character long directory names at top level
The code to skip multiple slashes in _notmuch_database_split_path()
skips back one character too much. This is compensated by a +1 in the
length parameter to the strndup() call. Mostly this works fine, but if
the path is to a file under a top level directory with one character
long name, the directory part is mistaken to be part of the file name
(slash == path in code). The returned directory name will be the empty
string and the basename will be the full path, breaking the indexing
logic in notmuch new.

Fix the multiple slash skipping to keep the slash variable pointing at
the last slash, and adjust strndup() accordingly.

The bug was introduced in

commit e890b0cf40
Author: Carl Worth <cworth@cworth.org>
Date:   Sat Dec 19 13:20:26 2009 -0800

    database: Store the parent ID for each directory document.

just a little over two months after the initial commit in the Notmuch
code history, making this the longest living bug in Notmuch to date.
2016-04-12 20:40:19 -03:00
Tomi Ollila
342910a280 lib: NOTMUCH_DEPRECATED macro also for older compilers
Some compilers (older than gcc 4.5 and clang 2.9) do support
__attribute__ ((deprecated)) but not
__attribute__ ((deprecated("message"))).

Check if clang version is at least 3.0, or gcc version
is at least 4.5 to define NOTMUCH_DEPRECATED as the
latter variant above. Otherwise define NOTMUCH_DEPRECATED
as the former variant above.

For a bit simpler implementation clang 2.9 is not included
to use the newer variant. It is just one release, and the
older one works fine. Clang 3.0 was released around 2011-11
and gcc 5.1 2015-04-22 (therefore newer macro for gcc 4.5+)
2016-03-14 19:54:32 -03:00
Daniel Kahn Gillmor
07b6220a55 clean up stray apostrophe in comment
This is a nit-picky orthographical fix for an nit-picky ontological
comment.
2016-01-16 08:17:15 -04:00
Daniel Kahn Gillmor
e038b95ffe correct comment referring to notmuch_database_remove_message
notmuch_database_remove_message has no leading underscore in its name.
2016-01-16 08:16:51 -04:00
Steven Allen
c946356cdc forbid atomic transactions on writable, upgradable databases
We can't (but currently do) allow upgrades within transactions because
upgrades need their own transactions. We don't want to re-use the
current transaction because bailing out of an upgrade would mean loosing
all previous changes (because our "atomic" transactions don't commit
before hand). This gives us two options:

1. Fail at the beginning of upgrade (tell the user to end the
   transaction, upgrade, and start over).
2. Don't allow the user to start the transaction.

I went with the latter because:

1. There is no reason to call `begin_atomic` unless you intend to to
   write to the database and anyone intending to write to the database
   should upgrade it first.
2. This means that nothing inside an atomic transaction can ever fail
   with NOTMUCH_STATUS_UPGRADE_REQUIRED.
2015-11-23 08:15:37 -04:00
Jani Nikula
506b81679a lib: content disposition values are not case-sensitive
Per RFC 2183, the values for Content-Disposition values are not
case-sensitive. While at it, use the gmime function for getting at the
disposition string instead of referencing the field directly.

This fixes "attachment" tagging and filename term generation for
attachments while indexing.
2015-11-19 07:47:29 -04:00
Steven Allen
10e933a3bb Documentation: fix type name spelling 2015-10-27 08:07:31 -03:00
Jani Nikula
727fcd18c6 lib: add interface to delete directory documents
As mentioned in acd66cdec0 we don't have
an interface to delete directory documents, and they're left behind. Add
the interface.
2015-10-10 09:14:25 -03:00
David Bremner
7a20f26f91 lib: update doxygen comments to add @since for the new _st API
We should probably to this for all new functions introduced from now on.
2015-10-05 20:16:59 -03:00
David Bremner
378ba492a6 lib: migrate thread.cc to new query_search API
here we rely on thread_id_query being attached to the local talloc
context, so no new cleanup code is needed.
2015-10-05 19:53:53 -03:00
David Bremner
2501c2565c lib: migrate notmuch_database_upgrade to new query_search API
Here we depend on the error path cleaning up query
2015-10-05 19:53:11 -03:00
David Bremner
87ee9a53e3 lib: add versions of n_q_count_{message,threads} with status return
Although I think it's a pretty bad idea to continue using the old API,
this allows both a more gentle transition for clients of the library,
and allows us to break one monolithic change into a series
2015-10-05 19:44:07 -03:00
David Bremner
65a6b86873 lib: move query variable to function scope
This is a prelude to deallocating it (if necessary) on the error path.
2015-10-05 19:39:11 -03:00
Jani Nikula
23b8ed610a lib: add support for date:<expr>..! to mean date:<expr>..<expr>
It doesn't seem likely we can support simple date:<expr> expanding to
date:<expr>..<expr> any time soon. (This can be done with a future
version of Xapian, or with a custom query query parser.) In the mean
time, provide shorthand date:<expr>..! to mean the same. This is
useful, as the expansion takes place before interpetation, and we can
use, for example, date:yesterday..! to match from beginning of
yesterday to end of yesterday.

Idea from Mark Walters <markwalters1009@gmail.com>.
2015-09-25 21:55:24 -03:00
David Bremner
93ee4faa4d lib: constify arguments to notmuch_query_get_*
These functions are all just accessors, and it's pretty clear they don't
modify the query struct. This also fixes one warning I created when I
introduced status.c.
2015-09-23 08:58:19 -03:00
Jani Nikula
f460ad4e9a util: move strcase_equal and strcase_hash to util
For future use in both cli and lib.
2015-09-07 09:43:31 -03:00
David Bremner
bd5504ec10 lib: constify argument to notmuch_database_status_string
We don't modify the database struct, so no harm in committing to that.
2015-09-04 08:24:38 -03:00
David Bremner
110694b00b lib: note remaining uses of deprecated message search API
The two remaining cases in the lib seem to require more than a simple
replacement of the old call, with the new call plus a check of the
return value.
2015-09-04 08:08:18 -03:00
David Bremner
f16944c3b4 lib: remove use of notmuch_query_search_messages from query.cc
There is not too much point in worrying about the bad error reporting
here, because the count api is due for the same deprecation.
2015-09-04 08:06:08 -03:00
Austin Clements
cb08a2ee01 lib: Add "lastmod:" queries for filtering by last modification
The implementation is essentially the same as the date range search
prior to Jani's fancy date parser.
2015-08-14 18:23:49 +02:00
Austin Clements
98ee460eaa lib: API to retrieve database revision and UUID
This exposes the committed database revision to library users along
with a UUID that can be used to detect when revision numbers are no
longer comparable (e.g., because the database has been replaced).
2015-08-13 23:52:51 +02:00
Austin Clements
7f57b747b9 lib: Add per-message last modification tracking
This adds a new document value that stores the revision of the last
modification to message metadata, where the revision number increases
monotonically with each database commit.

An alternative would be to store the wall-clock time of the last
modification of each message.  In principle this is simpler and has
the advantage that any process can determine the current timestamp
without support from libnotmuch.  However, even assuming a computer's
clock never goes backward and ignoring clock skew in networked
environments, this has a fatal flaw.  Xapian uses (optimistic)
snapshot isolation, which means reads can be concurrent with writes.
Given this, consider the following time line with a write and two read
transactions:

   write  |-X-A--------------|
   read 1       |---B---|
   read 2                      |---|

The write transaction modifies message X and records the wall-clock
time of the modification at A.  The writer hangs around for a while
and later commits its change.  Read 1 is concurrent with the write, so
it doesn't see the change to X.  It does some query and records the
wall-clock time of its results at B.  Transaction read 2 later starts
after the write commits and queries for changes since wall-clock time
B (say the reads are performing an incremental backup).  Even though
read 1 could not see the change to X, read 2 is told (correctly) that
X has not changed since B, the time of the last read.  In fact, X
changed before wall-clock time A, but the change was not visible until
*after* wall-clock time B, so read 2 misses the change to X.

This is tricky to solve in full-blown snapshot isolation, but because
Xapian serializes writes, we can use a simple, monotonically
increasing database revision number.  Furthermore, maintaining this
revision number requires no more IO than a wall-clock time solution
because Xapian already maintains statistics on the upper (and lower)
bound of each value stream.
2015-08-13 23:52:51 +02:00
David Bremner
765556c1f1 build: extract library versions from notmuch.h
- Make lib/notmuch.h the canonical location for the library versioning
information.

- Since the release-check should never fail now, remove it to reduce
complexity.

- Make the version numbers in notmuch.h consistent with the (now
  deleted) ones in lib/Makefile.local
2015-08-10 13:53:55 +02:00
David Bremner
6b440a0adf lib: add public accessor for database from query
This is to make it easier for clients of the library to update to the
new error code returning versions of notmuch_query_search_messages
2015-08-04 09:11:34 +02:00
David Bremner
4fed7047b2 lib: deprecate notmuch_query_search_{threads, messages}
The CLI (and bindings) code should really be updated to use the new
status-code-returning versions. Here are some warnings to prod us (and
other clients) to do so.
2015-08-04 09:11:25 +02:00
David Bremner
7e2d0ef105 lib: define NOTMUCH_DEPRECATED macro, document its use.
This has been tested with gcc and clang.
2015-08-04 09:11:17 +02:00
Austin Clements
e6ad3a5dd4 lib: Only sync modified message documents
Previously, we updated the database copy of a message on every call to
_notmuch_message_sync, even if nothing had changed.  In particular,
this always happens on a thaw, so a freeze/thaw pair with no
modifications between still caused a database update.

We only modify message documents in a handful of places, so keep track
of whether the document has been modified and only sync it when
necessary.  This will be particularly important when we add message
revision tracking.
2015-08-04 08:54:46 +02:00
David Bremner
882ccb7e49 build: add "set -eu" to version script generation
It turns out that on certain systems like FreeBSD, c++filt is not
installed by default. It's basically OK if we fail the build in that
case, but what's really not OK is for the build to continue and
generate bad binaries.
2015-07-28 21:34:01 +02:00
David Bremner
53035dafe0 lib, ruby: make use of -Wl,--no-undefined configurable
In particular this is supposed to help build on systems (presumably
using a non-gnu ld) where this flag is not available.
2015-06-13 17:52:48 +02:00
David Bremner
32fd74b7aa lib: reject relative paths in n_d_{create,open}_verbose
There are many places in the notmuch code where the path is assumed to be absolute. If someone (TM) wants a project, one could remove these assumptions. In the mean time, prevent users from shooting themselves in the foot.

Update test suite mark tests for this error as no longer broken, and
also convert some tests that used relative paths for nonexistent
directories.
2015-06-12 07:34:50 +02:00
David Bremner
b59ad1a9cc lib: add NOTMUCH_STATUS_PATH_ERROR
The difference with FILE_ERROR is that this is for things that are
wrong with the path before looking at the disk.

Add some 3 tests; two broken as a reminder to actually use this new
code.
2015-06-12 07:34:47 +02:00
J. Lewis Muir
d08af93c65 cli: change "setup" to "set up" where used as a verb
The word "setup" is a noun, not a verb.  Change occurrences of "setup"
where used as a verb to "set up".
2015-05-31 19:14:42 +02:00
David Bremner
9d192da683 lib: eliminate fprintf from _notmuch_message_file_open
You may wonder why _notmuch_message_file_open_ctx has two parameters.
This is because we need sometime to use a ctx which is a
notmuch_message_t. While we could get the database from this, there is
no easy way in C to tell type we are getting.
2015-03-29 00:34:15 +01:00
David Bremner
736ac26407 lib: replace almost all fprintfs in library with _n_d_log
This is not supposed to change any functionality from an end user
point of view. Note that it will eliminate some output to stderr. The
query debugging output is left as is; it doesn't really fit with the
current primitive logging model. The remaining "bad" fprintf will need
an internal API change.
2015-03-29 00:34:15 +01:00