Commit graph

156 commits

Author SHA1 Message Date
David Bremner
33dd5fdc69 lib: catch Xapian exceptions in n_m_add_tag
This is mostly just (horizontal) code movement due to wrapping
everything in a try / catch.
2020-07-14 07:31:45 -03:00
David Bremner
96befd0dd0 lib: catch Xapian exceptions in n_m_count_files
This will require some care for the caller to check the sign, and not
just add error returns into a running total.
2020-07-14 07:31:37 -03:00
David Bremner
00f1abfdf4 lib: catch Xapian exceptions in n_m_get_tags
This allows the function to return an error value rather than
crashing.
2020-07-14 07:12:52 -03:00
David Bremner
e404d8a51d lib: use LOG_XAPIAN_EXCEPTION in n_m_get_date
This should not change functionality, but does slightly reduce code
duplication. Perhaps more importantly it allows consistent changes to
all of the similar exception handling in message.cc.
2020-07-14 07:12:52 -03:00
David Bremner
286161b703 lib: catch exceptions in n_m_get_filenames
This is essentially copied from the change to notmuch_message_get_filename
2020-07-13 07:19:22 -03:00
David Bremner
a606cba32b lib/n_m_g_filename: catch Xapian exceptions, document NULL return
This is the same machinery as applied for

     notmuch_message_get_{thread,message}_id
2020-07-13 07:19:22 -03:00
David Bremner
9201c50204 lib/message: use LOG_XAPIAN_EXCEPTION in n_m_get_header
This is just for consistency, and a small reduction in the amount of
boilerplate.
2020-07-13 07:19:22 -03:00
David Bremner
dbdb860bb9 lib/message: catch exception in n_m_get_thread_id
This allows us to return an error value from the library.
2020-07-03 21:04:43 -03:00
David Bremner
87d462a204 lib: catch error from closed db in n_m_get_message_id
By catching it at the library top level, we can return an error value.
2020-07-03 21:03:51 -03:00
David Bremner
45cfeb2e55 lib: replace STRNCMP_LITERAL in __message_remove_indexed_terms
strncmp looks for a prefix that matches, which is very much not what
we want here. This fixes the bug reported by Franz Fellner in
id:1588595993-ner-8.651@TPL520
2020-05-04 10:55:43 -03:00
uncrustify
2b62ca2e3b lib: run uncrustify
This is the result of running

     $ uncrustify --replace --config ../devel/uncrustify.cfg *.c *.h *.cc

in the lib directory
2019-06-14 07:41:27 -03:00
Daniel Kahn Gillmor
5c3a44681f indexing: record protected subject when indexing cleartext
When indexing the cleartext of an encrypted message, record any
protected subject in the database, which should make it findable and
visible in search.

Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
2019-05-29 08:14:44 -03:00
David Bremner
75bdce7952 lib: support user prefix names in term generation
This should not change the indexing process yet as nothing calls
_notmuch_message_gen_terms with a user prefix name. On the other hand,
it should not break anything either.

_notmuch_database_prefix does a linear walk of the list of (built-in)
prefixes, followed by a logarithmic time search of the list of user
prefixes. The latter is probably not really noticable.
2019-05-25 07:17:27 -03:00
David Bremner
97939170b3 n_m_remove_indexed_terms: reduce number of Xapian API calls.
Previously this functioned scanned every term attached to a given
Xapian document. It turns out we know how to read only the terms we
need to preserve (and we might have already done so). This commit
replaces many calls to Xapian::Document::remove_term with one call to
::clear_terms, and a (typically much smaller) number of calls to
::add_term. Roughly speaking this is based on the assumption that most
messages have more text than they have tags.

According to the performance test suite, this yields a roughly 40%
speedup on "notmuch reindex '*'"
2019-05-23 08:00:56 -03:00
David Bremner
319dd95ebb lib: add 'body:' field, stop indexing headers twice.
The new `body:` field (in Xapian terms) or prefix (in slightly
sloppier notmuch) terms allows matching terms that occur only in the
body.

Unprefixed query terms should continue to match anywhere (header or
body) in the message.

This follows a suggestion of Olly Betts to use the facility (since
Xapian 1.0.4) to add the same field with multiple prefixes. The double
indexing of previous versions is thus replaced with a query time
expension of unprefixed query terms to the various prefixed
equivalent.

Reindexing will be needed for 'body:' searches to work correctly;
otherwise they will also match messages where the term occur in
headers (demonstrated by the new tests in T530-upgrade.sh)
2019-04-17 08:48:16 -03:00
David Bremner
0a7181dd16 lib: calculate message depth in thread
This will be used in reparenting messages without useful in-reply-to,
but with useful references
2018-09-06 08:07:13 -03:00
David Bremner
d0b844b358 lib: read reference terms into message struct.
The plan is to use these in resolving threads.
2018-09-06 08:07:12 -03:00
David Bremner
9b568e73e1 lib/thread: sort sibling messages by date
For non-root messages, this should not should anything currently, as
the messages are already added in date order. In the future we will
add some non-root messages in a second pass out of order and the
sorting will be useful. It does fix the order of multiple
root-messages (although it is overkill for that).
2018-09-06 08:07:12 -03:00
Daniel Kahn Gillmor
6a9f26b4a0 lib: make notmuch_message_get_database() take a const notmuch_message_t*
This is technically an API change, but it is not an ABI change, and
it's merely a statement that limits what the library can do.

This is in parallel to notmuch_query_get_database(), which also takes
a const pointer.
2018-05-26 07:32:01 -07:00
Daniel Kahn Gillmor
9088db76d8 lib: expose notmuch_message_get_database()
We've had _notmuch_message_database() internally for a while, and it's
useful.  It turns out to be useful on the other side of the library
interface as well (i'll use it later in this series for "notmuch
show"), so we expose it publicly now.
2018-05-26 07:30:32 -07:00
David Bremner
f0131af6c5 lib: define specialized get_thread_id for use in thread subquery
The observation is that we are only using the messages to get there
thread_id, which is kindof a pessimal access pattern for the current
notmuch_message_get_thread_id
2018-05-07 08:42:53 -03:00
Daniel Kahn Gillmor
6a9626a2fd cli/reindex: destroy stashed session keys when --decrypt=false
There are some situations where the user wants to get rid of the
cleartext index of a message.  For example, if they're indexing
encrypted messages normally, but suddenly they run across a message
that they really don't want any trace of in their index.

In that case, the natural thing to do is:

   notmuch reindex --decrypt=false id:whatever@example.biz

But of course, clearing the cleartext index without clearing the
stashed session key is just silly.  So we do the expected thing and
also destroy any stashed session keys while we're destroying the index
of the cleartext.

Note that stashed session keys are stored in the xapian database, but
xapian does not currently allow safe deletion (see
https://trac.xapian.org/ticket/742).

As a workaround, after removing session keys and cleartext material
from the database, the user probably should do something like "notmuch
compact" to try to purge whatever recoverable data is left in the
xapian freelist.  This problem really needs to be addressed within
xapian, though, if we want it fixed right.
2017-12-08 08:08:47 -04:00
Daniel Kahn Gillmor
4dfcc8c9b2 crypto: index encrypted parts when indexopts try_decrypt is set.
If we see index options that ask us to decrypt when indexing a
message, and we encounter an encrypted part, we'll try to descend into
it.

If we can decrypt, we add the property index.decryption=success.

If we can't decrypt (or recognize the encrypted type of mail), we add
the property index.decryption=failure.

Note that a single message may have both values of the
"index.decryption" property: "success" and "failure".  For example,
consider a message that includes multiple layers of encryption.  If we
manage to decrypt the outer layer ("index.decryption=success"), but
fail on the inner layer ("index.decryption=failure").

Because of the property name, this will be automatically cleared (and
possibly re-set) during re-indexing.  This means it will subsequently
correspond to the actual semantics of the stored index.
2017-10-21 19:53:19 -03:00
Daniel Kahn Gillmor
0bb05ff693 reindex: drop all properties named with prefix "index."
This allows us to create new properties that will be automatically set
during indexing, and cleared during re-indexing, just by choice of
property name.
2017-10-21 19:53:08 -03:00
Jani Nikula
008a5e92eb lib: convert notmuch_bool_t to stdbool internally
C99 stdbool turned 18 this year. There really is no reason to use our
own, except in the library interface for backward
compatibility. Convert the lib internally to stdbool.
2017-10-09 22:27:16 -03:00
David Bremner
debfae20db lib: enforce that n_message_reindex takes headers from first file
This is still a bit stopgap to be only choosing one set of headers,
but this seems like a more defensible set of headers to choose.
2017-09-05 21:51:57 -03:00
David Bremner
0a40ea4b48 lib: add notmuch_message_has_maildir_flag
I considered a higher level interface where the caller passes a tag
name rather than a flag character, but the role of the "unread" tag is
particularly confusing with such an interface.
2017-08-29 21:56:21 -03:00
David Bremner
8a8fb39b0c lib/message: split n_m_maildir_flags_tags, store maildir flags
In a future commit this will allow querying maildir flags seperately
from tags to allow resolving certain conflicts.
2017-08-29 21:51:10 -03:00
Daniel Kahn Gillmor
eb232ee0ab reindex: drop notmuch_param_t, use notmuch_indexopts_t instead
There are at least three places in notmuch that can trigger an
indexing action:

 * notmuch new
 * notmuch insert
 * notmuch reindex

I have plans to add some indexing options (e.g. indexing the cleartext
of encrypted parts, external filters, automated property injection)
that should properly be available in all places where indexing
happens.

I also want those indexing options to be exposed by (and constrained
by) the libnotmuch C API.

This isn't yet an API break because we've never made a release with
notmuch_param_t.

These indexing options are relevant in the listed places (and in the
libnotmuch analogues), but they aren't relevant in the other kinds of
functionality that notmuch offers (e.g. dump/restore, tagging, search,
show, reply).

So i think a generic "param" object isn't well-suited for this case.
In particular:

 * a param object sounds like it could contain parameters for some
   other (non-indexing) operation.  This sounds confusing -- why would
   i pass non-indexing parameters to a function that only does
   indexing?

 * bremner suggests online a generic param object would actually be
   passed as a list of param objects, argv-style.  In this case (at
   least in the obvious argv implementation), the params might be some
   sort of generic string.  This introduces a problem where the API of
   the library doesn't grow as new options are added, which means that
   when code outside the library tries to use a feature, it first has
   to test for it, and have code to handle it not being available.
   The indexopts approach proposed here instead makes it clear at
   compile time and at dynamic link time that there is an explicit
   dependency on that feature, which allows automated tools to keep
   track of what's needed and keeps the actual code simple.

My proposal adds the notmuch_indexopts_t as an opaque struct, so that
we can extend the list of options without causing ABI breakage.

The cost of this proposal appears to be that the "boilerplate" API
increases a little bit, with a generic constructor and destructor
function for the indexopts struct.

More patches will follow that make use of this indexopts approach.
2017-08-23 07:55:12 -03:00
Daniel Kahn Gillmor
5b93fa6e70 lib: add notmuch_message_reindex
This new function asks the database to reindex a given message.
The parameter `indexopts` is currently ignored, but is intended to
provide an extensible API to support e.g. changing the encryption or
filtering status (e.g. whether and how certain non-plaintext parts are
indexed).
2017-08-01 21:17:47 -04:00
David Bremner
34d7753992 lib: add _notmuch_message_remove_indexed_terms
Testing will be provided via use in notmuch_message_reindex
2017-08-01 21:17:47 -04:00
David Bremner
8a8e2b11c2 lib: add notmuch_message_count_files
This operation is relatively inexpensive, as the needed metadata is
already computed by our lazy metadata fetching. The goal is to support
better UI for messages with multipile files.
2017-08-01 21:17:47 -04:00
David Bremner
c040464a7c lib: wrap use of g_mime_utils_header_decode_date
This changes return type in gmime 3.0
2017-07-14 21:23:52 -03:00
Jani Nikula
30c475c1ef build: visibility=default for library structs is no longer needed
Commit d5523ead90 ("Mark some structures in the library interface
with visibility=default attribute.") fixed some mixed visibility
issues with structs. With the symbol default visibility reversed, this
is no longer a problem.
2017-05-13 08:38:18 -03:00
Fredrik Fornwall
e565118172 Replace index(3) with strchr(3)
The index(3) function has been deprecated in POSIX since 2001 and
removed in 2008, and most code in notmuch already calls strchr(3).

This fixes a compilation error on Android whose libc does not have
index(3).
2017-04-20 06:59:22 -03:00
David Bremner
5ce8e0b11b lib: replace deprecated n_q_count_messages with status returning version
This function was deprecated in notmuch 0.21.  We re-use the name for
a status returning version, and deprecate the _st name. One or two
remaining uses of the (removed) non-status returning version fixed at
the same time
2017-03-22 08:35:07 -03:00
David Bremner
a8a2705222 Merge branch 'release'
Merge in memory fixes
2017-03-18 21:02:42 -03:00
Tomi Ollila
06adc27668 lib/message.cc: fix Coverity finding (use after free)
The object where pointer to `data` was received was deleted before
it was used in _notmuch_string_list_append().

Relevant Coverity messages follow:

3: extract
Assigning: data = std::__cxx11::string(message->doc.()).c_str(),
which extracts wrapped state from temporary of type std::__cxx11::string.

4: dtor_free
The internal representation of temporary of type std::__cxx11::string
is freed by its destructor.

5: use after free:
Wrapper object use after free (WRAPPER_ESCAPE)
Using internal representation of destroyed object local data.
2017-03-18 20:59:46 -03:00
David Bremner
62822a4e2d lib: clamp return value of g_mime_utils_header_decode_date to >=0
For reasons not completely understood at this time, gmime (as of
2.6.22) is returning a date before 1900 on bad date input. Since this
confuses some other software, we clamp such dates to 0,
i.e. 1970-01-01.
2017-03-15 21:58:25 -03:00
David Bremner
7bd63833bf lib/message.cc: use view number to invalidate cached metadata
Currently the view number is incremented by notmuch_database_reopen
2017-02-25 21:15:38 -04:00
David Bremner
e0b22c139c lib: handle DatabaseModifiedError in _n_message_ensure_metadata
The retries are hardcoded to a small number, and error handling aborts
than propagating errors from notmuch_database_reopen. These are both
somewhat justified by the assumption that most things that can go
wrong in Xapian::Database::reopen are rare and fatal. Here's the brief
discussion with Xapian upstream:

   24-02-2017 08:12:57 < bremner> any intuition about how likely
      Xapian::Database::reopen is to fail? I'm catching a
      DatabaseModifiedError somewhere where handling any further errors is
      tricky, and wondering about treating a failed reopen as as "the
      impossible happened, stopping"

   24-02-2017 16:22:34 < olly> bremner: there should not be much scope for
    failure - stuff like out of memory or disk errors, which are probably a
    good enough excuse to stop
2017-02-25 21:13:50 -04:00
David Bremner
884dccf293 lib: make _notmuch_message_ensure_property_map static
It's not called outside message.cc
2017-02-23 08:54:36 -04:00
David Bremner
3db9e94b0e lib: make _notmuch_message_ensure_metadata static
It's not called anywhere outside message.cc.
2017-02-23 08:54:25 -04:00
David Bremner
b8bb6d7964 lib: basic message-property API
Initially, support get, set and removal of single key/value pair, as
well as removing all properties.
2016-09-21 18:14:24 -03:00
David Bremner
4dfb69169e lib: read "property" terms from messages.
This is a first step towards providing an API to attach
arbitrary (key,value) pairs to messages and retrieve all of the values
for a given key.
2016-09-21 18:14:24 -03:00
Daniel Kahn Gillmor
6a833a6e83 Use https instead of http where possible
Many of the external links found in the notmuch source can be resolved
using https instead of http.  This changeset addresses as many as i
could find, without touching the e-mail corpus or expected outputs
found in tests.
2016-06-05 08:32:17 -03:00
Tomi Ollila
cf09631a45 lib: whitespace cleanup
Cleaned the following whitespace in lib/* files:

lib/index.cc:              1 line:  trailing whitespace
lib/database.cc            5 lines: 8 spaces at the beginning of line
lib/notmuch-private.h:     4 lines: 8 spaces at the beginning of line
lib/message.cc:            1 line:  trailing whitespace
lib/sha1.c:                1 line:  empty lines at the end of file
lib/query.cc:              2 lines: 8 spaces at the beginning of line
lib/gen-version-script.sh: 1 line:  trailing whitespace
2016-06-05 08:23:28 -03:00
Daniel Kahn Gillmor
e366bb2227 complete ghost-on-removal-when-shared-thread-exists
To fully complete the ghost-on-removal-when-shared-thread-exists
proposal, we need to clear all ghost messages when the last active
message is removed from a thread.

Amended by db: Remove the last test of T530, as it no longer makes sense
if we are garbage collecting ghost messages.
2016-04-15 07:13:49 -03:00
Daniel Kahn Gillmor
1695415039 On deletion, replace with ghost when other active messages in thread
There is no need to add a ghost message upon deletion if there are no
other active messages in the thread.

Also, if the message being deleted was a ghost already, we can just go
ahead and delete it.
2016-04-15 07:07:23 -03:00
Daniel Kahn Gillmor
9eebae3da4 Introduce _notmuch_message_has_term()
It can be useful to easily tell if a given message has a given term
associated with it.
2016-04-15 07:07:23 -03:00