Currently I don't know of a good way of testing this, but at least in
principle a Xapian exception in _notmuch_message_{add,remove}_term
would cause an abort in the library.
This should not change functionality, but does slightly reduce code
duplication. Perhaps more importantly it allows consistent changes to
all of the similar exception handling in message.cc.
This will be mandatory as of Xapian 1.5. The API is also more
consistent with the FieldProcessor API, which helps code re-use a bit.
Note that this switches to using the built-in Xapian support for
prefixes on ranges (i.e. deleted code at beginning of
ParseTimeRangeProcessor::operator(), added prefix to constructor).
Another side effect of the migration is that we are generating smaller
queries, using one OP_VALUE_RANGE instead of an AND of two OP_VALUE_*
queries.
As we prepare to handle S/MIME-encrypted PKCS#7 EnvelopedData (which
is not multipart), we don't want to be limited to passing only
GMimeMultipartEncrypted MIME parts to _notmuch_crypto_decrypt.
There is no functional change here, just a matter of adjusting how we
pass arguments internally.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
When we are indexing, we should treat SignedData parts the same way
that we treat a multipart object, indexing the wrapped part as a
distinct MIME object.
Unfortunately, this means doing some sort of cryptographic
verification whose results we throw away, because GMime doesn't offer
us any way to unwrap without doing signature verification.
I've opened https://github.com/jstedfast/gmime/issues/67 to request
the capability from GMime but for now, we'll just accept the
additional performance hit.
As we do this indexing, we also apply the "signed" tag, by analogy
with how we handle multipart/signed messages. These days, that kind
of change should probably be done with a property instead, but that's
a different set of changes. This one is just for consistency.
Note that we are currently *only* handling signedData parts, which are
basically clearsigned messages. PKCS#7 parts can also be
envelopedData and authEnvelopedData (which are effectively encryption
layers), and compressedData (which afaict isn't implemented anywhere,
i've never encountered it). We're laying the groundwork for indexing
these other S/MIME types here, but we're only dealing with signedData
for now.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
strncmp looks for a prefix that matches, which is very much not what
we want here. This fixes the bug reported by Franz Fellner in
id:1588595993-ner-8.651@TPL520
Xapian 1.4 is over 3 years old now (1.4.0 released 2016-06-24),
and 1.2 has been deprecated in Notmuch version 0.27 (2018-06-13).
Xapian 1.4 supports compaction, field processors and retry locking;
conditionals checking compaction and field processors were removed
but user may want to disable retry locking at configure time so it
is kept.
Apparently doxygen needs its comments formatted in a specific way to
notice that the group is closed.
Without this fix, with doxygen 1.8.16-2 we see:
```
doxygen ./doc/doxygen.cfg
…/notmuch/lib/notmuch.h:2322: warning: end of file while inside a group
```
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
The documentation for notmuch_config_list_key warns that that the
returned value will be destroyed by the next call to
notmuch_config_list_key, but it neglected to mention that calling
notmuch_config_list_value would also destroy it (by calling
notmuch_config_list_key). This is surprising, and caused a use after
free bug in _setup_user_query_fields (first noticed by an OpenBSD
porter, so kudos to the OpenBSD malloc implementation). This change
fixes that use-after-free bug.
When encountering a message that has been mangled in the "mixed up"
way by an intermediate MTA, notmuch should instead repair it and index
the repaired form.
When it does this, it also associates the index.repaired=mixedup
property with the message. If a problem is found with this repair
process, or an improved repair process is proposed later, this should
make it easy for people to reindex the relevant message. The property
will also hopefully make it easier to diagnose this particular problem
in the future.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
When we notice a legacy-display part during indexing, it makes more
sense to avoid indexing it as part of the message body.
Given that the protected subject will already be indexed, there is no
need to index this part at all, so we skip over it.
If this happens during indexing, we set a property on the message:
index.repaired=skip-protected-headers-legacy-display
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Our _notmuch_message_crypto_potential_payload implementation could
only return a failure if bad arguments were passed to it. It is an
internal function, so if that happens it's an entirely internal bug
for notmuch.
It will be more useful for this function to return whether or not the
part is in fact a cryptographic payload, so we dispense with the
status return.
If some future change suggests adding a status return back, there are
only a handful of call sites, and no pressure to retain a stable API,
so it could be changed easily. But for now, go with the simpler
function.
We will use this return value in future patches, to make different
decisions based on whether a part is the cryptographic payload or not.
But for now, we just leave the places where it gets invoked marked
with (void) to show that the result is ignored.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
This adds no functionality directly, but is a useful starting point
for adding new repair functionality.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
When indexing the cleartext of an encrypted message, record any
protected subject in the database, which should make it findable and
visible in search.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
This should not change the indexing process yet as nothing calls
_notmuch_message_gen_terms with a user prefix name. On the other hand,
it should not break anything either.
_notmuch_database_prefix does a linear walk of the list of (built-in)
prefixes, followed by a logarithmic time search of the list of user
prefixes. The latter is probably not really noticable.
This will be used to avoid needing a database access to resolve a db
prefix from the corresponding UI prefix (e.g. when indexing). Arguably
the setup of the separate header map does not belong here, since it is
about indexing rather than querying, but we currently don't have any
other indexing setup to do.
Previously this functioned scanned every term attached to a given
Xapian document. It turns out we know how to read only the terms we
need to preserve (and we might have already done so). This commit
replaces many calls to Xapian::Document::remove_term with one call to
::clear_terms, and a (typically much smaller) number of calls to
::add_term. Roughly speaking this is based on the assumption that most
messages have more text than they have tags.
According to the performance test suite, this yields a roughly 40%
speedup on "notmuch reindex '*'"
Without this,
$ make time-test OPTIONS=--small
leads to fatal errors from too many open files.
Thanks to st-gourichon-fid for bringing this problem to my attention in IRC.
Rather than storing the lower level stdio FILE object, we store a
GMime stream. This allows both transparent decompression, and passing
the stream into GMime for parsing. As a side effect, we can let GMime
close the underlying OS stream (indeed, that stream isn't visible here
anymore).
This change is enough to get notmuch-{new,search} working, but there is still
some work required for notmuch-show, to be done in a following commit.
This is a functional change, not a straight translation, because we
are no longer directly invoking g_mime_parser_options_get_default(),
but the GMime source has indicated that the options parameter for
g_mime_parser_construct_message() is "nullable" since upstream commit
d0ebdd2ea3e6fa635a2a551c846e9bc8b6040353 (which itself precedes GMime
3.0).
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Several GMime 2.6 functions sprouted a change in the argument order in
GMime 3.0. We had a compatibility layer here to be able to handle
compiling against both GMime 2.6 and 3.0. Now that we're using 3.0
only, rip out the compatibility layer for those functions with changed
argument lists, and explicitly use the 3.0 argument lists.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Several of these #defines were not actually used in the notmuch
codebase any longer. And as of GMime 3.0, g_mime_init takes no
arguments, so we can also drop the bogus RFC2047 argument that we were
passing and then #defining away.
signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
This means dropping GMimeCryptoContext and notmuch_config arguments.
All the argument changes are to internal functions, so this is not an
API or ABI break.
We also get to drop the #define for g_mime_3_unused.
signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
In _index_mime_part, we don't need to extract the content-type from
the part until just before we use it, so we also defer it lazily.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
The new `body:` field (in Xapian terms) or prefix (in slightly
sloppier notmuch) terms allows matching terms that occur only in the
body.
Unprefixed query terms should continue to match anywhere (header or
body) in the message.
This follows a suggestion of Olly Betts to use the facility (since
Xapian 1.0.4) to add the same field with multiple prefixes. The double
indexing of previous versions is thus replaced with a query time
expension of unprefixed query terms to the various prefixed
equivalent.
Reindexing will be needed for 'body:' searches to work correctly;
otherwise they will also match messages where the term occur in
headers (demonstrated by the new tests in T530-upgrade.sh)