Currently, notmuch has the levers needed to set coherent crypto policy
around how cleartext is indexed, which also has an impact on how
messages are rendered. But we don't have a lot of documentation about
how to do sensible things. This is an initial attempt to address
that.
The first example shows a way to selectively index specific messages.
The next two examples are about aligning the existing database with
crypto indexing policy
The default crypto policy is to not index cleartext, and to only
decrypt messages on display when explicitly requested.
The other sensible crypto policy is to index cleartext while stashing
session keys. messages indexed in this way will be searchable, and
will be decrypted on display automatically unless the user explicitly
asks for it to *not* be decrypted.
The policy for indexing *new* messages is stored in the database as
the config variable index.decrypt.
But setting policy for new messages doesn't retroactively affect
already indexed messages.
This patch attempts to document ways that someone can efficiently
align their pre-existing database with their new policy.
I'm not sure this is the right place to document these examples, but i
do want them to be user-facing and relatively easy to find. I'm happy
to entertain suggestions for where else we should put them.
In some cases (e.g. when building a publicly-visible e-mail archive)
it doesn't make any sense to restrict visibility of the message to the
current user account.
This adds a --world-readable boolean option for "notmuch insert", so
that those who want to archive their mail publicly can feed their
archiver with:
notmuch insert --world-readable
Other local delivery agents (postfix's local, and dovecot's lda) all
default to delivery in mode 0600 rather than relying on the user's
umask, so this fix doesn't change the default.
Also, this does not override the user's umask. if the umask is
already set tight, it will not become looser as the result of passing
--world-readable.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Need to be clearer about specifying time ranges using timestamps.
Legacy syntax which predates the date prefix is still supported, but
timestamps used in conjunction with the date prefix require additional
syntax.
Make all parameter descriptions etc. use reStructuredText definition
lists with uniform style and indentation. Remove redundant indentation
from around the lists. Remove blank lines between term lines and
definition blocks. Use four spaces for indentation.
This is almost completely whitespace and paragraph reflow changes.
This brings the --decrypt argument to "notmuch reply" into line with
the other --decrypt arguments (in "show", "new", "insert", and
"reindex"). This patch is really just about bringing consistency to
the user interface.
We also use the recommended form in the emacs MUA when replying, and
update test T350 to match.
We also expand tab completion for it, update the emacs bindings, and
update T350, T357, and T450 to match.
Make use of the bool-to-keyword backward-compatibility feature.
Add support for using /<regex>/ style regular expressions in
new.ignore, mixed with the old style verbatim file and directory
basenames. The regex is matched against the relative path from the
database path.
Having first a list of prefixes followed by detailed descriptions was
viable when we didn't have all that many prefixes. Now, arranging the
prefix descriptions in a definition list makes more sense.
While at it, include all the supported prefix forms, especially some
missing regex ones.
Now that the range of sensible decryption policies has come into full
view, we take a bit of space to document the distinctions.
Most people will use either "auto" or "true" -- but we provide "false"
and "nostash" to handle use cases that might reasonably be requested.
Note also that these can be combined in sensible ways. Like, if your
mail comes in regularly to a service that doesn't have access to your
secret keys, but does have access to your index, and you feel
comfortable adding selected encrypted messages to the index after
you've read them, you could stay in "auto" normally, and then when you
find yourself reading an indexable message (e.g. one you want to be
able to search for in the future, and that you don't mind exposing to
whatever entities have access to your inde), you can do:
notmuch reindex --decrypt=true id:whatever@example.biz
That leaves your default the same (still "auto") but you get the
cleartext index and stashed session key benefits for that particular
message.
Here's the configuration choice for people who want a cleartext index,
but don't want stashed session keys.
Interestingly, this "nostash" decryption policy is actually the same
policy that should be used by "notmuch show" and "notmuch reply",
since they never modify the index or database when they are invoked
with --decrypt.
We take advantage of this parallel to tune the behavior of those
programs so that we're not requesting session keys from GnuPG during
"show" and "reply" that we would then otherwise just throw away.
If you're going to store the cleartext index of an encrypted message,
in most situations you might just as well store the session key.
Doing this storage has efficiency and recoverability advantages.
Combined with a schedule of regular OpenPGP subkey rotation and
destruction, this can also offer security benefits, like "deletable
e-mail", which is the store-and-forward analog to "forward secrecy".
But wait, i hear you saying, i have a special need to store cleartext
indexes but it's really bad for me to store session keys! Maybe
(let's imagine) i get lots of e-mails with incriminating photos
attached, and i want to be able to search for them by the text in the
e-mail, but i don't want someone with access to the index to be
actually able to see the photos themselves.
Fret not, the next patch in this series will support your wacky
uncommon use case.
There are some situations where the user wants to get rid of the
cleartext index of a message. For example, if they're indexing
encrypted messages normally, but suddenly they run across a message
that they really don't want any trace of in their index.
In that case, the natural thing to do is:
notmuch reindex --decrypt=false id:whatever@example.biz
But of course, clearing the cleartext index without clearing the
stashed session key is just silly. So we do the expected thing and
also destroy any stashed session keys while we're destroying the index
of the cleartext.
Note that stashed session keys are stored in the xapian database, but
xapian does not currently allow safe deletion (see
https://trac.xapian.org/ticket/742).
As a workaround, after removing session keys and cleartext material
from the database, the user probably should do something like "notmuch
compact" to try to purge whatever recoverable data is left in the
xapian freelist. This problem really needs to be addressed within
xapian, though, if we want it fixed right.
The new "auto" decryption policy is not only good for "notmuch show"
and "notmuch reindex". It's also useful for indexing messages --
there's no good reason to not try to go ahead and index the cleartext
of a message that we have a stashed session key for.
This change updates the defaults and tunes the test suite to make sure
that they have taken effect.
The stashed session keys are stored internally as notmuch properties.
So a user or developer who is reading about those properties might
want to understand how they fit into the bigger picture.
Note here that decrypting with a stored session key no longer needs
-decrypt for "notmuch show" and "notmuch reply".
When showing a message, if the user doesn't specify --decrypt= at all,
but a stashed session key is known to notmuch, notmuch should just go
ahead and try to decrypt the message with the session key (without
bothering the user for access to their asymmetric secret key).
The user can disable this at the command line with --decrypt=false if
they really don't want to look at the e-mail that they've asked
notmuch to show them.
and of course, "notmuch show --decrypt" still works for accessing the
user's secret keys if necessary.
If the user doesn't specify --decrypt= at all, but a stashed session
key is known to notmuch, when replying to an encrypted message,
notmuch should just go ahead and decrypt.
The user can disable this at the command line with --decrypt=false,
though it's not clear why they would ever want to do that.
This new automatic decryption policy should make it possible to
decrypt messages that we have stashed session keys for, without
incurring a call to the user's asymmetric keys.
the command-line interface for indexing (reindex, new, insert) used
--try-decrypt; and the configuration records used index.try_decrypt.
But by comparison with "show" and "reply", there doesn't seem to be
any reason for the "try" prefix.
This changeset adjusts the command-line interface and the
configuration interface.
For the moment, i've left indexopts_{set,get}_try_decrypt alone. The
subsequent changeset will address those.
When doing any decryption, if the notmuch database knows of any
session keys associated with the message in question, try them before
defaulting to using default symmetric crypto.
This changeset does the primary work in _notmuch_crypto_decrypt, which
grows some new parameters to handle it.
The primary advantage this patch offers is a significant speedup when
rendering large encrypted threads ("notmuch show") if session keys
happen to be cached.
Additionally, it permits message composition without access to
asymmetric secret keys ("notmuch reply"); and it permits recovering a
cleartext index when reindexing after a "notmuch restore" for those
messages that already have a session key stored.
Note that we may try multiple decryptions here (e.g. if there are
multiple session keys in the database), but we will ignore and throw
away all the GMime errors except for those that come from last
decryption attempt. Since we don't necessarily know at the time of
the decryption that this *is* the last decryption attempt, we'll ask
for the errors each time anyway.
This does nothing if no session keys are stashed in the database,
which is fine. Actually stashing session keys in the database will
come as a subsequent patch.
Now that it's easy to add argument specific modifiers in opt
descriptions, add a new .allow_empty field to allow empty strings for
individual string arguments while retaining strict checks
elsewhere. Use this for notmuch insert --folder, where the empty
string means top level folder.
Enable override of the index.try_decrypt setting on a per-run basis
when invoking "notmuch reindex". This allows the possibility of (for
example) an emacs keybinding that adds the cleartext of the currently
shown decrypted message to the index, making it searchable in the
future.
It also enables one-time indexing of all messages matching some query,
like so:
notmuch reindex tag:encrypted and\
not property:index.decryption=success and\
from:alice@example.org
We also update the documentation and tab completion, and add a few
more tests.
Enable override of the index.try_decrypt setting on a per-message
basis when invoking "notmuch insert".
We also update the documentation and tab completion, and add more tests.
Enable override of the index.try_decrypt setting during "notmuch new"
on a per-invocation basis.
We update the documentation and tab completion, and also add a test.
By default, notmuch won't try to decrypt on indexing. With this
patch, we make it possible to indicate a per-database preference using
the config variable "index.try_decrypt", which by default will be
false.
At indexing time, the database needs some way to know its internal
defaults for how to index encrypted parts. It shouldn't be contingent
on an external config file (since that can't be retrieved from the
database object itself), so we store it in the database.
This behaves similarly to the query.* configurations, which are also
stored in the database itself, so we're not introducing any new
dependencies by requiring that it be stored in the database.
QUERY_STRING was only used in two places, both to test whether a
variable should be stored in (or retrieved from) the database.
Since other configuration variables might be stored in the database in
the future, consolidate that test into a single function.
We also document that these configuration options should not be placed
in the config file.
If we see index options that ask us to decrypt when indexing a
message, and we encounter an encrypted part, we'll try to descend into
it.
If we can decrypt, we add the property index.decryption=success.
If we can't decrypt (or recognize the encrypted type of mail), we add
the property index.decryption=failure.
Note that a single message may have both values of the
"index.decryption" property: "success" and "failure". For example,
consider a message that includes multiple layers of encryption. If we
manage to decrypt the outer layer ("index.decryption=success"), but
fail on the inner layer ("index.decryption=failure").
Because of the property name, this will be automatically cleared (and
possibly re-set) during re-indexing. This means it will subsequently
correspond to the actual semantics of the stored index.
This allows us to create new properties that will be automatically set
during indexing, and cleared during re-indexing, just by choice of
property name.
By default, Sphinx tries to pre-process text through SmartyPants,
which attempts to convert ASCII quotes and dashes to Unicode
characters. Unfortunately, this mangles technical text such as command
lines. For instance, this excerpt from notmuch-tag.rst:
**notmuch** **tag** **--batch** [--input=<*filename*>]
got turned into:
notmuch tag –batch [–input=<filename>]
That's an en-dash and an em-dash respectively.
Not only are these characters visually confusing and could easily be
mistaken for a single dash, copying and pasting such command lines
into a terminal is doomed to result in incomprehensible error
messages.
* doc/conf.py: Disable SmartyPants.
A leading / in paths in a .gitignore file matches the beginning of the
path, meaning that for patterns without slashes, git will match files
only in the current directory as opposed to in any subdirectory.
Prefix relevant paths with / in .gitignore files, to prevent
accidentally ignoring files in subdirectories and possibly slightly
improve the performance of "git status".
gmime 3.0 no longer offers a means to set the path for gpg.
Users can set $PATH anyway if they want to pick a
differently-installed gpg (e.g. /usr/local/bin/gpg), so this isn't
much of a reduction in functionality.
The one main difference is for people who have tried to use "gpg2" to
make use of gpg 2.1, but that isn't usefully co-installable anyway.
the idea is that you can run
% notmuch search subject:/<your-favourite-regexp>/
% notmuch search from:/<your-favourite-regexp>/
or
% notmuch search subject:"your usual phrase search"
% notmuch search from:"usual phrase search"
This feature is only available with recent Xapian, specifically
support for field processors is needed.
It should work with bindings, since it extends the query parser.
This is easy to extend for other value slots, but currently the only
value slots are date, message_id, from, subject, and last_mod. Date is
already searchable; message_id is left for a followup commit.
This was originally written by Austin Clements, and ported to Xapian
field processors (from Austin's custom query parser) by yours truly.