When we notice a legacy-display part during indexing, it makes more
sense to avoid indexing it as part of the message body.
Given that the protected subject will already be indexed, there is no
need to index this part at all, so we skip over it.
If this happens during indexing, we set a property on the message:
index.repaired=skip-protected-headers-legacy-display
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
This adds no functionality directly, but is a useful starting point
for adding new repair functionality.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
The extra flexibility of having both HAVE_EMACS (for yes, there is an
emacs we can use) and WITH_EMACS (the user wants emacs support) lead
to confusion and bugs. We now just force WITH_EMACS to 0 if no
suitable emacs is detected.
In 40b025 we stopped building the notmuch-emacs documentation if
HAVE_EMACS=0 (i.e. no emacs was detected by configure). Unfortunately
we continued to try to install the (non-existent) documentation, which
causes build/install failures.
As a bonus, we also avoid installing the documentation if the user
configures --without-emacs.
Thanks to Ralph Seichter for reporting the problem, and testing
previous versions of this fix.
Since the docstrings are not built in the case of --without-emacs,
even if emacs is detected, don't let sphinx build the emacs docs. This
avoids a large number of error messages due to missing includes. It's
actually a bit surprising sphinx doesn't generate an error for the
missing include files.
It seems our previous attempt with order-only targets was not
sufficient to avoid problems with sphinx-builds doctree cache [0].
Looking around at other people's approaches [1], using separate
doctrees was suggested. I guess there might be a slight loss of
efficiency, but it seems more robust.
[0]: build failures were first noticed in Debian experimental, but I was able to duplicate it in
my usual build environment about 1 in 8 builds.
[1]: in particular
9e3fc1657d
The new `body:` field (in Xapian terms) or prefix (in slightly
sloppier notmuch) terms allows matching terms that occur only in the
body.
Unprefixed query terms should continue to match anywhere (header or
body) in the message.
This follows a suggestion of Olly Betts to use the facility (since
Xapian 1.0.4) to add the same field with multiple prefixes. The double
indexing of previous versions is thus replaced with a query time
expension of unprefixed query terms to the various prefixed
equivalent.
Reindexing will be needed for 'body:' searches to work correctly;
otherwise they will also match messages where the term occur in
headers (demonstrated by the new tests in T530-upgrade.sh)
Without this change, we see this during the build:
sphinx-build -b html -d doc/_build/doctrees -q ./doc doc/_build/html
…/doc/notmuch-emacs.rst:67: WARNING: Unexpected indentation.
…/doc/notmuch-emacs.rst:165: WARNING: Unexpected indentation.
…/doc/notmuch-emacs.rst:306: WARNING: Unexpected indentation.
This source change doesn't seem to have any effect on the generated
HTML, at least.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
This should silence some warnings about the jobserver, but also make
it easier to build the docs where GNU make is called something other
than make.
Based on a patch from aidecoe.
In certain conditions the parallel calls to sphinx-build could
collide, yielding a crash like
Exception occurred:
File "/usr/lib/python3/dist-packages/sphinx/environment.py", line 1261, in get_doctree
doctree = pickle.load(f)
EOFError: Ran out of input
Many of the manpages didn't treat literal text as literal text. I've
tried to normalize some of the restructured text to make it a bit more
regular.
several of the synopsis lines are still untouched by this cleanup, but
i'm not sure what the right way to represent those is in .rst,
actually.
In particular find that if i rebuild the manpages, sometimes i end up
with some of the synopsis lines showing – (U+2013 EN DASH) where they
should have -- (2 × U+002D HYPHEN-MINUS) in the generated nroff
output, though i have not tracked down the source of this error yet.
This is mainly to improve discoverability. It seems that doing
variable cross-references is not easy without using some sphinx
extension/customization.
Add fancy new feature, which makes "notmuch show" capable of actually
indexing messages that it just decrypted.
This enables a workflow where messages can come in in the background
and be indexed using "--decrypt=auto". But when showing an encrypted
message for the first time, it gets automatically indexed.
This is something of a departure for "notmuch show" -- in particular,
because it requires read/write access to the database. However, this
might be a common use case -- people get mail delivered and indexed in
the background, but only want access to their secret key to happen
when they're directly interacting with notmuch itself.
In such a scenario, they couldn't search newly-delivered, encrypted
messages, but they could search for them once they've read them.
Documentation of this new feature also uses a table form, similar to
that found in the description of index.decrypt in notmuch-config(1).
A notmuch UI that wants to facilitate this workflow while also
offering an interactive search interface might instead make use of
these additional commands while the user is at the console:
Count received encrypted messages (if > 0, there are some things we
haven't yet tried to index, and therefore can't yet search):
notmuch count tag:encrypted and \
not property:index.decryption=success and \
not property:index.decryption=failure
Reindex those messages:
notmuch reindex --try-decrypt=true tag:encrypted and \
not property:index.decryption=success and \
not property:index.decryption=failure
I think we've diverged enough from the Xapian query parser
that we can't rely on that syntax description [1]. As far as I can
tell, [1] also only discusses quotes in the context of phrases.
[1]: https://xapian.org/docs/queryparser.html
Currently, notmuch has the levers needed to set coherent crypto policy
around how cleartext is indexed, which also has an impact on how
messages are rendered. But we don't have a lot of documentation about
how to do sensible things. This is an initial attempt to address
that.
The first example shows a way to selectively index specific messages.
The next two examples are about aligning the existing database with
crypto indexing policy
The default crypto policy is to not index cleartext, and to only
decrypt messages on display when explicitly requested.
The other sensible crypto policy is to index cleartext while stashing
session keys. messages indexed in this way will be searchable, and
will be decrypted on display automatically unless the user explicitly
asks for it to *not* be decrypted.
The policy for indexing *new* messages is stored in the database as
the config variable index.decrypt.
But setting policy for new messages doesn't retroactively affect
already indexed messages.
This patch attempts to document ways that someone can efficiently
align their pre-existing database with their new policy.
I'm not sure this is the right place to document these examples, but i
do want them to be user-facing and relatively easy to find. I'm happy
to entertain suggestions for where else we should put them.
In some cases (e.g. when building a publicly-visible e-mail archive)
it doesn't make any sense to restrict visibility of the message to the
current user account.
This adds a --world-readable boolean option for "notmuch insert", so
that those who want to archive their mail publicly can feed their
archiver with:
notmuch insert --world-readable
Other local delivery agents (postfix's local, and dovecot's lda) all
default to delivery in mode 0600 rather than relying on the user's
umask, so this fix doesn't change the default.
Also, this does not override the user's umask. if the umask is
already set tight, it will not become looser as the result of passing
--world-readable.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Need to be clearer about specifying time ranges using timestamps.
Legacy syntax which predates the date prefix is still supported, but
timestamps used in conjunction with the date prefix require additional
syntax.
Make all parameter descriptions etc. use reStructuredText definition
lists with uniform style and indentation. Remove redundant indentation
from around the lists. Remove blank lines between term lines and
definition blocks. Use four spaces for indentation.
This is almost completely whitespace and paragraph reflow changes.
This brings the --decrypt argument to "notmuch reply" into line with
the other --decrypt arguments (in "show", "new", "insert", and
"reindex"). This patch is really just about bringing consistency to
the user interface.
We also use the recommended form in the emacs MUA when replying, and
update test T350 to match.
We also expand tab completion for it, update the emacs bindings, and
update T350, T357, and T450 to match.
Make use of the bool-to-keyword backward-compatibility feature.
Add support for using /<regex>/ style regular expressions in
new.ignore, mixed with the old style verbatim file and directory
basenames. The regex is matched against the relative path from the
database path.
Having first a list of prefixes followed by detailed descriptions was
viable when we didn't have all that many prefixes. Now, arranging the
prefix descriptions in a definition list makes more sense.
While at it, include all the supported prefix forms, especially some
missing regex ones.
Now that the range of sensible decryption policies has come into full
view, we take a bit of space to document the distinctions.
Most people will use either "auto" or "true" -- but we provide "false"
and "nostash" to handle use cases that might reasonably be requested.
Note also that these can be combined in sensible ways. Like, if your
mail comes in regularly to a service that doesn't have access to your
secret keys, but does have access to your index, and you feel
comfortable adding selected encrypted messages to the index after
you've read them, you could stay in "auto" normally, and then when you
find yourself reading an indexable message (e.g. one you want to be
able to search for in the future, and that you don't mind exposing to
whatever entities have access to your inde), you can do:
notmuch reindex --decrypt=true id:whatever@example.biz
That leaves your default the same (still "auto") but you get the
cleartext index and stashed session key benefits for that particular
message.
Here's the configuration choice for people who want a cleartext index,
but don't want stashed session keys.
Interestingly, this "nostash" decryption policy is actually the same
policy that should be used by "notmuch show" and "notmuch reply",
since they never modify the index or database when they are invoked
with --decrypt.
We take advantage of this parallel to tune the behavior of those
programs so that we're not requesting session keys from GnuPG during
"show" and "reply" that we would then otherwise just throw away.
If you're going to store the cleartext index of an encrypted message,
in most situations you might just as well store the session key.
Doing this storage has efficiency and recoverability advantages.
Combined with a schedule of regular OpenPGP subkey rotation and
destruction, this can also offer security benefits, like "deletable
e-mail", which is the store-and-forward analog to "forward secrecy".
But wait, i hear you saying, i have a special need to store cleartext
indexes but it's really bad for me to store session keys! Maybe
(let's imagine) i get lots of e-mails with incriminating photos
attached, and i want to be able to search for them by the text in the
e-mail, but i don't want someone with access to the index to be
actually able to see the photos themselves.
Fret not, the next patch in this series will support your wacky
uncommon use case.
There are some situations where the user wants to get rid of the
cleartext index of a message. For example, if they're indexing
encrypted messages normally, but suddenly they run across a message
that they really don't want any trace of in their index.
In that case, the natural thing to do is:
notmuch reindex --decrypt=false id:whatever@example.biz
But of course, clearing the cleartext index without clearing the
stashed session key is just silly. So we do the expected thing and
also destroy any stashed session keys while we're destroying the index
of the cleartext.
Note that stashed session keys are stored in the xapian database, but
xapian does not currently allow safe deletion (see
https://trac.xapian.org/ticket/742).
As a workaround, after removing session keys and cleartext material
from the database, the user probably should do something like "notmuch
compact" to try to purge whatever recoverable data is left in the
xapian freelist. This problem really needs to be addressed within
xapian, though, if we want it fixed right.
The new "auto" decryption policy is not only good for "notmuch show"
and "notmuch reindex". It's also useful for indexing messages --
there's no good reason to not try to go ahead and index the cleartext
of a message that we have a stashed session key for.
This change updates the defaults and tunes the test suite to make sure
that they have taken effect.
The stashed session keys are stored internally as notmuch properties.
So a user or developer who is reading about those properties might
want to understand how they fit into the bigger picture.
Note here that decrypting with a stored session key no longer needs
-decrypt for "notmuch show" and "notmuch reply".
When showing a message, if the user doesn't specify --decrypt= at all,
but a stashed session key is known to notmuch, notmuch should just go
ahead and try to decrypt the message with the session key (without
bothering the user for access to their asymmetric secret key).
The user can disable this at the command line with --decrypt=false if
they really don't want to look at the e-mail that they've asked
notmuch to show them.
and of course, "notmuch show --decrypt" still works for accessing the
user's secret keys if necessary.
If the user doesn't specify --decrypt= at all, but a stashed session
key is known to notmuch, when replying to an encrypted message,
notmuch should just go ahead and decrypt.
The user can disable this at the command line with --decrypt=false,
though it's not clear why they would ever want to do that.
This new automatic decryption policy should make it possible to
decrypt messages that we have stashed session keys for, without
incurring a call to the user's asymmetric keys.
the command-line interface for indexing (reindex, new, insert) used
--try-decrypt; and the configuration records used index.try_decrypt.
But by comparison with "show" and "reply", there doesn't seem to be
any reason for the "try" prefix.
This changeset adjusts the command-line interface and the
configuration interface.
For the moment, i've left indexopts_{set,get}_try_decrypt alone. The
subsequent changeset will address those.
When doing any decryption, if the notmuch database knows of any
session keys associated with the message in question, try them before
defaulting to using default symmetric crypto.
This changeset does the primary work in _notmuch_crypto_decrypt, which
grows some new parameters to handle it.
The primary advantage this patch offers is a significant speedup when
rendering large encrypted threads ("notmuch show") if session keys
happen to be cached.
Additionally, it permits message composition without access to
asymmetric secret keys ("notmuch reply"); and it permits recovering a
cleartext index when reindexing after a "notmuch restore" for those
messages that already have a session key stored.
Note that we may try multiple decryptions here (e.g. if there are
multiple session keys in the database), but we will ignore and throw
away all the GMime errors except for those that come from last
decryption attempt. Since we don't necessarily know at the time of
the decryption that this *is* the last decryption attempt, we'll ask
for the errors each time anyway.
This does nothing if no session keys are stashed in the database,
which is fine. Actually stashing session keys in the database will
come as a subsequent patch.
Now that it's easy to add argument specific modifiers in opt
descriptions, add a new .allow_empty field to allow empty strings for
individual string arguments while retaining strict checks
elsewhere. Use this for notmuch insert --folder, where the empty
string means top level folder.
Enable override of the index.try_decrypt setting on a per-run basis
when invoking "notmuch reindex". This allows the possibility of (for
example) an emacs keybinding that adds the cleartext of the currently
shown decrypted message to the index, making it searchable in the
future.
It also enables one-time indexing of all messages matching some query,
like so:
notmuch reindex tag:encrypted and\
not property:index.decryption=success and\
from:alice@example.org
We also update the documentation and tab completion, and add a few
more tests.
Enable override of the index.try_decrypt setting on a per-message
basis when invoking "notmuch insert".
We also update the documentation and tab completion, and add more tests.
Enable override of the index.try_decrypt setting during "notmuch new"
on a per-invocation basis.
We update the documentation and tab completion, and also add a test.
By default, notmuch won't try to decrypt on indexing. With this
patch, we make it possible to indicate a per-database preference using
the config variable "index.try_decrypt", which by default will be
false.
At indexing time, the database needs some way to know its internal
defaults for how to index encrypted parts. It shouldn't be contingent
on an external config file (since that can't be retrieved from the
database object itself), so we store it in the database.
This behaves similarly to the query.* configurations, which are also
stored in the database itself, so we're not introducing any new
dependencies by requiring that it be stored in the database.
QUERY_STRING was only used in two places, both to test whether a
variable should be stored in (or retrieved from) the database.
Since other configuration variables might be stored in the database in
the future, consolidate that test into a single function.
We also document that these configuration options should not be placed
in the config file.
If we see index options that ask us to decrypt when indexing a
message, and we encounter an encrypted part, we'll try to descend into
it.
If we can decrypt, we add the property index.decryption=success.
If we can't decrypt (or recognize the encrypted type of mail), we add
the property index.decryption=failure.
Note that a single message may have both values of the
"index.decryption" property: "success" and "failure". For example,
consider a message that includes multiple layers of encryption. If we
manage to decrypt the outer layer ("index.decryption=success"), but
fail on the inner layer ("index.decryption=failure").
Because of the property name, this will be automatically cleared (and
possibly re-set) during re-indexing. This means it will subsequently
correspond to the actual semantics of the stored index.
This allows us to create new properties that will be automatically set
during indexing, and cleared during re-indexing, just by choice of
property name.
By default, Sphinx tries to pre-process text through SmartyPants,
which attempts to convert ASCII quotes and dashes to Unicode
characters. Unfortunately, this mangles technical text such as command
lines. For instance, this excerpt from notmuch-tag.rst:
**notmuch** **tag** **--batch** [--input=<*filename*>]
got turned into:
notmuch tag –batch [–input=<filename>]
That's an en-dash and an em-dash respectively.
Not only are these characters visually confusing and could easily be
mistaken for a single dash, copying and pasting such command lines
into a terminal is doomed to result in incomprehensible error
messages.
* doc/conf.py: Disable SmartyPants.
A leading / in paths in a .gitignore file matches the beginning of the
path, meaning that for patterns without slashes, git will match files
only in the current directory as opposed to in any subdirectory.
Prefix relevant paths with / in .gitignore files, to prevent
accidentally ignoring files in subdirectories and possibly slightly
improve the performance of "git status".
gmime 3.0 no longer offers a means to set the path for gpg.
Users can set $PATH anyway if they want to pick a
differently-installed gpg (e.g. /usr/local/bin/gpg), so this isn't
much of a reduction in functionality.
The one main difference is for people who have tried to use "gpg2" to
make use of gpg 2.1, but that isn't usefully co-installable anyway.
the idea is that you can run
% notmuch search subject:/<your-favourite-regexp>/
% notmuch search from:/<your-favourite-regexp>/
or
% notmuch search subject:"your usual phrase search"
% notmuch search from:"usual phrase search"
This feature is only available with recent Xapian, specifically
support for field processors is needed.
It should work with bindings, since it extends the query parser.
This is easy to extend for other value slots, but currently the only
value slots are date, message_id, from, subject, and last_mod. Date is
already searchable; message_id is left for a followup commit.
This was originally written by Austin Clements, and ported to Xapian
field processors (from Austin's custom query parser) by yours truly.
In most part, our .rst documents are indented with 8 spaces instead
of tabs. Bring the rest of the lines to the same format.
Also, on one (supposedly empty) line, trailing spaces were removed.
If the --hello parameter is given, display the notmuch hello buffer
instead of the message composition buffer if no message composition
parameters are given.
Signed-off-by: Jani Nikula <jani@nikula.org>
Install man pages based on $(MAN_GZIP_FILES), which directly
corresponds to the man page source rst files. This way we can filter
the man pages to be installed as needed.
Use $(wildcard ...) to generate the list of man pages based on the rst
source files present in the man page directories, instead of reading
conf.py. This has three main benefits:
1) This makes the man page build slightly less complicated and easier
to understand. At least there are fewer moving parts.
2) This makes the build fail if we add a man page rst file, but fail
to add it to conf.py.
3) We can use Sphinx constructs in conf.py that are not available when
importing the file into a normal python program such as
mkdocdeps.py.
If Sphinx fails to create any of the roff files, having touch create
them hides the errors until someone realizes, possibly much later,
that the resulting files are empty. (Note that gzip doesn't fail on
empty input files.) Sphinx will change the timestamps of any files it
has written anyway.