Some MTAs mangle e-mail messages in transit in ways that are
repairable.
Microsoft Exchange (in particular, the version running today on
Office365's mailservers) appears to mangle multipart/encrypted
messages in a way that makes them undecryptable by the recipient.
I've documented this in section 4.1 "Mixed-up encryption" of draft -00
of
https://tools.ietf.org/html/draft-dkg-openpgp-pgpmime-message-mangling
Fortunately, it's possible to repair such a message, and notmuch can
do that so that a user who receives an encrypted message from a user
of office365.com can still decrypt the message.
Enigmail already knows about this particular kind of mangling. It
describes it as "broken PGP email format probably caused by an old
Exchange server", and it tries to repair by directly changing the
message held by the user. if this kind of repair goes wrong, the
repair process can cause data loss
(https://sourceforge.net/p/enigmail/bugs/987/, yikes).
The tests introduced here are currently broken. In subsequent
patches, i'll introduce a non-destructive form of repair for notmuch
so that notmuch users can read mail that has been mangled in this way,
and the tests will succeed.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Enigmail generates a "legacy-display" part when it sends encrypted
mail with a protected Subject: header. This part is intended to
display the Subject for mail user agents that are capable of
decryption, but do not know how to deal with embedded protected
headers.
This part is the first child of a two-part multipart/mixed
cryptographic payload within a cryptographic envelope that includes
encryption (that is, it is not just a cleartext signed message). It
uses Content-Type: text/rfc822-headers.
That is:
A └┬╴multipart/encrypted
B ├─╴application/pgp-encrypted
C └┬╴application/octet-stream
* ╤ <decryption>
D └┬╴multipart/mixed; protected-headers=v1 (cryptographic payload)
E ├─╴text/rfc822-headers; protected-headers=v1 (legacy-display part)
F └─╴… (actual message body)
In discussions with jrollins, i've come to the conclusion that a
legacy-display part should be stripped entirely from "notmuch show"
and "notmuch reply" now that these tools can understand and interpret
protected headers.
You can tell when a message part is a protected header part this way:
* is the payload (D) multipart/mixed with exactly two children?
* is its first child (E) Content-Type: text/rfc822-headers?
* does the first child (E) have the property protected-headers=v1?
* do all the headers in the body of the first child (E) match
the protected headers in the payload part (D) itself?
If this is the case, and we already know how to deal with the
protected header, then there is no reason to try to render the
legacy-display part itself for the user.
Furthermore, when indexing, if we are indexing properly, we should
avoid indexing the text in E as part of the message body.
'notmuch reply' is an interesting case: the standard use of 'notmuch
reply' will end up omitting all mention of protected Subject:.
The right fix is for the replying MUA to be able to protect its
headers, and for it to set them appropriately based on headers found
in the original message.
If a replying MUA is unable to protect headers, but still wants the
user to be able to see the original header, a replying MUA that
notices that the original message's subject differs from the proposed
reply subject may choose to include the original's subject in the
quoted/attributed text. (this would be a stopgap measure; it's not
even clear that there is user demand for it)
This test suite change indicates what we want to happen for this case
(the tests are currently broken), and includes three additional TODO
suggestions of subtle cases for anyone who wants to flesh out the test
suite even further. (i believe all these cases should be already
fixed by the rest of this series, but haven't had time to write the
tests for the unusual cases)
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
When the user knows the signer's key, we want "notmuch show" to be
able to verify the signature of an encrypted and signed message
regardless of whether we are using a stashed session key or not.
I wrote this test because I was surprised to see signature
verification failing when viewing some encrypted messages after
upgrading to GPGME 1.13.0-1 in debian experimental.
The added tests here all pass with GPGME 1.12.0, but the final test
fails with 1.13.0, due to some buggy updates to GPGME upstream: see
https://dev.gnupg.org/T3464 for more details.
While the bug needs to be fixed in GPGME, notmuch's test suite needs
to make sure that GMime is doing what we expect it to do; i was a bit
surprised that it hadn't caught the problem, hence this patch.
I've fixed this bug in debian experimental with gpgme 1.13.0-2, so the
tests should pass on any debian system. I've also fixed it in the
gpgme packages (1.13.0-2~ppa1) in the ubuntu xenial PPA
(ppa:notmuch/notmuch) that notmuch uses for Travis CI.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
We want to make sure that internally-forwarded messages don't end up
"bubbling up" when they aren't actually the cryptographic payload.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Up to this point, we've tested protected headers on messages that have
either been encrypted or signed, but not both.
This adds a couple tests of signed+encrypted messages, one where the
subject line is masked (outside subject line is "Subject Unavailable")
and another where it is not (outside Subject: matches inner Subject:)
See the discussion at
https://dkg.fifthhorseman.net/blog/e-mail-cryptography.html#protected-headers
for more details about the nuances between signed, stripped, and
stubbed headers.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Make sure that we emit the correct cryptographic envelope status for
cleartext signed messages.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Adding another test to ensure that we handle protected headers
gracefully when no external subject is present.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Here we add several variant e-mail messages, some of which have
correctly-structured protected headers, and some of which do not. The
goal of the tests is to ensure that the right protected subjects get
reported.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
We did not have a test showing what message decryption looks like
within notmuch-emacs. This change gives us a baseline for future work
on the notmuch-emacs interface.
This differs from previous revisions of this patch in that it should
be insensitive to the order in which the local filesystem readdir()s
the underlying maildir.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
The current scheme of choosing the replyto (i.e. the default parent
for threading purposes) does not work well for mailers that put
the oldest Reference last.
There are 3 threads here, two synthetic, and one anonymized one using
data from Gregor. They test various aspects of thread
ordering/construction in the presence of replies to ghost messages.
'quite' on IRC reported that notmuch new was grinding to a halt during
initial indexing, and we eventually narrowed the problem down to some
html parts with large embedded images. These cause the number of terms
added to the Xapian database to explode (the first 400 messages
generated 4.6M unique terms), and of course the resulting terms are
not much use for searching.
The second test is sanity check for any "improved" indexing of HTML.
These 210 messages are in several long threads, which is good for
testing our threading code, and may be useful just as a larger test
corpus in the future.
As Daniel Kahn Gillmor <dkg@fifthhorseman.net> reports in
id:87d1ngv95p.fsf@alice.fifthhorseman.net, notmuch show combines
multiple Cc: fields into one, while notmuch reply does not. While such
messages are in violation of RFC 5322, it would be reasonable to
expect notmuch to be consistent. Add a known broken test to document
this expectation.
This also starts a new "broken" corpus for messages which are broken.
Details:
The original message is formatted using the message printing in
notmuch-show.c. For Cc:, it uses g_mime_message_get_recipients(),
which apparently combines all Cc: fields into one internally.
The addresses in the reply headers, OTOH, are based on headers queried
through libnotmuch. It boils down to g_mime_object_get_header() in
lib/message-file.c, which returns only the first occurence of header.
We largely use the corpus under test/corpus for
testing. Unfortunately, many of our tests have grown to depend on
having exactly this set of messages, making it hard to add new message
files for testing specific cases.
We do use a lot of add_message from within the tests, but it's not
possible to use that for adding broken messages, and adding several
messages at once can get unwieldy.
Move the basic corpus under tests/corpora/default, and make it
possible to add new, independent corpora along its side. This means
tons of renames with a few tweaks to add_email_corpus function in
test-lib.sh to let tests specify which corpus to use.