For those messages, compute a synthetic Message-ID based on the SHA1
of the whole message, in the same way that notmuch would do. See:
http://git.notmuchmail.org/git/notmuch/blob/HEAD:/lib/sha1.c
To do the above, rewrite get_message_id() to accumulate header lines,
parse them to check for Message-ID, and fallback to SHA1 computation
if it is not present.
Thanks to:
- Jan N. Klug for preliminary versions of this patch
- Tomi Ollila for suggesting an elegant implementation
Besides generally cleaning up the code and separating the general
content ID handling from the w3m-specific code, this fixes several
problems.
Foremost is that, previously, the code roughly assumed that referenced
parts would be in the same multipart/related as the reference.
According to RFC 2392, nothing could be further from the truth:
content IDs are supposed to be globally unique and globally
addressable. This is nonsense, but this patch at least fixes things
so content IDs can be anywhere in the same message.
As a side-effect of the above, this handles multipart/alternate
content-IDs more in line with RFC 2046 section 5.1.2 (not that I've
ever seen this in the wild). This also properly URL-decodes cid:
URLs, as per RFC 2392 (the previous code did not), and applies crypto
settings from the show buffer (the previous code used the global
crypto settings).
Unibyte strings are meant for representing binary data. In practice,
using unibyte versus multibyte strings affects *almost* nothing. It
does happen to matter if we use the binary data in an image descriptor
(which is, helpfully, not documented anywhere and getting it wrong
results in opaque errors like "Not a PNG image: <giant binary spew
that is, in fact, a PNG image>").
`notmuch-get-bodypart-content' could do two very different things,
depending on conditions: for text/* parts other than text/html, it
would return the part content as a multibyte Lisp string *after*
charset conversion, while for other parts (including text/html), it
would return binary part content without charset conversion.
This commit completes the split of `notmuch-get-bodypart-content' into
two different and explicit APIs: `notmuch-get-bodypart-binary' and
`notmuch-get-bodypart-text'. It updates all callers to use one or the
other depending on what's appropriate.
The new function, `notmuch-get-bodypart-binary', replaces
`notmuch-get-bodypart-internal'. Whereas the old function was really
meant for internal use in `notmuch-get-bodypart-content', it was used
in a few other places. Since the difference between
`notmuch-get-bodypart-content' and `notmuch-get-bodypart-internal' was
unclear, these other uses were always confusing and potentially
inconsistent. The new call clearly requests the part as undecoded
binary.
This is step 1 of 2 in separating `notmuch-get-bodypart-content' into
two APIs for retrieving either undecoded binary or decoded text.
Questions related to the way that probabilistic prefixes and phrases
are handled come up quite often and it is nicer to have the documentation self contained. Hopefully putting it in subsections prevents it from being overwhelming.
Adds new entry to the NEWS file, and updates the search terms section
of the man page. The search terms section needs to be updated again
once the new section in the documentation covering probablistic terms
has been committed.
This adds the indexing support for the "mimetype:" term and removes
the broken test flag. The indexing is probablistic in Xapian terms,
which gives a better experience to end users. Standard content-types
of the form "foo/bar" are automatically interpreted as phrases in
Xapian due to the embedded slash.
Assume, separate messages with application/pdf and application/x-pdf
are indexed, then:
- mimetype:application/x-pdf will find only the application/x-pdf
- mimetype:application/pdf will find only the application/pdf
- mimetype:pdf will find both of the messages
This feature will exist in all newly created databases, but there is
no upgrade provided for it. If this flag exists, it indicates that
the database was created after the indexed MIME-types feature was
added.
Adds three failing unit tests for searching of mime-types.
An attempt was made at adding a negative test (i.e. searching for a
non-existent mime-type and ensuring it didn't return a message), but
that test would always pass making it pointless.
We set header-line-format to the message subject, but if the subject
contains percents, the next character is interpreted as a formatting
control, which is not desired.
When something in tests fails one possibility to test is to run
the test script as `bash -x TXXX-testname.sh`. As stderr (fd 2) was
redirected to separate file during test execution also this set -x
(xtrace) output would also go there.
test-lib.sh saves the stderr to fd 7 from where it can be restored,
and bash has BASH_XTRACEFD variable, which is now given the same value
7, making bash to output all xtrade information (consistently) there.
This lib file used to save fd's 1 & 2 to 6 & 7 (respectively) in
test_begin_subtest(), but as those needs to be set *before* XTRACEFD
variable is set those are now saved at the beginning of the lib (once).
This is safe and simple thing to do.
To make xtrace output more verbose PS4 variable was set to contain the
source file, line number and if execution is in function, that function
name. Setting this variable has no effect when not xtracing.
As it is known that fd 6 is redirected stdout, printing status can now
use that fd, instead of saving stdout to fd 5 and use it.
_thread_set_subject_from_message sometimes replaces the subject, making the
cur_subject point to free'd memory
==6550== ERROR: AddressSanitizer: heap-use-after-free on address 0x601a0000bec0 at pc 0x4464a4 bp 0x7fffa40be910 sp 0x7fffa40be908
READ of size 1 at 0x601a0000bec0 thread T0
#0 0x4464a3 in _thread_add_matched_message /home/todd/.apps/notmuch/lib/thread.cc:369
#1 0x443c2c in notmuch_threads_get /home/todd/.apps/notmuch/lib/query.cc:496
#2 0x41d947 in do_search_threads /home/todd/.apps/notmuch/notmuch-search.c:131
#3 0x40a3fe in main /home/todd/.apps/notmuch/notmuch.c:345
#4 0x7f4e535b4ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
#5 0x40abe6 in _start ??:?
0x601a0000bec0 is located 96 bytes inside of 134-byte region [0x601a0000be60,0x601a0000bee6)
freed by thread T0 here:
#0 0x7f4e54e6933a in __interceptor_free ??:?
#1 0x7f4e54482fab in _talloc_free ??:?
previously allocated by thread T0 here:
#0 0x7f4e54e6941a in malloc ??:?
#1 0x7f4e54485b5d in talloc_strdup ??:?
==22884== ERROR: AddressSanitizer: heap-buffer-overflow on address 0x601600008291 at pc 0x7ff6295680e5 bp 0x7fff4ab9aa40 sp 0x7fff4ab9aa08
READ of size 1 at 0x601600008291 thread T0
#0 0x7ff6295680e4 in __interceptor_strcmp ??:?
#1 0x44763b in _thread_add_message /home/todd/.apps/notmuch/lib/thread.cc:255
#2 0x4459e8 in notmuch_threads_get /home/todd/.apps/notmuch/lib/query.cc:496
#3 0x41e2a7 in do_search_threads /home/todd/.apps/notmuch/notmuch-search.c:131
#4 0x40a408 in main /home/todd/.apps/notmuch/notmuch.c:345
#5 0x7ff627cb9ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
#6 0x40abf3 in _start ??:?
0x601600008291 is located 0 bytes to the right of 97-byte region [0x601600008230,0x601600008291)
allocated by thread T0 here:
#0 0x7ff62956e41a in malloc ??:?
#1 0x7ff628b8ab5d in talloc_strdup ??:?
The TAB-initiated address completion generates completion candidates
synchronously, blocking the UI. Since this can take long time, it is
better to let the use know what's happening.
At the moment, the test-lib fills in any missing headers. This makes
it impossible to test our handling of empty subjects. This will
allow us to use a special dummy subject -- `@FORCE_EMPTY` -- to force
the subject to remain empty.
Currently the thread is named based on either the oldest or newest
matching message (depending on the search order). If this message has
an empty subject, though, the thread will show up with an empty
subject in the search results. (See the thread starting with
`id:1412371140-21051-1-git-send-email-david@tethera.net` for an
example.)
This changes the behavior so it will use a non-empty name for the
thread if possible. We name threads based on (a) non-empty matches for
the query, and (b) the search order. If the search order is
oldest-first (as in the default inbox) it chooses the oldest matching
non-empty message as the subject. If the search order is newest-first
it chooses the newest one.
Make a new customizable variable instead of relying on
message-cite-function because the default for the latter changed
between emacs releases.
The defcustom is borrowed from the message.el source, with minor
modifications.
The faces used when washing messages should be notmuch specific and
inherit from the underlying emacs face rather than using it
directly. This allows the washed face to be modified without requiring
the modification of the underlying face.
Currently we hardcode "python" in several places. This makes things
hard for people who have only commands called python3 and/or
python2. We also add the name to sh.config to eventually replace the
current workaround in the test suite.
As discussed in
id:8cc9dd580ad672527e12f43706f9803b2c8e99d8.1405220724.git.wking@tremily.us,
execfile is unavailable in python3.
The approach of this commit avoids modifying the python module path,
which is arguably preferable since it avoids potentially accidentally
importing a module from the wrong place.
Tamas Szakaly points out [1] that the bug fixed in 51b073c still
exists in at least one place. This change follows the suggestion of
[2] and creates a block scope temporary std::string to avoid the rules
of iterators temporaries.
[1]: id:20141226113755.GA64154@pamparam
[2]: id:20141226230655.GA41992@pamparam
We generally do not support mbox files, but for historical reasons
we've supported single-message mbox files, with a deprecation
message. We've tried dropping the support altogether, but backed out
of it because we'd need to stop indexing them, while keeping support
for previously indexed files. This would be more complicated than
simply supporting single-message mbox files. Therefore, drop the
deprecation message, and just silently accept single-message mboxes.
Currently, if a From-header is of the form:
"" <address@example.com>
the empty string will be treated as a valid real-name, and the entry
in the search results will be empty.
The new behavior here is that we treat an empty real-name field as if
it were null, so that the email address will be used in the search
results instead.
Signed-off-by: Jesse Rosenthal <jrosenthal@jhu.edu>
We test for whether a quoted empty email address
"" <address@example.com>
will show up as the address, instead of the empty string. This is
marked as known-broken, since the current behavior is to use the empty
string.
This is a new test file, since handling of unusual email addresses
doesn't seem to fit well in any of our existing tests.
Signed-off-by: Jesse Rosenthal <jrosenthal@jhu.edu>
`with-current-notmuch-show-message' applies a `no-conversion' coding
system when reading a raw message from notmuch. That coding system
should _not_ be applied when the body of the macro is evaluated, as it
can cause file operations used during that evaluation to incorrectly
apply the `no-conversion' coding system.
This was discovered when a user's .signature file contained non-ASCII
characters. When a message is forwarded, the `no-conversion' coding
system was applied to the reading of the .signature file, resulting in
raw rather than UTF-8 interpretation of the data.