This imports a message with ISO-8859-2 encoded characters, then opens
the database using the python bindings. We peek through all mesage
parts, afterwards print the message id.
Signed-off-by: Florian Klink <flokli@flokli.de>
Signed-off-by: Andreas Rammhold <andreas@rammhold.de>
currently, notmuch's get_message_parts() opens the file in text mode and passes
the file object to email.message_from_file(fp). In case the email contains
UTF-8 characters, reading might fail inside email.parser with the following exception:
File "/usr/lib/python3.6/site-packages/notmuch/message.py", line 591, in get_message_parts
email_msg = email.message_from_binary_file(fp)
File "/usr/lib/python3.6/email/__init__.py", line 62, in message_from_binary_file
return BytesParser(*args, **kws).parse(fp)
File "/usr/lib/python3.6/email/parser.py", line 110, in parse
return self.parser.parse(fp, headersonly)
File "/usr/lib/python3.6/email/parser.py", line 54, in parse
data = fp.read(8192)
File "/usr/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 1865: invalid continuation byte
To fix this, read file in binary mode and pass to
email.message_from_binary_file(fp).
Unfortunately, Python 2 doesn't support
email.message_from_binary_file(fp), so keep using
email.message_from_file(fp) there.
Signed-off-by: Florian Klink <flokli@flokli.de>
Commit a7964c86d1 ("emacs: Sanitize authors and subjects in search
and show") added sanitization of header information for display. Do
the same for reply subjects.
This fixes the long-standing annoying artefact of certain versions of
mailman using tab as folding whitespace, leading to tabs in reply
subjects.
This is a logical followup to "lib: index the content type of
signature parts", which will make it easier to record the message
structure of all messages.
It's useful (*) to be able to easily find messages with certain types
of signatures. Having the mimetype: prefix searches fail for some
content types is also genuinely surprising (*). Index the content type
of signature parts.
While at it, switch to the gmime convenience constants for content and
signature part indexes.
*) At least for developers of email software!
Add known broken subtests for searching signed and encrypted messages
using mimetype: prefix search for the content-types of signed and
encrypted parts.
Quoting from the elisp reference:
For other types (e.g., lists, vectors, strings), two arguments
with the same contents or elements are not necessarily ‘eq’ to
each other.
Thanks to "Attic Hermit" for the fix.
The deprecated Database.add_message now calls the new index_file with
correct number of arguments (without an extra `self`), and returns the
tuple from index_file - as it used to do before.
This change also adds a DeprecationWarning to the function.
Switch to a local version of enriched-decode-display-prop if we
encounter a text/enriched part. This is to mitigate
https://bugs.gnu.org/28350. Normally it would be prudent to remove the
override afterwards, but in this case just leave it in.
Notes from db:
This doesn't disable text/enriched, just one feature of it.
In [1] Mark showed that the the current code (d7a49e81) is not
consistent in it's handling of subjects of messages with duplicate
message-ids (or in notmuch-speak, of messages with multiple files).
notmuch-search uses indexing order and explicitedly preserves the
first. notmuch-show (apparently) uses alphabetical (or at least xapian
term order) of filenames. In a perfect world we would probably report
all subjects in the json output; at the very least we should be
consistent.
[1]: id:87378dny3d.fsf@qmul.ac.uk
In [1], Mark gave a test that was behaving strangly. This turns out to
be specific to reindexing. I suppose one could argue that picking the
lexicographically last file name is a defensible choice, but it's
almost as easy to take the first, which seems more intuitive. So mark
the current situation as broken.
[1]: id:1503859703-2973-1-git-send-email-markwalters1009@gmail.com
The existing test for notmuch search had the first in filename order
the same as the first indexed, which made it harder to understand what
the underlying behaviour is. Add a file with a lexicographically
smaller name, but later index time to clarify this.
The original intent of this test was to verify that notmuch show was
not crashing when the first file (where headers are being read from)
was deleted. Run the output through some sanitization so that as we
add and delete copies we don't have to update this test.
notmuch-tree did not protect against concurrent refreshes like
notmuch-search, meaning, hitting '=' (notmuch-refresh-this-buffer)
quickly will spawn multiple parallel notmuch processes, and clobber
the existing results in the current buffer.
* notmuch-tree.el: Add a guard to notmuch-tree-refresh-view similar to
the one in notmuch-search.
'g_object_newv' is deprecated, and prints annoying warnings. The
warnings suggest using 'g_object_new_with_properties', but that's only
available since glib 2.55 (i.e. a month ago as of this writing).
Since we don't actuall pass any properties, it seems we can just call
'g_object_new'.
Changed "" quotes to '' as we're not supposed to dynamically
alter python program (via shell $variable expansion).
Added space to python program to match general python style.
Replaced $* with 'idiomatic' "$@" to serve as better example.
In [1], Vladimir Panteleev observed that the In-Reply-To and
References headers could be wrapped in the 'default' output format of
notmuch-reply, depending on the version of Emacs creating the
message. In my own experiments notmuch-reply sometimes wraps headers
with only one message-id if that message-id is long enough. However it
happens, this causes the previous approach using grep to fail.
Since I found the proposed unwrapping shell fragment in [1] a bit hard
to follow, I decided to write a little python script instead. Then
Tomi suggested a slight generalization of my script, and here we are.
[1] id:20170817175145.3204-7-notmuch@thecybershadow.net
Builds not requiring sudo access run in a container, which will have
better performance and less overhead on the Travis infrastructure.
Use the apt addon to install dependencies instead of explicit apt-get
commands.
On some system configurations, setting a breakpoint on the "add_file"
function then issuing "continue" in gdb causes the debugger to
seemingly jump over the add_file invocation. This results in a test
failure, as the "Handle files vanishing between scandir and add_file"
subtest expects add_file to be called and fail due to the vanishing
file. The compiler optimization level also plays a role - the problem
can be reproduced with CFLAGS having -O2 but not -Og.
This problem was observed manifesting as a test failure on Travis CI
configured with "dist: trusty" and "sudo: false". It was not
reproducible on a local Docker image of Travis' runtime environment,
so Travis' virtualization infrastructure likely plays a role as well.
* T050-new.sh: Breakpoint notmuch_database_add_message instead of
add_file to the same effect, and avoid bad gdb behaviour on Travis
CI.
Amended by db:
s/notmuch_database_add_message/notmuch_database_index_file/
Somehow the wrapper function doesn't work as a breakpoint; perhaps due
to inlining.
Travis now offers Ubuntu Trusty (14.04 LTS) VMs as test runners, which
is gradually becoming the default. We can opt in to using Trusty now
so that we no longer need to manually update zlib to a newer version.
Download the test message database used for the T530-upgrade.sh test.
If the additional load on the web server is undesired, Travis can be
instructed to cache the file.
Fix the following cppcheck errors:
notmuch-count.c:207: error: Resource leak: input
notmuch-tag.c:238: error: Resource leak: input
We know that the program is shutting down here, but it does no harm to
clean up a bit.
The advantage of having a target as opposed to running cppcheck by
hand
- reuse list of source files
- output errors in a format parsable, e.g. by emacs
- returns exit code 1 on any error, for possibly use in other
targets.
For the moment, leave this as an optional target. If desired, it can
be added to e.g. the release targets in the same way as the test
target.
Using two levels of directory for the stamps is arguably
overengineering, but it doesn't really cost anything, and leaves open
the possibility of putting other kinds of stamp files there.
This only checks "new" source files (w.r.t. their last check). A future target
(cppcheck-all ?) could blow away the stamp files first.
Sometimes using $@ as the target in the quiet build lines can be
confusing. Accept an optional second parameter in the quiet variable
function to specify the target.
We reorder reading maildir flags to avoid overwriting 'new.tags'. The
inverted status of 'unread' means the maildir flag needs to be checked
a second time.
I backpedalled here on the idea of supporting 'new.tags' without
'unread' in the presence of maildir syncing. For files in 'new/', it
seems quite natural to tag them as 'unread'.
I considered a higher level interface where the caller passes a tag
name rather than a flag character, but the role of the "unread" tag is
particularly confusing with such an interface.