Commit graph

11 commits

Author SHA1 Message Date
Jani Nikula
473930bb6f lib: replace the header parser with gmime
The notmuch library includes a full blown message header parser. Yet
the same message headers are parsed by gmime during indexing. Switch
to gmime parsing completely.

These are the main changes:

* Gmime stops header parsing at the first invalid header, and presumes
  the message body starts from there. The current parser is quite
  liberal in accepting broken headers. The change means we will be
  much pickier about accepting invalid messages.

* The current parser converts tabs used in header folding to
  spaces. Gmime preserve the tabs. Due to a broken python library used
  in mailman, there are plenty of mailing lists that produce headers
  with tabs in header folding, and we'll see plenty of tabs. (This
  change has been mitigated in preparatory patches.)

* For pure header parsing, the current parser is likely faster than
  gmime, which parses the whole message rather than just the
  headers. Since we parse the message and its headers using gmime for
  indexing anyway, this avoids and extra header parsing round when
  adding new messages. In case of duplicate messages, we'll end up
  parsing the full message although just headers would be
  sufficient. All in all this should still speed up 'notmuch new'.

* Calls to notmuch_message_get_header() may be slightly slower than
  previously for headers that are not indexed in the database, due to
  parsing of the whole message. Within the notmuch code base, notmuch
  reply is the only such user.
2014-04-05 12:53:04 -03:00
Jani Nikula
71521f06b0 lib/cli: pass GMIME_ENABLE_RFC2047_WORKAROUNDS to g_mime_init()
As explained by Jeffrey Stedfast, the author of GMime, quoted in [1]:

> Passing the GMIME_ENABLE_RFC2047_WORKAROUNDS flag to g_mime_init()
> *should* solve the decoding problem mentioned in the thread. This
> flag should be safe to pass into g_mime_init() without any bad side
> effects and my unit tests do test that code-path.

The thread being referred to is [2].

[1] id:87bo56viyo.fsf@nikula.org
[2] id:08cb1dcd-c5db-4e33-8b09-7730cb3d59a2@gmail.com
2013-09-14 14:13:43 -03:00
Tomi Ollila
27dacc7947 lib/message-file.c: use g_malloc () & g_free () in hash table values
The message->headers hash table values get data returned by
g_mime_utils_header_decode_text ().

The pointer returned by g_mime_utils_header_decode_text is from the
following line in rfc2047_decode_tokens

        return g_string_free (decoded, FALSE);

The docs for g_string_free say

 Frees the memory allocated for the GString. If free_segment is TRUE
 it also frees the character data. If it's FALSE, the caller gains
 ownership of the buffer and must free it after use with g_free().

The remaining frees and allocations referencing to message->headers hash
values have been changed to use g_free and g_malloc functions.

This combines and completes the changes started by David Bremner.
2012-12-24 19:02:05 -04:00
Stewart Smith
c86d77b16a Fix appending of Received headers
We're not properly concatenating the Received headers if we parse them
while requesting a header that isn't Received.

this fixes notmuch-reply address detection in a bunch of situations.
2011-06-10 17:03:14 -07:00
Anton Khirnov
d3fdb76c8d lib/message-file: plug three memleaks.
Signed-off-by: Jameson Graef Rollins <jrollins@finestructure.net>
2011-06-03 12:30:55 -07:00
Dirk Hohndel
5b8b0377cb Make Received: header special in notmuch_message_file_get_header
With this patch the Received: header becomes special in the way
we treat headers - this is the only header for which we concatenate
all the instances we find (instead of just returning the first one).

This will be used in the From guessing code for replies as we need to
be able to walk ALL of the Received: headers in a message to have a
good chance to guess which mailbox this email was delivered to.

Signed-off-by: Dirk Hohndel <hohndel@infradead.org>
2010-04-26 14:44:06 -07:00
Dirk Hohndel
a48f368778 fix notmuch_message_file_get_header
fix notmuch_message_file_get_header to always return the first instance
 of the header you are looking for

Signed-off-by: Dirk Hohndel <hohndel@infradead.org>
2010-04-06 18:47:28 -07:00
Carl Worth
8cf72920e1 message_file_get_header: Use break where more clear than continue.
Calling continue here worked only because we set a flag before the
continue, and, check the flag at the beginning of the loop, and *then*
break. It's much more clear to just break in the first place.
2009-11-17 18:37:45 -08:00
Mikhail Gusarov
dc5a9d8eb2 Close message file after parsing message headers
Keeping unused files open helps to see "Too many open files" often.

Signed-off-by: Mikhail Gusarov <dottedmag@dottedmag.net>
2009-11-17 08:53:16 -08:00
Carl Worth
091d18c54c notmuch show: Avoid segmentation for message with no subject.
It's safer to return an empty string rather than NULL for missing
header values.
2009-11-11 23:00:58 -08:00
Carl Worth
1465493210 libify: Move library sources down into lib directory.
A "make" invocation still works from the top-level, but not from
down inside the lib directory yet.
2009-11-09 16:24:03 -08:00
Renamed from message-file.c (Browse further)