notmuch

mirror of https://git.notmuchmail.org/git/notmuch synced 2024-11-22 10:58:10 +01:00

Author	SHA1	Message	Date
Austin Clements	610f0e0992	lib: Reject multi-message mboxes and deprecate single-message mbox Previously, we would treat multi-message mboxes as one giant email, which, besides the obvious incorrect indexing, often led to out-of-memory errors for archival mboxes. Now we explicitly reject multi-message mboxes. For historical reasons, we retain support for single-message mboxes, but official deprecate this behavior.	2012-11-26 21:12:10 -04:00
Michal Sojka	40edc971a8	Convert non-UTF-8 parts to UTF-8 before indexing them This fixes a bug that didn't allow to search for non-ASCII words such parts. The code here was copied from show_text_part_content(), because the show command already does the needed conversion when showing the message.	2012-02-29 07:41:39 -04:00
Jameson Graef Rollins	ac7f843064	Ignore encrypted parts when indexing. It appears to be an oversight that encrypted parts were indexed previously. The terms generated from encrypted parts are meaningless and do nothing but add bloat to the database. It is not worth indexing the encrypted content, just as it's not worth indexing the signatures in signed parts.	2011-12-29 17:44:43 -04:00
Jameson Graef Rollins	1d6b49561f	tag signed/encrypted during notmuch new This patch adds the tag "signed" to messages with any multipart/signed parts, and the tag "encrypted" to messages with any multipart/encrypted parts. This only occurs when messages are indexed during notmuch new, so a database rebuild is required to have old messages tagged.	2011-05-27 16:22:00 -07:00
Carl Worth	c7b4d15d0a	Fix to index the "Re" term present in any subject. This was a misfeature where notmuch had extra code that just threw away legitimate information. It was never indexing an initial "Re" term in a subject. But some users have legitimately wanted to search for this term. The original code was written this way merely for strict compatiblity with the indexing performed by sup, but we're not taking advantage of that now anyway.	2010-11-23 18:11:04 -08:00
Carl Worth	67c3bc9db4	lib: Add some missing static qualifiers. These various functions and data are all used only locally, so should be marked static. Ensuring we get these right will avoid us accidentally leaking unintended symbols through the library interface.	2010-11-01 21:58:43 -07:00
martin f. krafft	449a418c65	Do not segfault on empty mime parts notmuch previously unconditionally checked mime parts for various properties, but not for NULL, which is the case if libgmime encounters an empty mime part. Upon encounter of an empty mime part, the following is printed to stderr (the second line due to my patch): (process:17197): gmime-CRITICAL **: g_mime_message_get_mime_part: assertion `GMIME_IS_MESSAGE (message)' failed Warning: Not indexing empty mime part. This is probably a bug that should get addressed in libgmime, but for not, my patch is an acceptable workaround. Signed-off-by: martin f. krafft <madduck@madduck.net>	2010-04-13 08:49:06 -07:00
Carl Worth	2bc0af15aa	Eliminate some useless gobject boilerplate. If we had external users of this filter then they might expect some of these macros to exist. But since this is just internal, that's just unneeded noise.	2010-02-04 17:26:00 -08:00
Carl Worth	3767c6f9f9	notmuch new: Don't index uuencoded data. With modern MIME attachments, we're already avoiding indexing the attachments. But for old-school uuencoded data in the mail, we have been directly indexing the encoded data as terms, (which is not useful at all---nobody will ever ytry to search based on the seemingly random uuencoded data). Additionally, indexing a modestly large uuencoded file seems to make Xapian go insane, (consuming lots of memory). We fix both problems by detecting uuencoded content and not performing any indexing of it.	2010-02-04 17:08:11 -08:00
Carl Worth	6ef6ddba80	Index content from citations and signatures. In the presentation we often omit citations and signatures, but this is not content that should be omitted from the index, (especially when the citation detection is wrong---see cases where a line beginning with "From" is corrupted to ">From" by mail processing tools).	2010-01-06 10:32:06 -08:00
Carl Worth	6ca6c089e9	database: Store mail filename as a new 'direntry' term, not as 'data'. Instead of storing the complete message filename in the data portion of a mail document we now store a 'direntry' term that contains the document ID of a directory document and also the basename of the message filename within that directory. This will allow us to easily store multple filenames for a single message, and will also allow us to find mail documents for files that previously existed in a directory but that have since been deleted.	2010-01-06 10:32:05 -08:00
Carl Worth	e5316b320a	lib/index: Fix memory leak for email addresses without names. We carefully noted the fact that we had locally allocated the string here, but then we neglected to free it. Switch to talloc instead which makes it easier to get the behavior we want. It's simpler since we can just call talloc_free unconditionally, without having to track the state of whether we allocated the storage for name or not.	2009-12-01 12:40:13 -08:00
Ingmar Vanhassel	2ce25b93a7	Typsos	2009-11-18 03:21:36 -08:00
Carl Worth	4d35c3544d	Don't create "contact" terms in the database. We never did export any interface to get at these, and when I went to use these, I found them inadequate, (because I wanted to distinguish address found in from: from those found in To:). Meanwhile, it was easy enough to extract addresses with a search like: notmuch show tag:sent \| grep ^To: so the storage of contact terms was just wasting space. Stop that.	2009-11-12 09:38:24 -08:00
Carl Worth	1465493210	libify: Move library sources down into lib directory. A "make" invocation still works from the top-level, but not from down inside the lib directory yet.	2009-11-09 16:24:03 -08:00

15 commits