notmuch

mirror of https://git.notmuchmail.org/git/notmuch synced 2024-11-22 19:08:09 +01:00

Author	SHA1	Message	Date
Carl Worth	f9bbd7baa0	Add full-text indexing using the GMime library for parsing. This is based on the old notmuch-index-message.cc from early in the history of notmuch, but considerably cleaned up now that we have some experience with Xapian and know just what we want to index, (rather than just blindly trying to index exactly what sup does). This does slow down notmuch_database_add_message a lot, but I've got some ideas for getting some time back.	2009-10-28 12:50:10 -07:00
Carl Worth	d07dd49aac	Fix incorrect name of _notmuch_thread_get_subject. Somehow this naming with an underscore crept in, (but only in the private header, so notmuch.c was compiling with no prototype). Fix to be the notmuch_thread_get_subject originally intended.	2009-10-26 20:11:58 -07:00
Carl Worth	c12823648e	Add public notmuch_thread_get_subject And use this in "notmuch search" to display subject line as well as thread ID.	2009-10-26 17:35:31 -07:00
Carl Worth	8e96a87fff	Remove all calls to g_strdup_printf Replacing them with calls to talloc_asprintf if possible, otherwise to asprintf (with it's painful error-handling leaving the pointer undefined).	2009-10-26 15:17:10 -07:00
Carl Worth	94f01d9de9	Add notmuch_thread_get_tags And augment "notmuch search" to print tag values as well as thread ID values. This tool is almost usable now.	2009-10-26 14:46:14 -07:00
Carl Worth	ef3ab5781a	tags: Replace sort() and reset() with prepare_iterator(). The previous functions were always called together, so we might as well just have one function for this. Also, the reset() name was poor, and prepare_iterator() is much more descriptive.	2009-10-26 14:12:56 -07:00
Carl Worth	3dce200788	tags: Re-implement tags iterator to avoid having C++ in the interface We want to be able to iterate over tags stored in various ways, so the previous TermIterator-based tags object just wasn't general enough. The new interface is nice and simple, and involves only C datatypes.	2009-10-26 14:02:51 -07:00
Carl Worth	1ba3d46fab	Add an initial implementation of a notmuch_thread_t object. We've now got a new notmuch_query_search_threads and a notmuch_threads_result_t iterator. The thread object itself doesn't do much yet, (just allows one to get the thread_id), but that's at least enough to see that "notmuch search" is actually doing something now, (since it has been converted to print thread IDs instead of message IDs). And maybe that's all we need. Getting the messages belonging to a thread is as simple as a notmuch_query_search_messages with a string of "thread:<thread-id>". Though it would be convenient to add notmuch_thread_get_messages which could use the existing notmuch_message_results_t iterator. Now we just need an implementation of "notmuch show" and we'll have something somewhat usable.	2009-10-25 23:12:20 -07:00
Carl Worth	884ac59256	Re-enable the warning for unused parameters. It's easy enough to squelch the warning with an __attribute__ ((unused)) and it might just catch something for us in the future.	2009-10-25 15:53:27 -07:00
Carl Worth	a360670c03	Change database to store only a single thread ID per message. Instead of supporting multiple thread IDs, we now merge together thread IDs if one message is ever found to belong to more than one thread. This allows for constructing complete threads when, for example, a child message doesn't include a complete list of References headers back to the beginning of the thread. It also simplifies dealing with mapping a message ID to a thread ID which is now a simple get_thread_id just like get_message_id, (and no longer an iterator-based thing like get_tags).	2009-10-25 14:54:13 -07:00
Carl Worth	7b227a6bf7	Add an INTERNAL_ERROR macro and use it for all internal errors. We were previously just doing fprintf;exit at each point, but I wanted to add file and line-number details to all messages, so it makes sense to use a single macro for that.	2009-10-25 10:54:49 -07:00
Carl Worth	3b8e3ab666	add_message: Propagate error status from notmuch_message_create_for_message_id What a great feeling to remove an XXX comment.	2009-10-25 10:54:43 -07:00
Carl Worth	1c2bac747e	Drop the storage of thread ID(s) in a value. Now that we are iterating over the thread terms instead, we can drop this redundant storage (which should shrink our database a tiny bit).	2009-10-25 00:31:20 -07:00
Carl Worth	28dd86af05	Implement notmuch_tags_t on top of new notmuch_terms_t The generic notmuch_terms_t iterator should provide support for notmuch_thread_ids_t when we switch as well, (And it would be interesting to see if we could reasonably make this support a PostingIterator too. Time will tell.)	2009-10-25 00:31:13 -07:00
Carl Worth	9ec68aa9c4	Shuffle the value numbers around in the database. First, it's nice that for now we don't have any users yet, so we can make incompatible changes to the database layout like this without causing trouble. ;-) There are a few reasons for this change. First, we now use value 0 uniformly as a timestamp for both mail and timestamp documents, (which lets us cleanup an ugly and fragile bare 0 in the add_value and get_value calls in the timestamp code). Second, I want to drop the thread value entirely, so putting it at the end of the list means we can drop it as compatible change in the future. (I almost want to drop the message-ID value too, but it's nice to be able to sort on it to get diff-able output from "notmuch dump".) But the thread value we never use as a value, (we would never sort on it, for example). And it's totally redundant with the thread terms we store already. So expect it to disappear soon.	2009-10-24 23:05:08 -07:00
Carl Worth	68a10091d6	Add notmuch_database_set_timestamp and notmuch_database_get_timestamp These will be very helpful to implement an efficient "notmuch new" command which imports new mail messages that have appeared.	2009-10-23 14:31:01 -07:00
Carl Worth	668f20bdfb	database: Add private find_unique_doc_id and find_unique_document functions These are a generalization of the unique-ness testing of notmuch_database_find_message. More preparation for firectory timestamps.	2009-10-23 14:24:07 -07:00
Carl Worth	6b228e4509	sha1: Add new notmuch_sha1_of_string function We'll be using this for storing really long terms in the database and when we just need to look them up, (and never read back the original data directly from the database). For example, storing arbitrarily long directory paths in the database along with mtime timestamps. Note that if we did want to store arbitrarily long terms and also be able to read them back, the Xapian folks recommending splitting the term off with multiple prefixes. See the note near the end of this page: http://trac.xapian.org/wiki/FAQ/UniqueIds	2009-10-23 13:54:53 -07:00
Carl Worth	6ccdffcd87	add_message: Fix to not add multiple documents with the same message ID Here's the second big fix to message-ID handling, (the first was to generate message IDs when an email contained none). Now, with no document missing a message ID, and no two documents having the same message ID, we have a nice consistent database where the message ID can be used as a unique key.	2009-10-23 06:00:10 -07:00
Carl Worth	1b5d8984c6	Add _notmuch_message_create_for_message_id This is the last piece needed for add_message to be able to properly support a message with a duplicate message ID. This function creates a new notmuch_message_t object but one that may reference an existing document in the database.	2009-10-23 05:53:52 -07:00
Carl Worth	17548e314a	Add internal functions for manipulating a new notmuch_message_t This will support the add_message function in incrementally creating state in a new notmuch_message_t. The new functions are _notmuch_message_set_filename _notmuch_message_add_thread_id _notmuch_message_ensure_thread_id _notmuch_message_set_date _notmuch_message_sync	2009-10-23 05:48:52 -07:00
Carl Worth	c78358fa8a	Move thread_id generation code from database.cc to message.cc It's really up to the message to decide how to generate these.	2009-10-23 05:25:58 -07:00
Carl Worth	6a4992bc61	Generate message ID (using SHA1) when a mail message contains none. This is important as we're using the message ID as the unique key in our database. So previously, all messages with no message ID would be treated as the same message---not good at all.	2009-10-22 15:31:56 -07:00
Carl Worth	defd216487	Add notmuch_message_add_tag and notmuch_message_remove_tag With these two added, we now have enough functionality in the library to implement "notmuch restore".	2009-10-21 15:56:33 -07:00
Carl Worth	0bbfa57014	notmuch-private.h: Move NOTMUCH_BEGIN_DECLS earlier We actually need this before the include of xutil.h, but it was previously stuck randomly among various system includes. Instead, put it at the top, right after include the notmuch.h header that defines it.	2009-10-21 15:51:13 -07:00
Carl Worth	6c5054ebee	database: Add new notmuch_database_find_message With this function, and the recently added support for notmuch_message_get_thread_ids, we now recode the find_thread_ids function to work just the way we expect a user of the public notmuch API to work. Not too bad really.	2009-10-21 15:40:20 -07:00
Carl Worth	22b2265cac	Rename NOTMUCH_MAX_TERM to NOTMUCH_TERM_MAX Just better consistency with our naming schemes.	2009-10-21 14:10:00 -07:00
Carl Worth	6142216132	Move find_prefix function from database.cc to message.cc It's definitely a better fit there for now, (and can likely eventually be made static as add_term moves from database to message as well).	2009-10-21 14:07:40 -07:00
Carl Worth	9ec5189a56	Move declarations for xutil.c from notmuch-private to new xutil.h. The motivation here is that our top-level notmuch.c main program wants to start using these, but we don't want it to see into notmuch-private.h, (since our main program is a test vehicle for the "public" notmuch interface in notmuch.h).	2009-10-21 13:57:02 -07:00
Carl Worth	65baa4f4e7	notmuch dump: Fix the sorting of results. To properly support sorting in notmuch_query we know use an Enquire object. We also throw in a QueryParser too, so we're really close to being able to support arbitrary full-text searches. I took a look at the supported QueryParser syntax and chose a set of flags for everything I like, (such as supporting Boolean operators in either case ("AND" or "and"), supporting phrase searching, supporting + and - to include/preclude terms, and supporting a trailing * on any term as a wildcard).	2009-10-21 00:35:56 -07:00
Carl Worth	466a7bbf62	Implement 'notmuch dump'. This is a fairly big milestone for notmuch. It's our first command to do anything besides building the index, so it proves we can actually read valid results out from the index. It also puts in place almost all of the API and infrastructure we will need to allow searching of the database. Finally, with this change we are now using talloc inside of notmuch which is truly a delight to use. And now that I figured out how to use C++ objects with talloc allocation, (it requires grotty parts of C++ such as "placement new" and "explicit destructors"), we are valgrind-clean for "notmuch dump", (as in "no leaks are possible").	2009-10-20 21:21:39 -07:00
Carl Worth	cd4a8734d3	Rename private notmuch_message_t to notmuch_message_file_t This is in preparation for a new, public notmuch_message_t. Eventually, the public notmuch_message_t is going to grow enough features to need to be file-backed and will likely need everything that's now in message-file.c. So we may fold these back into one object/implementation in the future.	2009-10-20 15:09:51 -07:00
Carl Worth	b6dd413903	Protect definition of _GNU_SOURCE. I was getting a duplicate definition of this from somewhere, so getting compiler warnings without this protection.	2009-10-19 22:34:59 -07:00
Carl Worth	371091139a	Rework message parsing to use getline rather than mmap. The line-based parsing can be a bit awkward when wanting to peek ahead, (say, for folded header values), but it's so convenient to be able to trust that a string terminator exists on every line so it cleans up the code considerably.	2009-10-19 16:38:44 -07:00
Carl Worth	0e777a8f80	notmuch: Switch from gmime to custom, ad-hoc parsing of headers. Since we're currently just trying to stitch together In-Reply-To and References headers we don't need that much sophistication. It's when we later add full-text searching that GMime will be useful. So for now, even though my own code here is surely very buggy compared to GMime it's also a lot faster. And speed is what we're after for the initial index creation.	2009-10-19 13:00:43 -07:00
Carl Worth	10c176ba0e	notmuch: Start actually adding messages to the index. This is the beginning of the notmuch library as well, with its interface in notmuch.h. So far we've got create, open, close, and add_message (all with a notmuch_database prefix). The current add_message function has already been whittled down from what we have in notmuch-index-message to add only references, message-id, and thread-id to the index, (that is---just enough to do thread-linkage but nothing for full-text searching). The concept here is to do something quickly so that the user can get some data into notmuch and start using it. (The most interesting stuff is then thread-linkage and labels like inbox and unread.) We can defer the full-text indexing of the body of the messages for later, (such as in the background while the user is reading mail). The initial thread-stitching step is still slower than I would like. We may have to stop using libgmime for this step as its overhead is not worth it for the simple case of just parsing the message-id, references, and in-reply-to headers.	2009-10-18 20:56:30 -07:00

36 commits