notmuch

mirror of https://git.notmuchmail.org/git/notmuch synced 2024-11-22 10:58:10 +01:00

Author	SHA1	Message	Date
Austin Clements	567bcbc294	Store "from" and "subject" headers in the database. This is a rebase and cleanup of Istvan Marko's patch from id:m3pqnj2j7a.fsf@zsu.kismala.com Search retrieves these headers for every message in the search results. Previously, this required opening and parsing every message file. Storing them directly in the database significantly reduces IO and computation, speeding up search by between 50% and 10X. Taking full advantage of this requires a database rebuild, but it will fall back to the old behavior for messages that do not have headers stored in the database.	2011-11-14 17:10:58 -04:00
David Bremner	1dedfc90f6	xutil.c: remove duplicate copies, create new library libutil.a to contain xutil. We keep the lib/xutil.c version. As a consequence, also factor out _internal_error and associated macros. It might be overkill to make a new file error_util.c for this, but _internal_error does not really belong in database.cc.	2011-10-30 23:09:49 -03:00
Austin Clements	bfe4555325	lib: Remove message document directly after removing the last file name. Previously, notmuch_database_remove_message would remove the message file name, sync the change to the message document, re-find the message document, and then delete it if there were no more file names. An interruption after sync'ing would result in a file-name-less, permanently un-removable zombie message that would produce errors and odd results in searches. We could wrap this in an atomic section, but it's much simpler to eliminate the round-about approach and just delete the message document instead of sync'ing it if we removed the last filename.	2011-09-23 21:50:39 -04:00
Carl Worth	d5523ead90	Mark some structures in the library interface with visibility=default attribute. As of gcc 4.6, there are new warnings from -Wattributes along the lines of: warning: ‘_notmuch_messages’ declared with greater visibility than the type of its field ‘_notmuch_messages::iterator’ [-Wattributes] To squelch these, we decorate all such containing structs with __attribute__((visibility("default"))). We take care to let only the C++ compiler see this, (since the C compiler would otherwise warn about ignored visibility attributes on types).	2011-05-11 13:27:15 -07:00
Austin Clements	f3c1eebfaf	Implement an internal generic string list and use it. This replaces the guts of the filename list and tag list, making those interfaces simple iterators over the generic string list. The directory, message filename, and tags-related code now build generic string lists and then wraps them in specific iterators. The real wins come in later patches, when we use these for even more generic functionality. As a nice side-effect, this also eliminates the annoying dependency on GList in the tag list.	2011-03-21 02:45:18 -04:00
Carl Worth	99cfa27030	Add support for folder-based searching. A new "folder:" prefix in the query string can now be used to match the directories in which mail files are stored. The addition of this feature causes the recently added search-by-folder tests to now pass.	2011-01-15 15:37:43 -08:00
Austin Clements	b3caef1f06	Optimize thread search using matched docid sets. This reduces thread search's 1+2t Xapian queries (where t is the number of matched threads) to 1+t queries and constructs exactly one notmuch_message_t for each message instead of 2 to 3. notmuch_query_search_threads eagerly fetches the docids of all messages matching the user query instead of lazily constructing message objects and fetching thread ID's from term lists. _notmuch_thread_create takes a seed docid and the set of all matched docids and uses a single Xapian query to expand this docid to its containing thread, using the matched docid set to determine which messages in the thread match the user query instead of using a second Xapian query. This reduces the amount of time required to load my inbox from 4.523 seconds to 3.025 seconds (1.5X faster).	2010-12-07 16:40:05 -08:00
Carl Worth	95dd5fe5d7	notmuch_message_tags_to_maildir_flags: Do nothing outside of "new" and "cur" Some people use notmuch with non-maildir files, (for example, email messages in MH format, or else cool things like using sluk[] to suck down feeds into a format that notmuch can index). To better support uses like that, don't do any renaming for files that are not in a directory named either "new" or "cur". [] https://github.com/krl/sluk/	2010-11-11 14:32:17 -08:00
Carl Worth	1d02dd64af	lib: Add new, public notmuch_message_get_filenames This augments the existing notmuch_message_get_filename by allowing the caller access to all filenames in the case of multiple files for a single message. To support this, we split the iterator (notmuch_filenames_t) away from the list storage (notmuch_filename_list_t) where previously these were a single object (notmuch_filenames_t). Then, whenever the user asks for a file or filename, the message object lazily creates a complete notmuch_filename_list_t and then: For notmuch_message_get_filename, returns the first filename in the list. For notmuch_message_get_filenames, creates and returns a new iterator for the filename list.	2010-11-11 03:40:19 -08:00
Carl Worth	d87db88432	lib: Add new implementation of notmuch_filenames_t The new implementation is simply a talloc-based list of strings. The former support (a list of database terms with a common prefix) is implemented by simply pre-iterating over the terms and populating the list. This should provide no performance disadvantage as callers of thigns like notmuch_directory_get_child_files are very likely to always iterate over all filenames anyway. This new implementation of notmuch_filenames_t is in preparation for adding API to query all of the filenames for a single message.	2010-11-11 03:40:19 -08:00
Carl Worth	bb74e9dff8	lib: Rework interface for maildir_flags synchronization Instead of having an API for setting a library-wide flag for synchronization (notmuch_database_set_maildir_sync) we instead implement maildir synchronization with two new library functions: notmuch_message_maildir_flags_to_tags and notmuch_message_tags_to_maildir_flags These functions are nicely documented here, (though the implementation does not quite match the documentation yet---as plainly evidenced by the current results of the test suite).	2010-11-11 03:40:19 -08:00
Michal Sojka	088801a14a	Maildir synchronization This patch allows bi-directional synchronization between maildir flags and certain tags. The flag-to-tag mapping is defined by flag2tag array. The synchronization works this way: 1) Whenever notmuch new is executed, the following happens: o New messages are tagged with configured new_tags. o For new or renamed messages with maildir info present in the file name, the tags defined in flag2tag are either added or removed depending on the flags from the file name. 2) Whenever notmuch tag (or notmuch restore) is executed, a new set of flags based on the tags is constructed for every message and a new file name is prepared based on the old file name but with the new flags. If the flags differs and the old message was in 'new' directory then this is replaced with 'cur' in the new file name. If the new and old file names differ, the file is renamed and notmuch database is updated accordingly. The rename happens before the database is updated. In case of crash between rename and database update, the next run of notmuch new brings the database in sync with the mail store again.	2010-11-10 13:09:31 -08:00
Carl Worth	c81cecf620	lib: Add GCC visibility(hidden) pragmas to private header files. This prevents any of the private functions from being leaked out through the library interface (at least when compiling with a recent-enough gcc to support the visibility pragma).	2010-11-01 22:35:48 -07:00
Carl Worth	7b78eb4af6	Add support (and tests) for messages with really long message IDs. Scott Henson reported an internal error that occurred when he tried to add a message that referenced another message with a message ID well over 300 characters in length. The bug here was running into a Xapian limit for the length of metadata key names, (which is even more restrictive than the Xapian limit for the length of terms). We fix this by noticing long message ID values and instead using a message ID of the form "notmuch-sha1-<sha1_sum_of_message_id>". That is, we use SHA1 to generate a compressed, (but still unique), version of the message ID. We add support to the test suite to exercise this fix. The tests add a message referencing the long message ID, then add the message with the long message ID, then finally add another message referencing the long ID. Each of these tests exercise different code paths where the special handling is implemented. A final test ensures that all three messages are stitched together into a single thread---guaranteeing that the three code paths all act consistently.	2010-06-04 13:35:07 -07:00
Carl Worth	98845fdbb2	Avoid database corruption by not adding partially-constructed mail documents. Previously we were using Xapian's add_document to allocate document ID values for notmuch_message_t objects. This had the drawback of adding a partially constructed mail document to the database. If notmuch was subsequently interrupted before fully populating this document, then later runs would be quite confused when seeing the partial documents. There are reports from the wild of people hitting internal errors of the form "Message ... has no thread ID" for example, (which is currently an unrecoverable error). We fix this by manually allocating document IDs without adding documents. With this change, we never call Xapian's add_document method, but only replace_document with either the current document ID of a message or a new one that we have allocated.	2010-06-04 10:16:53 -07:00
Dirk Hohndel	5b8b0377cb	Make Signed-off-by: Dirk Hohndel <hohndel@infradead.org>	2010-04-26 14:44:06 -07:00
Dirk Hohndel	57561414d7	Add authors member to message message->authors contains the author's name (as we want to print it) get / set methods are declared in notmuch-private.h Signed-off-by: Dirk Hohndel <hohndel@infradead.org>	2010-04-26 11:44:49 -07:00
Jesse Rosenthal	4971b85641	Name thread based on matching msgs instead of first msg. At the moment all threads are named based on the name of the first message in the thread. However, this can cause problems if people either start new threads by replying-all (as unfortunately, many out there do) or change the subject of their mails to reflect a shift in a thread on a list. This patch names threads based on (a) matches for the query, and (b) the search order. If the search order is oldest-first (as in the default inbox) it chooses the oldest matching message as the subject. If the search order is newest-first it chooses the newest one. Reply prefixes ("Re: ", "Aw: ", "Sv: ", "Vs: ") are ignored (case-insensitively) so a Re: won't change the subject. Note that this adds a "sort" argument to _notmuch_thread_create and _thread_add_matched_message, so that when constructing the thread we can be aware of the sort order. Signed-off-by: Jesse Rosenthal <jrosenthal@jhu.edu>	2010-04-21 14:56:53 -07:00
Carl Worth	4e5d2f22db	lib: Rename iterator functions to prepare for reverse iteration. We rename 'has_more' to 'valid' so that it can function whether iterating in a forward or reverse direction. We also rename 'advance' to 'move_to_next' to setup parallel naming with the proposed functions 'move_to_first', 'move_to_last', and 'move_to_previous'.	2010-03-09 09:22:29 -08:00
Carl Worth	d12801c8b4	lib: Split the database upgrade into two phases for safer operation. The first phase copies data from the old format to the new format without deleting anything. This allows an old notmuch to still use the database if the upgrade process gets interrupted. The second phase performs the deletion (after updating the database version number). If the second phase is interrupted, there will be some unused data in the database, but it shouldn't cause any actual harm.	2010-01-09 11:13:12 -08:00
Carl Worth	909f52bd8c	lib: Implement versioning in the database and provide upgrade function. The recent support for renames in the database is our first time (since notmuch has had more than a single user) that we have a database format change. To support smooth upgrades we now encode a database format version number in the Xapian metadata. Going forward notmuch will emit a warning if used to read from a database with a newer version than it natively supports, and will refuse to write to a database with a newer version. The library also provides functions to query the database format version: notmuch_database_get_version to ask if notmuch wants a newer version than that: notmuch_database_needs_upgrade and a function to actually perform that upgrade: notmuch_database_upgrade	2010-01-07 18:26:31 -08:00
Carl Worth	807aef93d3	Prefer READ_ONLY consistently over READONLY. Previously we had NOTMUCH_DATABASE_MODE_READ_ONLY but NOTMUCH_STATUS_READONLY_DATABASE which was ugly and confusing. Rename the latter to NOTMUCH_STATUS_READ_ONLY_DATABASE for consistency.	2010-01-07 10:29:05 -08:00
Carl Worth	f93b7218c3	lib: Consolidate checks for read-only database. Previously, many checks were deep in the library just before a cast operation. These have now been replaced with internal errors and new checks have instead been added at the beginning of all top-levelentry points requiring a read-write database. The new checks now also use a single function for checking and printing the error message. This will give us a convenient location to extend the check, (such as based on database version as well).	2010-01-07 10:19:44 -08:00
Carl Worth	d807e28f43	lib: Implement new notmuch_directory_t API. This new directory ojbect provides all the infrastructure needed to detect when files or directories are deleted or renamed. There's still code needed on top of this (within "notmuch new") to actually do that detection.	2010-01-06 10:32:06 -08:00
Carl Worth	498edff503	database: Abstract _filename_to_direntry from _add_message The code to map a filename to a direntry is something that we're going to want in a future _remove_message function, so put it in a new function _notmuch_database_filename_to_direntry .	2010-01-06 10:32:05 -08:00
Carl Worth	1376a90db6	database: Allowing storing multiple filenames for a single message ID. The library interface is unchanged so far, (still just notmuch_database_add_message), but internally, the old _set_filename function is now _add_filename instead.	2010-01-06 10:32:05 -08:00
Carl Worth	6ca6c089e9	database: Store mail filename as a new 'direntry' term, not as 'data'. Instead of storing the complete message filename in the data portion of a mail document we now store a 'direntry' term that contains the document ID of a directory document and also the basename of the message filename within that directory. This will allow us to easily store multple filenames for a single message, and will also allow us to find mail documents for files that previously existed in a directory but that have since been deleted.	2010-01-06 10:32:05 -08:00
Carl Worth	84742d86ab	database: Split _find_parent_id into _split_path and _find_directory_id Some pending commits want the _split_path functionality separate from mapping a directory to a document ID. The split_path function now returns the basename as well as the directory name.	2010-01-06 10:32:05 -08:00
Carl Worth	406ec4b15d	database: Export _notmuch_database_find_parent_id for internal use. We'll soon have mail documents referring to their parent directory's directory documents, so we'll need access to _find_parent_id in files such as message.cc.	2010-01-06 10:32:05 -08:00
Carl Worth	ba12bf1f26	lib: Abstract the extraction of a relative path from set_filename We'll soon be having multiple entry points that accept a filename path, so we want common code for getting a relative path from a potentially absolute path.	2010-01-06 10:32:05 -08:00
Carl Worth	8c6b7d311c	lib: Add missing value to notmuch_private_status_t enum. And fix the initialization such that the private enum will always have distinct values from the public enum even if we similarly miss the addition of a new public value in the future.	2010-01-06 10:32:05 -08:00
Fernando Carrijo	db68eea013	Nuke the remainings of _notmuch_message_add_thread_id. The function _notmuch_message_add_thread_id has been removed from the private interface of notmuch. There's no reason for one to keep a declaration of its prototype in the code base. Also, lets update a commentary that referenced that function and escaped from previous scrutiny. Signed-off-by: Fernando Carrijo <fcarrijo@yahoo.com.br>	2009-12-09 12:09:55 -08:00
Jeffrey C. Ollie	95f97540a0	Remove unused notmuch_parse_date function prototype. notmuch_parse_date is not implemented, so remove the unused function prototype. Signed-off-by: Jeffrey C. Ollie <jeff@ocjtech.us>	2009-12-03 17:07:22 -08:00
Carl Worth	880b21a097	Makefile: Incorporate getline implementation into the build. It's unconditional for a very short time. We expect to soon be building it only if necessary.	2009-12-01 16:33:17 -08:00
Carl Worth	70962fabf9	lib/messages.c: Make message searches stream as well. Xapian provides an interator-based interface to all search results. So it was natural to make notmuch_messages_t be iterator-based as well. Which we did originally. But we ran into a problem when we added two APIs, (_get_replies and _get_toplevel_messages), that want to return a messages iterator that's not based on a Xapian search result. My original compromise was to use notmuch_message_list_t as the basis for all returned messages iterators in the public interface. This had the problem of introducing extra latency at the beginning of a search for messages, (the call would block while iterating over all results from Xapian, converting to a message list). In this commit, we remove that initial conversion and instead provide two alternate implementations of notmuch_messages_t (one on top of a Xapian iterator and one on top of a message list). With this change, I tested a "notmuch search" returning many results as previously taking about 7 seconds before results started appearing, and now taking only 2 seconds.	2009-11-24 11:33:09 -08:00
Chris Wilson	f379aa5284	Permit opening the notmuch database in read-only mode. We only rarely need to actually open the database for writing, but we always create a Xapian::WritableDatabase. This has the effect of preventing searches and like whilst updating the index. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Carl Worth <cworth@cworth.org>	2009-11-21 22:04:49 +01:00
Ingmar Vanhassel	2ce25b93a7	Typsos	2009-11-18 03:21:36 -08:00
Carl Worth	0da0131096	database: Make _parse_message_id static once again. We had exposed this to the internal implementation for a short time, (only while we had the silly code fetching In-Reply-To values from message files instead of from the database). Make this private again as it should be.	2009-11-17 18:50:13 -08:00
Keith Packard	d025e89ac7	Fix "too many open files" bug by closing message files when done with them. The message file header parsing code parses only enough of the file to find the desired header fields, then it leaves the file open until the next header parsing call or when the message is no longer in use. If a large number of messages end up being active, this will quickly run out of file descriptors. Here, we add support to explicitly close the message file within a message, (_notmuch_message_close) and call that from thread construction code. Signed-off-by: Keith Packard <keithp@keithp.com> Edited-by: Carl Worth <cworth@cworth.org>: Many portions of Keith's original patch have since been solved other ways, (such as the code that changed the handling of the In-Reply-To header). So the final version is clean enough that I think even Keith would be happy to have his name on it.	2009-11-17 18:37:13 -08:00
Carl Worth	24a25ffba9	Remove the talloc_owner argument from create_for_message_id. This function has only one caller, and that one caller was passing the same value for both talloc_owner and the notmuch database. Dropping the redundant argument simplifies the documentation of this function considerably.	2009-11-17 17:42:32 -08:00
Carl Worth	933caf814f	notmuch show: Implement proper thread ordering/nesting of messages. We now properly analyze the in-reply-to headers to create a proper tree representing the actual thread and present the messages in this correct thread order. Also, there's a new "depth:" value added to the "message{" header so that clients can format the thread as desired, (such as by indenting replies).	2009-11-15 20:41:45 -08:00
Carl Worth	d136a1e2cf	Add _notmuch_message_get_in_reply_to. The existing notmuch_message_get_header is almost good enough for this, except that we also need to remove the '<' and '>' delimiters. We'll probably want to implement this function with database storage in the future rather than loading the email message.	2009-11-15 20:36:51 -08:00
Carl Worth	b97756926f	Remove obsolete notmuch_message_get_subject prototype. This prototype has been sitting around for a while with no function implementing it. I wonder if there's a compiler warning I could turn on to catch these things.	2009-11-15 20:34:24 -08:00
Carl Worth	f970d8078c	lib/messages: Add new notmuch_message_list_t to internal interface. Previously, the notmuch_messages_t object was a linked list built on top of a linked-list node with the odd name of notmuch_message_list_t. Now, we've got much more sane naming with notmuch_message_list_t being a list built on a linked-list node named notmuch_message_node_t. And now the public notmuch_messages_t object is a separate iterator based on notmuch_message_node_t. This means the interfaces for the new notmuch_message_list_t object are now made available to the library internals.	2009-11-15 20:31:30 -08:00
Carl Worth	9b1c6c250b	Export _parse_message_id to the library implementation. Not exported through the public interface, but the thread code is going to want to be able to parse In-Reply-To headers so needs access to this code.	2009-11-15 20:21:43 -08:00
Carl Worth	d3349358c6	lib: Move notmuch_messages_t code from query.cc to new messages.c The new object is simply a linked-list of notmuch_message_t objects, (unlike the old object which contained a couple of Xapian iterators). This works now by the query code immediately iterator over all results and creating notmuch_message_t objects for them, (rather than waiting to create the objects until the notmuch_messages_get call as we did earlier). The point of this change is to allow other instances of lists of messages, (such as in notmuch_thread_t), that are not directly related to Xapian search results.	2009-11-14 23:05:17 -08:00
Carl Worth	c168e24174	notmuch search: Print the number of matched/total messages for each thread. Note that we don't print the number of unread messages, but instead the number of messages that matched the search terms. This is in keeping with our philosophy that the inbox is nothing more than a search view. If we search for messages with an inbox tag, then that's what we'll get a count of. (And if somebody does want to see unread counts, then they can search for the "unread" tag.) Getting the number of matched messages is really nice when doing historical searches. For example in a search like: notmuch search tag:sent (where the "sent" tag has been applied to all messages originating from the user's email address)---here it's really nice to be able to see a thread where the user just mentioned one point [1/13] vs. really getting involved in the discussion [10/29].	2009-11-12 22:01:44 -08:00
Carl Worth	ec6d3506db	notmuch search: Print all authors contributing to a thread. We've now expanded the notmuch_thread_create function to fire off a secondary database query to find all the messages that belong to this particular thread. This allows us to now have the complete authors' list for the thread, and will also make it trivial to print accurate message counts for threads in the future.	2009-11-12 21:09:54 -08:00
Carl Worth	1465493210	libify: Move library sources down into lib directory. A "make" invocation still works from the top-level, but not from down inside the lib directory yet.	2009-11-09 16:24:03 -08:00

49 commits