Commit graph

81 commits

Author SHA1 Message Date
Austin Clements
b3caef1f06 Optimize thread search using matched docid sets.
This reduces thread search's 1+2t Xapian queries (where t is the
number of matched threads) to 1+t queries and constructs exactly one
notmuch_message_t for each message instead of 2 to 3.
notmuch_query_search_threads eagerly fetches the docids of all
messages matching the user query instead of lazily constructing
message objects and fetching thread ID's from term lists.
_notmuch_thread_create takes a seed docid and the set of all matched
docids and uses a single Xapian query to expand this docid to its
containing thread, using the matched docid set to determine which
messages in the thread match the user query instead of using a second
Xapian query.

This reduces the amount of time required to load my inbox from 4.523
seconds to 3.025 seconds (1.5X faster).
2010-12-07 16:40:05 -08:00
Carl Worth
d064bd696c lib: Eliminate some redundant includes of xapian.h
Most files including this already include database-private.h which
includes xapian.h already.
2010-11-01 23:24:40 -07:00
Carl Worth
e83b40138e lib: Add two functions: notmuch_query_get_query_string and _get_sort
It can be handy to be able to query these settings from an existing
query object.
2010-10-28 10:30:26 -07:00
Carl Worth
f6cb896bc4 lib: Fix notmuch_query_search_threads to return NULL on any Xapian exception.
Previously, if the underlying search_messages hit an exception and returned
NULL, this function would ignore that and return a non-NULL, (but empty)
threads object. Fix this to properly propagate the error.
2010-10-22 17:56:58 -07:00
Carl Worth
138fd38afe lib: Ensure notmuch_query_search_messages returns NULL on an exception.
Previously, this function may have segfaulted immediately after
reporting the exception.
2010-04-24 07:27:50 -07:00
Sebastian Spaeth
aadb15a002 query.cc: allow to return query results unsorted
Previously, we always sorted the returned results by some string value,
(newest-to-oldest by default), however in some cases (as when applying
tags to a search result) we are not interested in any special order.

This introduces a NOTMUCH_SORT_UNSORTED value that does just that. It is
not used at the moment anywhere in the code.

Signed-off-by: Sebastian Spaeth <Sebastian@SSpaeth.de>
2010-04-21 16:06:05 -07:00
Jesse Rosenthal
4971b85641 Name thread based on matching msgs instead of first msg.
At the moment all threads are named based on the name of the first message
in the thread. However, this can cause problems if people either start
new threads by replying-all (as unfortunately, many out there do) or
change the subject of their mails to reflect a shift in a thread on a
list.

This patch names threads based on (a) matches for the query, and (b) the
search order. If the search order is oldest-first (as in the default
inbox) it chooses the oldest matching message as the subject. If the
search order is newest-first it chooses the newest one.

Reply prefixes ("Re: ", "Aw: ", "Sv: ", "Vs: ") are ignored
(case-insensitively) so a Re: won't change the subject.

Note that this adds a "sort" argument to _notmuch_thread_create and
_thread_add_matched_message, so that when constructing the thread we can
be aware of the sort order.

Signed-off-by: Jesse Rosenthal <jrosenthal@jhu.edu>
2010-04-21 14:56:53 -07:00
Carl Worth
e100871981 lib: Handle "*" as a query string to match all messages.
This seems like a generally useful thing to support, (but the previous
support through an empty string was not convenient for some users,
(such as the command-line client).
2010-04-09 17:43:58 -07:00
Carl Worth
4e5d2f22db lib: Rename iterator functions to prepare for reverse iteration.
We rename 'has_more' to 'valid' so that it can function whether
iterating in a forward or reverse direction. We also rename
'advance' to 'move_to_next' to setup parallel naming with
the proposed functions 'move_to_first', 'move_to_last', and
'move_to_previous'.
2010-03-09 09:22:29 -08:00
Carl Worth
45b1856782 lib: Explicitly set BoolWeight when searching.
All notmuch searches currently sort by value (either date or message
ID) so it's just wasted effort for Xapian to compute relevance values
for each result. We now explicitly tell Xapian that we're uninterested
in the relevance values.
2010-01-09 11:16:40 -08:00
Jeffrey C. Ollie
e991148b00 Silence compiler warning by initializing a variable.
If Xapian threw an exception on notmuch_query_count_messages the count
variable could be used uninitialized.  Initialize count to solve the
problem.

Signed-off-by: Jeffrey C. Ollie <jeff@ocjtech.us>
2009-11-27 18:38:06 -08:00
Carl Worth
70962fabf9 lib/messages.c: Make message searches stream as well.
Xapian provides an interator-based interface to all search results.
So it was natural to make notmuch_messages_t be iterator-based as
well. Which we did originally.

But we ran into a problem when we added two APIs, (_get_replies and
_get_toplevel_messages), that want to return a messages iterator
that's *not* based on a Xapian search result. My original compromise
was to use notmuch_message_list_t as the basis for all returned
messages iterators in the public interface.

This had the problem of introducing extra latency at the beginning
of a search for messages, (the call would block while iterating over
all results from Xapian, converting to a message list).

In this commit, we remove that initial conversion and instead provide
two alternate implementations of notmuch_messages_t (one on top of a
Xapian iterator and one on top of a message list).

With this change, I tested a "notmuch search" returning *many* results
as previously taking about 7 seconds before results started appearing,
and now taking only 2 seconds.
2009-11-24 11:33:09 -08:00
Carl Worth
94eb9aacd4 lib/query: Drop the first and max_messages arguments from search_messages.
These only existed to support the chunky-searching hack, but that
was recently dropped anyway.
2009-11-23 20:25:13 -08:00
Carl Worth
ba3554b804 lib/query: Fix notmuch_threads_t to stream results rather than blocking.
Previously, notmuch_query_search_threads would do all the work, so the
caller would block until all results were processed. Now, we do the
work as we go, as the caller iterates with notmuch_threads_next. This
means that once results start coming back from "notmuch search" they
just keep continually streaming.

There's still some initial blocking before the first results appear
because the notmuch_messages_t object has the same bug (for now).
2009-11-23 20:18:57 -08:00
Carl Worth
1fd8b7866f notmuch search: Remove the chunked-searching hack.
This was a poor workaround around the fact that the existing
notmuch_threads_t object is implemented poorly. It's got a fine
iterartor-based interface, but the implementation does all of the
work up-front in _create rather than doing the work incrementally
while iterating.

So to start fixing this, first get rid of all the hacks we had working
around this. This drops the --first and --max-threads options from the
search command, (but hopefully nobody was using them
anyway---notmuch.el certainly wasn't).
2009-11-23 20:17:37 -08:00
Keith Packard
53f8cc5651 Add 'notmuch count' command to show the count of matching messages
Getting the count of matching threads or messages is a fairly
expensive operation. Xapian provides a very efficient mechanism that
returns an approximate value, so use that for this new command.

This returns the number of matching messages, not threads, as that is
cheap to compute.

Signed-off-by: Keith Packard <keithp@keithp.com>
2009-11-23 06:33:54 +01:00
Carl Worth
e2341cbc09 Catch and optionally print about exception at database->flush.
If an earlier exception occurred, then it's not unexpected for the
flush to fail as well. So in that case, we'll silently catch the
exception. Otherwise, make some noise about things going wrong at the
time of flush.
2009-11-22 03:54:20 +01:00
Carl Worth
591f901241 Print information about where Xapian exception occurred.
Previously, our Xapian exception reports where identical so they
were hard to track down.
2009-11-22 03:51:35 +01:00
Eric Anholt
59c241ebd0 When a search query triggers a Xapian exception, log what the query was.
In my script containing a series of queries to be run on new mail for
setting up tags, it's nice to see which query I typed wrong.

Signed-off-by: Eric Anholt <eric@anholt.net>
2009-11-21 00:18:15 +01:00
Adrian Perez
e5da2b701f Allow lone "not" search operators
As suggested by Keith in FLAG_PURE_NOT allows for expressions like:

  notmuch search NOT tag:inbox

Note that this way a search like:

  notmuch search foobar NOT tag:inbox

should not be written instead:

  notmuch search foobar AND NOT tag:inbox

In my opinion, the latter feels more natural and is somewhat more explicit.
It gives a better clue of what the search is about instead of assuming that
an implicit AND operator is there.
2009-11-19 01:42:31 +01:00
Carl Worth
3334865725 notmuch search: Change default search order to be newest messages first.
This is what most people want for a _search_ command. It's often
different for actually reading mail in an inbox, (where it makes more
sense to have results displayed in chronological order), but in such a
case, ther user is likely using an interface that can simply pass the
--sort=oldest-first option to "notmuch search".

Here we're also change the sort enum from NOTMUCH_SORT_DATE and
NOTMUCH_SORT_DATE_REVERSE to NOTMUCH_SORT_OLDEST_FIRST and
NOTMUCH_SORT_NEWEST_FIRST. Similarly we replace the --reverse option
to "notmuch search" with two options: --sort=oldest-first and
--sort=newest-first.

Finally, these changes are all tracked in the emacs interface, (which
has no change in its behavior).
2009-11-17 20:58:30 -08:00
Carl Worth
f970d8078c lib/messages: Add new notmuch_message_list_t to internal interface.
Previously, the notmuch_messages_t object was a linked list built on
top of a linked-list node with the odd name of notmuch_message_list_t.

Now, we've got much more sane naming with notmuch_message_list_t being
a list built on a linked-list node named notmuch_message_node_t. And
now the public notmuch_messages_t object is a separate iterator based
on notmuch_message_node_t. This means the interfaces for the new
notmuch_message_list_t object are now made available to the library
internals.
2009-11-15 20:31:30 -08:00
Carl Worth
d3349358c6 lib: Move notmuch_messages_t code from query.cc to new messages.c
The new object is simply a linked-list of notmuch_message_t objects,
(unlike the old object which contained a couple of Xapian iterators).
This works now by the query code immediately iterator over all results
and creating notmuch_message_t objects for them, (rather than waiting
to create the objects until the notmuch_messages_get call as we did
earlier).

The point of this change is to allow other instances of lists of
messages, (such as in notmuch_thread_t), that are not directly related
to Xapian search results.
2009-11-14 23:05:17 -08:00
Carl Worth
f7b49d658a notmuch search: Add support for a --reverse option to reverse sort order.
Note that the difference between thread results in date order and
thread results in reverse-date order is not simply a matter of
reversing the final results. When sorting in date order, the threads
are sorted by the oldest message in the thread. When sorting in
reverse-date order, the threads are sorted by the newest message in
the thread.

This difference means that we might want an explicit option in the
interface to reverse the order, (even though the default will be to
display the inbox in date order and global searches in reverse-date
order).
2009-11-12 22:35:16 -08:00
Carl Worth
c168e24174 notmuch search: Print the number of matched/total messages for each thread.
Note that we don't print the number of *unread* messages, but instead
the number of messages that matched the search terms. This is in
keeping with our philosophy that the inbox is nothing more than a
search view. If we search for messages with an inbox tag, then that's
what we'll get a count of. (And if somebody does want to see unread
counts, then they can search for the "unread" tag.)

Getting the number of matched messages is really nice when doing
historical searches. For example in a search like:

	notmuch search tag:sent

(where the "sent" tag has been applied to all messages originating
from the user's email address)---here it's really nice to be able to
see a thread where the user just mentioned one point [1/13] vs. really
getting involved in the discussion [10/29].
2009-11-12 22:01:44 -08:00
Carl Worth
ec6d3506db notmuch search: Print all authors contributing to a thread.
We've now expanded the notmuch_thread_create function to fire off a
secondary database query to find all the messages that belong to this
particular thread. This allows us to now have the complete authors'
list for the thread, and will also make it trivial to print accurate
message counts for threads in the future.
2009-11-12 21:09:54 -08:00
Carl Worth
bbf4b8e4ae notmuch_query_search_threads: Avoid returning more threads than asked for.
I thought it would be safe enough to return a few extra threads,
(since we happened to already get the relevant messages out of the
database). The problem is that then requires the caller to carefully
read the number of threads returned and adjust its next "first" value
accordingly. The interface is much simpler to use if we simply return
exactly what is asked for and no more.
2009-11-12 20:31:22 -08:00
Carl Worth
e4a7c2b870 notmuch search: Fix a second bug in the change to support incremental searches.
The search was dropping the first thread from the results.

When am I going to break down and write a test suite? It's long
overdue now.
2009-11-12 20:12:16 -08:00
Carl Worth
523f1ce0a5 notmuch search: Fix to actually return something.
This serves me right for committing untested code. The
notmuch_query_search_threads was totally broken, (it didn't properly
treat -1 as being unlimited and instead returned no threads in that
case).
2009-11-12 20:09:12 -08:00
Carl Worth
93dcc3b695 libnotmuch: Underlying support for doing partial-results searches.
The library interface now allows the caller to do incremental searches,
(such as one page of results at a time). Next we'll just need to hook
this up to "notmuch search" and the emacs interface.
2009-11-12 16:47:27 -08:00
Carl Worth
1465493210 libify: Move library sources down into lib directory.
A "make" invocation still works from the top-level, but not from
down inside the lib directory yet.
2009-11-09 16:24:03 -08:00
Renamed from query.cc (Browse further)