Commit graph

317 commits

Author SHA1 Message Date
Jan Janak
c3c52e464b notmuch: New function to retrieve all tags from the database.
This patch adds a new function called notmuch_database_get_all_tags
which can be used to obtain a list of all tags from the database
(in other words, the list contains all tags from all messages). The
function produces an alphabetically sorted list.

To add support for the new function, we rip the guts off of
notmuch_message_get_tags and put them in a new generic function
called _notmuch_convert_tags. The generic function takes a
Xapian::TermIterator as argument and uses the iterator to find tags.
This makes the function usable with different Xapian objects.

Function notmuch_message_get_tags is then reimplemented to call the
generic function with message->doc.termlist_begin() as argument.

Similarly, we implement notmuch_message_database_get_all_tags, the
function calls the generic function with db->xapian_db->allterms_begin()
as argument.

Finally, notmuch_database_get_all_tags is exported through
lib/notmuch.h

Signed-off-by: Jan Janak <jan@ryngle.com>
2009-11-26 07:01:52 -08:00
Carl Worth
70962fabf9 lib/messages.c: Make message searches stream as well.
Xapian provides an interator-based interface to all search results.
So it was natural to make notmuch_messages_t be iterator-based as
well. Which we did originally.

But we ran into a problem when we added two APIs, (_get_replies and
_get_toplevel_messages), that want to return a messages iterator
that's *not* based on a Xapian search result. My original compromise
was to use notmuch_message_list_t as the basis for all returned
messages iterators in the public interface.

This had the problem of introducing extra latency at the beginning
of a search for messages, (the call would block while iterating over
all results from Xapian, converting to a message list).

In this commit, we remove that initial conversion and instead provide
two alternate implementations of notmuch_messages_t (one on top of a
Xapian iterator and one on top of a message list).

With this change, I tested a "notmuch search" returning *many* results
as previously taking about 7 seconds before results started appearing,
and now taking only 2 seconds.
2009-11-24 11:33:09 -08:00
Carl Worth
94eb9aacd4 lib/query: Drop the first and max_messages arguments from search_messages.
These only existed to support the chunky-searching hack, but that
was recently dropped anyway.
2009-11-23 20:25:13 -08:00
Carl Worth
ba3554b804 lib/query: Fix notmuch_threads_t to stream results rather than blocking.
Previously, notmuch_query_search_threads would do all the work, so the
caller would block until all results were processed. Now, we do the
work as we go, as the caller iterates with notmuch_threads_next. This
means that once results start coming back from "notmuch search" they
just keep continually streaming.

There's still some initial blocking before the first results appear
because the notmuch_messages_t object has the same bug (for now).
2009-11-23 20:18:57 -08:00
Carl Worth
1fd8b7866f notmuch search: Remove the chunked-searching hack.
This was a poor workaround around the fact that the existing
notmuch_threads_t object is implemented poorly. It's got a fine
iterartor-based interface, but the implementation does all of the
work up-front in _create rather than doing the work incrementally
while iterating.

So to start fixing this, first get rid of all the hacks we had working
around this. This drops the --first and --max-threads options from the
search command, (but hopefully nobody was using them
anyway---notmuch.el certainly wasn't).
2009-11-23 20:17:37 -08:00
Carl Worth
793cbf8049 Add rudimentary date-based search.
The rudimentary aspect here is that the date ranges are specified with
UNIX timestamp values (number of seconds since 1970-01-01 UTC). One
thing that can help here is using the date program to determins
timestamps, such as:

	$(date +%s -d 2009-10-01)..$(date +%s)

Long-term, we'll probably need to do our own query parsing to be able
to support directly-specified dates and also relative expressions like
"since:'2 months ago'".
2009-11-23 17:17:08 +01:00
Keith Packard
53f8cc5651 Add 'notmuch count' command to show the count of matching messages
Getting the count of matching threads or messages is a fairly
expensive operation. Xapian provides a very efficient mechanism that
returns an approximate value, so use that for this new command.

This returns the number of matching messages, not threads, as that is
cheap to compute.

Signed-off-by: Keith Packard <keithp@keithp.com>
2009-11-23 06:33:54 +01:00
Bart Trojanowski
ceee152fca fix notmuch-new bug when database path ends with a trailing /
I configured my database.path with a trailing /, and after running notmuch
new every notmuch search would fail with error messages like this:

  Error opening /inbox/cur/1258565257.000211.mbox:2,S: No such file or directory

The actual bug was in the filename normalization for storage in the
database.  The database.path was removed from the full filename, but if
the database.path from the config file contained a trailing /, the
relative file name would retain an extra leading /... which made it look
like an absolute path after it was read out from the DB.

Signed-off-by: Bart Trojanowski <bart@jukie.net>
2009-11-23 04:37:01 +01:00
Chris Wilson
3e4ab913db lib/database.cc: coding style
Carl claims he must have been distracted when he wrote this...

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2009-11-22 05:26:59 +01:00
Chris Wilson
530df68258 Makefile: Magic silent rules.
Use the facilities of GNU make to create a magic function that will
on the first invocation print a description of how to enable verbose
compile lines and then print the quiet rule.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Carl Worth <cworth@cworth.org>
Cc: Mikhail Gusarov <dottedmag@dottedmag.net>

[ickle: Rebased, and duplicate command string eliminated.]
[ickle: Fixed verbose bug pointed out by Mikhail]
2009-11-22 04:29:29 +01:00
Carl Worth
5d56e931b9 add_message: Use sha-1 in place of overly long message ID.
Since Xapian has a limit on the maximum length of a term, we have
to check for that before trying to add the message ID as a term.

This fixes the bug reported by Mike Hommey here:

	<20091120132625.GA19246@glandium.org>

I've also constructed 20 files with a range of message ID lengths
centered around the Xapian term-length limit which I'll use to seed a
new test suite soon.
2009-11-22 04:03:49 +01:00
Carl Worth
f336ee034b get_timestamp: Ensure that return value is 0 in case of exception.
Just to be on the safe side of things.
2009-11-22 03:55:39 +01:00
Carl Worth
e2341cbc09 Catch and optionally print about exception at database->flush.
If an earlier exception occurred, then it's not unexpected for the
flush to fail as well. So in that case, we'll silently catch the
exception. Otherwise, make some noise about things going wrong at the
time of flush.
2009-11-22 03:54:20 +01:00
Carl Worth
717279fbcf Add a missing print after catching an exception.
Without this, trying to debug this exception was *really* confusing.
2009-11-22 03:52:55 +01:00
Carl Worth
591f901241 Print information about where Xapian exception occurred.
Previously, our Xapian exception reports where identical so they
were hard to track down.
2009-11-22 03:51:35 +01:00
Carl Worth
b725481cb3 Fix freak case problem that broke the compile.
I think I must have bumped some emacs keybinding that changed the case
of a word here.
2009-11-21 22:29:31 +01:00
Carl Worth
637f99d8f3 Rename NOTMUCH_DATABASE_MODE_WRITABLE to NOTMUCH_DATABASE_MODE_READ_WRITE
And correspondingly, READONLY to READ_ONLY.
2009-11-21 22:10:18 +01:00
Chris Wilson
f379aa5284 Permit opening the notmuch database in read-only mode.
We only rarely need to actually open the database for writing, but we
always create a Xapian::WritableDatabase. This has the effect of
preventing searches and like whilst updating the index.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Carl Worth <cworth@cworth.org>
2009-11-21 22:04:49 +01:00
Eric Anholt
59c241ebd0 When a search query triggers a Xapian exception, log what the query was.
In my script containing a series of queries to be run on new mail for
setting up tags, it's nice to see which query I typed wrong.

Signed-off-by: Eric Anholt <eric@anholt.net>
2009-11-21 00:18:15 +01:00
Carl Worth
3ae12b1e28 add_message: Re-fix handling of non-mail files.
More fallout from _get_header now returning "" for missing headers.

The bug here is that we would no longer detect that a file is not an
email message and give up on it like we should.

And this time, I actually audited all callers to
notmuch_message_get_header, so hopefully we're done fixing this
bug over and over.
2009-11-20 21:46:37 +01:00
Carl Worth
656e4c413d notmuch_database_add_message: Add missing error-value propagation.
Thanks to Mike Hommey for doing the analysis that led to noticing that
this was missing.
2009-11-20 21:02:11 +01:00
Carl Worth
52292c5485 add_message: Properly handle missing Message-ID once again.
There's been a fair amount of fallout from when we changed
message_file_get_header from returning NULL to returning "" for
missing headers. This is yet more fallout from that, (where we were
accepting an empty message-ID rather than generating one like we want
to).
2009-11-20 19:36:01 +01:00
Carl Worth
31b54bc787 Avoid access of a Xapian iterator's object when there's nothing there.
This eliminates a crash when a message (either corrupted or a non-mail
file that wasn't properly detected as not being mail) has no In-Reply-To
header, (and so few terms that trying to skip to the prefix of the
In-Reply-To terms actually brings us to the end of the termlist).
2009-11-20 12:06:11 +01:00
Adrian Perez
e5da2b701f Allow lone "not" search operators
As suggested by Keith in FLAG_PURE_NOT allows for expressions like:

  notmuch search NOT tag:inbox

Note that this way a search like:

  notmuch search foobar NOT tag:inbox

should not be written instead:

  notmuch search foobar AND NOT tag:inbox

In my opinion, the latter feels more natural and is somewhat more explicit.
It gives a better clue of what the search is about instead of assuming that
an implicit AND operator is there.
2009-11-19 01:42:31 +01:00
Ingmar Vanhassel
2ce25b93a7 Typsos 2009-11-18 03:21:36 -08:00
Carl Worth
fc3a3be337 linke_message: Avoid segfault when In-Reply-to header is empty.
This was recently introduced in commit:

	64c03ae97f

which was adding extra checks to avoid adding a self-referencing
message.

How many times am I going to fix a dumb regression like this and say
"we really need a test suite" before I actually sit down and write the
test suite?
2009-11-18 01:36:30 -08:00
Carl Worth
3334865725 notmuch search: Change default search order to be newest messages first.
This is what most people want for a _search_ command. It's often
different for actually reading mail in an inbox, (where it makes more
sense to have results displayed in chronological order), but in such a
case, ther user is likely using an interface that can simply pass the
--sort=oldest-first option to "notmuch search".

Here we're also change the sort enum from NOTMUCH_SORT_DATE and
NOTMUCH_SORT_DATE_REVERSE to NOTMUCH_SORT_OLDEST_FIRST and
NOTMUCH_SORT_NEWEST_FIRST. Similarly we replace the --reverse option
to "notmuch search" with two options: --sort=oldest-first and
--sort=newest-first.

Finally, these changes are all tracked in the emacs interface, (which
has no change in its behavior).
2009-11-17 20:58:30 -08:00
Carl Worth
0da0131096 database: Make _parse_message_id static once again.
We had exposed this to the internal implementation for a short time,
(only while we had the silly code fetching In-Reply-To values from
message files instead of from the database). Make this private again
as it should be.
2009-11-17 18:50:13 -08:00
Carl Worth
c50891f449 database: Add "replyto" to the database schema documentation.
Maybe ths lack of this documentation is why I forgot we were actually
storing this and wrote the ugly code to fetch In-Reply-To from message
files rather than from the database.
2009-11-17 18:48:38 -08:00
Carl Worth
6e9fdf0abf database: Rename "ref" prefix name to "reference"
Which is more consistent with the XREFERENCE prefix used in the terms
in the database. Also remove some stale documentation describing the
removal of resolved references from the database (we no longer do
this).
2009-11-17 18:44:02 -08:00
Carl Worth
8cf72920e1 message_file_get_header: Use break where more clear than continue.
Calling continue here worked only because we set a flag before the
continue, and, check the flag at the beginning of the loop, and *then*
break. It's much more clear to just break in the first place.
2009-11-17 18:37:45 -08:00
Keith Packard
d025e89ac7 Fix "too many open files" bug by closing message files when done with them.
The message file header parsing code parses only enough of the file to
find the desired header fields, then it leaves the file open until the
next header parsing call or when the message is no longer in use. If a
large number of messages end up being active, this will quickly run
out of file descriptors.

Here, we add support to explicitly close the message file within a
message, (_notmuch_message_close) and call that from thread
construction code.

Signed-off-by: Keith Packard <keithp@keithp.com>

Edited-by: Carl Worth <cworth@cworth.org>:

Many portions of Keith's original patch have since been solved other
ways, (such as the code that changed the handling of the In-Reply-To
header). So the final version is clean enough that I think even Keith
would be happy to have his name on it.
2009-11-17 18:37:13 -08:00
Carl Worth
64c03ae97f add_message: Don't add any self-references to the database.
In our scheme it's illegal for any message to refer to itself, (nor
would it be useful for anything anyway). Cut these self-references off
at the source, before they trip up any internal errors.
2009-11-17 17:55:37 -08:00
Carl Worth
f7eaeff242 message_get_thread_id: Generate internal error if message has no thread ID.
This case was happening when a message had its own message ID in its
In-Reply-To header. The thread-resolution code would find the
partially constructed message, (with no thread ID yet), get garbage
from this function, and then march right along with that garbage.

With this commit, a self-cyclic message like this will now trigger an
internal error rather than marching along silienty. (And a subsequent
commit will remove the call to this function in this case.)
2009-11-17 17:42:32 -08:00
Carl Worth
24a25ffba9 Remove the talloc_owner argument from create_for_message_id.
This function has only one caller, and that one caller was passing the
same value for both talloc_owner and the notmuch database. Dropping
the redundant argument simplifies the documentation of this function
considerably.
2009-11-17 17:42:32 -08:00
Carl Worth
387828c435 get_in_reply_to: Implement via the database, not by opening mail file.
This reduces our reliance on open message_file objects, (so is a step
toward fixing the "too many open files" bug), but more importantly, it
means we don't load a self-referencing in-reply-to header, (since we
weed those out before adding any replyto terms to the database).
2009-11-17 17:40:19 -08:00
Carl Worth
12d3014d88 Fix broken commit.
Oops. I should have actually compiled before pushing.
2009-11-17 09:04:14 -08:00
Mikhail Gusarov
469ea9ebc6 Include <stdint.h> to get uint32_t in C++ file with gcc 4.4
Signed-off-by: Mikhail Gusarov <dottedmag@dottedmag.net>
2009-11-17 08:53:19 -08:00
Mikhail Gusarov
dc5a9d8eb2 Close message file after parsing message headers
Keeping unused files open helps to see "Too many open files" often.

Signed-off-by: Mikhail Gusarov <dottedmag@dottedmag.net>
2009-11-17 08:53:16 -08:00
Carl Worth
0dab6a2c1e add_message: Avoid a memory leak when user holds on to message return.
When this function was originally written, the 'message' object was
always destroyed locally, so I thought it would be good to use a NULL
talloc context to make it more obvious if there was any leak.

Since then, however, this function has been changed to optionally
return the added message, and in that case we *don't* free the message
locally, so let's let the database be the talloc context.
2009-11-17 08:50:14 -08:00
Carl Worth
933caf814f notmuch show: Implement proper thread ordering/nesting of messages.
We now properly analyze the in-reply-to headers to create a proper
tree representing the actual thread and present the messages in this
correct thread order. Also, there's a new "depth:" value added to the
"message{" header so that clients can format the thread as desired,
(such as by indenting replies).
2009-11-15 20:41:45 -08:00
Carl Worth
d136a1e2cf Add _notmuch_message_get_in_reply_to.
The existing notmuch_message_get_header is *almost* good enough for
this, except that we also need to remove the '<' and '>'
delimiters. We'll probably want to implement this function with
database storage in the future rather than loading the email message.
2009-11-15 20:36:51 -08:00
Carl Worth
b97756926f Remove obsolete notmuch_message_get_subject prototype.
This prototype has been sitting around for a while with no function
implementing it. I wonder if there's a compiler warning I could turn
on to catch these things.
2009-11-15 20:34:24 -08:00
Carl Worth
f970d8078c lib/messages: Add new notmuch_message_list_t to internal interface.
Previously, the notmuch_messages_t object was a linked list built on
top of a linked-list node with the odd name of notmuch_message_list_t.

Now, we've got much more sane naming with notmuch_message_list_t being
a list built on a linked-list node named notmuch_message_node_t. And
now the public notmuch_messages_t object is a separate iterator based
on notmuch_message_node_t. This means the interfaces for the new
notmuch_message_list_t object are now made available to the library
internals.
2009-11-15 20:31:30 -08:00
Carl Worth
9034e396b6 database: Fix a typo in a commit.
Nothing significant here, but we might as well not keep things
misspelled when we notice.
2009-11-15 20:23:27 -08:00
Carl Worth
9b1c6c250b Export _parse_message_id to the library implementation.
Not exported through the public interface, but the thread code is
going to want to be able to parse In-Reply-To headers so needs access
to this code.
2009-11-15 20:21:43 -08:00
Carl Worth
54be14098b _thread_add_messages: Remove unused variable.
I'm not sure how I let this warning go by unfixed for a while. Fix
it now.
2009-11-15 20:21:12 -08:00
Carl Worth
d3349358c6 lib: Move notmuch_messages_t code from query.cc to new messages.c
The new object is simply a linked-list of notmuch_message_t objects,
(unlike the old object which contained a couple of Xapian iterators).
This works now by the query code immediately iterator over all results
and creating notmuch_message_t objects for them, (rather than waiting
to create the objects until the notmuch_messages_get call as we did
earlier).

The point of this change is to allow other instances of lists of
messages, (such as in notmuch_thread_t), that are not directly related
to Xapian search results.
2009-11-14 23:05:17 -08:00
Carl Worth
c979fc5b05 notmuch_tags_advance: Make safe against excessive calls.
Previously, an excess call would have caused a crash. Now it simply
does nothing. Also, make notmuch_tags_get use a similar, consistent
early return for a NULL iterator.
2009-11-14 23:02:55 -08:00
Carl Worth
ed2643333c notmuch search: Fix thread dates to come only from matched messages.
We were properly sorting the threads based only on matched messages,
but we were displaying the date based on the total messages in the
thread, which led to inconsistent and very confusing results.
2009-11-12 23:10:04 -08:00