The previous functions were always called together, so we might as
well just have one function for this. Also, the reset() name was
poor, and prepare_iterator() is much more descriptive.
We want to be able to iterate over tags stored in various ways, so
the previous TermIterator-based tags object just wasn't general
enough. The new interface is nice and simple, and involves only
C datatypes.
We will soon be wanting multiple different implementations of
notmuch_tags_t iterators, so we need to keep the actual structure
as an implementation detail inside of tags.cc.
We want to start using this from both message.cc and thread.cc so we
need it in a place we can share the code. This also requires a new
notmuch-private-cxx.h header file for interfaces that include
C++-specific datatypes (such as Xapian::Document).
We had documented both notmuch_thread_results_get and
notmuch_message_results_get to return NULL if (! has_more)
but we hadn't actually implemented that. Fix.
We've now got a new notmuch_query_search_threads and a
notmuch_threads_result_t iterator. The thread object itself
doesn't do much yet, (just allows one to get the thread_id),
but that's at least enough to see that "notmuch search" is
actually doing something now, (since it has been converted
to print thread IDs instead of message IDs).
And maybe that's all we need. Getting the messages belonging
to a thread is as simple as a notmuch_query_search_messages
with a string of "thread:<thread-id>".
Though it would be convenient to add notmuch_thread_get_messages
which could use the existing notmuch_message_results_t iterator.
Now we just need an implementation of "notmuch show" and we'll
have something somewhat usable.
Along with renaming notmuch_results_t to notmuch_message_results_t.
The new type is quite a mouthful, but I don't expect it to be
used much other than the for-loop idiom in the documentation,
(which does at least fit nicely within 80 columns).
This is all in preparation for the addition of a new
notmuch_query_search_threads of course.
Even with the recent warnings work, gcc didn't tell me about a static
function that I'm not calling? Apparently I get "defined but not
used" in C files, but not C++ files. That's bogus, and yet one more
reason for me to push the C++ to a minimal lower layer.
I didn't notice this because `xapian-config -cxxflags` gives empty
output on my system. But for someone with the xapian library
installed in some non-standard location this would be important.
I didn't end up adding any of the warnings options that aren't allowed
for C++, (such as -Wold-style-definition, -Wnested-externs,
-Werror-implicit-function-declaration, -Wstrict-prototypes,
-Wmissing-prototypes, or -Wbad-function-cast). So for now we can
drop the separate C and C++ variables for warnings.
Having to enumerate all the enum values at every switch is annoying,
but this warning actually found a bug, (missing support for
NOTMUCH_STATUS_OUT_OF_MEMORY in notmuch_status_to_string).
When adding -Wextra we also add -Wno-ununsed-parameters since that
function means well enough, but is really annoying in practice.
So the warnings we fix here are basically all comparsions between
signed and unsigned values.
Instead of supporting multiple thread IDs, we now merge together
thread IDs if one message is ever found to belong to more than one
thread. This allows for constructing complete threads when, for
example, a child message doesn't include a complete list of References
headers back to the beginning of the thread.
It also simplifies dealing with mapping a message ID to a thread ID
which is now a simple get_thread_id just like get_message_id, (and no
longer an iterator-based thing like get_tags).
We dropped the THREAD_ID value from the database a while back, but here
is code that's carefully computing that value and then never doing
anything with it. Delete, delete, delete.
The function was getting too long-winded before. Add since I'm about
to change how we handle the thread linking, it's convenient to have
it in an isolated function.
We were previously just doing fprintf;exit at each point, but I
wanted to add file and line-number details to all messages, so it
makes sense to use a single macro for that.
The "notmuch setup" output was getting overwhelmingly verbose.
Also, some people might not have a lot of mail, so might never need
this optimization. It's much better to move the hint to the time
when the user could actually benefit from it, (it's easy to detect
that "notmuch new" took more than 1 second, and we know if there
are any read-only directories there or not).
Aside from increased code sharing, the benefit here is that now
thread_ids iterates over the terms of a message rather than the
thread_id value. So we'll now be able to drop that value.
The generic notmuch_terms_t iterator should provide support for
notmuch_thread_ids_t when we switch as well, (And it would be
interesting to see if we could reasonably make this support a
PostingIterator too. Time will tell.)
First, it's nice that for now we don't have any users yet, so we
can make incompatible changes to the database layout like this
without causing trouble. ;-)
There are a few reasons for this change. First, we now use value 0
uniformly as a timestamp for both mail and timestamp documents, (which
lets us cleanup an ugly and fragile bare 0 in the add_value and
get_value calls in the timestamp code).
Second, I want to drop the thread value entirely, so putting it at the
end of the list means we can drop it as compatible change in the
future. (I almost want to drop the message-ID value too, but it's nice
to be able to sort on it to get diff-able output from "notmuch dump".)
But the thread value we never use as a value, (we would never sort on
it, for example). And it's totally redundant with the thread terms we
store already. So expect it to disappear soon.
We're now dropping all pretense of keeping the database directly
compatible with sup's current xapian backend. (But perhaps someone
might write a new nothmuch backend for sup in the future.)
In coming up with the prefix values here, I tried to follow the
conventions of http://xapian.org/docs/omega/termprefixes.html as
closely as makes sense, (with some domain translation from "web"
to "email archive").
The idea here is that only some of the prefix names (such as "id" and
"tag") actually make sense in external user-supplied query
strings. Other things like "type" are internal implementation details
of how we store things in the database. So internal machinery will add
those terms to the database and we don't need to support them in the
string itself.
With this, we can now simply loop over the external prefix values to
let the quiery parser know about them. So as we add prefixes in the
future, we'll only need to add them to this list.
It's not much of a script, (we don't have that many commands after
all), but it's the kind of thing that's nice to have and gives the
tool a slightly more polished feel.
The key for this is call add_boolean_prefix on the QueryParser
object. That tells the query parser to take something like "tag:inbox"
and transform it into the "Linbox" term and do what it needs to do to
make this term a requirement of the search. We're starting to have a
real system here.
Also, I didn't want to expose the ugly name of "msgid" to the user, so
we add a prefix name of simply "id" instead.
I'm planning to change prefix values soon, which would break code
like this. So eliminate the fragility by going through our existing
_find_prefix function.
Here's the big bug that was preventing any searches from working at
all like desired. I did the work to carefully pick out exactly the
flags that I wanted, and then I threw it away by trying to combine
them with & instead of | (so just passing 0 for flags instead).
Much better now.
It's nice that Xapian provides a little function to print a textual
representation of the entire query tree. So now, if you compile
like so:
make CFLAGS=-DDEBUG_QUERY
then you get a nice output of the query string received by the query
module, and the final query actually being sent to Xapian.
This isn't behaving at all like it's documented yet, (for example,
it's returning message IDs not thread IDs[*]). In fact, the output
code is just a copy of the body of "notmuch dump", so all you
get for now is message ID and tags.
But this should at least be enough to start exercising the query
functionality, (which is currently very buggy).
[*] I'll want to convert the databse to store thread documents
before fixing that.
The current problem is that when this function fails the caller
doesn't get any information about what the particular failure
was, (something in the filesystem? or in Xapian?). We should fix
that.
With some recent testing, the timestamp was failing, (overflowing
the term limit), and reporting an error, but the top-level notmuch
command was still returning a success return value.
I think it's high time to add a test suite, (and the code base is
small enough that if we add it now it shouldn't be *too* hard to
shoot for a very high coverage percentage).
The previous code was only correct as long as the timestamp prefix
was only a single character. But with the recent change to a
multi-character prefix, this broke. So fix it now.
I've decided not to try for sup compatibility at the leve of the
xapian datbase. There's just too much about sup's usage of the
database that I don't like, (beyond the embedded ruby data structures
there is redundant storage of message IDs, thread IDs, and dates (in
both terms and values)).
I'm going to fix that up in the database of notmuch, with some other
changes as well. (I plan to drop "reference" terms once linkage to a
thread ID through the reference is established. I also plan to add
actual documents to represent threads.)
So with all that incompatibility, I might as well make my own prefix
values. And while doing that, I should try to be as compatible as
possible with the conventions described here:
http://xapian.org/docs/omega/termprefixes.html
With this, "notmuch new" is now plenty fast even with large archives
spanning many sub-directories. Document this both in "notmuch help"
and also in the output of notmuch setup.