This code comes courtesy of Brian Gladman and Mikhail Gusarov.
Both files are available under the GPL and were downloaded as
version 0.2 of libsha1 from git://github.com/dottedmag/libsha1.git
with the following commit:
commit d0f0e7e0dc5ce2d58972cb5a492183c0d4e58433
Author: Mikhail Gusarov <dottedmag@dottedmag.net>
Date: Mon Oct 20 22:38:47 2008 +0700
Version bump.
Signed-off-by: Mikhail Gusarov <dottedmag@dottedmag.net>
It's pretty easy to do with all the right infrastructure in place.
Now that I can get my tags from sup to notmuch, maybe I'll be able
to start reading mail again.
We actually need this before the include of xutil.h, but
it was previously stuck randomly among various system
includes. Instead, put it at the top, right after include
the notmuch.h header that defines it.
This is where we wanted to put the note to recommend the user
call notmuch_message_destroy if the lifetime of the message
is much shorter than the lifetime of the query. (Somehow this
had ended up in the documentation of notmuch_message_get_tags
before.)
Previously, this would allocate new memory with every call. That
was with talloc, of course, so there wasn't any leaking (eventually).
But since we're now calling this internally we want to be a little
less wasteful. It's easy enough to just stash the result into the
message on the first call, and then just return that on subsequent
calls.
With this function, and the recently added support for
notmuch_message_get_thread_ids, we now recode the find_thread_ids
function to work just the way we expect a user of the public
notmuch API to work. Not too bad really.
Along with all of the notmuch_thread_ids_t iterator functions.
Using a consistent idiom seems better here rather than returning
a comma-separated string and forcing the user to parse it.
The motivation here is that our top-level notmuch.c main program
wants to start using these, but we don't want it to see into
notmuch-private.h, (since our main program is a test vehicle
for the "public" notmuch interface in notmuch.h).
I'm too lazy to see what the RFC says, but I know that having
whitespace inside a message-ID is sure to confuse things. And
besides, this makes things more compatible with sup so that
I have some hope of importing sup labels.
To properly support sorting in notmuch_query we know use an
Enquire object. We also throw in a QueryParser too, so we're
really close to being able to support arbitrary full-text
searches.
I took a look at the supported QueryParser syntax and chose
a set of flags for everything I like, (such as supporting
Boolean operators in either case ("AND" or "and"), supporting
phrase searching, supporting + and - to include/preclude terms,
and supporting a trailing * on any term as a wildcard).
This is to help keep the report looking clean when a new report
is shorter than a previous reports, (say, when crossing the
boundary from over one minute remaining to less than one minute
remaining).
This used to be here, but I must have accidentally dropped it
when reformatting the progress report recently.
Using the address of a static char* was clever, but really
unnecessary. An empty string is much less magic, and even
easier to understand as the way to query everything from
the database.
Previously we were leaking[*] memory in that the memory footprint of
a "notmuch dump" run would continue to grow until the output was
complete, and then finally all the memory would be freed.
Now, the memory footprint is small and constant, O(1) rather than
O(n) in the number of messages.
[*] Not leaking in a valgrind sense---every byte was still carefully
being accounted for and freed eventually.
None of these are strictly necessary, (everything was leak-free
without them), but notmuch_message_destroy can actually be useful
for when one query has many message results, but only one is needed
to be live at a time.
The destroy functions for results and tags are fairly gratuitous, as
there's unlikely to be any benefit from calling them. But they're all
easy to add, (all of these functions are just wrappers for talloc_free),
and we do so for consistency and completeness.
This is a fairly big milestone for notmuch. It's our first command
to do anything besides building the index, so it proves we can
actually read valid results out from the index.
It also puts in place almost all of the API and infrastructure we
will need to allow searching of the database.
Finally, with this change we are now using talloc inside of notmuch
which is truly a delight to use. And now that I figured out how
to use C++ objects with talloc allocation, (it requires grotty
parts of C++ such as "placement new" and "explicit destructors"),
we are valgrind-clean for "notmuch dump", (as in "no leaks are
possible").
This is in preparation for a new, public notmuch_message_t.
Eventually, the public notmuch_message_t is going to grow enough
features to need to be file-backed and will likely need everything
that's now in message-file.c. So we may fold these back into one
object/implementation in the future.
The recent change from GIOChannel to getline, (with a semantic
change of the newline terminator now being included in the
result that setup_command sees), broke this.
I'm trying to chase down 3 still-reachable pointers to glib hash
tables.
This change didn't help with that, but I think destroy might be a
better semantic match for what I actually want. (It shouldn't matter
though since I never take any additional references.)
We were properly feeing this memory when the thread-ids list was not
empty, but leaking it when it was.
Thanks, of course, to valgrind along with the G_SLICE=always-malloc
environment variable which makes leak checking with glib almost
bearable.
We were careful to free this memory when we finished parsing the
headers, but we missed it for the case of closing the message
without ever parsing all of the headers.
I was incorrectly using the return value of stat (-1) instead of
errno (ENOENT) to try to construct the error message here.
Also, while we're here, reword the error message to not have
"stat" in it, which in spite of what a Unix programmer will
tell you, is not actually a word.
Since we allow the user to enter a custom directory, we need to
let the user know how to make this persistent. Of course, a better
answer would be to take what the user entered and shove it into
a ~/.notmuch-config file or so, but for now this will have to do.
When documenting these functions I described support for a
NOTMUCH_BASE environment variable to be consulted in the case
of a NULL path. Only, I had forgotten to actually write the
code.
This code exists now, with a new, exported function:
notmuch_database_default_path
A simple bug meant that the correct value was being inserted into
the hash table, but a NULL value would be returned in some cases.
(If the value was already in the hash table at the beginning of
the call the the correct value would be returned, but if the
function had to parse to reach it then it would return NULL.)
This was tripping up the recently-added code to ignore messages
with NULL From:, Subject:, and To: headers, (which is fortunate
since otherwise the broken parsing might have stayed hidden for
longer).
The big update here is the addition of the dump and restore commands
which are next on my list. Also, I've now come up with a syntax for
documenting the arguments of sub-commands.
This is helpful for things like indexes that other mail programs
may have left around. It also means we can make the initial
instructions much easier, (the user need not worry about moving
away auxiliary files from some other email program).
These were just little tests while getting comfortable with
GMime and xapian. I'll likely use pieces of these as notmuch
continues, but for now let's not distract anyone looking
at notmuch with these.
And the code will live on in the history if I need to look
at it.
I noticed this style during a recent Debian install and I liked
how much less busy it is compared to what we had before, (while
still telling the user everything she might want).
The line-based parsing can be a bit awkward when wanting to peek
ahead, (say, for folded header values), but it's so convenient
to be able to trust that a string terminator exists on every
line so it cleans up the code considerably.
Looks like we can copy in a hash-table implementation, (from cairo,
say), and then a few _ascii_ functions from glib, (we'll need to
switch a few current uses if things like isspace, etc. to locale-
independent versions as well). So not too hard to free ourselves
of glib for now, (until we add GMime back in later, of course).