We know that matched messages are always added in order, so we can
always just grab the subject from the first message. This is the same
approach that was used previously in _thread_add_message. That is, the
recent feature of renaming a thread based on the subject of the
"first" matched message is as simple as moving the subject assignment
from _thread_add_message to _thread_add_matched_message.
At the moment all threads are named based on the name of the first message
in the thread. However, this can cause problems if people either start
new threads by replying-all (as unfortunately, many out there do) or
change the subject of their mails to reflect a shift in a thread on a
list.
This patch names threads based on (a) matches for the query, and (b) the
search order. If the search order is oldest-first (as in the default
inbox) it chooses the oldest matching message as the subject. If the
search order is newest-first it chooses the newest one.
Reply prefixes ("Re: ", "Aw: ", "Sv: ", "Vs: ") are ignored
(case-insensitively) so a Re: won't change the subject.
Note that this adds a "sort" argument to _notmuch_thread_create and
_thread_add_matched_message, so that when constructing the thread we can
be aware of the sort order.
Signed-off-by: Jesse Rosenthal <jrosenthal@jhu.edu>
When constructing a thread, we usually run a nested query to find all
messages in the thread that match the original search string. However,
we need to have special-case handling of an original search string of
"*" now that that is a supported means of specifying all messages.
The special-case ends up bein quite simple---we do less work, (just
skipping the nested search since we know that all messages must
match). I had been wanting to write this identical code to more
efficiently handle "notmuch search thread:<foo>" which was previously
running two identical searches. So that case is now more efficient as
well.
This encodes the library version into the library, where the linking
binary can pick it up, and the linker can even enforce mismatches in
the minor release, (such as linking a binary against version 1.2 and
then attempting to run it against version 1.1).
I'm not sure which system Aaron used, but on the machine I have access
to, (Darwin 8.11.0), the -shared and -dylib_install_name options are
not recognized. Instead I use -dynamic_lib and -install_name as
documented here:
http://www.finkproject.org/doc/porting/shared.php
This patch adds a configure check for OS X (actually Darwin),
and sets up the Makefiles to build a proper shared library on
that platform.
Signed-off-by: Aaron Ecay <aaronecay@gmail.com>
notmuch previously unconditionally checked mime parts for various
properties, but not for NULL, which is the case if libgmime encounters
an empty mime part.
Upon encounter of an empty mime part, the following is printed to
stderr (the second line due to my patch):
(process:17197): gmime-CRITICAL **: g_mime_message_get_mime_part: assertion `GMIME_IS_MESSAGE (message)' failed
Warning: Not indexing empty mime part.
This is probably a bug that should get addressed in libgmime, but for
not, my patch is an acceptable workaround.
Signed-off-by: martin f. krafft <madduck@madduck.net>
Reviewed-by: Carl Worth <cworth@cworth.org>:
The original proposal for having different open modes used the name
WRITABLE. I didn't like that name, (easy to misspell as WRITEABLE even
for native English speakers). So we renamed it to READ_WRITE
immediately, but apparently some of the documentation held the old
name for a while.
Previously, we were only adding the reference terms for cases where
the referenced message did not yet exist in the database. For thread
presentation, it's useful to have the connection information provided
by the references, even when the messages are present. So add this
term unconditionally.
This function was recently modified, (to include a metadata lookup for
a message's thread ID before looking for parent/child thread IDs), but
the documentation wasn't updated. Fix that.
There are two primary cases in this function, (the message exists in
the database or it does not). Previously the code for these two cases
was split and intermingled with goto-spaghetti connections.
This allows us to thread messages even when we receive them out of
order, or never receive the root.
The thread ids for messages that aren't present but are referred to are
stored as metadata in the database and then retrieved if we ever get
that message.
When determining the thread id for a message we also check for this
metadata so that we can thread descendants of a message together before
we receive it.
Edited by Carl Worth <cworth@cworth.org>: Split this portion of the
commit from the earlier-applied portion adding test cases.
This seems like a generally useful thing to support, (but the previous
support through an empty string was not convenient for some users,
(such as the command-line client).
fix notmuch_message_file_get_header to always return the first instance
of the header you are looking for
Signed-off-by: Dirk Hohndel <hohndel@infradead.org>
With the original quiet function, there's an actual purpose (hiding
excessively long compiler command lines so that warnings and errors
from the compiler can be seen).
But with things like quiet_symlink there's nothing quieter. In fact
"SYMLINK" is longer than "ln -sf". So all this is doing is hiding the
actual command from the user for no real benefit.
The only actual reason we implemented the quiet_* functions was to be
able to neatly right-align the command name and left-align the arguments.
Let's give up on that, and just left-align everything, simplifying the
Makefiles considerably. Now, the only instances of a captialized command
name in the output is if there's some actually shortening of the command
itself.
We add a magic line to the beginning of each Makefile.local file to
help the editor know that it should use makefile mode for editing the
file, (even though the filename isn't exactly "Makefile").
Edited-by: Carl Worth <cworth@cworth.org>: Expand treatment from
emacs/Makefile.local to each instance of Makefile.local.
The idea here is to allow a new user of notmuch to be able to run
notmuch immediately after compiling, (without having to install
the shared library first). This also ensures that the test suite
tests the locally compiled library, and not whatever installled
version of the library the dynamic linker happens to find.
When I wanted to create a debian package from the current master, make
install failed because of non-existent include directory. This patch
fixes this minor issue.
We had a fairly ugly violation of modularity with the top-level
Makefile.local isntalling everything, (even when the build commands
for the library were down in lib/Makefile.local).
For the case of adding a file that already exist, (with the same
filename). In this case, nothing will happen to the database, but
that wasn't clear before.
We were previously maintaining two lists of the child Makefile
fragments---one for the includes and another for the dependencies. So,
of course, they drifted and the dependency list wasn't up to date.
We fix this by adding a single subdirs variable, and then using GNU
Makefile substitution to generate both the include and the dependency
lists.
Some side effect of this change caused the '=' assignment of the dir
variable to not work anymore. I'm not sure why that is, but using ':='
makes sense here and fixes the problem.
Commit cd467caf renamed notmuch_query_search to notmuch_query_search_messages.
Commit 1ba3d46f created notmuch_query_search_threads. We better keep the docs
of notmuch_query_create consistent with those changes.
Signed-off-by: Fernando Carrijo <fcarrijo@yahoo.com.br>
Edited-by: Carl Worth to explicitly list the full name of each
function being referenced.
We rename 'has_more' to 'valid' so that it can function whether
iterating in a forward or reverse direction. We also rename
'advance' to 'move_to_next' to setup parallel naming with
the proposed functions 'move_to_first', 'move_to_last', and
'move_to_previous'.
Thanks to Michal Sojka <sojkam1@fel.cvut.cz> for pointing out the
correct fix, which I verified in the freely-available WG14/N1124 draft
(from the C99 working group) which is available here:
http://www.open-std.org/JTC1/SC22/wg14/www/docs/n1124.pdf
The sequential identifiers have the advantage of being guaranteed to
be unique (until we overflow a 64-bit unsigned integer), and also take
up half as much space in the "notmuch search" output (16 columns
rather than 32).
This change also has the side effect of fixing a bug where notmuch
could block on /dev/random at startup (waiting for some entropy to
appear). This bug was hit hard by the test suite, (which could easily
exhaust the available entropy on common systems---resulting in large
delays of the test suite).
If we had external users of this filter then they might expect some of
these macros to exist. But since this is just internal, that's just
unneeded noise.
With modern MIME attachments, we're already avoiding indexing the
attachments. But for old-school uuencoded data in the mail, we have
been directly indexing the encoded data as terms, (which is not useful
at all---nobody will ever ytry to search based on the seemingly random
uuencoded data).
Additionally, indexing a modestly large uuencoded file seems to make
Xapian go insane, (consuming *lots* of memory).
We fix both problems by detecting uuencoded content and not performing
any indexing of it.
Previously we were printing a number of messages upgraded so far. The
original motivation for this was to accurately reflect the fact that
there are two passes, (so each message is processed twice and it's not
accurate to represent with a single count). But as it turns out, the
second pass takes zero time (relatively speaking) so we're still not
accounting for it.
If nothing else, the percentage-based reporting makes for a cleaner
API for the progress_notify function.
The WDF is the "within-document frequency" value for a particular
term. It's intended to provide an indication of how frequent a term is
within a document, (for use in computing relevance). Xapian's term
generator already computes WDF values when we use that, (which we do
for indexing all mail content).
We don't use the term generator when adding single terms for things
that don't actually appear in the mail document, (such as tags, the
filename, etc.). In this case, the WDF value for these terms doesn't
matter much.
But Xapian's flint backend can be more efficient with changes to terms
that don't affect the document "length". So there's a performance
advantage for manipulating tags (with the flint backend) if the WDF of
these terms is 0.
All notmuch searches currently sort by value (either date or message
ID) so it's just wasted effort for Xapian to compute relevance values
for each result. We now explicitly tell Xapian that we're uninterested
in the relevance values.
The first phase copies data from the old format to the new format
without deleting anything. This allows an old notmuch to still use the
database if the upgrade process gets interrupted. The second phase
performs the deletion (after updating the database version number). If
the second phase is interrupted, there will be some unused data in the
database, but it shouldn't cause any actual harm.