This function was recently modified, (to include a metadata lookup for
a message's thread ID before looking for parent/child thread IDs), but
the documentation wasn't updated. Fix that.
There are two primary cases in this function, (the message exists in
the database or it does not). Previously the code for these two cases
was split and intermingled with goto-spaghetti connections.
This allows us to thread messages even when we receive them out of
order, or never receive the root.
The thread ids for messages that aren't present but are referred to are
stored as metadata in the database and then retrieved if we ever get
that message.
When determining the thread id for a message we also check for this
metadata so that we can thread descendants of a message together before
we receive it.
Edited by Carl Worth <cworth@cworth.org>: Split this portion of the
commit from the earlier-applied portion adding test cases.
This seems like a generally useful thing to support, (but the previous
support through an empty string was not convenient for some users,
(such as the command-line client).
fix notmuch_message_file_get_header to always return the first instance
of the header you are looking for
Signed-off-by: Dirk Hohndel <hohndel@infradead.org>
With the original quiet function, there's an actual purpose (hiding
excessively long compiler command lines so that warnings and errors
from the compiler can be seen).
But with things like quiet_symlink there's nothing quieter. In fact
"SYMLINK" is longer than "ln -sf". So all this is doing is hiding the
actual command from the user for no real benefit.
The only actual reason we implemented the quiet_* functions was to be
able to neatly right-align the command name and left-align the arguments.
Let's give up on that, and just left-align everything, simplifying the
Makefiles considerably. Now, the only instances of a captialized command
name in the output is if there's some actually shortening of the command
itself.
We add a magic line to the beginning of each Makefile.local file to
help the editor know that it should use makefile mode for editing the
file, (even though the filename isn't exactly "Makefile").
Edited-by: Carl Worth <cworth@cworth.org>: Expand treatment from
emacs/Makefile.local to each instance of Makefile.local.
The idea here is to allow a new user of notmuch to be able to run
notmuch immediately after compiling, (without having to install
the shared library first). This also ensures that the test suite
tests the locally compiled library, and not whatever installled
version of the library the dynamic linker happens to find.
When I wanted to create a debian package from the current master, make
install failed because of non-existent include directory. This patch
fixes this minor issue.
We had a fairly ugly violation of modularity with the top-level
Makefile.local isntalling everything, (even when the build commands
for the library were down in lib/Makefile.local).
For the case of adding a file that already exist, (with the same
filename). In this case, nothing will happen to the database, but
that wasn't clear before.
We were previously maintaining two lists of the child Makefile
fragments---one for the includes and another for the dependencies. So,
of course, they drifted and the dependency list wasn't up to date.
We fix this by adding a single subdirs variable, and then using GNU
Makefile substitution to generate both the include and the dependency
lists.
Some side effect of this change caused the '=' assignment of the dir
variable to not work anymore. I'm not sure why that is, but using ':='
makes sense here and fixes the problem.
Commit cd467caf renamed notmuch_query_search to notmuch_query_search_messages.
Commit 1ba3d46f created notmuch_query_search_threads. We better keep the docs
of notmuch_query_create consistent with those changes.
Signed-off-by: Fernando Carrijo <fcarrijo@yahoo.com.br>
Edited-by: Carl Worth to explicitly list the full name of each
function being referenced.
We rename 'has_more' to 'valid' so that it can function whether
iterating in a forward or reverse direction. We also rename
'advance' to 'move_to_next' to setup parallel naming with
the proposed functions 'move_to_first', 'move_to_last', and
'move_to_previous'.
Thanks to Michal Sojka <sojkam1@fel.cvut.cz> for pointing out the
correct fix, which I verified in the freely-available WG14/N1124 draft
(from the C99 working group) which is available here:
http://www.open-std.org/JTC1/SC22/wg14/www/docs/n1124.pdf
The sequential identifiers have the advantage of being guaranteed to
be unique (until we overflow a 64-bit unsigned integer), and also take
up half as much space in the "notmuch search" output (16 columns
rather than 32).
This change also has the side effect of fixing a bug where notmuch
could block on /dev/random at startup (waiting for some entropy to
appear). This bug was hit hard by the test suite, (which could easily
exhaust the available entropy on common systems---resulting in large
delays of the test suite).
If we had external users of this filter then they might expect some of
these macros to exist. But since this is just internal, that's just
unneeded noise.
With modern MIME attachments, we're already avoiding indexing the
attachments. But for old-school uuencoded data in the mail, we have
been directly indexing the encoded data as terms, (which is not useful
at all---nobody will ever ytry to search based on the seemingly random
uuencoded data).
Additionally, indexing a modestly large uuencoded file seems to make
Xapian go insane, (consuming *lots* of memory).
We fix both problems by detecting uuencoded content and not performing
any indexing of it.
Previously we were printing a number of messages upgraded so far. The
original motivation for this was to accurately reflect the fact that
there are two passes, (so each message is processed twice and it's not
accurate to represent with a single count). But as it turns out, the
second pass takes zero time (relatively speaking) so we're still not
accounting for it.
If nothing else, the percentage-based reporting makes for a cleaner
API for the progress_notify function.
The WDF is the "within-document frequency" value for a particular
term. It's intended to provide an indication of how frequent a term is
within a document, (for use in computing relevance). Xapian's term
generator already computes WDF values when we use that, (which we do
for indexing all mail content).
We don't use the term generator when adding single terms for things
that don't actually appear in the mail document, (such as tags, the
filename, etc.). In this case, the WDF value for these terms doesn't
matter much.
But Xapian's flint backend can be more efficient with changes to terms
that don't affect the document "length". So there's a performance
advantage for manipulating tags (with the flint backend) if the WDF of
these terms is 0.
All notmuch searches currently sort by value (either date or message
ID) so it's just wasted effort for Xapian to compute relevance values
for each result. We now explicitly tell Xapian that we're uninterested
in the relevance values.
The first phase copies data from the old format to the new format
without deleting anything. This allows an old notmuch to still use the
database if the upgrade process gets interrupted. The second phase
performs the deletion (after updating the database version number). If
the second phase is interrupted, there will be some unused data in the
database, but it shouldn't cause any actual harm.
The recent support for renames in the database is our first time
(since notmuch has had more than a single user) that we have a
database format change. To support smooth upgrades we now encode a
database format version number in the Xapian metadata.
Going forward notmuch will emit a warning if used to read from a
database with a newer version than it natively supports, and will
refuse to write to a database with a newer version.
The library also provides functions to query the database format
version:
notmuch_database_get_version
to ask if notmuch wants a newer version than that:
notmuch_database_needs_upgrade
and a function to actually perform that upgrade:
notmuch_database_upgrade
Previously we had NOTMUCH_DATABASE_MODE_READ_ONLY but
NOTMUCH_STATUS_READONLY_DATABASE which was ugly and confusing. Rename
the latter to NOTMUCH_STATUS_READ_ONLY_DATABASE for consistency.
Previously, many checks were deep in the library just before a cast
operation. These have now been replaced with internal errors and new
checks have instead been added at the beginning of all top-levelentry
points requiring a read-write database.
The new checks now also use a single function for checking and
printing the error message. This will give us a convenient location to
extend the check, (such as based on database version as well).
The original wording made it sound like this function was just doing
some string manipulation. But this function actually creates new
directory documents as a side effect. So make that explicit in its
documentation.
When a notmuch database is upgraded to the new database format, (to
support file rename and deletion), any message documents corresponding
to deleted files will not currently be upgraded. This means that a
search matching these documents will find no filenames in the expected
place.
Go ahead and return the filename as originally stored, (rather than
aborting with an internal error), in this case.
Similar to the return value of notmuch_database_add_message, we now
enhance the return value of notmuch_database_remove_message to
indicate whether the message document was entirely removed (SUCCESS)
or whether only this filename was removed and the document exists
under other filenamed (DUPLICATE_MESSAGE_ID).
Previously, adding a filename with the same message ID as an existing
message would do nothing. But we recently fixed this to instead add
the new filename to the existing message document. So update the
documentation to match now.
In the presentation we often omit citations and signatures, but this
is not content that should be omitted from the index, (especially
when the citation detection is wrong---see cases where a line
beginning with "From" is corrupted to ">From" by mail processing
tools).