We need a way to pass parameters to the indexing functionality on the
first index, not just on reindexing. The obvious place is in
notmuch_database_add_message. But since modifying the argument list
would break both API and ABI, we needed a new name.
I considered notmuch_database_add_message_with_params(), but the
functionality we're talking about doesn't always add a message. It
tries to index a specific file, possibly adding a message, but
possibly doing other things, like adding terms to an existing message,
or failing to deal with message objects entirely (e.g. because the
file didn't contain a message).
So i chose the function name notmuch_database_index_file.
I confess i'm a little concerned about confusing future notmuch
developers with the new name, since we already have a private
_notmuch_message_index_file function, and the two do rather different
things. But i think the added clarity for people linking against the
future libnotmuch and the capacity for using index parameters makes
this a worthwhile tradeoff. (that said, if anyone has another name
that they strongly prefer, i'd be happy to go with it)
This changeset also adjusts the tests so that we test whether the new,
preferred function returns bad values (since the deprecated function
just calls the new one).
We can keep the deprecated n_d_add_message function around as long as
we like, but at the next place where we're forced to break API or ABI
we can probably choose to drop the name relatively safely.
NOTE: there is probably more cleanup to do in the ruby and go bindings
to complete the deprecation directly. I don't know those languages
well enough to attempt a fix; i don't know how to test them; and i
don't know the culture around those languages about API additions or
deprecations.
Recently a user reported a crash in notmuch new, but because of
missing error reporting, all they could say was "A Xapian exception
occured". This commit adds the extra information available about
the error message in the exception.
Split file ignores in count_files to fixed and user configured in
order to not have to call _entry_in_ignore_list twice when debugging
is enabled. Minor detail.
If some software other than notmuch new renames or removes files
during the notmuch new scan (specifically after scandir but before
indexing the file), keep going instead of bailing out. Failing to
index the file is just a race condition between notmuch and the other
software; the rename could happen after the notmuch new scan
anyway. It's not fatal, and we'll catch the renamed files on the next
scan.
Add a new exit code for when files vanished, so the caller has a
chance to detect the race and re-run notmuch new to recover.
Reported by Paul Wise <pabs@debian.org> at
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=843127
Many of the external links found in the notmuch source can be resolved
using https instead of http. This changeset addresses as many as i
could find, without touching the e-mail corpus or expected outputs
found in tests.
The code in add_file seems to assume that NOTMUCH_STATUS_FILE_ERROR is
never returned from add_message. This turns out to be false (although it
seems to only happen in certain fairly rare race conditions).
There was a problem with the directory documents being left behind when
the filesystem directory was removed. This was worked around in [1].
However, that ignored the fact that the directory documents are also
still listed by notmuch_directory_get_child_directories() leading to
confusing results when running notmuch new. The directory documents are
found and queued for removal over and over again.
Fix the problem for real by removing the directory documents. This fixes
the tests flagged as broken in [2].
The (non-deterministic) hack test from [3] also still passes with this
change.
[1] commit acd66cdec0
[2] commit ed9ceda623
[3] id:1441445731-4362-1-git-send-email-jani@nikula.org
The side effect is that all of add_files_state will be initialized to
zero, removing any lingering doubt that some of it might not be
initialized. It's not a small struct, and the initialization is
scattered around a bit, so this makes the code more readable.
The library does not have a function to remove a directory document
for a path. Usually this doesn't matter except for a slight waste of
space. However, if the same directory gets added to the filesystem
again, the old directory document is found with the old mtime. Reset
the directory mtime on removal to avoid problems.
The corner case that can hit this problem is renaming directories back
and forth. Renaming does not change the mtime of the directory in the
filesystem, and thus the old db directory document mtime may match the
fs mtime of the directory.
The long term fix might be to add a library function to remove a
directory document, however this is a much simpler and faster fix for
the time being.
The function notmuch_exit_if_unmatched_db_uuid is split from
notmuch_process_shared_options because it needs an open notmuch
database.
There are two exceptional cases in uuid handling.
1) notmuch config and notmuch setup don't currently open the database,
so it doesn't make sense to check the UUID.
2) notmuch compact opens the database inside the library, so we either
need to open the database just to check uuid, or change the API.
Try to narrow down what part of the code adds files and directories to
the queue(s) to be deleted.
Update one test. The output is slightly confusing, but I believe it is
correct, resulting from a directory being discovered but containing only ignored files.
Unfortunately it seems trickier to support --config globally
The non-trivial changes are in notmuch.c; most of the other changes
consists of blindly inserting two lines into every subcommand.
The compatibility wrapper ensures that clients calling
notmuch_database_open will receive consistent output for now.
The changes to notmuch-{new,search} and test/symbol-test are just to
make the test suite pass.
The use of IGNORE_RESULT is justified by two things. 1) I don't know
what else to do. 2) asprintf guarantees the output string is NULL if
an error occurs, so at least we are not passing garbage back.
The version number has always been pretty meaningless to the user and
it's about to become even more meaningless with the introduction of
"features". Hopefully, the database will remain on version 3 for some
time to come; however, the introduction of new features over time in
version 3 will necessitate upgrades within version 3. It would be
confusing if we always tell the user they've been "upgraded to version
3". If the user wants to know what's new, they should read the news.
All we do here is calculate the backup filename, and call the existing
dump routine.
Also take the opportunity to add a message about being safe to
interrupt.
Support for dirent.d_type is OS-specific. Previously, we used
_DIRENT_HAVE_D_TYPE to detect support for this, but this is apparently
a glic-ism (FreeBSD, for example, supports d_type, but does not define
this). Since there's no cross-platform way to detect support for
dirent.d_type, detect it using a test compile at configure time.
The notmuch_new_command() function has grown huge, chop it up a
bit. This should also be helpful when adding a --quiet option to
notmuch new. No functional changes.
Apart from the status codes for format mismatches, the non-zero exit
status codes have been arbitrary. Make the cli consistently return
either EXIT_SUCCESS or EXIT_FAILURE.
This allows specifying config file as a top level argument to notmuch,
and generally makes it possible to override config file options in
main(), without having to touch the subcommands.
If the config file does not exist, one will be created for the notmuch
main command and setup and help subcommands. Help is special in this
regard; the config is created just to avoid errors about missing
config, but it will not be saved.
This also makes notmuch config the talloc context for subcommands.
We now have a notmuch_config_is_new() function to query whether a
config was created or not. Change the notmuch_config_open() is_new
parameter into boolean create_new to determine whether the function
should create a new config if one doesn't exist. This reduces the
complexity of the API.
Use the notmuch argument parser to handle arguments in "notmuch
new". As a side effect, this fixes broken STRNCMP_LITERAL usage that
accepts, for example, --verbosefoo for --verbose.
We now test for user ignore patterns before attempting to determine if
a directory entry is itself a directory. As a result, we no longer
abort for broken symlinks if the user has explicitly ignored them.
This fixes the test added in the previous patch. It also slightly
changes the debug output checked by another test of ignores.
Since starting at the top of a directory tree and recursing within
that tree are now identical operations, there's no need for both
add_files and add_files_recursive. This eliminates add_files (which
did nothing more than call add_files_recursive after the previous
patch) and renames add_files_recursive to add_files.
Previously, add_files_recursive could have been called on a symlink to
a non-directory. Hence, calling it on a non-directory was not an
error, so a separate function, add_files, existed to fail loudly in
situations where the path had to be a directory.
With the new stat-ing logic, add_files_recursive is always called on
directories, so the separation of this logic is no longer necessary.
Hence, this patch moves the strict error checking previously done by
add_files into add_files_recursive.
This moves our logic to get a file's type into one function. This has
several benefits: we can support OSes and file systems that do not
provide dirent.d_type or always return DT_UNKNOWN, complex
symlink-handling logic has been replaced by a simple stat fall-through
in one place, and the error message for un-stat-able file is more
accurate (previously, the error always mentioned directories, even
though a broken symlink is not a directory).
Previously, notmuch_database_get_directory did not indicate whether or
not the returned directory object was newly created, which required a
workaround to distinguish newly created directory objects with no
child messages from directory objects that had no mtime set but did
have child messages. Now that notmuch_database_get_directory
distinguishes whether or not the directory object exists in the
database, this workaround is no longer necessary.
Previously, notmuch_database_get_directory had no way to indicate how
it had failed. This changes its prototype to return a status code and
set an out-argument to the retrieved directory, like similar functions
in the library API. This does *not* change its currently broken
behavior of creating directory objects when they don't exist, but it
does document it and paves the way for fixing this. Also, it can now
check for a read-only database and return
NOTMUCH_STATUS_READ_ONLY_DATABASE instead of crashing.
In the interest of atomicity, this also updates calls from the CLI so
that notmuch still compiles.
This is the notmuch_database_create equivalent of the previous change.
In this case, there were places where errors were not being propagated
correctly in notmuch_database_create or in calls to it. These have
been fixed, using the new status value.
It has been a long-standing issue that notmuch_database_open doesn't
return any indication of why it failed. This patch changes its
prototype to return a notmuch_status_t and set an out-argument to the
database itself, like other functions that return both a status and an
object.
In the interest of atomicity, this also updates every use in the CLI
so that notmuch still compiles. Since this patch does not update the
bindings, the Python bindings test fails.
Previously, if we failed to find the message by filename in
remove_filename, we would return immediately from the function without
ending its atomic block. Now this code follows the usual goto DONE
idiom to perform cleanup.
This was going to stdout. I removed the newline at the beginning of
printing the fatal error message because it wouldn't make sense if you
were only looking at the stderr stream (e.g., you had redirected
stdout to /dev/null).