This replaces the guts of the filename list and tag list, making those
interfaces simple iterators over the generic string list. The
directory, message filename, and tags-related code now build generic
string lists and then wraps them in specific iterators. The real wins
come in later patches, when we use these for even more generic
functionality.
As a nice side-effect, this also eliminates the annoying dependency on
GList in the tag list.
This performs a single pass over a message's term list to fetch the
thread ID, message ID, and reply-to, rather than requiring a pass for
each. Xapian decompresses the term list anew for each iteration, so
this reduces the amount of time spent decompressing message metadata.
This reduces my inbox search from 3.102 seconds to 2.555 seconds (1.2X
faster).
With the previous commit, unexpected output before or between search results
would be displayed. However, trailing junk from the "notmuch search" output
would still be silently swallowed.
The most common case for an error message from "notmuch search" would be
an invalid command-line, and in that case, there would be no search results
and the trailing error message would get swallowed.
We fix the process sentinel to check for leftover data and add it to the
final buffer. We also add a test case to ensure this works.
This fixes the recently-added emacs-large-search-buffer test. This is
as simple as saving any trailing input and then pre-prepending it on
the next call.
MAny thanks to Thomas Schwinge <thomas@schwinge.name> for tracking
down this problem and contributing a preliminary version of this fix.
Rather than silently swallowing unexpected output, the emacs interface will now
display it. This will allow error messages to actually arrive at the emacs
interface (though not in an especially pretty way). This also allows for easier
investigation of the inadvertent swallowing of search results that span page
boundaries (as demonstrated by the recent added emacs-large-search-buffer test).
The page-boundary bug has been present since a commit from 2009-11-24:
93af7b5745
Many thanks to Thomas Schwinge for tracking that bug down and
contributing the test for it.
The new name is more descriptive of the bug being tested. Also, the test
is rewritten slightly so that it's much more plain to see how the bug
manifests itself, (that messages are droped from the emacs result at
regular intervals). Primarily, this is by collapsing the large blobs
used to inflate the message subjects.
The most recent commit optimized the implementation of this
function. This commit simply updates the relevant comments to match
the new implementation.
The db_files and db_subdirs are unnecessary for unchanged directories.
maildir with 10000 e-mails:
old version:
$ time ./notmuch new
No new mail.
real 0m0.053s
user 0m0.028s
sys 0m0.026s
new version:
$ time ./notmuch new
No new mail.
real 0m0.032s
user 0m0.009s
sys 0m0.023s
Signed-off-by: Karel Zak <kzak@redhat.com>
Reviewed-by: Austin Clements <amdragon@mit.edu>
Looks good (faster than, but provably equivalent to the original code!
notmuch_directory_get_child_* are side-effect free,
db_files/db_subdirs aren't used between where they were set in the old
code and where they are set in the new code, and db_files/db_subdirs
are initialized to NULL when declared).
Another timing data point:
Old code: ./notmuch new 0.77s user 0.28s system 99% cpu 1.051 total
New code: ./notmuch new 0.09s user 0.27s system 98% cpu 0.368 total
This supports the case of a user running "configure --prefix=/foo" then later
updating the soruce (including the configure script) and re-running make.
In this case, the make invocation will re-run configure. Before this change,
this run of configure would lose the user's carefully chosen prefix. This
is now fixed so that configrue is re-run with the user's options.
Such as:
mkdir build
cd build
../configure
make
This is implemented by having the configure script set a srcdir
variable in Makefile.config, and then sprinkling $(srcdir) into
various make rules. We also use vpath directives to convince GNU make
to find the source files from the original source directory.
In the original json code, search matching nothing would return a
valid, empty json array (that is, "[]"). I broke this in commit
6dcb7592e3 when adding support for
--output=threads|messages|tags. This time, while fixing the bug also
add a test to the test suite to help avoid future regressions.
Now that we understand the bug here, we rename this test to
search-insufficient-from-quoting to clarify the bug being exercised,
(which occurs when the From: line contains an unquoted '.' character).
We also mark these tests as expected failures until the bug gets fixed.
Don't require the caller of _notmuch_doc_id_set_init to pass in a
correct bound; instead compute it from the array. This simplifies the
caller and makes this interface easier to use correctly.
Remove the repeated "sizeof (doc_ids->bitmap[0])" that bothered cworth
by instead defining macros to compute the word and bit offset of a
given bit in the doc ID set bitmap.
The call-process to notmuch in notmuch-query.el was previously sending
stderr into the output buffer. This means that if there is any stderr
the JSON parsing breaks. Unfortunately call-process does not support
sending stderr to a separate buffer or to the minibuffer [0], but it
does support sending it to /dev/null. So we do that here instead.
[0] a bug was filed against emacs (#7842)
Without this patch, it might happen that the remaining time or processing
rate were calculated just after start where nothing was processed yet.
This resulted into division by a very small number (or zero) and the
printed information was of little value.
Instead of printing nonsenses we print only that the operation is in
progress. The estimates will be printed later, after there is enough data.
This was originally intended to help support filenames with spaces in
them, but this actually breaks things when someone sets a command with
a space in it, (such as CC="ccache cc").
Instead, we now only set a custom IFS when acting on the
newline-separated list of files from /sbin/ldconfig.
Previously, the user didn't know whether the pipe command succeeded or
not. It was only possible to find it out by manually inspecting
the work done (or not done) by the command or by manually switching to
*notmuch-pipe* buffer and determine it from command output. For this
the user had to first find the text corresponding to the last run of
pipe command as the buffer accumulated the output from all pipe commands.
This patch changes the following. The *notmuch-pipe* buffer is erased
before every pipe command so it contains only the output from the last
command. Additionally, when the command failed, the *notmuch-pipe* buffer
is shown and an error message is displayed.
with the output of pipe command.
Currently, there are two places in the test framework that contain very
long list on a single line. Whenever a test is added (or changed) in
several branches and these branches are merged, it results in conflict
which is hard to resolve because one has to go through the whole long
line to find where the conflict is.
This patch splits these long lists to several lines so that the
conflicts are easier to resolve.
add --bashcompletiondir and --zshcompletiondir (like --emacslispdir) to choose
installation dir for bash/zsh completion files
Make some features optional:
--without-emacs / --with-emacs=no do not install lisp file
--without-bash-completion / --with-bash-completion=no do not install bash
files
--without-zsh-completion / --with-zsh-completion=no do not install zsh files
By default, everything is enabled. You can reenable something with
--with-feature=yes
notmuch new reports progress only during the "first" phase when the
files on disk are traversed and indexed. After this phase, other
operations like rename detection and maildir flags synchronization are
performed, but the user is not informed about them. Since these
operations can take significant time, we want to inform the user about
them.
This patch enhances the progress reporting facility that was already
present. The timer that triggers reporting is not stopped after the
first phase but continues to run until all operations are finished. The
rename detection and maildir flag synchronization are enhanced to report
their progress.
If there are several tags applied to the new messages, it is beneficial
to store them to the database at one, because it saves some time,
especially when the notmuch new is run for the first time.
This patch decreased the time for initial import from 1h 35m to 1h 14m.
This is a simplified version of a patch originally by Michal Sojka
<sojkam1@fel.cvut.cz> which is designed to have the same performance
benefits. Michal said the following:
When notmuch new is run for the first time, it is not necessary to
defer maildir flags synchronization to later because we already know
that no files will be removed.
Performing the maildinr flag synchronization immediately after the
message is added to the database has the advantage that the message
is likely hot in the disk cache so the synchronization is faster.
Additionally, we also save one database query for each message,
which must be performed when the operation is deferred.
Without this patch, the first notmuch new of 200k messages (3 GB)
took 1h and 46m out of which 20m was maildir flags
synchronization. With this patch, the whole operation took only 1h
and 36m.
Unlike Michal's patch, this version does the deferral for any new
message, rather than doing it only on the first run of "notmuch new".
Here's a bitty patch to the vim plugin; it now calculates the primary email
of the user based on a call to notmuch config. There's still a lot of work
that needs to get done on notmuch.vim, e.g., the ability to have multiple
emails/accounts.
Currently, whenever we call index_terms multiple times for a single
field, the term generator is being reset to position 0 each time. This
means that with text such as:
To: a@b.c, x@y.z
one can get a bogus match by searching for:
To: a@y.c
Thanks to Mark Anderson for reporting the bug, (and providing a nice,
minimal test case that inspired what is used here).
With talloc, we were already freeing all memory by the time we exited
the loop, but that didn't help with excess use of memory inside the
loop, (which was mostly from tallocing some objects with the incorrect
parent).
Thanks to Andrew Tridgell for sitting next to me and teaching me to
use talloc_report_full to find these leaks.
A new "folder:" prefix in the query string can now be used to match
the directories in which mail files are stored.
The addition of this feature causes the recently added
search-by-folder tests to now pass.