When running "notmuch new --verbose", ANSI escapes are used. This may not be
desirable when the output of the command is *not* being sent to a terminal
(e.g. when piping output into another command). In that case each file
processed is printed in a new line and ANSI escapes are not used at all.
For very large mail boxes, it is desirable to know which files are being
processed e.g. when a crash occurs to know which one was the cause. Also,
it may be interesting to have a better idea of how the operation is
progressing when processing mailboxes with big messages.
This patch adds support for printing messages as they are processed by
"notmuch new":
* The "new" command now supports a "--verbose" flag.
* When running in verbose mode, the file path of the message about to be
processed is printed in the following format:
current/total: /path/to/message/file
Where "current" is the number of messages processed so far and "total" is
the total count of files to be processed.
The status line is erased using an ANSI sequence "\033[K" (erase current
line from the cursor to the end of line) each time it is refreshed. This
should not pose a problem because nearly every terminal supports it.
* The signal handler for SIGALRM and the timer are not enabled when running
in verbose mode, because we are already printing progress with each file,
periodical reports are not neccessary.
Check that the stdout is connected to an interactive terminal with
isatty() before installing the periodic timer to print progress reports.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
I felt sorry for Carl trying to step through an exception from xapian
and suffering from the SIGALARMs..
We can detect if the user launched notmuch under a debugger by either
checking our cmdline for the presence of the gdb string or querying if
valgrind is controlling our process. For the latter we need to add a
compile time check for the valgrind development library, and so add the
initial support to build Makefile.config from configure.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Carl Worth <cworth@cworth.org>
[ickle: And do not install the timer when under the debugger]
We only rarely need to actually open the database for writing, but we
always create a Xapian::WritableDatabase. This has the effect of
preventing searches and like whilst updating the index.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Carl Worth <cworth@cworth.org>
This reverts commit 9794f19017.
The feature makes a lot of sense for the initial import, but it's not
as clear whether it makes sense for ongoing "notmuch new" runs. We
might need to make this opt-in by configuration.
This patch adds maildir directory name as the tag name for
messages. This helps in adding tags using filtering already
provided by procmail.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Carl says: This has similar performance benefits as the previous
patch, and I fixed similar style issues here as well, (including
missing more of a commit message than the one-line summary).
This gives a rather decent reduction in number of seeks required when
reading a Maildir that isn't in pagecache.
Most filesystems give some locality on disk based on inode numbers.
In ext[234] this is the inode tables, in XFS groups of sequential inode
numbers are together on disk and the most significant bits indicate
allocation group (i.e inode 1,000,000 is always after inode 1,000).
With this patch, we read in the whole directory, sort by inode number
before stat()ing the contents.
Ideally, directory is sequential and then we make one scan through the
file system stat()ing.
Since the universe is not ideal, we'll probably seek during reading the
directory and a fair bit while reading the inodes themselves.
However... with readahead, and stat()ing in inode order, we should be
in the best place possible to hit the cache.
In a (not very good) benchmark of "how long does it take to find the first
15,000 messages in my Maildir after 'echo 3 > /proc/sys/vm/drop_caches'",
this patch consistently cut at least 8 seconds off the scan time.
Without patch: 50 seconds
With patch: 38-42 seconds.
(I did this in a previous maildir reading project and saw large improvements too)
By installing a signal handler for SIGINT we can ensure that no work
that is already complete will be lost if the user interrupts a
"notmuch new" run with Control-C.
I recently discovered that mb2md has the annoying bug of creating
files with mtime of 0, and notmuch then promptly ignored them,
(thinking that its timestamps initialized to 0 were just as new).
We fix notmuch to not exclude messages based on a database timestamp
of 0.
Leaving this variable uninitialized caused notmuch to display a random
number while counting files for the new database.
Signed-off-by: Keith Packard <keithp@keithp.com>
Now that the client sources are alone here in their own directory,
(with all the library sources down inside the lib directory), we can
break the client up into multiple files without mixing the files up.
The hope is that these smaller files will be easier to manage and
maintain.