This gives a rather decent reduction in number of seeks required when
reading a Maildir that isn't in pagecache.
Most filesystems give some locality on disk based on inode numbers.
In ext[234] this is the inode tables, in XFS groups of sequential inode
numbers are together on disk and the most significant bits indicate
allocation group (i.e inode 1,000,000 is always after inode 1,000).
With this patch, we read in the whole directory, sort by inode number
before stat()ing the contents.
Ideally, directory is sequential and then we make one scan through the
file system stat()ing.
Since the universe is not ideal, we'll probably seek during reading the
directory and a fair bit while reading the inodes themselves.
However... with readahead, and stat()ing in inode order, we should be
in the best place possible to hit the cache.
In a (not very good) benchmark of "how long does it take to find the first
15,000 messages in my Maildir after 'echo 3 > /proc/sys/vm/drop_caches'",
this patch consistently cut at least 8 seconds off the scan time.
Without patch: 50 seconds
With patch: 38-42 seconds.
(I did this in a previous maildir reading project and saw large improvements too)
Previously, Ubuntu 9.10, gcc 4.4.1 was getting:
/usr/bin/ld: lib/notmuch.a(database.o): in function global
constructors keyed to BOOLEAN_PREFIX_INTERNAL:database.cc(.text+0x3a):
error: undefined reference to 'std::ios_base::Init::Init()'
That is, give a nice error message and exit if no search terms are
provided. Thanks to Priit Laes <plaes@plaes.org> for reporting the
error and providing an early version of the fix.
Do not use -C cmdline option of install, older versions, commonly found in
distributions like Debian, do not seem to support it. Running make install
on such systems (tested on Debian Lenny) fails.
Signed-off-by: Jan Janak <jan@ryngle.com>
This was recently introduced in commit:
64c03ae97f
which was adding extra checks to avoid adding a self-referencing
message.
How many times am I going to fix a dumb regression like this and say
"we really need a test suite" before I actually sit down and write the
test suite?
We take the recently created text from the notmuch manual page and
update the "notmuch help" command to use similar text. In particular,
we add a new "notmuch help search-terms" for documenting the search
syntax that is common to several commands.
I set out merely to add documentation for the recently-added options
for "notmuch search" (--first, --max-threads, and --sort), but ended
up revamping a lot. A significant change is a new SEARCH SYNTAX
section separate from "notmuch search" that is referred to in the
documentation of search, show, reply, and tag.
Also many sections were updated to reflect recent changes, (such as
the dropping of the NOTMUCH_BASE environment variable, the addition of
the .notmuch-config file, etc.)
This is what most people want for a _search_ command. It's often
different for actually reading mail in an inbox, (where it makes more
sense to have results displayed in chronological order), but in such a
case, ther user is likely using an interface that can simply pass the
--sort=oldest-first option to "notmuch search".
Here we're also change the sort enum from NOTMUCH_SORT_DATE and
NOTMUCH_SORT_DATE_REVERSE to NOTMUCH_SORT_OLDEST_FIRST and
NOTMUCH_SORT_NEWEST_FIRST. Similarly we replace the --reverse option
to "notmuch search" with two options: --sort=oldest-first and
--sort=newest-first.
Finally, these changes are all tracked in the emacs interface, (which
has no change in its behavior).
This is one of those cases where total time is not the metric of
interest. We increase the total time of the search, (by doing some
redundant work for the initial threads). But more significantly, we
give the user *some* results nearly instantaneously, (so that the user
might see the result of interest without ever even waiting for the
complete results to come in).
We had exposed this to the internal implementation for a short time,
(only while we had the silly code fetching In-Reply-To values from
message files instead of from the database). Make this private again
as it should be.
Maybe ths lack of this documentation is why I forgot we were actually
storing this and wrote the ugly code to fetch In-Reply-To from message
files rather than from the database.
Which is more consistent with the XREFERENCE prefix used in the terms
in the database. Also remove some stale documentation describing the
removal of resolved references from the database (we no longer do
this).
Calling continue here worked only because we set a flag before the
continue, and, check the flag at the beginning of the loop, and *then*
break. It's much more clear to just break in the first place.
The message file header parsing code parses only enough of the file to
find the desired header fields, then it leaves the file open until the
next header parsing call or when the message is no longer in use. If a
large number of messages end up being active, this will quickly run
out of file descriptors.
Here, we add support to explicitly close the message file within a
message, (_notmuch_message_close) and call that from thread
construction code.
Signed-off-by: Keith Packard <keithp@keithp.com>
Edited-by: Carl Worth <cworth@cworth.org>:
Many portions of Keith's original patch have since been solved other
ways, (such as the code that changed the handling of the In-Reply-To
header). So the final version is clean enough that I think even Keith
would be happy to have his name on it.
This really should be impossible---if there are no messages, then what
was the thread object created from? During recent debugging, it was
useful to have this error detected and reported.
In our scheme it's illegal for any message to refer to itself, (nor
would it be useful for anything anyway). Cut these self-references off
at the source, before they trip up any internal errors.
This case was happening when a message had its own message ID in its
In-Reply-To header. The thread-resolution code would find the
partially constructed message, (with no thread ID yet), get garbage
from this function, and then march right along with that garbage.
With this commit, a self-cyclic message like this will now trigger an
internal error rather than marching along silienty. (And a subsequent
commit will remove the call to this function in this case.)
This function has only one caller, and that one caller was passing the
same value for both talloc_owner and the notmuch database. Dropping
the redundant argument simplifies the documentation of this function
considerably.
This reduces our reliance on open message_file objects, (so is a step
toward fixing the "too many open files" bug), but more importantly, it
means we don't load a self-referencing in-reply-to header, (since we
weed those out before adding any replyto terms to the database).
We'll be a much more polite package this way. And the user can change
the prefix by editing Makefile.config. Still to be done is to make
configure write out Makefile.config and to add a --prefix option to
configure.
I was confusing myself with some rules installing to directories and
some installing to files. We do still install to a filename when
simultaneously renaming, (such as notmuch-completion.bash to notmuch).
When this function was originally written, the 'message' object was
always destroyed locally, so I thought it would be good to use a NULL
talloc context to make it more obvious if there was any leak.
Since then, however, this function has been changed to optionally
return the added message, and in that case we *don't* free the message
locally, so let's let the database be the talloc context.
The documentation for 'next-line' suggests that 'forward-line' is a
better choice for non-interactive usage. That appears to be the case
here; using next-line caused emacs to spin forever for me.
Signed-off-by: Keith Packard <keithp@keithp.com>
This is the default separator used by mailman, so there's a lot of
clutter in thread displays without this. Also, we not provide a nice
variable to the user (notmuch-show-signature-regexp) for configuring
this.
I think there's a GMime bug that we're getting parts decoded without a
final newline (the encoded parts seem to have them just fine). We can
workaround the bug easily enough by finding a part-closing delimiter
that is not at the beginning of a line, and if so, just insert a
newline.
Without this, the one-line-summary of the next message would continue
on the same line as the last line of the previous message, (and this
would often happen for mailing-list messages where mailman would add
an extra part for its signature block).
notmuch restore used to only add tags; now that it clears existing
tags, it needs to operate on messages even if the new tag list is empty.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Carl Worth <cworth@cworth.org>:
I fixed up the indentation here, (someday we might switch to 8-space
indents, but we haven't yet).
This makes it much easier to actually read the subject lines.
The user can set notmuch-search-authors-width to control the width of
the column.
Two possible ideas for improving this support further:
1. Make the excess authors invisible instead of removing them from
the buffer, (which means that isearch could still find them).
2. Have the user variable control a percentage of the window width
rather than being a fixed number of columns.
Now that we're actually adding text to the buffer for the indentation,
our old aproach of using positions to record regions to manipulate is
now longer correct. Fortunately, it's easy to switch from positions to
markers which are robust, (just call point-marker instead of point and
all relevant functions accept markers as well as points).
I also finally fixed the bug where the text "[6 line signature]" we
display was causing the one-line-summary of the next message to be on
its same line rather than at the beginning of the next line where it
belongs.
We now properly analyze the in-reply-to headers to create a proper
tree representing the actual thread and present the messages in this
correct thread order. Also, there's a new "depth:" value added to the
"message{" header so that clients can format the thread as desired,
(such as by indenting replies).
It's funny that I picked up the habit of always including a space
before a left parenthesis from Keith, and now he's in the habit of
contributing code without it.
The existing notmuch_message_get_header is *almost* good enough for
this, except that we also need to remove the '<' and '>'
delimiters. We'll probably want to implement this function with
database storage in the future rather than loading the email message.