Commit graph

134 commits

Author SHA1 Message Date
David Bremner
b846bdb482 lib: extend private string map API with iterators
Support for prefix based iterators is perhaps overengineering, but I
wanted to mimic the existing database_config API.
2016-09-21 18:14:24 -03:00
David Bremner
8b03ee1d5a lib: private string map (associative array) API
The choice of array implementation is deliberate, for future iterator support
2016-09-21 18:14:24 -03:00
David Bremner
4dfb69169e lib: read "property" terms from messages.
This is a first step towards providing an API to attach
arbitrary (key,value) pairs to messages and retrieve all of the values
for a given key.
2016-09-21 18:14:24 -03:00
David Bremner
293186d6c6 lib: provide _notmuch_database_log_append
_notmuch_database_log clears the log buffer each time. Rather than
introducing more complicated semantics about for this function, provide
a second function that does not clear the buffer. This is mainly a
convenience function for callers constructing complex or multi-line log
messages.

The changes to query.cc are to make sure that the common code path of
the new function is tested.
2016-08-09 09:34:11 +09:00
Daniel Kahn Gillmor
6a833a6e83 Use https instead of http where possible
Many of the external links found in the notmuch source can be resolved
using https instead of http.  This changeset addresses as many as i
could find, without touching the e-mail corpus or expected outputs
found in tests.
2016-06-05 08:32:17 -03:00
Tomi Ollila
cf09631a45 lib: whitespace cleanup
Cleaned the following whitespace in lib/* files:

lib/index.cc:              1 line:  trailing whitespace
lib/database.cc            5 lines: 8 spaces at the beginning of line
lib/notmuch-private.h:     4 lines: 8 spaces at the beginning of line
lib/message.cc:            1 line:  trailing whitespace
lib/sha1.c:                1 line:  empty lines at the end of file
lib/query.cc:              2 lines: 8 spaces at the beginning of line
lib/gen-version-script.sh: 1 line:  trailing whitespace
2016-06-05 08:23:28 -03:00
Daniel Kahn Gillmor
9eebae3da4 Introduce _notmuch_message_has_term()
It can be useful to easily tell if a given message has a given term
associated with it.
2016-04-15 07:07:23 -03:00
Daniel Kahn Gillmor
011fc41d4d Add internal functions to search for alternate doc types
Publicly we are only exposing the non-ghost documents (of "type"
"mail").  But internally we might want to inspect the ghost messages
as well.

This changeset adds two new private interfaces to queries to recover
information about alternate document types.
2016-04-15 07:07:23 -03:00
Jani Nikula
f460ad4e9a util: move strcase_equal and strcase_hash to util
For future use in both cli and lib.
2015-09-07 09:43:31 -03:00
Austin Clements
7f57b747b9 lib: Add per-message last modification tracking
This adds a new document value that stores the revision of the last
modification to message metadata, where the revision number increases
monotonically with each database commit.

An alternative would be to store the wall-clock time of the last
modification of each message.  In principle this is simpler and has
the advantage that any process can determine the current timestamp
without support from libnotmuch.  However, even assuming a computer's
clock never goes backward and ignoring clock skew in networked
environments, this has a fatal flaw.  Xapian uses (optimistic)
snapshot isolation, which means reads can be concurrent with writes.
Given this, consider the following time line with a write and two read
transactions:

   write  |-X-A--------------|
   read 1       |---B---|
   read 2                      |---|

The write transaction modifies message X and records the wall-clock
time of the modification at A.  The writer hangs around for a while
and later commits its change.  Read 1 is concurrent with the write, so
it doesn't see the change to X.  It does some query and records the
wall-clock time of its results at B.  Transaction read 2 later starts
after the write commits and queries for changes since wall-clock time
B (say the reads are performing an incremental backup).  Even though
read 1 could not see the change to X, read 2 is told (correctly) that
X has not changed since B, the time of the last read.  In fact, X
changed before wall-clock time A, but the change was not visible until
*after* wall-clock time B, so read 2 misses the change to X.

This is tricky to solve in full-blown snapshot isolation, but because
Xapian serializes writes, we can use a simple, monotonically
increasing database revision number.  Furthermore, maintaining this
revision number requires no more IO than a wall-clock time solution
because Xapian already maintains statistics on the upper (and lower)
bound of each value stream.
2015-08-13 23:52:51 +02:00
David Bremner
9d192da683 lib: eliminate fprintf from _notmuch_message_file_open
You may wonder why _notmuch_message_file_open_ctx has two parameters.
This is because we need sometime to use a ctx which is a
notmuch_message_t. While we could get the database from this, there is
no easy way in C to tell type we are getting.
2015-03-29 00:34:15 +01:00
David Bremner
9b73a8bcc9 lib: add private function to extract the database for a message.
This is needed by logging in functions outside message.cc that take
only a notmuch_message_t object.
2015-03-29 00:34:15 +01:00
David Bremner
b53e1a2da7 lib: add a log function with output to a string in notmuch_database_t
In principle in the future this could do something fancier than
asprintf.
2015-03-29 00:34:15 +01:00
Jani Nikula
08757767de lib: fix clang build warnings
Fix the following warning produced by clang 3.5.0:

lib/message.cc:899:4: warning: comparison of constant 64 with expression of type 'notmuch_message_flag_t' (aka '_notmuch_message_flag') is always true [-Wtautological-constant-out-of-range-compare]
        ! NOTMUCH_TEST_BIT (message->lazy_flags, flag))
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./lib/notmuch-private.h:70:6: note: expanded from macro 'NOTMUCH_TEST_BIT'
    (_NOTMUCH_VALID_BIT(bit) ? !!((val) & (1ull << (bit))) : 0)
     ^~~~~~~~~~~~~~~~~~~~~~~
./lib/notmuch-private.h:68:26: note: expanded from macro '_NOTMUCH_VALID_BIT'
    ((bit) >= 0 && (bit) < CHAR_BIT * sizeof (unsigned long long))
                   ~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2015-02-25 23:09:39 +01:00
Jani Nikula
41b870fba5 lib: abstract bit validity check in bit test/set/clear macros
Reduce duplication in the bit test/set/clear macros. No functional
changes.
2015-02-25 23:08:35 +01:00
Austin Clements
bc9c50602d lib: Internal support for querying and creating ghost messages
This updates the message abstraction to support ghost messages: it
adds a message flag that distinguishes regular messages from ghost
messages, and an internal function for initializing a newly created
(blank) message as a ghost message.
2014-10-25 19:26:54 +02:00
Austin Clements
d99491f274 lib: Introduce macros for bit operations
These macros help clarify basic bit-twiddling code and are written to
be robust against C undefined behavior of shift operators.
2014-10-25 19:26:43 +02:00
Austin Clements
54ec8a0fd8 lib: Move message ID compression to _notmuch_message_create_for_message_id
Previously, this was performed by notmuch_database_add_message.  This
happens to be the only caller currently (which is why this was safe),
but we're about to introduce more callers, and it makes more sense to
put responsibility for ID compression in the lower-level function
rather than requiring each caller to handle it.
2014-10-11 07:09:54 +02:00
Charles Celerier
df8885f62c lib: Start all function names in notmuch-private.h with
As noted in devel/STYLE, every private library function should start
with _notmuch. This patch corrects function naming that did not adhere
to this style in lib/notmuch-private.h. In particular, the old function
names that now begin with _notmuch are

    notmuch_sha1_of_file
    notmuch_sha1_of_string
    notmuch_message_file_close
    notmuch_message_file_get_header
    notmuch_message_file_open
    notmuch_message_get_author
    notmuch_message_set_author

Signed-off-by: Charles Celerier <cceleri@cs.stanford.edu>
2014-07-13 12:25:29 -03:00
Jani Nikula
473930bb6f lib: replace the header parser with gmime
The notmuch library includes a full blown message header parser. Yet
the same message headers are parsed by gmime during indexing. Switch
to gmime parsing completely.

These are the main changes:

* Gmime stops header parsing at the first invalid header, and presumes
  the message body starts from there. The current parser is quite
  liberal in accepting broken headers. The change means we will be
  much pickier about accepting invalid messages.

* The current parser converts tabs used in header folding to
  spaces. Gmime preserve the tabs. Due to a broken python library used
  in mailman, there are plenty of mailing lists that produce headers
  with tabs in header folding, and we'll see plenty of tabs. (This
  change has been mitigated in preparatory patches.)

* For pure header parsing, the current parser is likely faster than
  gmime, which parses the whole message rather than just the
  headers. Since we parse the message and its headers using gmime for
  indexing anyway, this avoids and extra header parsing round when
  adding new messages. In case of duplicate messages, we'll end up
  parsing the full message although just headers would be
  sufficient. All in all this should still speed up 'notmuch new'.

* Calls to notmuch_message_get_header() may be slightly slower than
  previously for headers that are not indexed in the database, due to
  parsing of the whole message. Within the notmuch code base, notmuch
  reply is the only such user.
2014-04-05 12:53:04 -03:00
Jani Nikula
1fa8e40561 lib: make folder: prefix literal
In xapian terms, convert folder: prefix from probabilistic to boolean
prefix, matching the paths, relative from the maildir root, of the
message files, ignoring the maildir new and cur leaf directories.

folder:foo matches all message files in foo, foo/new, and foo/cur.

folder:foo/new does *not* match message files in foo/new.

folder:"" matches all message files in the top level maildir and its
new and cur subdirectories.

This change constitutes a database change: bump the database version
and add database upgrade support for folder: terms. The upgrade also
adds path: terms.

Finally, fix the folder search test for literal folder: search, as
some of the folder: matching capabilities are lost in the
probabilistic to boolean prefix change.
2014-03-11 19:51:22 -03:00
Jani Nikula
db465e443f lib: fix clang build
Long story short, fix build on recent (3.2+) clang.

The long story for posterity follows.

gcc 4.6 added new warnings about structs with greater visibility than
their fields. The warnings were silenced by adjusting visibility in

commit d5523ead90
Author: Carl Worth <cworth@cworth.org>
Date:   Wed May 11 13:23:13 2011 -0700

    Mark some structures in the library interface with visibility=default attribute.

Later on,

commit 3b76adf9e2
Author: Austin Clements <amdragon@MIT.EDU>
Date:   Sat Jan 14 19:17:33 2012 -0500

    lib: Add support for automatically excluding tags from queries

changed visibility of struct _notmuch_string_list for the same reason, and

commit 1a53f9f116
Author: Mark Walters <markwalters1009@gmail.com>
Date:   Thu Mar 1 22:30:38 2012 +0000

    lib: Add the exclude flag to notmuch_query_search_threads

split the struct _notmuch_string_list and its typedef
notmuch_string_list_t as a way to make a forward declaration for
_notmuch_thread_create().

The subtle difference was that the struct definition now had 'visible'
in it, while the typedef didn't, and it was within the #pragma GCC
visibility push(hidden) block. This went unnoticed, as the then common
versions of clang didn't care about this.

A later change in clang (I did not dig into when this change was
introduced) caused the following error:

CXX  -O2 lib/database.o
In file included from lib/database.cc:21:
In file included from ./lib/database-private.h:33:
./lib/notmuch-private.h:479:8: error: visibility does not match previous declaration
struct visible _notmuch_string_list {
       ^
./lib/notmuch-private.h:67:33: note: expanded from macro 'visible'
                                ^
./lib/notmuch-private.h:52:13: note: previous attribute is here
            ^
1 error generated.
make: *** [lib/database.o] Error 1

This is slightly misleading due to the reference to the #pragma. The
real culprit is the typedef within the #pragma.

We could just add 'visible' to the typedef, or move the typedef
outside of the #pragma, and be done with it, but juggle the
declarations a bit to accommodate moving the typedef back with the
struct, and keep the visibility attribute in one place.

The problem was originally reported by Simonas Kazlauskas
<s@kazlauskas.me> in id:20130418102507.GA23688@godbox but I was only
able to reproduce and investigate now that I upgraded clang.
2013-09-01 07:06:54 -03:00
Austin Clements
3fbb518335 lib: Document which strings are returned in UTF-8
Any string that ultimately comes from notmuch_message_file_get_header
is in UTF-8.
2013-08-13 17:43:34 +02:00
Mark Walters
38698d8659 lib: add --exclude=all option
Adds a exclude all option to the lib which means that excluded
messages are completely ignored (as if they had actually been
deleted).
2013-05-13 21:32:03 -03:00
Austin Clements
d6e3905df7 lib: Eliminate _notmuch_message_list_append
This API invited micro-optimized and complicated list pointer
manipulation and is no longer used.
2013-02-18 20:20:38 -04:00
Austin Clements
5394924e6c lib: Separate list of all messages from top-level messages
Previously, thread.cc built up a list of all messages, then
proceeded to tear it apart to transform it into a list of
top-level messages.  Now we simply build a new list of top-level
messages.

This simplifies the interface to _notmuch_message_add_reply,
eliminates the pointer acrobatics from
_resolve_thread_relationships, and will enable us to do things
with the list of all messages in the following patches.
2013-02-18 20:20:24 -04:00
Justus Winter
faf6ede3ef Fix the COERCE_STATUS macro
Fix the COERCE_STATUS macro to handle _internal_error being declared
as void function.

Note that the function _internal_error does not return. Evaluating to
NOTMUCH_STATUS_SUCCESS is done purely to appease the compiler.

Signed-off-by: Justus Winter <4winter@informatik.uni-hamburg.de>
2012-09-27 12:51:51 -03:00
Austin Clements
fe1ca14104 lib: Make notmuch_database_get_directory return NULL if the directory is not found
Using the new support from _notmuch_directory_create, this makes
notmuch_database_get_directory a read-only operation that simply
returns the directory object if it exists or NULL otherwise.  This
also means that notmuch_database_get_directory can work on read-only
databases.

This change breaks the directory mtime workaround in notmuch-new.c by
fixing the exact issue it was working around.  This permits mtime
update races to prevent scans of changed directories, which
non-deterministically breaks a few tests.  The next patch fixes this.
2012-05-23 22:30:55 -03:00
Austin Clements
67ae2377a9 lib: Perform the same transformation to _notmuch_database_filename_to_direntry
Now _notmuch_database_filename_to_direntry takes a flags argument and
can indicate if the necessary directory documents do not exist.
Again, callers have been updated, but retain their original behavior.
2012-05-23 22:30:43 -03:00
Austin Clements
0c950146a1 lib: Perform the same transformation to _notmuch_database_find_directory_id
Now _notmuch_database_find_directory_id takes a flags argument, which
it passes through to _notmuch_directory_create and can indicate if the
directory does not exist.  Again, callers have been updated, but
retain their original behavior.
2012-05-23 22:30:32 -03:00
Austin Clements
f69314fbd3 lib: Make directory document creation optional for _notmuch_directory_create
Previously this function would create directory documents if they
didn't exist.  As a result, it could only be used on writable
databases.  This adds an argument to make creation optional and to
make this function work on read-only databases.  We use a flag
argument to avoid a bare boolean and to permit future expansion.

Both callers have been updated, but currently retain the old behavior.
We'll take advantage of the new argument in the following patches.
2012-05-23 22:30:20 -03:00
Jani Nikula
de0557477d lib: work around talloc_steal usage from C++ code
Implicit typecast from 'void *' to 'T *' is okay in C, but not in
C++. In talloc_steal, an explicit cast is provided for type safety in
some GCC versions. Otherwise, a cast is required. Provide a template
function for this to maintain type safety, and redefine talloc_steal
to use it.

The template must be outside the extern "C" block (NOTMUCH_BEGIN_DECLS
and NOTMUCH_END_DECLS), but keep it within the GCC visibility #pragma.

No functional changes, apart from making the library build with
compilers other than recent GCC.

Signed-off-by: Jani Nikula <jani@nikula.org>
2012-04-15 09:42:15 -03:00
Mark Walters
1a53f9f116 lib: Add the exclude flag to notmuch_query_search_threads
Add the NOTMUCH_MESSAGE_FLAG_EXCLUDED flag to
notmuch_query_search_threads. Implemented by inspecting the tags
directly in _notmuch_thread_create/_thread_add_message rather than as
a Xapian query for speed reasons.

Note notmuch_thread_get_matched_messages now returns the number of
non-excluded matching messages. This API is not totally desirable but
fixing it means breaking binary compatibility so we delay that.
2012-03-02 08:28:39 -04:00
Mark Walters
c9eb94d7fb lib: Make notmuch_query_search_messages set the exclude flag
Add a flag NOTMUCH_MESSAGE_FLAG_EXCLUDED which is set by
notmuch_query_search_messages for excluded messages. Also add an
option omit_excluded_messages to the search that we do not want the
excludes at all.

This exclude flag will be added to notmuch_query_search threads in the
next patch.
2012-03-02 08:27:47 -04:00
Austin Clements
3b76adf9e2 lib: Add support for automatically excluding tags from queries
This is useful for tags like "deleted" and "spam" that people
generally want to exclude from query results.  These exclusions will
be overridden if a tag is explicitly mentioned in a query.
2012-01-16 21:06:35 -04:00
Austin Clements
567bcbc294 Store "from" and "subject" headers in the database.
This is a rebase and cleanup of Istvan Marko's patch from
id:m3pqnj2j7a.fsf@zsu.kismala.com

Search retrieves these headers for every message in the search
results.  Previously, this required opening and parsing every message
file.  Storing them directly in the database significantly reduces IO
and computation, speeding up search by between 50% and 10X.

Taking full advantage of this requires a database rebuild, but it will
fall back to the old behavior for messages that do not have headers
stored in the database.
2011-11-14 17:10:58 -04:00
David Bremner
1dedfc90f6 xutil.c: remove duplicate copies, create new library libutil.a to contain xutil.
We keep the lib/xutil.c version. As a consequence, also factor out
_internal_error and associated macros.  It might be overkill to make a
new file error_util.c for this, but _internal_error does not really
belong in database.cc.
2011-10-30 23:09:49 -03:00
Austin Clements
bfe4555325 lib: Remove message document directly after removing the last file name.
Previously, notmuch_database_remove_message would remove the message
file name, sync the change to the message document, re-find the
message document, and then delete it if there were no more file names.
An interruption after sync'ing would result in a file-name-less,
permanently un-removable zombie message that would produce errors and
odd results in searches.  We could wrap this in an atomic section, but
it's much simpler to eliminate the round-about approach and just
delete the message document instead of sync'ing it if we removed the
last filename.
2011-09-23 21:50:39 -04:00
Carl Worth
d5523ead90 Mark some structures in the library interface with visibility=default attribute.
As of gcc 4.6, there are new warnings from -Wattributes along the lines of:

	warning: ‘_notmuch_messages’ declared with greater visibility
	than the type of its field ‘_notmuch_messages::iterator’
	[-Wattributes]

To squelch these, we decorate all such containing structs with
__attribute__((visibility("default"))). We take care to let only the
C++ compiler see this, (since the C compiler would otherwise warn
about ignored visibility attributes on types).
2011-05-11 13:27:15 -07:00
Austin Clements
f3c1eebfaf Implement an internal generic string list and use it.
This replaces the guts of the filename list and tag list, making those
interfaces simple iterators over the generic string list.  The
directory, message filename, and tags-related code now build generic
string lists and then wraps them in specific iterators.  The real wins
come in later patches, when we use these for even more generic
functionality.

As a nice side-effect, this also eliminates the annoying dependency on
GList in the tag list.
2011-03-21 02:45:18 -04:00
Carl Worth
99cfa27030 Add support for folder-based searching.
A new "folder:" prefix in the query string can now be used to match
the directories in which mail files are stored.

The addition of this feature causes the recently added
search-by-folder tests to now pass.
2011-01-15 15:37:43 -08:00
Austin Clements
b3caef1f06 Optimize thread search using matched docid sets.
This reduces thread search's 1+2t Xapian queries (where t is the
number of matched threads) to 1+t queries and constructs exactly one
notmuch_message_t for each message instead of 2 to 3.
notmuch_query_search_threads eagerly fetches the docids of all
messages matching the user query instead of lazily constructing
message objects and fetching thread ID's from term lists.
_notmuch_thread_create takes a seed docid and the set of all matched
docids and uses a single Xapian query to expand this docid to its
containing thread, using the matched docid set to determine which
messages in the thread match the user query instead of using a second
Xapian query.

This reduces the amount of time required to load my inbox from 4.523
seconds to 3.025 seconds (1.5X faster).
2010-12-07 16:40:05 -08:00
Carl Worth
95dd5fe5d7 notmuch_message_tags_to_maildir_flags: Do nothing outside of "new" and "cur"
Some people use notmuch with non-maildir files, (for example, email
messages in MH format, or else cool things like using sluk[*] to suck
down feeds into a format that notmuch can index).

To better support uses like that, don't do any renaming for files that
are not in a directory named either "new" or "cur".

[*] https://github.com/krl/sluk/
2010-11-11 14:32:17 -08:00
Carl Worth
1d02dd64af lib: Add new, public notmuch_message_get_filenames
This augments the existing notmuch_message_get_filename by allowing
the caller access to all filenames in the case of multiple files for a
single message.

To support this, we split the iterator (notmuch_filenames_t) away from
the list storage (notmuch_filename_list_t) where previously these were
a single object (notmuch_filenames_t). Then, whenever the user asks
for a file or filename, the message object lazily creates a complete
notmuch_filename_list_t and then:

	For notmuch_message_get_filename, returns the first filename
	in the list.

	For notmuch_message_get_filenames, creates and returns a new
	iterator for the filename list.
2010-11-11 03:40:19 -08:00
Carl Worth
d87db88432 lib: Add new implementation of notmuch_filenames_t
The new implementation is simply a talloc-based list of strings. The
former support (a list of database terms with a common prefix) is
implemented by simply pre-iterating over the terms and populating the
list. This should provide no performance disadvantage as callers of
thigns like notmuch_directory_get_child_files are very likely to
always iterate over all filenames anyway.

This new implementation of notmuch_filenames_t is in preparation for
adding API to query all of the filenames for a single message.
2010-11-11 03:40:19 -08:00
Carl Worth
bb74e9dff8 lib: Rework interface for maildir_flags synchronization
Instead of having an API for setting a library-wide flag for
synchronization (notmuch_database_set_maildir_sync) we instead
implement maildir synchronization with two new library functions:

	notmuch_message_maildir_flags_to_tags
  and   notmuch_message_tags_to_maildir_flags

These functions are nicely documented here, (though the implementation
does not quite match the documentation yet---as plainly evidenced by
the current results of the test suite).
2010-11-11 03:40:19 -08:00
Michal Sojka
088801a14a Maildir synchronization
This patch allows bi-directional synchronization between maildir
flags and certain tags. The flag-to-tag mapping is defined by flag2tag
array.

The synchronization works this way:

1) Whenever notmuch new is executed, the following happens:
   o New messages are tagged with configured new_tags.
   o For new or renamed messages with maildir info present in the file
     name, the tags defined in flag2tag are either added or removed
     depending on the flags from the file name.

2) Whenever notmuch tag (or notmuch restore) is executed, a new set of
   flags based on the tags is constructed for every message and a new
   file name is prepared based on the old file name but with the new
   flags. If the flags differs and the old message was in 'new'
   directory then this is replaced with 'cur' in the new file name. If
   the new and old file names differ, the file is renamed and notmuch
   database is updated accordingly.

   The rename happens before the database is updated. In case of crash
   between rename and database update, the next run of notmuch new
   brings the database in sync with the mail store again.
2010-11-10 13:09:31 -08:00
Carl Worth
c81cecf620 lib: Add GCC visibility(hidden) pragmas to private header files.
This prevents any of the private functions from being leaked out
through the library interface (at least when compiling with a
recent-enough gcc to support the visibility pragma).
2010-11-01 22:35:48 -07:00
Carl Worth
7b78eb4af6 Add support (and tests) for messages with really long message IDs.
Scott Henson reported an internal error that occurred when he tried to
add a message that referenced another message with a message ID well
over 300 characters in length. The bug here was running into a Xapian
limit for the length of metadata key names, (which is even more
restrictive than the Xapian limit for the length of terms).

We fix this by noticing long message ID values and instead using a
message ID of the form "notmuch-sha1-<sha1_sum_of_message_id>". That
is, we use SHA1 to generate a compressed, (but still unique), version
of the message ID.

We add support to the test suite to exercise this fix. The tests add a
message referencing the long message ID, then add the message with the
long message ID, then finally add another message referencing the long
ID. Each of these tests exercise different code paths where the
special handling is implemented.

A final test ensures that all three messages are stitched together
into a single thread---guaranteeing that the three code paths all act
consistently.
2010-06-04 13:35:07 -07:00
Carl Worth
98845fdbb2 Avoid database corruption by not adding partially-constructed mail documents.
Previously we were using Xapian's add_document to allocate document ID
values for notmuch_message_t objects.  This had the drawback of adding
a partially constructed mail document to the database. If notmuch was
subsequently interrupted before fully populating this document, then
later runs would be quite confused when seeing the partial documents.

There are reports from the wild of people hitting internal errors of
the form "Message ... has no thread ID" for example, (which is
currently an unrecoverable error).

We fix this by manually allocating document IDs without adding
documents. With this change, we never call Xapian's add_document
method, but only replace_document with either the current document ID
of a message or a new one that we have allocated.
2010-06-04 10:16:53 -07:00
Dirk Hohndel
5b8b0377cb Make Received: header special in notmuch_message_file_get_header
With this patch the Received: header becomes special in the way
we treat headers - this is the only header for which we concatenate
all the instances we find (instead of just returning the first one).

This will be used in the From guessing code for replies as we need to
be able to walk ALL of the Received: headers in a message to have a
good chance to guess which mailbox this email was delivered to.

Signed-off-by: Dirk Hohndel <hohndel@infradead.org>
2010-04-26 14:44:06 -07:00
Dirk Hohndel
57561414d7 Add authors member to message
message->authors contains the author's name (as we want to print it)
get / set methods are declared in notmuch-private.h

Signed-off-by: Dirk Hohndel <hohndel@infradead.org>
2010-04-26 11:44:49 -07:00
Jesse Rosenthal
4971b85641 Name thread based on matching msgs instead of first msg.
At the moment all threads are named based on the name of the first message
in the thread. However, this can cause problems if people either start
new threads by replying-all (as unfortunately, many out there do) or
change the subject of their mails to reflect a shift in a thread on a
list.

This patch names threads based on (a) matches for the query, and (b) the
search order. If the search order is oldest-first (as in the default
inbox) it chooses the oldest matching message as the subject. If the
search order is newest-first it chooses the newest one.

Reply prefixes ("Re: ", "Aw: ", "Sv: ", "Vs: ") are ignored
(case-insensitively) so a Re: won't change the subject.

Note that this adds a "sort" argument to _notmuch_thread_create and
_thread_add_matched_message, so that when constructing the thread we can
be aware of the sort order.

Signed-off-by: Jesse Rosenthal <jrosenthal@jhu.edu>
2010-04-21 14:56:53 -07:00
Carl Worth
4e5d2f22db lib: Rename iterator functions to prepare for reverse iteration.
We rename 'has_more' to 'valid' so that it can function whether
iterating in a forward or reverse direction. We also rename
'advance' to 'move_to_next' to setup parallel naming with
the proposed functions 'move_to_first', 'move_to_last', and
'move_to_previous'.
2010-03-09 09:22:29 -08:00
Carl Worth
d12801c8b4 lib: Split the database upgrade into two phases for safer operation.
The first phase copies data from the old format to the new format
without deleting anything. This allows an old notmuch to still use the
database if the upgrade process gets interrupted. The second phase
performs the deletion (after updating the database version number). If
the second phase is interrupted, there will be some unused data in the
database, but it shouldn't cause any actual harm.
2010-01-09 11:13:12 -08:00
Carl Worth
909f52bd8c lib: Implement versioning in the database and provide upgrade function.
The recent support for renames in the database is our first time
(since notmuch has had more than a single user) that we have a
database format change. To support smooth upgrades we now encode a
database format version number in the Xapian metadata.

Going forward notmuch will emit a warning if used to read from a
database with a newer version than it natively supports, and will
refuse to write to a database with a newer version.

The library also provides functions to query the database format
version:

	notmuch_database_get_version

to ask if notmuch wants a newer version than that:

	notmuch_database_needs_upgrade

and a function to actually perform that upgrade:

	notmuch_database_upgrade
2010-01-07 18:26:31 -08:00
Carl Worth
807aef93d3 Prefer READ_ONLY consistently over READONLY.
Previously we had NOTMUCH_DATABASE_MODE_READ_ONLY but
NOTMUCH_STATUS_READONLY_DATABASE which was ugly and confusing. Rename
the latter to NOTMUCH_STATUS_READ_ONLY_DATABASE for consistency.
2010-01-07 10:29:05 -08:00
Carl Worth
f93b7218c3 lib: Consolidate checks for read-only database.
Previously, many checks were deep in the library just before a cast
operation. These have now been replaced with internal errors and new
checks have instead been added at the beginning of all top-levelentry
points requiring a read-write database.

The new checks now also use a single function for checking and
printing the error message. This will give us a convenient location to
extend the check, (such as based on database version as well).
2010-01-07 10:19:44 -08:00
Carl Worth
d807e28f43 lib: Implement new notmuch_directory_t API.
This new directory ojbect provides all the infrastructure needed to
detect when files or directories are deleted or renamed. There's still
code needed on top of this (within "notmuch new") to actually do that
detection.
2010-01-06 10:32:06 -08:00
Carl Worth
498edff503 database: Abstract _filename_to_direntry from _add_message
The code to map a filename to a direntry is something that we're going
to want in a future _remove_message function, so put it in a new
function _notmuch_database_filename_to_direntry .
2010-01-06 10:32:05 -08:00
Carl Worth
1376a90db6 database: Allowing storing multiple filenames for a single message ID.
The library interface is unchanged so far, (still just
notmuch_database_add_message), but internally, the old
_set_filename function is now _add_filename instead.
2010-01-06 10:32:05 -08:00
Carl Worth
6ca6c089e9 database: Store mail filename as a new 'direntry' term, not as 'data'.
Instead of storing the complete message filename in the data portion
of a mail document we now store a 'direntry' term that contains the
document ID of a directory document and also the basename of the
message filename within that directory. This will allow us to easily
store multple filenames for a single message, and will also allow us
to find mail documents for files that previously existed in a
directory but that have since been deleted.
2010-01-06 10:32:05 -08:00
Carl Worth
84742d86ab database: Split _find_parent_id into _split_path and _find_directory_id
Some pending commits want the _split_path functionality separate from
mapping a directory to a document ID. The split_path function now
returns the basename as well as the directory name.
2010-01-06 10:32:05 -08:00
Carl Worth
406ec4b15d database: Export _notmuch_database_find_parent_id for internal use.
We'll soon have mail documents referring to their parent directory's
directory documents, so we'll need access to _find_parent_id in files
such as message.cc.
2010-01-06 10:32:05 -08:00
Carl Worth
ba12bf1f26 lib: Abstract the extraction of a relative path from set_filename
We'll soon be having multiple entry points that accept a filename
path, so we want common code for getting a relative path from a
potentially absolute path.
2010-01-06 10:32:05 -08:00
Carl Worth
8c6b7d311c lib: Add missing value to notmuch_private_status_t enum.
And fix the initialization such that the private enum will always have
distinct values from the public enum even if we similarly miss the
addition of a new public value in the future.
2010-01-06 10:32:05 -08:00
Fernando Carrijo
db68eea013 Nuke the remainings of _notmuch_message_add_thread_id.
The function _notmuch_message_add_thread_id has been removed
from the private interface of notmuch. There's no reason for
one to keep a declaration of its prototype in the code base.
Also, lets update a commentary that referenced that function
and escaped from previous scrutiny.

Signed-off-by: Fernando Carrijo <fcarrijo@yahoo.com.br>
2009-12-09 12:09:55 -08:00
Jeffrey C. Ollie
95f97540a0 Remove unused notmuch_parse_date function prototype.
notmuch_parse_date is not implemented, so remove the unused function
prototype.

Signed-off-by: Jeffrey C. Ollie <jeff@ocjtech.us>
2009-12-03 17:07:22 -08:00
Carl Worth
880b21a097 Makefile: Incorporate getline implementation into the build.
It's unconditional for a very short time. We expect to soon be
building it only if necessary.
2009-12-01 16:33:17 -08:00
Carl Worth
70962fabf9 lib/messages.c: Make message searches stream as well.
Xapian provides an interator-based interface to all search results.
So it was natural to make notmuch_messages_t be iterator-based as
well. Which we did originally.

But we ran into a problem when we added two APIs, (_get_replies and
_get_toplevel_messages), that want to return a messages iterator
that's *not* based on a Xapian search result. My original compromise
was to use notmuch_message_list_t as the basis for all returned
messages iterators in the public interface.

This had the problem of introducing extra latency at the beginning
of a search for messages, (the call would block while iterating over
all results from Xapian, converting to a message list).

In this commit, we remove that initial conversion and instead provide
two alternate implementations of notmuch_messages_t (one on top of a
Xapian iterator and one on top of a message list).

With this change, I tested a "notmuch search" returning *many* results
as previously taking about 7 seconds before results started appearing,
and now taking only 2 seconds.
2009-11-24 11:33:09 -08:00
Chris Wilson
f379aa5284 Permit opening the notmuch database in read-only mode.
We only rarely need to actually open the database for writing, but we
always create a Xapian::WritableDatabase. This has the effect of
preventing searches and like whilst updating the index.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Carl Worth <cworth@cworth.org>
2009-11-21 22:04:49 +01:00
Ingmar Vanhassel
2ce25b93a7 Typsos 2009-11-18 03:21:36 -08:00
Carl Worth
0da0131096 database: Make _parse_message_id static once again.
We had exposed this to the internal implementation for a short time,
(only while we had the silly code fetching In-Reply-To values from
message files instead of from the database). Make this private again
as it should be.
2009-11-17 18:50:13 -08:00
Keith Packard
d025e89ac7 Fix "too many open files" bug by closing message files when done with them.
The message file header parsing code parses only enough of the file to
find the desired header fields, then it leaves the file open until the
next header parsing call or when the message is no longer in use. If a
large number of messages end up being active, this will quickly run
out of file descriptors.

Here, we add support to explicitly close the message file within a
message, (_notmuch_message_close) and call that from thread
construction code.

Signed-off-by: Keith Packard <keithp@keithp.com>

Edited-by: Carl Worth <cworth@cworth.org>:

Many portions of Keith's original patch have since been solved other
ways, (such as the code that changed the handling of the In-Reply-To
header). So the final version is clean enough that I think even Keith
would be happy to have his name on it.
2009-11-17 18:37:13 -08:00
Carl Worth
24a25ffba9 Remove the talloc_owner argument from create_for_message_id.
This function has only one caller, and that one caller was passing the
same value for both talloc_owner and the notmuch database. Dropping
the redundant argument simplifies the documentation of this function
considerably.
2009-11-17 17:42:32 -08:00
Carl Worth
933caf814f notmuch show: Implement proper thread ordering/nesting of messages.
We now properly analyze the in-reply-to headers to create a proper
tree representing the actual thread and present the messages in this
correct thread order. Also, there's a new "depth:" value added to the
"message{" header so that clients can format the thread as desired,
(such as by indenting replies).
2009-11-15 20:41:45 -08:00
Carl Worth
d136a1e2cf Add _notmuch_message_get_in_reply_to.
The existing notmuch_message_get_header is *almost* good enough for
this, except that we also need to remove the '<' and '>'
delimiters. We'll probably want to implement this function with
database storage in the future rather than loading the email message.
2009-11-15 20:36:51 -08:00
Carl Worth
b97756926f Remove obsolete notmuch_message_get_subject prototype.
This prototype has been sitting around for a while with no function
implementing it. I wonder if there's a compiler warning I could turn
on to catch these things.
2009-11-15 20:34:24 -08:00
Carl Worth
f970d8078c lib/messages: Add new notmuch_message_list_t to internal interface.
Previously, the notmuch_messages_t object was a linked list built on
top of a linked-list node with the odd name of notmuch_message_list_t.

Now, we've got much more sane naming with notmuch_message_list_t being
a list built on a linked-list node named notmuch_message_node_t. And
now the public notmuch_messages_t object is a separate iterator based
on notmuch_message_node_t. This means the interfaces for the new
notmuch_message_list_t object are now made available to the library
internals.
2009-11-15 20:31:30 -08:00
Carl Worth
9b1c6c250b Export _parse_message_id to the library implementation.
Not exported through the public interface, but the thread code is
going to want to be able to parse In-Reply-To headers so needs access
to this code.
2009-11-15 20:21:43 -08:00
Carl Worth
d3349358c6 lib: Move notmuch_messages_t code from query.cc to new messages.c
The new object is simply a linked-list of notmuch_message_t objects,
(unlike the old object which contained a couple of Xapian iterators).
This works now by the query code immediately iterator over all results
and creating notmuch_message_t objects for them, (rather than waiting
to create the objects until the notmuch_messages_get call as we did
earlier).

The point of this change is to allow other instances of lists of
messages, (such as in notmuch_thread_t), that are not directly related
to Xapian search results.
2009-11-14 23:05:17 -08:00
Carl Worth
c168e24174 notmuch search: Print the number of matched/total messages for each thread.
Note that we don't print the number of *unread* messages, but instead
the number of messages that matched the search terms. This is in
keeping with our philosophy that the inbox is nothing more than a
search view. If we search for messages with an inbox tag, then that's
what we'll get a count of. (And if somebody does want to see unread
counts, then they can search for the "unread" tag.)

Getting the number of matched messages is really nice when doing
historical searches. For example in a search like:

	notmuch search tag:sent

(where the "sent" tag has been applied to all messages originating
from the user's email address)---here it's really nice to be able to
see a thread where the user just mentioned one point [1/13] vs. really
getting involved in the discussion [10/29].
2009-11-12 22:01:44 -08:00
Carl Worth
ec6d3506db notmuch search: Print all authors contributing to a thread.
We've now expanded the notmuch_thread_create function to fire off a
secondary database query to find all the messages that belong to this
particular thread. This allows us to now have the complete authors'
list for the thread, and will also make it trivial to print accurate
message counts for threads in the future.
2009-11-12 21:09:54 -08:00
Carl Worth
1465493210 libify: Move library sources down into lib directory.
A "make" invocation still works from the top-level, but not from
down inside the lib directory yet.
2009-11-09 16:24:03 -08:00
Renamed from notmuch-private.h (Browse further)