Commit graph

88 commits

Author SHA1 Message Date
Austin Clements
30de720ba0 lib: Invalidate message metadata in _notmuch_message_gen_terms
Previously, we invalidated stored message metadata in
_notmuch_message_add_term and _notmuch_message_remove_term, but not in
_notmuch_message_gen_terms.  This doesn't currently result in any bugs
because of our limited uses of _notmuch_message_gen_terms, but it may
could cause trouble in the future.
2014-08-04 18:57:55 -03:00
Charles Celerier
df8885f62c lib: Start all function names in notmuch-private.h with
As noted in devel/STYLE, every private library function should start
with _notmuch. This patch corrects function naming that did not adhere
to this style in lib/notmuch-private.h. In particular, the old function
names that now begin with _notmuch are

    notmuch_sha1_of_file
    notmuch_sha1_of_string
    notmuch_message_file_close
    notmuch_message_file_get_header
    notmuch_message_file_open
    notmuch_message_get_author
    notmuch_message_set_author

Signed-off-by: Charles Celerier <cceleri@cs.stanford.edu>
2014-07-13 12:25:29 -03:00
Austin Clements
dc64ab6720 lib: Separate all phrases indexed by _notmuch_message_gen_terms
This adds a 100 termpos gap between all phrases indexed by
_notmuch_message_gen_terms.  This fixes a bug where terms from the end
of one header and the beginning of another header could match together
in a single phrase and a separate bug where term positions of
un-prefixed terms overlapped.

This fix only affects newly indexed messages.  Messages that are
already indexed won't benefit from this fix without re-indexing, but
the fix won't make things any worse for existing messages.
2014-06-18 18:03:18 -03:00
Jani Nikula
1fa8e40561 lib: make folder: prefix literal
In xapian terms, convert folder: prefix from probabilistic to boolean
prefix, matching the paths, relative from the maildir root, of the
message files, ignoring the maildir new and cur leaf directories.

folder:foo matches all message files in foo, foo/new, and foo/cur.

folder:foo/new does *not* match message files in foo/new.

folder:"" matches all message files in the top level maildir and its
new and cur subdirectories.

This change constitutes a database change: bump the database version
and add database upgrade support for folder: terms. The upgrade also
adds path: terms.

Finally, fix the folder search test for literal folder: search, as
some of the folder: matching capabilities are lost in the
probabilistic to boolean prefix change.
2014-03-11 19:51:22 -03:00
Jani Nikula
59823f9642 lib: add support for path: prefix searches
The path: prefix is a literal boolean prefix matching the paths,
relative from the maildir root, of the message files.

path:foo matches all message files in foo (but not in foo/new or
foo/cur).

path:foo/new matches all message files in foo/new.

path:"" matches all message files in the top level maildir.

path:foo/** matches all message files in foo and recursively in all
subdirectories of foo.

path:** matches all message files recursively, i.e. all messages.
2014-03-11 19:51:22 -03:00
Jani Nikula
4d150eba67 lib: refactor folder term update after filename removal
Abstract some blocks of code for reuse. No functional changes.
2014-03-11 19:51:22 -03:00
Tomi Valkeinen
075d53dde5 lib: fix error handling
Currently if a Xapian exception happens in notmuch_message_get_header,
the exception is not caught leading to crash. In
notmuch_message_get_date the exception is caught, but an internal error
is raised, again leading to crash.

This patch fixes the error handling by making both functions catch the
Xapian exceptions, print an error and return NULL or 0.

The 'notmuch->exception_reported' is also set, as is done elsewhere,
even if I don't really get the idea of that field.

Signed-off-by: Tomi Valkeinen <tomi.valkeinen@iki.fi>
2014-01-18 14:47:35 -04:00
Louis Rilling
a9b2135c75 tags_to_maildir_flags: Don't rename if no flags change
notmuch_message_tags_to_maildir_flags() unconditionally moves messages from
maildir directory "new/" to maildir directory "cur/", which makes messages lose
their "new" status in the MUA. However some users want to keep this "new"
status after, for instance, an auto-tagging of new messages.

However, as Austin mentioned and according to the maildir specification,
messages living in "new/" are not allowed to have flags, even if mutt allows it
to happen. For this reason, this patch prevents moving messages from "new/" to
"cur/", only if no flags have to be changed. It's hopefully enough to satisfy
mutt (and maybe other MUAs showing the "new" status) users checking the "new"
status.

Changelog:
* v2: Fix bool type as well as NULL returned despite having no errors (Austin
      Clements)
* v4: Tag the related test (contributed by Michal Sojka) as working

Signed-off-by: Louis Rilling <l.rilling@av7.net>

[Condition for keeping messages in new/ was extended to satisfy all
 tests from the previous patch. -Michal Sojka]

[Added by David Bremner, to keep the tests passing at each commit]

update insert tests for new maildir synchronization rules

As of id:1355952747-27350-4-git-send-email-sojkam1@fel.cvut.cz
we are more conservative about moving messages from ./new to ./cur.
This updates the insert tests to match
2013-09-03 20:41:51 -03:00
Vladimir Marek
51b073c6f2 lib/message.cc: stale pointer bug (v3)
Xapian::TermIterator::operator* returns std::string which is destroyed
as soon as (*i).c_str() finishes. The remembered pointer 'term' then
references invalid memory.

Signed-off-by: Vladimir Marek <vlmarek@volny.cz>
2013-05-03 21:17:56 -03:00
Austin Clements
5394924e6c lib: Separate list of all messages from top-level messages
Previously, thread.cc built up a list of all messages, then
proceeded to tear it apart to transform it into a list of
top-level messages.  Now we simply build a new list of top-level
messages.

This simplifies the interface to _notmuch_message_add_reply,
eliminates the pointer acrobatics from
_resolve_thread_relationships, and will enable us to do things
with the list of all messages in the following patches.
2013-02-18 20:20:24 -04:00
Jani Nikula
5505d55515 lib: fix warnings when building with clang
Building notmuch with CC=clang and CXX=clang++ produces the warnings:

CC -O2 lib/tags.o
lib/tags.c:43:5: warning: expression result unused [-Wunused-value]
    talloc_steal (tags, list);
    ^~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/talloc.h:345:143: note: expanded from:
  ...__location__); __talloc_steal_ret; })
                    ^~~~~~~~~~~~~~~~~~
1 warning generated.

CXX -O2 lib/message.o
lib/message.cc:791:5: warning: expression result unused [-Wunused-value]
    talloc_reference (message, message->tag_list);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/talloc.h:932:36: note: expanded from:
  ...(_TALLOC_TYPEOF(ptr))_talloc_reference_loc((ctx),(ptr), __location__)
     ^                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.

Check talloc_reference() return value, and explicitly ignore
talloc_steal() return value as it has no failure modes, to silence the
warnings.
2012-12-01 08:10:32 -04:00
Austin Clements
b88030bda6 lib: Treat messages in new/ as maildir messages with no flags set
Previously, notmuch new only synchronized maildir flags to tags for
files with a maildir "info" part.  Since messages in new/ don't have
an info part, notmuch would ignore them for flag-to-tag
synchronization.

This patch makes notmuch consider messages in new/ to be legitimate
maildir messages that simply have no maildir flags set.  The most
visible effect of this is that such messages now automatically get the
unread tag.
2012-06-10 20:14:56 -03:00
Austin Clements
750231bae8 lib: Only synchronize maildir flags for messages in maildirs
Previously, we synchronized flags to tags for any message that looked
like it had maildir flags in its file name, regardless of whether it
was in a maildir-like directory structure.  This was asymmetric with
tag-to-flag synchronization, which only applied to messages in
directories named new/ and cur/ (introduced by 95dd5fe5).

This change makes our interpretation stricter and addresses this
asymmetry by only synchronizing flags to tags for messages in
directories named new/ or cur/.  It also prepares us to treat messages
in new/ as maildir messages, even though they lack maildir flags.
2012-06-10 20:13:58 -03:00
Austin Clements
93ab4c7d11 lib: Move _filename_is_in_maildir
This way notmuch_message_maildir_flags_to_tags can call it.  It makes
more sense for this to be just above all of the maildir
synchronization code rather than mixed in the middle.
2012-06-10 20:13:45 -03:00
Austin Clements
d9f61c26a1 lib: Don't needlessly create directory docs in _notmuch_message_remove_filename
Previously, if passed a filename with a directory that did not exist
in the database, _notmuch_message_remove_filename would needlessly
create that directory document.  Fix it so that doesn't happen.
2012-05-23 22:32:12 -03:00
Austin Clements
67ae2377a9 lib: Perform the same transformation to _notmuch_database_filename_to_direntry
Now _notmuch_database_filename_to_direntry takes a flags argument and
can indicate if the necessary directory documents do not exist.
Again, callers have been updated, but retain their original behavior.
2012-05-23 22:30:43 -03:00
Louis Rilling
b9360be2bd tags_to_maildir_flags: Cleanup double assignement
The for loop right after already does the job.

Signed-off-by: Louis Rilling <l.rilling@av7.net>
2011-11-21 20:32:32 -04:00
Louis Rilling
21b13c3932 lib: Kill last usage of C++ type bool
Signed-off-by: Louis Rilling <l.rilling@av7.net>
2011-11-21 20:32:07 -04:00
Austin Clements
567bcbc294 Store "from" and "subject" headers in the database.
This is a rebase and cleanup of Istvan Marko's patch from
id:m3pqnj2j7a.fsf@zsu.kismala.com

Search retrieves these headers for every message in the search
results.  Previously, this required opening and parsing every message
file.  Storing them directly in the database significantly reduces IO
and computation, speeding up search by between 50% and 10X.

Taking full advantage of this requires a database rebuild, but it will
fall back to the old behavior for messages that do not have headers
stored in the database.
2011-11-14 17:10:58 -04:00
Ali Polatel
02a3076711 lib: make find_message{,by_filename) report errors
Previously, the functions notmuch_database_find_message() and
notmuch_database_find_message_by_filename() functions did not properly
report error condition to the library user.

For more information, read the thread on the notmuch mailing list
starting with my mail "id:871uv2unfd.fsf@gmail.com"

Make these functions accept a pointer to 'notmuch_message_t' as argument
and return notmuch_status_t which may be used to check for any error
condition.

restore: Modify for the new notmuch_database_find_message()
new: Modify for the new notmuch_database_find_message_by_filename()
2011-10-04 07:55:29 +03:00
Austin Clements
bfe4555325 lib: Remove message document directly after removing the last file name.
Previously, notmuch_database_remove_message would remove the message
file name, sync the change to the message document, re-find the
message document, and then delete it if there were no more file names.
An interruption after sync'ing would result in a file-name-less,
permanently un-removable zombie message that would produce errors and
odd results in searches.  We could wrap this in an atomic section, but
it's much simpler to eliminate the round-about approach and just
delete the message document instead of sync'ing it if we removed the
last filename.
2011-09-23 21:50:39 -04:00
Austin Clements
e4379c43e2 lib: Indicate if there are more filenames after removal.
Make _notmuch_message_remove_filename return
NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID if the message has more filenames
and fix callers to handle this.
2011-09-23 21:50:39 -04:00
Austin Clements
62445dd023 lib: Add missing status check in _notmuch_message_remove_filename.
Previously, this function would synchronize the folder list even if
removing the file name failed.  Now it returns immediately if removing
the file name fails.
2011-09-12 23:36:00 -03:00
Mark Anderson
8a856e5c38 Fix folder: coherence issue
Add removal of all ZXFOLDER terms to removal of all XFOLDER terms for
each message filename removal.

The existing filename-list reindexing will put all the needed terms
back in.  Test search-folder-coherence now passes.

Signed-off-by:Mark Anderson <ma.skies@gmail.com>
2011-06-29 14:13:16 -07:00
Pieter Praet
8bb6f7869c fix sum moar typos [comments in source code]
Various typo fixes in comments within the source code.

Signed-off-by: Pieter Praet <pieter@praet.org>

Edited-by: Carl Worth <cworth@cworth.org> Restricted to just
source-code comments, (and fixed fix of "descriptios" to "descriptors"
rather than "descriptions").
2011-06-23 15:58:39 -07:00
Carl Worth
d5523ead90 Mark some structures in the library interface with visibility=default attribute.
As of gcc 4.6, there are new warnings from -Wattributes along the lines of:

	warning: ‘_notmuch_messages’ declared with greater visibility
	than the type of its field ‘_notmuch_messages::iterator’
	[-Wattributes]

To squelch these, we decorate all such containing structs with
__attribute__((visibility("default"))). We take care to let only the
C++ compiler see this, (since the C compiler would otherwise warn
about ignored visibility attributes on types).
2011-05-11 13:27:15 -07:00
Carl Worth
2f3a76c569 Remove some variables which were set but not used.
gcc (at least as of version 4.6.0) is kind enough to point these out to us,
(when given -Wunused-but-set-variable explicitly or implicitly via -Wunused
or -Wall).

One of these cases was a legitimately unused variable. Two were simply
variables (named ignored) we were assigning only to squelch a warning about
unused function return values. I don't seem to be getting those warnings
even without setting the ignored variable. And the gcc docs. say that the
correct way to squelch that warning is with a cast to (void) anyway.
2011-05-11 13:27:14 -07:00
Austin Clements
d19c5de17a Add the tag list to the unified message metadata pass.
Now each caller of notmuch_message_get_tags only gets a new iterator,
instead of a whole new list.  In principle this could cause problems
with iterating while modifying tags, but through the magic of talloc
references, we keep the old tag list alive even after the cache in the
message object is invalidated.

This reduces my index search from the 3.102 seconds before the unified
metadata pass to 1.811 seconds (1.7X faster).  Combined with the
thread search optimization in b3caef1f06,
that makes this query 2.5X faster than when I started.
2011-03-21 02:45:18 -04:00
Austin Clements
f271071330 Add the file name list to the unified message metadata pass.
Even if the caller never uses the file names, there is little cost to
simply fetching the file name terms.  However, retrieving the full
paths requires additional database work, so the expansion from terms
to full paths is performed lazily.

This also simplifies clearing the filename cache, since that's now
handled by the generic metadata cache code.

This further reduces my inbox search from 3.102 seconds before the
unified metadata pass to 2.206 seconds (1.4X faster).
2011-03-21 02:45:18 -04:00
Austin Clements
206938ec9b Add a generic function to get a list of terms with some prefix.
Replace _notmuch_convert_tags with this and simplify
_create_filenames_for_terms_with_prefix.  This will also come in handy
shortly to get the message file name list.
2011-03-21 02:45:18 -04:00
Austin Clements
f3c1eebfaf Implement an internal generic string list and use it.
This replaces the guts of the filename list and tag list, making those
interfaces simple iterators over the generic string list.  The
directory, message filename, and tags-related code now build generic
string lists and then wraps them in specific iterators.  The real wins
come in later patches, when we use these for even more generic
functionality.

As a nice side-effect, this also eliminates the annoying dependency on
GList in the tag list.
2011-03-21 02:45:18 -04:00
Austin Clements
d9b0ae918f Use a single unified pass to fetch scalar message metadata.
This performs a single pass over a message's term list to fetch the
thread ID, message ID, and reply-to, rather than requiring a pass for
each.  Xapian decompresses the term list anew for each iteration, so
this reduces the amount of time spent decompressing message metadata.

This reduces my inbox search from 3.102 seconds to 2.555 seconds (1.2X
faster).
2011-03-21 02:45:18 -04:00
Carl Worth
db70f3f0c4 lib: Save and restore term position in message while indexing.
This fixes the recently addead search-position-overlap bug as
demonstrated in the test of the same name.
2011-01-26 15:59:19 +10:00
Carl Worth
99cfa27030 Add support for folder-based searching.
A new "folder:" prefix in the query string can now be used to match
the directories in which mail files are stored.

The addition of this feature causes the recently added
search-by-folder tests to now pass.
2011-01-15 15:37:43 -08:00
Carl Worth
36161181df Correct some minor typos in a comment
Nothing too important here. Just some misspellings I noticed while reading
nearby code.
2011-01-15 15:37:43 -08:00
Austin Clements
b3caef1f06 Optimize thread search using matched docid sets.
This reduces thread search's 1+2t Xapian queries (where t is the
number of matched threads) to 1+t queries and constructs exactly one
notmuch_message_t for each message instead of 2 to 3.
notmuch_query_search_threads eagerly fetches the docids of all
messages matching the user query instead of lazily constructing
message objects and fetching thread ID's from term lists.
_notmuch_thread_create takes a seed docid and the set of all matched
docids and uses a single Xapian query to expand this docid to its
containing thread, using the matched docid set to determine which
messages in the thread match the user query instead of using a second
Xapian query.

This reduces the amount of time required to load my inbox from 4.523
seconds to 3.025 seconds (1.5X faster).
2010-12-07 16:40:05 -08:00
Carl Worth
7278383005 lib: Fix missing initialization of status field.
This could have been a problematic bug. Fortuinately "gcc -O2" warns
about it.
2010-11-11 20:54:41 -08:00
Carl Worth
fe8eeaf4a5 lib: Add two missing static qualifiers
The debian packaging is nice enough to notice when we accidentally
leak private symbols to the public interface.
2010-11-11 20:53:21 -08:00
Carl Worth
96d99c3837 tags_to_maildir_flags: Fix to preserve existing, unsupported flags
This is to prevent notmuch from destroying any information the user
has encoded as flags in the maildir filename. Tests are also added to
the test suite to verify the documented behavior.
2010-11-11 16:36:02 -08:00
Carl Worth
95dd5fe5d7 notmuch_message_tags_to_maildir_flags: Do nothing outside of "new" and "cur"
Some people use notmuch with non-maildir files, (for example, email
messages in MH format, or else cool things like using sluk[*] to suck
down feeds into a format that notmuch can index).

To better support uses like that, don't do any renaming for files that
are not in a directory named either "new" or "cur".

[*] https://github.com/krl/sluk/
2010-11-11 14:32:17 -08:00
Carl Worth
37a8096fdc notmuch_message_tags_to_maildir_flags: Don't exit on failure to rename.
It is totally legitimate for a non-maildir directory to be named "new"
(and not have a directory next to it named "cur"). To support this
case at least, be silent about any rename failure.
2010-11-11 03:50:42 -08:00
Carl Worth
71a3201885 notmuch_message_tags_to_maildir_flags: Fix to rename multiple files
This function was documented as modifying every filename associated
with the message. Fix it to actually do that.
2010-11-11 03:47:11 -08:00
Carl Worth
404db1de90 maildir_flags_to_tags: Avoid interpreting "no info" as "no flags set".
If a filename has no maildir info at all, (that is, it does not
contain the sequence ":2,"), we consider this distinct from a filename
with an empty maildir info, (the ":2," separator is present, but no
flags characters follow).

Specifically, we regard a missing info field as providing no
information, so tags will remain unchanged. On the other hand, an info
field that is present but has no flags set will cause various tags to
be cleared, (or in the case of "unread", added).

This fixes the "remove info" case of the maildir-sync tests in the
test suite.
2010-11-11 03:40:19 -08:00
Carl Worth
81cbaafc0f Fix notmuch_message_tags_to_maildir_flags to effect rename immediately
We have tests to ensure that when the notmuch library renames a file
that that rename takes place immediately in the database, (without
requiring something like "notmuch new" to notice the change).

This was working when the code was first added, but recently broke in
the reworking of the maildir-synchronization interface since the
tags_to_maildir_flags function can no longer assume that it is being
called as part of _notmuch_message_sync.

Fortunately, the fix is as simple as adding an explicit call to
_notmuch_message_sync.
2010-11-11 03:40:19 -08:00
Carl Worth
4b6063397f Fix notmuch_message_maildir_flags_to_tags to iterate over filenames
As documented, this function now iterates over all filenames for the
message, computing a logical OR of the flags set on the filenames,
then uses the final result to set tags on the message.

This change fixes 3 of the 10 maildir-sync tests that have been
failing since being added.
2010-11-11 03:40:19 -08:00
Carl Worth
1d02dd64af lib: Add new, public notmuch_message_get_filenames
This augments the existing notmuch_message_get_filename by allowing
the caller access to all filenames in the case of multiple files for a
single message.

To support this, we split the iterator (notmuch_filenames_t) away from
the list storage (notmuch_filename_list_t) where previously these were
a single object (notmuch_filenames_t). Then, whenever the user asks
for a file or filename, the message object lazily creates a complete
notmuch_filename_list_t and then:

	For notmuch_message_get_filename, returns the first filename
	in the list.

	For notmuch_message_get_filenames, creates and returns a new
	iterator for the filename list.
2010-11-11 03:40:19 -08:00
Carl Worth
d422dcf0a2 lib: Remove the notion of TAGS_INVALID
This rather ugly hack was recently obviated by the removal of the
notmuch_database_set_maildir_sync function. Now, clients must make
explicit calls to do any syncrhonization between maildir flags and
tags. So the library no longer needs to worry about doing inconsistent
synchronization while a message is only partially added.
2010-11-11 03:40:19 -08:00
Carl Worth
bb74e9dff8 lib: Rework interface for maildir_flags synchronization
Instead of having an API for setting a library-wide flag for
synchronization (notmuch_database_set_maildir_sync) we instead
implement maildir synchronization with two new library functions:

	notmuch_message_maildir_flags_to_tags
  and   notmuch_message_tags_to_maildir_flags

These functions are nicely documented here, (though the implementation
does not quite match the documentation yet---as plainly evidenced by
the current results of the test suite).
2010-11-11 03:40:19 -08:00
Carl Worth
2c262042ac lib: Remove the synchronization of 'T' flag with "deleted" tag.
Tags in a notmuch database affect all messages with the identical
message-ID. But maildir tags affect individual files. And since
multiple files can contain the identical message-ID, there is not a
one-to-one correspondence between messages affected by tags and flags.

This is particularly dangerous with the 'T' (== "trashed") maildir
flag and the corresponding "deleted" tag in the notmuch
database. Since these flags/tags are often used to trigger
irreversible deletion operations, the lack of one-to-one
correspondence can be potentially dangerous.

For example, consider the following sequence:

  1. A third-party application is used to identify duplicate messages
     in the mail store, and mark all-but-one of each duplicate with
     the 'T' flag for subsequent deletion.

  2. A "notmuch new" operation reads that 'T' flag, adding the
     "deleted" flag to the corresponding messages within the notmuch
     database.

  3. A subsequent notmuch operation, (such as a "notmuch dump; notmuch
     restore" cycle) synchronized the "deleted" tag back to the mail
     store, applying the 'T' flag to all(!) filenames with duplicate
     message IDs.

  4. A third-party application reads the 'T' flags and irreversibly
     deletes all mail messages which had any duplicates(!).

In order to avoid this scenario, we simply refuse to synchronize the
'T' flag with the "deleted" tag. Instead, applications can set 'T' and
act on it to delete files, or can set "deleted" and act on it to
delete files. But in either case the semantics are clear and there is
never dangerous propagation through the one-to-many mapping of notmuch
message objects to files.
2010-11-11 02:35:03 -08:00
Michal Sojka
d9d3d3e6f0 Make maildir synchronization configurable
This adds group [maildir] and key 'synchronize_flags' to the
configuration file. Its value enables (true) or diables (false) the
synchronization between notmuch tags and maildir flags. By default,
the synchronization is disabled.
2010-11-10 13:09:32 -08:00