Commit graph

3 commits

Author SHA1 Message Date
Carl Worth
56218ddbb4 index: Don't bother indexing quoted portions of messages (and signatures).
Our old notmuch-index-message.cc code had this, but I originally
left it out when adding indexing back in. I was concerned primarily
with mistakenly detecting signature markers and omitting important
text, (for example, I often do long lines of "----" as section
separators).

But now I  see that there's a performance benefit to skippint the
quotations, (about 120 files/sec. instead of 95 files/sec.). I mitigated
the bogus signature checking by recognizing nothing other than the
all-time classic "-- ".
2009-10-28 15:41:42 -07:00
Carl Worth
3a91df21ca index: Store "Full Name <user@example.com>" addressses in the database
We put these is as a separate term so that they can be extracted.
We don't actually need this for searching, since typing an email
address in as a search term will already trigger a phrase search
that does exactly what's wanted.
2009-10-28 13:09:08 -07:00
Carl Worth
f9bbd7baa0 Add full-text indexing using the GMime library for parsing.
This is based on the old notmuch-index-message.cc from early in
the history of notmuch, but considerably cleaned up now that
we have some experience with Xapian and know just what we want
to index, (rather than just blindly trying to index exactly
what sup does).

This does slow down notmuch_database_add_message a *lot*, but I've
got some ideas for getting some time back.
2009-10-28 12:50:10 -07:00