notmuch/test/corpora
David Bremner 77c9ec1fdd test: add known broken test for indexing html
'quite' on IRC reported that notmuch new was grinding to a halt during
initial indexing, and we eventually narrowed the problem down to some
html parts with large embedded images. These cause the number of terms
added to the Xapian database to explode (the first 400 messages
generated 4.6M unique terms), and of course the resulting terms are
not much use for searching.

The second test is sanity check for any "improved" indexing of HTML.
2017-04-20 06:59:40 -03:00
..
broken test: add known broken test for reply to message with multiple Cc headers 2016-09-17 08:41:29 -03:00
default test: make it possible to have multiple corpora 2016-09-17 08:39:34 -03:00
html test: add known broken test for indexing html 2017-04-20 06:59:40 -03:00
lkml/cur test: add 'lkml' corpus 2017-04-13 21:55:43 -03:00
README test: add known broken test for indexing html 2017-04-20 06:59:40 -03:00

This directory contains email corpora for testing.

default
  The default corpus is based on about 50 messages from early in the
  history of the notmuch mailing list, which allows for reliably
  testing commands that need to operate on a not-totally-trivial
  number of messages.

broken
  The broken corpus contains messages that are broken and/or RFC
  non-compliant, ensuring we deal with them in a sane way.

html
  The html corpus contains html parts