mirror of
https://git.notmuchmail.org/git/notmuch
synced 2024-11-29 06:04:11 +01:00
77c9ec1fdd
'quite' on IRC reported that notmuch new was grinding to a halt during initial indexing, and we eventually narrowed the problem down to some html parts with large embedded images. These cause the number of terms added to the Xapian database to explode (the first 400 messages generated 4.6M unique terms), and of course the resulting terms are not much use for searching. The second test is sanity check for any "improved" indexing of HTML.
19 lines
581 B
Bash
Executable file
19 lines
581 B
Bash
Executable file
#!/usr/bin/env bash
|
|
test_description="indexing of html parts"
|
|
. ./test-lib.sh || exit 1
|
|
|
|
add_email_corpus html
|
|
|
|
test_begin_subtest 'embedded images should not be indexed'
|
|
test_subtest_known_broken
|
|
notmuch search kwpza7svrgjzqwi8fhb2msggwtxtwgqcxp4wbqr4wjddstqmeqa7 > OUTPUT
|
|
test_expect_equal_file /dev/null OUTPUT
|
|
|
|
test_begin_subtest 'non tag text should be indexed'
|
|
notmuch search hunter2 | notmuch_search_sanitize > OUTPUT
|
|
cat <<EOF > EXPECTED
|
|
thread:XXX 2009-11-17 [1/1] David Bremner; test html attachment (inbox unread)
|
|
EOF
|
|
test_expect_equal_file EXPECTED OUTPUT
|
|
|
|
test_done
|