mirror of
https://git.notmuchmail.org/git/notmuch
synced 2024-12-18 07:24:51 +01:00
77c9ec1fdd
'quite' on IRC reported that notmuch new was grinding to a halt during initial indexing, and we eventually narrowed the problem down to some html parts with large embedded images. These cause the number of terms added to the Xapian database to explode (the first 400 messages generated 4.6M unique terms), and of course the resulting terms are not much use for searching. The second test is sanity check for any "improved" indexing of HTML.
15 lines
354 B
Text
15 lines
354 B
Text
From: David Bremner <david@example.net>
|
|
To: David Bremner <david@example.net>
|
|
Subject: test html attachment
|
|
Date: Tue, 17 Nov 2009 21:28:38 +0600
|
|
Message-ID: <87d1dajhgf.fsf@example.net>
|
|
MIME-Version: 1.0
|
|
Content-Type: text/html
|
|
Content-Disposition: inline; filename=test.html
|
|
|
|
<html>
|
|
<body>
|
|
<input value="a>swordfish">
|
|
</body>
|
|
hunter2
|
|
</html>
|