[sup-compat] Don't index mime parts with content-disposition of attachment

Here's another change which I'm making for sup compatibility against
my better judgment. It seems that sup never indexes content from
mime parts with content-disposition of attachment. But these
attachments are often very indexable, (for example, the first one
I encountered was a small shell script).

So I'll have to think a bit about whether or not I want to revert
this commit. To do this properly we would really want to distinguish
between attachments that are indexable, (such as text), and those
that aren't, (such as binaries). I know the mime-type alone isn't
alwas sufficient here as even this little plaintext shell script
was attached as octet-stream.

And if we wanted to get really fancy we could run things like antiword
to generate text from non-text attachments and index their output.
This commit is contained in:
Carl Worth 2009-10-14 16:20:45 -07:00
parent 7c9dbbad40
commit 653ff260f5

View file

@ -444,6 +444,7 @@ gen_terms_part (Xapian::TermGenerator term_gen,
strcmp (disposition->disposition, GMIME_DISPOSITION_ATTACHMENT) == 0) strcmp (disposition->disposition, GMIME_DISPOSITION_ATTACHMENT) == 0)
{ {
add_term (term_gen.get_document (), "label", "attachment"); add_term (term_gen.get_document (), "label", "attachment");
return;
} }
byte_array = g_byte_array_new (); byte_array = g_byte_array_new ();