notmuch/test/encoding
Michal Sojka 74f8f15adc test: Add test for searching of uncommonly encoded messages
Emails that are encoded differently than as ASCII or UTF-8 are not
indexed properly by notmuch. It is not possible to search for non-ASCII
words within those messages.
2012-02-29 07:34:54 -04:00

33 lines
1.3 KiB
Bash
Executable file
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

#!/usr/bin/env bash
test_description="encoding issues"
. ./test-lib.sh
test_begin_subtest "Message with text of unknown charset"
add_message '[content-type]="text/plain; charset=unknown-8bit"' \
"[body]=irrelevant"
output=$(notmuch show id:${gen_msg_id} 2>&1 | notmuch_show_sanitize)
test_expect_equal "$output" " message{ id:msg-001@notmuch-test-suite depth:0 match:1 filename:/XXX/mail/msg-001
header{
Notmuch Test Suite <test_suite@notmuchmail.org> (2001-01-05) (inbox unread)
Subject: Test message #1
From: Notmuch Test Suite <test_suite@notmuchmail.org>
To: Notmuch Test Suite <test_suite@notmuchmail.org>
Date: Fri, 05 Jan 2001 15:43:57 +0000
header}
body{
part{ ID: 1, Content-type: text/plain
irrelevant
part}
body}
message}"
test_begin_subtest "Search for ISO-8859-2 encoded message"
test_subtest_known_broken
add_message '[content-type]="text/plain; charset=iso-8859-2"' \
'[content-transfer-encoding]=8bit' \
'[subject]="ISO-8859-2 encoded message"' \
"[body]=$'Czech word tu\350\362\341\350\350\355 means pinguin\'s.'" # ISO-8859-2 characters are generated by shell's escape sequences
output=$(notmuch search tučňáččí 2>&1 | notmuch_show_sanitize)
test_expect_equal "$output" "thread:0000000000000002 2001-01-05 [1/1] Notmuch Test Suite; ISO-8859-2 encoded message (inbox unread)"
test_done