mirror of
https://git.notmuchmail.org/git/notmuch
synced 2024-11-25 04:18:08 +01:00
test: add known broken test for indexing text/* attachments
The general problem of indexing attachments requires some help to turn things into text, but (most?) text/* should be doable internally, possibly with optimizations as for the text/html case.
This commit is contained in:
parent
a832f940e1
commit
8eabd6388e
2 changed files with 290 additions and 0 deletions
|
@ -455,4 +455,12 @@ Date: Fri, 17 Jun 2016 22:14:41 -0400
|
||||||
EOF
|
EOF
|
||||||
test_expect_equal_file EXPECTED OUTPUT
|
test_expect_equal_file EXPECTED OUTPUT
|
||||||
|
|
||||||
|
add_email_corpus indexing
|
||||||
|
|
||||||
|
test_begin_subtest "index text/* attachments"
|
||||||
|
test_subtest_known_broken
|
||||||
|
notmuch search id:20200930101213.2m2pt3jrspvcrxfx@localhost.localdomain > EXPECTED
|
||||||
|
notmuch search id:20200930101213.2m2pt3jrspvcrxfx@localhost.localdomain and ersatz > OUTPUT
|
||||||
|
test_expect_equal_file_nonempty EXPECTED OUTPUT
|
||||||
|
|
||||||
test_done
|
test_done
|
||||||
|
|
|
@ -0,0 +1,282 @@
|
||||||
|
From mboxrd@z Thu Jan 1 00:00:00 1970
|
||||||
|
Return-Path: <SRS0=/pzd=DH=vger.kernel.org=linux-man-owner@kernel.org>
|
||||||
|
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
|
||||||
|
aws-us-west-2-korg-lkml-1.web.codeaurora.org
|
||||||
|
X-Spam-Level:
|
||||||
|
X-Spam-Status: No, score=-8.3 required=3.0 tests=BAYES_00,DKIM_SIGNED,
|
||||||
|
DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,
|
||||||
|
HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,
|
||||||
|
SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no
|
||||||
|
version=3.4.0
|
||||||
|
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
|
||||||
|
by smtp.lore.kernel.org (Postfix) with ESMTP id AFE3FC4727E
|
||||||
|
for <linux-man@archiver.kernel.org>; Wed, 30 Sep 2020 10:12:21 +0000 (UTC)
|
||||||
|
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
|
||||||
|
by mail.kernel.org (Postfix) with ESMTP id 4E0D62074A
|
||||||
|
for <linux-man@archiver.kernel.org>; Wed, 30 Sep 2020 10:12:21 +0000 (UTC)
|
||||||
|
Authentication-Results: mail.kernel.org;
|
||||||
|
dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Osm9Pn67"
|
||||||
|
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
|
||||||
|
id S1725823AbgI3KMU (ORCPT <rfc822;linux-man@archiver.kernel.org>);
|
||||||
|
Wed, 30 Sep 2020 06:12:20 -0400
|
||||||
|
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50038 "EHLO
|
||||||
|
lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
|
||||||
|
with ESMTP id S1725779AbgI3KMU (ORCPT
|
||||||
|
<rfc822;linux-man@vger.kernel.org>); Wed, 30 Sep 2020 06:12:20 -0400
|
||||||
|
Received: from mail-pf1-x443.google.com (mail-pf1-x443.google.com [IPv6:2607:f8b0:4864:20::443])
|
||||||
|
by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5026DC061755
|
||||||
|
for <linux-man@vger.kernel.org>; Wed, 30 Sep 2020 03:12:20 -0700 (PDT)
|
||||||
|
Received: by mail-pf1-x443.google.com with SMTP id b124so832681pfg.13
|
||||||
|
for <linux-man@vger.kernel.org>; Wed, 30 Sep 2020 03:12:20 -0700 (PDT)
|
||||||
|
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=gmail.com; s=20161025;
|
||||||
|
h=date:from:to:cc:subject:message-id:references:mime-version
|
||||||
|
:content-disposition:in-reply-to:user-agent;
|
||||||
|
bh=qR1FJVXOhU6/g+m4SoSco3vMtV+CNvRvNyXS1xuG+T4=;
|
||||||
|
b=Osm9Pn67G380QiA1ORltntJShSHlKg/KZZfKV8ebvfEXJw9893EO0N6J6GDR+zkmHi
|
||||||
|
TOQuIe7x9y95Pipm54rWWEW33U3gwoXRHsPc2Kivm6L8Ixb+f0T0rMPKw/FOkL8OGo9t
|
||||||
|
WmmSvnlErAXHqBq9aRAJJsf2bSlDgdAyYY1Qe6PSq2hKi2rg+sOy1Vaj4RqZ6jTK/DWY
|
||||||
|
tX28Ql0XS3kKWp0Lc8MNsSP+SXlcdwHQYll5LeReAg1oi++hICgWphuMmo3OH+2B1WtO
|
||||||
|
hMH7VuUONqbuE1aLoZ6PyyUlCeN1soJd8bKY0cmY0TKCsw0Jvkuh/XzYDVNi6wOSM6Ez
|
||||||
|
okpA==
|
||||||
|
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
|
||||||
|
d=1e100.net; s=20161025;
|
||||||
|
h=x-gm-message-state:date:from:to:cc:subject:message-id:references
|
||||||
|
:mime-version:content-disposition:in-reply-to:user-agent;
|
||||||
|
bh=qR1FJVXOhU6/g+m4SoSco3vMtV+CNvRvNyXS1xuG+T4=;
|
||||||
|
b=TJU+duGLhruSES/5sJy4y1wfcltfokDpA58edkSUJyasvsooUo67VNtOB3ZK49iHm5
|
||||||
|
C/cjy0ExxTECB0aM6p+B1jcePdWoPUaVBY9bVd/Q5DNhm4KhTO8ON96gB43d2rLWLOiK
|
||||||
|
/Y1vCu+MwOpY0JQTojbC140s/JYccR/KPapTmbUkRzrpmeoYqw8CbBPV60rQxYCn9GUu
|
||||||
|
FeCXJY5q9OfaYW1viQZoBL5n1IMMpJDVa61Q8gZ33b3wRCvQv/x1eZCsVlYpjcqf7Umc
|
||||||
|
/Amx3i27cxvo8pSvvwiTzrlJHJv0Gkytz13i7s+zW+XKzZRyzy3yirtU2DFTGat6FeMn
|
||||||
|
H8Ig==
|
||||||
|
X-Gm-Message-State: AOAM530Yon7xNOW6kiuy6bVpbpwbzR/9pldRB49OtZaSAHAZg7Gyf7qE
|
||||||
|
JXgAH20rZzYlwqOZyeZCeAwtWh09PeI=
|
||||||
|
X-Google-Smtp-Source: ABdhPJxzyZAVDBtMwQ5+dUqVg37y/LgZByrSaTxvhS6wnx6sJuG8ROItw0CwDAg939XUVADeje/nZQ==
|
||||||
|
X-Received: by 2002:a63:c547:: with SMTP id g7mr1563654pgd.234.1601460739764;
|
||||||
|
Wed, 30 Sep 2020 03:12:19 -0700 (PDT)
|
||||||
|
Received: from localhost.localdomain ([1.129.172.177])
|
||||||
|
by smtp.gmail.com with ESMTPSA id k14sm1804437pjd.45.2020.09.30.03.12.17
|
||||||
|
(version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
|
||||||
|
Wed, 30 Sep 2020 03:12:19 -0700 (PDT)
|
||||||
|
Date: Wed, 30 Sep 2020 20:12:15 +1000
|
||||||
|
From: "G. Branden Robinson" <g.branden.robinson@gmail.com>
|
||||||
|
To: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
|
||||||
|
Cc: Jakub Wilk <jwilk@jwilk.net>, linux-man@vger.kernel.org
|
||||||
|
Subject: Re: [PATCH 1/2] system_data_types.7: srcfix
|
||||||
|
Message-ID: <20200930101213.2m2pt3jrspvcrxfx@localhost.localdomain>
|
||||||
|
References: <20200925080330.184303-1-colomar.6.4.3@gmail.com>
|
||||||
|
<20200927061015.4obt73pdhyh7wecu@localhost.localdomain>
|
||||||
|
<20200928132959.x4koforqnzohxh5u@jwilk.net>
|
||||||
|
<9b8303fe-969e-c9f0-e3cd-0590b342d5bf@gmail.com>
|
||||||
|
MIME-Version: 1.0
|
||||||
|
Content-Type: multipart/signed; micalg=pgp-sha256;
|
||||||
|
protocol="application/pgp-signature"; boundary="jg2hlfugxpumieke"
|
||||||
|
Content-Disposition: inline
|
||||||
|
In-Reply-To: <9b8303fe-969e-c9f0-e3cd-0590b342d5bf@gmail.com>
|
||||||
|
User-Agent: NeoMutt/20180716
|
||||||
|
Precedence: bulk
|
||||||
|
List-ID: <linux-man.vger.kernel.org>
|
||||||
|
X-Mailing-List: linux-man@vger.kernel.org
|
||||||
|
|
||||||
|
|
||||||
|
--jg2hlfugxpumieke
|
||||||
|
Content-Type: multipart/mixed; boundary="wl6i3r6gpq7ibouc"
|
||||||
|
Content-Disposition: inline
|
||||||
|
|
||||||
|
|
||||||
|
--wl6i3r6gpq7ibouc
|
||||||
|
Content-Type: text/plain; charset=us-ascii
|
||||||
|
Content-Disposition: inline
|
||||||
|
Content-Transfer-Encoding: quoted-printable
|
||||||
|
|
||||||
|
Hi Jakub and Michael,
|
||||||
|
|
||||||
|
At 2020-09-29T14:13:26+0200, Michael Kerrisk (man-pages) wrote:
|
||||||
|
> On 9/28/20 3:29 PM, Jakub Wilk wrote:
|
||||||
|
> > Hi Branden!
|
||||||
|
> >=20
|
||||||
|
> > In groff_man_style(7) you wrote:
|
||||||
|
> >> Unused macro arguments are more often simply omitted, or good style
|
||||||
|
> >> suggests that a more appropriate macro be chosen, that earlier
|
||||||
|
> >> arguments are more important than later ones, or that arguments
|
||||||
|
> >> have identical significance such that skipping any is superfluous.
|
||||||
|
> >=20
|
||||||
|
> > After 15 minutes of gawking at this sentence, I still don't
|
||||||
|
> > understand what are you trying to say here. The sentence should be
|
||||||
|
> > either thoroughly rephrased or removed.
|
||||||
|
>=20
|
||||||
|
> I must say that I too found it hard to parse. I presume, Branden,
|
||||||
|
> that you mean:
|
||||||
|
>=20
|
||||||
|
> [[
|
||||||
|
> Unused macro arguments are more often simply omitted, or good style=20
|
||||||
|
> suggests
|
||||||
|
> EITHER (1)=20
|
||||||
|
> that a more appropriate macro be chosen,=20
|
||||||
|
> (2)
|
||||||
|
> that earlier arguments are more important than later ones, or
|
||||||
|
> (3)
|
||||||
|
> that arguments have=20
|
||||||
|
> identical significance such that skipping any is superfluous.
|
||||||
|
> ]]
|
||||||
|
|
||||||
|
You got it. But it was too much work.
|
||||||
|
|
||||||
|
> But it takes a few scans to work that out. Perhaps break this into
|
||||||
|
> smaller pieces, or add some explicit structuring elements to the
|
||||||
|
> sentence?
|
||||||
|
|
||||||
|
I was trying to be comprehensive with respect to several anti-patterns I
|
||||||
|
had in mind. However, using the anti-patterns concretely is premature
|
||||||
|
at that point in the page. So I both expanded and relocated the
|
||||||
|
material.
|
||||||
|
|
||||||
|
I'm attaching what I've just committed to groff git.
|
||||||
|
|
||||||
|
Further feedback is welcome, of course; revision of documentation is a
|
||||||
|
process that is never completed, only abandoned. And I haven't given up
|
||||||
|
yet. :)
|
||||||
|
|
||||||
|
Thank you both for your reviews.
|
||||||
|
|
||||||
|
Regards,
|
||||||
|
Branden
|
||||||
|
|
||||||
|
--wl6i3r6gpq7ibouc
|
||||||
|
Content-Type: text/x-diff; charset=us-ascii
|
||||||
|
Content-Disposition: attachment; filename="excise_standardese.diff"
|
||||||
|
Content-Transfer-Encoding: quoted-printable
|
||||||
|
|
||||||
|
commit dd2c4cf05a659ae7127e342924668ff0fa0deaa1
|
||||||
|
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
|
||||||
|
Date: Wed Sep 30 19:56:38 2020 +1000
|
||||||
|
|
||||||
|
groff_man_style(7): Clarify empty macro arguments.
|
||||||
|
=20
|
||||||
|
Rewrite some ersatz standardese I had managed to concoct regarding why
|
||||||
|
empty macro arguments are usually not needed. Put an expanded
|
||||||
|
discussion, with anti-patterns and remedies, in section "Notes", with
|
||||||
|
forward reference from subsection "Macro reference preliminaries".
|
||||||
|
=20
|
||||||
|
Thanks to Jakub Wilk and Michael Kerrisk for the critique.
|
||||||
|
|
||||||
|
diff --git a/tmac/groff_man.7.man.in b/tmac/groff_man.7.man.in
|
||||||
|
index c62d97ba..b96cbaf4 100644
|
||||||
|
--- a/tmac/groff_man.7.man.in
|
||||||
|
+++ b/tmac/groff_man.7.man.in
|
||||||
|
@@ -281,23 +281,8 @@ but the
|
||||||
|
package is designed such that this should seldom be necessary.
|
||||||
|
_ifstyle()dnl
|
||||||
|
.
|
||||||
|
-Unused macro arguments are more often simply omitted,
|
||||||
|
-.\" antipattern: '.TP ""' (just '.TP' will do)
|
||||||
|
-or good style suggests that a more appropriate macro be chosen,
|
||||||
|
-.\" antipattern: '.BI "" italic bold' (use '.IB' instead)
|
||||||
|
-that earlier arguments are more important than later ones,
|
||||||
|
-.\" antipattern: '.TH foo 1 "" "foo "1.2.3"' (don't skip the date!)
|
||||||
|
-.\" antipattern: '.IP "" 4n' (use .TP or .RS/.RE, depending on needs)
|
||||||
|
-or that arguments have identical significance such that skipping any is
|
||||||
|
-superfluous.
|
||||||
|
-.\" antipattern: '.B one two "" three' (pointless)
|
||||||
|
-.\" Technically, the above has a side-effect of additional space
|
||||||
|
-.\" between "two" and "three", but there are much more obvious ways of
|
||||||
|
-.\" getting it if desired.
|
||||||
|
-.\" .B "one two three"
|
||||||
|
-.\" .B one "two " three
|
||||||
|
-.\" .B one two " three"
|
||||||
|
-.\" .B one two\~ three
|
||||||
|
+See section \(lqNotes\(rq below for examples of cases where better
|
||||||
|
+alternatives to empty arguments in macro calls are available.
|
||||||
|
_endif()dnl
|
||||||
|
.
|
||||||
|
Most macro arguments are strings that will be output as text;
|
||||||
|
@@ -3235,6 +3220,63 @@ Some tips on troubleshooting your man pages follow.
|
||||||
|
.
|
||||||
|
.
|
||||||
|
.TP
|
||||||
|
+\(bu Do I ever need to use an empty macro argument ("")?
|
||||||
|
+Probably not.
|
||||||
|
+.
|
||||||
|
+When this seems necessary,
|
||||||
|
+often a shorter or clearer alternative is available.
|
||||||
|
+.
|
||||||
|
+.\" antipattern: '.TP ""' (just '.TP' will do)
|
||||||
|
+.\" antipattern: '.BI "" italic bold' (use '.IB' instead)
|
||||||
|
+.\" antipattern: '.TH foo 1 "" "foo 1.2.3"' (don't skip the date!)
|
||||||
|
+.\" antipattern: '.IP "" 4n' (use .TP or .RS/.RE, depending on needs)
|
||||||
|
+.\" antipattern: '.B one two "" three' (pointless)
|
||||||
|
+.\" Technically, the above has a side-effect of additional space
|
||||||
|
+.\" between "two" and "three", but there are much more obvious ways of
|
||||||
|
+.\" getting it if desired.
|
||||||
|
+.\" .B "one two three"
|
||||||
|
+.\" .B one "two " three
|
||||||
|
+.\" .B one two " three"
|
||||||
|
+.\" .B one two\~ three
|
||||||
|
+.TS
|
||||||
|
+c c
|
||||||
|
+lfCB lfCB.
|
||||||
|
+Instead of.\|.\. .\|.\|.do this.
|
||||||
|
+_
|
||||||
|
+\&.TP \(dq\(dq .TP
|
||||||
|
+\&.BI \(dq\(dq italic-text bold-text .IB italic-text bold-text
|
||||||
|
+\&.TH foo 1 \(dq\(dq \(dqfoo 1.2.3\(dq .TH foo 1 \
|
||||||
|
+\f(CIyyyy\fP-\f(CImm\fP-\f(CIdd\fP \(dqfoo 1.2.3\(dq
|
||||||
|
+\&.IP \(dq\(dq 4n .TP 4n
|
||||||
|
+\&.B one two \(dq\(dq three .B one two three
|
||||||
|
+.TE
|
||||||
|
+.
|
||||||
|
+.
|
||||||
|
+.IP
|
||||||
|
+In the title heading
|
||||||
|
+.RB ( .TH ),
|
||||||
|
+the date of the page's last revision is more important than packaging
|
||||||
|
+information;
|
||||||
|
+it should not be omitted.
|
||||||
|
+.
|
||||||
|
+Ideally,
|
||||||
|
+a page maintainer will keep both up to date.
|
||||||
|
+.
|
||||||
|
+.
|
||||||
|
+.IP
|
||||||
|
+In the last example,
|
||||||
|
+the empty argument does have a subtly different effect than its
|
||||||
|
+suggested replacement;
|
||||||
|
+the empty argument becomes an additional space character\(embut it is a
|
||||||
|
+regular breaking space,
|
||||||
|
+so it can be discarded at the end of an output line.
|
||||||
|
+.
|
||||||
|
+It is better not to be subtle,
|
||||||
|
+particularly with space,
|
||||||
|
+which can be overlooked in source and rendered forms.
|
||||||
|
+.
|
||||||
|
+.
|
||||||
|
+.TP
|
||||||
|
.RB \(bu " .RS" " doesn't indent relative to my indented paragraph"
|
||||||
|
The
|
||||||
|
.B .RS
|
||||||
|
|
||||||
|
--wl6i3r6gpq7ibouc--
|
||||||
|
|
||||||
|
--jg2hlfugxpumieke
|
||||||
|
Content-Type: application/pgp-signature; name="signature.asc"
|
||||||
|
|
||||||
|
-----BEGIN PGP SIGNATURE-----
|
||||||
|
|
||||||
|
iQIzBAEBCAAdFiEEh3PWHWjjDgcrENwa0Z6cfXEmbc4FAl90WfUACgkQ0Z6cfXEm
|
||||||
|
bc5raQ/9GhXG/5U7McaEEu+aW1IgaTYTMbsMpew5u3tBlj3/IenGzsy8wDO912BD
|
||||||
|
aHPSedYoc485k1Vh/Kowyx569RhyIXiMtH7uINCEtheMSUNgITNFqXo8mhaqVMlU
|
||||||
|
3JoV12btQddOIqHnGX6c5V9Z38KXFmVctD6CxjLaWGLp/Bu9tSKwSaHOOmtUYyOv
|
||||||
|
fYpMzr0amd4z9f+O8PPnToqBhwUitEvis1ZHYU6gIj8VwOjD0gNsWjA9HR3uC3c9
|
||||||
|
GK/R5przMANrNejzSgofm0/yAL6a61WhqhYEtzLUYu2NFnsyNJWzITNsNnoxzgQ5
|
||||||
|
liKL0Onmw0YWjOo4Z9Zht9Iyd6JhJxW0uRwlpFhE6UlCkFHK8nbv3NbHT2xlx/po
|
||||||
|
rxY5jDC3Ap3+mdYHY8k5o8vFd4QOXc2bSTuDRZoWtFZQsjnl4Fpkqks1W54Txq4y
|
||||||
|
o3Vu9aOPx//Jfi8sDc/qD/mFnyUu+AMFWjIj8UxQN4HmbrbXg/DEczRfP68DjOiX
|
||||||
|
ssy/0Rmm/H1cu7oBMoSss63mpk/NvPTSzzCR+VhU4PHQ7rxSZYS105tzkBVfe37e
|
||||||
|
hSS00rQVWe2YnI1KkfJHFjzveHiPXf+IxC0Z4PpJuLhl+pIZ/FgxJ5yEkX0XVUIy
|
||||||
|
aYRzKI3JaJktYl6WvulKSBPzQxIyOgrqVkZW4lv/uTh64pE6E5w=
|
||||||
|
=oeam
|
||||||
|
-----END PGP SIGNATURE-----
|
||||||
|
|
||||||
|
--jg2hlfugxpumieke--
|
||||||
|
|
Loading…
Reference in a new issue