emacs: Improve the regexp used to match id:'s in messages

This regexp agrees with Xapian query syntax much more closely, though
we specifically disallow various cases that would be confusing in the
context of an email body (e.g., punctuation at the end of an id: link
is not considered part of the id: link because it's probably part of
the surrounding text).

In particular, this handles id: links that are not surrounded by
quotes much better, which stash is much more likely to generate now
that we don't quote id's that don't need to be quoted.  It also
handles quoted id: links better.

We update the buttonization test to reflect the new pattern.
This commit is contained in:
Austin Clements 2012-11-15 14:49:53 -05:00 committed by David Bremner
parent 65801835ee
commit 580997252f
2 changed files with 29 additions and 11 deletions

View file

@ -996,6 +996,24 @@ message at DEPTH in the current thread."
"Insert the forest of threads FOREST." "Insert the forest of threads FOREST."
(mapc (lambda (thread) (notmuch-show-insert-thread thread 0)) forest)) (mapc (lambda (thread) (notmuch-show-insert-thread thread 0)) forest))
(defvar notmuch-id-regexp
(concat
;; Match the id: prefix only if it begins a word (to disallow, for
;; example, matching cid:).
"\\<id:\\("
;; If the term starts with a ", then parse Xapian's quoted boolean
;; term syntax, which allows for anything as long as embedded
;; double quotes escaped by doubling them. We also disallow
;; newlines (which Xapian allows) to prevent runaway terms.
"\"\\([^\"\n]\\|\"\"\\)*\""
;; Otherwise, parse Xapian's unquoted syntax, which goes up to the
;; next space or ). We disallow [.,;] as the last character
;; because these are probably part of the surrounding text, and not
;; part of the id. This doesn't match single character ids; meh.
"\\|[^\"[:space:])][^[:space:])]*[^])[:space:].,:;?!]"
"\\)")
"The regexp used to match id: links in messages.")
(defun notmuch-show-buttonise-links (start end) (defun notmuch-show-buttonise-links (start end)
"Buttonise URLs and mail addresses between START and END. "Buttonise URLs and mail addresses between START and END.
@ -1004,7 +1022,7 @@ a corresponding notmuch search."
(goto-address-fontify-region start end) (goto-address-fontify-region start end)
(save-excursion (save-excursion
(goto-char start) (goto-char start)
(while (re-search-forward "id:\\(\"?\\)[^[:space:]\"]+\\1" end t) (while (re-search-forward notmuch-id-regexp end t)
;; remove the overlay created by goto-address-mode ;; remove the overlay created by goto-address-mode
(remove-overlays (match-beginning 0) (match-end 0) 'goto-address t) (remove-overlays (match-beginning 0) (match-end 0) 'goto-address t)
(make-text-button (match-beginning 0) (match-end 0) (make-text-button (match-beginning 0) (match-end 0)

View file

@ -136,23 +136,23 @@ To: Notmuch Test Suite <test_suite@notmuchmail.org>
Date: Fri, 05 Jan 2001 15:43:57 +0000 Date: Fri, 05 Jan 2001 15:43:57 +0000
<<id:abc>> <<id:abc>>
<<id:abc.def.>> <<id:abc,def,>> <<id:abc;def;>> <<id:abc:def:>> <<id:abc.def>>. <<id:abc,def>>, <<id:abc;def>>; <<id:abc:def>>:
<<id:foo@bar.?baz?>> <<id:foo@bar!.baz!>> <<id:foo@bar.?baz>>? <<id:foo@bar!.baz>>!
(<<id:foo@bar.baz)>> [<<id:foo@bar.baz]>> (<<id:foo@bar.baz>>) [<<id:foo@bar.baz>>]
<<id:foo@bar.baz...>> <<id:foo@bar.baz>>...
<<id:2+2=5>> <<id:2+2=5>>
<<id:=_-:/.[]@$%+>> <<id:=_-:/.[]@$%+>>
<<id:abc)def>> <<id:abc>>)def
<<id:ab>>"c def <<id:ab"c>> def
<<id:"abc">>def <<id:"abc">>def
<<id:"ab">>"c"def <<id:"ab""c">>def
id:"ab c"def <<id:"ab c">>def
<<id:"abc">>.def <<id:"abc">>.def
id:"abc id:"abc
" "
<<id:)>> id:)
id: id:
c<<id:xxx>> cid:xxx
EOF EOF
test_expect_equal_file OUTPUT EXPECTED test_expect_equal_file OUTPUT EXPECTED