nmbug and notmuch-report are developer tools. It's 2018, and all
developers should have python3 available.
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Correct URLs that have crept into the notmuch codebase with http://
when https:// is possible.
As part of this conversion, this changeset also indicates the current
preferred upstream URLs for both gmime and sup. the new URLs are
https-enabled, the old ones are not.
This also fixes T310-emacs.sh, thanks to Bremner for catching it.
Changes since 0.2:
* Accept failures to unset core.worktree in clone (0a155847,
2017-10-10, unreleased).
* Use --no-renames in log (f9189a06, 2016-09-26, v0.24).
* Auto-checkout in clone if it wouldn't clobber (7ef3b653, 2017-10-10,
unreleased).
* Add a 'help' command for folks who don't like --help
(9d25c97d, 2014-10-03, v0.20).
* Setup a 'config' branch on clone to track origin/config (244f8739,
2015-03-22, v0.20). This branch may be consumed by
notmuch-report(1).
* Only error for invalid diff lines in tags/ (57225988, 2017-10-16,
unreleased).
* Ignore # comments in 'notmuch dump ...' output (9bbc54bd,
2016-03-27, v0.22).
* Respect 'expect' in _spawn(..., wait=True) (e263c5b1, 2017-10-10,
unreleased).
* Update URLs in documentation (554b90b5 and 6a833a6e8, 2016-06-02,
v0.23).
Avoid:
Traceback (most recent call last):
File "/home/nmbug/bin/nmbug", line 834, in <module>
args.func(**kwargs)
File "/home/nmbug/bin/nmbug", line 385, in checkout
status = get_status()
File "/home/nmbug/bin/nmbug", line 580, in get_status
maybe_deleted = _diff_index(index=index, filter='D')
File "/home/nmbug/bin/nmbug", line 658, in _diff_index
for id, tag in _unpack_diff_lines(stream=p.stdout):
File "/home/nmbug/bin/nmbug", line 678, in _unpack_diff_lines
'Invalid line in diff: {!r}'.format(line.strip()))
ValueError: Invalid line in diff: u'.mailmap'
With this commit, folks can commit READMEs, .mailmap, etc. to their
nmbug repositories, and 'nmbug diff' and 'status' won't choke on them.
If you want to check for this sort of thing, you can set --log-level
to info or greater. nmbug will still error if the unrecognized path
is under tags/, since that's more likely to be a user error.
We currently auto-checkout after pull and merge to make those more
convenient. They're guarded against data-loss with a leading
_insist_committed(). This commit adds the same convenience to clone,
since in most cases users will have no NMBPREFIX-prefixed tags in
their database when they clone. Users that *do* have
NMBPREFIX-prefixed tags will get a warning (and I've bumped the
default log level to warning so folks who don't set --log-level will
see it) like:
$ nmbug clone http://nmbug.notmuchmail.org/git/nmbug-tags.git
Cloning into '/tmp/nmbug-clone.g9dvd0tv'...
Checking connectivity: 16674, done.
Branch config set up to track remote branch config from origin.
Not checking out to avoid clobbering existing tags: notmuch::0.25, ...
Since 6311cfaf (init: do not set unnecessary core.worktree,
2016-09-25, 2.11.0 [1]), Git has no longer set core.worktree when
--separate-git-dir is used. This broke clone with:
$ nmbug clone http://nmbug.notmuchmail.org/git/nmbug-tags.git
Cloning into '/tmp/nmbug-clone.33gg442e'...
Checking connectivity: 16674, done.
['git', '--git-dir', '/home/wking/.nmbug', 'config', '--unset', 'core.worktree'] exited with 5
$ echo $?
1
The initial discussion that lead to the Git change is in [2], and
there is some more discussion around this specific change in [3].
There is some useful background on working trees in this 2009 message
[4]. There is also a git-worktree(1) since df0b6cfb (worktree: new
place for "git prune --worktrees", 2015-06-29, 2.5.0 [5]) which grew
the ability to add new worktrees in 799767cc (Merge branch
'es/worktree-add', 2015-07-13, 2.5.0 [6]). Folks relying on
core.worktree in the --separate-git-dir case fall into the "former
case" in [4], and as Junio pointed out in that message, Git
operations like 'add' don't really work there.
In nmbug we don't want core.worktree, because our effective working
tree is the notmuch database. By accepting failed core.worktree
unsets, clone will work with Gits older and younger than 2.11.0.
[1]: 6311cfaf93
[2]: https://public-inbox.org/git/CALqjkKZO_y0DNcRJjooyZ7Eso7yBMGhvZ6fE92oO4Su7JeCeng@mail.gmail.com/
[3]: https://public-inbox.org/git/87h94d8cwi.fsf@kyleam.com/
[4]: https://public-inbox.org/git/7viqbsw2vn.fsf@alter.siamese.dyndns.org/
[5]: df0b6cfbda
[6]: 799767cc98
Reported-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
Fixing a bug from 7f2cb3be (nmbug: Translate to Python, 2014-10-03).
The bug had no direct impact though, because none of the wait=True
callers were setting expect.
Also add expected codes to the debug messages, to help log readers
understand why nonzero exits are occasionally accepted.
A leading / in paths in a .gitignore file matches the beginning of the
path, meaning that for patterns without slashes, git will match files
only in the current directory as opposed to in any subdirectory.
Prefix relevant paths with / in .gitignore files, to prevent
accidentally ignoring files in subdirectories and possibly slightly
improve the performance of "git status".
Git has supported this since b68ea12e (diff.c: respect diff.renames
config option, 2006-07-07, v1.4.2). All of our information is in the
paths (the files are empty), so we don't want rename detection. By
using --no-renames, we get entries like:
$ nmbug log -- e473b453a2
commit e473b453a25c072b5df67d834d822121373321f5
Author: David Bremner <david@tethera.net>
Date: Sun Sep 25 07:54:11 2016 -0300
D tags/1474196252-31700-1-git-send-email-markwalters1009@gmail.com/0.23
A tags/1474196252-31700-1-git-send-email-markwalters1009@gmail.com/pushed
...
Instead of the old:
$ nmbug log -- e473b453a2
commit e473b453a25c072b5df67d834d822121373321f5
Author: David Bremner <david@tethera.net>
Date: Sun Sep 25 07:54:11 2016 -0300
R100 tags/1474196252-31700-1-git-send-email-markwalters1009@gmail.com/0.23 tags/1474196252-31700-1-git-send-email-markwalters1009@gmail.com/pushed
Many of the external links found in the notmuch source can be resolved
using https instead of http. This changeset addresses as many as i
could find, without touching the e-mail corpus or expected outputs
found in tests.
Lines starting with # have always (for a long time, anyway) been ignored
by notmuch-restore, but have not been generated by notmuch-dump
previously. In order to make nmbug robust against such output, ignore
comment lines.
To describe the script and config file format, so folks don't have to
dig through NEWS or the script's source to get that information.
The Makefile and conf.py are excerpted from the main doc/ directory
with minor simplifications and adjustments. The devel/nmbug/ scripts
are largely independent of notmuch, and separating the docs here
allows packagers to easily build the docs and install the scripts in a
separate package, without complicating notmuch's core build/install
process.
status-config.json wasn't obviously associated with the old
nmubg-status, now notmuch-report. The new name is
${CONFIGURED_SCRIPT}.json, so the association should be clear.
This script generates reports based on notmuch queries, and doesn't
really have anything to do with nmbug, except for sharing the NMBGIT
environment variable.
For example:
"query": ["tag:a", "tag:b or tag:c"]
is now converted to:
( tag:a ) and ( tag:b or tag:c )
instead of the old:
tag:a and tag:b or tag:c
This helps us avoid confusion due to Xapian's higher-precedence AND
[1], where the old query would be interpreted as:
( tag:a and tag:b ) or tag:c
[1]: http://xapian.org/docs/queryparser.html
These were broken by b70386a4 (Move the generated date from the top of
the page to the footer, 2014-05-31), which moved 'Generated ...' to
the footer with the opening tag, but didn't replace the blurb opening
tag or add a closing tag after 'Generated ...'.
We've been leading off with h2s since 3e5fb88f (contrib/nmbug: add
nmbug-status script, 2012-07-07), but the semantically-correct headers
are:
<h1>{title}</h1>
...
<h2>Views</h2>
...
<h3>View 1</h3>
...
<h3>View 2</h3>
...
We can always add additional CSS if the default h1 formatting is too
intense.
We already have a 'filename' variable with the name, so stay DRY and
use that variable here.
Also fix a missing-whitespace error from bed8b674 (nmbug-status:
Clarify errors for illegible configs, 2014-05-10), wrapping on the
sentence to match similar error-generation earlier in this function.
Let each view have a "sort" key, typically used with values
"oldest-first" or "newest-first" (although all values in Query.SORT
are accepted), and sort the results accordingly. Oldest first remains
the default.
The dynamic approach of mapping sort values is as suggested by
W. Trevor King <wking@tremily.us>.
When loading configs from Git, the bare branch name (without a
refs/heads/ prefix or similar) matches all branches of that name
(including remote-tracking branches):
.nmbug $ git show-ref config
48f3bbf1d1492e5f3d2f01de6ea79a30d3840f20 refs/heads/config
48f3bbf1d1492e5f3d2f01de6ea79a30d3840f20 refs/remotes/origin/config
4b6dbd9ffd152e7476f5101eff26747f34497cee refs/remotes/wking/config
Instead of relying on the ordering of the matching references, use
--heads to ensure we only match local branches.
Carl Worth pointed out that errors like:
$ ./nmbug-status
fatal: Not a git repository: '/home/cworth/.nmbug'
fatal: Not a git repository: '/home/cworth/.nmbug'
Traceback (most recent call last):
File "./nmbug-status", line 254, in <module>
config = read_config(path=args.config)
File "./nmbug-status", line 73, in read_config
return json.load(fp)
File "/usr/lib/python2.7/json/__init__.py", line 290, in load
**kw)
File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
are not particularly clear. With this commit, we'll get output like:
$ ./nmbug-status
fatal: Not a git repository: '/home/wking/.nmbug'
No local branch 'config' in /home/wking/.nmbug. Checkout a local
config branch or explicitly set --config.
which is much more accessible. I've also added user-friendly messages
for a number of other config-parsing errors.
This allows us to capture stdout and stderr separately, and do other
explicit subprocess manipulation without resorting to external
packages. It should be compatible with Python 2.7 and later
(including the 3.x series).
Most of the user-facing interface is the same, but there are a few
changes, where reproducing the original interface was too difficult or
I saw a change to make the underlying Git UI accessible:
* 'nmbug help' has been split between the general 'nmbug --help' and
the command-specific 'nmbug COMMAND --help'.
* Commands are no longer split into "most common", "other useful", and
"less common" sets. If we need something like this, I'd prefer
workflow examples highlighting common commands in the module
docstring (available with 'nmbug --help').
* 'nmbug commit' now only uses a single argument for the optional
commit-message text. I wanted to expose more of the underlying 'git
commit' UI, since I personally like to write my commit messages in
an editor with the notes added by 'git commit -v' to jog my memory.
Unfortunately, we're using 'git commit-tree' instead of 'git
commit', and commit-tree is too low-level for editor-launching. I'd
be interested in rewriting commit() to use 'git commit', but that
seemed like it was outside the scope of this rewrite. So I'm not
supporting all of Git's commit syntax in this patch, but I can at
least match 'git commit -m MESSAGE' in requiring command-line commit
messages to be a single argument.
* The default repository for 'nmbug push' and 'nmbug fetch' is now the
current branch's upstream (branch.<name>.remote) instead of
'origin'. When we have to, we extract this remote by hand, but
where possible we just call the Git command without a repository
argument, and leave it to Git to figure out the default.
* 'nmbug push' accepts multiple refspecs if you want to explicitly
specify what to push. Otherwise, the refspec(s) pushed depend on
push.default. The Perl version hardcoded 'master' as the pushed
refspec.
* 'nmbug pull' defaults to the current branch's upstream
(branch.<name>.remote and branch.<name>.merge) instead of hardcoding
'origin' and 'master'. It also supports multiple refspecs if for
some crazy reason you need an octopus merge (but mostly to avoid
breaking consistency with 'git pull').
* 'nmbug log' now execs 'git log', as there's no need to keep the
Python process around once we've launched Git there.
* 'nmbug status' now catches stderr, and doesn't print errors like:
No upstream configured for branch 'master'
The Perl implementation had just learned to avoid crashing on that
case, but wasn't yet catching the dying subprocess's stderr.
* 'nmbug archive' now accepts positional arguments for the tree-ish
and additional 'git archive' options. For example, you can run:
$ nmbug archive HEAD -- --format tar.gz
I wish I could have preserved the argument order from 'git archive'
(with the tree-ish at the end), but I'm not sure how to make
argparse accept arbitrary possitional arguments (some of which take
arguments). Flipping the order to put the tree-ish first seemed
easiest.
* 'nmbug merge' and 'pull' no longer checkout HEAD before running
their command, because blindly clobbering the index seems overly
risky.
* In order to avoid creating a dirty index, 'nmbug commit' now uses
the default index (instead of nmbug.index) for composing the commit.
That way the index matches the committed tree. To avoid leaving a
broken index after a failed commit, I've wrapped the whole thing in
a try/except block that resets the index to match the pre-commit
treeish on errors. That means that 'nmbug commit' will ignore
anything you've cached in the index via direct Git calls, and you'll
either end up with an index matching your notmuch tags and the new
HEAD (after a successful commit) or an index matching the original
HEAD (after a failed commit).
If we don't have an upstream, there is nothing to merge, so nothing is
unmerged. This avoids errors like:
$ nmbug status
error: No upstream configured for branch 'master'
error: No upstream configured for branch 'master'
fatal: ambiguous argument '@{upstream}': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
'git rev-parse @{upstream}' exited with nonzero value
You might not have an upstream if you're only using nmbug locally to
version-control your tags.
Sometimes we want to catch Git errors and handle them, instead of
dying with an error message. This lower-level version of git() allows
us to get the error status when we want it.
Our repository [1] has a post-update hook that rebuilds the status
page after each push. Since that may happen several times a day, we
might as well show the build time (as well as the date) in the footer.
The trailing 'Z' is the ISO 8601 designator for UTC. Now that we're
showing times, it's nice to be explicit about the timezone we're
using.
The rename from date -> datetime gives us backward-compatibility for
folks that *do* only want the date. We keep the old date formatting
to support those folks.
[1]: http://nmbug.tethera.net/git/nmbug-tags.git
Rather than splitting this context into header-only and footer-only
groups, just dump it all in a shared dict. This will make it easier
to eventually split the header/footer templates out of this script
(e.g. if we want to load them from the config file).
It's useful reference information, but anyone who wants it will look
for and find it. We don't need this front-and-center. Follow the
pattern set by our header template with a triple-quoted string.
The gray <hr> styling is less agressive. IE uses 'color' for drawing
the rule, while Gecko and Opera use the border or 'background-color'
[1].
[1]: https://bugzilla.mozilla.org/show_bug.cgi?id=239386
Prefer a docstring to a header comment so we can use it as the
ArgumentParser description (formatted with 'nmbug-status --help').
Script readers still have it near the top of the file. Since it's a
docstring, use PEP 257's summary-line-and-body format [1].
[1]: http://legacy.python.org/dev/peps/pep-0257/#multi-line-docstrings
If a git repository is non-bare, and core.worktree is not set, git
tries to deduce the worktree. This deduction is not always helpful, e.g.
% git --git-dir=$HOME/.nmbug clean -f
would likely delete most of the files in the current directory
With two branches getting fetched (master and config), the branch
referenced by FETCH_HEAD is ambiguous. For example, I have:
$ cat FETCH_HEAD
41d7bfa7184cc93c9dac139d1674e9530799e3b0 \
not-for-merge branch 'config' of http://nmbug.tethera.net/git/nmbug-tags
acd379ccb973c45713eee9db177efc530f921954 \
not-for-merge branch 'master' of http://nmbug.tethera.net/git/nmbug-tags
(where I wrapped the line by hand). This means that FETCH_HEAD
references the config branch:
$ git rev-parse FETCH_HEAD
41d7bfa7184cc93c9dac139d1674e9530799e3b0
which breaks all of the FETCH_HEAD logic in nmbug (where FETCH_HEAD is
assumed to point to the master branch).
Instead of relying on FETCH_HEAD, use @{upstream} as the
remote-tracking branch that should be merged/diffed/integrated into
HEAD. @{upstream} was added in Git v1.7.0 (2010-02-12) [1], so
relying on it should be fairly safe. One tricky bit is that bare
repositories don't set upstream tracking branches by default:
$ git clone --bare http://nmbug.tethera.net/git/nmbug-tags.git nmbug-bare
$ cd nmbug-bare
$ git remote show origin
* remote origin
Fetch URL: http://nmbug.tethera.net/git/nmbug-tags.git
Push URL: http://nmbug.tethera.net/git/nmbug-tags.git
HEAD branch: master
Local refs configured for 'git push':
config pushes to config (up to date)
master pushes to master (up to date)
While in a non-bare clone:
$ git clone http://nmbug.tethera.net/git/nmbug-tags.git
$ cd nmbug-tags
$ git remote show origin
* remote origin
Fetch URL: http://nmbug.tethera.net/git/nmbug-tags.git
Push URL: http://nmbug.tethera.net/git/nmbug-tags.git
HEAD branch: master
Remote branches:
config tracked
master tracked
Local branch configured for 'git pull':
master merges with remote master
Local ref configured for 'git push':
master pushes to master (up to date)
From the clone docs [2]:
--bare::
Make a 'bare' Git repository…
Also the branch heads at the remote are copied directly
to corresponding local branch heads, without mapping
them to `refs/remotes/origin/`. When this option is
used, neither remote-tracking branches nor the related
configuration variables are created.
To use @{upstream}, we need to the local vs. remote-tracking
distinction, so this commit adds 'nmbug clone', replacing the
previously suggested --bare clone with a non-bare --no-checkout
--separate-git-dir clone into a temporary work directory. After
which:
$ git rev-parse @{upstream}
acd379ccb973c45713eee9db177efc530f921954
gives us the master-branch commit. Existing nmbug users will have to
run the configuration tweaks and re-fetch by hand. If you don't have
any local commits, you could also blow away your NMBGIT repository and
re-clone from scratch:
$ nmbug clone http://nmbug.tethera.net/git/nmbug-tags.git
Besides removing the ambiguity of FETCH_HEAD, this commit allows users
to configure which upstream branch they want nmbug to track via 'git
config', in case they want to change their upstream repository.
[1]: http://git.kernel.org/cgit/git/git.git/tree/Documentation/RelNotes/1.7.0.txt
[2]: http://git.kernel.org/cgit/git/git.git/tree/Documentation/git-clone.txt
Make nmbug-status more generally usable outside of nmbug by not
hardcoding notmuch related things.
This lets anyone publish html search views to mailing list messages
with a custom config file, independent of nmbug.
Python dict() object does not have __values__() function which
OrderedDict().values() (the stub provided in nmbug-status) could call
to provide ordered list of values. By renaming this thinko to
values() will make our stub work as expected -- dict items listed out
in order those were added to the dictionary.
David [1] and Tomi [2] both feel that the user's choice of LANG is not
explicit enough to have such a strong effect on nmbug-status. For
example, cron jobs usually default to LANG=C, and that is going to
give you ASCII output:
$ LANG=C python -c 'import locale; print(locale.getpreferredencoding())'
ANSI_X3.4-1968
Trying to print Unicode author names (and other strings) in that
encoding would crash nmbug-status with a UnicodeEncodeError. To avoid
that, this patch hardcodes UTF-8, which can handle generic Unicode,
and is the preferred encoding (regardless of LANG settings) for
everyone who has chimed in on the list so far. I'd prefer trusting
LANG, but in the absence of any users that prefer non-UTF-8 encodings
I'm fine with this approach.
While we could achieve the same effect on the output content by
dropping the previous patch (nmbug-status: Encode output using the
user's locale), Tomi also wanted UTF-8 hardcoded as the config-file
encoding [2]. Keeping the output encoding patch and then adding this
to hardcode both the config-file and output encodings at once seems
the easiest route, now that fd29d3f (nmbug-status: Decode Popen output
using the user's locale, 2014-02-10) has landed in master.
[1]: id="877g8z4v4x.fsf@zancas.localnet"
http://article.gmane.org/gmane.mail.notmuch.general/17202
[2]: id="m2vbwj79lu.fsf@guru.guru-group.fi"
http://article.gmane.org/gmane.mail.notmuch.general/17209
Instead of always writing UTF-8, allow the user to configure the
output encoding using their locale. This is useful for previewing
output in the terminal, for poor souls that don't use UTF-8 locales
;).
We already had the tbody with a blank row separating threads (which is
not colored); this commit adds a bit of spacing to separate messages
within a thread. It will also add a bit of colored padding above the
first message and below the final message, but the main goal is to add
padding *between* two-row message blocks.
<--- new padding
thread-1, message-1, row-1 (class="message-first")
thread-1, message-1, row-2 (class="message-last")
<--- new padding
spacer tbody with a blank row
<--- new padding
thread-2, message-1, row-1 (class="message-first")
thread-2, message-1, row-2 (class="message-last")
<--- new padding
<--- new padding
thread-2, message-2, row-1 (class="message-first")
thread-2, message-2, row-2 (class="message-last")
<--- new padding
'message-id' and 'from' now have sensitive characters escaped using
xml.sax.saxutils.escape [1]. The 'subject' data was already being
converted to a link into Gmane; I've escape()d that too, so it doesn't
need to be handled ain the same block as 'message-id' and 'from'.
This prevents broken HTML by if subjects etc. contain characters that
would otherwise be interpreted as HTML markup.
[1]: http://docs.python.org/3/library/xml.sax.utils.html#xml.sax.saxutils.escape
Also allow manual id overrides from the JSON config. Sluggin avoids
errors like:
Bad value '#Possible bugs' for attribute href on element a:
Whitespace in fragment component. Use %20 in place of spaces.
from http://validator.w3.org.
I tried just quoting the titles (e.g. 'Possible%20bugs'), but that
didn't work (at least with Firefox 24.2.0). Slugging avoids any
ambiguity over when the quotes are expanded in the client. The specs
are unclear about quoting, saying only [1]:
Value: Any string, with the following restrictions:
must be at least one character long
must not contain any space characters
[1]: http://dev.w3.org/html5/markup/global-attributes.html#common.attrs.id