Unify the subtests by requiring test_begin_subtest before
test_expect_success. (Similar change for test_expect_code will
follow.)
This increases clarity in the test scripts by having a separate line
for the start of the subtest with the heading, and makes it possible
to simplify the test infrastructure by making all subtests similar.
The only place where we use the implicit prereq check is T000-basic.sh
where we check that it works. It's an added complication that we don't
use. Remove it.
The test_have_prereq function can still be used for the same effect in
subtests that use test_begin_subtest. For now, this will make it
impossible to have prereqs in one-line subtests that don't require
test_begin_subtest. This will be fixed in follow-up work.
Known broken tests are, well, known broken. Do not print the result
diff for them unless V=1 is specified. Now that the test description
is printed also when known broken tests fail, the user can also skip
to running the individual failing tests.
We already use this directory for dtach sockets, so it makes sense to
put gnupg sockets there as well. There doesn't seem to be a clean way
to put a fully functional socket in a different location than
GNUPGHOME.
This reverts commit e7b88e8b0a.
It turns out that this does not work well in environments without a
running systemd (or some other provider of /run/user)
Instead of just having the first filename for the message, list all
duplicate filenames of the message as a list in the formatted
outputs. This bumps the format version to 3.
$NOTMUCH_PYTHON is sourced from sh.config, configured by
./configure and stated to be used as:
"Name of python command to use in configure and the test suite."
This enables the shortened socket pathes in /run or equivalent. The
explicit call to gpgconf is needed for nonstandard GNUPGHOME settings.
(amended according to id:m2fujatr4k.fsf@guru.guru-group.fi)
GnuPG 2.1.16 is now injecting the full issuer fingerprint in its
signatures, which makes them about 32 octets larger when
ascii-armored.
This change in size means that the size of the MIME parts will vary
depending on the version of gpg that the user has installed. at any
rate, the signature part should be non-zero (this is true for
basically any MIME part), so we just test for that instead of an exact
size.
Moved the 2 basename(1) executions to the test failure branch in
test_expect_equal_file ().
The output of basename(1) executions in function test_expect_equal_file ()
are only used when tests fails -- when all tests pass these 2 basename(1)
executions are no longer done at all.
In case of the test script is to be relaunced under valgrind, or --tee
is requested, use the $BASH shell variable to locate the command
interpreter. The $SHELL variable is re-set by non-interactive shells
so in case the shell uses some other shell (e.g. zsh) for interactive
use these bash scripts continue to work.
We largely use the corpus under test/corpus for
testing. Unfortunately, many of our tests have grown to depend on
having exactly this set of messages, making it hard to add new message
files for testing specific cases.
We do use a lot of add_message from within the tests, but it's not
possible to use that for adding broken messages, and adding several
messages at once can get unwieldy.
Move the basic corpus under tests/corpora/default, and make it
possible to add new, independent corpora along its side. This means
tons of renames with a few tweaks to add_email_corpus function in
test-lib.sh to let tests specify which corpus to use.
The trick of having a common header file doesn't work to share between
test scripts, so make an include file in the test directory.
The use of #include <notmuch-test.h> looks slightly pretentious, but
the include file is not actually in the current (temporary) directory.
Place PYTHONPATH to the environment when python is executed in a way
that current shell environment is not affected. This also allows adding
the old value of PYTHONPATH to the end of the new value (otherwise it
would have been appended again and again when test_python is called).
At the same time, use -B option to avoid writing .pyc files to
bindings/python/* (which are not cleared out by distclean).
Drop the (unused) prefix code which preserved the original stdout of the
python program and opened sys.stdout to OUTPUT. In place of that there
is now note how (debug) information can be printed to original stdout.
Previously LD_LIBRARY_PATH was exported (and environment changed)
in the middle of test case execution, when a function setting it
was called.
Previously the old contents of LD_LIBRARY_PATH was lost (if any)
when it was re-set and exported. In some systems the old contents of
LD_LIBRARY_PATH was needed to e.g. locate suitable gmime library.
Added die() function to test-lib.sh with the following first use of it:
If notmuch new fails during email corpus addition the database is
most probably inexistent or broken and the added corpus would be
unusable while running single tests, giving misleading failures
("only" full 'make test' cleans out old corpus).
Many of the external links found in the notmuch source can be resolved
using https instead of http. This changeset addresses as many as i
could find, without touching the e-mail corpus or expected outputs
found in tests.
Most of the infrastructure here is general, only the validation/dispatch
is hardcoded to a particular prefix.
A notable change in behaviour is that notmuch-config now opens the
database e.g. on every call to list, which fails with an error message
if the database doesn't exit yet.
Files in test directories had only copyright of a single individual,
of which code was adapted here as a base of the test system.
Since then many Notmuch Developers have contributed to the test
system, which is now acknowledged with a constant string in some
of the test files.
The README file in test directory instructed new files contain a
copyright notice, but that has never been done (and it is also not
needed). To simplify things a bit (and lessen confusion) this
instruction is now removed.
As a side enchangement, all of the 3 entries in the whole source
tree cd'ing to `dirname` of "$0" now uses syntax cd "$(dirname "$0")".
This makes these particular lines work when current working directory
is e.g. /c/Program Files/notmuch/test/.
(Probably it would fail elsewhere, though.)
This is mainly for the test suite. We already expect the tests to be
run in the same environment as configure was run, at least to get the
name of the python interpreter. So we are not really imposing a new
restriction.
The test is pretty much cut and paste from the PGP/MIME version, with
obvious updates taken from notmuch output. This also requires setting
up gpgsm infrastucture.
Test the ability of notmuch-mua-mail to send S/MIME signed (and
encrypted) messages; this really relies on existing functionality in
message-mode.
The generated keys and messages will later be useful for testing the
notmuch CLI.
ALTERNATE_EDITOR causes emacsclient to run an alternate editor if the
emacs server is not ready. This can collide with intended
functionality in test-lib.sh.
If the ALTERNATE_EDITOR is set but empty, emacsclient runs emacs
daemon and tries to connect to it. When this happens the emacs run by
test-lib.sh fails to start the server and the subsequent attempts to
use the server fail because the daemon started by emacsclient does not
know about notmuch-test-progn. This leads to test suite failure due to
time out on any emacs test.
This exposes the committed database revision to library users along
with a UUID that can be used to detect when revision numbers are no
longer comparable (e.g., because the database has been replaced).
The files (test) scripts source (with builtin command `.`) provides
information which the scripts depend, and without the `source` to
succeed allowing script to continue may lead to dangerous situations
(e.g. rm -rf "${undefined_variable}"/*).
At the end of all source (.) lines construct ' || exit 1' was added;
In our case the script script will exit if it cannot find (or read) the
file to be sourced. Additionally script would also exits if the last
command of the sourced file exited nonzero.
Previously we globally modified these variables, which tended to cause
problems for people using message-mode, but not notmuch-mua-mail, to
send mail.
User visible changes:
- Calling notmuch-fcc-header-setup is no longer optional. OTOH, it
seems to do the right thing if notmuch-fcc-dirs is set to nil.
- The Fcc header is visible during message composition
- The name in the mode line is changed, and no longer matches exactly
the menu label.
- Previously notmuch-mua-send-and-exit was never called. Either we
misunderstood define-mail-user-agent, or it had a bug. So there was
no difference if the user called message-send-and-exit directly. Now
there will be.
- User bindings to C-c C-c and C-c C-s in message-mode-map are
overridden. The user can override them in notmuch-message-mode-map,
but then they're on their own for Fcc handling.
The configure script chooses "python" if both python and python{2,3}
exist exists, so this could change the version of python used to run
the test suite.
The checking for ${NOTMUCH_PYTHON} in the test suite is arguably
over-engineering, since the configure step will fail if it can't find
it.
This is to limit the copy-pasta involved in running C tests. I decided
to keep things simple and not try to provide an actual C skeleton.
The setting of LD_LIBRARY_PATH is to force using the built libnotmuch
rather than any potential system one.
When something in tests fails one possibility to test is to run
the test script as `bash -x TXXX-testname.sh`. As stderr (fd 2) was
redirected to separate file during test execution also this set -x
(xtrace) output would also go there.
test-lib.sh saves the stderr to fd 7 from where it can be restored,
and bash has BASH_XTRACEFD variable, which is now given the same value
7, making bash to output all xtrade information (consistently) there.
This lib file used to save fd's 1 & 2 to 6 & 7 (respectively) in
test_begin_subtest(), but as those needs to be set *before* XTRACEFD
variable is set those are now saved at the beginning of the lib (once).
This is safe and simple thing to do.
To make xtrace output more verbose PS4 variable was set to contain the
source file, line number and if execution is in function, that function
name. Setting this variable has no effect when not xtracing.
As it is known that fd 6 is redirected stdout, printing status can now
use that fd, instead of saving stdout to fd 5 and use it.
At the moment, the test-lib fills in any missing headers. This makes
it impossible to test our handling of empty subjects. This will
allow us to use a special dummy subject -- `@FORCE_EMPTY` -- to force
the subject to remain empty.
The unread/read changes will use the post-command-hook. test_emacs
does not call the post-command-hook. This adds a notmuch-test-progn
which takes a list of commands as argument and executes them in turn
but runs the post-command-hook after each one.
The caller can batch operations (ie to stop post-command-hook from
being interleaved) by wrapping the batch of operations inside a progn.
We also explicitly run the post-command-hook before getting the output
from a test; this makes sense as this will be a place the user would
be seeing the information.
At least in emacs24, this removes the "site-lisp" directories from the
load path in addition to enforcing --no-site-lisp --no-init-file.
This works around a slightly mysterious bug on Debian that causes
test-lib.el not to load when there is cl-lib.el(c) in some site-lisp
directory. It should be harmless in general since we really don't
want to load any files from addon packages to emacs.
The printf builtin "%(fmt)T" specifier (which allows time values
to use strftime-like formatting) is introduced in bash 4.2.
Trying to execute this in pre-4.2 bash will fail -- and if this
happens execute the fallback piece of perl code to do the same thing.
The test names assigned to NOTMUCH_SKIP_TESTS variable can now be given
with or without the Tddd- prefix for tester convenience:
The test name without Tddd -prefix stays constant even when test filenames
are renumbered.
The test name with Tddd -prefix is printed out when tests run.
Previously, we stripped the "Tnnn-" part from the test name when
printing its description at the beginning of each test. However, this
makes it difficult to find the source script for a test (e.g., when a
test fails). Put this prefix back.
Script `notmuch-test` expects the results file have T\d\d\d- part
intact so the results files (and some test output files) are now
name as such.
Without this change `notmuch-test` will exit in case the test
script it was executing exited with nonzero value.
The T\d\d\d- part is dropped in new variable $this_test_bare which is
used in progress informational messages and when loading .el files in
emacs tests (whenever $this_test_bare.el exists).
All test scripts to be executed are now named as T\d\d\d-name.sh,
numers in increments of 10.
This eases adding new tests and developers to see which are test scripts
that are executed by test suite and in which order.
There is an obscure bug in notmuch-hello that very occasionally causes
emacs_deliver_message to fail. Since it it doesn't serve any actual
purpose in the function we delete it, and leave tracking down the the
bug for another day.
Most of the tests previously using emacs_deliver_message do not use
the actual transmitted message, so we replace it with a simpler (and
presumably more reliable function) that only saves (and indexes) an
fcc copy of the message.
When NOTMUCH_TEST_QUIET environment variable is set to non-null value
messages when new test script starts and when test PASSes are disabled.
This eases picking the cases when tests FAIL (as those are still printed).
In preparation for quiet mode print empty line before writing the
test description. This is done now in function designed for it --
it will also be called when test fails.
test-lib.sh sometimes did equivalent of `basename "$0" .sh`, sometimes
skipping the basename part and sometimes .sh part. This worked as
we never had path components in $0 (more than ./) nor .sh ending.
Now the equivalent of `basename "$0" .sh` is done once and used
everywhere. In the future we may have .sh suffix in test names
-- removing those is a good idea.
The choice of decreasing timestamps is a hack which reduces the number
of existing tests which fail. This can be changed to increasing
if/when somebody wants update another 47 tests.
add a new function notmuch_date_sanitize for rfc822-ish things. Add
date sanitization to notmuch_show_sanitize_all and use it more places.
This is all in aid of a transition to unique timestamps on messages.
Eventually we want test messages to have distinct dates to avoid
reproducability problems. This sanitization will prevent some test
failures when that change is made.
Replace the use of a local function in maildir-sync with
notmuch_json_show_sanitize
When executed command line is written to *Notmuch errors* buffer,
shell-quote-argument will backslash-escape any char that is not in
"POSIX filename characters" (i.e. matching "[^-0-9a-zA-Z_./\n]").
Currently in two emacs tests shell has expanded $PWD as part of
emacs variable, which will later be fed to #'shell-quote-argument
and finally written to ERROR file. If $PWD contained non-POSIX
filename characters, data in ERROR file will not match $PWD when
later comparing in shell. Therefore, in these two particular cases
the escaped $PWD is replaced with YYY in ERROR file and expected
content is adjusted accordingly.
One test (reply to encrypted message in the crypto test) recently
started failing on some systems. The failure I saw were two extra
lines of the form
<87d2nbc5xg.fsf@host.i-did-not-set--mail-host-address--so-tickle-me>
The test pipes the output through
grep -v -e '^In-Reply-To:' -e '^References:'
which would normally these two ids but it does not, in this case,
because they are so long they get put on a separate line in the output.
To fix this we set mail-host-address for emacs deliver. example.com
seems a sensible address to use. This is short enough that we don't
get the line breaks above and the tests then all pass.
When 'xpg_echo' bash shell option is unset (usually the default)
echo builtin does not expand backslash-escape sequences by default
(i.e. '\n' is echoed as '\n' instead of newline). Not all bash
installations have this feature we depend on activated by default.
Note that the feature is bash (and GNU /bin/echo) specific. It is used
as it is convenient. If portability is needed (elsewhere) use printf(1)
(also often available as a shell builtin).
When execution of tests is interrupted by signal coming outside of the
test system itself, output just one line "interrupted by signal <num>"
message to standard output. This distinguishes the case from internal
exit and reduces noise.
Set the variable '$test_subtest_name' in all functions which starts
a new test and use that variable in all functions that output
test results.
Additionally output the latest '$test_subtest_name' in case of
abnormal exit, to avoid confusion.
The TERM environment variable is set to 'dumb' when running tests, but
the original value of it is stored for echoing colors and running emacs
(somewhat interactively) in detached session. Emacs requires some
terminal control sequences to be available for interactive operation.
In case original TERM is (also) 'dumb' (or unset/empty) emacs cannot
run interactively. To fix this problem dtach (and emacs as it's child
process) is run with TERM=vt100 in case original TERM was unset, empty
or 'dumb'. This way there is a chance to run emacs tests with different
user terminals and potentially find problems there.
notmuch_json_show_sanitize replaced "filename" field values even in part
structures, where the value is predictable. Make it only normalize the
filename value if it is an absolute path (begins with slash), which is
true of the Maildir filenames that were intended to be normalized away.
test_expect_equal_json uses json.tool from the system Python. While
Python 2 wasn't picky about the encoding of stdin, Python 3 decodes
stdin strictly according to the environment. Since we set LC_ALL=C
for the tests, Python 3's json.tool was assuming stdin would be in
ASCII and aborting when it couldn't decode the UTF-8 characters from
some of the JSON tests. This patch sets the PYTHONIOENCODING
environment variable to utf-8 when invoking json.tool to override
Python's default encoding choice.
Previously, the test framework generated a variable name for each
external prereq as a poor man's associative array. Unfortunately,
prereqs names may not be legal variable names, leading to
unintelligible bash errors like
test_missing_external_prereq_emacsclient.emacs24_=t: command not found
Using proper associative arrays to track prereqs, in addition to being
much cleaner than generating variable names and using grep to
carefully construct unique string lists, removes restrictions on
prereq names.
Previously, if a test script aborted (e.g., because it passed too few
arguments to a test function), the test driver loop would simply
continue on to the next test script and the final results would
declare that everything passed (except that the test count would look
suspiciously low, but maybe you just misremembered how many tests
there were).
Now, if a test script exits with a non-zero status and did not produce
a final results file, we propagate that failure out of the driver loop
immediately.
To keep this simple, this patch removes the PID from the test-results
file name. This PID was inherited from the git test system and seems
unnecessary, since the file name already includes the name of the test
script and the test-results directory is created anew for each run.
And require that if TEST_EMACS is specified, so is TEST_EMACSCLIENT.
Previously, the test framework always used "emacsclient", even if the
Emacs in use was overridden by TEST_EMACS. This causes problems if
both Emacs 23 and Emacs 24 are installed, the Emacs 23 emacsclient is
the system default, but TEST_EMACS is set to emacs24. Specifically,
with an Emacs 24 server and an Emacs 23 client, emacs tests that run
very quickly may produce no output from emacsclient, causing the test
to fail.
The Emacs server uses a very simple line-oriented protocol in which
the client sends a request to evaluate an expression and the server
sends a request to print the result of evaluation. Prior to Emacs bzr
commit 107565 on March 11th, 2012 (released in Emacs 24.1), if
multiple commands were sent to the emacsclient between when it sent
the evaluation command and when it entered its receive loop, it would
only process the first response command, ignoring the rest of the
received buffer. This wasn't a problem with the Emacs 23 server
because it sent only the command to print the evaluation result.
However, the Emacs 24 server first sends an unprompted command
specifying the PID of the Emacs server, then processes the evaluation
request, then sends the command to print the result. If the
evaluation is fast enough, it can send both of these commands before
emacsclient enters the receive loop. Hence, if an Emacs 24 server is
used with an Emacs 23 emacsclient, it may miss the response printing
command, ultimately causing intermittent notmuch test failures.
The use of --background option (instead of shell '&') ensures that
smtp-dummy is listening its server socket until execution of shell
script can continue, thus the client will always have socket where
to connect.
smtp-dummy outputs smtp_dummy_pid variable in shell assignment format;
eval'ing that output makes that variable available for the shell.
As the smtp-dummy instance is no longer child process of the script
the SIGKILL signal sent to it will ensure it is going away in case
the mail sender fails to connect to smtp-dummy.
Obviates the need to create a 'NOTMUCH_NEW' clone which runs
'notmuch new --debug'. This will be used in a later patch.
Doesn't cause any issues for other tests.
Since $TEST_DIRECTORY is an absolute path, any filenames generated
with it will be complete paths. Only use the basename to generate
suffixes for filenames.
Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
Most Emacs tests end with a call to (test-output), which saves the
buffer to a filed called OUTPUT. Previously, if the test code failed
with an exception before this call, the test framework would then
compare against the OUTPUT file from the last Emacs test, resulting in
confusing diffs.
This requires one tweak to an emacs test that made two calls to
test_emacs and expected an OUTPUT file from the first call. We simply
reverse the order of the test_emacs calls.
Before the change, test_expect_equal_file() function treated the first
argument as "actual output file" and the second argument as "expected
output file". When the test fails, the files are copied for later
inspection. The first files was copied to "$testname.output" and the
second file to "$testname.expected". The argument order for
test_expect_equal_file() is often wrong which results in confusing
diff output and incorrectly named files.
The patch solves the issue by changing test_expect_equal_file() to
treat arguments just as two files, without any special properties
(like "actual" and "expected"). The file names for copying is now
based on the given file name: "$testname.$file1" and
"$testname.$file2". E.g. if test_expect_equal_file() is called with
"OUTPUT" and "EXPECTED", the copied files can be named
"emacs.1.OUTPUT" and "emacs.1.EXPECTED".
The down side of this approach is that diff argument order depends on
test_expect_equal_file() argument order. So sometimes we get diff
from expected to actual results, and sometimes the other way around.
But the files are always named correctly.
Previously, we used a variety of ad-hoc canonicalizations for JSON
output in the test suite, but were ultimately very sensitive to JSON
irrelevancies such as whitespace. This introduces a new test
comparison function, test_expect_equal_json, that first pretty-prints
*both* the actual and expected JSON and the compares the result.
The current implementation of this simply uses Python's json.tool to
perform pretty-printing (with a fallback to the identity function if
parsing fails). However, since the interface it introduces is
semantically high-level, we could swap in other mechanisms in the
future, such as another pretty-printer or something that does not
re-order object keys (if we decide that we care about that).
In general, this patch does not remove the existing ad-hoc
canonicalization because it does no harm. We do have to remove the
newline-after-comma rule from notmuch_json_show_sanitize and
filter_show_json because it results in invalid JSON that cannot be
pretty-printed.
Most of this patch simply replaces test_expect_equal and
test_expect_equal_file with test_expect_equal_json. It changes the
expected JSON in a few places where sanitizers had placed newlines
after commas inside strings.
Before the change, messages generated by generate_message() used "Test
message #N" for default subject where N is the generated messages
counter. Since message subject is commonly present in expected
results, there is a chance of breaking other tests when a new
generate_message() call is added. The patch changes default subject
value for generated messages to subtest name if it is available. If
subtest name is not available (i.e. message is generated during test
initialization), the old default value is used (in this case it is
fine to have the counter in the subject).
Another benefit of this change is a sane default value for subject in
generated messages, which would allow to simplify code like:
test_begin_subtest "test for a cool feature"
add_message [subject]="message for test for a cool feature"
Emails that are encoded differently than as ASCII or UTF-8 are not
indexed properly by notmuch. It is not possible to search for non-ASCII
words within those messages.
When tests are skipped due to missing prereqs, those prereqs are only
displayed when running with the `--verbose' option. This is essential
information when troubleshooting, so always send to stdout.
Add a new test function to allow simpler testing of emacs
functionality.
`test_emacs_expect_t' takes one argument - a lisp expression to
evaluate. The test passes if the expression returns `t', otherwise it
fails and the output is reported to the tester.
When checking for a running emacs, test_emacs evaluates the empty list
'()'. This returns 'nil' when emacs is running, which is then
prepended to the actual test result. Given that it is not part of the
actual test output the test harness can incorrectly report test
failure (or success).
Used emacs (whitespace-cleanup) function to "cleanup blank problems"
in test files where that could be done without breaking tests;
test/emacs was partially, and test/multipart was fully reverted.
When running the Emacs tests in verbose mode, only the first missing
prereq is reported because the `run_emacs' function is short-circuited
early:
#+begin_example
emacs: Testing emacs interface
missing prerequisites: [0] emacs(1)
skipping test: [0] Basic notmuch-hello view in emacs
SKIP [0] Basic notmuch-hello view in emacs
#+end_example
This can lead to situations reminiscent of "dependency hell", so instead
of returning based on each individual `test_require_external_prereq's exit
status, we now do so only after checking all the prereqs:
#+begin_example
emacs: Testing emacs interface
missing prerequisites: [0] dtach(1) emacs(1) emacsclient(1)
skipping test: [0] Basic notmuch-hello view in emacs
SKIP [0] Basic notmuch-hello view in emacs
#+end_example
Also added missing prereq for dtach(1).
It makes no sense to run test-lib.sh, so it makes no sense to give it
an interpreter. This is particularly annoying for Emacs users who
have executable-insert set, since the presence of the #! line will
cause Emacs to mark test-lib.sh executable when saving it, which will
in turn case the 'basic' test to fail.
January 5, 2001 was a Tuesday, not a Friday. Jameson fixed this exact
problem for the multipart test in ec2b0a98cc, but not for
generate_message itself.
As Jameson pointed out in ec2b0a98cc, if we want to test date parsing,
we should do it separately.