The TERM environment variable is set to 'dumb' when running tests, but
the original value of it is stored for echoing colors and running emacs
(somewhat interactively) in detached session. Emacs requires some
terminal control sequences to be available for interactive operation.
In case original TERM is (also) 'dumb' (or unset/empty) emacs cannot
run interactively. To fix this problem dtach (and emacs as it's child
process) is run with TERM=vt100 in case original TERM was unset, empty
or 'dumb'. This way there is a chance to run emacs tests with different
user terminals and potentially find problems there.
notmuch_json_show_sanitize replaced "filename" field values even in part
structures, where the value is predictable. Make it only normalize the
filename value if it is an absolute path (begins with slash), which is
true of the Maildir filenames that were intended to be normalized away.
test_expect_equal_json uses json.tool from the system Python. While
Python 2 wasn't picky about the encoding of stdin, Python 3 decodes
stdin strictly according to the environment. Since we set LC_ALL=C
for the tests, Python 3's json.tool was assuming stdin would be in
ASCII and aborting when it couldn't decode the UTF-8 characters from
some of the JSON tests. This patch sets the PYTHONIOENCODING
environment variable to utf-8 when invoking json.tool to override
Python's default encoding choice.
Previously, the test framework generated a variable name for each
external prereq as a poor man's associative array. Unfortunately,
prereqs names may not be legal variable names, leading to
unintelligible bash errors like
test_missing_external_prereq_emacsclient.emacs24_=t: command not found
Using proper associative arrays to track prereqs, in addition to being
much cleaner than generating variable names and using grep to
carefully construct unique string lists, removes restrictions on
prereq names.
Previously, if a test script aborted (e.g., because it passed too few
arguments to a test function), the test driver loop would simply
continue on to the next test script and the final results would
declare that everything passed (except that the test count would look
suspiciously low, but maybe you just misremembered how many tests
there were).
Now, if a test script exits with a non-zero status and did not produce
a final results file, we propagate that failure out of the driver loop
immediately.
To keep this simple, this patch removes the PID from the test-results
file name. This PID was inherited from the git test system and seems
unnecessary, since the file name already includes the name of the test
script and the test-results directory is created anew for each run.
And require that if TEST_EMACS is specified, so is TEST_EMACSCLIENT.
Previously, the test framework always used "emacsclient", even if the
Emacs in use was overridden by TEST_EMACS. This causes problems if
both Emacs 23 and Emacs 24 are installed, the Emacs 23 emacsclient is
the system default, but TEST_EMACS is set to emacs24. Specifically,
with an Emacs 24 server and an Emacs 23 client, emacs tests that run
very quickly may produce no output from emacsclient, causing the test
to fail.
The Emacs server uses a very simple line-oriented protocol in which
the client sends a request to evaluate an expression and the server
sends a request to print the result of evaluation. Prior to Emacs bzr
commit 107565 on March 11th, 2012 (released in Emacs 24.1), if
multiple commands were sent to the emacsclient between when it sent
the evaluation command and when it entered its receive loop, it would
only process the first response command, ignoring the rest of the
received buffer. This wasn't a problem with the Emacs 23 server
because it sent only the command to print the evaluation result.
However, the Emacs 24 server first sends an unprompted command
specifying the PID of the Emacs server, then processes the evaluation
request, then sends the command to print the result. If the
evaluation is fast enough, it can send both of these commands before
emacsclient enters the receive loop. Hence, if an Emacs 24 server is
used with an Emacs 23 emacsclient, it may miss the response printing
command, ultimately causing intermittent notmuch test failures.
The use of --background option (instead of shell '&') ensures that
smtp-dummy is listening its server socket until execution of shell
script can continue, thus the client will always have socket where
to connect.
smtp-dummy outputs smtp_dummy_pid variable in shell assignment format;
eval'ing that output makes that variable available for the shell.
As the smtp-dummy instance is no longer child process of the script
the SIGKILL signal sent to it will ensure it is going away in case
the mail sender fails to connect to smtp-dummy.
Obviates the need to create a 'NOTMUCH_NEW' clone which runs
'notmuch new --debug'. This will be used in a later patch.
Doesn't cause any issues for other tests.
Since $TEST_DIRECTORY is an absolute path, any filenames generated
with it will be complete paths. Only use the basename to generate
suffixes for filenames.
Signed-off-by: Ethan Glasser-Camp <ethan@betacantrips.com>
Most Emacs tests end with a call to (test-output), which saves the
buffer to a filed called OUTPUT. Previously, if the test code failed
with an exception before this call, the test framework would then
compare against the OUTPUT file from the last Emacs test, resulting in
confusing diffs.
This requires one tweak to an emacs test that made two calls to
test_emacs and expected an OUTPUT file from the first call. We simply
reverse the order of the test_emacs calls.
Before the change, test_expect_equal_file() function treated the first
argument as "actual output file" and the second argument as "expected
output file". When the test fails, the files are copied for later
inspection. The first files was copied to "$testname.output" and the
second file to "$testname.expected". The argument order for
test_expect_equal_file() is often wrong which results in confusing
diff output and incorrectly named files.
The patch solves the issue by changing test_expect_equal_file() to
treat arguments just as two files, without any special properties
(like "actual" and "expected"). The file names for copying is now
based on the given file name: "$testname.$file1" and
"$testname.$file2". E.g. if test_expect_equal_file() is called with
"OUTPUT" and "EXPECTED", the copied files can be named
"emacs.1.OUTPUT" and "emacs.1.EXPECTED".
The down side of this approach is that diff argument order depends on
test_expect_equal_file() argument order. So sometimes we get diff
from expected to actual results, and sometimes the other way around.
But the files are always named correctly.
Previously, we used a variety of ad-hoc canonicalizations for JSON
output in the test suite, but were ultimately very sensitive to JSON
irrelevancies such as whitespace. This introduces a new test
comparison function, test_expect_equal_json, that first pretty-prints
*both* the actual and expected JSON and the compares the result.
The current implementation of this simply uses Python's json.tool to
perform pretty-printing (with a fallback to the identity function if
parsing fails). However, since the interface it introduces is
semantically high-level, we could swap in other mechanisms in the
future, such as another pretty-printer or something that does not
re-order object keys (if we decide that we care about that).
In general, this patch does not remove the existing ad-hoc
canonicalization because it does no harm. We do have to remove the
newline-after-comma rule from notmuch_json_show_sanitize and
filter_show_json because it results in invalid JSON that cannot be
pretty-printed.
Most of this patch simply replaces test_expect_equal and
test_expect_equal_file with test_expect_equal_json. It changes the
expected JSON in a few places where sanitizers had placed newlines
after commas inside strings.
Before the change, messages generated by generate_message() used "Test
message #N" for default subject where N is the generated messages
counter. Since message subject is commonly present in expected
results, there is a chance of breaking other tests when a new
generate_message() call is added. The patch changes default subject
value for generated messages to subtest name if it is available. If
subtest name is not available (i.e. message is generated during test
initialization), the old default value is used (in this case it is
fine to have the counter in the subject).
Another benefit of this change is a sane default value for subject in
generated messages, which would allow to simplify code like:
test_begin_subtest "test for a cool feature"
add_message [subject]="message for test for a cool feature"
Emails that are encoded differently than as ASCII or UTF-8 are not
indexed properly by notmuch. It is not possible to search for non-ASCII
words within those messages.
When tests are skipped due to missing prereqs, those prereqs are only
displayed when running with the `--verbose' option. This is essential
information when troubleshooting, so always send to stdout.
Add a new test function to allow simpler testing of emacs
functionality.
`test_emacs_expect_t' takes one argument - a lisp expression to
evaluate. The test passes if the expression returns `t', otherwise it
fails and the output is reported to the tester.
When checking for a running emacs, test_emacs evaluates the empty list
'()'. This returns 'nil' when emacs is running, which is then
prepended to the actual test result. Given that it is not part of the
actual test output the test harness can incorrectly report test
failure (or success).
Used emacs (whitespace-cleanup) function to "cleanup blank problems"
in test files where that could be done without breaking tests;
test/emacs was partially, and test/multipart was fully reverted.
When running the Emacs tests in verbose mode, only the first missing
prereq is reported because the `run_emacs' function is short-circuited
early:
#+begin_example
emacs: Testing emacs interface
missing prerequisites: [0] emacs(1)
skipping test: [0] Basic notmuch-hello view in emacs
SKIP [0] Basic notmuch-hello view in emacs
#+end_example
This can lead to situations reminiscent of "dependency hell", so instead
of returning based on each individual `test_require_external_prereq's exit
status, we now do so only after checking all the prereqs:
#+begin_example
emacs: Testing emacs interface
missing prerequisites: [0] dtach(1) emacs(1) emacsclient(1)
skipping test: [0] Basic notmuch-hello view in emacs
SKIP [0] Basic notmuch-hello view in emacs
#+end_example
Also added missing prereq for dtach(1).
It makes no sense to run test-lib.sh, so it makes no sense to give it
an interpreter. This is particularly annoying for Emacs users who
have executable-insert set, since the presence of the #! line will
cause Emacs to mark test-lib.sh executable when saving it, which will
in turn case the 'basic' test to fail.
January 5, 2001 was a Tuesday, not a Friday. Jameson fixed this exact
problem for the multipart test in ec2b0a98cc, but not for
generate_message itself.
As Jameson pointed out in ec2b0a98cc, if we want to test date parsing,
we should do it separately.
As we start to pay more attention to emacs24, it helps to be able to
select a different version of emacs to run the tests with to verify
version specific bugs.
A separate variable TEST_EMACS is needed to avoid being overwritten by the
make variable EMACS in Makefile.config
For what it's worth, the value of emacs is chosen at the time
tmp.emacs/run_emacs is created, so is fixed for all subtests.
The idea is that $test_count could be used in tests to label
intermediate files. The output enabled by this patch (and --debug)
helps figure out which OUTPUT.nn file belongs to which test in case
several subtests write to OUTPUT.$test_count
Some distros (Arch Linux) ship Python as python2 and Python 3 as python.
Checking for python2 is necessary for the Python tests to work on these
platforms.
The new test_python() function makes writing Python tests a little easier:
- it sets the environment variables as needed
- it redirects stdout to the OUTPUT file (like test_emacs()).
This commit also declares python as an external prereq.
The stdout redirection is required to avoid trouble when running commands like
"python 'script' | sort > OUTPUT": in such a case, any error due to a missing
external prereq would be "swallowed" by sort, resulting to a failed test instead
of a skipped one.
The patch adds two auxiliary functions and a variable:
notmuch_counter_reset
$notmuch_counter_command
notmuch_counter_value
They allow to count how many times notmuch binary is called.
notmuch_counter_reset() function generates a script that counts how
many times it is called and resets the counter to zero. The function
sets $notmuch_counter_command variable to the path to the generated
script that should be called instead of notmuch to do the counting.
The notmuch_counter_value() function returns the current counter
value.
The fake missing binary functions check if the binary has already be
added to the diagnostic message to avoid duplicates. Unfortunately,
this check was buggy because the message string does not have the
trailing space.
test_missing_external_prereq_${binary}_ variable indicates that the
binary is missing. It must be set in test_declare_external_prereq()
outside of the fake $binary() function.
Some tests (e.g. crypto) do a common initialization required for all
subtests. The patch adds a check for missing external dependencies
during this initialization. If any prerequisites are missing, all
subtests are skipped.
The check is run on the first call of test_reset_state_ function, so
no changes for the tests are needed.
There is existing support for general prerequisites in the test suite.
But it is not very convenient to use: every test case has to keep
track for it's dependencies and they have to be explicitly listed.
The patch aims to add better support for a particular type of external
dependencies: external executables. The main idea is to replace
missing external binaries with shell functions that have the same
name. These functions always fail and keep track of missing
dependencies for a subtest. The result reporting functions later can
check that an external binaries are missing and correctly report SKIP
result instead of FAIL. The primary benefit is that the test cases do
not need to declare their dependencies or be changed in any way.
If mail sending from emacs fails before it has chance to connect
to the smtp-dummy mail server, the opportunistic QUIT message
sending makes smtp-dummy to exit.
Due to 108-character limit in unix domain socket path this change
is required; it is more probable that length of ${TMPDIR:-/tmp} is
shorter than length of path to the current directory of notmuch test
source directory. One can expect to create reasonable-length unix
domain sockets wherever $TMPDIR points to.
The TEST_TMPDIR if first needed to hold dtach's socket (due
to 108-character limit in socket file names). Later it can be
used to hold other temporary files; directory deleted at exit.
Do not redirect test_emacs stderr to /dev/null. Test_emacs uses
emacsclient(1) now and it does not print unwanted messages (like
those from `message') to stderr. But it does print useful
errors, e.g. when emacs server connection fails, given expression
is not valid or undefined function is called.
Set SCREENRC and SYSSCREENRC environment variables to "/dev/null"
as suggested by Jim Paris to avoid potential problems with
screen(1) configuration files.
Before the change, emacs run in daemon mode without any visible
buffers. Turns out that this affects emacs behavior in some
cases. In particular, `window-end' function returns `point-max'
instead of the last visible position. That makes it hard or
impossible to implement some tests. The patch runs emacs in a
detached screen(1) session. So that it works exactly as if it
has a visible window.
Note: screen terminates when emacs exits. So the patch does not
introduce new "running processes left behind" issues.
- explain test_expect_equal_file
- remove mention of test_expect_failure, since that function was removed.
Based on id:"1317317811-29540-1-git-send-email-thomas@schwinge.name"
This means that test_subtest_known_broken needs to be called before
every known broken subtest, which is no different than what is
documented for the test_begin_subtest case.
The assumption is that every test ends up calling either skipping,
calling test_ok_ or test_failure_ and and the latter in turn delegate
to the known_broken versions in the case where
test_subtest_known_broken_ is set.
There is existing support for broken tests. But it is not convenient
to use. The primary issue is that we have to maintain a set of
test_expect_*_failure functions which are equivalent to the normal
test_expect_* counterparts except for what functions are called for
result reporting. The patch adds test_subtest_known_broken function
which marks a subset as broken, making the normal test_expect_*
functions behave as test_expect_*_failure. All test_expect_*_failure
functions are removed. Test_known_broken_failure_ is changed to
format details the same way as test_failure_ does.
Another benefit of this change is that the diff when a broken test is
fixed would be small and nice.
Documentation is updated accordingly.
Before the change, every Emacs test ran in a separate Emacs
instance. Starting Emacs many times wastes considerable time and
it gets worse as the test suite grows. The patch solves this by
using a single Emacs server and emacsclient(1) to run multiple
tests. Emacs server is started on the first test_emacs call and
stopped when test_done is called. We take care not to leave
orphan Emacs processes behind when test is terminated by whatever
reason: Emacs server runs a watchdog that periodically checks
that the test is still running.
Some tests need to provide user input. Before the change, this
was done using echo(1) to Emacs stdin. This no longer works and
instead `standard-input' variable is set accordingly to make
`read' return the appropriate string.
Without this, mail messages delivered by emacs_deliver_message might
not be seen by the next invocation of "notmuch new", (which can lead
to test-suite failures if emacs_deliver_message is fast enough).
Change add_email_corpus, emacs_deliver_message and tests to use
$TEST_DIRECTORY instead of '..'.
This improves the behavior of the usage of --root=<dir>, as the
assumption of what '..' means will usually be incorrect.
Document -root option in README and update valgrind to work with
-root.
Instead of generating auxiliary run_emacs script every time
test_emacs is run, do it once in the beginning of the test.
Also, use absolute paths in the script to make it more robust.
Using `setq' for setting variables in Emacs tests affect other
tests that may run in the same Emacs environment. Currently it
works because each test is run in a separate Emacs instance. But
in the future multiple tests will run in a single Emacs instance.
The patch changes all variables to use `let', so the scope of the
change is limited to a single test.
Most test_emacs calls have long arguments that consist of many
expressions. Putting them on a single line makes it hard to read
and produces poor diff when they are changed. The patch puts
every expression in test_emacs calls on a separate line.
Various typo fixes in comments within the source code.
Signed-off-by: Pieter Praet <pieter@praet.org>
Edited-by: Carl Worth <cworth@cworth.org> Restricted to just
source-code comments, (and fixed fix of "descriptios" to "descriptors"
rather than "descriptions").
In the master branch in test/emacs two tests access the build users home
directory, so does emacs_deliver_message in the crypto branch.
The tests should not touch the build user's home directory. The patch
creates a directory in the temporary test directory and sets home
accordingly.
In case of a non-existent home directory, the tests are failing without
this patch.
Signed-off-by: Jameson Graef Rollins <jrollins@finestructure.net>
Before the change, test_expect_equal_file moved files it compared
in case of failure. The patch changes it to copy the files
instead. This allows testing non-temporary files which are
stored in git.
Note: the change should not result in new temporary files left
after the tests. Test_expect_equal_file used to move files only
on failure, so callers had to cleanup them anyway.
This adds a new "crypto" test script to the test suite to test
PGP/MIME signature verification and message decryption. Included here
is a test GNUPGHOME with a test secret key (passwordless), and test
for:
* signing/verification
* signing/verification with full owner trust
* verification with signer key unavailable
* encryption/decryption
* decryption failure with missing key
* encryption/decryption + signing/verfifying
* reply to encrypted message
* verification of signature from revoked key
These tests are not expected to pass now, but will as crypto
functionality is included.
We need to be able to test for the presence of a newline at the end of
output. There's no good way to capture trailing newlines in bash, so
redirecting output to a file is the next best thing. This new
function should be used when testing for output that is expected to
have trailing newlines.
The next commit will demonstrate the use of this.
Change #!/bin/bash at start of tests to "#!/usr/bin/env bash". That way
systems running on bash < 4 can prepend bash >= 4 to path before
running the tests.
The patch adds test-lib.el file for Emacs tests auxiliary stuff.
Currently, it implements two functions: `visible-buffer-string'
and `visible-buffer-substring'. These are similar to standard
counterparts without "visible-" prefix but exclude invisible
text. The functions are not used anywhere at the moment but
should be useful for testing hiding/showing in the Emacs
interface.
Edited-by: Carl Worth <cworth@cworth.org> Fixed "basic" test to ignore
new test-lib.el file.
Previously, this directory was only preserved for failing tests. But
it's important to be able to easily debug known-broken tests, so
preserve the actual vs. expected output for those as well.
When test_begin_subtest is not followed by corresponding test_expect_equal,
the output of the rest of the test script is errornously suppressed. Add
code to detect these bugs in test scripts.
The newline was removed from say_color in commit 222926ab to allow
printing test status in the beginning of the line. Error messages are
never followed by other text so we add the newline to error function.
Git-style tests (test_expect_success etc.) suppress stdout and stderr
unless -v is given. Notmuch-style tests (created by test_begin_subtest
and test_expect_equal) do not have this behavior so implement it the
same.
Additionally, for both test styles, the test-lib.sh is changed so that
the content of suppressed stdout and stderr is shown in case of failed
test.
Finally a test for this functionality is added to basic tests.
We have test names like maildir-sync now, so it's cleaner if the
temporary files created are named things like maildir-sync-10.out
rather than maildir-10.out. Presumably the extra stripping here came
from naming conventions in git's test suite.
This was too rude of a thing to do and could easily introduce
problems, (as reported by Rob Browning whose environment required some
HOME-specific things for shell startup).
Instead, implement more focused changes to ensure that particular file
in $HOME don't cause problems. Specifically, we fix known problems
with ~/.signature and ~/.mailrc here.
After any emacs test failure, the tmp.emacs directory will have this
run_emacs script in it which the user can use to run emacs within the
test suite environment, (pointing at the test suite's notmuch
database, using the local notmuch command-line program, and the local
notmuch emacs lisp code).
The bash code in the test suite is using associative arrays which were
only added to bash as of release 4.0.
If the test suite is run with an older bash, we now immediately error
out and explain the situation, (instead of emitting confusing error
messages and failing dozens of tests, which is what happened before
this change).
We set the HOME environment variable to the test directory to avoid
the tests relying on any configuration files from the test author's
own home directory, (such as ${HOME}/.emacs or similar).
We simulate the act of selecting the "inbox" saved search from
notmuch-hello and the act of selecting a desired thread from the
notmuch-search results.
The test for the navigation of notmuch-hello is currently marked as
BROKEN since its output is in the opposite order compared to the
'(notmuch-search "tag:inbox")' test. This question of ordering is a
currently open issue on the notmuch mailing list, so we'll let the
test suite reflect that for now.
Finally, this commit also abstracts some common emacs lisp code,
(waiting for the current buffer's process to complete), into a new
notmuch-test-wait function that is made available to anything calling
test_emacs.
This should be quite handy for doing automated testing of the
emacs-based functionality in notmuch. This function invokes emacs with
the necessary command-line arguments, (to run in batch mode with no
local initialization, to load the notmuch code from the source
directory, and to ensure an 80-column width).
When the NOTMUCH variable was originally invented it was used as an
explicit path to the notmuch binary being tested. Today, the test
suite sets the PATH variable instead, so the NOTMUCH variable always
has a value of simply "notmuch".
We simplifying that by using the constant value rather than the
continual variable reference.
Now that we can usefully pass section names via the NOTMUCH_SKIP_TESTS
environment variable, it's useful to actually print those names out
for the user. Then, since we're now printing these names, let's use
nicer names, (not excessively long but also not using abbreviations
like "msg").
In order for --valgrind to be useful, we drop noisy additional output of
all of the commands being executed in verbose mode. This makes --verbose
alone quite useless, so we don't document it any more.
Also, add a zlib valgrind suppression that was showing up frequently in the
test suite.
By scanning test-lib.sh for occurrences of "git" or "GIT", I found
that most of those are internal things, (like the GIT_TEST_TEE_STARTED
variable). But GIT_SKIP_TESTS is part of the user-interface to the
test suite, so we rename it to reference notmuch rather than git.
Also, the GIT_TRACE warning is git-specific, so we drop that as well.
The original git test suite works by concatenating many commands into
a very long string (each separated by &&). This is painful to work
with since it prevents the editor from helping by parsing the shell
script, indenting, colorizing, etc.
Instead, we switch this back to something like the original notmuch
test suite, and add two new functions to test-lib.sh
(test_begin_subtest and test_expect_equal) to support these.
This also fixes the test suite to once again display the diff when a
test fails to generate the expected input.