Skip to content

t: add lint-style.pl and convert grep to test_grep#2135

Open
mmontalbo wants to merge 6 commits into
gitgitgadget:masterfrom
mmontalbo:mm/test-grep-docs
Open

t: add lint-style.pl and convert grep to test_grep#2135
mmontalbo wants to merge 6 commits into
gitgitgadget:masterfrom
mmontalbo:mm/test-grep-docs

Conversation

@mmontalbo
Copy link
Copy Markdown

@mmontalbo mmontalbo commented Jun 2, 2026

The test suite has a test_grep wrapper that prints file contents on
assertion failure, making debugging easier. Many tests still use
bare 'grep' for assertions, which silently swallows context on
failure.

This series adds a lint tool (lint-style.pl) to mechanically detect
and convert these, then applies it across the test suite.

The tool reuses the shared shell parser (t/lib-shell-parser/)
rather than reimplementing shell parsing. This gives us proper
handling of heredocs, $(...), pipes, quoting, case patterns, and
control flow structures for free. The only new parsing logic is
the grep assertion classifier, which works on the already-parsed
token stream.

The approach:

  • Positive matching: is_grep_assertion() examines parsed tokens
    and classifies whether a grep looks like an assertion (PATTERN +
    FILE) vs a filter (piped, redirected, inside $(), with -c/-l/-L).
    Only assertions are converted.

  • Missing file detection: greps at assertion level with no file
    argument (reading from empty stdin) are flagged as likely bugs.
    Three pre-existing instances were found and fixed in patch 5.

  • Auto-fix: --fix applies the conversions mechanically. The -q
    flag is dropped (test_grep checks status natively), and
    negation is rewritten from '! grep' to 'test_grep !'.

Structure:

1/6 t/README: document test_grep helper
2/6 t: extract chainlint's parser into shared module
3/6 t: fix Lexer line count for $() inside double-quoted strings
4/6 t: add lint-style.pl with test_grep negation rule
5/6 t: fix grep assertions missing file arguments
6/6 t: lint and convert grep assertions to test_grep

Patch 2 extracts chainlint.pl's Lexer, ShellParser, and
ScriptParser into t/lib-shell-parser/Parser.pm. ScriptParser's
check_test() becomes a no-op in the module; chainlint.pl defines
ChainlintParser (a ScriptParser subclass that runs TestParser for
&&-chain detection), and lint-style.pl defines LintParser (a
ScriptParser subclass that runs grep lint rules).

Patch 3 fixes a pre-existing Lexer bug where multi-line $()
substitutions inside double-quoted strings caused line numbers to
drift. chainlint is unaffected (it uses byte offsets), but
lint-style.pl needs accurate token line numbers to locate and fix
specific lines in the original file.

Patch 4 introduces the lint-style.pl framework (LintParser
subclass, Makefile targets, fixture infrastructure) with a small,
complete rule (! test_grep -> test_grep !) so reviewers can see
the machinery in action before the bulk conversion.

Patch 5 fixes three test bugs where grep assertions were missing
their file arguments, causing them to pass vacuously (all three
pass with the corrected arguments). These were independently
discovered by the missing-file detection rule introduced in
patch 6.

Patch 6 adds the main rule, its fixtures, and the mechanical
conversion of ~2800 assertions across ~340 files, including sourced
test fragments in t/t5411/ and heredoc test bodies in t/t5564/.

To verify the conversion (patch 6 adds both the rule and the
mechanical conversion in the same commit, so apply the full
series and re-run --fix to confirm it produces no further
changes):

git checkout &&
perl t/lint-style.pl --fix t/t*.sh t/test-lib*.sh t/lib-.sh
t/
-tests.sh t/perf/.sh t/t5411/.sh

As an independent completeness check:

grep -rn '^\s*!\sgrep\s' t/t*.sh t/lib-.sh t/-tests.sh
t/test-lib*.sh t/perf/*.sh |
grep -v test_grep | grep -v lint-ok | grep -v '#.*grep'

Known limitations:

  • One grep in t/t1400 asserts against .git/packed-refs which
    does not exist on the reftable backend; suppressed with
    '# lint-ok'.

  • Two greps in t/t3901 inside case branches that inherit piped
    stdin from two lines above are suppressed with '# lint-ok'.

  • One grep in t/t6437 uses glob expansion (grep -q content *)
    which breaks test_grep's file check; suppressed with '# lint-ok'.

  • One grep in t/t7527 captures $? for later use rather than
    asserting inline; suppressed with '# lint-ok'.

@mmontalbo mmontalbo force-pushed the mm/test-grep-docs branch 2 times, most recently from 3258ee2 to cfa4313 Compare June 3, 2026 02:33
test_grep is a wrapper around grep for test assertions that prints
the file contents on failure for easier debugging.  It also accepts
'!' as its first argument for negation, which preserves the
diagnostic output that '! test_grep' would suppress.

Despite being widely used (and the preferred replacement for bare
grep in assertions), test_grep has no entry in t/README alongside
the other documented helpers like test_cmp and test_line_count.
Add one.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
@mmontalbo mmontalbo force-pushed the mm/test-grep-docs branch 27 times, most recently from 1f8f6e1 to 5d5366a Compare June 3, 2026 11:11
@mmontalbo mmontalbo force-pushed the mm/test-grep-docs branch 12 times, most recently from 3509bca to a878d13 Compare June 4, 2026 01:24
mmontalbo added 5 commits June 3, 2026 20:05
Move the Lexer, ShellParser, and ScriptParser packages from
chainlint.pl into t/lib-shell-parser.pl so they can be reused by
other tools.  ScriptParser's check_test() is a no-op in the shared
module; callers subclass ScriptParser and override it.

chainlint.pl defines TestParser (&&-chain detection) and
ChainlintParser (a ScriptParser subclass whose check_test runs
TestParser and formats the results).  The shared module is loaded
via do() for portability with minimal Perl installations.

A subsequent commit introduces lint-style.pl which needs the same
shell parser to properly tokenize test scripts.  Sharing the parser
avoids reimplementing heredoc handling, $(...) nesting, pipe
tracking, quoting, and test body extraction.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
scan_dqstring's post-loop newline counter re-counts newlines that
were already counted during recursive parsing of $() bodies.  This
happens because scan_dollar's returned text can contain newlines
(from token text of multi-line strings and from \n command separator
tokens), and the catch-all counter at the end of scan_dqstring
counts all of them again.

Fix this by counting newlines inline as non-special characters are
consumed, and removing the post-loop catch-all.  Each newline is
now counted exactly once: literal newlines at the inline match,
line splices at the \<newline> handler, and $() newlines by
scan_token during the recursive parse.

This does not affect chainlint's output because chainlint annotates
the original body text using byte offsets, not token line numbers.
It does matter for tools like lint-style.pl (introduced in a
subsequent commit) that use token line numbers to locate and fix
specific lines in the original file.

Add check-shell-parser.pl to verify that the Lexer reports correct
line numbers after multi-line $() in double-quoted strings.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Add a mechanical lint checker for test scripts, similar in spirit to
check-non-portable-shell.pl but focused on test conventions rather
than portability.

The tool defines LintParser, a subclass of ScriptParser (from the
shared lib-shell-parser/lib-perl.pl module).  ScriptParser's
parse_cmd() finds test_expect_success blocks and calls check_test()
for each body; LintParser overrides check_test() to run lint rules
on the parsed commands.  A "# lint-ok" comment suppresses all
checks for intentional style violations.

The first rule detects '! test_grep' and replaces it with
'test_grep !'.  Shell-level negation suppresses the diagnostic
output that test_grep prints on failure; the built-in negation
preserves it.

Three violations inside test bodies are converted via --fix.  One
additional violation in a helper function outside test_expect_success
(t7900's test_geometric_repack_needed) is converted manually, since
the parser only processes test bodies.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Three grep assertions were missing their file arguments, causing
them to read from empty stdin instead of the intended file:

- t2402: '! grep ...' should read from 'out', matching the
  grep on the preceding line.
- t7507: the closing quote is in the wrong place, making the
  entire 'diff --git actual' a single pattern with no file
  argument instead of pattern 'diff --git' and file 'actual'.
- t7700: '! grep ...' should read from 'packlist', matching
  the redirect on the preceding line.

Without file arguments these greps always succeed (empty stdin
matches nothing), so the assertions were not actually checking
anything.  All three tests pass with the corrected file arguments,
confirming the intended behavior is sound.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
Extend lint-style.pl with a rule that detects bare 'grep' used as a
test assertion and converts it to test_grep.  test_grep prints the
file contents on failure, making test debugging significantly easier.

parse_commands() is extended to split at shell structural tokens
({, }, (, ), |) and keywords (if, then, for, etc.), and each
command gains a token_pos index so that rules can scan backward and
forward in the token stream for context.

Three new functions implement the grep-assertion rule:

 - is_filter_context() scans the surrounding tokens for pipes,
   control-flow keywords (if/elif/while/until), for-in value
   lists, and brace groups with output redirects.

 - is_grep_assertion() classifies a grep command: convertible
   assertion (pattern and file present), filter (not an assertion),
   or missing file argument (flagged as a likely bug).

 - check_bare_grep() ties them together and calls
   report_violation() with the appropriate fix.

The --fix mode handles:
 - Replacing 'grep' with 'test_grep'
 - Moving negation from '! grep' to 'test_grep !'
 - Stripping the -q flag (test_grep inherently checks match status)

Five files require '# lint-ok' annotations for intentional grep
usage that cannot be mechanically converted: t1400 (packed-refs
may not exist on reftable), t3901 (piped stdin via case block),
t6437 (glob argument breaks test_grep's test -f check), t7450
(file may not exist after failed MINGW clone), and t7527 ($?
capture on the next line).

The test-lint-style scope is extended to include sourced test
fragments in subdirectories (t5411/*.sh and similar) via a new
TSOURCED variable.

Run '--fix' to convert all ~2800 grep assertions across ~340 files
in the test suite.  test-lib-functions.sh and lib-rebase.sh are
excluded from linting since they implement test infrastructure
rather than test assertions.

Signed-off-by: Michael Montalbo <mmontalbo@gmail.com>
@mmontalbo mmontalbo force-pushed the mm/test-grep-docs branch from a878d13 to 92fdf29 Compare June 4, 2026 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant