Batch phpcs calls into a single invocation by sirbrillig · Pull Request #110 · sirbrillig/phpcs-changed

sirbrillig · 2026-04-20T23:23:33Z

Fixes #115

Summary

Replaces the 2N per-file phpcs launches in runGitWorkflow and runSvnWorkflow with a single phpcs invocation that scans all file versions at once.
All modified and unmodified file contents are written to a temp directory (/tmp/phpcs-changed-XXXX/new/… and /tmp/phpcs-changed-XXXX/old/…) and phpcs runs once across all of them, eliminating startup overhead that previously scaled linearly with the number of changed files.

This is a different approach to the two-pass scan idea in #114

Approach

Each workflow now has three phases:

Pre-batch — check caches, determine which files are new (no unmodified version), collect the lists of files that need scanning.
Batch — single getPhpcsOutputForGitBatch / getPhpcsOutputForSvnBatch call writes all content to a temp dir and runs phpcs once; results are cached individually per file.
Filter — parse results and compute new messages per file as before.

Trade-off

The batch approach always scans the unmodified version of uncached files even when the modified version has no messages. Previously the per-file path skipped the unmodified scan in that case. The trade-off is intentional: one saved phpcs startup cost (≥250 ms) outweighs the cost of a few extra file sniffs.

Performance

Measured with benchmark.sh on 10 staged PHP files (PSR2 standard, 10 runs):

	Mean	Relative
trunk: 20 phpcs calls	5.74 s ± 0.11 s	1.00×
batch (this branch): 1 phpcs call	2.44 s ± 0.05 s	2.36× faster

Replace the 2N per-file phpcs launches in runGitWorkflow and runSvnWorkflow with a single batch invocation. All modified and unmodified file contents are written to a temp directory and phpcs is run once on all of them, eliminating the startup overhead cost that scaled linearly with the number of changed files. - Add getPhpcsOutputForGitBatch / getPhpcsOutputForSvnBatch to ShellOperator - Implement batch methods in UnixShell using a temp dir layout (new/ and old/) - Override batch methods in TestShell to delegate to existing per-file mocks - Rewrite runGitWorkflow and runSvnWorkflow with pre-batch/batch/filter phases - Add and update tests for the new batch behavior

…ied files runBatchPhpcs keyed its results by original file path, but every file is scanned as both a modified (new/) and an unmodified (old/) temp file under the same original path, so one silently overwrote the other. Both 'new' and 'old' then resolved to the same phpcs output, defeating the new-vs-old filtering that the tool exists to perform. Key batch results by temp path instead, and map each side back to its original files through its own temp-to-original map.

The workflow tests previously overrode getPhpcsOutputFor{Git,Svn}Batch in the test shells, looping over the per-file phpcs methods. That left the real ShellRunner batch path (temp-file writing, single combined phpcs invocation, and per-file JSON splitting) completely untested -- which is how the new/old collision bug went unnoticed. The test shells now let the real batch path run and intercept only the final combined phpcs invocation, synthesizing its output from the temp files the batch writes (buildBatchPhpcsOutput). Command registrations drop the obsolete '| phpcs' per-file suffix, and the shells prefer an exact command match so a file-contents command no longer shadows that same command piped to git hash-object. With the real batch covered, the now-unused per-file methods (getPhpcsOutputOf{Modified,Unmodified}{Git,Svn}File) and their helpers (getPhpcsCommand, processPhpcsOutput) are removed from ShellOperator, UnixShell, WindowsShell, and ShellRunner.

…ports The test shells now record the intercepted batch phpcs invocation and expose wasCommandCalledContaining(), so the standard-configured workflow tests assert that --standard reaches the phpcs command (coverage previously provided by the per-file phpcs registrations). Also removes use statements in TestShell that became unused when the per-file batch overrides were deleted.

wasCommandCalledContaining matched the raw recorded command, but on Windows escapeshellarg() emits double quotes, so --standard="standard" never matched the single-quoted needle. Normalize double quotes to single quotes before comparing, matching how the test shells already normalize commands.

The batch path silently swallowed phpcs processing errors (eg: an uninstalled standard). phpcs writes a non-JSON error to stdout in that case, and runBatchPhpcs returned an empty result, which mapped each file to empty output and produced a bogus phantom STDIN success with exit 0. Treat any phpcs output that is not decodable JSON with a 'files' key as a failure and throw a ShellException, restoring the pre-batch behavior where a processing error surfaces and exits non-zero.

The batch path materialized each file's content by capturing the content command (git show / cat) through the line-oriented executeCommand(), which rejoins exec() output and unconditionally appends a trailing newline (and collapses runs of trailing blank lines). The temp file phpcs scanned therefore always ended in \n, so the batch path missed PSR2.Files.EndFileNewline.NoneFound (and .TooMany) for files that genuinely lack a trailing newline. Add ShellPlatform::writeCommandOutputToFile(), which redirects the content command's stdout straight to the temp file, preserving exact bytes. writeTempFile() now uses it and throws when the content command fails, so a fetch failure surfaces instead of producing a silently-empty temp file. The mocked TestShell cannot reproduce the exec() byte corruption, so add a UnixShellTest exercising the real shell to lock in the byte-preserving contract.

sirbrillig · 2026-06-20T21:42:13Z

Working on making sure this branch's output is identical to trunk.

testFullGitWorkflowForEmptyNewFile mocked git show returning exit 1 with phpcs's 'You must supply at least one file' text, which described the old per-file path that piped empty stdin into phpcs. The batch path instead writes the empty content to a temp file and passes it as an argument, so git show succeeds (exit 0) and phpcs emits an Internal.NoCodeFound warning. Because the file is new, that warning is reported as a new message. Update the test to register that real behavior and assert the warning is reported.

testFullSvnWorkflowForEmptyNewFile mocked cat returning phpcs's 'You must supply at least one file' text, describing the old per-file path that piped empty stdin into phpcs. The batch path writes the empty content to a temp file and passes it as an argument, so cat succeeds and phpcs emits an Internal.NoCodeFound warning that is reported as a new message for the new file. Mirror the git workflow test fix.

Inlining one shell argument per temp file overflowed the OS ARG_MAX limit once a batch reached thousands of files, failing with a cryptic "Argument list too long". Write all temp paths to a phpcs --file-list file in the batch temp dir instead, keeping the command line a constant size regardless of file count. This also handles paths with spaces cleanly. Update the test mock shell to read the file list and add a regression guard asserting files are never inlined as phpcs arguments.

sirbrillig force-pushed the batch-phpcs-invocations branch from a75e2c2 to 129b081 Compare April 20, 2026 23:26

sirbrillig force-pushed the batch-phpcs-invocations branch from 129b081 to 74ce1eb Compare April 21, 2026 03:16

sirbrillig mentioned this pull request Apr 21, 2026

Fix benchmark script to be generic and not specific to batch process #113

Merged

sirbrillig marked this pull request as ready for review April 21, 2026 03:39

This was referenced Apr 21, 2026

Batch phpcs into two invocations: one for new files, one for old #114

Open

Improve performance by reducing the number of phpcs calls #115

Open

sirbrillig added 8 commits June 20, 2026 11:41

Refactor to simplify runXWorkflow functions

4f976a5

Remove unused runXWorkflowForFile methods

fa57560

sirbrillig added 3 commits June 20, 2026 17:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch phpcs calls into a single invocation#110

Batch phpcs calls into a single invocation#110
sirbrillig wants to merge 12 commits into
trunkfrom
batch-phpcs-invocations

sirbrillig commented Apr 20, 2026 •

edited

Loading

Uh oh!

sirbrillig commented Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sirbrillig commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Approach

Trade-off

Performance

Uh oh!

sirbrillig commented Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sirbrillig commented Apr 20, 2026 •

edited

Loading