Skip to content

Conversation

@kolkov
Copy link
Contributor

@kolkov kolkov commented Jan 15, 2026

Summary

Fix performance regression where [^\s]+\.txt pattern caused extreme benchmark to hang.

Problem

  • isSafeForReverseSuffix only recognized .* and .+ wildcards
  • CharClass Plus patterns like [^\s]+ were not recognized as safe wildcards
  • Result: Pattern fell back to slow path (266ms/MB instead of µs)

Solution

  • Add CharClass Plus patterns to whitelist of safe wildcard patterns
  • Now patterns like [^\s]+\.txt, [\w]+\.go use ReverseSuffix optimization

Testing

  • Added TestReverseSuffix_CharClassPlus test
  • Local benchmark: suffix_find now completes in 398µs (was timing out)
  • All existing tests pass

Impact

  • Fixes extreme benchmark CI hang
  • No regressions in other patterns

Bug: [^\s]+\.txt pattern caused extreme benchmark to hang
Root cause: isSafeForReverseSuffix only recognized .* and .+ wildcards,
not CharClass Plus patterns like [^\s]+

Fix: Add CharClass Plus to whitelist of safe wildcard patterns

Verified: suffix_find pattern now completes in 398µs (was timing out)
@github-actions
Copy link

Benchmark Comparison

Comparing main → PR #94

Summary: geomean 252.6n 251.9n -0.30%

⚠️ Potential regressions detected:

geomean                               ³                +0.00%               ³
geomean                               ³                +0.00%               ³
geomean                         ³                +0.00%               ³
geomean                         ³                +0.00%               ³
AnchoredAlt_ManyBranches_Stdlib/NoMatch-4               85.95n ± ∞ ¹    86.51n ± ∞ ¹     +0.65% (p=0.032 n=5)
FatTeddyFallback/large_haystack_1KB-4                   308.8n ± ∞ ¹    312.6n ± ∞ ¹     +1.23% (p=0.008 n=5)
IPRegex_Find/coregex_64KB_sparse-4                      3.107µ ± ∞ ¹    3.358µ ± ∞ ¹     +8.08% (p=0.008 n=5)
IPRegex_Find/coregex_1MB_sparse-4                       2.733µ ± ∞ ¹    3.040µ ± ∞ ¹    +11.23% (p=0.008 n=5)
IPRegex_Find/stdlib_6MB_sparse-4                        128.0µ ± ∞ ¹   3506.6µ ± ∞ ¹  +2640.06% (p=0.008 n=5)
IPRegex_Find/coregex_6MB_sparse-4                       3.638µ ± ∞ ¹    6.134µ ± ∞ ¹    +68.61% (p=0.008 n=5)

Full results available in workflow artifacts. CI runners have ~10-20% variance.
For accurate benchmarks, run locally: ./scripts/bench.sh --compare

@kolkov kolkov merged commit 84efa57 into main Jan 15, 2026
15 checks passed
@kolkov kolkov deleted the hotfix/reverse-suffix-charclass branch January 15, 2026 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants