Skip to content

Conversation

@kolkov
Copy link
Contributor

@kolkov kolkov commented Jan 15, 2026

Summary

Release v0.11.0 - resolves Issue #79 (UTF-8 dot performance regression).

New Features

  • UseAnchoredLiteral strategy - O(1) specialized matching for ^prefix.*suffix$ patterns

    • Pattern ^/.*[\w-]+\.php$: 32-133x faster than stdlib (was 5.3x slower)
    • 17th strategy in meta-engine
  • V11-002 ASCII runtime detection - SIMD-accelerated input classification

    • Dual NFA compilation for patterns with .
    • Up to 1.6x faster on ASCII input

Bug Fixes

  • OnePass DFA handles StateLook anchors (^, $, \A, \z)
  • Suffix extraction skips trailing anchors for O(1) rejection

Internal

  • meta.go refactored from 2821 lines into 6 focused files (no API changes)
  • Linter fixes in anchored_literal.go

Performance (Issue #79 pattern)

Input coregex stdlib Speedup
Short (24B) 7.6 ns 241 ns 32x
Medium (45B) 7.8 ns 347 ns 44x
Long (78B) 7.9 ns 516 ns 65x
No match 4.4 ns 590 ns 133x

Test plan

  • go test ./... passes
  • go test -race ./... passes
  • golangci-lint run - 0 issues
  • CI pipeline green
  • Benchmark regression check (regex-bench)

Closes #79

Implements dual-NFA compilation for patterns with '.':
- UTF-8 NFA: handles all valid UTF-8 codepoints (~28 states per '.')
- ASCII NFA: optimized for ASCII-only input (1-2 states per '.')

At runtime, input is checked using SIMD (AVX2 on x86-64) to
determine if all bytes are ASCII. If so, the faster ASCII NFA is used.

Performance for Issue #79 pattern ^/.*[\w-]+\.php:
- Short input (6B): 3.7x faster
- Medium input (23B): 2.3x faster
- Long input (49B): 1.5x faster

SIMD isASCII throughput: 20-41 GB/s (AVX2), 10+ GB/s (SWAR fallback)
NFA state reduction: 6x for single '.', 2.8x for Issue #79 pattern

New files:
- simd/ascii_*.go, simd/ascii_amd64.s: SIMD ASCII detection
- nfa/pattern_analysis.go: ContainsDot() helper
- meta/ascii_optimization_test.go: integration tests
- nfa/compile_ascii_test.go: NFA compilation tests

Config: EnableASCIIOptimization (default: true)
ExtractSuffixes was returning empty for patterns like '\.php$'
because the last AST element is OpEndLine ($), not the literal.

Now skips trailing anchors (OpEndLine, OpEndText) to find the
actual suffix literal.

Issue #79 improvements:
- Wrong suffix: 2017ns → 15ns (91x faster than stdlib!)
- Wrong prefix: already fast at 5.2ns (13x faster than stdlib)
- Matching: 972ns vs stdlib 788ns (1.23x slower, was 5.3x)
epsilonClosureOnePass was not handling StateLook states, causing
OnePass DFA to fail for all patterns with explicit anchors.

Now treats anchor states as epsilon transitions:
- Start anchors (^, \A): Always satisfied in anchored mode
- End anchors ($, \z): Follow epsilon; match verified at input end

Simple anchored patterns like ^abc$ now work correctly.
Patterns with .* are still rejected (inherent ambiguity).
Add specialized O(1) matching for ^prefix.*suffix$ patterns:
- Pattern detection via AST analysis (DetectAnchoredLiteral)
- Fast matching with O(k) prefix/suffix checks (MatchAnchoredLiteral)
- 256-byte lookup table for charclass bridge verification
- Proper UTF-8 handling (reject non-ASCII in ASCII-only charclass)

Benchmark results vs stdlib for ^/.*[\w-]+\.php$:
- Short input: 24x faster (13 ns vs 322 ns)
- Medium input: 28x faster (14 ns vs 399 ns)
- Long input: 51x faster (10 ns vs 489 ns)
- No match: 99x faster (7 ns vs 689 ns)

Issue #79 resolution: from 5.3x SLOWER to 24-99x FASTER than stdlib.
- engine.go (230): Engine struct, Stats, core API
- compile.go (526): Compilation, builders
- find.go (749): Find methods returning *Match
- find_indices.go (733): Zero-alloc FindIndices methods
- ismatch.go (353): IsMatch boolean methods
- findall.go (285): FindAll*, Count, FindSubmatch
- meta.go (81): Package documentation only

Also fixes linter issues in anchored_literal.go
- CHANGELOG.md: v0.11.0 release notes with UseAnchoredLiteral, ASCII detection
- README.md: 17 strategies, added AnchoredLiteral to strategy table
- ROADMAP.md: v0.11.0 current, v0.12.0 CompositeSearcher next
- OPTIMIZATIONS.md: Added AnchoredLiteral as 9th key optimization
@github-actions
Copy link

github-actions bot commented Jan 15, 2026

Benchmark Comparison

Comparing main → PR #96

Summary: geomean 252.5n 252.3n -0.05%

⚠️ Potential regressions detected:

Accelerate/memchr1-4       120.5n ± ∞ ¹   121.9n ± ∞ ¹  +1.16% (p=0.008 n=5)
LazyDFAAlternation-4       255.5n ± ∞ ¹   256.4n ± ∞ ¹  +0.35% (p=0.008 n=5)
geomean                               ³                +0.00%               ³
geomean                               ³                +0.00%               ³
OnePassSearch-4      49.01n ± ∞ ¹   56.56n ± ∞ ¹  +15.41% (p=0.008 n=5)
geomean              32.49n         34.90n         +7.40%
geomean                         ³                +0.00%               ³
geomean                         ³                +0.00%               ³
BranchDispatch_Coregex/Digits-4                            8.050n ± ∞ ¹    9.095n ± ∞ ¹   +12.98% (p=0.008 n=5)
AnchoredAlt_ManyBranches_Stdlib/DELETE-4                   116.5n ± ∞ ¹    118.2n ± ∞ ¹    +1.46% (p=0.008 n=5)

Full results available in workflow artifacts. CI runners have ~10-20% variance.
For accurate benchmarks, run locally: ./scripts/bench.sh --compare

@kolkov kolkov merged commit 35f5319 into main Jan 15, 2026
15 checks passed
@kolkov kolkov deleted the release/v0.11.0 branch January 15, 2026 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

noticeably slower than core regexp

2 participants