Skip to content

Deduplicate issues when different rules flag the same issue#61

Open
oshorefueled wants to merge 2 commits intomainfrom
feat/dedup
Open

Deduplicate issues when different rules flag the same issue#61
oshorefueled wants to merge 2 commits intomainfrom
feat/dedup

Conversation

@oshorefueled
Copy link
Contributor

@oshorefueled oshorefueled commented Feb 22, 2026

Summary

This PR implements issue deduplication logic in the CLI by natively grouping issues by their match text and using Array.reduce() to safely execute heuristics for tie-breaking. It eliminates overlapping violation noise across all output formats (TTY, JSON, and Vale JSON) while retaining stability and computational efficiency.

Why

Multiple overlapping rules triggering on the exact same text caused cluttered outputs, degrading readability with duplicated messaging.

Scope

  • In scope:
    • Exact string/line deduplication
    • Issue buffering behind the Orchestrator formatting pipeline
    • Tie-breaking heuristic based on suggestion presence, summary length, and severity

Behavior Impact

  • User-facing changes: yes. Greatly reduces duplicate/spam logs when multiple criteria trigger over the exact same target string within documents.
  • Breaking changes: no. The pipeline operates cleanly within its API boundaries, with backward compatibility.
  • Operational impact: None

Risk and Mitigations

  • Risk level: low
  • Primary risks:
    • Deduplication might inadvertently suppress distinct issues that happen to trigger on the exact same line and text but represent functionally separate problems.
  • Mitigations:
    • Empty matched strings are intentionally retained as separate issues outside of tie-breaker scopes.
  • Rollback plan:
    • Remove array buffering wrapper over reportIssue, reverting directly to streaming execution if false positives are reported.

API / Contract / Schema Changes

  • Marking ReportIssueParams as @deprecated. Subscribing functions safely upgraded to RawIssue.

Follow-ups

  • Optional follow-up task handling fuzzy-character boundary overlap logic instead of relying on strictly exact group[0].match criteria tracking.

Known tradeoffs:

  • Exact string matching ignores partial token subsets (e.g. tracking "foo bar" alongside "foo"), resulting in both being reported. This remains out of scope for now.

Summary by CodeRabbit

  • New Features

    • Issue deduplication: duplicate violations on the same file/line are consolidated so users see a single, best-quality report (prefers suggested fixes, longer summaries, higher severity; preserves multiple reports when no exact match).
    • Consolidated reporting pass so deduplicated issues are presented once instead of repeated across checks.
  • Tests

    • Added comprehensive tests covering deduplication behaviors and tie-breaking rules.

- Introduce `filterDuplicateIssues` utility natively keyed by `group[0].match`
- Implement Array.reduce() heuristics to deterministically tiebreak based on overlapping output severity and instruction length
- Preserve matched instances that return empty strings as verified explicit overlaps
- Seamlessly wrap `RawIssue[]` payloads behind orchestrator formatting pipeline
- Deprecate `ReportIssueParams` alias for API backward compatibility
@coderabbitai
Copy link

coderabbitai bot commented Feb 22, 2026

📝 Walkthrough

Walkthrough

A new issue deduplication utility groups RawIssue objects by file, line, and match text and selects a single representative per group. The orchestrator now collects RawIssue arrays from evaluation paths, aggregates and deduplicates them, and reports deduplicated issues. Types were extended with a RawIssue interface.

Changes

Cohort / File(s) Summary
Deduplication Utility
src/cli/issue-deduplication.ts, tests/issue-deduplication.test.ts
Added filterDuplicateIssues(RawIssue[]) and tests. Groups issues by file/line/match, preserves empty-match multiples, and selects best issue by suggestion presence, summary length, severity, then original order.
Type System Updates
src/cli/types.ts
Introduced RawIssue interface (file, line, column, severity, summary, ruleName, outputFormat, jsonFormatter, optional suggestion/fix/scoreText/match). Added issues?: RawIssue[] to ErrorTrackingResult. Deprecated ReportIssueParams in favor of RawIssue.
Orchestrator Integration
src/cli/orchestrator.ts
Refactored flows to accumulate RawIssue[] from routePromptResult, extractAndReportCriterion, and locateAndReportViolations. evaluateFile aggregates, calls filterDuplicateIssues, then reports deduplicated issues. Function signatures updated to return issues: RawIssue[].

Sequence Diagram

sequenceDiagram
    participant EF as evaluateFile
    participant EP as extractAndReportCriterion
    participant RP as routePromptResult
    participant FD as filterDuplicateIssues
    participant RI as reportIssue

    EF->>EP: iterate criteria -> collect results (including issues)
    EP->>RP: evaluate prompts -> accumulate RawIssue[]
    RP-->>EP: return issues array
    EP-->>EF: return criterion issues
    EF->>EF: aggregate all issues from criteria
    EF->>FD: pass aggregated RawIssue[]
    FD->>FD: group by file/line/match<br/>select best per group
    FD-->>EF: return deduplicated issues
    EF->>RI: report each deduplicated issue
    RI-->>EF: reporting complete
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 Hop-hop, I sorted the heap of reports,

Lines and matches trimmed into neat cohorts.
I chose the best hint, the clearest summary light,
One issue remains, tidy and bright. ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the primary change: implementing deduplication of issues when different rules flag the same issue.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/dedup

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/cli/orchestrator.ts (3)

349-371: ⚠️ Potential issue | 🟡 Minor

Same issue: the !got early return also omits issues.

The early return at lines 353–371 when a criterion is not found in the result also lacks the issues property. While issues would be empty at this point (no issues pushed yet), including it would be consistent with the other return paths and avoids a subtle undefined vs [] discrepancy for callers that check criterionResult.issues.

Proposed fix
     return {
       errors: 0,
       warnings: 0,
       userScore: 0,
       maxScore,
       hadOperationalErrors,
       hadSeverityErrors,
       scoreEntry: { id: ruleName, scoreText: "-", score: 0.0 },
       scoreComponent: {
         criterion: nameKey,
         rawScore: 0,
         maxScore: 4,
         weightedScore: 0,
         weightedMaxScore: weightNum,
         normalizedScore: 0,
         normalizedMaxScore: 10,
       },
+      issues,
     };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/orchestrator.ts` around lines 349 - 371, The early-return branch that
runs when the criterion lookup (const got = result.criteria.find(...)) fails
returns an object missing the issues property, causing callers to receive
undefined instead of an empty array; update that return to include issues: [] so
the shape matches other paths (keep other fields like scoreEntry,
scoreComponent, userScore, maxScore, hadOperationalErrors, hadSeverityErrors
unchanged), referencing the same variables nameKey and ruleName used in the
current return.

316-347: ⚠️ Potential issue | 🔴 Critical

Bug: issues array is not included in the early return for the missing-target path.

The local issues array is populated at line 316–328, but the early return on lines 329–346 does not include the issues property. This means the "target not found" issue is constructed but silently dropped — it will never reach the deduplication pipeline or be reported.

🐛 Proposed fix — include `issues` in the early return
     return {
       errors: 1,
       warnings: 0,
       userScore: 0,
       maxScore,
       hadOperationalErrors,
       hadSeverityErrors,
       scoreEntry: { id: ruleName, scoreText: "0.0/10", score: 0.0 },
       scoreComponent: {
         criterion: nameKey,
         rawScore: 0,
         maxScore: 4,
         weightedScore: 0,
         weightedMaxScore: weightNum,
         normalizedScore: 0,
         normalizedMaxScore: 10,
       },
+      issues,
     };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/orchestrator.ts` around lines 316 - 347, The early return constructs
and populates the local issues array but does not include it in the returned
object; update the return value so it includes an issues property set to the
existing issues variable (alongside scoreEntry, scoreComponent, errors,
warnings, etc.) so the "target not found" issue is preserved and flows into the
deduplication/reporting pipeline; ensure the returned shape/type remains
compatible with the callers expecting issues.

918-939: ⚠️ Potential issue | 🟠 Major

Error/warning counts are accumulated before deduplication, causing a mismatch with reported issues.

totalErrors and totalWarnings (lines 918–919) reflect the pre-dedup count from all prompts, but the actual issues emitted to the user are the post-dedup set (line 937). This means the returned EvaluateFileResult.errors/.warnings can overcount relative to what the user sees, which may affect exit codes or summary display.

Consider either:

  1. Recomputing error/warning counts from deduplicatedIssues after filtering, or
  2. Documenting that counts reflect rule triggers, not unique reported issues.
🛠️ Option 1 — recompute counts from deduplicated issues
   // Deduplicate issues
   const deduplicatedIssues = filterDuplicateIssues(allIssues);

+  // Recompute error/warning counts from deduplicated issues
+  totalErrors = 0;
+  totalWarnings = 0;
+  for (const issue of deduplicatedIssues) {
+    if (issue.severity === Severity.ERROR) totalErrors++;
+    else if (issue.severity === Severity.WARNING) totalWarnings++;
+  }
+
   // Group and format output appropriately
   for (const issue of deduplicatedIssues) {
     reportIssue(issue);
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/orchestrator.ts` around lines 918 - 939, totalErrors/totalWarnings
are currently tallied from raw prompt results before deduplication, causing a
mismatch with the actual issues emitted; after calling filterDuplicateIssues
(deduplicatedIssues), recompute error and warning counts (and update
hadOperationalErrors/hadSeverityErrors if needed) by iterating
deduplicatedIssues to count severity types and set
totalErrors/totalWarnings/hadOperationalErrors/hadSeverityErrors accordingly,
then proceed to reportIssue — this ensures EvaluateFileResult reflects the
post-deduplication issue set.
🧹 Nitpick comments (4)
src/cli/types.ts (1)

61-74: RawIssue couples raw data with mutable formatter references.

jsonFormatter (line 69) embeds a mutable formatter instance into what is conceptually an issue data record. Every buffered RawIssue carries the same shared formatter reference, so this works today, but it means RawIssue objects cannot be serialized, compared structurally (e.g. in tests without a shared instance), or used across different formatter contexts.

Consider splitting into a pure data type (file/line/severity/match/…) and resolving the formatter at report time, rather than at issue-construction time. This would also simplify testing (no need to instantiate JsonFormatter in test fixtures).

Not blocking since the pattern is consistent with the existing EvaluationContext / ReportIssueParams design.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/types.ts` around lines 61 - 74, The RawIssue interface currently
embeds a mutable formatter instance via the jsonFormatter field, coupling
formatter state to issue data; change RawIssue into a pure data shape (remove
jsonFormatter) that only contains file, line, column, severity, summary,
ruleName, outputFormat, suggestion, fix, scoreText, and match, and update call
sites that construct RawIssue to stop passing formatter instances; instead,
resolve and inject the appropriate formatter when producing reports (e.g., in
the code paths that use EvaluationContext or ReportIssueParams) so formatters
are looked up/instantiated at report time rather than stored on each RawIssue.
src/cli/issue-deduplication.ts (1)

26-33: Grouping key could collide when file paths or match text contain colons.

The composite key ${issue.file}:${issue.line}:${matchText} uses : as a delimiter. If file contains colons (e.g., Windows paths like C:\Users\...) or matchText starts with digits, distinct issues could map to the same key.

In practice this is very unlikely to cause a false dedup since all three components would need to align, but a safer delimiter (e.g., \0) or a structured key approach would eliminate the risk entirely.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/issue-deduplication.ts` around lines 26 - 33, The composite key
construction using `${issue.file}:${issue.line}:${matchText}` can collide when
file or matchText contain colons; update the grouping key logic in the loop that
builds grouped (references: issues, matchText, key, grouped) to use a
collision-safe representation such as JSON.stringify([issue.file, issue.line,
matchText]) or join with a null delimiter ('\0'), then use that new key for
grouped.get/set so distinct issues never alias incorrectly.
tests/issue-deduplication.test.ts (2)

7-17: Shared JsonFormatter instance in test fixture couples tests to formatter internals.

BASE_ISSUE instantiates new JsonFormatter() which is shared (by reference after spread) across all test issues. This works because filterDuplicateIssues doesn't interact with the formatter, but it means tests will break if the fixture is reused for integration tests that do. A lightweight stub or {} as any would decouple the fixture, though this conflicts with the "no any" guideline.

Not blocking — the current approach is pragmatic for unit tests.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/issue-deduplication.test.ts` around lines 7 - 17, BASE_ISSUE currently
creates a real shared JsonFormatter instance (new JsonFormatter()) which couples
tests to formatter internals; change BASE_ISSUE to use a lightweight stub or
per-test factory to avoid sharing — e.g. replace jsonFormatter: new
JsonFormatter() with a non-shared stub typed as JsonFormatter using a safe cast
(jsonFormatter: {} as unknown as JsonFormatter) or implement a createBaseIssue()
function that returns a fresh RawIssue per test; update references to BASE_ISSUE
in tests or use the factory so filterDuplicateIssues and other unit tests no
longer share the same JsonFormatter instance.

19-84: Good test coverage of the main deduplication paths.

The suite validates all four heuristic tiers and the empty-match preservation. Two small coverage gaps to consider:

  1. match: undefined (as opposed to match: ''): Both coerce to '' in the implementation, but a test would document that undefined match behaves identically to empty string.
  2. Tie-breaker 3 (original order): When two issues are identical across all heuristic dimensions, prev should win. Adding a test for this final fallback would lock in the stability guarantee.

Neither is blocking.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/issue-deduplication.test.ts` around lines 19 - 84, Add two small tests
to tests/issue-deduplication.test.ts to cover the gaps: (1) a test using
BASE_ISSUE to create two issues where one has match: undefined and the other has
match: '' and assert filterDuplicateIssues([...]) returns both (verifying
undefined is treated like empty match and preserved), and (2) a test for
tie-breaker 3 where you create two issues identical in file/line/match/summary
length/severity/suggestion and assert that filterDuplicateIssues([first,
second]) keeps the first (prev) by checking filtered[0] === first; reference
filterDuplicateIssues and BASE_ISSUE when adding these assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/cli/orchestrator.ts`:
- Around line 349-371: The early-return branch that runs when the criterion
lookup (const got = result.criteria.find(...)) fails returns an object missing
the issues property, causing callers to receive undefined instead of an empty
array; update that return to include issues: [] so the shape matches other paths
(keep other fields like scoreEntry, scoreComponent, userScore, maxScore,
hadOperationalErrors, hadSeverityErrors unchanged), referencing the same
variables nameKey and ruleName used in the current return.
- Around line 316-347: The early return constructs and populates the local
issues array but does not include it in the returned object; update the return
value so it includes an issues property set to the existing issues variable
(alongside scoreEntry, scoreComponent, errors, warnings, etc.) so the "target
not found" issue is preserved and flows into the deduplication/reporting
pipeline; ensure the returned shape/type remains compatible with the callers
expecting issues.
- Around line 918-939: totalErrors/totalWarnings are currently tallied from raw
prompt results before deduplication, causing a mismatch with the actual issues
emitted; after calling filterDuplicateIssues (deduplicatedIssues), recompute
error and warning counts (and update hadOperationalErrors/hadSeverityErrors if
needed) by iterating deduplicatedIssues to count severity types and set
totalErrors/totalWarnings/hadOperationalErrors/hadSeverityErrors accordingly,
then proceed to reportIssue — this ensures EvaluateFileResult reflects the
post-deduplication issue set.

---

Nitpick comments:
In `@src/cli/issue-deduplication.ts`:
- Around line 26-33: The composite key construction using
`${issue.file}:${issue.line}:${matchText}` can collide when file or matchText
contain colons; update the grouping key logic in the loop that builds grouped
(references: issues, matchText, key, grouped) to use a collision-safe
representation such as JSON.stringify([issue.file, issue.line, matchText]) or
join with a null delimiter ('\0'), then use that new key for grouped.get/set so
distinct issues never alias incorrectly.

In `@src/cli/types.ts`:
- Around line 61-74: The RawIssue interface currently embeds a mutable formatter
instance via the jsonFormatter field, coupling formatter state to issue data;
change RawIssue into a pure data shape (remove jsonFormatter) that only contains
file, line, column, severity, summary, ruleName, outputFormat, suggestion, fix,
scoreText, and match, and update call sites that construct RawIssue to stop
passing formatter instances; instead, resolve and inject the appropriate
formatter when producing reports (e.g., in the code paths that use
EvaluationContext or ReportIssueParams) so formatters are looked up/instantiated
at report time rather than stored on each RawIssue.

In `@tests/issue-deduplication.test.ts`:
- Around line 7-17: BASE_ISSUE currently creates a real shared JsonFormatter
instance (new JsonFormatter()) which couples tests to formatter internals;
change BASE_ISSUE to use a lightweight stub or per-test factory to avoid sharing
— e.g. replace jsonFormatter: new JsonFormatter() with a non-shared stub typed
as JsonFormatter using a safe cast (jsonFormatter: {} as unknown as
JsonFormatter) or implement a createBaseIssue() function that returns a fresh
RawIssue per test; update references to BASE_ISSUE in tests or use the factory
so filterDuplicateIssues and other unit tests no longer share the same
JsonFormatter instance.
- Around line 19-84: Add two small tests to tests/issue-deduplication.test.ts to
cover the gaps: (1) a test using BASE_ISSUE to create two issues where one has
match: undefined and the other has match: '' and assert
filterDuplicateIssues([...]) returns both (verifying undefined is treated like
empty match and preserved), and (2) a test for tie-breaker 3 where you create
two issues identical in file/line/match/summary length/severity/suggestion and
assert that filterDuplicateIssues([first, second]) keeps the first (prev) by
checking filtered[0] === first; reference filterDuplicateIssues and BASE_ISSUE
when adding these assertions.

…ost-dedup

- Include issues array in the target-not-found early return so the
  constructed issue is not silently dropped before deduplication
- Add issues: [] to the criterion-not-found early return for
  consistency with all other return paths
- Remove pre-dedup totalErrors/totalWarnings accumulation; recompute
  counts from deduplicatedIssues so reported counts match emitted output
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/cli/orchestrator.ts (2)

920-944: ⚠️ Potential issue | 🟡 Minor

hadSeverityErrors can be true while totalErrors is 0 after deduplication.

hadSeverityErrors is computed pre-deduplication (line 922), but totalErrors is recomputed post-deduplication (lines 937–940). These can diverge because filterDuplicateIssues tie-breaks on summary length before severity — an ERROR with a short summary is dropped in favour of a WARNING with a longer summary. The caller then sees a non-zero exit code (hadSeverityErrors = true) but an output reporting zero errors.

🛡️ Proposed fix — recompute `hadSeverityErrors` from deduplicated issues
  // Recompute counts from deduplicated issues so they match what is reported
  for (const issue of deduplicatedIssues) {
    if (issue.severity === Severity.ERROR) totalErrors += 1;
    else totalWarnings += 1;
  }
+
+  // Recompute hadSeverityErrors from deduplicatedIssues for consistency with totalErrors
+  if (deduplicatedIssues.some(issue => issue.severity === Severity.ERROR)) {
+    hadSeverityErrors = true;
+  } else {
+    hadSeverityErrors = false;
+  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/orchestrator.ts` around lines 920 - 944, The boolean
hadSeverityErrors is set before deduplication and can disagree with
post-deduplication counts; after obtaining deduplicatedIssues (from
filterDuplicateIssues) recompute hadSeverityErrors (and optionally
hadOperationalErrors if relevant) by iterating deduplicatedIssues and setting
hadSeverityErrors = true if any issue.severity === Severity.ERROR (and set
totalErrors/totalWarnings there as you already do), so the exit-status flag
matches the output reported by reportIssue; update any downstream logic that
uses hadSeverityErrors to use the recomputed value.

147-156: ⚠️ Potential issue | 🟡 Minor

Stale JSDoc — function no longer "reports" violations.

The comment says "Locates and reports each violation" but after this PR the function only locates violations and collects them into a RawIssue[]reportIssue is now called exclusively in evaluateFile after deduplication.

📝 Proposed update
-/*
- * Locates and reports each violation using pre/post evidence markers.
- * If location matching fails (missing markers, content mismatch), logs warning
- * and continues processing. Returns hadOperationalErrors=true if any violations
- * couldn't be located, signaling text matching issues vs. content quality issues.
- */
+/*
+ * Locates each violation using pre/post evidence markers and returns them as RawIssue[].
+ * If location matching fails (missing markers, content mismatch), logs warning
+ * and continues processing. Returns hadOperationalErrors=true if any violations
+ * couldn't be located, signaling text matching issues vs. content quality issues.
+ */
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/orchestrator.ts` around lines 147 - 156, The JSDoc for
locateAndReportViolations is stale: it says the function "Locates and reports
each violation" but the function now only locates violations and returns
hadOperationalErrors and an array of RawIssue; update the comment above the
locateAndReportViolations function to remove "reports" and clearly state it only
locates violations, logs any location/matching warnings, and returns
hadOperationalErrors plus the collected RawIssue[] for reporting to be done
elsewhere (e.g., in evaluateFile) — keep the description concise and reference
the returned fields hadOperationalErrors and issues (RawIssue[]).
🧹 Nitpick comments (2)
src/cli/orchestrator.ts (2)

86-100: Update reportIssue to accept RawIssue directly, completing the deprecation migration.

ReportIssueParams is marked @deprecated per this PR, and the function is now exclusively called with RawIssue objects (line 944). The signature should be updated to close the gap.

♻️ Proposed update
-function reportIssue(params: ReportIssueParams): void {
+function reportIssue(params: RawIssue): void {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/orchestrator.ts` around lines 86 - 100, Update the reportIssue
function signature to accept a RawIssue directly instead of the deprecated
ReportIssueParams: change the parameter type from ReportIssueParams to RawIssue
and update the destructuring inside reportIssue (file, line, column, severity,
summary, ruleName, outputFormat, jsonFormatter, suggestion, fix, scoreText,
match) to read from that RawIssue; also remove or replace any import/usages of
the deprecated ReportIssueParams type in this module and ensure callers (e.g.,
places constructing RawIssue objects) still compile with the new signature,
keeping the function name reportIssue unchanged.

595-629: totalErrors / totalWarnings in this path are now dead code.

evaluateFile no longer consumes promptResult.errors or promptResult.warnings — it recomputes counts exclusively from deduplicatedIssues (lines 937–940). The same dead chain exists in extractAndReportCriterion, whose returned errors/warnings are accumulated by routePromptResult but are themselves never consumed upstream.

Consider removing the accumulation entirely and dropping the errors/warnings fields from the return shapes of both routePromptResult and extractAndReportCriterion, or at minimum adding a comment noting these values are now advisory only.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/orchestrator.ts` around lines 595 - 629, The totals totalErrors /
totalWarnings computed inside the loop (and returned via
extractAndReportCriterion/routePromptResult) are now dead—evaluateFile derives
final counts from deduplicatedIssues—so remove the unused accumulation and the
errors/warnings fields (or mark them advisory). Specifically: delete/update the
logic that increments totalErrors/totalWarnings in the loop that builds ruleName
and calls locateAndReportViolations, remove the errors/warnings properties from
extractAndReportCriterion and routePromptResult return shapes (and any code that
only aggregates promptResult.errors/promptResult.warnings), and update
evaluateFile to rely solely on deduplicatedIssues; if you prefer a minimal
change, add a clear comment in evaluateFile and extractAndReportCriterion
explaining these counts are advisory-only. Ensure references to evaluateFile,
extractAndReportCriterion, routePromptResult, promptResult.errors,
promptResult.warnings, and deduplicatedIssues are updated consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cli/orchestrator.ts`:
- Around line 705-707: Change the types so `issues` is non-optional on
ErrorTrackingResult and ProcessCriterionResult (e.g., `issues: RawIssue[]`),
update any places that construct those results (notably
`extractAndReportCriterion` and `routePromptResult`) to keep returning an
initialized array, and remove the defensive guard around `issues` in
orchestrator.ts (the `if (criterionResult.issues) {
issues.push(...criterionResult.issues); }` block) so you can directly push
`criterionResult.issues`; also scan for other callers that assumed optional and
adjust them to expect an array.

---

Outside diff comments:
In `@src/cli/orchestrator.ts`:
- Around line 920-944: The boolean hadSeverityErrors is set before deduplication
and can disagree with post-deduplication counts; after obtaining
deduplicatedIssues (from filterDuplicateIssues) recompute hadSeverityErrors (and
optionally hadOperationalErrors if relevant) by iterating deduplicatedIssues and
setting hadSeverityErrors = true if any issue.severity === Severity.ERROR (and
set totalErrors/totalWarnings there as you already do), so the exit-status flag
matches the output reported by reportIssue; update any downstream logic that
uses hadSeverityErrors to use the recomputed value.
- Around line 147-156: The JSDoc for locateAndReportViolations is stale: it says
the function "Locates and reports each violation" but the function now only
locates violations and returns hadOperationalErrors and an array of RawIssue;
update the comment above the locateAndReportViolations function to remove
"reports" and clearly state it only locates violations, logs any
location/matching warnings, and returns hadOperationalErrors plus the collected
RawIssue[] for reporting to be done elsewhere (e.g., in evaluateFile) — keep the
description concise and reference the returned fields hadOperationalErrors and
issues (RawIssue[]).

---

Nitpick comments:
In `@src/cli/orchestrator.ts`:
- Around line 86-100: Update the reportIssue function signature to accept a
RawIssue directly instead of the deprecated ReportIssueParams: change the
parameter type from ReportIssueParams to RawIssue and update the destructuring
inside reportIssue (file, line, column, severity, summary, ruleName,
outputFormat, jsonFormatter, suggestion, fix, scoreText, match) to read from
that RawIssue; also remove or replace any import/usages of the deprecated
ReportIssueParams type in this module and ensure callers (e.g., places
constructing RawIssue objects) still compile with the new signature, keeping the
function name reportIssue unchanged.
- Around line 595-629: The totals totalErrors / totalWarnings computed inside
the loop (and returned via extractAndReportCriterion/routePromptResult) are now
dead—evaluateFile derives final counts from deduplicatedIssues—so remove the
unused accumulation and the errors/warnings fields (or mark them advisory).
Specifically: delete/update the logic that increments totalErrors/totalWarnings
in the loop that builds ruleName and calls locateAndReportViolations, remove the
errors/warnings properties from extractAndReportCriterion and routePromptResult
return shapes (and any code that only aggregates
promptResult.errors/promptResult.warnings), and update evaluateFile to rely
solely on deduplicatedIssues; if you prefer a minimal change, add a clear
comment in evaluateFile and extractAndReportCriterion explaining these counts
are advisory-only. Ensure references to evaluateFile, extractAndReportCriterion,
routePromptResult, promptResult.errors, promptResult.warnings, and
deduplicatedIssues are updated consistently.
ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd64099 and 4820146.

📒 Files selected for processing (1)
  • src/cli/orchestrator.ts

Comment on lines +705 to +707
if (criterionResult.issues) {
issues.push(...criterionResult.issues);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find type definitions for ProcessCriterionResult and ErrorTrackingResult
echo "=== Searching for ProcessCriterionResult and ErrorTrackingResult type definitions ==="
rg -n "type ProcessCriterionResult|interface ProcessCriterionResult|type ErrorTrackingResult|interface ErrorTrackingResult" src/cli --type ts -A 10

echo ""
echo "=== Searching for RawIssue type definition ==="
rg -n "type RawIssue|interface RawIssue" src/cli --type ts -A 5

Repository: TRocket-Labs/vectorlint

Length of output: 1484


🏁 Script executed:

#!/bin/bash
# Find implementations of extractAndReportCriterion and routePromptResult
echo "=== Searching for extractAndReportCriterion implementation ==="
rg -n "function extractAndReportCriterion|const extractAndReportCriterion" src/cli --type ts -A 40 | head -80

echo ""
echo "=== Searching for routePromptResult implementation ==="
rg -n "function routePromptResult|const routePromptResult" src/cli --type ts -A 40 | head -80

Repository: TRocket-Labs/vectorlint

Length of output: 4702


🏁 Script executed:

#!/bin/bash
# Get the full return statements for extractAndReportCriterion
echo "=== extractAndReportCriterion return statement (around line 361 and end of function) ==="
sed -n '267,450p' src/cli/orchestrator.ts | tail -40

echo ""
echo "=== routePromptResult return statement (around line 559-700) ==="
sed -n '690,750p' src/cli/orchestrator.ts

Repository: TRocket-Labs/vectorlint

Length of output: 3079


🏁 Script executed:

#!/bin/bash
# Check for all return statements in extractAndReportCriterion
echo "=== All return statements in extractAndReportCriterion (lines 267-450) ==="
sed -n '267,450p' src/cli/orchestrator.ts | grep -n "return {"

echo ""
echo "=== Full view of extractAndReportCriterion function end (around line 440-460) ==="
sed -n '420,460p' src/cli/orchestrator.ts

Repository: TRocket-Labs/vectorlint

Length of output: 1240


🏁 Script executed:

#!/bin/bash
# Get the complete final return statement of extractAndReportCriterion
echo "=== Final return statement of extractAndReportCriterion (lines 456-475) ==="
sed -n '456,475p' src/cli/orchestrator.ts

echo ""
echo "=== Check line numbers - extract full function signature and locate all returns ==="
rg -n "^function extractAndReportCriterion" src/cli/orchestrator.ts -A 200 | grep -E "return \{|^[0-9]+:function" | head -20

Repository: TRocket-Labs/vectorlint

Length of output: 749


🏁 Script executed:

#!/bin/bash
# Check the early return statements to see if they include issues
echo "=== Early return 1 (around line 329) ==="
sed -n '325,340p' src/cli/orchestrator.ts

echo ""
echo "=== Early return 2 (around line 354) ==="
sed -n '350,365p' src/cli/orchestrator.ts

echo ""
echo "=== Check if issues field is included in early returns ==="
sed -n '325,365p' src/cli/orchestrator.ts | grep -A 12 "return {"

Repository: TRocket-Labs/vectorlint

Length of output: 1609


🏁 Script executed:

#!/bin/bash
# Get the complete final return statement to verify issues is included
echo "=== Complete final return of extractAndReportCriterion (lines 456-480) ==="
sed -n '456,480p' src/cli/orchestrator.ts

Repository: TRocket-Labs/vectorlint

Length of output: 597


Make issues field required on ProcessCriterionResult and ErrorTrackingResult types.

issues is currently declared optional (issues?: RawIssue[]) on ErrorTrackingResult (and inherited by ProcessCriterionResult), but both extractAndReportCriterion and routePromptResult always initialize and return the field in every code path—even early returns include it. Making issues required in the type definitions would eliminate the need for the guards at lines 705-707 and accurately reflect the implementation invariant.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/orchestrator.ts` around lines 705 - 707, Change the types so `issues`
is non-optional on ErrorTrackingResult and ProcessCriterionResult (e.g.,
`issues: RawIssue[]`), update any places that construct those results (notably
`extractAndReportCriterion` and `routePromptResult`) to keep returning an
initialized array, and remove the defensive guard around `issues` in
orchestrator.ts (the `if (criterionResult.issues) {
issues.push(...criterionResult.issues); }` block) so you can directly push
`criterionResult.issues`; also scan for other callers that assumed optional and
adjust them to expect an array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant