Skip to content

feat: add --agent flag for coding agent deviation file output#212

Open
sohankshirsagar wants to merge 1 commit intomainfrom
sohan/agent-flag
Open

feat: add --agent flag for coding agent deviation file output#212
sohankshirsagar wants to merge 1 commit intomainfrom
sohan/agent-flag

Conversation

@sohankshirsagar
Copy link
Copy Markdown
Contributor

Summary

Adds an --agent flag to tusk drift run that writes rich deviation data to .tusk/logs/agent-run-{timestamp}/ as markdown files with YAML frontmatter. This output is designed for consumption by coding agent skills that can analyze deviations locally without waiting for CI.

What it does

  • tusk drift run --cloud --agent writes a deviation file per failed test with full context: request details, response diff, outbound call match quality, and mock-not-found events
  • An index.md summary is written after all tests complete
  • --agent-output-dir <path> overrides the default output location
  • Works with all existing flags and both interactive (TUI) and non-interactive modes
  • Output goes to stderr so it doesn't interfere with --print or --output-format json

Deviation file format

Each file has YAML frontmatter (for fast grep/filtering) and markdown body sections:

  • Frontmatter: deviation_id, endpoint, method, path, failure_type, status codes, mock status, duration
  • Request: method, path, headers (auth masked), pretty-printed body
  • Response Diff: status change line + plaintext unified diff of response body
  • Outbound Call Context: table of mock matches with quality/scope
  • Mock Not Found Events: details for outbound calls with no matching recording

Design decisions

  • No ## Deviations list section — the response diff already shows exactly what changed; a truncated deviation list was redundant noise for an agent consumer
  • Reuses existing difflib.UnifiedDiff engine for body diffs, just without ANSI colors and TUI box borders (FormatJSONDiffPlain)
  • No file cleanup — stale runs accumulate

Edge cases handled

  • Server crashes, cancelled tests, retried-after-crash tests
  • Large response bodies (>100KB) truncated with size info
  • Unsafe filename characters sanitized
  • All write failures are warnings, never fatal

Example markdown file
image

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

dir = candidate
break
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unbounded loop if Stat returns unexpected error

Medium Severity

The for i := 2; ; i++ loop in NewAgentWriter only breaks when os.IsNotExist(err) is true. If os.Stat(candidate) returns an error that is not an "is not exist" error (e.g., permission denied, I/O error, or path-too-long), the condition os.IsNotExist(err) evaluates to false and the loop continues indefinitely, hanging the process. There's no upper-bound check or fallback for non-ENOENT errors.

Fix in Cursor Fix in Web

fmt.Fprintf(os.Stderr, "Warning: failed to write agent index file: %v\n", err)
}
fmt.Fprintf(os.Stderr, "Agent deviation files written to: %s\n", agentWriter.OutputDir())
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cancelled tests inflate failure count in index

Low Severity

writeAgentResult skips cancelled tests entirely (no RecordPassedTest, no WriteDeviation), but countPassedFailed counts cancelled tests as "failed" since their Passed field is false. The index header will report more failures than the deviation table contains — e.g. "3 failed" with only 2 deviation entries — because cancelled tests are invisible in w.results but included in totalTests - passedTests.

Additional Locations (2)
Fix in Cursor Fix in Web

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 5 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="cmd/run.go">

<violation number="1" location="cmd/run.go:753">
P2: Cancelled tests inflate the failure count in the agent index. `writeAgentResult` skips cancelled tests (no `RecordPassedTest`, no `WriteDeviation`), but `countPassedFailed` counts them as failed since `Passed` is false. The index header will report e.g. "3 failed" while only 2 deviation files exist. Filter out cancelled tests before computing the totals passed to `WriteIndex`.</violation>
</file>

<file name="internal/runner/agent_writer.go">

<violation number="1" location="internal/runner/agent_writer.go:46">
P1: The collision handling is racy; concurrent runs can still write into the same `agent-run-*` directory.</violation>

<violation number="2" location="internal/runner/agent_writer.go:49">
P2: Unbounded loop: if `os.Stat(candidate)` returns an error other than `ENOENT` (e.g. permission denied, I/O error), `os.IsNotExist(err)` is false so the loop never breaks. Add a fallback for unexpected errors or cap the iteration count.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

dir := filepath.Join(baseDir, fmt.Sprintf("agent-run-%s", timestamp))

// Handle concurrent runs: if directory exists, append counter
if _, err := os.Stat(dir); err == nil {
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: The collision handling is racy; concurrent runs can still write into the same agent-run-* directory.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/runner/agent_writer.go, line 46:

<comment>The collision handling is racy; concurrent runs can still write into the same `agent-run-*` directory.</comment>

<file context>
@@ -0,0 +1,450 @@
+	dir := filepath.Join(baseDir, fmt.Sprintf("agent-run-%s", timestamp))
+
+	// Handle concurrent runs: if directory exists, append counter
+	if _, err := os.Stat(dir); err == nil {
+		for i := 2; ; i++ {
+			candidate := fmt.Sprintf("%s-%d", dir, i)
</file context>
Fix with Cubic

OnAllCompleted: func(results []runner.TestResult, tests []runner.Test, exec *runner.Executor) {
// Write agent index after all tests complete (interactive mode)
if agentWriter != nil {
passed, _ := countPassedFailed(results)
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Cancelled tests inflate the failure count in the agent index. writeAgentResult skips cancelled tests (no RecordPassedTest, no WriteDeviation), but countPassedFailed counts them as failed since Passed is false. The index header will report e.g. "3 failed" while only 2 deviation files exist. Filter out cancelled tests before computing the totals passed to WriteIndex.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At cmd/run.go, line 753:

<comment>Cancelled tests inflate the failure count in the agent index. `writeAgentResult` skips cancelled tests (no `RecordPassedTest`, no `WriteDeviation`), but `countPassedFailed` counts them as failed since `Passed` is false. The index header will report e.g. "3 failed" while only 2 deviation files exist. Filter out cancelled tests before computing the totals passed to `WriteIndex`.</comment>

<file context>
@@ -701,6 +748,15 @@ func runTests(cmd *cobra.Command, args []string) error {
 			OnAllCompleted: func(results []runner.TestResult, tests []runner.Test, exec *runner.Executor) {
+				// Write agent index after all tests complete (interactive mode)
+				if agentWriter != nil {
+					passed, _ := countPassedFailed(results)
+					if err := agentWriter.WriteIndex(len(results), passed); err != nil {
+						fmt.Fprintf(os.Stderr, "Warning: failed to write agent index file: %v\n", err)
</file context>
Fix with Cubic

Comment on lines +49 to +52
if _, err := os.Stat(candidate); os.IsNotExist(err) {
dir = candidate
break
}
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Unbounded loop: if os.Stat(candidate) returns an error other than ENOENT (e.g. permission denied, I/O error), os.IsNotExist(err) is false so the loop never breaks. Add a fallback for unexpected errors or cap the iteration count.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At internal/runner/agent_writer.go, line 49:

<comment>Unbounded loop: if `os.Stat(candidate)` returns an error other than `ENOENT` (e.g. permission denied, I/O error), `os.IsNotExist(err)` is false so the loop never breaks. Add a fallback for unexpected errors or cap the iteration count.</comment>

<file context>
@@ -0,0 +1,450 @@
+	if _, err := os.Stat(dir); err == nil {
+		for i := 2; ; i++ {
+			candidate := fmt.Sprintf("%s-%d", dir, i)
+			if _, err := os.Stat(candidate); os.IsNotExist(err) {
+				dir = candidate
+				break
</file context>
Suggested change
if _, err := os.Stat(candidate); os.IsNotExist(err) {
dir = candidate
break
}
if _, err := os.Stat(candidate); os.IsNotExist(err) {
dir = candidate
break
} else if err != nil {
// Unexpected error (permission denied, I/O error, etc.) — use this candidate
dir = candidate
break
}
Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant