Skip to content

Handle excluded input column in event expansion#6

Closed
rasmusfaber wants to merge 110 commits into
mainfrom
fix/handle-excluded-input
Closed

Handle excluded input column in event expansion#6
rasmusfaber wants to merge 110 commits into
mainfrom
fix/handle-excluded-input

Conversation

@rasmusfaber

Copy link
Copy Markdown
Collaborator

Summary

  • _expand_events_in_df crashed with KeyError when the input column was excluded from scan results but input_data was present
  • Added early-return guard for missing input column
  • Added unit test covering the exact failure scenario

Test plan

  • New unit test test_expand_events_no_input_column passes
  • Existing test_event_expansion.py tests still pass

🤖 Generated with Claude Code

jjallaire and others added 30 commits February 23, 2026 11:09
…ai#294)

* initial work on transcript node detection

* transcript nodes for typescript

* more work on design

* wip

* infra events

* fixups

* utility agents

* add support for branches

* timeline ui

* updates

* add section on registration to docs

* regenerate docs

* timeline panel

* multiple timelines

* custom outline

* remove type suffix

* custom outline design notes

* design docs

* implementation plan

* more synthetic nodes

* Add swimlane row computation and make node times non-nullable

Phase 1 of timeline core logic: implement computeSwimLaneRows() which
transforms an AgentNode's children into SwimLaneRow[] for rendering as
horizontal swimlane bars. Handles sequential, iterative (multiple spans),
and parallel (overlapping) agent patterns with case-insensitive grouping.

Also makes start_time/end_time non-nullable across both Python and
TypeScript node types, since every Event has a required timestamp field.
This eliminates pervasive null checks throughout the codebase. Container
nodes use epoch sentinel for the degenerate empty-content case.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* save timeline context

* Add content item building for timeline detail panel

Phase 2 of timeline core logic: implement buildContentItems() which
transforms an AgentNode into a flat list of ContentItems (event,
agent_card, branch_card). Branch cards are inserted after the event
matching their forkedAt UUID; unmatched branches append at the end.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add marker computation for timeline UI

Phase 3 of timeline core logic: implement collectMarkers() which finds
error, compaction, and branch markers in an AgentNode at configurable
depth (direct, children, recursive). Includes isErrorEvent() and
isCompactionEvent() helpers for event classification.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* update context

* commit timeline

* ruff

* import event tree span

* rename to timeline

* rename to timelinebranch

* rename to timeline node

* update docs

* add timeline scanning target

* add support for timeline=true

* improved auto-span detection

* timeline scanning doc

* don't do fingerprinting across compaction boundaries

* rename timeline => timelines

* update scanning doc

* update with working guidelines

* message_numbering function

* add commit id

* messages_by_compaction

* add commit id

* message chunking

* update heading

* message chunking

* update scanning doc

* phase 5 timeline control

* improve compaction handling

* add depth parameter

* claude code transcript source

* extract answer functions

* update plan

* refactor llm_scanner

* update doc

* parallel segment scanning

* llm scanner reduction

* update reference

* draft doc enhancements

* remove scan_segments

* Improve minimap

* clear breadcrumbs on reload

* Baseline events

* fine tuning

* collapsible

* Drive minimap with selection

* tweak minimap

* market improvements

* Emplace timeline outline as placeholder

* Show selection in breadcrumbs

* Add marker controller to test rig

* Toggleable outline

* reorganize code

* Refactoring

* fitler_timeline rather than include param

* scorers section

* factor out timeline detection

* move function

* exclude scorers from transcript_messages by default

* claude code import fixes

* correct token counting

* propagate to ts

* reorder imports

* fix lint

* improve import perf

* baselines script

* improve code quality

* more tests for claude code source

* remove calls to getattr

* improve event routing

* correct total tokens

* correctly populate task_repeat

* properly group assistant messages

* populate description field

* more flexible slicing out of compactions

* messages_by_compaction function

* trajectories example

* ensure previous messages are flushed after compaction

* Make sure parent of selection is navigable

* Improve spacing

* populate top level claude code messages using timeline

* collapse tool output in messages view

* import_cc example

* merge sessions

* scout import

* no gitignore

* improve import ux

* overwrite flag

* apply limit by time for claude code

* some tool views for claude code

* improve /clear handling

* yaml parsing cleanup

* improve transcript dir handling

* code review feedback

* code review feedback

* remove spurious user command messages

* doc plan

* consolidate messages_by_compaction into span_messages

* initial restructuring of scanner docs

* add docs on multi-label

* update transcript_fields

* update documentation plan

* improve handling of model instances w/ multiprocessing

* update scanner_ir with changes in llm_scanner

* address feedback

* improve custom scanner doc

* more doc improvements

* improve timeline docs

* improve scanner tools

* simplify llm reducer

* update reference

* scout import docs

* improve llm reducer

* improve default reducers

* support for images from claude code logs

* fix majority reducer

* code review feedback

* more liberal text extraction

* regen docs

* doc fixes

* extract_refs

* update test_generate tests

* remove timelines from docs

* Revert "remove timelines from docs"

This reverts commit 6e3a887.

* clean out some design and examples files

* dev version callouts

* more dev callouts

* update changelog

* ruff format

* fix date format in python 3.10

* format typescript

* reformat

* update uv lock

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Charles Teague <charles@merdianlabs.ai>
* add timeline_dump and timeline_load functions

* import timeline functions from inspect_ai

* remove cc events
* set assistant message id from underlying jsonl event

* update inspect dep

* call inspect_swe for cc events

* add inspect-swe depencency

* tweak changelog
…ianlabs-ai#298)

* timelines: Improve agent detection logic in `timeline_build()`

* fix tests

* update python tests
Implements a new source for importing traces from Weights & Biases Weave,
following the existing pattern established by langsmith, logfire, and phoenix.

Features:
- Async generator that yields Transcript objects from Weave traces
- Provider format detection for OpenAI, Anthropic, and Google
- Tree building for hierarchical call/span structures
- Event conversion (ModelEvent, ToolEvent, SpanBeginEvent, SpanEndEvent)
- Message extraction using inspect_ai converters
- Retry logic with tenacity for API resilience
- Support for time-based filtering and custom filters

Includes comprehensive test suite with:
- 92 unit tests covering all modules
- Integration test framework with bootstrap.py for real trace creation
- Mock objects simulating real Weave call structures

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Charles Teague <charles@meridianlabs.ai>
Co-authored-by: jjallaire <jj.allaire@gmail.com>
jjallaire and others added 27 commits March 16, 2026 16:51
Summary._report() was counting all items in a resultset (including
value=0 items) as positive results. This caused the sidebar to show
inflated numbers (e.g., 965) while the results list correctly filtered
to only positive matches (e.g., 12).

Now uses is_positive_value() to check each item's value field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
RecorderBuffer loaded the existing _summary.json on both init and
resume, causing summary counters (scans, results) to accumulate across
scan reruns. Now init() passes reset=True to start fresh, while
resume() preserves the existing summary as intended.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Make error file truncation conditional on reset (matching the summary
logic). Previously errors were always truncated, even on resume.

Also: parametrize counting tests, use monkeypatch for env vars, use
public scan_summary() API in assertions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Charles Teague <charles@meridianlabs.ai>
* Bump to main

* correct types?

* point to latest main

---------

Co-authored-by: Charles Teague <charles@meridianlabs.ai>
…labs-ai#352)

Co-authored-by: Rasmus Faber-Espensen <rfaber@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Charles Teague <charles@meridianlabs.ai>
* Update to take marker fix

* bump to latest

---------

Co-authored-by: Charles Teague <charles@meridianlabs.ai>
When input is excluded from scan results, _expand_events_in_df would
crash with a KeyError because it checked for input_data but not input.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@rasmusfaber rasmusfaber deleted the fix/handle-excluded-input branch March 25, 2026 10:21
@rasmusfaber rasmusfaber restored the fix/handle-excluded-input branch March 25, 2026 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants