Integrate agent transcripts into the DAG (Phase C)#99
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughIntegrates agent/sidechain transcript entries into the DAG by adding Changes
Sequence DiagramsequenceDiagram
participant Directory as Directory Scanner
participant Converter as Converter
participant Integrator as _integrate_agent_entries()
participant DAGBuilder as DAG Builder
participant Renderer as Renderer
Directory->>Converter: load_directory_transcripts()
Converter->>Converter: scan & collect all messages
Converter->>Integrator: _integrate_agent_entries(all_messages)
Integrator->>Integrator: build agentId → anchor UUID map
Integrator->>Integrator: set parentUuid on agent roots
Integrator->>Integrator: assign synthetic sessionIds ("#agent-")
Integrator-->>Converter: modified messages (in-place)
Converter->>DAGBuilder: build_dag_from_entries(all_messages, sidechain_uuids)
DAGBuilder->>DAGBuilder: construct DAG including agent child sessions
DAGBuilder-->>Converter: DAG structure
Converter->>Renderer: generate HTML/navigation
Renderer->>Renderer: filter out agent sessions from nav
Renderer->>Renderer: use effective parent session for agent messages
Renderer-->>Converter: rendered content grouped under parent sessions
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
test/test_dag_integration.py (1)
741-986: Add one branch-scoped agent regression.These cases only exercise agents anchored on ordinary session trunks. A case where the anchor lives inside a within-session fork would catch regressions where agent messages get regrouped under the base session instead of the branch pseudo-session.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/test_dag_integration.py` around lines 741 - 986, Add a new test (e.g., test_agent_branch_scoped_session) that creates a within-session fork and places an agent anchor inside one branch to ensure agent entries are attached to the branch pseudo-session rather than the base session: use _make_user_entry and _make_assistant_entry to build a main trunk (u1→a1), then create two divergent branch entries both referencing a1 (branch anchor u2_branch and a continuation a2_branch) and include agent sidechain entries with agent_id tied to u2_branch; write them with _write_jsonl and call load_directory_transcripts, then assert the synthetic agent session appears in tree.sessions with parent_session_id equal to the specific branch session id and attachment_uuid equal to the branch anchor's uuid (using tree.sessions, parent_session_id, and attachment_uuid) and that agent entries are ordered between the branch anchor and its continuation in the returned result.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@claude_code_log/renderer.py`:
- Around line 2025-2031: The code incorrectly derives agent parent scope by
slicing msg_session_id (msg_session_id.split("#agent-") ...) which loses branch
context; instead, use the session-tree/explicit render parent information (e.g.
prefer message.render_session_id or call the SessionTree method that resolves
the parent/anchor for a session) to compute agent_parent_session so branch
pseudo-sessions are preserved; update occurrences around
msg_session_id/agent_parent_session (and any logic that feeds
_reorder_session_template_messages) to stop splitting on "#agent-" and use the
SessionTree API or message.render_session_id to determine the true parent
session.
---
Nitpick comments:
In `@test/test_dag_integration.py`:
- Around line 741-986: Add a new test (e.g., test_agent_branch_scoped_session)
that creates a within-session fork and places an agent anchor inside one branch
to ensure agent entries are attached to the branch pseudo-session rather than
the base session: use _make_user_entry and _make_assistant_entry to build a main
trunk (u1→a1), then create two divergent branch entries both referencing a1
(branch anchor u2_branch and a continuation a2_branch) and include agent
sidechain entries with agent_id tied to u2_branch; write them with _write_jsonl
and call load_directory_transcripts, then assert the synthetic agent session
appears in tree.sessions with parent_session_id equal to the specific branch
session id and attachment_uuid equal to the branch anchor's uuid (using
tree.sessions, parent_session_id, and attachment_uuid) and that agent entries
are ordered between the branch anchor and its continuation in the returned
result.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a1c21610-c91a-40de-856a-11a147b04f98
📒 Files selected for processing (6)
claude_code_log/converter.pyclaude_code_log/dag.pyclaude_code_log/renderer.pydev-docs/dag.mdtest/test_dag_integration.pywork/phase-c-agent-transcripts.md
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
claude_code_log/converter.py (1)
1431-1437:⚠️ Potential issue | 🟠 MajorFold agent entries into the parent session cache instead of skipping them.
These messages are grouped back under the parent session later when rendering/paginating, so dropping
#agent-...sessions here makes the cachedmessage_countand token totals too low for any session with agents. That skews page assignment and session/page stats away from what the UI actually renders. Coalesce synthetic agent sessions back to the parent session ID here, and mirror the same normalization in_build_session_data_from_messages().💡 Suggested fix
- session_id = getattr(message, "sessionId", "") - if not session_id or is_agent_session(session_id): + session_id = getattr(message, "sessionId", "") + if not session_id: continue + if is_agent_session(session_id): + session_id = session_id.split("#agent-")[0]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@claude_code_log/converter.py` around lines 1431 - 1437, The current branch skips any message whose sessionId is an agent session (is_agent_session(session_id)), which drops those messages from the session cache and undercounts message_count/tokens; instead, when processing session-level data in this block (and similarly inside _build_session_data_from_messages()), detect agent sessions and remap/coalesce them into their parent session id (e.g., derive parent_id from session_id rather than continue), then proceed to accumulate counts/tokens under that parent_id; update both this message-processing branch (where message and SummaryTranscriptEntry are checked) and the _build_session_data_from_messages() normalization so agent sessions are folded into the parent session consistently.
🧹 Nitpick comments (1)
test/test_dag_integration.py (1)
738-1072: Add a regression for session-metadata/pagination coalescing.These tests cover DAG/tree/rendering well, but the PR also changes cache/session aggregation and paginated grouping. A focused assertion around
_update_cache_with_session_data()or paginated HTML generation would catch cases where agent messages render inline but are omitted from the parent session’smessage_countor token totals.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/test_dag_integration.py` around lines 738 - 1072, Add a regression test that exercises _update_cache_with_session_data (via load_directory_transcripts) to ensure agent sidechain entries are included in the parent session's pagination/aggregation metadata: create a main session with an agent anchor and sidechain agent_entries (like existing tests), call load_directory_transcripts(tmp_path, silent=True) then call or verify the cache update path (_update_cache_with_session_data) ran, and assert that session_tree.sessions["s1"] (or the parent session object) has message_count and total token-related fields updated to include the agent messages (e.g., message_count increased by agent entries, total_tokens or similar totals reflect agent content) so paginated HTML generation will include agent messages in parent session aggregates.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@claude_code_log/converter.py`:
- Around line 1431-1437: The current branch skips any message whose sessionId is
an agent session (is_agent_session(session_id)), which drops those messages from
the session cache and undercounts message_count/tokens; instead, when processing
session-level data in this block (and similarly inside
_build_session_data_from_messages()), detect agent sessions and remap/coalesce
them into their parent session id (e.g., derive parent_id from session_id rather
than continue), then proceed to accumulate counts/tokens under that parent_id;
update both this message-processing branch (where message and
SummaryTranscriptEntry are checked) and the _build_session_data_from_messages()
normalization so agent sessions are folded into the parent session consistently.
---
Nitpick comments:
In `@test/test_dag_integration.py`:
- Around line 738-1072: Add a regression test that exercises
_update_cache_with_session_data (via load_directory_transcripts) to ensure agent
sidechain entries are included in the parent session's pagination/aggregation
metadata: create a main session with an agent anchor and sidechain agent_entries
(like existing tests), call load_directory_transcripts(tmp_path, silent=True)
then call or verify the cache update path (_update_cache_with_session_data) ran,
and assert that session_tree.sessions["s1"] (or the parent session object) has
message_count and total token-related fields updated to include the agent
messages (e.g., message_count increased by agent entries, total_tokens or
similar totals reflect agent content) so paginated HTML generation will include
agent messages in parent session aggregates.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3db4e5f1-dedf-4f81-9ffa-473893d3cb36
📒 Files selected for processing (4)
claude_code_log/converter.pyclaude_code_log/renderer.pyclaude_code_log/utils.pytest/test_dag_integration.py
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@claude_code_log/converter.py`:
- Around line 1649-1658: The fallback path still groups by raw sessionId and
thus leaks synthetic agent sessions; update _collect_project_sessions() (or the
process_projects_hierarchy() fallback call) to exclude any session_id where
is_agent_session(session_id) is true and to skip warmup_session_ids, mirroring
the filtering used earlier (session_id not in warmup_session_ids and not
is_agent_session(session_id)); ensure the grouping step uses the filtered
session_ids variable so pseudo-sessions like "#agent-..." are never emitted as
standalone project-index sessions.
- Around line 416-456: The agent anchoring logic in _integrate_agent_entries
only collects anchors from non-sidechain entries so nested agents (whose anchor
is in a sidechain) remain unanchored; update the agent_anchors build to include
any entry with an agentId (i.e., check msg.agentId regardless of
msg.isSidechain) and when multiple candidates exist prefer a non-sidechain
anchor over a sidechain one (so preserve an existing non-sidechain anchor and
only set/override when the current candidate is a better choice), keeping
references to BaseTranscriptEntry, agentId, isSidechain, agent_anchors,
parentUuid, and sessionId to locate the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 047fec33-895e-4754-b83c-c8d84df4dae8
📒 Files selected for processing (4)
claude_code_log/converter.pyclaude_code_log/renderer.pyclaude_code_log/utils.pytest/test_dag_integration.py
| def _integrate_agent_entries(messages: list[TranscriptEntry]) -> None: | ||
| """Parent agent entries and assign synthetic session IDs. | ||
|
|
||
| Agent (sidechain) entries share sessionId with their parent session | ||
| but form separate conversation threads. This function: | ||
|
|
||
| 1. Builds a map of agentId -> anchor UUID (the main-session User entry | ||
| whose agentId matches, i.e. the tool_result that references the agent) | ||
| 2. For each agent's root entry (parentUuid=None, isSidechain=True), | ||
| sets parentUuid to the anchor UUID | ||
| 3. Assigns a synthetic sessionId ("{sessionId}#agent-{agentId}") to all | ||
| agent entries so they form separate DAG-lines | ||
|
|
||
| Mutates entries in place (Pydantic v2 models are mutable by default). | ||
| """ | ||
| # Build agentId -> anchor UUID map from main-session entries | ||
| agent_anchors: dict[str, str] = {} | ||
| for msg in messages: | ||
| if not isinstance(msg, BaseTranscriptEntry): | ||
| continue | ||
| if msg.isSidechain: | ||
| continue | ||
| # Main-session entries with agentId reference an agent transcript | ||
| if msg.agentId: | ||
| agent_anchors[msg.agentId] = msg.uuid | ||
|
|
||
| if not agent_anchors: | ||
| return | ||
|
|
||
| # Process sidechain entries: parent roots and assign synthetic sessionIds | ||
| for msg in messages: | ||
| if not isinstance(msg, BaseTranscriptEntry): | ||
| continue | ||
| if not msg.isSidechain or not msg.agentId: | ||
| continue | ||
| agent_id = msg.agentId | ||
| # Assign synthetic session ID to separate from main session | ||
| msg.sessionId = f"{msg.sessionId}#agent-{agent_id}" | ||
| # Parent the root entry to the anchor | ||
| if msg.parentUuid is None and agent_id in agent_anchors: | ||
| msg.parentUuid = agent_anchors[agent_id] |
There was a problem hiding this comment.
Nested agent transcripts still won't get anchored.
load_transcript() already recurses into agent files, but agent_anchors is built only from non-sidechain entries. If an agent launches another agent, that inner anchor lives in a sidechain entry, so the nested root keeps parentUuid=None and becomes a separate root instead of attaching under the outer agent branch.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@claude_code_log/converter.py` around lines 416 - 456, The agent anchoring
logic in _integrate_agent_entries only collects anchors from non-sidechain
entries so nested agents (whose anchor is in a sidechain) remain unanchored;
update the agent_anchors build to include any entry with an agentId (i.e., check
msg.agentId regardless of msg.isSidechain) and when multiple candidates exist
prefer a non-sidechain anchor over a sidechain one (so preserve an existing
non-sidechain anchor and only set/override when the current candidate is a
better choice), keeping references to BaseTranscriptEntry, agentId, isSidechain,
agent_anchors, parentUuid, and sessionId to locate the change.
| # Find all unique session IDs (excluding warmup and agent sessions) | ||
| session_ids: set[str] = set() | ||
| for message in messages: | ||
| if hasattr(message, "sessionId"): | ||
| session_id: str = getattr(message, "sessionId") | ||
| if session_id and session_id not in warmup_session_ids: | ||
| if ( | ||
| session_id | ||
| and session_id not in warmup_session_ids | ||
| and not is_agent_session(session_id) | ||
| ): |
There was a problem hiding this comment.
The no-cache project-summary fallback still leaks synthetic agent sessions.
Filtering agent sessions out here fixes individual session files, but process_projects_hierarchy() falls back to _collect_project_sessions() when cached project data is missing, and that helper still groups by raw sessionId. In that path, #agent-... pseudo-sessions will still show up as standalone project-index sessions, which contradicts the inline-only behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@claude_code_log/converter.py` around lines 1649 - 1658, The fallback path
still groups by raw sessionId and thus leaks synthetic agent sessions; update
_collect_project_sessions() (or the process_projects_hierarchy() fallback call)
to exclude any session_id where is_agent_session(session_id) is true and to skip
warmup_session_ids, mirroring the filtering used earlier (session_id not in
warmup_session_ids and not is_agent_session(session_id)); ensure the grouping
step uses the filtered session_ids variable so pseudo-sessions like "#agent-..."
are never emitted as standalone project-index sessions.
Agent (sidechain) entries are now part of the DAG instead of being
partitioned out and reordered heuristically. _integrate_agent_entries()
parents agent roots to their anchor tool_result and assigns synthetic
session IDs ({sessionId}#agent-{agentId}) so they form separate
DAG-lines spliced at the correct junction point.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use session hierarchy to determine the correct parent session for agent messages (may be a branch pseudo-session when the anchor is inside a within-session fork). Fallback to extracting from synthetic ID. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract `is_agent_session()` helper to `utils.py` and use it to filter agent sessions from: session metadata building, cache population, pagination session collection, and message grouping for pages. Agent messages in pagination are grouped under their parent session. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of dropping agent messages from session metadata, remap their synthetic session IDs to the parent session using get_parent_session_id(). This ensures message counts, token totals, and timestamps in pagination and cache include agent contributions. Add regression test verifying agent messages fold into parent session's message_count and token totals via _build_session_data_from_messages(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- _integrate_agent_entries: also collect anchors from sidechain entries (for nested agents where agent A spawns agent B), preferring non-sidechain anchors when both exist. - _collect_project_sessions: coalesce agent sessions into parent via get_parent_session_id(), preventing synthetic #agent- sessions from appearing as standalone entries in the project index. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
With DAG integration, agent messages are already in correct order via DAG traversal — the sidechain reorder function was a no-op for directory mode. To make it safe to remove for single-file mode too, add _integrate_agent_entries() to the single-file path in converter.py so both code paths use consistent DAG-based ordering. _cleanup_sidechain_duplicates() is kept as-is: content-level dedup (Task input/output duplicated in sidechain) cannot be handled at the DAG level since it requires text comparison, not structural ordering. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reset current_render_session to None when entering a new non-agent session in _render_messages(). Previously, the branch tracking state from a within-session fork would leak into subsequent sessions, causing their messages to inherit a stale branch render_session_id. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fork points now render as standalone structural elements (⑂) outside the message box, listing each branch with preview text. Branch headers are compact with ↳ prefix, no session divider, and a back-reference to the fork point. Changes: - renderer.py: Extend junction_forward_links with branch preview text, add fork_point_preview attribute to TemplateMessage - transcript.html: Move fork element outside message box, update branch header to use ↳ prefix and skip session divider - system_formatters.py: Redesign branch header HTML with compact back-reference (from ⑂ Fork point • preview) - message_styles.css: Branch headers are transparent, no box shadow, left border only - session_nav_styles.css: Replace junction-link styles with fork-point element styles Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- .branch-header: margin 2em/1em, fold-bar border override to gray - .fork-point: margin-bottom 2em, border-radius 8px - Branch headers show "(in Session X)" for cross-session forks (when branch's original session differs from parent's real session) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add --fork-point-color and --branch-point-color CSS variables to :root - Replace hardcoded #adb5bd with var() references in fork/branch styles - Fix fold-bar specificity: use .branch-header .fold-bar[data-border-color= "session-header"] to override the yellow/orange session-header color Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Entries like "attachment", "permission-mode", "progress", and "file-history-snapshot" have uuid/parentUuid and are part of the DAG chain, but were previously dropped at parse time. This caused false roots — any message whose parentUuid pointed to a dropped entry became an orphan in the DAG. PassthroughTranscriptEntry captures just the DAG-relevant fields (uuid, parentUuid, sessionId, timestamp, isSidechain) so these entries participate in the DAG graph without being rendered. Changes: - models.py: Add PassthroughTranscriptEntry, add to TranscriptEntry union - converter.py: Create PassthroughTranscriptEntry for unknown types with uuid+sessionId; update _repair_parent_chains to skip repair for progress entries that are present as passthrough nodes; update _integrate_agent_entries and _build_session_data_from_messages - transcript_factory.py: Fall back to PassthroughTranscriptEntry in create_transcript_entry for cache deserialization - renderer.py: Filter out PassthroughTranscriptEntry in _filter_messages Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two fixes found from real-data testing: 1. deduplicate_messages() was dropping PassthroughTranscriptEntry entries with identical timestamps (common for attachments). Use uuid as content_key to prevent false dedup of DAG chain nodes. 2. Fork points with fewer than 2 branches with meaningful preview text (i.e. not UUID-fragment fallbacks) are now elided — they add visual noise without providing navigation value. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two bugs when processing Windows-origin transcripts: 1. deduplicate_messages() collapsed distinct entries sharing a timestamp: system entries used content_key="" (now uuid), and assistant split-content-blocks shared the same message.id (now message.id + block types, so stutters still dedup). 2. SystemTranscriptEntry.level can be None (turn_duration, memory_saved, away_summary subtypes). The factory's getattr fallback never triggered because the attribute exists as None; changed to `transcript.level or "info"`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address monk review note: use `message.level or "info"` instead of `getattr(message, "level", "info")` for consistency with the system_factory fix (same None-attribute-exists issue). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Oops, not sure why it was merged without the typecheck/ty fixes... will follow-up on main directly. |
…lize naming
Split planning content from as-built reference content per the principle
that ``work/`` holds plans/TODOs (authoritative until the code lands)
and ``dev-docs/`` reflects the current code (authoritative once the
code lands).
Three file moves, one rename, one deletion, one new file:
- ``dev-docs/FOLD_STATE_DIAGRAM.md`` → ``dev-docs/message-hierarchy.md``
(drops the SCREAMING_SNAKE; H1 also updated to "Message Hierarchy
and Fold State" to match content scope).
- ``dev-docs/restoring-archived-sessions.md`` → ``docs/restoring-archived-sessions.md``
(user-facing operations content; ``docs/`` is the new home for
user-facing documentation distinct from contributor docs).
- ``dev-docs/rendering-next.md`` → ``work/rendering-next.md``
("Future Work for the rendering system" — explicitly plan content).
- ``work/phase-c-agent-transcripts.md`` deleted: the work fully landed
via PR #99 (b5aefa0); the as-built reality is now in ``agents.md``.
- ``dev-docs/application_model.md`` (new): entry-point doc.
Inbound references updated in ``CLAUDE.md``, ``CONTRIBUTING.md``,
``claude_code_log/converter.py`` (URL to docs/), and the four dev-docs
that cross-referenced the moved/renamed files.
The new ``application_model.md`` is the entry point: § 1 subsystem
table with pointers to deep-dive docs, § 2 inlined coverage of
subsystems without their own deep-dive (CLI, TUI, cache implementation,
migrations, detail-level filter, image export, performance profiling),
§ 3 data lifecycle ASCII diagram, § 4 cross-cutting glossary
(TranscriptEntry, MessageContent, TemplateMessage, RenderingContext,
session_id and synthetic forms, sidechain, agent_id, fork point,
pair_first/middle/last, detail level, passthrough), § 5 "where to
start reading" by question.
Surviving dev-docs all carry a one-line breadcrumb pointing at
``application_model.md`` so a reader landing on a deep-dive without
context can find the system overview.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ming, clean work/ (#134) * dev-docs/work/docs restructure: introduce application_model.md, normalize naming Split planning content from as-built reference content per the principle that ``work/`` holds plans/TODOs (authoritative until the code lands) and ``dev-docs/`` reflects the current code (authoritative once the code lands). Three file moves, one rename, one deletion, one new file: - ``dev-docs/FOLD_STATE_DIAGRAM.md`` → ``dev-docs/message-hierarchy.md`` (drops the SCREAMING_SNAKE; H1 also updated to "Message Hierarchy and Fold State" to match content scope). - ``dev-docs/restoring-archived-sessions.md`` → ``docs/restoring-archived-sessions.md`` (user-facing operations content; ``docs/`` is the new home for user-facing documentation distinct from contributor docs). - ``dev-docs/rendering-next.md`` → ``work/rendering-next.md`` ("Future Work for the rendering system" — explicitly plan content). - ``work/phase-c-agent-transcripts.md`` deleted: the work fully landed via PR #99 (b5aefa0); the as-built reality is now in ``agents.md``. - ``dev-docs/application_model.md`` (new): entry-point doc. Inbound references updated in ``CLAUDE.md``, ``CONTRIBUTING.md``, ``claude_code_log/converter.py`` (URL to docs/), and the four dev-docs that cross-referenced the moved/renamed files. The new ``application_model.md`` is the entry point: § 1 subsystem table with pointers to deep-dive docs, § 2 inlined coverage of subsystems without their own deep-dive (CLI, TUI, cache implementation, migrations, detail-level filter, image export, performance profiling), § 3 data lifecycle ASCII diagram, § 4 cross-cutting glossary (TranscriptEntry, MessageContent, TemplateMessage, RenderingContext, session_id and synthetic forms, sidechain, agent_id, fork point, pair_first/middle/last, detail level, passthrough), § 5 "where to start reading" by question. Surviving dev-docs all carry a one-line breadcrumb pointing at ``application_model.md`` so a reader landing on a deep-dive without context can find the system overview. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * dev-docs: address branch-heading navigation gap surfaced by docs-walk-through A reviewer reading the new ``application_model.md`` cold landed at ``dag.md`` for branch sessions, but the doc was silent on where the branch-heading text itself is assembled — they'd have had to grep ``renderer.py`` to find it. Three small fixes close the gap: 1. ``SessionHeaderMessage`` glossary entry in ``application_model.md`` § 4: defines the term, distinguishes trunk vs branch flavours, names the four functions that compose the title (``_branch_label``, ``_enrich_branch_titles``, ``create_session_preview``, ``simplify_command_tags``). Lands between the ``fork point / branch`` and ``pair_first / middle / last`` entries since branch headings are the lookup most likely to lead here. 2. Cross-reference in ``dag.md``'s "Navigation Links" section: one-line callout that title composition is a renderer concern, pointing back at the new glossary entry. Keeps ``dag.md`` focused on structure (its job) without the reader hitting a silent dead end on the labeling question. 3. Cross-cutting-concerns paragraph after the § 1 subsystem table: names "label and preview composition" as the canonical example of a concern that touches several rows (DAG, renderer, parsing) and isn't owned by any single one. Helps the next reader who looks for "where do branch labels live?" and finds neither the DAG row nor the renderer row obviously authoritative. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * dev-docs: sync with main for #36 (JSON), #135 (DAG cycle-break + SIGUSR1), #133 (factory-layer normalisation seam) Three PRs landed on main since this branch was created. The dev-docs deltas they imply: - **PR #36 (JSON renderer)** — - `application_model.md` § 1 subsystem table gains a JSON export row; new § 2.5 "JSON export" describes the renderer (mirrors HTML/Markdown surface, runs through `generate_template_messages` so it inherits all post-factory polishing for free, JSON-specific `_json_default` shim for embedded Pydantic models, output naming via `variant_suffix`). - § 2.1 CLI lists `json` alongside `html`/`md`/`markdown` for `--format`. - § 3 data lifecycle diagram adds `json/renderer.py` as a third fan-out branch alongside HTML/Markdown. - § 5 entry-questions gains a "How do I export to JSON?" pointer. - `rendering-architecture.md` § 1 / § 4 list JSON; new `JsonRenderer` subsection in § 7 describes its serialisation approach (single document via `dataclasses.asdict` + shim; only reuses `title_content` from the dispatcher). - § 10 Separation of Concerns clarifies that **factories/** is the normalisation seam — display polish for *all* output formats lives there, not in renderers. - `implementing-a-tool-renderer.md` Overview + Tests sections note that JSON needs no per-tool integration: serialises the typed factory output directly. - **PR #135 (DAG cycle-break + SIGUSR1)** — - `dag.md` Phase 2 (Build DAG) gains a mermaid diagram of the three-step `build_dag` ordering (orphan promotion → cycle break → children build) and explains the ordering rationale (children must be built last because cyclic edges in `parent_uuid` would propagate into `children_uuids` otherwise, hanging downstream walks). - Phase 3 documents the `walk_visited` defence-in-depth in `_walk_session_with_forks` — backstop for reintroduced cycles after build time. - Invariant 3 (DAG acyclicity) updated from "no cycles" assertion to "actively broken at build time, walker has belt". - `application_model.md` new § 2.9 "Diagnosing hangs" describes the SIGUSR1 stack-dump handler installed in `cli.py`. - **PR #133 (system-info-cleanup)** — - Already mostly absorbed via the branch-heading navigation gap commit; the renormalisation ("if it shows up right in HTML, it shows up right in JSON") and the explicit "factory-layer normalisation seam" framing in `rendering-architecture.md` § 10 cite `simplify_command_tags` as the canonical example. Also drive-by: § 2.6 detail-level table now includes the `high` level (was missing — `models.DetailLevel.HIGH` predates this branch but the doc never listed it). `just ci` clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * dev-docs: address CR findings on PR #134 (page-size scope, image-export modes) Two doc claims didn't match the actual code; both verified directly against `cli.py` / `converter.py` / `image_export.py` rather than the CR text alone. - L75-76 page-size scope: clarify pagination targets the combined-transcript output (whole-session packing into N-message pages, with overflow when a single session exceeds N), not per-session HTML files. Verified against `cli.py:547-551` (help text says "combined transcript") and `converter.py:1002` `_assign_sessions_to_pages` (per-page accumulation at the combined-transcript level). - § 2.7 image-export modes: rewrite to match the actual CLI choices. Mode names were wrong (`inline/referenced/dropped` → `placeholder/embedded/referenced`); defaults were also swapped (now `embedded` for HTML, `referenced` for Markdown). Verified against `cli.py:541-544` and `image_export.py:29-45`. Skipped CR's L361 nit ("each `.md` is an in-flight or proposed plan" → "each `.md` is a plan"): the longer phrasing carries information about work/'s mixed state and isn't redundant. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * CLAUDE.md: ask Claude to keep dev-docs/ in sync with code changes Per discussion on PR #134: dev-docs/ are useful only if they remain true to the as-built reality. Articulate the principle (code is authoritative; doc is wrong if they disagree) and the work/ → dev-docs/ lifecycle so non-trivial changes prompt a doc update in the same commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
_integrate_agent_entries()inconverter.pyparents agent roots to their anchortool_resultand assigns synthetic session IDs ({sessionId}#agent-{agentId}) to form separate DAG-linesThe look and feel of fork points and branches is now like this:
Details
How it works
agentId(from entry-level ortoolUseResult.agentId) become anchorsparentUuid=null) get theirparentUuidset to the anchor's UUIDsessionId(s1#agent-xyz) separates agents from their parent session in the DAGWhat's kept (conservative approach)
_reorder_sidechain_template_messages()— now effectively a no-op for integrated agents, kept as fallback_cleanup_sidechain_duplicates()— still needed for Task tool input/output dedupsidechain_uuidsinbuild_dag()— still needed for unloaded subagent files (e.g. aprompt_suggestion)Key constraint
entry.sessionIdon disk / in cache is never mutated. Synthetic IDs are only assigned in-memory duringload_directory_transcripts().Test plan
TestAgentDagIntegrationtests passtest_load_directory_with_sidechainsfor new DAG orderingty checkandruff checkclean-src-experiments-claude_p)🤖 Generated with Claude Code
Summary by CodeRabbit
Refactoring
Documentation
Tests