@@ -120,11 +120,77 @@ The paper finds agents excel at parallelization/batching and struggle with
120120vectorization. Surfacing these tags would let users filter the explorer by
121121strategy.
122122
123- - ** Used by:** ` /explorer/ ` optional "Strategy" filter chip.
123+ - ** Used by:** ` /explorer/ ` optional "Strategy" filter chip; ** Strategy
124+ Explorer** section on the landing page (a ` ToolGrid ` -style 6–8 column grid,
125+ one column per category, tasks listed inside each cell with ✓/✗ for whether
126+ the agent matched the human's strategy).
124127- ** Ideal shape:** per-task labels (` ["caching", "vectorization", "io"] ` ) on
125- the human PR.
128+ the human PR. Categories from the paper: ` caching ` , ` vectorization ` ,
129+ ` parallelization ` , ` batching ` , ` memory ` , ` io ` , ` algorithm ` ,
130+ ` data-structure ` .
131+ - ** Pairs well with:** wishlist #3 (per-task patches). The Strategy Explorer
132+ becomes far more interesting if clicking a cell opens a drawer with the
133+ * representative diff hunk* for that (strategy, agent) cell — even one
134+ ~ 10-line snippet per cell is enough to read as "this is what vectorization
135+ looks like in pandas."
126136- ** Without it:** strategy taxonomy lives only in the paper, not the site.
127137
138+ ## 8. Agent family / model / cost taxonomy
139+
140+ The leaderboard currently lists agents as flat IDs (e.g.
141+ ` terminus-2,gpt-5 ` ). For an ** Agent Explorer** patterned on ccunpacked.dev's
142+ slash-command catalog, we need to group them by family.
143+
144+ - ** Used by:** new ** Agent Explorer** section on the landing page (pill grid
145+ grouped by agent family, each pill showing the agent's signature strength
146+ and a cost-tier badge); future ` /agents/ ` per-agent page.
147+ - ** Ideal shape:** an ` agents.json ` keyed by agent_id:
148+ ``` json
149+ {
150+ "terminus-2,gpt-5" : {
151+ "agent_family" : " Terminus 2" ,
152+ "model_family" : " GPT" ,
153+ "model" : " gpt-5" ,
154+ "provider" : " OpenAI" ,
155+ "cost_tier" : " frontier" ,
156+ "open_weights" : false ,
157+ "signature_strength" : " module-level optimization" ,
158+ "color_category" : " frontier-closed"
159+ }
160+ }
161+ ```
162+ - ** Pairs well with:** wishlist #6 (per-task cost). Cost tier in the taxonomy
163+ + per-task cost in the CSV unlocks the paper's "frontier vs. open-weights
164+ cost-effectiveness" finding as a visual.
165+ - ** Without it:** agents stay as opaque IDs; we can't surface the
166+ family-level story (Terminus + frontier-LLM vs. Aider + open-weights, etc.).
167+
168+ ## 9. Structured findings catalog
169+
170+ The paper has ~ 6 sharply phrased findings (local vs. global optimization,
171+ strategy strengths, long-tail repository performance, cost efficiency, …).
172+ ` copy.json ` already stores these as ` {title, description} ` pairs, but that's
173+ just prose. To render them as a ** Findings cards grid** (the
174+ ` HiddenFeatures ` pattern from ccunpacked), each finding needs a category, a
175+ headline metric, and a link to where in the site that finding is * visually
176+ demonstrated* .
177+
178+ - ** Used by:** new ** Findings** section — tinted cards with category color,
179+ one-line description, headline metric chip, "View analysis ↗" link.
180+ - ** Ideal shape:** extend ` copy.json ` ` overview.keyFindings.findings[] ` :
181+ ``` json
182+ {
183+ "title" : " Local vs. Global Optimization" ,
184+ "description" : " Agents are better at local or function-level …" ,
185+ "category" : " scope" ,
186+ "metric" : { "label" : " L4 advantage" , "value" : -0.04 },
187+ "link" : " /leaderboard/?level=L4"
188+ }
189+ ```
190+ - ** Without it:** findings stay as plain prose cards — readable but inert,
191+ with no visual emphasis on the actual numbers and no path from "claim" to
192+ "evidence."
193+
128194---
129195
130196## Out of scope (intentionally)
0 commit comments