feat(v2): papaparse for CSV + mammoth for .docx preview#272
Open
feat(v2): papaparse for CSV + mammoth for .docx preview#272
Conversation
CSV: replace the hand-rolled split-on-comma with papaparse. The previous parser broke on real spreadsheet exports — quoted fields containing commas, escaped quotes, multi-line cells. papaparse is RFC 4180. DOCX: render Word documents inline via mammoth (lazy-imported via mammoth/mammoth.browser so inspectors that never open a Word file don't pay the bundle cost). Mammoth produces clean semantic HTML — headings, lists, tables, bold/italic. For pixel-faithful render the user clicks Open and uses Word / Office Online. Skipped (deferred): - xlsx / xls — SheetJS is ~600 KB and demo-day uses CSV instead. - ppt / pptx — no clean OSS option that's lightweight. - odt / ods / odp — same shape as their MS counterparts; covered by the same Office Online viewer follow-up. Bundle additions: - papaparse@^5.5.3 + @types/papaparse@^5.5.2 (~45 KB gzipped) - mammoth@^1.12.0 (~200 KB, lazy-loaded — only when a docx is opened) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6 tasks
lilyshen0722
added a commit
that referenced
this pull request
May 6, 2026
Closes the missing inline preview for Word documents — agents producing DOCX deliverables via officecli now render readable inside the inspector instead of falling through to the 'Open in browser' download path. Two changes (was open as PR #272 since 2026-04-29; ships here for the ADR-013 demo loop completion): 1. mammoth (1.12.0, ~200KB, lazy-loaded via dynamic import) renders the .docx → HTML inside <DocxPreview>. Wired into ArtifactPreview.tsx when kind === 'docx'. Format loss is acceptable for preview — mammoth produces clean semantic HTML (headings, lists, tables, bold/italic). For pixel-faithful render users click Open and use Word/Office Online. 2. papaparse (5.5.3) replaces the hand-rolled CSV split in <CsvPreview>. The previous splitter broke on real spreadsheet exports with quoted commas, escaped quotes, multi-line cells. RFC 4180 compliant now. XLSX/PPTX inline preview still deferred — SheetJS for XLSX is ~600KB (too heavy to ship by default); PPTX has no clean client-side option (would need server-side render via officecli/libreoffice). For the ADR-013 demo, agents producing PPTX/XLSX still get clickable + downloadable pills, just no inline preview pane. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
First of three "artifact polish" PRs queued for review (will not auto-merge).
What this changes
CSV preview — replaces the hand-rolled `split(',')` with papaparse. The previous parser broke on real spreadsheet exports (quoted fields containing commas, escaped quotes, multi-line cells). papaparse is RFC 4180.
DOCX preview — Word documents render inline via mammoth (`mammoth/mammoth.browser`, lazy-imported so inspectors that never open a Word file don't pay the bundle cost). Mammoth produces clean semantic HTML — headings, lists, tables, bold/italic. For pixel-faithful render the user clicks Open and uses Word / Office Online.
Bundle impact
Skipped (deferred)
Test plan
🤖 Generated with Claude Code