Kitsy is a local-first toolbox for everyday file, media, document, recorder, and todo workflows. It runs as an offline-capable PWA, keeps file processing on the user's device, and uses a small backend session only for optional Google Drive authorization and upload proxying.
Optional Google Drive support lets users sync the todo list into their own hidden Drive app data and save processed outputs into their own Drive. The app still works local-only when Drive is disconnected, unconfigured, or unavailable.
Useful if you want an offline-friendly alternative to TinyWow, 123apps, Smallpdf, iLovePDF, and similar browser tool services.
Please consider leaving a star.
full-showcase-speed-2x.mp4
flowchart TD
UI["UI Layer<br/>(React 19 + DaisyUI 5 + Tabler icons)"] --> Router["Routing Layer<br/>(TanStack Start / React Router)"]
Router --> HomeRoute["/ Route"]
Router --> ToolRoute["/tool/$id Route"]
HomeRoute --> SearchRank["search.ts<br/>(intent-aware ranking)"]
ToolRoute --> Registry["tool-registry.ts<br/>(62 registered tools)"]
Registry --> Processors["Processor Functions"]
Processors --> ImgProc["image-processor.ts<br/>(OffscreenCanvas + imagetracerjs)"]
Processors --> PdfProc["pdf-processor.ts<br/>(pdf-lib + pdfjs-dist + qpdf-wasm + signing libs)"]
Processors --> FileProc["file-processor.ts<br/>(fflate + papaparse)"]
Processors --> FfmpegProc["ffmpeg-processor.ts<br/>(FFmpeg.wasm)"]
Registry --> DocInline["Document Viewer<br/>(inline registry processor)"]
DocInline --> docxPrev["docx-preview (DOCX)"]
DocInline --> xlsxLib["exceljs + papaparse<br/>(XLSX/CSV HTML tables)"]
DocInline --> nativeDocs["Native pass-through<br/>(PDF/TXT/JSON)"]
UI --> CollageUI["CollagePanel.tsx<br/>(react-konva)"]
UI --> RecorderUI["RecorderPanel.tsx<br/>(MediaRecorder + capture APIs + canvas composition)"]
UI --> TodoUI["TodoListPanel.tsx<br/>(localStorage + Drive sync + JSON import/export)"]
TodoUI --> TodoModel["todo-list.ts<br/>(schema normalization + merge/search/link parsing)"]
UI --> ShellUI["AppShellProvider.tsx<br/>(offline status + Drive auth/sync + PWA-ready toast)"]
UI --> Header["Header.tsx<br/>(search, cloud, GitHub, debug console, theme)"]
ShellUI --> DriveAuth["google-drive.ts<br/>(Google Identity Services code popup)"]
ShellUI --> ServerFns["server-functions.ts<br/>(OAuth exchange + Drive REST proxy)"]
ShellUI --> SW["src/sw.ts<br/>(Serwist precache runtime)"]
style ImgProc fill:#4ecdc4,color:#000
style PdfProc fill:#ff6b6b,color:#000
style FileProc fill:#ffe66d,color:#000
style FfmpegProc fill:#9b5de5,color:#fff
style DocInline fill:#f4a261,color:#000
/renderssrc/routes/index.tsx, reads theqsearch param, ranks all registry tools withrankToolsByQuery(), and otherwise groups tools by category./tool/$idrenderssrc/routes/tool.$id.tsx, looks up the ID withgetToolById(), and passes the selected tool toToolPanel.src/routes/__root.tsxowns the app shell, metadata, theme bootstrap, route preloading, and route-specific PWA manifest switching./tool/todo-listuses/manifest-todo.json; all other routes use/manifest.json.- There are no per-tool route files.
All tools are objects in src/lib/tool-registry.ts. A tool definition contains:
id,name,description,category,iconacceptedExtensionsand optionalproducedExtensionsmultiple- optional
requiresFiles - optional
uiMode:standard,auto-process,collage,recorder, ortodo optionsprocess(files, options) => Promise<ProcessedFile[]>
ToolPanel handles file selection, previews, options, processing, result cards, downloads, ZIP download-all, and Drive upload actions for standard tools. Custom UI modes render:
auto-process: runs immediately after file selection; currently used bydocument-viewer.collage: rendersCollagePanel.recorder: rendersRecorderPanel.todo: rendersTodoListPanel.
batch() in the registry sequentially applies single-file processors to multi-file tools. Tools like PDF merge and image-to-PDF handle all files as one batch.
FileDropzone accepts drag/drop and hidden file input selection. It builds the input accept string from acceptedExtensions, optional MIME types, and an extra text/csv hint for CSV selection on mobile browsers. ToolCard shows the first four accepted extensions visually, adds a +N badge for the rest, and includes an sr-only metadata block with the tool description and accepted extensions.
This list matches the current registry.
| Category | Tools |
|---|---|
| Image | image-convert, image-resize, image-rotate, image-crop, image-upscale, image-collage, image-blur, image-pixelate, image-watermark |
pdf-merge, pdf-split, pdf-delete-pages, pdf-reorder, pdf-header-footer, pdf-bates-numbering, pdf-add-blank-pages, pdf-remove-blank-pages, pdf-crop-pages, pdf-overlay-pages, pdf-resize-pages, pdf-n-up, pdf-page-dimensions, pdf-sign-visual, pdf-digital-sign, pdf-validate-signature, pdf-lock, pdf-unlock, pdf-images-to-pdf, pdf-to-images, pdf-compress, pdf-watermark, pdf-rotate, pdf-flatten, pdf-metadata, pdf-strip-metadata, pdf-remove-annotations |
|
| Video | video-convert, video-trim, video-extract-audio, video-merge, video-audio-merge, video-mute, video-speed, screen-recorder, camera-recorder, video-resize, video-crop, video-watermark, video-extract-frames |
| Audio | audio-convert, audio-trim, audio-merge, audio-recorder, audio-volume, audio-fade |
| Document | document-viewer |
| File | file-zip, file-unzip |
| Data | data-csv-to-json, data-json-to-csv, data-format-json, todo-list |
Current count: 62 tools.
src/lib/image-processor.ts uses OffscreenCanvas and native image loading. SVG input is loaded through HTMLImageElement; other images use createImageBitmap(). It implements image conversion, resize, rotate, crop, upscale, blur, pixelate, text watermark, and raster-to-SVG tracing through imagetracerjs.
src/lib/pdf-processor.ts uses:
pdf-libfor structural edits, page operations, watermarks, metadata, signatures-as-stamps, image-to-PDF, flattening, annotation removal, and CSV dimension reports.pdfjs-distthroughsrc/lib/pdfjs.tsfor rendering PDF pages to images/previews.@neslinesli93/qpdf-wasmfor password lock/unlock.zgapdfsignerfor certificate-based signing.node-forgefor signature validation support.
PDF byte output is wrapped through helpers that avoid TS6 Uint8Array<ArrayBufferLike> BlobPart issues by copying/slicing data first.
src/lib/ffmpeg-processor.ts lazy-loads a singleton FFmpeg instance from /ffmpeg/ffmpeg-core.js and /ffmpeg/ffmpeg-core.wasm. prefetchFFmpeg() is called by the app shell so the offline cache is warmed early. Each operation writes browser File data into FFmpeg's virtual filesystem, runs ff.exec([...]), reads output, copies bytes into a clean Uint8Array, creates a Blob, and deletes temporary VFS files.
Supported operations include video/audio conversion, trimming, extracting audio, merging video/audio, muting, speed changes, resizing, cropping, video watermarking through a generated PNG overlay, frame extraction, volume changes, and audio fades.
src/lib/file-processor.ts uses fflate for ZIP creation/extraction and papaparse for CSV/JSON conversion. JSON formatting uses native JSON.parse() and JSON.stringify().
document-viewer is implemented inline in tool-registry.ts and auto-processes after file drop:
- PDF, TXT, and JSON pass through as the original file.
- DOCX passes through and renders with
docx-previewinDocxPreview. - XLSX loads the first worksheet with
exceljsand renders an HTML table. - CSV parses with
papaparseand renders an HTML table.
HTML/PDF previews render in iframes; DOCX and text/JSON render inline.
src/components/CollagePanel.tsx uses react-konva on an 800x600 canvas. Users can drag, transform, reorder selected images, and export PNG or JPG directly from the Konva stage.
src/components/RecorderPanel.tsx uses browser capture APIs and MediaRecorder:
- Screen recorder uses
getDisplayMedia(), optional camera overlay fromgetUserMedia(), optional microphone/system audio mixing, canvas composition at 30 FPS, and a draggable/resizable overlay rectangle. - Camera recorder uses
getUserMedia()with optional microphone. - Audio recorder uses microphone-only
getUserMedia().
Output names are generated by src/lib/recorder.ts and use WebM or Ogg depending on supported MIME type. Screen recording is gated to desktop-sized viewports (>= 768px).
src/components/TodoListPanel.tsx and src/lib/todo-list.ts implement a local-first todo list:
- Primary local key:
kitsy.todo-list.v1. - Blank draft row at the top.
- Inline
contenteditableplain-text editing; paste is forced totext/plain. - Clickable
http/httpslinks are rendered outside edit mode withtarget="_blank"andnoopener. - Search supports token and subsequence matching.
- Filters: open, done, all.
- Reminders are stored as
YYYY-MM-DDand "today" compares month/day in UTC. - Pinned items sort first.
- Deletes are soft deletes (
deletedAt) so Drive merge can resolve them. - Import/export uses JSON arrays. Sync uses a versioned
{ version: 2, syncedAt, items }document. - Merge resolution prefers the item with the newest
deletedAt,updatedAt, orcreatedAt, then additional stable tie breakers.
sequenceDiagram
actor User
participant Drop as FileDropzone / ToolPanel
participant Reg as Tool Registry
participant Proc as Processor Function
participant Blob as Blob URL / Drive
User->>Drop: Selects or drops files
Drop->>Drop: Stores browser File objects
User->>Drop: Clicks Run
Drop->>Reg: tool.process(files, options)
Reg->>Proc: Calls processor
Proc-->>Reg: ProcessedFile[] { blob, name }
Reg-->>Drop: Results
Drop->>Blob: URL.createObjectURL() for download/preview
User->>Blob: Download or Save to Drive
sequenceDiagram
actor User
participant Header as Header / Tool UI
participant Shell as AppShellProvider
participant GIS as Google Identity Services
participant Server as TanStack server functions
participant Google as Google OAuth / Drive REST
User->>Header: Click cloud icon
Header->>Shell: cloud.connect()
Shell->>GIS: Popup requestCode() for Drive scopes
GIS-->>Shell: Authorization code
Shell->>Server: Send code and redirect origin
Server->>Google: Exchange code with server-only client secret
Google-->>Server: Drive authorization data
Server->>Server: Store Drive session in httpOnly cookie
Shell->>Server: Todo appDataFolder sync or file upload FormData
Server->>Google: Refresh authorization as needed, then Drive REST
Drive constants:
- Reconnect hint:
kitsy.google-drive.connected - Todo Drive file:
kitsy.todo-sync.v2.jsoninappDataFolder - Result folder: visible
Kitsyfolder in the user's Drive - Scopes:
drive.appdataanddrive.file - Session cookie:
kitsy_sessionin development and__Host-kitsy_sessionin production - Uploads proxy through the Kitsy server as multipart Drive uploads.
- Disconnect clears the server session and best-effort revokes the Drive grant.
Serwist is configured in vite.config.ts and the runtime service worker lives in src/sw.ts.
The production build precaches .output/public plus additional entries:
//ffmpeg/ffmpeg-core.js/ffmpeg/ffmpeg-core.wasm/qpdf.wasm
The FFmpeg and qpdf assets are copied into public/ by scripts/stage-wasm-assets.ts during postinstall. Revisions include the installed package versions. The maximum precache file size is 160 * 1024 * 1024 bytes.
AppShellProvider registers /sw.js, prefetches FFmpeg, and shows an offline-ready toast once both service worker readiness and FFmpeg prefetch complete. qpdf is precached by the service worker but initialized lazily by matching PDF tools.
src/lib/search.ts ranks tools from the static registry. It supports:
- Exact name and ID matches.
- Name, ID, description, category, keyword, accepted extension, and produced extension matching.
- Conversion intent such as
jpg to png. - Synonyms:
shrink,combine,join,record,checklist,photo, andsound. - Format aliases:
jpeg -> jpg,tif -> tiff,text -> txt,yml -> yaml.
Search results are exposed on the homepage through the q query param.
Kitsy uses Google Identity Services in the browser only to obtain an authorization code. The backend exchanges that code with the server-only OAuth client secret, stores the Drive session in an httpOnly cookie, and proxies Drive REST calls.
- Open Google Cloud Console and create or select a project.
- Enable the Google Drive API.
- Configure the OAuth consent screen in Google Auth Platform. Set app name, support email, developer contact, and production links as needed.
- In Data Access, request:
https://www.googleapis.com/auth/drive.appdatahttps://www.googleapis.com/auth/drive.file
- Create an OAuth Client ID with application type
Web application. - Add Authorized JavaScript origins, for example:
http://localhost:3000https://your-production-domain.example
- Add Authorized redirect URIs matching the origins, for example:
http://localhost:3000https://your-production-domain.example
- Set local env vars:
GOOGLE_DRIVE_CLIENT_ID=...GOOGLE_DRIVE_CLIENT_SECRET=...- Do not use Vite's public client-env prefix for these values.
- Start the app, click the cloud icon, grant access, and verify the icon changes to connected.
- Files are processed locally in the browser.
- Google Drive is optional and disabled when offline or when
GOOGLE_DRIVE_CLIENT_IDorGOOGLE_DRIVE_CLIENT_SECRETis missing. - OAuth credentials and Drive authorization data stay on the server side. Browser storage only keeps a non-secret reconnect hint.
- Todo sync writes one JSON document into Drive
appDataFolder. - Processed output uploads occur only when the user clicks a Drive save action and are proxied through same-origin server functions.
- DOCX rendering is delegated to
docx-preview; error messages are inserted withtextContent. - Todo editing uses plain-text contenteditable handling and React-rendered URL anchors.
- DebugConsole intentionally monkey-patches console methods in the browser to show client logs from the header.
- Large files may hit browser memory limits; there is no streaming-to-disk pipeline.
- WASM codec support depends on the bundled FFmpeg core.
- Some FFmpeg operations may fail for codecs/containers unsupported by the browser-side build.
- Safari support may vary for WASM, capture APIs, and MediaRecorder MIME types.
- Browser storage can be cleared by the user, browser policy, private browsing, or storage pressure.
- Drive todo sync is timestamp-based item merge, not real-time collaboration and not conflict UI.
- Google OAuth misconfiguration must be fixed in Google Cloud Console; Kitsy cannot repair it from the browser.
Use nix develop for Node 24 and npm:
nix developThe shell hook runs npm install, prints node -v and npm -v, and enters zsh. postinstall stages FFmpeg and qpdf WASM assets.
Common commands:
nix develop -c npm run dev
nix develop -c npm run build
nix develop -c npm run preview
nix develop -c npm run test
nix develop -c npm run test:e2e
nix develop -c npm run check
nix develop -c npm run formatScripts:
dev: loads.env, importsinstrument.server.mjs, and starts Vite dev on port 3000.build: runsvite buildand copies Sentry server instrumentation into.output/serverwhen present.preview: runs Vite preview on port 3000.start: starts the Nitro server output with instrumentation.test: Vitest.test:e2e: Playwright.showcase:concat: concatenates successful E2E recordings.check,lint,format: Biome.
- Update this README with every behavior, architecture, tooling, or tool-registry change.
- Test browser behavior against the production build (
npm run buildplus preview), because service worker behavior differs from dev. - Keep tools in
tool-registry.ts; do not add per-tool routes. - Keep processors stateless and browser-side in
src/lib/*-processor.ts. - Keep file data as native
FileorBlobuntil a library requiresArrayBuffer. - Keep backend behavior scoped to Google Drive authorization, todo sync, and upload proxying.
- Do not add custom service worker logic beyond Serwist configuration and
src/sw.ts. - UI should use DaisyUI component classes.
src/styles.cssis the Tailwind/DaisyUI setup file, not a place for feature-specific CSS. - Biome uses tabs, double quotes, and no semicolons.
- React Compiler is enabled in
@vitejs/plugin-react; avoid manualuseMemo/useCallbackunless required for behavior.
- Add or reuse a processor in the appropriate
src/lib/*-processor.tsfile. - Add one object to the
toolsarray insrc/lib/tool-registry.ts. - Set
acceptedExtensionsto strings beginning with.or the wildcard*. - Add focused processor tests or E2E showcase coverage.
- Update this README, especially the tool list and processor responsibilities.
Uint8Array<ArrayBufferLike>is not a TS6-safeBlobPart; copy/slice beforenew Blob().- FFmpeg and Drive OAuth should be checked against both dev and production preview.
acceptedExtensionsis extension-oriented. MIME handling belongs in file input accept logic or optionacceptfields.document-viewerauto-runs after file selection and does not show a Run button.- qpdf and FFmpeg assets must be present in
public/after install/build staging.
Vitest covers processors, search, registry behavior, recorder helpers, and selected components.
Playwright (tests/e2e/showcase.spec.ts) runs the sparticuz-chromium project against npm run build && npm run preview on port 3000 by default, or PLAYWRIGHT_PORT when set. It generates sample assets with download-samples.ts, drives supported tools, records videos, and writes SUCCESS markers for passing cases. concat-videos.ts combines successful recordings into videos/full-showcase.webm.
Current E2E coverage includes most registered tools. Certificate signing, signature validation, and unlock flows are intentionally covered by focused processor tests where browser fixture setup is less practical.
Stable test IDs:
file-inputrun-buttonresult-cardresult-save-to-driveresult-save-all-to-drivepreviewcrop-selectioncrop-resize-handlecamera-overlaycamera-overlay-handlerecorder-togglerecorder-mountedtodo-inputtodo-draft-inputtodo-edit-inputtodo-importtodo-exporttodo-itemtodo-linktodo-mounted
- Add a chatbox (DaisyUI chat component) that is available when online, allowing users to send tasks and attachments.
- Have the autonomous LLM complete tasks and return output using frontend tools/functions.
- Support chat history through the AI provider's API.
- Allow the AI to execute tools autonomously without requiring the user to click Run or explicitly confirm actions.
- Provide tool access dynamically using structured Function Calling, not prompt engineering. Function schemas should be generated dynamically from
tool-registry.ts. - Ensure the AI can handle normal conversation context even when the prompt does not explicitly mention file operations.
- Allow the AI to extract file metadata dynamically, such as size and resolution, when needed for an operation.
- For region-based tools like cropping, pixelating, or collages, route the user to the tool UI with the file preloaded and the relevant docs attached.
- If a file is uploaded first and a prompt is entered later, the AI should use the previous context to continue the task.