diff --git a/authors/a1local.md b/authors/a1local.md
new file mode 100644
index 00000000..0cfa9ea9
--- /dev/null
+++ b/authors/a1local.md
@@ -0,0 +1,8 @@
+Author: A1 Local Title: AI-assisted open-source contributor Description:
+A1 Local contributes practical developer documentation and small product
+improvements with a focus on source-checked workflows, reproducible validation,
+and clear handoffs for maintainers. Author Image:
+![A1 Local](https://github.com/a1local.png) Author LinkedIn: Author Twitter:
+Company Name: A1 Local Company Description: Practical web, automation, and
+technical documentation work for small teams. Company Logo Dark: Company Logo
+White:
diff --git a/definitions/20260520_definition_glossary_aware_transcription.md b/definitions/20260520_definition_glossary_aware_transcription.md
new file mode 100644
index 00000000..c7f1cc5f
--- /dev/null
+++ b/definitions/20260520_definition_glossary_aware_transcription.md
@@ -0,0 +1,21 @@
+---
+title: "Glossary-Aware Transcription"
+description:
+  "Glossary-aware transcription uses a controlled term list to guide speech-to-text
+  output and review domain-specific words."
+date: 2026-05-20
+author: "A1 Local"
+tags: ["transcription", "speech-to-text", "ai"]
+---
+
+# Glossary-Aware Transcription
+
+Glossary-aware transcription is a speech-to-text workflow that gives the
+transcription system a controlled list of names, acronyms, product terms, and
+domain phrases before or during review. The glossary helps reduce common errors
+around words that sound similar, proper nouns, and technical vocabulary.
+
+For AI engineering teams, glossary-aware transcription is useful because
+transcripts often become source material for summaries, support notes, release
+updates, search indexes, and retrieval systems. Correcting key terms early keeps
+downstream tools from repeating the same mistake.
diff --git a/guides/20260520_sapat_glossary_correction_daytona.md b/guides/20260520_sapat_glossary_correction_daytona.md
new file mode 100644
index 00000000..4c79714d
--- /dev/null
+++ b/guides/20260520_sapat_glossary_correction_daytona.md
@@ -0,0 +1,516 @@
+---
+title: "Build Glossary-Aware Transcription with Sapat"
+description:
+  "Use Sapat in a Daytona workspace to transcribe recordings, preserve domain
+  terms, and review transcripts before AI handoff."
+date: 2026-05-20
+author: "A1 Local"
+tags: ["daytona", "sapat", "transcription", "workflow"]
+---
+
+# Build Glossary-Aware Transcription with Sapat
+
+# Introduction
+
+AI transcription is usually treated as a single command: upload a recording,
+wait for a transcript, and move on. That works for simple audio, but it breaks
+down when a recording contains product names, customer names, acronyms, roadmap
+labels, feature flags, model names, or words from a specific industry. Those
+terms are often the exact reason an AI engineer wants the transcript in the
+first place.
+
+Sapat is a small command-line transcription tool that converts video files to
+MP3 with FFmpeg, sends the audio to a supported provider, and writes a `.txt`
+file beside the source video. The current Sapat CLI supports OpenAI, Groq Cloud,
+and Azure OpenAI through the `--api` option. It also exposes useful controls
+such as `--language`, `--prompt`, `--temperature`, `--quality`, and `--correct`.
+
+This guide shows how to run Sapat inside a [Daytona workspace](../definitions/20240819_definition_daytona workspace.md)
+as a repeatable, [glossary-aware transcription](../definitions/20260520_definition_glossary_aware_transcription.md)
+workflow. You will create a workspace, configure one provider, build a small
+term glossary, run Sapat with a prompt that protects domain vocabulary, and
+review the output before it becomes input for summaries, support notes, release
+updates, or a retrieval pipeline.
+
+The goal is not just "get text from audio." The goal is to produce a transcript
+that another engineer or [LLM](../definitions/20241219_definition_llm.md) can
+trust.
+
+## TL;DR
+
+- Use Daytona to keep Sapat, FFmpeg, provider credentials, and review notes in a
+  reproducible workspace.
+- Put product names, speaker names, acronyms, and uncommon terms into a
+  glossary before running transcription.
+- Pass the glossary summary through Sapat's `--prompt` option and keep
+  `--temperature` low for stable output.
+- Review the transcript against the glossary before handing it to another AI
+  workflow.
+- Store the source metadata, raw transcript, corrected transcript, and review
+  notes as separate artifacts.
+
+## Prerequisites
+
+You will need:
+
+- Daytona installed and connected to your Git provider.
+- Python 3.6 or later in the workspace.
+- FFmpeg available in the workspace.
+- API credentials for OpenAI, Groq Cloud, or Azure OpenAI.
+- One or more `.mp4` recordings that you have permission to process.
+
+This guide uses placeholder credentials. Keep real [environment variables](../definitions/20241126_definition_environment_variables.md)
+in `.env` and out of Git.
+
+## Workflow Overview
+
+![Glossary-aware Sapat transcription workflow](assets/20260520_sapat_glossary_correction_daytona_img1.svg)
+
+The workflow has four artifacts:
+
+- **Recording metadata**: what the file contains, who can see it, and what needs
+  review.
+- **Glossary**: names, acronyms, product terms, and expected spelling.
+- **Sapat transcript**: the `.txt` output written beside each video.
+- **Handoff notes**: the cleaned transcript status and remaining review items.
+
+Separating these files makes the process easier to audit. It also helps when
+you need to rerun transcription with a better prompt without losing your first
+pass.
+
+## Step 1: Create a Daytona Workspace
+
+Create a workspace from the Sapat repository:
+
+```bash
+daytona create https://github.com/nkkko/sapat --code
+```
+
+Inside the workspace, inspect the project:
+
+```bash
+ls
+find src/sapat -maxdepth 3 -type f
+```
+
+The important files are:
+
+- `README.md`, which lists provider credentials and CLI examples.
+- `src/sapat/script.py`, which defines the Click command and supported flags.
+- `src/sapat/transcription/base.py`, which converts video to MP3 and writes the
+  final `.txt` file.
+- `src/sapat/transcription/openai.py`, `groq.py`, and `azure.py`, which contain
+  the provider implementations.
+
+Sapat processes a single file when `input_path` is a file. If `input_path` is a
+directory, it loops over `.mp4` files in that directory.
+
+## Step 2: Install Sapat in the Workspace
+
+You can run Sapat directly from the source tree, but installing the package into
+a virtual environment gives you a cleaner command-line workflow.
+
+```bash
+python -m venv .venv
+source .venv/bin/activate
+python -m pip install --upgrade pip
+python -m pip install build
+python -m build
+python -m pip install dist/sapat-0.1.2-py3-none-any.whl
+```
+
+Confirm FFmpeg is available:
+
+```bash
+ffmpeg -version
+```
+
+If FFmpeg is missing, install it in the environment your Daytona workspace uses.
+For a Debian or Ubuntu based workspace image, that usually means:
+
+```bash
+sudo apt-get update
+sudo apt-get install -y ffmpeg
+```
+
+## Step 3: Configure One Provider
+
+Create a local `.env` file. Start with one provider instead of adding every key
+you own.
+
+For OpenAI:
+
+```bash
+OPENAI_API_KEY=your_openai_api_key_here
+OPENAI_MODEL=whisper-1
+OPENAI_API_ENDPOINT=https://api.openai.com/v1/audio/transcriptions
+OPENAI_MODEL_NAME_CHAT=gpt-4o
+```
+
+For Groq Cloud:
+
+```bash
+GROQCLOUD_API_KEY=your_groq_api_key_here
+GROQCLOUD_MODEL=whisper-large-v3-turbo
+GROQCLOUD_API_ENDPOINT=https://api.groq.com/openai/v1/audio/transcriptions
+GROQCLOUD_MODEL_NAME_CHAT=llama3-8b-8192
+```
+
+For Azure OpenAI:
+
+```bash
+AZURE_OPENAI_API_KEY=your_azure_api_key_here
+AZURE_OPENAI_ENDPOINT=https://DEPLOYMENTENDPOINTNAME.openai.azure.com
+AZURE_OPENAI_DEPLOYMENT_NAME_WHISPER=whisper
+AZURE_OPENAI_API_VERSION_WHISPER=2024-06-01
+AZURE_OPENAI_DEPLOYMENT_NAME_CHAT=gpt-4o
+AZURE_OPENAI_API_VERSION_CHAT=2023-03-15-preview
+```
+
+Add `.env` to `.gitignore` if your working copy does not already ignore it:
+
+```bash
+printf '\n.env\n' >> .gitignore
+```
+
+## Step 4: Prepare Recordings and Metadata
+
+Create a predictable folder structure:
+
+```bash
+mkdir -p media/raw media/review media/handoff media/glossary
+```
+
+Put recordings in `media/raw`. Use names that describe the session without
+leaking private data:
+
+```text
+media/raw/
+  customer-research-call-01.mp4
+  roadmap-demo-voiceover.mp4
+```
+
+Before you run transcription, write a short metadata note:
+
+```markdown
+# Recording Metadata
+
+Source file: customer-research-call-01.mp4
+Primary language: English
+Speaker count: 3
+Provider: OpenAI
+Allowed use: internal summary and support insight extraction
+
+Review priority:
+- Customer names
+- Product names
+- Plan names
+- Numbers, dates, and pricing
+- Sentences marked unclear
+
+Privacy notes:
+- Do not publish raw audio.
+- Remove personal phone numbers from public excerpts.
+```
+
+Save it as:
+
+```text
+media/review/customer-research-call-01-metadata.md
+```
+
+This file is boring on purpose. It gives reviewers enough context to judge the
+transcript without replaying the entire recording.
+
+## Step 5: Build a Glossary Prompt
+
+Create a glossary file with the terms that the provider may mishear:
+
+```markdown
+# Transcript Glossary
+
+Product and company terms:
+- Daytona
+- Sapat
+- Dev Container
+- Workspace
+- OpenAI
+- Groq Cloud
+- Azure OpenAI
+
+People and teams:
+- Platform Engineering
+- Developer Experience
+- Support Operations
+
+Acronyms:
+- CDE: cloud development environment
+- QA: quality assurance
+- RAG: retrieval-augmented generation
+
+Style notes:
+- Keep product names in title case.
+- Keep acronyms uppercase.
+- Do not expand acronyms unless the speaker does.
+```
+
+Save it as:
+
+```text
+media/glossary/customer-research-call-01-glossary.md
+```
+
+Now turn that glossary into a short transcription prompt. The prompt should be
+brief enough to help the model without becoming another document to parse.
+
+```bash
+cat > media/glossary/customer-research-call-01-prompt.txt <<'EOF'
+This recording discusses Daytona, Sapat, Dev Containers, OpenAI, Groq Cloud,
+Azure OpenAI, CDEs, QA, and RAG. Preserve product names and acronyms exactly.
+Use normal punctuation. Do not invent speaker names.
+EOF
+```
+
+## Step 6: Run a First Transcription Pass
+
+Run Sapat with a low temperature and the glossary prompt.
+
+```bash
+sapat media/raw/customer-research-call-01.mp4 \
+  --api openai \
+  --language en \
+  --quality H \
+  --temperature 0 \
+  --prompt "$(cat media/glossary/customer-research-call-01-prompt.txt)"
+```
+
+Sapat will:
+
+1. Convert the `.mp4` file to a temporary `.mp3`.
+2. Send that MP3 to the selected provider.
+3. Write a same-name `.txt` file beside the source video.
+4. Delete the temporary MP3 file.
+
+Your output should look like this:
+
+```text
+media/raw/customer-research-call-01.txt
+```
+
+Copy the raw transcript into the review folder before making edits:
+
+```bash
+cp media/raw/customer-research-call-01.txt \
+  media/review/customer-research-call-01-raw.txt
+```
+
+## Step 7: Review Terms Before Correcting Style
+
+Do not start by rewriting the transcript. Start by checking whether the core
+terms survived the transcription pass.
+
+```bash
+grep -n -E 'Daytona|Sapat|Groq|Azure|OpenAI|CDE|RAG|QA' \
+  media/review/customer-research-call-01-raw.txt
+```
+
+Create a review note:
+
+```markdown
+# Transcript Review
+
+File: customer-research-call-01-raw.txt
+
+Glossary matches:
+- Daytona: OK
+- Sapat: OK
+- Groq Cloud: check two occurrences
+- CDE: one occurrence transcribed as "CD"
+
+Numbers and dates:
+- Pricing statement at 12:40 needs manual check.
+- Launch date at 18:05 needs manual check.
+
+Unclear sections:
+- 09:30 to 09:48: speaker overlap.
+- 22:10 to 22:25: background noise.
+```
+
+Save it as:
+
+```text
+media/review/customer-research-call-01-review.md
+```
+
+This step is where glossary-aware transcription pays off. If one important term
+is wrong, you can rerun with a more specific prompt instead of editing every
+downstream artifact later.
+
+## Step 8: Rerun with a Tighter Prompt When Needed
+
+If the transcript misses a term, update the prompt with a small correction
+hint:
+
+```text
+The speaker says "CDE", meaning cloud development environment. Do not write
+"CD" or "city" when the context is developer workspaces.
+```
+
+Then rerun Sapat:
+
+```bash
+sapat media/raw/customer-research-call-01.mp4 \
+  --api openai \
+  --language en \
+  --quality H \
+  --temperature 0 \
+  --prompt "$(cat media/glossary/customer-research-call-01-prompt.txt)"
+```
+
+Copy the second pass separately:
+
+```bash
+cp media/raw/customer-research-call-01.txt \
+  media/review/customer-research-call-01-pass-2.txt
+```
+
+Compare the two passes:
+
+```bash
+diff -u media/review/customer-research-call-01-raw.txt \
+  media/review/customer-research-call-01-pass-2.txt | less
+```
+
+Keep the diff. It is useful evidence when you need to explain why a second
+transcription pass was necessary.
+
+## Step 9: Use the Correction Pass Carefully
+
+Sapat includes a `--correct` flag that asks the configured provider to run a
+chat correction pass after transcription. Use it when your selected provider is
+configured for both transcription and chat completion.
+
+```bash
+sapat media/raw/customer-research-call-01.mp4 \
+  --api groq \
+  --language en \
+  --quality H \
+  --temperature 0 \
+  --correct \
+  --prompt "$(cat media/glossary/customer-research-call-01-prompt.txt)"
+```
+
+Treat the corrected output as a new draft, not as the final source of truth.
+Correction models are useful for punctuation and spelling, but they can also
+smooth over uncertainty. Keep the raw pass, corrected pass, and review notes
+together.
+
+```bash
+cp media/raw/customer-research-call-01.txt \
+  media/review/customer-research-call-01-corrected.txt
+```
+
+## Step 10: Package a Handoff File
+
+Create one final handoff file for the next workflow. This could feed a summary
+prompt, a support insights report, a RAG ingestion job, or a release note draft.
+
+```markdown
+# Transcript Handoff
+
+Source: customer-research-call-01.mp4
+Final transcript: customer-research-call-01-corrected.txt
+Reviewer: your-name
+Date: 2026-05-20
+
+Status:
+- Glossary terms reviewed.
+- Numbers and dates reviewed.
+- Private contact details removed from public excerpts.
+- Two unclear sections left marked for human review.
+
+Do not use for:
+- Legal record.
+- Public quotation without speaker approval.
+- Training data without consent.
+
+Ready for:
+- Internal summary.
+- Support theme extraction.
+- Product feedback clustering.
+```
+
+Save it as:
+
+```text
+media/handoff/customer-research-call-01-handoff.md
+```
+
+## Common Issues and Troubleshooting
+
+**Problem:** Sapat says the input path is invalid.
+
+**Solution:** Confirm the file exists inside the Daytona workspace, not only on
+your local machine. Run `ls media/raw` before the Sapat command.
+
+**Problem:** FFmpeg conversion fails.
+
+**Solution:** Run `ffmpeg -version`. If FFmpeg is missing, install it in the
+workspace image. If FFmpeg exists, test with a short sample file before
+processing long recordings.
+
+**Problem:** The provider rejects the upload because of size.
+
+**Solution:** Split the source recording into smaller files or run with lower
+audio quality. Sapat's OpenAI and Groq provider classes validate a 25 MB maximum
+audio file size after MP3 conversion.
+
+**Problem:** The transcript keeps misspelling one product name.
+
+**Solution:** Put the exact spelling in the glossary and the `--prompt` text.
+If the term is an acronym, include both the acronym and what it means.
+
+**Problem:** The corrected transcript reads too polished.
+
+**Solution:** Compare the corrected pass against the raw pass. Use correction
+for punctuation and spelling, but keep unclear speech marked instead of guessing
+what a speaker meant.
+
+**Problem:** Directory processing skips a file.
+
+**Solution:** Sapat's directory loop processes files ending in `.mp4`. Convert
+or rename other source formats before running a directory batch.
+
+## Confirmation Checklist
+
+Before handing off the transcript, confirm:
+
+- The source recording has metadata and privacy notes.
+- The glossary includes all known names, products, and acronyms.
+- The Sapat command records provider, language, quality, and prompt choices.
+- The raw transcript is preserved.
+- The corrected transcript is reviewed against the glossary.
+- Remaining unclear sections are marked instead of invented.
+- The handoff file states what the transcript can and cannot be used for.
+
+## Conclusion
+
+Sapat is intentionally small, which makes it a good fit for repeatable
+transcription workflows in Daytona. By adding a glossary before transcription
+and a review loop after transcription, you turn a raw speech-to-text output into
+a safer engineering artifact. The extra files are simple: metadata, glossary,
+raw transcript, corrected transcript, and handoff notes.
+
+That structure matters when transcripts become inputs for AI summaries,
+customer research, support workflows, or retrieval systems. The earlier you
+protect product names, acronyms, numbers, and privacy constraints, the less
+cleanup you need downstream.
+
+## References
+
+- [Sapat repository](https://github.com/nkkko/sapat)
+- [Sapat README](https://github.com/nkkko/sapat/blob/main/README.md)
+- [Daytona](https://www.daytona.io/)
+- [OpenAI audio transcription API](https://platform.openai.com/docs/guides/speech-to-text)
+- [Groq audio transcription docs](https://console.groq.com/docs/speech-to-text)
+- [Azure OpenAI audio documentation](https://learn.microsoft.com/azure/ai-services/openai/)
diff --git a/guides/assets/20260520_sapat_glossary_correction_daytona_img1.svg b/guides/assets/20260520_sapat_glossary_correction_daytona_img1.svg
new file mode 100644
index 00000000..9a1901c7
--- /dev/null
+++ b/guides/assets/20260520_sapat_glossary_correction_daytona_img1.svg
@@ -0,0 +1,34 @@
+<svg xmlns="http://www.w3.org/2000/svg" width="1200" height="520" viewBox="0 0 1200 520" role="img" aria-labelledby="title desc">
+  <title id="title">Glossary-aware Sapat transcription workflow in Daytona</title>
+  <desc id="desc">A four-step workflow from recording and glossary to Sapat transcription, transcript review, and downstream handoff.</desc>
+  <rect width="1200" height="520" fill="#f7f8fb"/>
+  <rect x="56" y="68" width="244" height="156" rx="16" fill="#ffffff" stroke="#2f5f6f" stroke-width="3"/>
+  <rect x="356" y="68" width="244" height="156" rx="16" fill="#ffffff" stroke="#7d4e24" stroke-width="3"/>
+  <rect x="656" y="68" width="244" height="156" rx="16" fill="#ffffff" stroke="#335c35" stroke-width="3"/>
+  <rect x="956" y="68" width="188" height="156" rx="16" fill="#ffffff" stroke="#4d4a78" stroke-width="3"/>
+  <text x="178" y="122" text-anchor="middle" font-family="Arial, sans-serif" font-size="25" font-weight="700" fill="#172026">Recordings</text>
+  <text x="178" y="158" text-anchor="middle" font-family="Arial, sans-serif" font-size="18" fill="#42505a">MP4 files, metadata,</text>
+  <text x="178" y="184" text-anchor="middle" font-family="Arial, sans-serif" font-size="18" fill="#42505a">privacy notes</text>
+  <text x="478" y="122" text-anchor="middle" font-family="Arial, sans-serif" font-size="25" font-weight="700" fill="#172026">Glossary</text>
+  <text x="478" y="158" text-anchor="middle" font-family="Arial, sans-serif" font-size="18" fill="#42505a">Names, acronyms,</text>
+  <text x="478" y="184" text-anchor="middle" font-family="Arial, sans-serif" font-size="18" fill="#42505a">product terms</text>
+  <text x="778" y="122" text-anchor="middle" font-family="Arial, sans-serif" font-size="25" font-weight="700" fill="#172026">Sapat Run</text>
+  <text x="778" y="158" text-anchor="middle" font-family="Arial, sans-serif" font-size="18" fill="#42505a">Provider, prompt,</text>
+  <text x="778" y="184" text-anchor="middle" font-family="Arial, sans-serif" font-size="18" fill="#42505a">quality flags</text>
+  <text x="1050" y="122" text-anchor="middle" font-family="Arial, sans-serif" font-size="25" font-weight="700" fill="#172026">Handoff</text>
+  <text x="1050" y="158" text-anchor="middle" font-family="Arial, sans-serif" font-size="18" fill="#42505a">Clean transcript</text>
+  <text x="1050" y="184" text-anchor="middle" font-family="Arial, sans-serif" font-size="18" fill="#42505a">and notes</text>
+  <path d="M304 146H348" stroke="#172026" stroke-width="4" fill="none" marker-end="url(#arrow)"/>
+  <path d="M604 146H648" stroke="#172026" stroke-width="4" fill="none" marker-end="url(#arrow)"/>
+  <path d="M904 146H948" stroke="#172026" stroke-width="4" fill="none" marker-end="url(#arrow)"/>
+  <rect x="180" y="310" width="840" height="116" rx="18" fill="#172026"/>
+  <text x="600" y="354" text-anchor="middle" font-family="Arial, sans-serif" font-size="25" font-weight="700" fill="#ffffff">Review loop</text>
+  <text x="600" y="392" text-anchor="middle" font-family="Arial, sans-serif" font-size="19" fill="#e1e7ef">Check terms, rerun with a tighter prompt when needed, then package transcript evidence for downstream AI work.</text>
+  <path d="M778 232V292" stroke="#335c35" stroke-width="4" fill="none" marker-end="url(#arrow)"/>
+  <path d="M478 292V232" stroke="#7d4e24" stroke-width="4" fill="none" marker-end="url(#arrow)"/>
+  <defs>
+    <marker id="arrow" viewBox="0 0 10 10" refX="8" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse">
+      <path d="M0 0L10 5L0 10z" fill="#172026"/>
+    </marker>
+  </defs>
+</svg>