diff --git a/authors/assets/images/jalil-hadj-habib.svg b/authors/assets/images/jalil-hadj-habib.svg
new file mode 100644
index 00000000..734d0001
--- /dev/null
+++ b/authors/assets/images/jalil-hadj-habib.svg
@@ -0,0 +1,6 @@
+
diff --git a/authors/jalil-hadj-habib.md b/authors/jalil-hadj-habib.md
new file mode 100644
index 00000000..c58b8469
--- /dev/null
+++ b/authors/jalil-hadj-habib.md
@@ -0,0 +1,8 @@
+Author: Jalil Hadj Habib
+Title: Full-Stack Developer
+Description: Jalil Hadj Habib is a full-stack developer and information systems engineer focused on Laravel, Vue.js, React, TypeScript, Firebase, APIs, dashboards, and practical workflow tools for business software.
+Company Name:
+Company Description:
+Author Image: /assets/images/jalil-hadj-habib.svg
+Company Logo Dark:
+Company Logo White:
diff --git a/definitions/20260520_definition_transcript_regression_test.md b/definitions/20260520_definition_transcript_regression_test.md
new file mode 100644
index 00000000..0d8e0d59
--- /dev/null
+++ b/definitions/20260520_definition_transcript_regression_test.md
@@ -0,0 +1,24 @@
+---
+title: "Transcript Regression Test"
+description: "A repeatable quality check that compares new speech-to-text output against expected transcript phrases or fixtures."
+date: 2026-05-20
+author: "Jalil Hadj Habib"
+---
+
+# Transcript Regression Test
+
+## Definition
+
+A transcript regression test is a repeatable quality check for speech-to-text
+workflows. It runs known audio fixtures through a transcription pipeline and
+compares the generated transcript against expected phrases, terms, or reference
+text.
+
+## Context and Usage
+
+Transcript regression tests help teams detect output drift when they change a
+transcription provider, model, prompt, audio conversion setting, or correction
+step. They are especially useful when transcripts contain product names,
+technical acronyms, invoice numbers, speaker names, or domain-specific terms
+that must not disappear before the text is used in summaries, search indexes, or
+downstream AI workflows.
diff --git a/guides/20260520_guide_sapat_transcript_regression_tests.md b/guides/20260520_guide_sapat_transcript_regression_tests.md
new file mode 100644
index 00000000..4f4b7031
--- /dev/null
+++ b/guides/20260520_guide_sapat_transcript_regression_tests.md
@@ -0,0 +1,379 @@
+---
+title: "Build Transcript Regression Tests With Sapat"
+description: "Use Daytona and Sapat to create repeatable transcript smoke tests before running larger AI transcription batches."
+date: 2026-05-20
+author: "Jalil Hadj Habib"
+tags: ["daytona", "sapat", "transcription", "testing", "ai"]
+---
+
+# Build Transcript Regression Tests With Sapat
+
+## Introduction
+
+AI transcription tools are easy to run once. They are harder to trust every
+week, across different providers, prompts, languages, and audio quality levels.
+A provider can change a model behind the scenes. A prompt can improve punctuation
+but damage product names. A low quality MP3 conversion can make a clear demo
+sound like a support call from a moving train.
+
+That is why a small transcript regression harness is useful. Instead of sending
+every recording through Sapat and discovering issues at the end of a batch, you
+keep a few short audio fixtures, expected transcript snippets, and a simple
+quality gate in a reproducible [Daytona workspace]().
+The workflow gives AI engineers a fast answer to one practical question:
+is this Sapat configuration still good enough to trust?
+
+In this guide, you will create a Daytona workspace for
+[`nkkko/sapat`](https://github.com/nkkko/sapat), run Sapat against small fixture
+files, record each run in a manifest, and compare the generated transcripts
+against expected phrases. The goal is not to build a full speech recognition
+benchmark. The goal is a lightweight [transcript regression test]()
+that catches obvious quality drift before it reaches production notes, meeting
+summaries, release packets, or RAG ingestion pipelines.
+
+## TL;DR
+
+- **Create a Daytona workspace** for Sapat so the transcription setup is repeatable.
+- **Keep short fixture recordings** that cover names, acronyms, numbers, and noisy speech.
+- **Run Sapat with one provider at a time** using `--api`, `--quality`, `--language`, `--prompt`, `--temperature`, and optional `--correct`.
+- **Store a manifest** for every smoke run so outputs are comparable.
+- **Fail early** when required phrases are missing from the generated transcript.
+
+## How the Harness Works
+
+The harness has four parts:
+
+1. fixture recordings that represent the content you care about;
+2. expected snippets that must appear in the transcript;
+3. a repeatable Sapat command for each provider or quality setting;
+4. a checker script that compares transcript output with expectations.
+
+
+
+This is deliberately smaller than a formal word error rate benchmark. Formal
+benchmarks need aligned transcripts, stable audio corpora, and scoring rules.
+For day-to-day engineering work, the first useful gate is simpler:
+
+- Did the provider keep the product name?
+- Did it preserve the API acronym?
+- Did it capture the invoice number?
+- Did the correction pass introduce or remove critical words?
+- Did the same fixture pass yesterday but fail after a prompt change?
+
+Sapat already gives you the important controls for this harness. The current CLI
+accepts a file or directory input, uses `ffmpeg` to convert media to MP3, writes
+a `.txt` sidecar beside the input file, and supports `--api openai`, `--api groq`,
+or `--api azure`. It also exposes `--quality`, `--language`, `--prompt`,
+`--temperature`, and `--correct`, which are exactly the knobs that tend to change
+transcript output.
+
+## Step 1: Create the Daytona Workspace
+
+Install Daytona if it is not already available on your machine:
+
+```bash
+curl -L https://download.daytona.io/daytona/install.sh | sudo bash
+```
+
+Create a workspace from the Sapat repository:
+
+```bash
+daytona create https://github.com/nkkko/sapat --code
+```
+
+Inside the workspace terminal, confirm the project layout:
+
+```bash
+ls
+find src/sapat -maxdepth 3 -type f | sort
+```
+
+You should see the Sapat package, including the Click-based CLI in
+`src/sapat/script.py` and provider implementations under `src/sapat/transcription`.
+The important behavior for this guide is:
+
+- `sapat ` processes one video file or all `.mp4` files in a directory;
+- `--api` chooses `openai`, `groq`, or `azure`;
+- `--quality` chooses MP3 conversion quality: `L`, `M`, or `H`;
+- `--correct` runs an LLM correction pass after transcription;
+- the generated transcript is saved as a `.txt` file beside the source media.
+
+## Step 2: Install Dependencies and Configure Secrets
+
+Create a virtual environment and install dependencies:
+
+```bash
+python -m venv .venv
+source .venv/bin/activate
+pip install -r requirements.txt
+pip install -e .
+```
+
+Confirm the CLI is available:
+
+```bash
+sapat --help
+```
+
+Then create a local `.env` file. Sapat supports Azure OpenAI, Groq, and OpenAI.
+Use only the provider you plan to test first:
+
+```env
+# Groq
+GROQCLOUD_API_KEY=your_groq_key
+GROQCLOUD_MODEL=whisper-large-v3-turbo
+GROQCLOUD_API_ENDPOINT=https://api.groq.com/openai/v1/audio/transcriptions
+GROQCLOUD_MODEL_NAME_CHAT=llama3-8b-8192
+```
+
+Keep `.env` out of Git. Daytona gives you a reproducible workspace, but secrets
+should still be local environment data, not repository content.
+
+## Step 3: Create Fixture and Expectation Files
+
+Create a small test area:
+
+```bash
+mkdir -p transcript-tests/fixtures transcript-tests/expected transcript-tests/runs
+```
+
+Add two or three short `.mp4` files to `transcript-tests/fixtures`. Keep them
+small: 10 to 45 seconds each is enough. Good fixtures cover the words that are
+expensive to lose:
+
+| Fixture | What it should test | Example required phrases |
+| --- | --- | --- |
+| `api-demo.mp4` | acronyms and endpoint names | `Sapat`, `Groq`, `OpenAI`, `webhook` |
+| `support-call.mp4` | noisy speech and numbers | `invoice 4729`, `Friday`, `refund` |
+| `release-note.mp4` | product names and action items | `beta dashboard`, `migration`, `owner` |
+
+For each fixture, create a matching expectation file. Example:
+
+```bash
+cat > transcript-tests/expected/api-demo.expected.txt <<'EOF'
+Sapat
+Groq
+OpenAI
+webhook
+EOF
+```
+
+These files do not need to contain the full transcript. They contain terms that
+must survive the transcription path. This keeps the gate maintainable when
+providers produce slightly different punctuation or sentence breaks.
+
+## Step 4: Run a Smoke Transcription With Sapat
+
+Start with one fixture and one provider:
+
+```bash
+sapat transcript-tests/fixtures/api-demo.mp4 \
+ --api groq \
+ --quality M \
+ --language en \
+ --prompt "Technical product demo with API names: Sapat, Groq, OpenAI, webhook." \
+ --temperature 0.3
+```
+
+Sapat converts the media to MP3, transcribes the audio, removes the temporary
+MP3, and writes:
+
+```text
+transcript-tests/fixtures/api-demo.txt
+```
+
+Read the transcript before automating anything:
+
+```bash
+sed -n '1,120p' transcript-tests/fixtures/api-demo.txt
+```
+
+If the output is wildly wrong, adjust only one variable at a time. For example,
+try `--quality H` before changing the prompt, or try `--temperature 0` before
+turning on `--correct`. Regression tests are useful because they keep those
+choices visible.
+
+## Step 5: Save a Run Manifest
+
+Create a small manifest after each test run:
+
+```bash
+cat > transcript-tests/runs/$(date -u +%Y%m%dT%H%M%SZ)-groq-api-demo.json <<'EOF'
+{
+ "fixture": "api-demo.mp4",
+ "provider": "groq",
+ "quality": "M",
+ "language": "en",
+ "temperature": 0.3,
+ "correct": false,
+ "prompt": "Technical product demo with API names: Sapat, Groq, OpenAI, webhook.",
+ "output": "api-demo.txt"
+}
+EOF
+```
+
+The manifest is not complicated, but it matters. When a transcript changes, you
+can see whether the provider, quality level, prompt, correction setting, or input
+file changed with it.
+
+## Step 6: Add a Phrase Gate
+
+Create a simple checker:
+
+```bash
+cat > transcript-tests/check_transcript.py <<'PY'
+from pathlib import Path
+import sys
+
+if len(sys.argv) != 3:
+ print("usage: check_transcript.py EXPECTED_FILE TRANSCRIPT_FILE")
+ raise SystemExit(2)
+
+expected_path = Path(sys.argv[1])
+transcript_path = Path(sys.argv[2])
+
+expected = [
+ line.strip().casefold()
+ for line in expected_path.read_text(encoding="utf-8").splitlines()
+ if line.strip()
+]
+transcript = transcript_path.read_text(encoding="utf-8").casefold()
+
+missing = [phrase for phrase in expected if phrase not in transcript]
+
+if missing:
+ print("Missing required transcript phrases:")
+ for phrase in missing:
+ print(f"- {phrase}")
+ raise SystemExit(1)
+
+print(f"PASS: {transcript_path.name} includes {len(expected)} required phrases")
+PY
+```
+
+Run the gate:
+
+```bash
+python transcript-tests/check_transcript.py \
+ transcript-tests/expected/api-demo.expected.txt \
+ transcript-tests/fixtures/api-demo.txt
+```
+
+This is a smoke test, not a final editor. It should fail loudly when important
+words disappear and stay quiet when punctuation changes.
+
+## Step 7: Compare Quality and Correction Settings
+
+Now run the same fixture with a second configuration:
+
+```bash
+sapat transcript-tests/fixtures/api-demo.mp4 \
+ --api groq \
+ --quality H \
+ --language en \
+ --prompt "Technical product demo with API names: Sapat, Groq, OpenAI, webhook." \
+ --temperature 0.3 \
+ --correct
+```
+
+Run the same phrase gate again. If both `M` and `H` pass, keep `M` unless the
+full transcript shows quality problems. If `--correct` changes a required phrase,
+do not use it blindly for that content type. Correction can improve readability,
+but it can also normalize technical names into more common words.
+
+For a larger comparison, use a table in your run notes:
+
+| Fixture | Provider | Quality | Correct | Phrase gate | Human note |
+| --- | --- | --- | --- | --- | --- |
+| `api-demo.mp4` | Groq | M | no | pass | Best cost-quality tradeoff |
+| `api-demo.mp4` | Groq | H | no | pass | No visible improvement |
+| `api-demo.mp4` | Groq | H | yes | fail | Corrected `Sapat` to `support` |
+
+This gives your team a simple decision log before they process a directory of
+customer calls, podcast episodes, demos, or engineering meetings.
+
+## Step 8: Run a Directory Batch Only After the Gate Passes
+
+Once your fixtures pass, run Sapat on a directory:
+
+```bash
+sapat transcript-tests/fixtures \
+ --api groq \
+ --quality M \
+ --language en \
+ --prompt "Technical conversations with product names, acronyms, and API terms." \
+ --temperature 0.3
+```
+
+Sapat processes every `.mp4` file in the directory. After the run, check each
+expected file against its transcript:
+
+```bash
+for expected in transcript-tests/expected/*.expected.txt; do
+ name="$(basename "$expected" .expected.txt)"
+ python transcript-tests/check_transcript.py \
+ "$expected" \
+ "transcript-tests/fixtures/$name.txt"
+done
+```
+
+If the loop passes, you have enough confidence to process the larger source
+folder using the same provider and settings.
+
+## Common Issues and Troubleshooting
+
+**Problem:** `sapat` is not found after installation.
+
+**Solution:** activate the virtual environment again with
+`source .venv/bin/activate`, or run `pip install -e .` inside the Daytona
+workspace.
+
+**Problem:** `ffmpeg` is missing.
+
+**Solution:** install it in the workspace image or terminal:
+
+```bash
+sudo apt-get update
+sudo apt-get install -y ffmpeg
+```
+
+**Problem:** the transcript is empty or very short.
+
+**Solution:** verify that the source file has an audio track. Run:
+
+```bash
+ffprobe transcript-tests/fixtures/api-demo.mp4
+```
+
+**Problem:** provider terms are wrong even with a good recording.
+
+**Solution:** add those terms to the `--prompt` value and rerun the fixture. If
+the issue only appears after `--correct`, disable correction or tighten the
+correction prompt in the provider implementation before using it in production.
+
+**Problem:** a directory batch overwrote a previous `.txt` output.
+
+**Solution:** copy important transcripts into a timestamped run folder after each
+batch. Sapat writes sidecar `.txt` files next to the media, so the harness should
+treat those files as current outputs, not permanent archives.
+
+## Conclusion
+
+Sapat gives AI engineers a practical CLI for converting videos to MP3,
+transcribing them through OpenAI, Groq, or Azure OpenAI, and saving sidecar text
+files. Daytona gives the same team a repeatable place to run that workflow
+without rebuilding the environment every time.
+
+The transcript regression harness connects those pieces. With a few fixtures,
+expected phrases, run manifests, and a small checker script, you can catch drift
+before it hits real content. That makes Sapat more useful for production-like
+workflows where transcripts feed release notes, support summaries, knowledge
+bases, incident reviews, or AI search.
+
+## References
+
+- [Sapat repository](https://github.com/nkkko/sapat)
+- [Daytona repository](https://github.com/daytonaio/daytona)
+- [OpenAI audio transcription API](https://platform.openai.com/docs/guides/speech-to-text)
+- [Groq speech-to-text documentation](https://console.groq.com/docs/speech-to-text)
+- [Azure OpenAI audio documentation](https://learn.microsoft.com/azure/ai-services/openai/whisper-quickstart)
diff --git a/guides/assets/images/20260520_sapat_transcript_regression_tests_flow.svg b/guides/assets/images/20260520_sapat_transcript_regression_tests_flow.svg
new file mode 100644
index 00000000..946d4ee8
--- /dev/null
+++ b/guides/assets/images/20260520_sapat_transcript_regression_tests_flow.svg
@@ -0,0 +1,39 @@
+