Add F1 radio RAG demo (full Opik eval loop)#12
Open
fschlz wants to merge 1 commit into
Open
Conversation
use-cases/f1_radio_rag: a runnable Typer CLI walking the entire Opik loop over a synthetic F1 team-radio RAG (ChromaDB): ingest -> ask (traced) -> eval (dataset + test suite + ContextRecall/Hallucination metrics) -> optimize (Optimization Studio) -> promote (Prompt Library). Every command has a DRY_RUN path; credentials read from env vars only.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
A new
use-cases/f1_radio_rag/demo: a small, runnable Typer CLI that walks the entire Opik evaluation-and-improvement loop for a RAG use case — summarising F1 team-radio messages across a race weekend.The loop (one CLI command each)
ingest— load radio messages into a local ChromaDB store (offline)ask— retrieve + summarise with Claude (via litellm), traced in Opikeval— build an Opik dataset + test suite; run plain-English assertions (run_tests) and theContextRecall+Hallucinationmetrics (evaluate)optimize—MetaPromptOptimizerimproves the summariser prompt against the datasetpromote— save the optimised prompt to the Prompt Library (versioned)run-all— the whole chainNotes
uvproject (pyproject.toml + uv.lock);pip installline also in the README..env.exampleprovided.team_radiois audio-only); the loop is identical for real transcripts. README carries this + the optimizer-scope caveat.use-cases/README.md.