Precision oncology powered by Quantum Machine Learning, Multi-LLM Clinical Reasoning, and SOPHiA DDM™ Integration
GenomicOracle is a full-stack genomic analysis platform built for the SOPHiA Genetics × ETH Zurich Future of Health Hackathon (May 8–10, 2026). It combines three pillars of modern precision oncology:
- ⚛️ Quantum Machine Learning — Variational Quantum Circuits (VQC) for pathogenicity scoring and Grover-inspired combinatorial search for multi-variant interactions (PennyLane)
- 🤖 Multi-LLM Clinical Reasoning — Evidence-based treatment recommendations and interactive clinical Q&A powered by Claude, Gemini, or GPT-4o
- 🔬 SOPHiA DDM™ Integration — Native VQS API connectivity and CSV export support for real-world genomic data from clinical labs
The platform ships with two complementary interfaces:
| Interface | Port | Purpose |
|---|---|---|
GenomicOracle (main.py) |
5000 |
Full patient analysis — VCF upload, QML scoring, LLM reports, PDF export, AI chat |
Genomic Co-Pilot (GenomicSophia/app.py) |
5001 |
Clinical triage dashboard — bulk case ranking, 4-metric scoring, PubMed literature search |
| Feature | Description |
|---|---|
| 🧬 VCF File Upload | Parse standard .vcf files or enter genomic variants manually |
| ⚛️ Quantum ML Scoring | Variational Quantum Circuit (VQC) pathogenicity scoring via PennyLane |
| 🔍 Grover-Inspired Search | Quantum combinatorial search across multi-variant state spaces for pathogenic interactions |
| 🤖 LLM Treatment Reports | NCCN/ESMO-aligned clinical oncology reports generated by Claude, Gemini, or GPT-4o |
| 💬 Interactive AI Chat | Context-aware clinical Q&A chatbot on the results page |
| 🎓 VQC Training Dashboard | Live training visualization with loss curves and terminal logs |
| 📄 PDF Report Export | Download formatted clinical genomics reports as PDF (xhtml2pdf) |
| 🔒 Offline Fallback | Full expert rule-base when no API key is configured — covers EGFR, KRAS, BRAF, PIK3CA, ALK, BRCA, and more |
| Feature | Description |
|---|---|
| 📊 4-Metric Case Scoring | ABCD Prediction (35%) + ClinVar Evidence (25%) + Community Frequency (20%) + QA Confidence (20%) |
| 🏥 Case Priority Ranking | 🔴 Critical → 🟠 High → 🟡 Moderate → 🟢 Low urgency tiers |
| 📁 Bulk CSV Analysis | Drop SOPHiA DDM™ exports into /samples — auto-loaded and ranked on startup |
| 🔌 VQS API Integration | Live connection to SOPHiA DDM™ platform via dataset keys |
| 🧠 AI Clinical Summaries | Optional Claude/Gemini/GPT-4o 2-sentence case summaries |
| ⚡ Deterministic Summaries | Always-available rule-based 2-sentence summaries (no API key needed) |
| 📚 PubMed Literature Search | Per-variant paper search via NCBI E-utilities with clickable paper cards |
| 💊 Therapy Matching | 21-gene actionable database with FDA-approved targeted therapies |
| 🌓 Dark/Light Mode | Toggle between dark and light themes |
| Feature | Description |
|---|---|
📋 variant_analysis_ai.py |
Batch-process up to 10 SOPHiA DDM™ CSV files via Claude API |
| 🎯 AI Importance Scoring | 0–100 importance score per sample with 2-sentence clinical summary |
| 📊 CSV Output | Saves aggregated results to ai_analysis.csv |
┌─────────────────────────────────────────────────────────────────────────┐
│ GenomicOracle Platform │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ main.py — Full Patient Analysis (port 5000) │ │
│ │ ┌──────────────┬─────────────────┬──────────────────────────┐ │ │
│ │ │ VCF Parser │ QML Module │ LLM Module │ │ │
│ │ │ (vcfpy) │ (qml_module.py) │ (llm_module.py) │ │ │
│ │ │ │ │ │ │ │
│ │ │ │ • VQC Scoring │ • Claude Sonnet / Opus │ │ │
│ │ │ │ • VQC Features │ • Gemini 2.5 Flash │ │ │
│ │ │ │ • Grover Search │ • GPT-4o │ │ │
│ │ │ │ (PennyLane) │ • Offline Fallback │ │ │
│ │ └──────────────┴─────────────────┴──────────────────────────┘ │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ GenomicSophia/app.py — Co-Pilot Triage Dashboard (port 5001) │ │
│ │ ┌───────────────────┬───────────────────┬────────────────────┐ │ │
│ │ │ 4-Metric Scoring │ VQS API Client │ LLM Summaries │ │ │
│ │ │ (ai_engine.py) │ (vqs_client.py) │ (ai_engine.py) │ │ │
│ │ │ │ │ │ │ │
│ │ │ • ABCD Score │ • IAM Auth │ • Claude Sonnet │ │ │
│ │ │ • ClinVar Score │ • VQS Query │ • Gemini Flash │ │ │
│ │ │ • Community Freq │ • Schema Fetch │ • GPT-4o │ │ │
│ │ │ • QA Confidence │ • CSV Parser │ • Deterministic │ │ │
│ │ └───────────────────┴───────────────────┴────────────────────┘ │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ variant_analysis_ai.py — CLI Batch Analyzer │ │
│ │ • Claude API (Opus 4) batch CSV scoring + 2-sentence summaries │ │
│ └────────────────────────────────────────────────────────────────────┘ │
│ │
│ Shared: vqs_client.py · ai_engine.py │
└─────────────────────────────────────────────────────────────────────────┘
- Python 3.10+
- (Optional) An API key for one of: Anthropic Claude, Google Gemini, or OpenAI GPT-4o
# Clone the repository
git clone https://github.com/yacine-baghli/GenomicOracle.git
cd GenomicOracle
# Create a virtual environment
python -m venv .venv
# Activate it
# Windows
.venv\Scripts\activate
# macOS / Linux
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txtpython main.pyOpen http://127.0.0.1:5000 — enter patient data, upload VCF files or manually enter variants, and generate AI-powered clinical reports.
cd GenomicSophia
pip install -r requirements.txt
python app.pyOpen http://127.0.0.1:5001 — the dashboard auto-loads and ranks all CSV files from samples/ on startup.
# Set API key
set ANTHROPIC_API_KEY=sk-ant-...
# Analyze one or more CSV files
python variant_analysis_ai.py samples/patient1.csv samples/patient2.csv --output results.csvYou can configure your LLM provider in two ways:
- Via the UI — Click ⚙️ API Key in the dashboard header, select your provider, paste your key, and save.
- Via environment variables:
# Set one of the following before launching
set ANTHROPIC_API_KEY=your-key # Anthropic Claude (recommended)
set GEMINI_API_KEY=your-key # Google Gemini
set OPENAI_API_KEY=your-key # OpenAI GPT-4oNote: If no API key is provided, both interfaces operate in Offline Deterministic Mode using built-in expert rule-bases that cover 21+ actionable oncological genes (EGFR, KRAS, BRAF, PIK3CA, ALK, BRCA1/2, RET, NTRK, MET, and more).
Fill in the patient's name, age, gender, and clinical condition (e.g., Lung Adenocarcinoma).
- Upload a VCF file, or
- Manually enter variants (Gene, HGVS notation, Zygosity)
- Or click Load Sample Data to try one of 5 pre-built clinical scenarios
The platform will:
- Annotate variants against ClinVar
- Score pathogenicity with the Variational Quantum Circuit
- Search for multi-variant interactions via Grover-inspired quantum search
- Generate an LLM-powered clinical oncology report (or expert rule-base fallback)
- Read targeted therapy recommendations aligned with NCCN/ESMO guidelines
- Chat with the AI about the patient's genomic profile
- Download the report as a PDF
Navigate to the ⚛ Train Quantum Model page to run the VQC training pipeline with live loss convergence visualization.
| Tab | How |
|---|---|
| 📄 CSV Data | Drop SOPHiA DDM™ export CSVs into GenomicSophia/samples/ — they auto-load. Or upload via the dashboard. |
| 🔌 VQS API | Enter SOPHiA DDM™ credentials + dataset keys from browser DevTools. |
- Load data — CSVs from
/samplesare auto-loaded on page load - Rank cases — Click 🔬 Rank All Cases (or 🔍 Analyze for VQS API)
- Review — Cases are sorted by urgency. Click any case card to expand it.
- Explore variants — The variant table shows all scored variants, sorted by composite score
- Literature — Click 📚 on any variant row to search PubMed for relevant papers
- AI Summaries — Set an LLM API key and select a provider to get AI-generated clinical summaries
- 4-metric radar — ABCD Prediction, ClinVar Evidence, Community Frequency, QA Confidence
- 🧠 AI Clinical Summary — 2-sentence LLM narrative (when API key is configured)
- ⚡ Deterministic Analysis — Always-available 2-sentence summary + bullet breakdown + action label
- 💊 Therapy tags — Matched from the 21-gene actionable database
- Variant table — Full priority ranking with gene links to ClinVar, per-variant PubMed search
Each variant is scored across 4 metrics (0–100), weighted and combined into a composite score.
SOPHiA DDM™'s proprietary ABCD predictor — the primary pathogenicity signal.
| ABCD Class | Base Score | + Consequence Bonus |
|---|---|---|
| A — High confidence pathogenic | 65 | Nonsense/Frameshift/Splice: +30 |
| B — Moderate confidence | 45 | Missense: +12 |
| C — Low confidence | 20 | Inframe: +5 |
| D — Benign prediction | 5 | — |
| ClinVar Significance | Points | + Review Status | Points |
|---|---|---|---|
| Pathogenic | +55 | Expert panel | +35 |
| Likely Pathogenic | +40 | Multiple submitters | +25 |
| VUS | +15 | Single submitter | +15 |
| Benign | +5 | No assertion | +5 |
SOPHiA DDM™ lab-wide frequency — rarer variants score higher.
| Community Freq. | Score | Interpretation |
|---|---|---|
| Absent / 0 | 95 | Never seen across all labs |
| < 0.1% | 90 | Ultra-rare |
| < 0.5% | 75 | Rare |
| < 1% | 60 | Low frequency |
| ≥ 10% | 10 | Likely artifact |
Sequencing quality — starts at 50, adjusted by read depth, VAF, and ABCD.
| Factor | Condition | Adjustment |
|---|---|---|
| Read Depth | ≥ 200× | +25 |
| VAF | ≥ 30% | +20 |
| VAF | < 5% | −10 |
| ABCD | A | +10 |
Case Score = Top Variant × 0.50 + Average Score × 0.30 + High-ABCD Bonus × 0.20
| Case Score | Tier | Action |
|---|---|---|
| ≥ 70 | 🔴 CRITICAL | Immediate review required |
| ≥ 50 | 🟠 HIGH | Priority review |
| ≥ 30 | 🟡 MODERATE | Standard review |
| < 30 | 🟢 LOW | Routine processing |
The platform maintains a curated database of 21 actionable genes with Tier 1/2 classifications and matched FDA-approved therapies:
| Gene | Tier | Matched Therapies |
|---|---|---|
| EGFR | 1 | Osimertinib, Erlotinib, Gefitinib |
| ALK | 1 | Alectinib, Lorlatinib, Crizotinib |
| BRAF | 1 | Dabrafenib+Trametinib, Vemurafenib |
| BRCA1/2 | 1 | Olaparib, Rucaparib |
| ROS1 | 1 | Crizotinib, Entrectinib |
| RET | 1 | Selpercatinib, Pralsetinib |
| NTRK1/2/3 | 1 | Larotrectinib, Entrectinib |
| KRAS | 2 | Sotorasib (G12C), Adagrasib (G12C) |
| MET | 2 | Capmatinib, Tepotinib |
| HER2/ERBB2 | 2 | Trastuzumab deruxtecan |
| PIK3CA | 2 | Alpelisib |
| ESR1 | 2 | Elacestrant |
| FGFR2/3 | 2 | Pemigatinib, Erdafitinib |
| IDH1/2 | 2 | Ivosidenib, Enasidenib |
Additionally, 6 prognostic markers are tracked: TP53, PTEN, RB1, CDKN2A, APC, SMAD4.
A PennyLane-based quantum circuit (default.qubit simulator) that:
- Encodes variant features into quantum states via
RXandRYrotations - Outputs a pathogenicity score (0–1) from
PauliZexpectation values - Supports post-training VQC feature extraction with interpretable variant topology descriptions
Simulates quantum amplitude amplification across multi-variant state spaces to identify clinically significant co-occurring mutations:
| Interaction | Genes | Clinical Impact |
|---|---|---|
| Synergistic Resistance | KRAS + PIK3CA | Combined MAP/PI3K pathway hyper-activation |
| Negative Prognostic | EGFR + TP53 | Severely limits TKI response duration |
| Synthetic Lethality Bypass | BRAF + PTEN | PTEN loss bypasses BRAF inhibition |
| Synthetic Lethality Synergy | PIK3CA + BRCA2 | Combined PARP + PI3K inhibition response |
| Patient | Condition | Key Variants |
|---|---|---|
| Jane Smith, 62F | Stage IV NSCLC | EGFR L858R, EGFR T790M, TP53 R273H |
| Michael Johnson, 58M | Metastatic CRC | KRAS G12V, PIK3CA E545K, APC R876* |
| Sarah Williams, 45F | Stage III Melanoma | BRAF V600E, PTEN R130G, CDKN2A R58* |
| Emily Davis, 40F | Metastatic Breast (HR+/HER2-) | PIK3CA H1047R, ESR1 D538G, BRCA2 S1982fs |
| David Brown, 8M | High-Risk Neuroblastoma | MYCN Amplification, ALK F1174L |
8 real-format SOPHiA DDM™ export CSVs are included in GenomicSophia/samples/ (patient1–patient8), containing 125–500+ variants each with full ABCD predictions, ClinVar annotations, community frequencies, and read depth data.
GenomicOracle/
├── main.py # Flask app — full patient analysis (port 5000)
├── qml_module.py # Quantum ML: VQC scoring, features, Grover search
├── llm_module.py # Multi-LLM: treatment reports, chat, offline fallback
├── ai_engine.py # 4-metric scoring engine + CSV parser + LLM summaries
├── vqs_client.py # SOPHiA VQS API client (auth + query + schema)
├── variant_analysis_ai.py # CLI batch analyzer (Claude API)
├── requirements.txt # Python dependencies
├── templates/
│ ├── index.html # Landing page — patient & variant input
│ ├── analysis.html # Variant analysis results
│ ├── results.html # Full clinical report + AI chat
│ ├── results_pdf.html # PDF-optimized report template
│ ├── training.html # VQC training dashboard
│ └── copilot.html # Co-Pilot dashboard (v1)
├── static/
│ └── style.css # UI stylesheet (glassmorphism dark theme)
├── uploads/ # Uploaded VCF files (gitignored)
│
├── GenomicSophia/ # ← Co-Pilot Triage Dashboard (standalone)
│ ├── app.py # Flask routes & API endpoints (port 5001)
│ ├── ai_engine.py # 4-metric scoring + deterministic summaries
│ ├── vqs_client.py # SOPHiA VQS API client
│ ├── requirements.txt # Minimal dependencies (flask, requests)
│ ├── samples/ # 8 SOPHiA DDM™ CSV exports (auto-loaded)
│ │ ├── patient1.csv # ... through patient8.csv
│ │ └── ...
│ ├── templates/
│ │ └── copilot.html # Dashboard HTML
│ └── static/
│ ├── style.css # Dark/Light theme CSS
│ └── app.js # Frontend logic
│
└── GenomicOracle/ # ← Earlier standalone version (separate repo)
└── ...
| Method | Endpoint | Description |
|---|---|---|
GET |
/ |
Landing page — patient & variant input |
POST |
/analysis |
Run variant analysis (VCF or manual) |
GET |
/results |
View clinical report + AI chat |
POST |
/api/chat |
Interactive AI chatbot (context-aware) |
GET |
/training |
VQC training dashboard |
POST |
/api/train_model |
Trigger quantum model training |
GET |
/download_pdf |
Download clinical report as PDF |
POST |
/settings |
Save LLM provider + API key |
GET |
/copilot |
Co-Pilot dashboard (embedded) |
POST |
/api/copilot/analyze |
VQS API analysis |
POST |
/api/copilot/csv |
CSV file analysis |
| Method | Endpoint | Description |
|---|---|---|
GET |
/ |
Dashboard |
GET |
/api/samples |
List CSV files in /samples |
POST |
/api/upload |
Upload & persist CSVs to /samples |
POST |
/api/remove_sample |
Remove a CSV from /samples |
POST |
/api/rank |
Score & rank all cases (+ optional LLM) |
POST |
/api/analyze |
Analyze via VQS API |
POST |
/api/set_key |
Store LLM API key in session |
POST |
/api/variant_papers |
PubMed paper search for a gene/variant |
| Layer | Technology |
|---|---|
| Backend | Flask (Python 3.10+) |
| Quantum ML | PennyLane (default.qubit simulator) |
| LLM Integration | Anthropic Claude, Google Gemini, OpenAI GPT-4o |
| VCF Parsing | vcfpy |
| PDF Generation | xhtml2pdf |
| SOPHiA DDM™ | VQS API (DuckDB-as-a-service) + CSV exports |
| Literature | PubMed E-utilities (free NCBI API) |
| Frontend | HTML5, CSS3 (glassmorphism), vanilla JavaScript, Inter font |
| Charts | Chart.js (training dashboard) |
flask>=3.0
pennylane>=0.39
vcfpy>=0.13
xhtml2pdf>=0.2
google-genai>=1.0
openai>=1.0
anthropic>=0.30
requests>=2.31
pandas>=2.0
The platform integrates with SOPHiA Genetics' Variant Query Service (VQS) — a DuckDB-powered API for querying genomic datasets stored on the SOPHiA DDM™ platform.
POST https://iam-vandv.sophiagenetics.com/account/token
Body: {"username": "...", "password": "..."}
→ Returns: {"access_token": "Bearer ..."}
POST https://platform-vandv1.sophiagenetics.com/api/variant/query
Params: key=<dataset_key>&engine.paginate=true
Body: {"columns": ["*"], "pagination": {"offset": 0, "limit": 500}}
- Log into SOPHiA DDM™ platform
- Open browser DevTools → Network tab
- Navigate to a patient's variant list
- Copy the
keyparameter from the/api/variant/queryrequest - Paste into the VQS API tab in the dashboard
This platform is a hackathon prototype built for educational and demonstration purposes. It is not intended for clinical use. All treatment recommendations should be validated by qualified medical professionals. The quantum ML components use simulated quantum circuits on classical hardware.
This project is open-source under the MIT License.
Built with ❤️ at ETH Zurich for the SOPHiA Genetics × ETH Zurich Future of Health Hackathon 2026