Skip to content

Latest commit

 

History

History
44 lines (38 loc) · 2.15 KB

File metadata and controls

44 lines (38 loc) · 2.15 KB

Progress — Scene Graph from Attention ✅ COMPLETE

Scores (VERIFIED)

Approach R@20 R@50 mR@20 Combined Key Insight
Baseline 0.124 0.175 0.017 0.123 Most common predicate for all pairs
V1 Attention thresholding 0.161 0.242 0.028 0.167 Self-attention weights rank related pairs higher (2.36x)
V2 Hidden concat + LogReg 0.315 0.385 0.104 0.301 Q/K as subject/object; concatenation preserves info
V3 Multi-layer MLP 0.431 0.575 0.154 0.433 All decoder layers + attention features in 3680-dim
V4 Smoothing + Connectivity 0.449 0.560 0.175 0.439 Detection-quality label smoothing + connectivity aux

Dataset

  • 5000 images total (4000 train / 1000 test)
  • 34 predicate classes (merged from 50), 24808 train relationships
  • Raw images + JSON annotations (contestants run DETR themselves)

E2E Verification: 35/35 checks passed

Phases

  • Phase 1: Design — insight chain, metric, difficulty
  • Phase 2: Dataset — 1734 images from Visual Genome
  • Phase 2.5: Signal validation — 1.63x attention ratio confirmed
  • Phase 3: Evaluation script — R@20/R@50/mR@20
  • Phase 4: Baseline — 12.3% combined
  • Phase 5: Reference solutions v1-v4 (16.7% → 30.1% → 43.3% → 43.9%)
  • Phase 6: End-to-end verification — ALL PASSED
  • Phase 7: Analysis + Kaggle packaging + notebooks

Deliverables

File Description
data_generation.py Full data pipeline (VG download → DETR extraction)
signal_validation.py Diagnostic confirming attention encodes relations
evaluation.py R@20/R@50/mR@20 scoring
baseline.py Frequency baseline (12.3%)
solution_v1.py Attention thresholding (16.7%)
solution_v2.py Hidden concat + LogReg (30.1%)
solution_v3.py Multi-layer MLP (43.3%)
solution_v4.py Smoothing + connectivity (43.9%)
analysis.md Full post-mortem
baseline_notebook.ipynb Kaggle baseline notebook
reference_solution.ipynb Reference solution notebook
kaggle/ Full Kaggle package (149.4 MB)