Skip to content

[codex] Refine dataset type detection confidence#238

Merged
cbyrohl merged 1 commit into
mainfrom
fix-detection-confidence
May 14, 2026
Merged

[codex] Refine dataset type detection confidence#238
cbyrohl merged 1 commit into
mainfrom
fix-detection-confidence

Conversation

@cbyrohl
Copy link
Copy Markdown
Owner

@cbyrohl cbyrohl commented May 14, 2026

Summary

This refactors dataset type detection so evidence confidence and class specificity are handled separately. Existing CandidateStatus validators continue to work, while new DetectionResult values allow validators to distinguish generic structural/header matches from format-specific markers.

The FLAMINGO regression was caused by ArepoSimulation treating a generic multi-HDF5 NumFilesPerSnapshot == 1 layout as certain Arepo evidence. That now reports generic-header confidence, while SWIFT markers report format-marker confidence, so FLAMINGO reduced snapshots resolve to SwiftSnapshot instead of ArepoSimulation.

Validation

  • uv run pytest tests/unit/test_discovertypes.py -q
  • SCIDA_TESTDATA_PATH=/newdata/data/public/testdata-astrodask uv run pytest tests/external/swift/test_flamingo.py tests/external/test_type_detection.py -m external -rs
  • SCIDA_TESTDATA_PATH=/newdata/data/public/testdata-astrodask uv run pytest tests/external -m external -rs

Full external astrodask run result: 365 passed, 2 failed, 3 skipped, 9 deselected. The two failures are pre-existing/unrelated issues observed in CI too: test_load_cachefail[TNG50-4_snapshot] and the test_areposimulation_lazy_message[TNG50-4] timing assertion.

@cbyrohl cbyrohl marked this pull request as ready for review May 14, 2026 05:50
@cbyrohl cbyrohl merged commit 37030e5 into main May 14, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant