Feature and its Use Cases
Currently, when a user runs verify_dataset.py, the verification report is limited to:
An ASCII table printed to the terminal showing PASS/FAIL per check
An optional raw JSON file via the --json flag
This works for quick checks, but it falls short in two critical areas:
Problem 1: No shareable, readable reports
The ASCII table only makes sense in a terminal. If a researcher wants to share verification results with a team, attach them to a paper, or post them in a GitHub issue/PR — there's no way to do it. The JSON output is machine-readable but not human-friendly.
Problem 2: No diagnostic details when verification fails
When a check fails (e.g., processed_sha256 shows FAIL), the report only shows the expected vs actual hash. It gives zero guidance on why it failed. The user is left guessing:
Did the environment change? Which packages differ?
Did the processed output change? Where exactly does it diverge?
Is it a Python version issue? A platform issue?
Additional Context
No response
Code of Conduct
Feature and its Use Cases
Currently, when a user runs verify_dataset.py, the verification report is limited to:
An ASCII table printed to the terminal showing PASS/FAIL per check
An optional raw JSON file via the --json flag
This works for quick checks, but it falls short in two critical areas:
Problem 1: No shareable, readable reports
The ASCII table only makes sense in a terminal. If a researcher wants to share verification results with a team, attach them to a paper, or post them in a GitHub issue/PR — there's no way to do it. The JSON output is machine-readable but not human-friendly.
Problem 2: No diagnostic details when verification fails
When a check fails (e.g., processed_sha256 shows FAIL), the report only shows the expected vs actual hash. It gives zero guidance on why it failed. The user is left guessing:
Did the environment change? Which packages differ?
Did the processed output change? Where exactly does it diverge?
Is it a Python version issue? A platform issue?
Additional Context
No response
Code of Conduct