Output Contract¶
The public benchmark output contract is tracked in
configs/contracts/public_output_contract.json. The same contract is exposed from
lrdbench.output_contract.PUBLIC_OUTPUT_CONTRACT for tests and downstream tooling.
The current contract version is 1.0.0.
Run Root¶
Each run writes artefacts under:
<report.export_root>/<run_id>/
For the public-small suites this is usually reports/public_small/<run_id>/. For public-medium
suites it is usually reports/public_medium/<run_id>/.
Required Files¶
Every reported run should include these summary artefacts:
tables/run_summary.csvtables/per_stratum_metrics.csvtables/leaderboard.csvtables/estimator_metadata.csvtables/failures.csvtables/failure_map.csvtables/uncertainty_calibration.csvtables/benchmark_uncertainty.csvtables/estimator_disagreement.csvtables/scale_window_sensitivity.csvhtml/report.htmlmanifest/environment.jsonartefacts/artefact_index.csv
Every raw result store should include:
raw/records.csvraw/estimates.csvraw/metrics.csvraw/artefacts.csv
raw/leaderboards.csv is present when leaderboard rows are generated. tables/stress_metrics.csv
is present for stress-test reports. Figures and LaTeX tables are present only when requested and
available for the run.
Required Columns¶
The machine-readable JSON contract lists required columns for each stable CSV. Downstream checks
should treat these as a minimum set: extra metric__* and stratum__* columns are expected in
leaderboard, failure-map, and failure-summary tables when manifests or strata change.
Core examples:
| File | Required columns |
|---|---|
tables/run_summary.csv |
run_id, manifest_id, benchmark_name, mode |
tables/per_stratum_metrics.csv |
estimator_name, metric_name, value, stratum_json, metadata_json |
tables/leaderboard.csv |
estimator_name, rank, score |
raw/metrics.csv |
scope, record_id, estimator_name, metric_name, value, stratum_json, metadata_json |
artefacts/artefact_index.csv |
artefact_id, run_id, artefact_type, format, path, hash, created_at, depends_on_json |
Use the JSON contract as the authority when building automated output checks.
Validation Command¶
After generating a report, validate the run directory against the public output contract:
lrdbench validate-output reports/public_small/<run_id>
The command checks required files and required CSV columns. It returns exit code 0 for a valid
output directory and exit code 2 with one error=... line per contract violation when validation
fails.
Artefact Index¶
artefacts/artefact_index.csv records every exported report artefact known to the reporter. Its
rows include:
- a stable
artefact_idwithin the run; - the
run_id; - an
artefact_type, such asmetric_export,leaderboard_export,figure,environment_snapshot,html_report, orartefact_index; - the file
format; - the artefact
path; - optional
hash,created_at, anddepends_on_jsonmetadata.
The raw result store mirrors this information in raw/artefacts.csv.