Skip to content

Public Medium Outputs

This page records reference output shapes for the tracked public_medium_* suites. Generated reports are intentionally ignored by Git; these notes provide clean-clone comparison targets for release-candidate users.

All runs below were produced from an installed release wheel on 2026-04-26 with:

lrdbench run <suite-name>
lrdbench validate-output reports/public_medium/<run_id>

The row shapes are expected to remain valid under the 1.0.0 output contract.

Suite Run ID Per-stratum rows Benchmark uncertainty rows Disagreement rows Failure rows Leaderboard rows Suite-specific rows
public_medium_canonical_ground_truth 12eda89e-8c35-481b-9746-2350273abedb 459 170 363 128 5 0 sensitivity
public_medium_stress_contamination 054d5baa-0b98-4fb8-9ce1-a27c8691c0f8 2631 880 1575 616 4 1872 stress
public_medium_null_false_positive d2d92952-a246-4965-9ab2-67c66a855516 167 32 371 44 4 0 sensitivity
public_medium_sensitivity_disagreement 1cd2ceb3-d970-4d35-8583-f1cb22421008 616 105 925 294 9 150 sensitivity

The row counts above exclude CSV headers. Different package versions may change numeric values or row counts; such changes should be intentional and reflected in the changelog and output contract when public surfaces change.

Verification Commands

Run medium suites one at a time:

lrdbench list-suites
lrdbench run public_medium_canonical_ground_truth
lrdbench validate-output reports/public_medium/<run_id>

The stress-contamination and sensitivity/disagreement suites are the slowest medium checks. For reproducibility comparisons, keep the generated manifest/environment.json and artefacts/artefact_index.csv beside the raw CSV exports.