Visualization: ROC/PR/PRG + Annotation Effort Saved#

Inputs#

Script: scripts/visualize_full_curves.py
Input: run folders under outputs/ that contain config.yaml, Score-*.csv, and Ranking-<issue>.csv files.

Generating Rankings#

Rankings are saved automatically when running the CLI after 2025-09-14.
Each run writes Ranking-near_duplicates.csv, Ranking-off_topic_samples.csv, and Ranking-label_errors.csv containing a target column of 0/1 values ordered by the method’s ranking.

Create Figures#

Example (BEATS at alpha 0.05): python scripts/visualize_full_curves.py --base-dir outputs --alpha 0.05 --model BEATS

Batch for All Models and Alphas#

Generate every figure the repo has results for: python scripts/visualize_all_curves.py --base-dir outputs
- Optional: --format pdf to only save PDFs (PNGs are always created by the per-combo script alongside the PDF of the effort plot).

What It Produces#

For every combination that has results for all three issue types, it saves:
- curves_<MODEL>_alpha<ALPHA>_ND-<...>_OT-<...>_LE-<...>.png/.pdf (ROC, PR, PRG, Effort Saved)
- Annotation_Effort_Saving_<MODEL>_alpha<ALPHA>_... .pdf/.png (single panel)

Notes#

Only groups with all three issues are visualized.
If older runs are missing ranking CSVs, re-run the experiments to emit them.
If multiple variants exist per issue, the script prioritizes combined variants when present (e.g., off_topic_combined over off_topic_noise/external/corrupted, combined_duplicates over other duplicate corruptions).