Back to Search-quality baseline: prose / conceptual (Phase 1.5, v1.2.0 candidate)
Search-quality baseline: prose / conceptual (Phase 1.5, v1.2.0 candidate)

Aggregate

MetricValue
Queries15
Any relevant in top-3 (headline)4 / 15 (26.7%)
Any relevant in top-56 / 15 (40.0%)
Mean P@30.1111
Mean P@50.0933

This is the second-worst class baseline after acronym (4/22, 18%). For prose, this number represents an upper-bound on the methodology problem and a lower-bound on the ranker problem; the truth is somewhere in between.