What This Audit Measures Vs What It Doesn't · Search-quality baseline: prose / conceptual (Phase 1.5, v1.2.0 candidate)

Measures:

Strict programmatic-ground-truth match rate on top-3 (26.7%)
That the ranker often surfaces page tangentially related to the question (visible in the miss listing)
The methodology limit for class G specifically

Does not measure:

Whether the surfaced pages, taken together, would be useful for an LLM agent constructing a Swift code answer
Whether human-judged relevance differs materially from the regex
Whether re-running with broadened regex would substantially change the result (worth doing as a follow-up if the test is rerun later)