Back to Search-quality baseline: v1.2.0 candidate `search.db`
Search-quality baseline: v1.2.0 candidate `search.db`

How to Use This Baseline

When evaluating a future ranking change (BM25F weight tweak, new tokenizer, schema change), re-run the same 50-query corpus on both the unchanged binary/DB and the changed binary/DB, use the paired-comparison mode (/tmp/cupertino-search-eval.py as written), and report the delta against this baseline.

A regression on MRR vs this 0.9467 mark, or on the P@1 perfect count of 46/50, demands explanation before the change ships. A regression in any of the 4 sub-perfect cases above is also worth noticing (they are where the current ranker is fragile).