What This Baseline Says About the Ranker · Search-quality baseline: prose / conceptual (Phase 1.5, v1.2.0 candidate)

Cupertino's BM25F + RRF configuration is tuned for canonical-lookup and symbol-identifier queries (classes A, D), where it excels (92% P@1, 100% rank-1 fragment recall). Prose multi-word queries lean on the content column at BM25F weight 1.0, which is the smallest weight in the schema — by design, so canonical-lookup queries do not get drowned out by long-content matches.

The cost of that design choice is that prose queries see worse rankings for the kind of conceptual results they want. This is a known trade-off baked into the BM25F weight vector, not a bug. The right answer for a prose-focused consumer would be to expose alternate BM25F weight profiles (a --profile prose flag that bumps content weight up) rather than re-tune the weights and regress the canonical-lookup case.

This is consistent with cupertino's stated AI-agent-grounding purpose: agents in coding sessions issue more canonical-lookup queries than prose, so the current trade-off is the right one.