| Metric | Value |
|---|---|
| Synonyms tested | 22 |
| Canonical framework at top-1 | 4 / 22 (18.2%) |
| Canonical in top-5 | 11 / 22 |
| Canonical in top-10 | 13 / 22 |
| Canonical missing from top-10 | 9 / 22 |
| MRR | 0.2562 |
Binomial test (top-1 == canonical vs chance 0.5): k = 4 of n = 22, one-sided p = 0.9996. The null hypothesis (synonyms-mechanism has no effect beyond chance) is not rejected; if anything, the data is consistent with the synonyms-mechanism actively producing worse results than random for top-1 retrieval.
Compare to the other class baselines on the same DB:
| Class | Top-1 hit rate |
|---|---|
A canonical lookup (search-quality-baseline) | 46/50 (92%) |
E deprecation-aware (search-quality-deprecation) | 30/30 (100%) |
F cross-source canonical (search-quality-crosssource) | 19/19 conditional (100%) |
D CamelCase fragment (search-quality-fragment) | 20/20 (100%) at any-match |
| C acronym / synonym (this doc) | 4/22 (18%) |
This is the worst-performing class baseline by a wide margin.