Skip to content

feat_query_normalization_tuning: Query normalization as a tunable, opt-in query-time parameter #403

@SoundMindsAI

Description

@SoundMindsAI

Why

  • Problem: Relevance pipelines run in stages (query understanding → retrieval → ranking → re-ranking). RelyLoop tunes the ranking stage; the query-understanding stage — where light pre-query string rewriting (case-folding, whitespace trimming, contraction expansion) often hides the largest wins — is invisible to the loop. Operators tune it by hand because RelyLoop has no parameter representing it; query_text passes through verbatim (backend/app/adapters/elastic.py:547, backend/app/adapters/solr.py:1108).
  • Outcome: A template that opts in by declaring query_normalizer as a Categorical param gets the Optuna loop deciding empirically — on the operator's judgment set — whether lowercasing, trimming, or contraction expansion improves nDCG/MAP/MRR. The winning normalizer travels in the proposal's config_diff and surfaces in the PR body as a copy-pasteable Python/JS snippet under a new "Operator-side requirement" section, so production parity is achievable without extending the apply path.
  • Non-goal (preserved): Analyzer / index-mapping changes remain a permanent non-goal per umbrella spec §4. This feature touches only the query string before it reaches the engine — no cluster write, no schema change, no analyzer modification.

Status

  • Stage: PLAN
  • Priority: (see idea file)

Definition of done

Spec defines AC-1: Normalizer library — pure functions, AC-2: Router validates query_normalizer reservation via validate_normalizer_reservation, AC-3: ElasticAdapter.render applies the chosen normalizer, AC-4: SolrAdapter.render applies the chosen normalizer, AC-5: PR body — section appears when query_normalizer is in config_diff, AC-6: PR body — section omitted when query_normalizer is absent, AC-7: PR body — none choice renders without a snippet, AC-8: Digest panel advisory — ES analyzer overlap predicate, AC-9: Digest panel advisory — hidden for Solr, AC-10: Digest panel advisory — hidden for none, AC-11: Frontend enum source-of-truth, AC-12: Snippet round-trip — runtime ≡ embedded (semantic equality), AC-13: End-to-end — operator runs a normalizer-tuning study against the live stack. Each must pass before merge.

Artifacts

How to execute

The folder has both feature_spec.md and implementation_plan.md — both cross-model reviewed. Ready to ship:

/impl-execute docs/00_overview/planned_features/02_mvp2/feat_query_normalization_tuning/implementation_plan.md --all

--all runs the full story sequence end-to-end with per-story verification gates, phase-gate cross-model reviews via GPT-5.5, test coverage audit, push, PR creation, CI watch, Gemini Code Assist adjudication, and final cross-model review. The PR is opened but NOT merged — you merge it manually after review.

Notes

This issue is part of the MVP2 backlog issue-coverage sweep (2026-06-02) — every active MVP2 folder should have a tracking issue so external contributors can discover the work without grep-ing the planned-features tree. If you pick this up, drop a comment so others don't duplicate; if you find the linked idea/spec stale, run /idea-preflight first to refresh it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    mvp2MVP2 backlog itemready-to-executeHas approved spec + impl plan; ready for /impl-executetype/featureFeature — new product capability

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions