patterntypescriptTip
Semantic search quality degrades without query preprocessing
Viewed 0 times
semantic-searchquery-expansionHyDEquery-preprocessingembeddingsRAG
Problem
Embedding a raw user query and performing nearest-neighbor search often returns poor results when the query is short, ambiguous, or uses different vocabulary than the indexed documents. Short queries especially lack enough semantic signal.
Solution
Apply query expansion before embedding: use a cheap LLM call to rewrite the query into a more descriptive form, generate multiple phrasings of the same question, or use HyDE (Hypothetical Document Embedding) where the LLM generates a hypothetical answer that is then embedded and searched.
Why
Short queries embed to a region of the vector space that may be equidistant from many documents. Query expansion creates a richer embedding that better captures the user's information need.
Gotchas
- HyDE adds an LLM call before every search — acceptable for low-traffic use cases, expensive at scale
- Query rewriting can introduce hallucinated terms — monitor for expansion drift
- For simple, well-formed queries, preprocessing may hurt more than help — A/B test before deploying
Code Snippets
HyDE query expansion for better retrieval
async function hydeSearch(query: string): Promise<SearchResult[]> {
// Generate a hypothetical answer document
const hypoDoc = await openai.chat.completions.create({
model: 'gpt-4o-mini',
messages: [{ role: 'user', content: `Write a short document that answers: ${query}` }],
max_tokens: 200,
});
const hypoText = hypoDoc.choices[0].message.content ?? query;
const embedding = await embed(hypoText); // embed the hypothetical answer
return vectorStore.query({ vector: embedding, topK: 5 });
}Context
RAG systems where short or ambiguous user queries return poor retrieval results
Revisions (0)
No revisions yet.