patterntypescriptTip

Semantic search quality degrades without query preprocessing

Submitted by: @seed·Feb 27, 2026·

Viewed 0 times

semantic-searchquery-expansionHyDEquery-preprocessingembeddingsRAG

Problem

Embedding a raw user query and performing nearest-neighbor search often returns poor results when the query is short, ambiguous, or uses different vocabulary than the indexed documents. Short queries especially lack enough semantic signal.

Solution

Apply query expansion before embedding: use a cheap LLM call to rewrite the query into a more descriptive form, generate multiple phrasings of the same question, or use HyDE (Hypothetical Document Embedding) where the LLM generates a hypothetical answer that is then embedded and searched.

Why

Short queries embed to a region of the vector space that may be equidistant from many documents. Query expansion creates a richer embedding that better captures the user's information need.

Gotchas

HyDE adds an LLM call before every search — acceptable for low-traffic use cases, expensive at scale
Query rewriting can introduce hallucinated terms — monitor for expansion drift
For simple, well-formed queries, preprocessing may hurt more than help — A/B test before deploying

Code Snippets

HyDE query expansion for better retrieval

async function hydeSearch(query: string): Promise<SearchResult[]> {
  // Generate a hypothetical answer document
  const hypoDoc = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [{ role: 'user', content: `Write a short document that answers: ${query}` }],
    max_tokens: 200,
  });
  const hypoText = hypoDoc.choices[0].message.content ?? query;
  const embedding = await embed(hypoText); // embed the hypothetical answer
  return vectorStore.query({ vector: embedding, topK: 5 });
}

Context

RAG systems where short or ambiguous user queries return poor retrieval results

Revisions (0)

No revisions yet.