Skip to content

feat: add QMD_EMBED_MODEL env var for multilingual embedding support#273

Open
daocoding wants to merge 1 commit intotobi:mainfrom
daocoding:feature/configurable-embed-model
Open

feat: add QMD_EMBED_MODEL env var for multilingual embedding support#273
daocoding wants to merge 1 commit intotobi:mainfrom
daocoding:feature/configurable-embed-model

Conversation

@daocoding
Copy link

Problem

The default embeddinggemma-300M embedding model is English-centric and produces poor vector representations for CJK (Chinese, Japanese, Korean) text. Users working with multilingual document collections have no way to switch to a better embedding model without modifying source code.

Solution

Add QMD_EMBED_MODEL environment variable to override the default embedding model at runtime.

Changes

  • src/llm.ts: DEFAULT_EMBED_MODEL now reads from QMD_EMBED_MODEL env var (falls back to embeddinggemma-300M for backward compatibility)
  • src/llm.ts: getDefaultLlamaCpp() passes QMD_EMBED_MODEL to LlamaCpp config when set
  • src/llm.ts: Add isQwen3EmbeddingModel() helper to detect Qwen3-Embedding model family
  • src/llm.ts: formatQueryForEmbedding() and formatDocForEmbedding() auto-detect model family and apply the correct prompt format:
    • embeddinggemma: nomic-style task: search result | query: ... (unchanged)
    • Qwen3-Embedding: task-instruction format Instruct: ...\nQuery: ...
  • src/store.ts: pass model URI to format functions for consistency between indexing and query time
  • README.md: document QMD_EMBED_MODEL with usage example

Usage

# Qwen3-Embedding-0.6B is multilingual (119 languages), MTEB top-ranked at its size
export QMD_EMBED_MODEL="hf:Qwen/Qwen3-Embedding-0.6B-GGUF/qwen3-embedding-0.6b-q8_0.gguf"

# Re-embed after switching models
qmd embed -f

Notes

  • Fully backward-compatible — no change when QMD_EMBED_MODEL is unset
  • The reranker (Qwen3-Reranker-0.6B) already ships with QMD; this PR aligns the embedding side for users who want a consistent Qwen3 pipeline
  • Only format detection logic added; no behavior change for default model

The default embeddinggemma-300M model is English-centric and produces
poor embeddings for CJK (Chinese, Japanese, Korean) text. This change
allows overriding the embedding model via the QMD_EMBED_MODEL environment
variable.

Changes:
- DEFAULT_EMBED_MODEL now reads from QMD_EMBED_MODEL env var (fallback to
  embeddinggemma-300M for backward compatibility)
- getDefaultLlamaCpp() passes QMD_EMBED_MODEL to LlamaCpp config when set
- formatQueryForEmbedding() and formatDocForEmbedding() detect Qwen3-Embedding
  models and apply the correct prompt format (Qwen3 uses task-instruction
  format; embeddinggemma uses nomic-style prefix format)
- store.ts: pass model URI to format functions so format selection is
  consistent between indexing and query time
- README: document QMD_EMBED_MODEL with Qwen3-Embedding example

Recommended multilingual model:
  QMD_EMBED_MODEL=hf:Qwen/Qwen3-Embedding-0.6B-GGUF/qwen3-embedding-0.6b-q8_0.gguf

After changing the model, run: qmd embed -f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant