Experiment

harrier-oss-v1

Search, retrieval, and semantic understanding are at the core of virtually every AI-powered application — and the quality of your text embeddings determines how well those experiences work across languages and domains. That’s why we’re excited to introduce harrier-oss-v1, a new family of open-source multilingual text embedding models from Microsoft. Harrier uses a decoder-only architecture with last-token pooling and L2 normalization to produce dense text embeddings — a design that enables it to excel across a wide range of downstream tasks including retrieval, clustering, semantic similarity, classification, bitext mining, and reranking.

With support for 94 languages — including Arabic, Chinese, Japanese, Korean, Hindi, Indonesian, and dozens of European languages — Harrier is purpose-built for global applications. And because it’s instruction-tuned, you can customize embedding behavior for different scenarios simply by prepending a one-sentence natural language instruction to your query — no fine-tuning required. Whether you’re building multilingual RAG pipelines, cross-lingual document search, or semantic similarity features, Harrier gives you a production-ready embedding model that scales from edge to enterprise.