Perplexity open-sources embedding models that match Google and Alibaba at a fraction of the memory cost

Perplexity open-sources embedding models that match Google and Alibaba at a fraction of the memory cost

By Jonathan Kemper
Publication Date: 2026-02-28 10:41:00

AI search engine Perplexity is introducing two new text embedding models designed to match or beat Google’s and Alibaba’s offerings at a fraction of the usual memory cost. Both models are open source.

Before a language model can answer a search query, it needs to find the right documents among billions of web pages. That first filtering step is handled by embedding models, which translate queries and documents into numerical vectors so semantic similarity becomes something you can calculate. The quality of these embeddings directly determines what gets passed on to ranking models and, ultimately, to the language model generating the answer.

Perplexity has now released two embedding models, pplx-embed-v1 and pplx-embed-context-v1. The first handles classic dense text retrieval, while the second also embeds passages in the context of their surrounding document; useful for disambiguating tricky sections. Both come in 0.6 billion and 4 billion parameter versions.

Perplexity’s embedding models hit similar scores to Qwen3 and Gemini on the MTEB benchmark but can store significantly more pages per gigabyte thanks to quantization. | Image: Perplexity

Bidirectional reading gives embeddings more context

According to the researchers, most leading embedding models are built on language models that only process text left to right. Each word can only “see” what came before it. That works fine for text generation, but it’s a problem for understanding meaning, since a sentence’s…