Embedding Models for RAG

Embedding models are used to convert text into vector embeddings. These embeddings can be used to perform various tasks like similarity search, clustering, and classification. In the context of RAG, embedding models are used to convert the input text into embeddings that are used to retrieve relevant (similar) documents from the document store.

Vendor(s)Modeldimensionsmax tokenscostMTEB avg scoresimilarity metric
OpenAItext-embedding-3-small1536 (scales down)8191$0.02 / 1M tokens62.3cosine, dot product, L2
text-embedding-3-large3072 (scales down)8191$0.13 / 1M tokens64.6cosine, dot product, L2
Googletext-embedding-preview-0409 / text-embedding-0004768 (scales down)2048 $0.025/1M tokens in Vertex, free in Gemini66.31cosine, L2
Fireworksthenlper/gte-large1024512$0.016 / 1M tokens63.23cosine
nomic-ai/nomic-embed-text-v1.5768 (scales down)8192$0.008 / 1M tokens62.28cosine
DeepInfragte-large1024512$0.010 / 1M tokens63.23cosine
Cohereembed-english-v3.01024512$0.10 / 1M Tokens64.5cosine
Voyagevoyage-large-2-instruct102416000$0.12 / 1M tokens68.28cosine, dot product, L2
voyage-210244000 $0.1/ 1M tokenscosine, dot product, L2
voyage-code-2153616000 $0.12/ 1M tokenscosine, dot product, L2
voyage-law-2102416000 $0.12/ 1M tokenscosine, dot product, L2

Explanation of columns

  • Vendor(s): The vendor(s) that provide the model as a service.
  • Model: The name of the model.
  • dimensions: The number of dimensions in the vector embeddings that the model generates
  • max tokens: The maximum number of tokens that can be passed to the model in a single request
  • cost: The cost of using the model (based on vendor pricing page, where available)
  • MTEB avg score: The Massive Text Embedding Benchmark (MTEB) average score. MTEB is a benchmark for evaluating the quality of embeddings across a range of tasks. The higher the score, the better the embeddings.
  • similarity metric: The similarity metric recommended by the model authors to use with the embeddings. We only included the metrics supported by pg_vector, some of the models may support additional metrics.