Google Embedding Models

Note that Google has to AI API products:

The NodeJS SDK for Gemini API is significantly nicer, so the example below will use it. However, Vertex AI has more embedding models - multi-lingual, multi-modal, images and video.

Availble models

The example below is based on the new text-embedding-preview-0409 model, also called text-embedding-0004 in Gemini APIs.
It is the best text embedding model available from Google and ranks well in the MTEB benchmark.

Modeldimensionsmax tokenscostMTEB avg scoresimilarity metric
text-embedding-preview-0409 / text-embedding-0004768 (scales down)2048 $0.025/1M tokens in Vertex, free in Gemini66.31cosine, L2

Usage

To use Google's embedding models, you need a Google Cloud project. The example below uses Gemini, so you will need to have Gemini Generative Language APIs enabled. and you will also need an API key with permissions to access the Generative Language API. You can get one by going to APIs & Services -> Credentials in your Google Cloud Console. (You can also use Google's AI Studio to get an API key).

Vertex has separate API to enable, separate key permissions, separate pricing and a different SDK (which we don't document here).

Installing dependencies

npm install @niledatabase/server @google/generative-ai

Generating embeddings with Google

const { GoogleGenerativeAI } = require("@google/generative-ai");
const GOOGLE_API_KEY = "my-google-api-key";
const model = "models/text-embedding-004"; // or your favorite Google model

const genAI = new GoogleGenerativeAI(GOOGLE_API_KEY);

const embed = genAI.getGenerativeModel({
  model: model,
  outputDimensionality: 768, // optional, default 768 but can be scaled down
});

const input_text = `The future belongs to those who believe in the beauty of their 
                    dreams.`;

// Note, if you don't want to specify taskType and title, you can simply use:
// const doc_vec_resp = await model.embedContent(input_text);
const doc_vec_resp = await embed.embedContent({
  content: { parts: [{ text: input_text }] },
  taskType: "RETRIEVAL_DOCUMENT", // for embeddings of documents stored in pg_vector
  // title of the document. can be used with RETRIEVAL_DOCUMENT and
  // possibly improve the quality of the embeddings
  // title: ""
});

const question = "Who does the future belong to?";

const question_vec_resp = await embed.embedContent({
  content: { parts: [{ text: question }] },
  taskType: "RETRIEVAL_QUERY", //embeddings of questions used to search documents
});

// Google returns a response object with an embedding field that
// contains the embeddings in the value field
const doc_vec = doc_vec_resp.embedding.values;
const question_vec = question_vec_resp.embedding.values;

Storing and retrieving the embeddings

// set up Nile as the vector store
const { Nile } = await import("@niledatabase/server");
const NILEDB_USER = "you need a nile user with access to a nile database";
const NILEDB_PASSWORD = "and a password for that user";
const nile = await Nile({
  user: NILEDB_USER,
  password: NILEDB_PASSWORD,
});

// create table to store vectors - vector size must match the model dimensions
await nile.db.query(
  "CREATE TABLE IF NOT EXISTS embeddings (embedding vector(768))"
);

// store vector in a table
await nile.db.query("INSERT INTO embeddings (embedding) values ($1)", [
  JSON.stringify(doc_vec.map((v) => Number(v))),
]);

// search for similar vectors
let db_resp = await nile.db.query(
  "select embedding from embeddings order by embedding<=>$1 limit 1",
  [JSON.stringify(question_vec.map((v) => Number(v)))]
);

// Postgres returns an object, with array of rows each column is a
// property of the row the vector is represented as a string
let similar_str = db_resp.rows[0].embedding;
let similar = similar_str
  .substring(1, similar_str.length - 1)
  .split(",")
  .map((v) => parseFloat(v));

// check that we got the same vector back
let same = true;

for (let i = 0; i < similar.length; i++) {
  if (Math.abs(similar[i] - doc_vec[i]) > 0.000001) {
    same = false;
  }
}

console.log("got same vector? " + same);

Additional notes

Scale down

Google's text-embedding-0004 model has 768 dimensions, but you can scale it down to lower dimensions. The older model, text-embedding-0001 does not support scaling down.

Task types

Google's documentation about taskTypes is a bit confusing. Some documents say that taskType is only supported by text-embedding-0001 model, and other say that it works with 0004 as well. My experiments showed that taskType works with 0004, so I have included it in the example above. I assume there are typos in the docs.

Distance metrics

Google documentation doesn't mention the distance metric used for similarity search and doesn't mention anything about normalization either. However, the MTEB benchmark records for Google's model show use of Cosine and L2 distance.