AWS Bedrock

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon through a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI.

Using AWS Bedrock with Nile

AWS Bedrock's foundation models can be used with Nile to build B2B applications using RAG (Retrieval Augmented Generation) architectures.

Below, we show a simple example of how to use AWS Bedrock's embedding models with Nile. However, keep in mind that AWS Bedrock offers a wide range of models and capabilities. All of them can be used with Nile to build powerful AI-native applications, using similar patterns to the one shown below.

We'll walk you through the setup steps and then explain the code line by line. The entire script is available here.

Setting Up Nile

Once you've signed up for Nile, you'll be promoted to create your first database. Go ahead and do so. You'll be redirected to the "Query Editor" page of your new database. This is a good time to create the table we'll be using in this example:

create table todos (
    id uuid DEFAULT (gen_random_uuid()),
    tenant_id uuid,
    title varchar(256),
    embedding vector(1024), -- Must match the dimensionality of the embedding model we'll be using
    complete boolean);

Once you've created the table, you'll see it on the left-hand side of the screen. You'll also see the tenants table that is built-in to Nile.

Next, you'll want to pick up your database connection string: Navigate to the "Settings" page, select "Connections" and click "Generate credentials". Copy the connection string and keep it in a secure location.

To use Nile in your application, you'll also need to install Psycopg2, a Python library for interacting with Postgres. And since we'll be using vector embeddings, it helps to have pgvector's Python client installed as well. You can install it with the following command:

python -m pip install psycopg2-binary pgvector

Setting Up AWS Bedrock

The first thing you'll need to do is to request access to the models you'll be using. AWS model availability varies by region, so make sure you select the region where your application will be deployed and request access to models in that region.

Note that it takes some time for AWS to approve access requests, so make sure to do this well ahead of time. In this example, we'll be using Titan Text Embeddings V2 model from Amazon.

You'll also need to install boto3, which is AWS's Python SDK. You can install it with the following command:

python -m pip install boto3

The simplest way to use boto3 on your local machine is using AWS profile and credentials. If you already have them configured, thats great! Otherwise, you'll need to use AWS CLI to configure them. AWS Boto3 documentation explains how to do this as well as other ways you can authenticate.

Quickstart

Now that we have everything set up, we can start writing some code (or alternatively, you can download the entire script here and follow along.

First, we'll need to import the libraries we'll be using and setting up the Boto3 Bedrock client:

import boto3
import json
import psycopg2
from pgvector.psycopg2 import register_vector

# Create a Bedrock Runtime client in the AWS Region of your choice.
client = boto3.client("bedrock-runtime", region_name="us-west-2")

model_id = "amazon.titan-embed-text-v2:0" # you can try other models as well, once you request access

Next, we'll set up the connection to the Nile database, register the pgvector client with the curson, and create a tenant who will own the todo items:

conn = psycopg2.connect('postgresql://user:password@us-west-2.db.thenile.dev:5432/mydb')
conn.set_session(autocommit=True)
cur = conn.cursor()
register_vector(cur)

cur.execute("insert into tenants (name) values ('first tenant') returning id;")
tenant_id = cur.fetchone()[0]

Now we'll generate embeddings for a few todo items and insert them into Nile:

todo_items = [
    "Center a div",
    "Implement RAG-based HR chatbot",
    "Add OKTA authentication to the app",
    "Write a blog post about RAG with Cohere and Nile",
    "Optimize a slow database query",
]

# Turn each todo item into a request to the model and convert the request to a JSON string.
# Amazing Titan Embeddings model doesn't accept batch requests, so we need to send one item at a time.
requests = [{"inputText": item} for item in todo_items]

# Call the model with each request and store the response in Nile
for item, request in zip(todo_items, requests):
    json_response = client.invoke_model(body=json.dumps(request), modelId=model_id)
    response = json.loads(json_response.get('body').read())
    cur.execute("INSERT INTO todos (tenant_id, title, embedding) VALUES (%s, %s, %s)", (tenant_id, item, response.get('embedding')))

Now we'll use Nile's RAG capabilities to retrieve todo items related to a given query:

question = "Is there any work left on authentication?"
question_embedding = client.invoke_model(body=json.dumps({"inputText": question}), modelId=model_id)
question_embedding = json.loads(question_embedding.get('body').read())

# Search for the question embedding in the database

cur.execute("set nile.tenant_id = %s", (tenant_id,))
cur.execute("SELECT title, complete FROM todos ORDER BY embedding <#> %s::vector LIMIT 1", (question_embedding.get('embedding'),))
print(cur.fetchone())

Run the script with the following command:

python cohere_nile_quickstart.py

And if everything went well, you should see the following output:

('Add OKTA authentication to the app')

Seems like a good answer to the question!

Full application

Coming soon!