Skip to content

Commit 2ab5cb8

Browse files
Add Amazon Bedrock Text vectorizer (#248)
Add a Vectorizer for Amazon Bedrock, along with tests and doc updates. --------- Co-authored-by: Tyler Hutcherson <tyler.hutcherson@redis.com>
1 parent f280c64 commit 2ab5cb8

File tree

9 files changed

+623
-59
lines changed

9 files changed

+623
-59
lines changed

.github/workflows/run_tests.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,8 @@ jobs:
6666
AZURE_OPENAI_ENDPOINT: ${{secrets.AZURE_OPENAI_ENDPOINT}}
6767
AZURE_OPENAI_DEPLOYMENT_NAME: ${{secrets.AZURE_OPENAI_DEPLOYMENT_NAME}}
6868
OPENAI_API_VERSION: ${{secrets.OPENAI_API_VERSION}}
69+
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
70+
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
6971
run: |
7072
poetry run test-cov
7173
@@ -86,6 +88,8 @@ jobs:
8688
AZURE_OPENAI_ENDPOINT: ${{secrets.AZURE_OPENAI_ENDPOINT}}
8789
AZURE_OPENAI_DEPLOYMENT_NAME: ${{secrets.AZURE_OPENAI_DEPLOYMENT_NAME}}
8890
OPENAI_API_VERSION: ${{secrets.OPENAI_API_VERSION}}
91+
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
92+
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
8993
run: |
9094
cd docs/ && poetry run treon -v --exclude="./examples/openai_qna.ipynb"
9195

conftest.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,13 @@ def gcp_location():
7171
def gcp_project_id():
7272
return os.getenv("GCP_PROJECT_ID")
7373

74+
@pytest.fixture
75+
def aws_credentials():
76+
return {
77+
"aws_access_key_id": os.getenv("AWS_ACCESS_KEY_ID"),
78+
"aws_secret_access_key": os.getenv("AWS_SECRET_ACCESS_KEY"),
79+
"aws_region": os.getenv("AWS_REGION", "us-east-1")
80+
}
7481

7582
@pytest.fixture
7683
def sample_data():

docs/api/vectorizer.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,16 @@ CohereTextVectorizer
6161
:show-inheritance:
6262
:members:
6363

64+
BedrockTextVectorizer
65+
=====================
66+
67+
.. _bedrocktextvectorizer_api:
68+
69+
.. currentmodule:: redisvl.utils.vectorize.text.bedrock
70+
71+
.. autoclass:: BedrockTextVectorizer
72+
:show-inheritance:
73+
:members:
6474

6575
CustomTextVectorizer
6676
====================

docs/user_guide/vectorizers_04.ipynb

Lines changed: 75 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,8 @@
1313
"3. Vertex AI\n",
1414
"4. Cohere\n",
1515
"5. Mistral AI\n",
16-
"6. Bringing your own vectorizer\n",
16+
"6. Amazon Bedrock\n",
17+
"7. Bringing your own vectorizer\n",
1718
"\n",
1819
"Before running this notebook, be sure to\n",
1920
"1. Have installed ``redisvl`` and have that environment active for this notebook.\n",
@@ -541,6 +542,76 @@
541542
"# print(test[:10])"
542543
]
543544
},
545+
{
546+
"cell_type": "markdown",
547+
"metadata": {},
548+
"source": [
549+
"### Amazon Bedrock\n",
550+
"\n",
551+
"Amazon Bedrock provides fully managed foundation models for text embeddings. Install the required dependencies:\n",
552+
"\n",
553+
"```bash\n",
554+
"pip install 'redisvl[bedrock]' # Installs boto3\n",
555+
"```"
556+
]
557+
},
558+
{
559+
"cell_type": "markdown",
560+
"metadata": {},
561+
"source": [
562+
"#### Configure AWS credentials:"
563+
]
564+
},
565+
{
566+
"cell_type": "code",
567+
"execution_count": null,
568+
"metadata": {},
569+
"outputs": [],
570+
"source": [
571+
"import os\n",
572+
"import getpass\n",
573+
"\n",
574+
"if \"AWS_ACCESS_KEY_ID\" not in os.environ:\n",
575+
" os.environ[\"AWS_ACCESS_KEY_ID\"] = getpass.getpass(\"Enter AWS Access Key ID: \")\n",
576+
"if \"AWS_SECRET_ACCESS_KEY\" not in os.environ:\n",
577+
" os.environ[\"AWS_SECRET_ACCESS_KEY\"] = getpass.getpass(\"Enter AWS Secret Key: \")\n",
578+
"\n",
579+
"os.environ[\"AWS_REGION\"] = \"us-east-1\" # Change as needed"
580+
]
581+
},
582+
{
583+
"cell_type": "markdown",
584+
"metadata": {},
585+
"source": [
586+
"#### Create embeddings:"
587+
]
588+
},
589+
{
590+
"cell_type": "code",
591+
"execution_count": null,
592+
"metadata": {},
593+
"outputs": [],
594+
"source": [
595+
"from redisvl.utils.vectorize import BedrockTextVectorizer\n",
596+
"\n",
597+
"bedrock = BedrockTextVectorizer(\n",
598+
" model=\"amazon.titan-embed-text-v2:0\"\n",
599+
")\n",
600+
"\n",
601+
"# Single embedding\n",
602+
"text = \"This is a test sentence.\"\n",
603+
"embedding = bedrock.embed(text)\n",
604+
"print(f\"Vector dimensions: {len(embedding)}\")\n",
605+
"\n",
606+
"# Multiple embeddings\n",
607+
"sentences = [\n",
608+
" \"That is a happy dog\",\n",
609+
" \"That is a happy person\",\n",
610+
" \"Today is a sunny day\"\n",
611+
"]\n",
612+
"embeddings = bedrock.embed_many(sentences)"
613+
]
614+
},
544615
{
545616
"cell_type": "markdown",
546617
"metadata": {},
@@ -691,7 +762,7 @@
691762
},
692763
{
693764
"cell_type": "code",
694-
"execution_count": 17,
765+
"execution_count": null,
695766
"metadata": {},
696767
"outputs": [
697768
{
@@ -710,9 +781,10 @@
710781
"source": [
711782
"# load expects an iterable of dictionaries where\n",
712783
"# the vector is stored as a bytes buffer\n",
784+
"from redisvl.redis.utils import array_to_buffer\n",
713785
"\n",
714786
"data = [{\"text\": t,\n",
715-
" \"embedding\": v}\n",
787+
" \"embedding\": array_to_buffer(v, dtype=\"float32\")}\n",
716788
" for t, v in zip(sentences, embeddings)]\n",
717789
"\n",
718790
"index.load(data)"

0 commit comments

Comments
 (0)