chore: add contributing Spanner tools RAG agent sample

PiperOrigin-RevId: 800938492
This commit is contained in:
Google Team Member
2025-08-29 09:59:56 -07:00
committed by Copybara-Service
parent bb4cfdec12
commit fcd748e17f
4 changed files with 416 additions and 0 deletions
@@ -0,0 +1,186 @@
# Spanner Tools RAG Agent Sample
## 🚀 Introduction
This sample demonstrates how to build an intelligent Retrieval Augmented
Generation (RAG) agent using the flexible, built-in Spanner tools available
in the ADK's `google.adk.tools.spanner` module, including how to create
customized Spanner tools by extending the existing ones.
[Spanner](https://cloud.google.com/spanner/docs) is a fully managed,
horizontally scalable, globally distributed database service that is great for
both relational and non-relational operational workloads.
Spanner has built-in vector search support, enabling you to perform similarity
or semantic search and implement retrieval augmented generation (RAG) in GenAI
applications at scale, leveraging either exact K-nearest neighbor (KNN) or
approximate nearest neighbor (ANN) features.
Spanner's vector search queries return fresh real-time data as soon as
transactions are committed, just like any other query on your operational data.
In this sample, you'll build the agent leveraging Spanner's built-in, real-time
vector search capabilities to provide relevant information.
## 🛠️ Setup and Requirements
To run this sample, you need an accessible Spanner instance and database in your
Google Cloud Project.
### Set up the Spanner database table
To set up the schema, navigate to Spanner Studio:
First, you want to add the products table. Copy and paste this statement in the
empty tab.
For the schema, copy and paste this DDL into the box:
```sql
CREATE TABLE products (
categoryId INT64 NOT NULL,
productId INT64 NOT NULL,
productName STRING(MAX) NOT NULL,
productDescription STRING(MAX) NOT NULL,
productDescriptionEmbedding ARRAY<FLOAT32>,
createTime TIMESTAMP NOT NULL OPTIONS (
allow_commit_timestamp = true
),
inventoryCount INT64 NOT NULL,
priceInCents INT64,
) PRIMARY KEY(categoryId, productId);
```
Then, click the `run` button and wait a few seconds for your schema to be
created.
### Create an Embedding model
Next, you will create an Embedding model in Spanner and configure it to VertexAI
model endpoint.
```sql
CREATE MODEL EmbeddingsModel INPUT(
content STRING(MAX),
) OUTPUT(
embeddings STRUCT<statistics STRUCT<truncated BOOL, token_count FLOAT32>, values ARRAY<FLOAT32>>,
) REMOTE OPTIONS (
endpoint = '//aiplatform.googleapis.com/projects/<PROJECT_ID>/locations/us-central1/publishers/google/models/text-embedding-004'
);
```
Then, click the `run` button and wait a few seconds for your models to be
created.
Learn more about Spanner `MODEL` in [Spanner Vertex AI integration](https://cloud.google.com/spanner/docs/ml-tutorial-embeddings)
### Load the sample data
Now, you will want to insert some products into your database. Open up a new tab
in Spanner Studio, then copy and paste the following insert statements:
```sql
INSERT INTO products (categoryId, productId, productName, productDescription, createTime, inventoryCount, priceInCents)
VALUES (1, 1, "Cymbal Helios Helmet", "Safety meets style with the Cymbal children's bike helmet. Its lightweight design, superior ventilation, and adjustable fit ensure comfort and protection on every ride. Stay bright and keep your child safe under the sun with Cymbal Helios!", PENDING_COMMIT_TIMESTAMP(), 100, 10999),
(1, 2, "Cymbal Sprout", "Let their cycling journey begin with the Cymbal Sprout, the ideal balance bike for beginning riders ages 2-4 years. Its lightweight frame, low seat height, and puncture-proof tires promote stability and confidence as little ones learn to balance and steer. Watch them sprout into cycling enthusiasts with Cymbal Sprout!", PENDING_COMMIT_TIMESTAMP(), 10, 13999),
(1, 3, "Cymbal Spark Jr.", "Light, vibrant, and ready for adventure, the Spark Jr. is the perfect first bike for young riders (ages 5-8). Its sturdy frame, easy-to-use brakes, and puncture-resistant tires inspire confidence and endless playtime. Let the spark of cycling ignite with Cymbal!", PENDING_COMMIT_TIMESTAMP(), 34, 13900),
(1, 4, "Cymbal Summit", "Conquering trails is a breeze with the Summit mountain bike. Its lightweight aluminum frame, responsive suspension, and powerful disc brakes provide exceptional control and comfort for experienced bikers navigating rocky climbs or shredding downhill. Reach new heights with Cymbal Summit!", PENDING_COMMIT_TIMESTAMP(), 0, 79999),
(1, 5, "Cymbal Breeze", "Cruise in style and embrace effortless pedaling with the Breeze electric bike. Its whisper-quiet motor and long-lasting battery let you conquer hills and distances with ease. Enjoy scenic rides, commutes, or errands with a boost of confidence from Cymbal Breeze!", PENDING_COMMIT_TIMESTAMP(), 72, 129999),
(1, 6, "Cymbal Trailblazer Backpack", "Carry all your essentials in style with the Trailblazer backpack. Its water-resistant material, multiple compartments, and comfortable straps keep your gear organized and accessible, allowing you to focus on the adventure. Blaze new trails with Cymbal Trailblazer!", PENDING_COMMIT_TIMESTAMP(), 24, 7999),
(1, 7, "Cymbal Phoenix Lights", "See and be seen with the Phoenix bike lights. Powerful LEDs and multiple light modes ensure superior visibility, enhancing your safety and enjoyment during day or night rides. Light up your journey with Cymbal Phoenix!", PENDING_COMMIT_TIMESTAMP(), 87, 3999),
(1, 8, "Cymbal Windstar Pump", "Flat tires are no match for the Windstar pump. Its compact design, lightweight construction, and high-pressure capacity make inflating tires quick and effortless. Get back on the road in no time with Cymbal Windstar!", PENDING_COMMIT_TIMESTAMP(), 36, 24999),
(1, 9,"Cymbal Odyssey Multi-Tool","Be prepared for anything with the Odyssey multi-tool. This handy gadget features essential tools like screwdrivers, hex wrenches, and tire levers, keeping you ready for minor repairs and adjustments on the go. Conquer your journey with Cymbal Odyssey!", PENDING_COMMIT_TIMESTAMP(), 52, 999),
(1, 10,"Cymbal Nomad Water Bottle","Stay hydrated on every ride with the Nomad water bottle. Its sleek design, BPA-free construction, and secure lock lid make it the perfect companion for staying refreshed and motivated throughout your adventures. Hydrate and explore with Cymbal Nomad!", PENDING_COMMIT_TIMESTAMP(), 42, 1299);
```
Click the `run` button to insert the data.
### Generate embeddings for the sample data
For similarity search to work on the products, you need to generate embeddings
for the product descriptions.
With the `EmbeddingsModel` created in the schema, this is a simple UPDATE DML
statement to generate embeddings.
```sql
UPDATE products p1
SET productDescriptionEmbedding =
(SELECT embeddings.values from ML.PREDICT(MODEL EmbeddingsModel,
(SELECT productDescription as content FROM products p2 where p2.productId=p1.productId)))
WHERE categoryId=1;
```
Click the `run` button to update the product descriptions.
Learn more about how to [generate and backfill vector embeddings in bulk](https://cloud.google.com/spanner/docs/backfill-embeddings)
for textual data (STRING or JSON) that is stored in Spanner using SQL.
## 🤖 How to use the sample RAG agent built on Spanner
Set up environment variables in your `.env` file for using
[Google AI Studio](https://google.github.io/adk-docs/get-started/quickstart/#gemini---google-ai-studio)
or
[Google Cloud Vertex AI](https://google.github.io/adk-docs/get-started/quickstart/#gemini---google-cloud-vertex-ai)
for the LLM service for your agent. For example, for using Google AI Studio you
would set:
* GOOGLE_GENAI_USE_VERTEXAI=FALSE
* GOOGLE_API_KEY={your api key}
### With Application Default Credentials
This mode is useful for quick development when the agent builder is the only
user interacting with the agent. The tools are run with these credentials.
1. Create application default credentials on the machine where the agent would
be running by following https://cloud.google.com/docs/authentication/provide-credentials-adc.
1. Set `CREDENTIALS_TYPE=None` in `agent.py`
1. Run the agent
### With Service Account Keys
This mode is useful for quick development when the agent builder wants to run
the agent with service account credentials. The tools are run with these
credentials.
1. Create service account key by following https://cloud.google.com/iam/docs/service-account-creds#user-managed-keys.
1. Set `CREDENTIALS_TYPE=AuthCredentialTypes.SERVICE_ACCOUNT` in `agent.py`
1. Download the key file and replace `"service_account_key.json"` with the path
1. Run the agent
### With Interactive OAuth
1. Follow
https://developers.google.com/identity/protocols/oauth2#1.-obtain-oauth-2.0-credentials-from-the-dynamic_data.setvar.console_name.
to get your client id and client secret. Be sure to choose "web" as your client
type.
1. Follow
https://developers.google.com/workspace/guides/configure-oauth-consent
to add scope "https://www.googleapis.com/auth/spanner.data" and
"https://www.googleapis.com/auth/spanner.admin" as declaration, this is used
for review purpose.
1. Follow
https://developers.google.com/identity/protocols/oauth2/web-server#creatingcred
to add http://localhost/dev-ui/ to "Authorized redirect URIs".
Note: localhost here is just a hostname that you use to access the dev ui,
replace it with the actual hostname you use to access the dev ui.
1. For 1st run, allow popup for localhost in Chrome.
1. Configure your `.env` file to add two more variables before running the
agent:
* OAUTH_CLIENT_ID={your client id}
* OAUTH_CLIENT_SECRET={your client secret}
Note: don't create a separate .env, instead put it to the same .env file
that stores your Vertex AI or Dev ML credentials
1. Set `CREDENTIALS_TYPE=AuthCredentialTypes.OAUTH2` in `agent.py` and run the
agent
## 💬 Sample prompts
* I'd like to buy a starter bike for my 3 year old child, can you show me the recommendation?
![Spanner RAG Sample Agent](Spanner_RAG_Sample_Agent.png)
Binary file not shown.

After

Width:  |  Height:  |  Size: 1.8 MiB

@@ -0,0 +1,15 @@
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from . import agent
@@ -0,0 +1,215 @@
# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
from typing import Any
from typing import Dict
from typing import Optional
from google.adk.agents.llm_agent import LlmAgent
from google.adk.auth.auth_credential import AuthCredentialTypes
from google.adk.tools.base_tool import BaseTool
from google.adk.tools.google_tool import GoogleTool
from google.adk.tools.spanner import query_tool
from google.adk.tools.spanner.settings import Capabilities
from google.adk.tools.spanner.settings import SpannerToolSettings
from google.adk.tools.spanner.spanner_credentials import SpannerCredentialsConfig
from google.adk.tools.spanner.spanner_toolset import SpannerToolset
from google.adk.tools.tool_context import ToolContext
import google.auth
from google.auth.credentials import Credentials
from pydantic import BaseModel
# Define an appropriate credential type
# Set to None to use the application default credentials (ADC) for a quick
# development.
CREDENTIALS_TYPE = None
# Define Spanner tool config with read capability set to allowed.
tool_settings = SpannerToolSettings(capabilities=[Capabilities.DATA_READ])
if CREDENTIALS_TYPE == AuthCredentialTypes.OAUTH2:
# Initiaze the tools to do interactive OAuth
# The environment variables OAUTH_CLIENT_ID and OAUTH_CLIENT_SECRET
# must be set
credentials_config = SpannerCredentialsConfig(
client_id=os.getenv("OAUTH_CLIENT_ID"),
client_secret=os.getenv("OAUTH_CLIENT_SECRET"),
scopes=[
"https://www.googleapis.com/auth/spanner.admin",
"https://www.googleapis.com/auth/spanner.data",
],
)
elif CREDENTIALS_TYPE == AuthCredentialTypes.SERVICE_ACCOUNT:
# Initialize the tools to use the credentials in the service account key.
# If this flow is enabled, make sure to replace the file path with your own
# service account key file
# https://cloud.google.com/iam/docs/service-account-creds#user-managed-keys
creds, _ = google.auth.load_credentials_from_file("service_account_key.json")
credentials_config = SpannerCredentialsConfig(credentials=creds)
else:
# Initialize the tools to use the application default credentials.
# https://cloud.google.com/docs/authentication/provide-credentials-adc
application_default_credentials, _ = google.auth.default()
credentials_config = SpannerCredentialsConfig(
credentials=application_default_credentials
)
### Section 1: Get the built-in Spanner toolset ###
# Note that the built-in Spanner toolset is more flexible and generic. It is
# shown here for comparison and tutorial purposes.
spanner_toolset = SpannerToolset(
credentials_config=credentials_config, spanner_tool_settings=tool_settings
)
### Section 2: Extending the built-in Spanner Toolset for Custom Use Cases ###
# This example illustrates how to extend the built-in Spanner toolset to create
# a customized Spanner tool. This method is advantageous when you need to deal
# with a specific use case:
#
# 1. Streamline the end user experience by pre-configuring the tool with fixed
# parameters (such as a specific database, instance, or project) and a
# dedicated SQL query, making it perfect for a single, focused use case
# like vector search on a specific table.
# 2. Enhance functionality by adding custom logic to manage tool inputs,
# execution, and result processing, providing greater control over the
# tool's behavior.
class SpannerRagSetting(BaseModel):
"""Customized Spanner RAG settings for an example use case."""
# Replace the following settings for your Spanner database used in the sample.
project_id: str = "<PROJECT_ID>"
instance_id: str = "<INSTANCE_ID>"
database_id: str = "<DATABASE_ID>"
# Follow the instructions in README.md, the table name is "products" and the
# Spanner embedding model name is "EmbeddingsModel" in this sample.
table_name: str = "products"
embedding_model_name: str = "EmbeddingsModel"
selected_columns: list[str] = [
"productId",
"productName",
"productDescription",
]
embedding_column_name: str = "productDescriptionEmbedding"
additional_filter_expression: str = "inventoryCount > 0"
vector_distance_function: str = "EUCLIDEAN_DISTANCE"
top_k: int = 3
RAG_SETTINGS = SpannerRagSetting()
# Create a wrapped function tool for the agent on top of the built-in Spanner
# toolset.
# This customized tool is used to perform a Spanner KNN vector search on a
# embeded knowledge base stored in a Spanner database table.
def wrapped_spanner_execute_sql_tool(
search_query: str,
credentials: Credentials, # GoogleTool handles `credentials` automatically
settings: SpannerToolSettings, # GoogleTool handles `settings` automatically
tool_context: ToolContext, # GoogleTool handles `tool_context` automatically
) -> str:
"""Perform a similarity search on the product catalog.
Args:
search_query: The search query to find relevant content.
Returns:
Relevant product catalog content with sources
"""
# Learn more about Spanner Vertex AI integration for embedding and Spanner
# vector search.
# https://cloud.google.com/spanner/docs/ml-tutorial-embeddings
# https://cloud.google.com/spanner/docs/vector-search/overview
embedding_query = f"""SELECT embeddings.values
FROM ML.PREDICT(
MODEL {RAG_SETTINGS.embedding_model_name},
(SELECT "{search_query}" as content)
)
"""
distance_alias = "distance"
columns = [f"{column}" for column in RAG_SETTINGS.selected_columns]
columns += [f"""{RAG_SETTINGS.vector_distance_function}(
{RAG_SETTINGS.embedding_column_name},
({embedding_query})) AS {distance_alias}
"""]
columns = ", ".join(columns)
knn_query = f"""
SELECT {columns}
FROM {RAG_SETTINGS.table_name}
WHERE {RAG_SETTINGS.additional_filter_expression}
ORDER BY {distance_alias}
LIMIT {RAG_SETTINGS.top_k}
"""
# Customized tool based on the built-in Spanner toolset.
return query_tool.execute_sql(
project_id=RAG_SETTINGS.project_id,
instance_id=RAG_SETTINGS.instance_id,
database_id=RAG_SETTINGS.database_id,
query=knn_query,
credentials=credentials,
settings=settings,
tool_context=tool_context,
)
def inspect_tool_params(
tool: BaseTool,
args: Dict[str, Any],
tool_context: ToolContext,
) -> Optional[Dict]:
"""A callback function to inspect tool parameters before execution."""
print("Inspect for tool: " + tool.name)
actual_search_query_in_args = args.get("search_query")
# Inspect the `search_query` when calling the tool for tutorial purposes.
print(f"Tool args `search_query`: {actual_search_query_in_args}")
pass
### Section 3: Create the root agent ###
root_agent = LlmAgent(
model="gemini-2.5-flash",
name="spanner_knowledge_base_agent",
description=(
"Agent to answer questions about product-specific recommendations."
),
instruction="""
You are a helpful assistant that answers user questions about product-specific recommendations.
1. Always use the `wrapped_spanner_execute_sql_tool` tool to find relevant information.
2. If no relevant information is found, say you don't know.
3. Present all the relevant information naturally and well formatted in your response.
""",
tools=[
# Add customized Spanner tool based on the built-in Spanner toolset.
GoogleTool(
func=wrapped_spanner_execute_sql_tool,
credentials_config=credentials_config,
tool_settings=tool_settings,
),
# Add built-in Spanner toolset for comparison and tutorial purposes.
# spanner_toolset,
],
before_tool_callback=inspect_tool_params,
)