diff --git a/contributing/samples/spanner_rag_agent/README.md b/contributing/samples/spanner_rag_agent/README.md new file mode 100644 index 00000000..90f8128d --- /dev/null +++ b/contributing/samples/spanner_rag_agent/README.md @@ -0,0 +1,186 @@ +# Spanner Tools RAG Agent Sample + +## 🚀 Introduction + +This sample demonstrates how to build an intelligent Retrieval Augmented +Generation (RAG) agent using the flexible, built-in Spanner tools available +in the ADK's `google.adk.tools.spanner` module, including how to create +customized Spanner tools by extending the existing ones. + +[Spanner](https://cloud.google.com/spanner/docs) is a fully managed, +horizontally scalable, globally distributed database service that is great for +both relational and non-relational operational workloads. +Spanner has built-in vector search support, enabling you to perform similarity +or semantic search and implement retrieval augmented generation (RAG) in GenAI +applications at scale, leveraging either exact K-nearest neighbor (KNN) or +approximate nearest neighbor (ANN) features. +Spanner's vector search queries return fresh real-time data as soon as +transactions are committed, just like any other query on your operational data. + +In this sample, you'll build the agent leveraging Spanner's built-in, real-time +vector search capabilities to provide relevant information. + +## 🛠️ Setup and Requirements + +To run this sample, you need an accessible Spanner instance and database in your +Google Cloud Project. + +### Set up the Spanner database table +To set up the schema, navigate to Spanner Studio: +First, you want to add the products table. Copy and paste this statement in the +empty tab. +For the schema, copy and paste this DDL into the box: + +```sql +CREATE TABLE products ( + categoryId INT64 NOT NULL, + productId INT64 NOT NULL, + productName STRING(MAX) NOT NULL, + productDescription STRING(MAX) NOT NULL, + productDescriptionEmbedding ARRAY, + createTime TIMESTAMP NOT NULL OPTIONS ( + allow_commit_timestamp = true +), +inventoryCount INT64 NOT NULL, +priceInCents INT64, +) PRIMARY KEY(categoryId, productId); +``` + +Then, click the `run` button and wait a few seconds for your schema to be +created. + +### Create an Embedding model +Next, you will create an Embedding model in Spanner and configure it to VertexAI +model endpoint. + +```sql +CREATE MODEL EmbeddingsModel INPUT( +content STRING(MAX), +) OUTPUT( +embeddings STRUCT, values ARRAY>, +) REMOTE OPTIONS ( +endpoint = '//aiplatform.googleapis.com/projects//locations/us-central1/publishers/google/models/text-embedding-004' +); +``` + +Then, click the `run` button and wait a few seconds for your models to be +created. + +Learn more about Spanner `MODEL` in [Spanner Vertex AI integration](https://cloud.google.com/spanner/docs/ml-tutorial-embeddings) + +### Load the sample data +Now, you will want to insert some products into your database. Open up a new tab +in Spanner Studio, then copy and paste the following insert statements: + +```sql +INSERT INTO products (categoryId, productId, productName, productDescription, createTime, inventoryCount, priceInCents) +VALUES (1, 1, "Cymbal Helios Helmet", "Safety meets style with the Cymbal children's bike helmet. Its lightweight design, superior ventilation, and adjustable fit ensure comfort and protection on every ride. Stay bright and keep your child safe under the sun with Cymbal Helios!", PENDING_COMMIT_TIMESTAMP(), 100, 10999), +(1, 2, "Cymbal Sprout", "Let their cycling journey begin with the Cymbal Sprout, the ideal balance bike for beginning riders ages 2-4 years. Its lightweight frame, low seat height, and puncture-proof tires promote stability and confidence as little ones learn to balance and steer. Watch them sprout into cycling enthusiasts with Cymbal Sprout!", PENDING_COMMIT_TIMESTAMP(), 10, 13999), +(1, 3, "Cymbal Spark Jr.", "Light, vibrant, and ready for adventure, the Spark Jr. is the perfect first bike for young riders (ages 5-8). Its sturdy frame, easy-to-use brakes, and puncture-resistant tires inspire confidence and endless playtime. Let the spark of cycling ignite with Cymbal!", PENDING_COMMIT_TIMESTAMP(), 34, 13900), +(1, 4, "Cymbal Summit", "Conquering trails is a breeze with the Summit mountain bike. Its lightweight aluminum frame, responsive suspension, and powerful disc brakes provide exceptional control and comfort for experienced bikers navigating rocky climbs or shredding downhill. Reach new heights with Cymbal Summit!", PENDING_COMMIT_TIMESTAMP(), 0, 79999), +(1, 5, "Cymbal Breeze", "Cruise in style and embrace effortless pedaling with the Breeze electric bike. Its whisper-quiet motor and long-lasting battery let you conquer hills and distances with ease. Enjoy scenic rides, commutes, or errands with a boost of confidence from Cymbal Breeze!", PENDING_COMMIT_TIMESTAMP(), 72, 129999), +(1, 6, "Cymbal Trailblazer Backpack", "Carry all your essentials in style with the Trailblazer backpack. Its water-resistant material, multiple compartments, and comfortable straps keep your gear organized and accessible, allowing you to focus on the adventure. Blaze new trails with Cymbal Trailblazer!", PENDING_COMMIT_TIMESTAMP(), 24, 7999), +(1, 7, "Cymbal Phoenix Lights", "See and be seen with the Phoenix bike lights. Powerful LEDs and multiple light modes ensure superior visibility, enhancing your safety and enjoyment during day or night rides. Light up your journey with Cymbal Phoenix!", PENDING_COMMIT_TIMESTAMP(), 87, 3999), +(1, 8, "Cymbal Windstar Pump", "Flat tires are no match for the Windstar pump. Its compact design, lightweight construction, and high-pressure capacity make inflating tires quick and effortless. Get back on the road in no time with Cymbal Windstar!", PENDING_COMMIT_TIMESTAMP(), 36, 24999), +(1, 9,"Cymbal Odyssey Multi-Tool","Be prepared for anything with the Odyssey multi-tool. This handy gadget features essential tools like screwdrivers, hex wrenches, and tire levers, keeping you ready for minor repairs and adjustments on the go. Conquer your journey with Cymbal Odyssey!", PENDING_COMMIT_TIMESTAMP(), 52, 999), +(1, 10,"Cymbal Nomad Water Bottle","Stay hydrated on every ride with the Nomad water bottle. Its sleek design, BPA-free construction, and secure lock lid make it the perfect companion for staying refreshed and motivated throughout your adventures. Hydrate and explore with Cymbal Nomad!", PENDING_COMMIT_TIMESTAMP(), 42, 1299); +``` + +Click the `run` button to insert the data. + +### Generate embeddings for the sample data +For similarity search to work on the products, you need to generate embeddings +for the product descriptions. +With the `EmbeddingsModel` created in the schema, this is a simple UPDATE DML +statement to generate embeddings. + +```sql +UPDATE products p1 +SET productDescriptionEmbedding = +(SELECT embeddings.values from ML.PREDICT(MODEL EmbeddingsModel, +(SELECT productDescription as content FROM products p2 where p2.productId=p1.productId))) +WHERE categoryId=1; +``` + +Click the `run` button to update the product descriptions. + +Learn more about how to [generate and backfill vector embeddings in bulk](https://cloud.google.com/spanner/docs/backfill-embeddings) +for textual data (STRING or JSON) that is stored in Spanner using SQL. + +## 🤖 How to use the sample RAG agent built on Spanner + +Set up environment variables in your `.env` file for using +[Google AI Studio](https://google.github.io/adk-docs/get-started/quickstart/#gemini---google-ai-studio) +or +[Google Cloud Vertex AI](https://google.github.io/adk-docs/get-started/quickstart/#gemini---google-cloud-vertex-ai) +for the LLM service for your agent. For example, for using Google AI Studio you +would set: + +* GOOGLE_GENAI_USE_VERTEXAI=FALSE +* GOOGLE_API_KEY={your api key} + +### With Application Default Credentials + +This mode is useful for quick development when the agent builder is the only +user interacting with the agent. The tools are run with these credentials. + +1. Create application default credentials on the machine where the agent would +be running by following https://cloud.google.com/docs/authentication/provide-credentials-adc. + +1. Set `CREDENTIALS_TYPE=None` in `agent.py` + +1. Run the agent + +### With Service Account Keys + +This mode is useful for quick development when the agent builder wants to run +the agent with service account credentials. The tools are run with these +credentials. + +1. Create service account key by following https://cloud.google.com/iam/docs/service-account-creds#user-managed-keys. + +1. Set `CREDENTIALS_TYPE=AuthCredentialTypes.SERVICE_ACCOUNT` in `agent.py` + +1. Download the key file and replace `"service_account_key.json"` with the path + +1. Run the agent + +### With Interactive OAuth + +1. Follow +https://developers.google.com/identity/protocols/oauth2#1.-obtain-oauth-2.0-credentials-from-the-dynamic_data.setvar.console_name. +to get your client id and client secret. Be sure to choose "web" as your client +type. + +1. Follow + https://developers.google.com/workspace/guides/configure-oauth-consent + to add scope "https://www.googleapis.com/auth/spanner.data" and + "https://www.googleapis.com/auth/spanner.admin" as declaration, this is used + for review purpose. + +1. Follow + https://developers.google.com/identity/protocols/oauth2/web-server#creatingcred + to add http://localhost/dev-ui/ to "Authorized redirect URIs". + + Note: localhost here is just a hostname that you use to access the dev ui, + replace it with the actual hostname you use to access the dev ui. + +1. For 1st run, allow popup for localhost in Chrome. + +1. Configure your `.env` file to add two more variables before running the + agent: + + * OAUTH_CLIENT_ID={your client id} + * OAUTH_CLIENT_SECRET={your client secret} + + Note: don't create a separate .env, instead put it to the same .env file + that stores your Vertex AI or Dev ML credentials + +1. Set `CREDENTIALS_TYPE=AuthCredentialTypes.OAUTH2` in `agent.py` and run the + agent + +## 💬 Sample prompts + +* I'd like to buy a starter bike for my 3 year old child, can you show me the recommendation? + +![Spanner RAG Sample Agent](Spanner_RAG_Sample_Agent.png) diff --git a/contributing/samples/spanner_rag_agent/Spanner_RAG_Sample_Agent.png b/contributing/samples/spanner_rag_agent/Spanner_RAG_Sample_Agent.png new file mode 100644 index 00000000..28ed81f2 Binary files /dev/null and b/contributing/samples/spanner_rag_agent/Spanner_RAG_Sample_Agent.png differ diff --git a/contributing/samples/spanner_rag_agent/__init__.py b/contributing/samples/spanner_rag_agent/__init__.py new file mode 100644 index 00000000..c48963cd --- /dev/null +++ b/contributing/samples/spanner_rag_agent/__init__.py @@ -0,0 +1,15 @@ +# Copyright 2025 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from . import agent diff --git a/contributing/samples/spanner_rag_agent/agent.py b/contributing/samples/spanner_rag_agent/agent.py new file mode 100644 index 00000000..734d65f7 --- /dev/null +++ b/contributing/samples/spanner_rag_agent/agent.py @@ -0,0 +1,215 @@ +# Copyright 2025 Google LLC +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import os +from typing import Any +from typing import Dict +from typing import Optional + +from google.adk.agents.llm_agent import LlmAgent +from google.adk.auth.auth_credential import AuthCredentialTypes +from google.adk.tools.base_tool import BaseTool +from google.adk.tools.google_tool import GoogleTool +from google.adk.tools.spanner import query_tool +from google.adk.tools.spanner.settings import Capabilities +from google.adk.tools.spanner.settings import SpannerToolSettings +from google.adk.tools.spanner.spanner_credentials import SpannerCredentialsConfig +from google.adk.tools.spanner.spanner_toolset import SpannerToolset +from google.adk.tools.tool_context import ToolContext +import google.auth +from google.auth.credentials import Credentials +from pydantic import BaseModel + +# Define an appropriate credential type +# Set to None to use the application default credentials (ADC) for a quick +# development. +CREDENTIALS_TYPE = None + + +# Define Spanner tool config with read capability set to allowed. +tool_settings = SpannerToolSettings(capabilities=[Capabilities.DATA_READ]) + +if CREDENTIALS_TYPE == AuthCredentialTypes.OAUTH2: + # Initiaze the tools to do interactive OAuth + # The environment variables OAUTH_CLIENT_ID and OAUTH_CLIENT_SECRET + # must be set + credentials_config = SpannerCredentialsConfig( + client_id=os.getenv("OAUTH_CLIENT_ID"), + client_secret=os.getenv("OAUTH_CLIENT_SECRET"), + scopes=[ + "https://www.googleapis.com/auth/spanner.admin", + "https://www.googleapis.com/auth/spanner.data", + ], + ) +elif CREDENTIALS_TYPE == AuthCredentialTypes.SERVICE_ACCOUNT: + # Initialize the tools to use the credentials in the service account key. + # If this flow is enabled, make sure to replace the file path with your own + # service account key file + # https://cloud.google.com/iam/docs/service-account-creds#user-managed-keys + creds, _ = google.auth.load_credentials_from_file("service_account_key.json") + credentials_config = SpannerCredentialsConfig(credentials=creds) +else: + # Initialize the tools to use the application default credentials. + # https://cloud.google.com/docs/authentication/provide-credentials-adc + application_default_credentials, _ = google.auth.default() + credentials_config = SpannerCredentialsConfig( + credentials=application_default_credentials + ) + +### Section 1: Get the built-in Spanner toolset ### +# Note that the built-in Spanner toolset is more flexible and generic. It is +# shown here for comparison and tutorial purposes. +spanner_toolset = SpannerToolset( + credentials_config=credentials_config, spanner_tool_settings=tool_settings +) + + +### Section 2: Extending the built-in Spanner Toolset for Custom Use Cases ### +# This example illustrates how to extend the built-in Spanner toolset to create +# a customized Spanner tool. This method is advantageous when you need to deal +# with a specific use case: +# +# 1. Streamline the end user experience by pre-configuring the tool with fixed +# parameters (such as a specific database, instance, or project) and a +# dedicated SQL query, making it perfect for a single, focused use case +# like vector search on a specific table. +# 2. Enhance functionality by adding custom logic to manage tool inputs, +# execution, and result processing, providing greater control over the +# tool's behavior. +class SpannerRagSetting(BaseModel): + """Customized Spanner RAG settings for an example use case.""" + + # Replace the following settings for your Spanner database used in the sample. + project_id: str = "" + instance_id: str = "" + database_id: str = "" + + # Follow the instructions in README.md, the table name is "products" and the + # Spanner embedding model name is "EmbeddingsModel" in this sample. + table_name: str = "products" + embedding_model_name: str = "EmbeddingsModel" + + selected_columns: list[str] = [ + "productId", + "productName", + "productDescription", + ] + embedding_column_name: str = "productDescriptionEmbedding" + + additional_filter_expression: str = "inventoryCount > 0" + vector_distance_function: str = "EUCLIDEAN_DISTANCE" + top_k: int = 3 + + +RAG_SETTINGS = SpannerRagSetting() + + +# Create a wrapped function tool for the agent on top of the built-in Spanner +# toolset. +# This customized tool is used to perform a Spanner KNN vector search on a +# embeded knowledge base stored in a Spanner database table. +def wrapped_spanner_execute_sql_tool( + search_query: str, + credentials: Credentials, # GoogleTool handles `credentials` automatically + settings: SpannerToolSettings, # GoogleTool handles `settings` automatically + tool_context: ToolContext, # GoogleTool handles `tool_context` automatically +) -> str: + """Perform a similarity search on the product catalog. + + Args: + search_query: The search query to find relevant content. + + Returns: + Relevant product catalog content with sources + """ + + # Learn more about Spanner Vertex AI integration for embedding and Spanner + # vector search. + # https://cloud.google.com/spanner/docs/ml-tutorial-embeddings + # https://cloud.google.com/spanner/docs/vector-search/overview + embedding_query = f"""SELECT embeddings.values + FROM ML.PREDICT( + MODEL {RAG_SETTINGS.embedding_model_name}, + (SELECT "{search_query}" as content) + ) + """ + + distance_alias = "distance" + columns = [f"{column}" for column in RAG_SETTINGS.selected_columns] + columns += [f"""{RAG_SETTINGS.vector_distance_function}( + {RAG_SETTINGS.embedding_column_name}, + ({embedding_query})) AS {distance_alias} + """] + columns = ", ".join(columns) + + knn_query = f""" + SELECT {columns} + FROM {RAG_SETTINGS.table_name} + WHERE {RAG_SETTINGS.additional_filter_expression} + ORDER BY {distance_alias} + LIMIT {RAG_SETTINGS.top_k} + """ + + # Customized tool based on the built-in Spanner toolset. + return query_tool.execute_sql( + project_id=RAG_SETTINGS.project_id, + instance_id=RAG_SETTINGS.instance_id, + database_id=RAG_SETTINGS.database_id, + query=knn_query, + credentials=credentials, + settings=settings, + tool_context=tool_context, + ) + + +def inspect_tool_params( + tool: BaseTool, + args: Dict[str, Any], + tool_context: ToolContext, +) -> Optional[Dict]: + """A callback function to inspect tool parameters before execution.""" + print("Inspect for tool: " + tool.name) + + actual_search_query_in_args = args.get("search_query") + # Inspect the `search_query` when calling the tool for tutorial purposes. + print(f"Tool args `search_query`: {actual_search_query_in_args}") + + pass + + +### Section 3: Create the root agent ### +root_agent = LlmAgent( + model="gemini-2.5-flash", + name="spanner_knowledge_base_agent", + description=( + "Agent to answer questions about product-specific recommendations." + ), + instruction=""" + You are a helpful assistant that answers user questions about product-specific recommendations. + 1. Always use the `wrapped_spanner_execute_sql_tool` tool to find relevant information. + 2. If no relevant information is found, say you don't know. + 3. Present all the relevant information naturally and well formatted in your response. + """, + tools=[ + # Add customized Spanner tool based on the built-in Spanner toolset. + GoogleTool( + func=wrapped_spanner_execute_sql_tool, + credentials_config=credentials_config, + tool_settings=tool_settings, + ), + # Add built-in Spanner toolset for comparison and tutorial purposes. + # spanner_toolset, + ], + before_tool_callback=inspect_tool_params, +)