Files
adk-python/contributing/samples/computer_use
Issac cd54f48fed fix: fix paths for public docs
Merge https://github.com/google/adk-python/pull/3572

**Please ensure you have read the [contribution guide](https://github.com/google/adk-python/blob/main/CONTRIBUTING.md) before creating a pull request.**

### Link to Issue or Description of Change

**1. Link to an existing issue (if applicable):**

N/A

**2. Or, if no issue exists, describe the change:**

**Problem:**
Docs fix

### Checklist

- [ ] I have read the [CONTRIBUTING.md](https://github.com/google/adk-python/blob/main/CONTRIBUTING.md) document.
- [ ] I have performed a self-review of my own code.
- [ ] I have commented my code, particularly in hard-to-understand areas.
- [ ] I have added tests that prove my fix is effective or that my feature works.
- [ ] New and existing unit tests pass locally with my changes.
- [ ] I have manually tested my changes end-to-end.
- [ ] Any dependent changes have been merged and published in downstream modules.

Co-authored-by: Hangfei Lin <hangfei@google.com>
COPYBARA_INTEGRATE_REVIEW=https://github.com/google/adk-python/pull/3572 from issacg:patch-1 b7c7ed46ff0fb018f4da1537535eff27c323daf5
PiperOrigin-RevId: 834864431
2025-11-20 12:26:43 -08:00
..
2025-11-20 12:26:43 -08:00

Computer Use Agent

This directory contains a computer use agent that can operate a browser to complete user tasks. The agent uses Playwright to control a Chromium browser and can interact with web pages by taking screenshots, clicking, typing, and navigating.

This agent is to demo the usage of ComputerUseToolset.

Overview

The computer use agent consists of:

  • agent.py: Main agent configuration using Google's gemini-2.5-computer-use-preview-10-2025 model
  • playwright.py: Playwright-based computer implementation for browser automation
  • requirements.txt: Python dependencies

Setup

1. Install Python Dependencies

Install the required Python packages from the requirements file:

uv pip install -r contributing/samples/computer_use/requirements.txt

2. Install Playwright Dependencies

Install Playwright's system dependencies for Chromium:

playwright install-deps chromium

3. Install Chromium Browser

Install the Chromium browser for Playwright:

playwright install chromium

Usage

Running the Agent

To start the computer use agent, run the following command from the project root:

adk web contributing/samples

This will start the ADK web interface where you can interact with the computer_use agent.

Example Queries

Once the agent is running, you can send queries like:

find me a flight from SF to Hawaii on next Monday, coming back on next Friday. start by navigating directly to flights.google.com

The agent will:

  1. Open a browser window
  2. Navigate to the specified website
  3. Interact with the page elements to complete your task
  4. Provide updates on its progress

Other Example Tasks

  • Book hotel reservations
  • Search for products online
  • Fill out forms
  • Navigate complex websites
  • Research information across multiple pages

Technical Details

  • Model: Uses Google's gemini-2.5-computer-use-preview-10-2025 model for computer use capabilities
  • Browser: Automated Chromium browser via Playwright
  • Screen Size: Configured for 600x800 resolution
  • Tools: Uses ComputerUseToolset for screen capture, clicking, typing, and scrolling

Troubleshooting

If you encounter issues:

  1. Playwright not found: Make sure you've run both playwright install-deps chromium and playwright install chromium
  2. Dependencies missing: Verify all packages from requirements.txt are installed
  3. Browser crashes: Check that your system supports Chromium and has sufficient resources
  4. Permission errors: Ensure your user has permission to run browser automation tools

Notes

  • The agent operates in a controlled browser environment
  • Screenshots are taken to help the agent understand the current state
  • The agent will provide updates on its actions as it works
  • Be patient as complex tasks may take some time to complete