Merge https://github.com/google/adk-python/pull/3394 This PR corrects misspellings identified by the [check-spelling action](https://github.com/marketplace/actions/check-spelling) Note: while I use tooling to identify errors, the tooling doesn't _actually_ provide the corrections, I'm picking them on my own. I'm a human, and I may make mistakes. ### Testing Plan The misspellings have been reported at https://github.com/jsoref/adk-python/actions/runs/19056081305/attempts/1#summary-54426435973 The action reports that the changes in this PR would make it happy: https://github.com/jsoref/adk-python/actions/runs/19056081446/attempts/1#summary-54426436321 **Unit Tests:** - [ ] I have added or updated unit tests for my change. - [ ] All unit tests pass locally. _Please include a summary of passed `pytest` results._ **Manual End-to-End (E2E) Tests:** _Please provide instructions on how to manually test your changes, including any necessary setup or configuration. Please provide logs or screenshots to help reviewers better understand the fix._ ### Checklist - [x] I have read the [CONTRIBUTING.md](https://github.com/google/adk-python/blob/main/CONTRIBUTING.md) document. - [x] I have performed a self-review of my own code. - [ ] I have commented my code, particularly in hard-to-understand areas. - [ ] I have added tests that prove my fix is effective or that my feature works. - [ ] New and existing unit tests pass locally with my changes. - [ ] I have manually tested my changes end-to-end. - [ ] Any dependent changes have been merged and published in downstream modules. ### Additional context - https://github.com/google/adk-python/pull/3382#issuecomment-3488654110 COPYBARA_INTEGRATE_REVIEW=https://github.com/google/adk-python/pull/3394 from jsoref:spelling-contributing c3d5e342c4350f7cae9f8f0c6638b176f2e30e80 PiperOrigin-RevId: 828659867
Computer Use Agent
This directory contains a computer use agent that can operate a browser to complete user tasks. The agent uses Playwright to control a Chromium browser and can interact with web pages by taking screenshots, clicking, typing, and navigating.
This agent is to demo the usage of ComputerUseToolset.
Overview
The computer use agent consists of:
agent.py: Main agent configuration using Google's gemini-2.5-computer-use-preview-10-2025 modelplaywright.py: Playwright-based computer implementation for browser automationrequirements.txt: Python dependencies
Setup
1. Install Python Dependencies
Install the required Python packages from the requirements file:
uv pip install -r internal/samples/computer_use/requirements.txt
2. Install Playwright Dependencies
Install Playwright's system dependencies for Chromium:
playwright install-deps chromium
3. Install Chromium Browser
Install the Chromium browser for Playwright:
playwright install chromium
Usage
Running the Agent
To start the computer use agent, run the following command from the project root:
adk web internal/samples
This will start the ADK web interface where you can interact with the computer_use agent.
Example Queries
Once the agent is running, you can send queries like:
find me a flight from SF to Hawaii on next Monday, coming back on next Friday. start by navigating directly to flights.google.com
The agent will:
- Open a browser window
- Navigate to the specified website
- Interact with the page elements to complete your task
- Provide updates on its progress
Other Example Tasks
- Book hotel reservations
- Search for products online
- Fill out forms
- Navigate complex websites
- Research information across multiple pages
Technical Details
- Model: Uses Google's
gemini-2.5-computer-use-preview-10-2025model for computer use capabilities - Browser: Automated Chromium browser via Playwright
- Screen Size: Configured for 600x800 resolution
- Tools: Uses ComputerUseToolset for screen capture, clicking, typing, and scrolling
Troubleshooting
If you encounter issues:
- Playwright not found: Make sure you've run both
playwright install-deps chromiumandplaywright install chromium - Dependencies missing: Verify all packages from
requirements.txtare installed - Browser crashes: Check that your system supports Chromium and has sufficient resources
- Permission errors: Ensure your user has permission to run browser automation tools
Notes
- The agent operates in a controlled browser environment
- Screenshots are taken to help the agent understand the current state
- The agent will provide updates on its actions as it works
- Be patient as complex tasks may take some time to complete