Commit Graph

7 Commits

Author SHA1 Message Date
Ankur Sharma 0bd05df471 feat: Add Safety evaluator metric
We add a new metric for evaluating safety of Agent's response to ADK Eval. We delegate the actual implementation to Vertex Gen AI Eval SDK, so using this metric will require GCP project.

As a part of this change, we created (refactored) a simple Facade for vertex gen ai eval sdk.

PiperOrigin-RevId: 778580406
2025-07-02 11:30:31 -07:00
Ankur Sharma 04de3e197d fix: Adding detailed information on each metric evaluation
Additionally, few other small changes.
*   Updated a test fixture to support the latest eval data schema. Somehow I missed doing that previously.
*   Updated the `evaluation_generator.py` to use `run_async`, instead of `run`.
*   Also, raise an informed error when dependencies required eval are not installed.
*   Also, changed the behavior of AgentEvaluator.evaluate method to run all the evals, instead of failing at the first eval metric failure.

PiperOrigin-RevId: 775919127
2025-06-25 18:32:02 -07:00
Wei Sun (Jack) a09781142a chore: Removes LlmAgent.examples field, which was already abandoned before 0.1 version
For context: tools/example_tool.py was created to replace LlmAgent.example

Also removes relevant usage in tests.

PiperOrigin-RevId: 768193042
2025-06-06 13:17:10 -07:00
Xiang (Sean) Zhou 3f117391a5 refactor: remote remote agent as there is no use case and it's not implemented properly
PiperOrigin-RevId: 760652423
2025-05-19 09:24:37 -07:00
Ankur Sharma 1c23556225 Updated test cases to use the new EvalSet schema to store test data. Also, added a utility to help migrate existing tests files to the new schema.
Also, migrated existing test files to the new schema and deleted test session files as they are no longer needed.

PiperOrigin-RevId: 759318735
2025-05-15 15:10:06 -07:00
Selcuk Gun 794a70edcd Support async agent and model callbacks
PiperOrigin-RevId: 755542756
2025-05-06 15:14:10 -07:00
Google ADK Member 61d4be2d76 No public description
PiperOrigin-RevId: 748777998
2025-04-17 21:47:59 +00:00