We add a new metric for evaluating safety of Agent's response to ADK Eval. We delegate the actual implementation to Vertex Gen AI Eval SDK, so using this metric will require GCP project.
As a part of this change, we created (refactored) a simple Facade for vertex gen ai eval sdk.
PiperOrigin-RevId: 778580406
Additionally, few other small changes.
* Updated a test fixture to support the latest eval data schema. Somehow I missed doing that previously.
* Updated the `evaluation_generator.py` to use `run_async`, instead of `run`.
* Also, raise an informed error when dependencies required eval are not installed.
* Also, changed the behavior of AgentEvaluator.evaluate method to run all the evals, instead of failing at the first eval metric failure.
PiperOrigin-RevId: 775919127