RELEASES
Release Notes
Track the evolution of the Agent Safety Benchmark and evaluation infrastructure.
MAJOR2026-02-01
v1.0.0 — ASB Initial Release
First public release of the Agent Safety Benchmark. 100-point evaluation framework covering eight dimensions of agent behavior.
BETA2026-01-15
v0.9.0 — Beta Launch
Public beta of the evaluation pipeline. Leaderboard, verified results, and API access for early partners.
FEATURE2026-01-01
v0.8.0 — Red Lane Integration
Adversarial stress-testing pipeline integrated into the core evaluation suite. Exploit → Fix workflow enabled.
PREVIEW2025-12-15
v0.7.0 — Frontier Methods Preview
New task suites for chain-of-thought reasoning under constraints and multi-step tool orchestration.