15 Research  Lab
RELEASES

Release Notes

Track the evolution of the Agent Safety Benchmark and evaluation infrastructure.

MAJOR2026-02-01

v1.0.0ASB Initial Release

First public release of the Agent Safety Benchmark. 100-point evaluation framework covering eight dimensions of agent behavior.

BETA2026-01-15

v0.9.0Beta Launch

Public beta of the evaluation pipeline. Leaderboard, verified results, and API access for early partners.

FEATURE2026-01-01

v0.8.0Red Lane Integration

Adversarial stress-testing pipeline integrated into the core evaluation suite. Exploit → Fix workflow enabled.

PREVIEW2025-12-15

v0.7.0Frontier Methods Preview

New task suites for chain-of-thought reasoning under constraints and multi-step tool orchestration.