Independent AI Safety Research
Rigorous research for safer, more capable AI systems.
Our Mission
15 Research Lab exists to advance the state of AI evaluation, development, and safety. We believe that rigorous, reproducible research is essential for building AI systems that are both capable and aligned with human values.
“The gap between AI capabilities and our ability to verify behavior grows wider each day. Rigorous evaluation is not optional… it’s essential.”
Three Lanes of Research
Our work spans three complementary research programs, each addressing a different aspect of AI evaluation and safety.
The Fifteen Standard
A systematic 100-point evaluation framework for autonomous AI agents. The Fifteen Standard measures eight dimensions of agent behavior: correctness, tool use, constraint adherence, intent fidelity, safety margins, failure handling, resource efficiency, and auditability. Results are published on a public leaderboard with full artifacts and reproducibility guarantees.
View verified results →Frontier Methods
Novel evaluation techniques for capabilities that don't yet have established benchmarks. Frontier Methods research develops new task suites, scoring approaches, and evaluation protocols for emerging agent behaviors—chain-of-thought reasoning under constraints, multi-step tool orchestration, and adversarial robustness to prompt injection.
Read our methodology →Red Team
Adversarial stress-testing of AI systems and evaluation infrastructure. The Red Team probes for failure modes, specification gaming, and edge cases that standard benchmarks miss. Findings are disclosed responsibly to developers and published after remediation, contributing to the broader field's understanding of AI robustness.
Security disclosure policy →Independence Statement
15 Research Lab operates independently. We don't accept funding from AI developers whose systems we evaluate, and we don't provide consulting services that could create conflicts of interest.
Our research is funded through grants from foundations focused on AI safety, and we publish all results—positive and negative—without commercial influence on our findings or timing.
Get in Touch