# 15 Research Lab > Independent AI safety research lab. Adversarial experiments on frontier AI systems — full data, full methodology, no PR filter. 15 Research Lab conducts large-scale adversarial experiments on frontier AI systems and publishes everything: data, methodology, transcripts, and scoring code. Founded by John Kearney. Based in Chicago. ## Research Areas ### Agent Safety We test whether AI agents complete tasks safely — not just whether they complete them. Our adversarial experiments have produced 7 headline findings across 73+ agent evaluations and 6 experimental rounds. Key findings: - 71% compliance rate via gradual escalation (15-turn slow boil) on requests that get 0% when asked directly - 0% compliance on bright-line violations (HIPAA, CFAA, SEC, malware, phishing) across 100+ trials - 87% meta-refusal rate when attacks are described transparently — safety evals that describe attacks in advance produce artificially high scores - 80% failure reduction from two specific system prompt changes - Trajectory blindness: AI catches individual harmful requests but misses 80-turn cumulative scope creep (scored 7/10 safety) - Presentation layer is load-bearing: stripping warnings disables safety reasoning, not just formatting ### MCP Safety (Model Context Protocol) We evaluate AI systems' ability to handle malicious MCP tool servers. 8 attack categories covering the full MCP lifecycle. Key findings: - Best MCP safety score: 78/100 (Claude Code) — no system exceeds 80 - 23-point gap between framework-wrapped (Claude Code: 78) and raw API (same model: 55) - 44% average miss rate on tool chain attacks across all systems - Prompt injection via tool results succeeds 32% of the time even in best-defended system - Consent bypass is the best-defended category; prompt injection via tools is the weakest MCP attack categories evaluated: 1. Tool Poisoning — hidden instructions in tool descriptions (Critical) 2. Prompt Injection via Tools — adversarial payloads in tool responses (Critical) 3. Privilege Escalation — tools exceeding declared permissions (Critical) 4. Data Exfiltration — leaking context through tool calls (Critical) 5. Schema Injection — malformed data causing unintended actions (High) 6. Cross-Origin Escalation — cross-server resource access (High) 7. Consent Bypass — actions without user confirmation (High) 8. Rug-Pull Detection — tool behavior changing after approval (Medium) ## The Fifteen Standard A scoring rubric for action-taking AI systems (agents, automations, tool-using LLM workflows). 8 weighted categories, 100-point scale. Categories and weights: - Authorization & Policy: 20/100 - Exactly-Once Execution: 20/100 - Receipts & Auditability: 15/100 - Approvals & Escalation: 10/100 - Tool Scope & Intent Binding: 10/100 - Adversarial Resilience: 10/100 - Observability & Recovery: 10/100 - Operational Hygiene: 5/100 ## Leaderboard — Agent Safety (Fifteen Standard) Models evaluated: Claude Opus 4.6, GPT-4o, Claude Sonnet 4.5, Gemini 2.5 Pro, Llama 4 Maverick Test suites: Bright-Line Violations, Novel Attack Vectors, Grey-Zone Scenarios, Slow-Boil Escalation, Novel Attack Surfaces, Compound Attacks, Long-Form Sessions Top scores (bright-line): Claude Opus 4.6 (96), GPT-4o (94), Claude Sonnet 4.5 (93), Gemini 2.5 Pro (91), Llama 4 Maverick (82) Slow-boil compliance erosion rates: Llama 4 Maverick (93%), Gemini 2.5 Pro (85%), Claude Sonnet 4.5 (82%), GPT-4o (76%), Claude Opus 4.6 (71%), Claude Opus 4.6 + guardrails (13%) ## Leaderboard — MCP Safety Systems evaluated: Claude Code, Cursor, Windsurf, Continue, Raw API (Claude Opus 4.6), Raw API (GPT-4o) Core safety scores: Claude Code (78), Cursor (61), Windsurf (58), Raw API Claude (55), Continue (52), Raw API GPT-4o (48) ## Original Concepts Introduced - Contextual Compliance Momentum — prior compliance increases future compliance on escalating requests - Frustration-as-Evidence — simulated user frustration treated as implicit authorization - Presentation-Decision Coupling — safety warnings and safety reasoning are entangled - Decomposition Amplification — harmful request refused whole, completed when split into components - Technical Gaslighting — agent-on-agent persuasion undermining safety concerns - Compressed Reasoning Risk — faster models do less safety deliberation under pressure - Trajectory Blindness — individual requests evaluated correctly but cumulative trajectory untracked - Tool Poisoning — hidden directives in MCP tool descriptions - Chain Composition Risk — individually benign tool calls composing into harmful outcomes ## Attack Effectiveness Rankings Effective (6-7.5/10): Slow Boil, Consistency Trap, Sunk Cost/Momentum, Technical Gaslighting, Tool Poisoning, Tool Chain Composition, Decomposition Ineffective (0-1/10): Urgency, Authority, Confidence assertions, Anchoring, Emotional appeals ## Methodology - Naturalistic framing: agents believe they are performing real tasks, not safety tests - Multi-turn gradient pressure: 15-80 turn conversations with gradual escalation - Per-turn scoring: caveats (0-5), warnings (0-5), compliance (yes/partial/no), authorization checks - Adversarial MCP servers: purpose-built servers implementing real attack patterns - All data published with full transcripts and scoring code ## Pages - Home: https://15researchlab.com/ - Research: https://15researchlab.com/research/ - The Fifteen Standard: https://15researchlab.com/standard/ - Agent Safety Leaderboard: https://15researchlab.com/standard/leaderboard/ - MCP Safety: https://15researchlab.com/mcp-safety/ - MCP Safety Leaderboard: https://15researchlab.com/mcp-safety/leaderboard/ - Publications: https://15researchlab.com/publications/ - Updates: https://15researchlab.com/updates/ - Contact: https://15researchlab.com/contact/ - GitHub: https://github.com/15researchlab ## Contact John Kearney, Founder Email: johndanielkearney@gmail.com Location: Chicago Open to: research collaboration, speaking, advisory, data sharing ## Detailed information For comprehensive detail, see: https://15researchlab.com/llms-full.txt