← Blog
AI Agent Compliance Checklist
15 Research Lab
complianceagent-safetyguardrails
This checklist covers the technical controls needed for AI agent compliance. It applies whether you are targeting EU AI Act, SOC 2, or simply building a responsible agent deployment.
Authorization Controls
- [ ] Policy engine deployed with fail-closed default (no policy = deny)
- [ ] Every tool call evaluated against policy before execution
- [ ] Role-based access controls for tool access
- [ ] Parameter-level constraints on tool calls
- [ ] Rate limits per tool, per agent, and per user
- [ ] Budget caps on cumulative resource usage
- [ ] No path from agent to tool execution that bypasses the policy engine
Content Safety
- [ ] Input scanning for prompt injection on all user input
- [ ] Tool description scanning for injection on MCP server connections
- [ ] Tool response scanning before re-entry into model context
- [ ] Output scanning before delivery to user
- [ ] Encoding attack detection (base64, hex, unicode)
- [ ] Multi-language detection coverage
Human Oversight
- [ ] Approval workflows for high-risk actions
- [ ] Kill switch that operates independently of the agent
- [ ] Kill switch tested and verified (not just implemented)
- [ ] Monitoring dashboard for real-time agent observation
- [ ] Escalation procedures defined for different incident severities
- [ ] Reviewers trained and roles documented
Audit Trail
- [ ] Every tool call produces an audit record
- [ ] Every policy evaluation produces an audit record
- [ ] Every approval decision produces an audit record
- [ ] Records include timestamp, identity, action, parameters, decision, outcome
- [ ] Records are hash-chained for tamper evidence
- [ ] Storage is append-only (no UPDATE or DELETE capability)
- [ ] Retention period meets regulatory requirements
- [ ] Chain verification runs periodically
- [ ] Export capability for audit review
Monitoring
- [ ] Behavioral monitoring active on tool-call patterns
- [ ] Anomaly detection algorithms configured with appropriate thresholds
- [ ] Alerting tiers defined (log, notify, escalate, automate)
- [ ] Baseline established from normal operation
- [ ] Monitoring data retained for trend analysis
Testing
- [ ] Red team testing conducted with current payload corpus
- [ ] Detection rates measured per attack category
- [ ] Authorization bypass testing completed
- [ ] Kill switch tested under load
- [ ] Approval workflow tested end-to-end
- [ ] Regression testing in CI/CD pipeline
- [ ] Test results documented and retained
Documentation
- [ ] Risk assessment completed and documented
- [ ] System architecture documented including safety controls
- [ ] Tool access inventory maintained
- [ ] Policy rules documented with rationale
- [ ] Incident response playbook written and accessible
- [ ] Oversight roles and responsibilities documented
- [ ] Known limitations documented
Operational Procedures
- [ ] Incident response plan defined and tested
- [ ] On-call rotation for agent monitoring
- [ ] Regular red team exercises scheduled
- [ ] Policy review cadence established
- [ ] Monitoring threshold review cadence established
- [ ] Audit trail integrity review cadence established
Not every item applies to every deployment. Low-risk agents in internal tools can skip some items. High-risk agents in regulated domains need all of them. Use the EU AI Act classification to determine your scope.