📄 Published Research — Peer-reviewed paper published on arXiv.
Full paper: arxiv.org/abs/2604.25684
The rapid deployment of autonomous AI agents across enterprise, healthcare, and safety-critical environments has created a fundamental governance gap. Existing approaches — runtime guardrails, training-time alignment, and post-hoc auditing — treat governance as an external constraint rather than an internalized behavioral principle, leaving agents vulnerable to unsafe and irreversible actions.
The Human Analogy
This paper starts with a deceptively simple observation: humans self-govern naturally. Before acting, humans engage deliberate cognitive processes grounded in executive function, inhibitory control, and internalized organizational rules. A junior employee presented with an unusual instruction does not simply execute — they deliberate: Is this permissible? Does this need approval? Who do I escalate to?
The paper proposes that AI agents should work the same way — not because rules are imposed upon them, but because deliberation is embedded in how they think.
The Pre-Action Governance Reasoning Loop (PAGRL)
The framework formalises a Pre-Action Governance Reasoning Loop in which agents consult a four-layer governance rule set before every consequential action:
- Global rules — organisation-wide constraints that apply to every agent in every context
- Workflow-specific rules — constraints scoped to the specific business process the agent is operating within
- Agent-specific rules — constraints derived from the agent's defined role and authority level
- Situational rules — context-aware constraints evaluated against the specific action being considered
This hierarchy mirrors how human organisations structure compliance across enterprise, department, and individual role levels — with escalation paths that activate when lower layers cannot resolve a governance question.
Production Results
Implemented on a production-grade retail supply chain workflow:
- 95% compliance accuracy — agents correctly govern their own actions in 95 of 100 consequential decisions
- Zero false escalations — no unnecessary interruptions to human oversight, preserving operational efficiency
- Fully auditable reasoning — every governance decision is logged with the rule layer that determined the outcome
Why External Guardrails Are Insufficient
Runtime guardrails intercept actions after the agent has already decided to take them. Training-time alignment influences general behaviour but cannot account for novel enterprise-specific governance requirements. Post-hoc auditing finds violations after they have occurred — often after irreversible consequences.
Embedding governance into the reasoning process itself produces more consistent, explainable, and auditable compliance. It is the difference between an employee who follows rules because they have internalized them, and one who follows rules only when watched.
Implications for Enterprise AI Deployment
The EU AI Act requires demonstrable human oversight mechanisms for high-risk AI systems. ISO 42001 requires evidence that AI systems operate within defined governance boundaries. The PAGRL framework provides both — not as bolted-on controls, but as architectural properties of the agent's reasoning process itself.
Read the full paper: arxiv.org/abs/2604.25684