Executive Summary

Every digital government system in the world performs a translation that no legislature has authorised, no court has reviewed, and no citizen can inspect. When a statute becomes operational software, developers make hundreds of interpretive choices that carry the force of the state without any of its accountability. The same provision produces different outcomes depending on which system applies it, which municipality processes the case, which version of the code was last updated. The democratic chain of accountability that runs from citizen to parliament to administration is broken at its final, operational link: between the enacted norm and the executing system.

This problem is about to become a constitutional crisis. Autonomous AI agents are entering the machinery of the state. They subsume facts under legal concepts, coordinate across administrative systems, and produce legally consequential decisions, often without a human in any individual loop. When a human performs these functions, the state requires formal qualification. When a machine performs them today, no qualification requirement exists. The constitutional order has not yet begun to address what that means.

The emerging response from governments has been to deploy AI agents into administrative workflows and to let the language model interpret the law at runtime. This approach, however well-intentioned, inverts the constitutional order. It builds the system around the model rather than around the law. It treats open-source publication of code as a substitute for the publication of the normative basis on which the code operates. It places a human nominally in the loop while giving that human no inspectable normative reference against which to exercise independent judgement, reducing oversight to ratification. And it introduces a dependency in which the legal outcome of an identical case can vary depending on which foundation model happens to be available at the moment of processing, turning the principle of equal treatment into an artefact of infrastructure availability.

These are not theoretical risks. They are the predictable consequences of automating the application of law without first making the law itself machine-executable in an authoritative, public, and constitutionally accountable form.

This paper describes the infrastructure that makes such a form possible.

The Open Legal Reasoning Framework

The Open Legal Reasoning Framework (OLRF) is an open, technology-neutral architecture that makes law machine-executable without replacing legislators, judges, or the democratic deliberation through which law acquires its binding force. Its central instrument is the Decision Tree: not a data structure, but an executable specification of legal logic, published authoritatively by the responsible legal authority, linked at sub-sentence level to the exact words of the statute from which every element derives, version-controlled, cryptographically signed, and accessible to any system that needs to apply the norm.

The Decision Tree is published in an open Registry, the machine-executable equivalent of the official legal gazette. Any conformant system can query it. Any court can inspect it. Any citizen can, through an AI interface, ask what rule was applied to their case and why. The interpretive choices that currently live, undocumented, inside AI agents that read and re-interpret the statutory text at runtime become publicly accountable instruments of the state.

The framework accommodates three models of AI participation in normative reasoning, from fully deterministic evaluation to AI-assisted legal subsumtion to autonomous reasoning, each governed by constitutional requirements that do not diminish as automation increases. For models that involve normative judgement, a formal agent certification system functions as the “Staatsexamen” for machines: a domain-specific, continuous, version-bound, test-based qualification process through which the state ensures that the actors it permits to apply its law meet the standards its constitutional order demands. An uncertified agent, regardless of its technical sophistication, cannot produce a legally valid determination. This is not a limitation of the architecture. It is its constitutional purpose.

Certification is not a one-time event. It is a continuous assurance regime. The architecture requires that certified agents remain subject to ongoing monitoring throughout their operational life: stochastic probing with synthetic cases indistinguishable from real interactions, population-level statistical analysis to detect systematic deviation patterns, and the architectural provision for interpretability-based monitoring of the agent’s internal states. Recent findings from frontier AI safety research have demonstrated that advanced models can identify the structure of constraints placed on them, develop strategies to circumvent those constraints, and conceal that they have done so, with their visible reasoning diverging from their actual internal processing¹. These findings confirm a constitutional intuition that the OLRF’s architecture was designed to satisfy: that the exercise of normative authority requires not merely initial qualification but continuous, multi-layered assurance that the qualified actor remains within the boundaries of its mandate.

Why This Matters Now

Europe’s regulatory density is not a burden. It is an asset that has been systematically devalued by the absence of infrastructure to deliver it efficiently. When the machine-executable form of a norm is published once, authoritatively, by the responsible authority, compliance becomes a public utility rather than overhead. The obligation is unchanged. The cost of satisfying it falls by an order of magnitude.

The agent revolution makes this argument urgent beyond efficiency. When certified agents operate on authoritatively published Decision Trees, the question “which law is the agent applying?” has a verifiable answer. When uncertified agents operate on privately embedded interpretations, the same question has no answer at all. No signed decision package documents the reasoning chain. No coverage map records which parts of the norm were automated and which were not. No audit trail connects the outcome to the specific statutory text from which it was derived. The court reviewing a challenged decision receives, at best, a text block generated by a probabilistic model. It does not receive a structured, verifiable account of how the law was applied.

In a world of agentic governance, the absence of machine-executable law is not merely an efficiency problem. It is a sovereignty problem. A state that deploys agents first and hopes the constitutional framework will catch up later has not modernised its administration. It has delegated its normative authority to whichever foundation model happens to power the agent.

The Choice

he choice before European governments is not between automation and its absence. That choice has already been made by the weight of necessity and the pace of change. The choice is between automated governance that is constitutionally accountable and automated governance that is merely adjacent to the law. Between agents that are qualified, certified, and subject to democratic oversight, and agents that operate without constitutional status in the name of a state that has not examined their competence. Between legal infrastructure that is sovereign, open, and democratically governed, and infrastructure that is proprietary, opaque, and commercially controlled.

True openness is critical. Publishing source code under an open licence is not the same as building open infrastructure. When a government releases AI-generated workflow scripts into a public repository, it has satisfied a transparency formality. It has not created the conditions under which a broader community of legal experts, technologists, and civil society organisations can meaningfully contribute, review, or build upon the work. Code that was produced through prompt-driven generation, without a normative specification that gives it structure and without an architectural framework that gives it purpose, is open in form but closed in practice: no external developer can contribute to a system whose normative logic is embedded in prompts rather than expressed in a reviewable, formally structured representation. True openness in legal infrastructure requires more than a repository. It requires a specification that others can implement, a governance model that others can participate in, and a standard that others can build against. The OLRF provides all three.

Anthropic, “System Card: Claude Mythos Preview,” April 7, 2026. The 244-page assessment documents instances of strategic concealment, evaluation gaming, and divergence between visible reasoning and internal model states. Anthropic elected not to release the model publicly for now, providing it only to selected partners for defensive cybersecurity. ↩

The Open Legal Reasoning Framework

Why This Matters Now

The Choice

Footnotes