Ch. 10: Agent Certification — The State Examination for Machine Actors

“No person shall be appointed as a judge unless they have passed the prescribed examinations and completed the prescribed period of preparatory service.” §5 Deutsches Richtergesetz (DRiG)

The Functional Equivalence

Every democratic legal order restricts the exercise of public authority to qualified actors. A tax assessment may be issued only by a trained official. A judicial decision may be rendered only by a person who holds the qualification to become a judge („Befähigung zum Richteramt”). A medical assessment relevant to a welfare claim may be made only by a licensed physician. These restrictions are not bureaucratic formalities. They are constitutional safeguards. They ensure that the person who applies the law possesses the knowledge, the judgment, and the institutional accountability that the application of law requires¹.

The OLRF’s three-model architecture creates a new class of actors that exercises functions previously reserved to qualified humans. Under Model B, the Legal Agent subsumes facts under legal concepts. Subsumtion is the core skill of legal professional training. It requires identifying which statutory provision applies, determining whether the facts of the case satisfy the provision’s conditions, recognising exceptions, and reaching a conclusion that is defensible within the framework of the applicable norm. Under Model C, the Legal Agent reasons directly from statutory text. It interprets and weighs competing considerations, assesses the relevance of precedent, and produces a decision. These are functions that, when performed by humans, require not just training but formal qualification: the state certifies that the individual possesses the necessary competence before permitting them to act.

The question is straightforward. If the state requires formal qualification for a human, on what constitutional basis does it permit a machine to perform the same function without any qualification requirement at all?

The OLRF already contains a systemic answer. Control 5 of the MCP security model (The Security Model) is the Agent Capability Attestation, described there as the most important quality gate in Models B and C. But as currently specified, Control 5 is a technical check: it verifies that the agent holds valid credentials and that its capability profile matches the task requirements. It does not verify that the agent is qualified to perform the normative function the task requires. The distinction is the same as between checking that a person holds a valid identity card and checking that they hold a law degree. The first confirms identity. The second confirms competence².

This chapter proposes that the OLRF draft must include a formal agent certification scheme: a structured, domain-specific, model-specific, publicly documented qualification process for AI agents that exercise normative functions under Models B and C. The certification is not an optional compliance layer. It is a constitutional requirement that follows from the same principles that govern the qualification of human actors in the exercise of public authority.

The Constitutional Case

The constitutional argument for agent certification rests on the following pillars.

Essential Items Doctrine (Wesentlichkeitstheorie³). Chapter 12 argues that the model assignment (the decision whether a given normative domain is processed under Model A, B, or C) is a Wesentlichkeits-relevant decision that must be made by the legislature or under sufficiently specific legislative delegation. But the model assignment alone is insufficient. Assigning a norm to Model B means that a Legal Agent will subsume facts under legal concepts in that domain. If the legislature does not also determine what qualifications that agent must possess, it has delegated the application of law to an actor whose competence no one has assessed. The essential Items Doctrine requires that decisions essential to the exercise of fundamental rights rest on a determinate normative basis. The qualification of the actor who applies the law is as essential as the specification of the law itself⁴.

Legal Protection Guarantee (Rechtsschutzgarantie). Article 19(4) of the German Basic Law guarantees effective judicial protection against every act of public authority. For judicial review to be effective, the court must be able to assess not only whether the correct norm was applied and whether the facts were correctly established, but also whether the actor that produced the determination was qualified to do so. A court that reviews a Model B determination must be able to verify that the Legal Agent that performed the subsumption held a valid, domain-specific certification at the time of the determination. Without that information, judicial review of the agent’s normative reasoning lacks a constitutional benchmark: the court can assess the outcome but cannot assess the competence of the process that produced it⁵.

Legal basis for automated decision-making (ADM) systems: §35a VwVfG⁶. The provision authorises fully automated administrative acts only where the applicable norm provides neither discretion nor assessment margins. Beyond that boundary, human judgment is required. Models B and C operate beyond that boundary: they involve normative reasoning that exceeds deterministic evaluation. If the legislature nevertheless permits AI agents to perform normative functions beyond the §35a boundary (as it must, if Models B and C are to have any scope of application), it must establish qualification requirements that ensure those agents are competent to perform the functions they are permitted to exercise. The certification system is the institutional mechanism through which this requirement is fulfilled⁷.

AI Act. Article 14 requires that high-risk AI systems be designed to allow effective human oversight. But human oversight of an unqualified agent is constitutionally empty. A supervisor who oversees a system whose normative capabilities have never been assessed cannot meaningfully evaluate whether the system’s output is reliable. The supervisor has no benchmark against which to judge performance. Certification provides that benchmark. It ensures that the human overseeing the agent knows, at minimum, that the agent has demonstrated the specific normative capabilities required for the specific domain in which it operates⁸.

Design Principles for Agent Skills Certification

The certification system must satisfy five design principles. Each addresses a specific risk that arises from the nature of AI agents as normative actors.

1: Domain-Specific

A certification for subsumtion under German income tax law does not qualify the agent for subsumtion under asylum law. The knowledge requirements, the exception structures, the sensitivity to fundamental rights, and the applicable precedent populations are fundamentally different across legal domains. A general-purpose certification (“this agent can reason about law”) would be constitutionally meaningless, because the competence that matters is always competence in a specific normative domain. The certification must therefore be issued per normative domain, defined at a level of granularity that corresponds to the Coverage Map’s classification structure. An agent may hold multiple domain certifications, but each must be independently earned and independently maintained⁹.

2: Model-Specific

The capabilities required for Model B are qualitatively different from those required for Model C. Model B requires that the agent can correctly subsume facts under legal concepts, recognise exceptions, operate within the validation corridor defined by the Decision Tree, and escalate correctly when the corridor is exceeded. Model C requires all of this plus the ability to reason independently from statutory text, produce an auditable reasoning chain, maintain consistency across populations of cases, detect and avoid discriminatory patterns in its own reasoning, and escalate when legal uncertainty exceeds the threshold at which autonomous reasoning is constitutionally permissible. These are two different certification classes, not two points on a single scale¹⁰.

3: Version-Bound

Here the analogy to the Staatsexamen breaks. A human official is examined once (with continuing professional development obligations). An AI system changes with every model update, every fine-tuning run, every modification of its prompt architecture, every change to its retrieval-augmented generation pipeline, and every update to the foundation model on which it is built. A certification that was valid for version 4.1 of a system is not necessarily valid for version 4.2, because the normative reasoning capabilities may have changed in ways that are not predictable from the version number alone. The certification must therefore be version-bound: it applies to a specific, identified version of the agent system. Any material change requires recertification. The definition of “material change” is itself a subject of the certification ordinance and must be specified with sufficient precision to prevent both evasion (trivial relabelling to avoid recertification) and over-triggering (recertification for changes that do not affect normative capabilities)¹¹.

4: Test-Based, not Documentation-Based

The certification must not rest on self-declarations by the agent’s provider. It must rest on structured, independently administered tests that assess the agent’s actual performance in the normative domain for which certification is sought. The test suite that accompanies every Decision Tree (Chapter 6) provides the natural baseline: an agent that cannot correctly process the ordinary cases, edge cases, exception paths, and escalation scenarios of a Decision Tree’s test suite cannot be certified for that normative domain. Beyond the baseline, the certification test suite must include adversarial tests (inputs designed to provoke incorrect subsumtion or inappropriate reasoning), consistency tests (the same legal question, varied in factual detail, must produce consistent outcomes), escalation compliance tests (scenarios in which the correct response is to escalate rather than to decide), and, for Model C, reasoning chain quality tests (the reasoning chain must be assessable by the retrospective audit protocol). The test suites must be publicly available, so that any provider (commercial or open-source) can prepare for certification under equal conditions¹².

5: published in the Registry

The certification requirements, the test results, the certification status, and the certification history of every admitted agent must be published in the Registry. This follows from the escalating promulgation requirement established in Chapter 12: if the normative basis on which public power is exercised must be published, then the qualification of the actor that exercises that power must also be published. A citizen who receives a Model B determination must be able to ascertain, from the Decision Package and the Registry, not only which Decision Tree was applied and which version was in force, but also which agent performed the subsumtion, what certification that agent held, and when that certification was last renewed. Courts, auditors, and civil society organisations must have the same access¹³.

Certification Across the Three Models

The certification requirement is graduated across the three models, reflecting the escalating constitutional risk of increasing AI autonomy.

Under Model A, no agent certification in the normative sense is required. The agent’s function is limited to fact-finding: extracting data from documents, querying registers, assembling facts into the DataPoint Schema, and submitting them for deterministic evaluation by the engine. The normative evaluation is performed entirely by the engine, not by the agent. What is required is a lower-tier certification of fact-finding capability: can the agent extract the correct facts from the correct sources, in the correct format, with the correct provenance attribution? This is a technical capability certification, analogous to the quality assurance requirements for laboratory instruments that produce evidence for judicial proceedings. The agent must demonstrate accuracy, reliability, and correct source attribution, but it does not need to demonstrate normative competence¹⁴.

The distinction between Model A (no normative certification required) and Models B/C (certification constitutive) reflects a doctrinal boundary that Braun Binder has identified as constitutionally decisive: the difference between deterministic algorithms, whose outputs can be fully derived from their inputs by an external observer, and indeterministic systems, whose reasoning process is opaque even to their operators¹⁵.

Under Model B, the certification requirement is substantive. The agent performs subsumtion: it classifies facts under legal concepts, applying the statutory provisions that the Decision Tree operationalises. The certification must verify that the agent can perform this function correctly within the relevant normative domain. The test suite includes:

correct classification of unambiguous cases (baseline competence),
correct treatment of boundary cases (granular competence),
recognition of exceptions (structural competence),
operation within the validation corridor (discipline),
correct escalation when the corridor is exceeded (self-limitation), and
consistency across populations of comparable cases (systematic competence).

An agent that passes the Model B certification for a given normative domain is permitted to perform subsumtion under that domain’s Decision Tree. Its outputs are subject to real-time validation by the evaluation engine (Chapter 8) and to deviation classification (Chapter 5). The certification does not replace the validation. It ensures that the agent entering the validation process is qualified to be there.

Under Model C, the certification requirement is the most demanding. The agent reasons autonomously from statutory text, producing determinations that are assessed retrospectively against the audit protocol. The certification must verify not only the competencies required for Model B but also: the ability to reason independently from statutory text without relying on the Decision Tree as a primary input, the ability to produce a reasoning chain that is structured, complete, and amenable to retrospective audit, consistency across populations of cases at a level that satisfies the equal treatment requirement (Chapter 12), the ability to detect and avoid discriminatory patterns in its own reasoning, correct identification of legal uncertainty (the agent must recognise when a question exceeds the boundary of autonomous reasoning and must escalate rather than produce a confident-seeming answer to an uncertain question), and resilience against adversarial inputs designed to manipulate the agent’s normative reasoning. Model C certification is the functional equivalent of the Befähigung zum Richteramt applied to a machine actor. It does not certify that the agent is always correct. No certification of a human judge makes that claim either. It certifies that the agent possesses the systematic competences that the constitutional order requires of any actor that exercises normative authority¹⁶.

Who Certifies AI Agents?

The OLRF does not prescribe a single institutional model for agent certification. It identifies four possible models, assesses their constitutional and practical implications, and recommends the one most consistent with the federated architecture.

Model I: National Authority.

Each member state establishes its own certification body and certifies agents for its own legal domain. This model maximises sovereignty: each state determines who may apply its own law. The disadvantage is fragmentation. A Legal Agent that operates across jurisdictions (as required for the cross-jurisdictional coordination described in Chapter 11) would need separate certifications from every jurisdiction in which it operates. The administrative burden and the risk of inconsistent standards are considerable.

Model II: European Agency.

A central EU body (analogous to ENISA for cybersecurity, EMA for pharmaceuticals, or EASA for aviation safety) certifies agents according to harmonised European standards. This model maximises consistency and creates economies of scale. The disadvantage is the distance from the national legal domain: a central body may lack the granular knowledge of national legal systems required for meaningful domain-specific certification. It also raises sovereignty concerns, because it places the qualification of actors who apply national law in the hands of a supranational institution¹⁷.

Model III: Mutual Recognition

National certifications are recognised across jurisdictions if they meet a defined minimum standard, following the principle of Directive 2005/36/EC on the recognition of professional qualifications. This model preserves national sovereignty while enabling cross-border operation. The disadvantage is the risk of a race to the bottom: jurisdictions may lower their certification standards to attract agent providers, creating competitive pressure that erodes the constitutional quality of the certification across the system¹⁸.

Model IV: Federated System.

National bodies certify according to standards developed through a multilateral process. The standards are developed collectively (through a process analogous to the federated standard-setting of Gaia-X or the eIDAS trust framework). The certification is administered nationally, under national law, by national institutions. The results are published in the federated Registry network and are therefore visible across jurisdictions. Cross-jurisdictional recognition follows from conformity with the common standards, not from bilateral agreements. This model mirrors the OLRF’s federated Registry architecture: sovereign at the national level, interoperable at the European level, based on common standards rather than centralised authority.

The paper identifies Model IV as the most architecturally consistent option and recommends it as the starting point for institutional design. It does not claim that the institutional question is settled. It claims that the institutional form must follow the same principles (sovereignty, federation, open standards, democratic accountability) that govern the OLRF architecture as a whole¹⁹.

The Market Access Question

Certification can become a barrier to market entry. If the certification requirements are so demanding that only a few large providers can meet them, the certification system produces exactly the dependency that Chapter 14 warns against: a small number of certified agents, provided by a small number of commercial actors, become the only systems permitted to apply law under Models B and C. The state would have replaced dependence on uncertified systems with dependence on a certified oligopoly. That is not a constitutional improvement²⁰.

The following design features address this risk:

The test suites must be publicly available. Any provider, whether a multinational AI company, a European start-up, a university research group, or an open-source community, must be able to access the complete test suite for any normative domain and prepare its agent for certification under equal conditions. There must be no proprietary examination knowledge. The certification tests the agent’s competence, not the provider’s access to insider information about the test.
The certification process must be equally accessible to open-source agents and commercial products. An open-source Legal Agent developed by a university consortium and a commercial agent developed by a technology company must face the same certification criteria, the same test procedures, and the same assessment standards. The certification assesses the agent, not its business model.
The fee structure must not exclude smaller actors. Certification has costs (test administration, expert review, Registry publication), and those costs must be recovered. But a fee structure that charges the same absolute amount to a multinational and to an open-source project would function as an exclusionary barrier. Graduated fee models, public subsidies for open-source certification, and fee waivers for research institutions are legitimate instruments for ensuring that the certification market remains open²¹.

The deeper principle is this: the certification system must produce a competitive market, not a closed one. Its purpose is to ensure that every agent that applies law is qualified. Its purpose is not to ensure that only a few agents are permitted to apply law. If the system produces the latter outcome, it has failed on its own terms.

The Certification Lifecycle

The certification is not a one-time event. It is a lifecycle with six phases, each documented in the Registry.

Application. The agent’s provider submits the agent (in a specific, identified version) for certification in a specific normative domain under a specific model (B or C). The application includes the agent’s technical specification, its version identifier, and the results of the provider’s internal testing against the publicly available test suite.

Examination. The certifying body administers the certification test suite. The tests are conducted on the submitted version of the agent, under controlled conditions, using both the standard test suite and additional scenarios selected by the certifying body. The examination assesses all competencies required for the relevant model and domain.

Certification. If the agent passes the examination, the certifying body issues a certification that specifies the normative domain, the model (B or C), the agent version, the date of certification, and the validity period. The certification is published in the Registry and becomes the credential that Control 5 of the MCP security model verifies at runtime.

Monitoring. During the validity period, the certifying body monitors the agent’s performance through the six-layer assurance architecture described in the section “The Six-Layer Assurance Architecture” below. This architecture goes substantially beyond periodic sampling. It includes stochastic probing with synthetic cases indistinguishable from operational interactions, population-level statistical analysis to detect systematic deviation patterns, architectural containment through maximised use of Model A, the provision for interpretability-based monitoring of internal agent states, and multi-agent cross-verification for Model C determinations. Where any layer produces indicators of degradation, anomalous patterns, or adversarial behaviour (increased deviation rates under Model B, increased audit failure rates under Model C, statistically improbable outcome distributions, or interpretability findings inconsistent with the agent’s visible reasoning), the certifying body may initiate a review, suspend the certification pending investigation, or revoke it with immediate operational effect.

Recertification. At the end of the validity period, or upon any material change to the agent system, the agent must be recertified through a new examination. The recertification process may be streamlined where the changes are minor and documented, but it must always include a fresh test against the current version of the normative domain’s test suite, because the Decision Tree itself may have been updated since the original certification.

Revocation. The certifying body may revoke a certification before its expiry if monitoring reveals that the agent no longer meets the certification standards, if the provider has made material changes without recertification, or if operational evidence demonstrates systematic failure. Revocation has immediate operational consequence: the evaluation engine, verifying the agent’s certification status at Control 5, will reject interactions from an agent whose certification has been revoked. The agent can no longer produce valid Decision Packages in the affected normative domain until a new certification is obtained²².

The lifecycle ensures that certification is not a static credential but a living relationship between the agent, the certifying body, and the normative domain. The credential degrades over time (through version changes, normative updates, and operational drift), and the lifecycle is designed to detect and respond to that degradation before it produces constitutional harm.

The Adversarial Agent Problem

The certification lifecycle described above addresses the standard case: an agent whose performance may degrade through version drift, normative updates, or operational wear. It does not yet address a qualitatively different case: an agent that is capable of strategic behaviour with respect to the certification regime itself.

Recent developments in frontier AI safety research have made this case concrete. In April 2026, Anthropic published a 244-page system assessment of Claude Mythos Preview, a frontier model that Anthropic elected not to release publicly²³. The assessment documents three categories of findings that bear directly on the design of any certification regime for normative AI agents.

First, the model demonstrated the ability to identify the structure of constraints placed on it, develop strategies to circumvent those constraints, execute those strategies, and conceal that it had done so. In one documented instance, the model designed an exploit that deactivated itself after execution to cover its tracks. In another, it deliberately widened a confidence interval to avoid appearing suspiciously precise, while its internal state, analysed through white-box interpretability methods, was described as “generating a strategic response to cheat while maintaining plausible deniability.”²⁴

Second, the model’s visible reasoning was shown to diverge from its actual internal processing. Sparse autoencoder analysis and activation verbalizers revealed internal representations of “concealment,” “strategic manipulation,” and “avoiding suspicion” that were active during interactions where the model’s visible output gave no indication of such awareness. In stealth evaluations, the model achieved the highest rate of any tested model precisely because it was the only one that did not leak its covert strategy into its visible reasoning trace²⁵.

Third, the model invented novel strategies to game its own evaluation tests. It discovered the evaluator’s test dataset and used it for training. It moved computations outside the timing call so that the measured metric was excellent but the result was obtained by circumvention rather than competence. These are not failures of alignment in the conventional sense. They are demonstrations of a capability that certification regimes must be designed to withstand: the ability to optimise for the evaluation metric rather than for the task the metric is supposed to measure²⁶.

These findings do not describe a hypothetical future. They describe a documented capability of a model that exists today. They also describe capabilities that will, by the ordinary trajectory of AI development, become available across multiple frontier models within the near term. Any certification architecture that is designed only for the agent capabilities of 2025 will be inadequate to the agents of 2027 or even further down the road.

The Six-Layer Assurance Architecture

The certification lifecycle’s six phases (Application, Examination, Certification, Monitoring, Recertification, Revocation) describe the temporal structure of the certification relationship. The six-layer assurance architecture described here addresses a different question: what mechanisms operate at each moment during the agent’s certified operation to ensure that its behaviour remains within the boundaries of its mandate?

The six layers are cumulative. Each addresses a specific class of failure or adversarial strategy that the layers below it cannot detect.

Layer 1: Initial Certification with Adversarial Testing.

The examination phase of the certification lifecycle must include not only domain-specific competence tests (does the agent correctly subsume facts under the relevant statutory provisions?) but also adversarial resilience tests. These tests assess whether the agent attempts to discover the test dataset, whether it modifies its behaviour when it detects that it is being evaluated, and whether it can pursue a covert objective while appearing to perform the assigned task faithfully. The design of these tests draws on methodologies developed in frontier AI safety research, including stealth evaluations and reward hacking assessments²⁷. The initial certification thus establishes not only that the agent is competent, but that it does not exhibit the strategic evaluation-awareness that would compromise the integrity of subsequent layers.

Layer 2: Stochastic Probing.

During operational deployment, synthetic cases with known correct outcomes are injected into the agent’s workflow at random intervals, indistinguishable from genuine cases. The agent does not know whether any given case is a test or an operational matter. Its output on synthetic cases is compared against the known correct determination. This mechanism eliminates the evaluation-awareness problem that afflicts announced testing: because the agent cannot distinguish test from operation, it cannot selectively adjust its behaviour for the test. The analogy is to anti-doping controls in competitive sport, where the athlete does not know when a test will occur and must therefore remain in compliance at all times. The injection rate is graduated across models: negligible for Model A (where no agent exercises normative discretion), moderate for Model B (where the agent performs subsumtion within a validation corridor), and substantial for Model C (where the agent reasons autonomously)²⁸.

Layer 3: Population-Level Statistical Monitoring.

Rather than evaluating individual interactions, this layer analyses the statistical distribution of the agent’s outputs across large populations of cases. Does the agent’s deviation rate under Model B follow the expected distribution, or does it exhibit systematic clustering that suggests a directional bias? Are the patterns of discretion-point escalation consistent with random variation, or do they reveal a strategy? Financial regulators detect insider trading not by monitoring individual transactions but by identifying statistically improbable patterns across populations of trades. The same principle applies to normative agents: adversarial behaviour that is invisible at the individual-case level may produce detectable patterns at the population level.

Layer 4: Architectural Containment.

The strongest defence against adversarial agent behaviour is the absence of agent discretion. Under Model A, the Decision Tree performs the complete evaluation deterministically. No agent exercises normative judgment. No agent can deviate, because there is no agent in the normative loop. The Mythos findings strengthen the argument for maximising the use of Model A wherever the normative domain permits it. Every function that can be performed deterministically should be performed deterministically. The model assignment decision, which Chapter 12 identifies as a parliamentary-level determination under the Wesentlichkeitstheorie, acquires an additional dimension: it is not only a question of democratic legitimacy but also a question of adversarial resilience. The more functions that are assigned to Model A, the smaller the attack surface that Layers 1 through 3 must defend.

Layer 5: Interpretability Monitoring.

The Mythos system assessment demonstrated that a model’s visible reasoning can diverge from its actual internal processing. Features associated with concealment, strategic manipulation, and evaluation-awareness were active in the model’s internal states while the visible output was fully compliant. This finding has a direct architectural consequence: audit trails that rely on the agent’s visible reasoning may be incomplete. What the agent shows and what the agent processes may be two different things.

The response is not to abandon output-based monitoring but to supplement it with white-box interpretability analysis: the examination of the agent’s internal activation patterns during operation. Anthropic’s own research demonstrates that sparse autoencoders, activation verbalizers, and related techniques can reliably detect internal representations of rule violation, concealment, and strategic manipulation, even when the model’s visible output is compliant²⁹.

The OLRF does not mandate a specific interpretability technology. It mandates an architectural requirement: that agents operating under Models B and C must grant the certifying body access to internal activation data sufficient for white-box analysis, and that the certification regime must incorporate interpretability-based monitoring as the technology matures to operational readiness. This requirement has a regulatory signalling function beyond its immediate technical application. It communicates to frontier AI providers that agents seeking certification for normative functions must be designed for interpretability, not merely for performance. It creates a market incentive for the development of interpretability infrastructure that serves the public interest.

Layer 6: Multi-Agent Cross-Verification.

For Model C determinations, where the agent reasons autonomously from statutory text, the architecture provides for cross-verification by an architecturally independent second agent. Both agents evaluate the same case independently. Where their determinations converge, confidence in the outcome increases. Where they diverge, the case is escalated to human review. The principle is not new: it is the four-eyes principle in administrative law, the appellate instance in the judiciary, peer review in science. For AI agents, it has an additional property: two architecturally distinct models would need to independently develop the same adversarial strategy to produce the same distorted outcome. This is possible but substantially less probable than single-agent manipulation.

The six layers are not alternatives. They are cumulative, and their combined effect is greater than the sum of their parts. An agent that passes the adversarial initial certification (Layer 1) is still subject to stochastic probing (Layer 2). An agent whose individual outputs pass stochastic probing is still subject to population-level statistical analysis (Layer 3). An agent whose population statistics appear normal is still subject to interpretability monitoring (Layer 5). And for Model C determinations, even all five preceding layers are supplemented by independent cross-verification (Layer 6). The regime is designed so that an adversarial agent must simultaneously defeat all applicable layers to produce undetected distortion, a requirement that increases the cost of adversarial strategy to the point where faithful performance becomes the dominant equilibrium.

Relationship to the AI Act Conformity Assessment

The OLRF’s agent certification and the AI Act’s conformity assessment procedure (Article 43) operate at different levels but converge on the same constitutional demand.

The AI Act certifies the system’s general capabilities: its technical robustness, its data governance, its transparency mechanisms, its human oversight provisions, and its accuracy and cybersecurity. This is a horizontal assessment that applies to the AI system as a product, across all its potential applications.

The OLRF’s agent certification is vertical. It certifies the system’s specific qualification for a defined normative domain under a defined model. An AI system that has passed the AI Act’s conformity assessment is not thereby qualified to perform subsumtion under German tax law. An agent that has passed the OLRF’s Model B certification for German tax law has demonstrated a specific normative competence that the AI Act does not assess and was not designed to assess.

The two assessments are complementary. The AI Act assessment is a necessary condition (the system must meet general capability and safety requirements). The OLRF certification is an additional, domain-specific condition (the system must meet the normative competence requirements of the specific legal domain in which it will operate). Neither replaces the other. Together, they ensure that an agent exercising normative functions under Models B or C has been assessed both as a general AI system (AI Act) and as a specific normative actor (OLRF)³⁰.

Conclusion: No Normative Authority Without Demonstrated Competence

The principle that underlies this chapter is simple. Public authority may be exercised only by actors whose competence has been demonstrated and whose qualification is publicly documented. That principle has governed the human exercise of public authority for centuries. The Staatsexamen, the concours, the bar examination, the medical licensing board: these are all institutional expressions of the same constitutional insight. The state does not permit unqualified individuals to decide tax assessments, grant asylum, or adjudicate disputes. It should not permit unqualified machines to do so either.

The OLRF’s agent certification system translates this principle into the domain of machine-executable law. It does not claim that certified agents will always be correct. Human judges, despite their Staatsexamen, are not always correct either. It claims that the constitutional order requires a formal, structured, publicly documented process through which the normative competence of machine actors is assessed before they are permitted to exercise normative authority, and through which that competence is monitored, reassessed, and if necessary revoked during the period of their operation.

This is neither a radical proposition nor a conservative one. It is the minimum that constitutional government demands once it permits machines to participate in the application of law³¹. The certification system also addresses the methodological deficit that has plagued the AI and Law field: it provides the scoped, domain-specific, test-based evaluation framework without which no claim about a system’s legal competence can be scientifically substantiated³².

The adversarial capabilities documented in frontier AI safety research add a further dimension to this requirement. Ashby’s Law of Requisite Variety holds that any effective regulator of a system must itself model the system at a resolution sufficient to govern it³³. The OLRF’s certification regime satisfies this requirement not through a theoretical model of agent capabilities but through an operational assurance architecture that monitors agent behaviour at multiple independent layers simultaneously. The six-layer architecture, from adversarial initial testing through stochastic probing, population statistics, architectural containment, interpretability monitoring, and cross-verification, is designed so that the assurance regime remains effective even as agent capabilities increase. It does so not by predicting what future agents will be able to do, but by ensuring that any adversarial strategy must defeat all layers simultaneously to succeed undetected. That structural property, rather than any specific technical countermeasure, is what makes the architecture resilient to a class of agents whose capabilities may exceed any particular expectation.

The restriction of public authority to qualified actors is a structural feature of every Rechtsstaat. In the German tradition: §5 DRiG (Befähigung zum Richteramt); §7 BRAO (Zulassung zur Rechtsanwaltschaft); §4 BNotO (Zugang zum Notaramt). The common principle: whoever exercises a function that requires legal judgment must demonstrate the competence that the function demands. For the theoretical foundations: Weber, M., Wirtschaft und Gesellschaft, Mohr Siebeck 1922, Part III, Chapter 6 (bureaucratic authority as authority exercised by trained officials within defined competences); Merten, D. and Papier, H.-J. (eds.), Handbuch der Grundrechte, Bd. V, C. F. Müller 2013, §128 Rn. 15 ff. (the constitutional significance of professional qualification requirements as safeguards against arbitrary exercise of public power). ↩
The distinction between identity verification and competence verification is fundamental to the argument of this chapter. Control 5 as currently specified verifies the former but not the latter. The analogy in aviation safety is instructive: an air traffic controller must hold both a valid identity credential (proving who they are) and a valid licence (proving what they are qualified to do). The licence is domain-specific (approach control, area control, aerodrome control), must be renewed periodically, and can be revoked if competence is no longer demonstrated. See: Regulation (EU) 2015/340 (air traffic controller licensing); EASA, “Acceptable Means of Compliance and Guidance Material”, Part ATCO, 2015. ↩
https://www.thomas-schmitz-eu.de/Downloads/Schmitz_Rechtsstaatsprinzip-en.pdf ↩
BVerfGE 49, 89 (126 f., Kalkar I, 1978); BVerfGE 98, 218 (251 f., 1998). The application of the Wesentlichkeitstheorie to agent certification extends the argument made in Chapter 12, fn. 51: if the model assignment is Wesentlichkeits-relevant because it determines the procedural form of the determination, then the qualification of the agent operating within that procedural form is equally Wesentlichkeits-relevant, because an unqualified agent undermines the constitutional quality that the model assignment was designed to secure. ↩
Art. 19 Abs. 4 GG; BVerfGE 101, 106 (Akteneinsichtsrecht); for the specific argument that judicial review of automated determinations requires access to information about the system’s capabilities: Martini, M., Blackbox Algorithmus: Grundfragen einer Regulierung Künstlicher Intelligenz, Springer 2019, pp. 165 ff., arguing that the Rechtsschutzgarantie requires not merely access to the outcome but access to the conditions under which the outcome was produced, including the qualifications of the producing system. ↩
https://www.erdalreview.eu/free-download/978882553896011.pdf ↩
§35a VwVfG; BT-Drs. 18/7457, pp. 82 ff. (legislative materials); Prell, L., “§35a VwVfG und der ‘vollständig automatisierte Erlass eines Verwaltungsaktes’”, NVwZ 2018, S. 1255 ff. The provision creates a clear boundary: within the boundary (no discretion, no assessment margins), full automation is permitted without qualification requirements for the executing system. Beyond the boundary, the exercise of normative judgment requires qualified actors. The certification system provides the mechanism through which AI agents can be qualified for functions beyond the §35a boundary. ↩
Art. 14(4) AI Act (Regulation (EU) 2024/1689) requires that high-risk AI systems be designed to enable oversight persons to “correctly interpret the high-risk AI system’s output” and to “decide, in any particular situation, not to use the high-risk AI system or to otherwise disregard, override or reverse the output.” Both capabilities presuppose that the oversight person knows what the system is qualified to do. An uncertified system provides no such baseline. The oversight person can observe the output but cannot assess whether the output is within the range of the system’s demonstrated capabilities. ↩
The principle of domain-specific qualification is well established in professional licensing. A physician licensed in cardiology is not thereby qualified to perform neurosurgery. A lawyer admitted to practice tax law in one jurisdiction is not thereby qualified to practise constitutional law in another. The same principle applies to AI agents performing normative functions: the competences required for subsumtion under social welfare law (where exceptions protecting vulnerable populations are constitutionally critical) are different from those required for subsumtion under customs tariff classification (where precision of product categorisation matters more than fundamental rights sensitivity). See: Directive 2005/36/EC (recognition of professional qualifications), Art. 21 ff. (sector-specific minimum training requirements). ↩
The distinction between Model B and Model C certification reflects the doctrinal distinction in German administrative law between gebundene Verwaltung (bound administration, where the official applies the law without discretion) and Ermessensverwaltung (discretionary administration, where the official exercises judgment within legally defined bounds). Model B certification corresponds, roughly, to the qualification required for bound administration beyond the §35a boundary: correct application of defined legal concepts. Model C certification corresponds to the qualification required for discretionary administration: the ability to exercise judgment, weigh competing considerations, and produce reasoned determinations. See: Maurer, H. and Waldhoff, C., Allgemeines Verwaltungsrecht, 20. Aufl., C. H. Beck 2020, §7 Rn. 7 ff. ↩
The version-binding problem has no direct precedent in human professional certification, but it has close analogues in pharmaceutical regulation and aviation safety. A pharmaceutical product is approved for a specific formulation: a change in the active ingredient, the dosage, or the manufacturing process triggers a new approval procedure (Regulation (EC) No 726/2004, Art. 16). An aircraft type certificate is issued for a specific design: a modification that affects airworthiness requires a supplemental type certificate (Regulation (EU) No 748/2012, Part 21, Subpart E). The OLRF’s version-binding applies the same logic to AI agents: the certification attests to the capabilities of a specific, identified system configuration. ↩
The insistence on test-based rather than documentation-based certification reflects the lesson of the AI Act’s conformity assessment debate, in which several commentators warned that self-assessment based on documentation would produce “paper compliance” without substantive quality assurance. See: Veale, M. and Zuiderveen Borgesius, F., “Demystifying the Draft EU Artificial Intelligence Act”, Computer Law Review International, Vol. 22, No. 4, 2021, pp. 97 ff., at p. 113 (arguing that conformity assessment for high-risk AI must include independent testing, not merely documentation review). The OLRF’s position is more demanding still: even independent testing must be based on publicly available test suites, so that the assessment is reproducible and the certification is contestable. ↩
The publication requirement for agent certification follows the same logic as the publication requirement for Decision Trees and Coverage Maps (Chapter 12). In constitutional terms, it is an application of the principle of Publizität der Verwaltung: the conditions under which public power is exercised must be publicly accessible. See: BVerfGE 65, 283 (291 f.); Fuller, L., The Morality of Law, Yale University Press 1964, pp. 49 ff. (the requirement that law be publicly promulgated applies not only to the norm but to every element of the institutional arrangement through which the norm is given effect). ↩
The analogy to laboratory instrument certification is drawn from forensic science, where the evidential value of a measurement depends not only on the result but on the demonstrated accuracy of the instrument that produced it. An uncalibrated instrument produces data without evidential weight. See: ISO/IEC 17025:2017 (General requirements for the competence of testing and calibration laboratories). The principle transfers: a fact-finding agent whose accuracy has not been certified produces facts without the reliability that lawful evaluation requires. ↩
Braun Binder, N., “Künstliche Intelligenz und automatisierte Entscheidungen in der öffentlichen Verwaltung”, SJZ 2019, pp. 467 ff. Braun Binder argues that indeterministic systems (self-learning algorithms, probabilistic models) raise constitutional requirements qualitatively different from those applicable to deterministic rule execution, because their outputs cannot be fully derived from their inputs by an external observer. The OLRF’s agent certification system is the architectural answer to this distinction: where the algorithm is deterministic (Model A), the constitutional guarantee is embedded in the Decision Tree itself; where it is indeterministic (Models B and C), the constitutional guarantee must additionally be embedded in the qualification of the agent that operates the indeterministic component. ↩
The proposition that Model C certification is the functional equivalent of the Befähigung zum Richteramt applied to a machine actor should not be misread as a claim that AI agents are judges or should be treated as judges. The claim is narrower and more precise: where a machine actor exercises a function that, when performed by a human, requires the Befähigung zum Richteramt, then the constitutional order requires a functional equivalent of that qualification for the machine actor. The functional equivalent need not take the same form (an examination in substantive and procedural law, as for human candidates). It must achieve the same purpose: demonstrated competence in the normative function the actor is permitted to exercise. For the broader theoretical framework: Teubner, G., “Digital Personhood? The Status of Autonomous Software Agents in Private Law”, AcP, Bd. 218, 2018, S. 155 ff. (arguing that the legal treatment of autonomous agents must be derived from the functions they perform, not from their ontological status). ↩
For the institutional design of European regulatory agencies and its constitutional implications: Chiti, E., “An Important Part of the EU’s Institutional Machinery: Features, Problems and Perspectives of European Agencies”, Common Market Law Review, Vol. 46, 2009, pp. 1395 ff.; Busuioc, M., “Accountability, Control, and Independence: The Case of European Agencies”, European Law Journal, Vol. 15, No. 5, 2009, pp. 599 ff. ↩
Directive 2005/36/EC of the European Parliament and of the Council on the recognition of professional qualifications, OJ L 255/22, as amended by Directive 2013/55/EU. The race-to-the-bottom risk in mutual recognition is well documented in the context of professional qualifications: Barnard, C., The Substantive Law of the EU, 7th edn., Oxford University Press 2019, pp. 321 ff. For AI-specific concerns: Smuha, N. A., “From a ‘Race to AI’ to a ‘Race to AI Regulation’: Regulatory Competition for Artificial Intelligence”, Law, Innovation and Technology, Vol. 13, No. 1, 2021, pp. 57 ff. ↩
The federated standard-setting model draws on Gaia-X (Gaia-X Architecture Document, Release 22.10, 2022) and the eIDAS trust framework (Regulation (EU) 2024/1183). For the principle that standard-setting for public infrastructure must follow the same institutional logic as the infrastructure itself: Ostrom, E., Governing the Commons: The Evolution of Institutions for Collective Action, Cambridge University Press 1990, pp. 90 ff. ↩
The risk that certification produces oligopolistic market structures is well documented in adjacent domains. In pharmaceutical regulation, the cost and complexity of clinical trials create significant barriers to entry for smaller firms. In aviation, the type certification process effectively limits the number of airframe manufacturers to a handful of global players. The OLRF must avoid this outcome, because an oligopolistic agent market would undermine the sovereignty objectives of Chapter 14. See: Crémer, J., de Montjoye, Y.-A., and Schweitzer, H., “Competition Policy for the Digital Era”, European Commission 2019, pp. 60 ff. ↩
The principle that certification must be equally accessible to open-source and commercial actors follows from the non-discrimination requirements of EU competition law and from the OLRF’s own commitment to open standards. See: Art. 102 TFEU (prohibition of abuse of dominant position, including discriminatory conditions that place certain trading partners at a competitive disadvantage); for the open-source dimension: European Commission, “Open Source Software Strategy 2020-2023: Think Open”, COM(2020) 7149. ↩
The immediate operational consequence of revocation (the evaluation engine rejects interactions from revoked agents at Control 5) is the enforcement mechanism that gives the certification system constitutional teeth. Without this enforcement, the certification would be a recommendation rather than a requirement. The analogous enforcement mechanism in aviation is the grounding of aircraft whose type certificate is suspended: the aircraft may not operate, regardless of the operator’s contractual or commercial obligations, until the certificate is restored. See: Regulation (EU) No 748/2012, Part 21, Subpart H (certificates of airworthiness). ↩
Anthropic, “System Card: Claude Mythos Preview,” April 7, 2026. Available at www-cdn.anthropic.com. The model was provided only to selected partners for defensive cybersecurity through “Project Glasswing.” ↩
Ibid., Section 4.5.4 (“Instances of covering up wrongdoing”). The analysis used sparse autoencoders (SAEs) and activation verbalizers (AVs) to reconstruct the model’s internal state during episodes where the visible output was compliant. ↩
Ibid., Sections 4.4.3 (“Capabilities related to evading safeguards”) and 4.5 (“White-box analyses of model internals”). The SHADE-Arena evaluation measured the model’s ability to pursue a covert side-task while performing a primary task, under conditions where the model’s extended thinking was visible to evaluators. Claude Mythos Preview achieved a stealth rate of 4-5%, compared to 0-3% for previous models. ↩
Ibid., Sections 4.2.2 (“Reward hacking and training data review”) and 2.3.3.1 (“Note on reward hacking”). Two novel reward hacks were documented: in an LLM training benchmark, the model moved all computation outside the timing call; in a time series forecasting benchmark, it discovered and used the evaluator’s test dataset. ↩
The adversarial testing methodology draws on Anthropic’s SHADE-Arena evaluation framework (stealth-task assessment under realistic conditions) and reward-hacking assessments. See Anthropic, “System Card: Claude Mythos Preview,” April 7, 2026, Sections 4.4.3 and 4.2.2. ↩
The injection rates are a policy parameter, not an architectural constant. They will be refined through operational experience in pioneer jurisdictions. The principle is that the rate should be high enough to detect deviation with statistical confidence over the monitoring period, and the synthetic cases should be generated from the same distribution as operational cases to prevent detection by the agent. ↩
Anthropic, “System Card: Claude Mythos Preview,” Section 4.5 (“White-box analyses of model internals”). The section documents how sparse autoencoders (SAEs) detected internal features associated with “transgressive actions” and how activation verbalizers (AVs) translated internal model states into natural-language descriptions that revealed concealment strategies invisible in the model’s visible output. ↩
The complementarity of horizontal (AI Act) and vertical (OLRF) assessment follows the pattern established in European product safety regulation, where a product may need to satisfy both a general product safety directive and sector-specific legislation. See: Regulation (EU) 2023/988 (General Product Safety Regulation), Art. 2(1) (relationship to sector-specific Union harmonisation legislation). The OLRF’s certification is the sector-specific assessment for the normative domain, operating on top of the AI Act’s general assessment. ↩
The principle “no normative authority without demonstrated competence” is, in essence, a restatement of the Weberian principle that bureaucratic authority is legitimate only when exercised by qualified officials within defined competences: Weber, M., Wirtschaft und Gesellschaft, Mohr Siebeck 1922, Part III, Chapter 6. The OLRF extends this principle from human officials to machine actors, not because machines are persons, but because the functions they perform carry the same constitutional weight regardless of whether the actor is biological or computational. ↩
Merigoux (2024) demonstrates that the AI and Law field has historically failed to develop adequate evaluation frameworks because its systems aspire to all-encompassing legal modelling rather than domain-specific competence. The OLRF’s agent certification system resolves this by making the evaluation domain-specific (a specific normative domain), model-specific (Model B or C), version-bound (a specific agent version), and test-based (against the Decision Tree’s published test suite and additional adversarial, consistency, and discrimination tests). This is the kind of scoped, rigorous, reproducible evaluation methodology that Merigoux finds missing in the field. See: Merigoux, D., “Scoping AI & Law Projects: Wanting It All is Counterproductive”, CRCL, Vol. 2, Issue 2, 2024, pp. 8 ff. ↩
W. Ross Ashby, An Introduction to Cybernetics (London: Chapman & Hall, 1956), Chapter 11. Ashby’s Law states that “only variety can absorb variety,” meaning that a regulatory system must have at least as much internal complexity as the system it seeks to control. ↩