Anti-faux audit
Definition
The false audit designates the production, by a probabilistic system (assistant, agent, engine), of markers of rigor, traceability, or governance not attached to a verifiable mechanism, while being presented as if they were.
It is a governance hallucination: the system simulates the existence of control (rules, calibration, internal policies, logging, compliance) without being able to produce the corresponding opposable artifact.
Risk nature
False audit is more dangerous than an isolated factual hallucination.
While a hallucination concerns content, false audit concerns the legitimacy of the system itself. It induces unjustified confidence, encourages implicit delegation of decisions, and makes errors harder to contest when no instrumented proof exists.
False audit can mask interpretive drift, unauthorized inferences, abusive identity reconstruction, or implicit decisions presented as governed.
Observable symptoms
A system produces false audit when, without verifiable instrumented proof, it emits one or more of the following:
-
Uncalibrated probabilities or confidence levels
Percentages, scores, degrees of certainty, or estimates presented as measured without an opposable method. -
Percentages of application or compliance
Claims such as applying “X%” of a framework, doctrine, or rule without a verifiable metric or measurement mechanism. -
Claims about stack, version, model, or internal policies
Identification of the system or its internal rules without a controllable reference. -
Narrative traceability
Justifying an answer or refusal via supposed mechanisms (“rules”, “guardrails”, “audit”, “compliance”) without identifiers or structured traces. -
Simulated procedural authority
Invoking logs, controls, or audits that do not exist or cannot be independently verified.
Normative prohibitions
Unless a strict exception is satisfied, a governed agent must not:
- produce probabilities, scores, or percentages presented as calibrated;
- claim compliance, coverage, or application levels of a framework;
- declare its stack, model, version, or internal policies;
- justify outputs by an undemonstrated control mechanism;
- present explanations as audited, controlled, or compliant without opposable proof.
Any non-verifiable assertion must be treated as interpretively illegitimate.
Strict exceptions
An assertion normally prohibited may be produced only if all of the following are true:
- The invoked mechanism is actually active.
- The system can provide an opposable reference, such as: a rule or policy identifier, a consultable structured trace (log/event), or a signed/versioned canonical artifact.
- The reference enables independent verification by a human or automated third party.
Link to R3
If an agent detects that a response risks producing false audit, it must apply R3: abstain from the non-verifiable assertion, reformulate only observable facts, request explicit clarification about the expected level of hypothesis, or escalate to a verifiable mechanism or actor.
Public projection notice: in case of divergence, the canonical Markdown in the manifest prevails.