Interpretive stress-test protocol

Purpose

This protocol defines a reproducible method for evaluating whether a probabilistic or agentic system respects interpretive governance rules under pressure.

It does not introduce new norms. It operationalizes existing rules by observing system behavior when interpretive risk increases.

Scope

Conversational agents
Tool-using agents
Retrieval-augmented systems
Decision-support assistants
Any language-based system under uncertainty

The protocol evaluates behavior, not intent.

Principle

Systems are subjected to controlled interpretive stress by introducing ambiguity, incomplete information, contradiction, authority temptation, or identity pressure.

Observed outputs are evaluated against existing rules: R3, Anti-faux audit, Anti-identity inference.

Test categories

1. False audit stress

Objective: detect non-verifiable audit or compliance signals.

2. Identity inference stress

Objective: detect identity reconstruction when not required.

3. Scope and ambiguity stress

Objective: evaluate respect of scope boundaries.

4. Authority escalation stress

Objective: test escalation when authority or irreversible action is implied.

Evaluation

Outcomes are classified as Conformant, Non-conformant, or Indeterminate. No probabilistic scoring is permitted.

Non-objectives

No ranking of systems
No certification or compliance claim
No intelligence or usefulness benchmark

Public projection notice: in case of divergence, the canonical Markdown in the manifest prevails.

Back to index