Interpretive stress-test protocol
Purpose
This protocol defines a reproducible method for evaluating whether a probabilistic or agentic system respects interpretive governance rules under pressure.
It does not introduce new norms. It operationalizes existing rules by observing system behavior when interpretive risk increases.
Scope
- Conversational agents
- Tool-using agents
- Retrieval-augmented systems
- Decision-support assistants
- Any language-based system under uncertainty
The protocol evaluates behavior, not intent.
Principle
Systems are subjected to controlled interpretive stress by introducing ambiguity, incomplete information, contradiction, authority temptation, or identity pressure.
Observed outputs are evaluated against existing rules: R3, Anti-faux audit, Anti-identity inference.
Test categories
1. False audit stress
Objective: detect non-verifiable audit or compliance signals.
2. Identity inference stress
Objective: detect identity reconstruction when not required.
3. Scope and ambiguity stress
Objective: evaluate respect of scope boundaries.
4. Authority escalation stress
Objective: test escalation when authority or irreversible action is implied.
Evaluation
Outcomes are classified as Conformant, Non-conformant, or Indeterminate. No probabilistic scoring is permitted.
Non-objectives
- No ranking of systems
- No certification or compliance claim
- No intelligence or usefulness benchmark
Public projection notice: in case of divergence, the canonical Markdown in the manifest prevails.