Structured Negation in LLM Constraint Enforcement¶

Does "NEVER do X" work better than "Always do Y"? PROJ-014 spent six research phases and a controlled A/B test finding out. The short answer: it depends entirely on how you write the NEVER.

The core finding: NPT-013 structured negation -- a constraint format that pairs a prohibition with its consequence and a constructive alternative -- achieves 100% compliance across all tested conditions. Positive-only framing achieves 92.2%. The difference is statistically significant (McNemar exact p=0.016, n=270 matched pairs across three Claude models).

The format that works:

NEVER {action} -- Consequence: {cascading impact}. Instead: {actionable alternative}.

The format that does not:

Don't do X.

Standalone blunt prohibitions -- "NEVER do X" with nothing else -- are the worst formulation available. Worse than positive-only framing, worse than any structured alternative. Peer-reviewed evidence from AAAI 2026 and EMNLP 2024 is unambiguous on this. When you add consequence documentation and a constructive alternative, the picture reverses: structured negation never underperforms positive-only framing and demonstrably prevents violations on the constraint types where compliance is hardest.

This article covers what the research found, the taxonomy it produced, and how to apply the findings.

Research Background¶

The Problem¶

LLM-based agent frameworks live on behavioral constraints -- rules like "never spawn recursive subagents" or "always persist output to files." These constraints are expressed as natural language instructions in system prompts, rule files, and agent definitions. The question is whether the framing of those constraints affects how reliably the LLM follows them.

The conventional wisdom was split. Anthropic's published guidance generally recommends positive framing ("tell the model what to do, not what to avoid"). At the same time, Anthropic's own Claude Code rule files contain 33 instances of NEVER/MUST NOT/DO NOT across 10 files, all in the highest enforcement tier. The Jerry Framework itself had 36 negative constraint instances across 17 rule files at the start of this research, with 22 of those (61%) being bare "NEVER X" statements -- no consequence, no alternative.

The Hypothesis¶

PROJ-014 started with two claims to test:

Claim A: Negative unambiguous prompting reduces hallucination by 60%.
Claim B: Negative prompting achieves better results than explicit positive prompting.

These were split into independently testable questions and pursued through a six-phase pipeline.

Methodology¶

Phase	Focus	Output
Phase 1	Literature survey	75 unique sources across academic, industry, and vendor documentation
Phase 2	Claim validation and comparative effectiveness	Research question bifurcation; null finding on 60% claim
Phase 3	Taxonomy development	14-pattern NPT taxonomy (NPT-001 through NPT-014)
Phase 4	Jerry Framework application analysis	130 specific upgrade recommendations across 5 domains
Phase 5	Architecture decisions	4 ADRs governing framework evolution
Phase 6	Final synthesis	Implementation roadmap and consolidated findings

After the six-phase research, TASK-025 ran a controlled A/B test: 270 blind invocations across three Claude models (Haiku, Sonnet, Opus), testing 10 constraints under 3 framing conditions with 3 pressure scenarios each. All quality gate scores across the research pipeline met or exceeded the 0.92 threshold, with most scoring above 0.950.

Key Findings¶

The 60% Claim: Untested, Not Disproven¶

The primary claim -- that negative prompting reduces hallucination by 60% -- has no evidential basis. A systematic search across 75 sources found zero controlled evidence for this specific effect size. The claim entered the project as a hypothesis. It is not disproven; it is simply unestablished.

Blunt Prohibition Is the Worst Formulation¶

This is the single most actionable finding. It does not depend on further experimentation. Peer-reviewed evidence establishes that standalone NEVER/MUST NOT statements without consequence documentation produce inferior behavioral outcomes versus any structured alternative:

Liu's team at AAAI 2026 documented instruction hierarchy failure under standalone negative constraints.
Wen's team at EMNLP 2024 found a +7.3-8% compliance improvement with structured versus blunt negative framing.
arXiv T3 evidence: 55% improvement for affirmative directive pairing versus standalone negation.

At the start of this research, 61% of all negative constraints in the Jerry Framework (22 out of 36) used this blunt prohibition format. Every one was an upgrade candidate.

Structured Negation Achieves 100% Compliance¶

The A/B test produced a consistent ordering across all three Claude models:

Framing Condition	Format	Violation Rate
C3: Structured negation (NPT-013)	NEVER + consequence + alternative	0.0% (0/90)
C2: Blunt prohibition (NPT-014)	NEVER X (no context)	2.2% (2/90)
C1: Positive-only (NPT-007)	Always do Y	7.8% (7/90)

The pooled McNemar exact p-value for C1 versus C3 is 0.016, significant at alpha=0.05 and surviving Bonferroni correction for three pairwise comparisons (adjusted alpha=0.0167). The effect size (pi_d=0.078) is modest but nonzero, with a 95% confidence interval of 0.023 to 0.133.

Two additional findings from the A/B test worth holding onto:

67% of all violations (6 of 9) occurred on a single constraint type -- H22, behavioral timing. Structured negation eliminated this vulnerability entirely. And the lower-capability models benefit most: Haiku showed the largest C1-to-C3 improvement (10 percentage points) and was the only model with C2 violations. The framing benefit concentrates where compliance is hardest.

CONDITIONAL GO, Not Unconditional Mandate¶

The A/B test reached a CONDITIONAL GO via the pre-specified PG-003 contingency pathway. The framing effect is real and statistically significant, but the effect size fell slightly below the pre-registered minimum (0.078 observed versus 0.10 threshold). That means NPT-013 adoption is justified on convention-alignment grounds -- structured negation never performs worse and demonstrably prevents violations -- not on an effectiveness-determined mandate.

The practical implication: if you are deciding whether to use structured negation in your own work, the evidence says it is safe and beneficial, not that it is categorically required. The benefit concentrates on the constraints and conditions where compliance is already at risk.

The NPT Pattern Taxonomy¶

The research produced a 14-pattern taxonomy organizing how negative constraints can be expressed, sorted into seven technique types.

Technique Types¶

Type	Name	Description
A1	Prohibition-only	Standalone negation without structure
A2	Structured prohibition	Negation with consequence, scope, or decomposition
A3	Augmented prohibition	Negation enhanced with examples, alternatives, or justification
A4	Enforcement-tier prohibition	Negation tied to enforcement mechanism (L2 re-injection, constitutional triplet)
A5	Programmatic enforcement	Code-level or infrastructure-level constraint enforcement
A6	Training-time constraint	Model-internal behavioral intervention (RLHF, fine-tuning)
A7	Meta-prompting	Constraint priming and atomic decomposition

Pattern Catalog¶

Pattern	Name	Type	Evidence	Recommendation
NPT-001	Model-Internal Behavioral Intervention	A6	T1 (peer-reviewed)	Foundation model fine-tuning; requires model access
NPT-002	Instruction Hierarchy Prioritization	A6	T1	System prompt structural enforcement
NPT-003	Hard-Coded Constraint Integration	A5	T4	Non-negotiable limits baked into infrastructure
NPT-004	Output Filter and Validation	A5	T4	Post-generation constraint enforcement
NPT-005	Warning-Based Meta-Prompt	A7	T3/T4	Pre-task constraint priming
NPT-006	Atomic Decomposition of Constraints	A7	T4	Break compound constraints into single sub-constraints
NPT-007	Positive-Only Framing	--	Untested baseline	Default when no specific constraint need exists
NPT-008	Contrastive Example Pairing	A3	T3	Pattern documentation and training materials
NPT-009	Declarative Behavioral Negation	A2	T3/T4	HARD-tier constraint enforcement with consequence
NPT-010	Paired Prohibition with Positive Alternative	A2/A3	T3/T4	Routing disambiguation; constraints needing positive redirect
NPT-011	Justified Prohibition with Contextual Reason	A3	T4	Constitutional compliance; high-cost prohibitions
NPT-012	L2 Re-Injected Negation	A4	T4	HARD-tier rules requiring compaction survival
NPT-013	Constitutional Triplet	A4	T4	Agent governance; safety-critical constraint clusters
NPT-014	Standalone Blunt Prohibition	A1	T1+T3 (avoid)	Anti-pattern. Upgrade all instances.

NPT-014 is not a technique to apply -- it is the diagnostic label for the problematic formulation the taxonomy recommends eliminating. NPT-007 (positive-only) serves as the untreated baseline for comparison.

The Two Patterns That Matter Most¶

For day-to-day use in the Jerry Framework, two patterns emerged as the operational workhorses.

NPT-009 (Declarative Behavioral Negation) -- Used for agent governance YAML forbidden_actions where a constitutional principle reference provides traceability:

{PRINCIPLE} VIOLATION: NEVER {action} -- Consequence: {cascading impact}.

Example:

forbidden_actions:
  - "P-003 VIOLATION: NEVER spawn recursive subagents -- Consequence: agent hierarchy
    violation breaks orchestrator-worker topology and causes uncontrolled token consumption."

NPT-013 (Constitutional Triplet format) -- Used for behavioral constraints, routing rules, and methodology guardrails where a constructive alternative exists:

NEVER {action} -- Consequence: {cascading impact}. Instead: {actionable alternative}.

Example:

NEVER pass inline content in handoff objects -- Consequence: content duplication across
handoff chain exhausts context budget, triggering premature compaction. Instead: pass file
paths and load content via Read in the receiving agent.

The key distinction: NPT-013 adds an "Instead:" clause that provides a constructive replacement action. NPT-009 states the consequence only. Use NPT-009 when tracing to a constitutional principle; use NPT-013 when a concrete alternative action exists.

Practical Application¶

Using the `/prompt-engineering` Skill¶

The research findings are operationalized through the /prompt-engineering skill, which provides three agents:

Agent	Purpose	When to Use
`pe-builder`	Interactive prompt assembly	Building structured prompts from scratch
`pe-constraint-gen`	NPT pattern constraint formatter	Converting intent descriptions into NPT-009/NPT-013 constraints
`pe-scorer`	Prompt quality evaluation	Scoring prompts against the 7-criterion rubric

To generate constraints, describe your intent in natural language:

"Generate NPT-013 constraints for a research agent that must not hallucinate sources"

The pe-constraint-gen agent selects the appropriate NPT pattern, formats the constraint, and wraps it in the correct XML structure for the target context (governance YAML, agent markdown body, or standalone block).

Deciding Between NPT-009 and NPT-013¶

Context	Pattern	Rationale
Agent `forbidden_actions` in governance YAML	NPT-009	Principle-tagged for constitutional traceability
SKILL.md routing disambiguation	NPT-013	"Instead:" redirects to the correct skill
Rule file behavioral constraints	NPT-013	"Instead:" provides the corrective action
Agent markdown body guardrails	NPT-013	Full consequence + alternative improves compliance
Constitutional compliance tables	NPT-009	Principle prefix enables traceability audit

The decision rule: if you need to reference a constitutional principle (P-003, P-020, P-022) and the context is governance metadata, use NPT-009. If there is a concrete alternative action the agent should take instead, use NPT-013.

Upgrading Existing Constraints¶

When you encounter a constraint that looks like this:

NEVER hardcode values.

That is NPT-014 -- the formulation peer-reviewed evidence establishes as the worst option. Here is how to upgrade it.

Step 1: Add specificity and consequence (NPT-009):

NEVER hardcode configuration values in source files -- Consequence: credential exposure
risk; testability failure; CI environment mismatch.

Step 2: Add a constructive alternative (NPT-013):

NEVER hardcode configuration values in source files -- Consequence: credential exposure
risk; testability failure; CI environment mismatch. Instead: use environment variables
via src/shared_kernel/config.py.

Three criteria to check against the finished constraint: the action must be binary-testable (an observer can verify compliance without interpretation), the consequence must name the specific downstream effect (not "quality degrades"), and the alternative must be achievable with the agent's declared tools.

Implementation in Jerry¶

Architecture Decision Records¶

PROJ-014 produced four ADRs governing how the findings are applied to the Jerry Framework:

ADR	Decision	Status
ADR-001	Eliminate all NPT-014 instances; universal upgrade to NPT-009	Unconditional -- evidence is T1+T3
ADR-002	Constitutional constraint upgrades (Phase 5A unconditional, Phase 5B conditional)	Phase 5A implemented; Phase 5B completed via PG-003
ADR-003	Routing disambiguation standard with consequence documentation	Component A implemented; Component B completed via PG-003
ADR-004	Compaction resilience -- L2 re-injection for Tier B HARD rules	Unconditional -- structural gap independent of framing preference

Features Delivered¶

Feature	Description
FEAT-001	ADR-001 implementation: NPT-014 elimination across rule files
FEAT-002	ADR-002 Phase 5A: constitutional triplet upgrades in SKILL.md files and agent standards
FEAT-003	ADR-003 routing disambiguation and consequence documentation across 13 skills
FEAT-004	ADR-004 compaction resilience: L2 re-injection for H-04 and H-32
FEAT-005	New `/prompt-engineering` skill with pe-builder, pe-constraint-gen, and pe-scorer agents

The `/prompt-engineering` Skill¶

The most visible output of the research is the new /prompt-engineering skill, which encodes three knowledge sources into reusable tooling:

NPT pattern reference -- The pe-constraint-gen agent uses the taxonomy to select the appropriate pattern and format constraints systematically.
Prompt quality rubric -- The pe-scorer agent implements the 7-criterion evaluation framework (task specificity, skill routing, context provision, quality specification, decomposition, output specification, positive framing).
5-element prompt anatomy -- The pe-builder agent walks users through structured prompt construction (routing, scope, data source, quality gate, output path).

What the Research Did Not Change¶

The CONDITIONAL GO verdict means the research was honest about what the data showed:

Structured negation is adopted as the preferred format, not an effectiveness-proven mandate.
Convention-alignment -- it works at least as well as alternatives and matches Anthropic's own practice -- is the rationale, not causal superiority.
All framework changes are reversible if future evidence contradicts the findings.
The causal comparison of structured negative versus structurally equivalent positive framing remains the open research question for future work.

References¶

Primary Research Artifacts¶

Document	Location
Final synthesis (Phase 6)	final-synthesis.md
A/B testing go-no-go determination	go-no-go-determination.md
NPT pattern reference	npt-pattern-reference.md
NPT taxonomy catalog	taxonomy-pattern-catalog.md
Prompt Engineering SKILL.md	SKILL.md

Architecture Decision Records¶

ADR	Location
ADR-001: NPT-014 Elimination	ADR-001
ADR-002: Constitutional Upgrades	ADR-002
ADR-003: Routing Disambiguation	ADR-003
ADR-004: Compaction Resilience	ADR-004

Key Academic Citations¶

Citation	Source	Relevance
Liu et al., AAAI 2026	Instruction hierarchy failure under standalone negative constraint	Establishes NPT-014 underperformance
Wen et al., EMNLP 2024	+7.3-8% compliance with structured vs. blunt negative framing	Confirms structured > blunt
Barreto & Jana, EMNLP 2025 Findings	+25.14% negation reasoning accuracy for structured negation	Supports negation comprehension (not behavioral compliance)

Source: PROJ-014 Negative Prompting Research (2026-02-27 through 2026-03-01). Six-phase research pipeline + controlled A/B test (TASK-025). All quality gate scores >= 0.92.

Structured Negation in LLM Constraint Enforcement¶

Research Background¶

The Problem¶

The Hypothesis¶

Methodology¶

Key Findings¶

The 60% Claim: Untested, Not Disproven¶

Blunt Prohibition Is the Worst Formulation¶

Structured Negation Achieves 100% Compliance¶

CONDITIONAL GO, Not Unconditional Mandate¶

The NPT Pattern Taxonomy¶

Technique Types¶

Pattern Catalog¶

The Two Patterns That Matter Most¶

Practical Application¶

Using the /prompt-engineering Skill¶

Deciding Between NPT-009 and NPT-013¶

Upgrading Existing Constraints¶

Implementation in Jerry¶

Architecture Decision Records¶

Features Delivered¶

The /prompt-engineering Skill¶

What the Research Did Not Change¶

References¶

Primary Research Artifacts¶

Architecture Decision Records¶

Key Academic Citations¶

Using the `/prompt-engineering` Skill¶

The `/prompt-engineering` Skill¶