Evidence
Proof It Works
Validated across 6 financial services domains on 30 held-out prompts. Every domain passes acceptance criteria.
Financial Services Validation
| Domain | True Coherence | Status |
|---|---|---|
| Risk Analysis | 83.7% | PASS |
| Regulatory Compliance | 83.2% | PASS |
| Credit Assessment | 83.5% | PASS |
| Market Analysis | 83.1% | PASS |
| Fraud Investigation | 83.4% | PASS |
| Financial Advisory | 83.3% | PASS |
What Higher Coherence Looks Like
Baseline writes about the task. The adapter writes the artifact.
| Pattern | With Adapter | Baseline |
|---|---|---|
| Formal memo headers | 5/5 | 2/5 |
| Facts-first ordering | 5/5 | 1/5 |
| Field-labeled structure | 5/5 | 0/5 |
| Placeholder discipline | 5/5 | 2/5 |
| Shorter, non-redundant narrative | 4/5 | 1/5 |
| Domain-appropriate artifact shape | 5/5 | 1/5 |
The Benchmark Paradox
MATH benchmark scores vs. True Coherence: r = −0.932. The models that ace benchmarks are the least coherent internally.
In coding, the adapter shows −1.2pp on HumanEval but produces qualitatively superior code. Standard AI fails by misunderstanding the problem. The adapter fails by fumbling execution of a correct understanding—a fundamentally different and more fixable failure mode.
Cross-Model Validation
Coherence adapters work across architectures. The right configuration depends on the model.
| Model | Before | After |
|---|---|---|
| Mistral-24B | 31.7% | 82.5% |
| Gemma 4-31B | 9.7% | 66.0% |
| Llama-70B | Safe gains | Zero degradation |