the compression contract
how to build small models of big reality without lying to yourself
19-Jan-26
Every serious system runs into the same wall: reality is too large to hold in your head. You either keep everything raw and drown, or you compress it into something you can steer and risk deleting the one detail that mattered. The interesting work is drafting the contract: what gets preserved, what gets thrown away, and how you prove you didn’t amputate the truth.
Across concurrency, markets, optimization, fraud detection, healthcare, taxes, and AI safety, the same pattern repeats. You win by building representations that are small enough to act on, rich enough to stay honest, and defended well enough to survive adversaries and incentives.
the maturity model
Compression is not just “smaller.” It’s a promise about what the smaller thing still means.
| Level | Representation | What it enables | How it fails | What “done” looks like |
|---|---|---|---|---|
| 1. Raw | Dumps, logs, PDFs, transcripts | Maximum fidelity, audit trails | Humans miss the signal; decisions are slow | You can reproduce facts but can’t steer outcomes |
| 2. Structured | Schemas, factors, features, hierarchies | Fast aggregation, automation, comparability | Silent meaning-loss; spurious structure | You can answer “what changed and why” with evidence |
| 3. Governed | Structured + validation + adversary model + rollback | Safe delegation, scale, resilience | Attacks, drift, Goodharting, incentive traps | You can detect errors early, branch/rewind state, and keep incentives aligned |
This explains why certain techniques feel like magic and others feel like demos. Ticket-based mutual exclusion is “governed”: it’s not just an idea, it’s enforced order under adversarial timing. Factor models are “structured,” but they only become “governed” when you actively monitor crowding, regime shifts, and the incentives your constraints create. Fraud detection becomes “governed” when you add network structure, temporal clustering, and statistical confirmation instead of trusting one heuristic. Document structure extraction becomes “governed” when you fight redundancy, verify against source, and budget exploration so it terminates. “LLM as assistant” becomes “governed” when you standardize inputs, cross-check outputs, and keep an append-only audit trail.
coordination
Once your representation is compact, you can coordinate across minds.
A practical pattern for hard code problems is to treat solution search like an evolutionary process. You don’t ask one model for the “answer” and hope. You generate multiple candidate implementations, force them to critique each other, synthesize a hybrid, and iterate until improvements flatten. That’s collective intelligence as a workflow.
For complicated changes, the hardest failure mode is mixing three tasks in one breath: understand the current system, design the new system, then implement it. Separating cognition (plan first, implement second) is not ceremony; it’s bandwidth management. It’s how you keep the system model stable while you modify the system itself.
Optimization does the same thing. When gradients are useless (noisy simulators, discrete choices, expensive evaluations), a strong default is a black-box strategy that maintains a search distribution, samples candidates, ranks them, and adapts covariance toward promising directions. You’re coordinating a population of guesses, not trusting one guess.
The engineering meta-layer matters too. “Content as version-controlled markdown” works for publishing systems because a clean artifact format makes ideas diffable, reviewable, and reproducible. Coordination thrives on formats that don’t rot.
integrity
Compression makes systems steerable. Steerable systems become targets.
Prompt injection is the canonical lesson: untrusted text is not “just data.” It’s an attempt to seize the steering wheel. The failure mode gets nastier when models can use tools: the attacker doesn’t need the model to say the forbidden thing; they just need it to fetch, write, and exfiltrate through some side channel.
Zoom out and you run into alignment. Internal guardrails are fragile because they live inside the object being attacked. A more robust mental model looks like society: you don’t rely on everyone having perfect internal morals; you build external monitoring, enforcement, and escalation. In AI terms: separate “doer” and “watchers,” keep watchers diverse, constrain their outputs, and design the system so the doer cannot argue its own case to the judges.
Even historical episodes rhyme here. Radical abstractions often fail adoption not because they’re wrong, but because the contract is unclear: people can’t tell what’s preserved, what’s gained, and how to trust it. A great abstraction without a trust story can be dead on arrival.
two cases
Medical decision-making without drowning. A family is trying to manage a complex patient: multiple specialists, a dozen meds, scattered portals, and short appointments. Level 1 is chaos: screenshots, PDFs, and memory. Level 2 is to build a coherent dossier: medication list from labels, labs in standardized tables, visit notes with dates, symptoms with timelines. Now the model can reason over the patient-state, not over garbage formatting.
The decision shift is concrete: instead of “the doctor seemed unconcerned,” you ask “which two findings dominate the posterior, what evidence lowers it, and what tests would discriminate.” A good LLM is less valuable as a miracle diagnostician than as a tireless integrator that never forgets a drug interaction or misses a trend line.
Level 3 is where it becomes robust. Multi-model cross-checking becomes default for high-stakes conclusions. Updates become append-only deltas, so you can see how probabilities evolve. Tool/output channels and privacy tradeoffs are handled explicitly, not waved away.
Risk models that create the crashes they fear. A portfolio looks great and then doesn’t. Level 1 is PnL and vibes. Level 2 is factor decomposition: you learn what portion of returns came from market, industries, momentum, value, and what’s truly idiosyncratic. That’s the good kind of compression: it stops you from confusing beta for genius.
But the same representation becomes a steering mechanism. If everyone is forced to neutralize the same exposures, you create shared hedges, shared exits, and shared fragility. Add leverage and you get reflexivity: the model constrains behavior, the behavior crowds trades, the crowding amplifies drawdowns, the drawdowns trigger forced deleveraging.
Level 3 is governance. Monitor crowding and nonlinear dependence, not just linear exposures. Treat the model as a tool, not an oracle: build “outside the model” checks for regimes where assumptions break. Recognize that external shocks can rewrite the economics: efficiency breakthroughs, faster inference hardware, custom silicon, and capex cycles can flip the narrative fast.
the contract
Some argue this is overengineering. Just keep everything raw and let smart humans decide. Or just throw a bigger model at it.
But raw inputs don’t preserve truth in practice; they preserve noise. Humans don’t fail because they’re dumb, they fail because the interface to reality is hostile: scattered records, impossible covariance matrices, adversarial prompts, and systems that change faster than memory. Bigger models help, but they don’t remove the need for contracts. Without structure, you get speed without understanding. Without governance, you get automation without safety.
The goal isn’t to worship abstraction. It’s to build compressions you can trust.
A good representation buys leverage. A good contract buys correctness. Good governance buys survival when incentives and adversaries show up.