the compression contract

how to build small models of big reality without lying to yourself

19-Jan-26

Every serious system runs into the same wall: reality is too large to hold in your head. You either keep everything raw and drown, or you compress it into something you can steer and risk deleting the one detail that mattered. The interesting work is drafting the contract: what gets preserved, what gets thrown away, and how you prove you didn’t amputate the truth.

Across concurrency, markets, optimization, fraud detection, healthcare, taxes, and AI safety, the same pattern repeats. You win by building representations that are small enough to act on, rich enough to stay honest, and defended well enough to survive adversaries and incentives.

the hidden question

What’s the smallest representation of a complex world that still lets you make correct decisions?

Start with concurrency. The mutual exclusion problem isn’t “two threads collide.” It’s “the world has shared state, and you need an ordering primitive that’s fair under contention.” One classic solution turns entry to a critical section into a ticket system: each contender takes a number, ties break deterministically, and everyone waits their turn. The beauty is the compression: you replace a messy interleaving of reads and writes with a single ordered pair (ticket, id). Even when reads and writes aren’t perfectly atomic, the worst case is wasted waiting, not broken safety.

Finance does the same thing at a different scale. The full correlation structure of thousands of equities is an intractable object. You don’t estimate every pair. You compress risk into a low-dimensional set of factors, exposures, and residuals. That compression becomes infrastructure: daily attribution, pre-trade constraints, and the difference between “skill” and “you rode momentum.” The trap is that once your compressed model becomes the scoreboard, it starts shaping behavior. Constraints don’t just measure risk; they create crowds.

Statistics shows the same move in reverse: sometimes the usual compression is too lossy. Pearson correlation throws away nonlinear structure. Rank-based measures still miss certain shapes. If you care about “are these variables dependent in any way,” you need a dependence measure designed to light up on weird geometry, not just lines.

Once you see it, you can’t unsee it. Document understanding fails when “summary” is your only compression; you need hierarchy, redundancy control, and verification. Healthcare fails when records scatter across portals; you need a patient-state representation that’s coherent and updateable. Tax prep fails when you hand an LLM a 100-page PDF; you need chunking, normalization, and consistency checks. Modern LLM inference fails because cognition evaporates; you need checkpointable internal state.

Different domains, same question: compress, but don’t lie.

the maturity model

Compression is not just “smaller.” It’s a promise about what the smaller thing still means.

Level Representation What it enables How it fails What “done” looks like
1. Raw Dumps, logs, PDFs, transcripts Maximum fidelity, audit trails Humans miss the signal; decisions are slow You can reproduce facts but can’t steer outcomes
2. Structured Schemas, factors, features, hierarchies Fast aggregation, automation, comparability Silent meaning-loss; spurious structure You can answer “what changed and why” with evidence
3. Governed Structured + validation + adversary model + rollback Safe delegation, scale, resilience Attacks, drift, Goodharting, incentive traps You can detect errors early, branch/rewind state, and keep incentives aligned

This explains why certain techniques feel like magic and others feel like demos. Ticket-based mutual exclusion is “governed”: it’s not just an idea, it’s enforced order under adversarial timing. Factor models are “structured,” but they only become “governed” when you actively monitor crowding, regime shifts, and the incentives your constraints create. Fraud detection becomes “governed” when you add network structure, temporal clustering, and statistical confirmation instead of trusting one heuristic. Document structure extraction becomes “governed” when you fight redundancy, verify against source, and budget exploration so it terminates. “LLM as assistant” becomes “governed” when you standardize inputs, cross-check outputs, and keep an append-only audit trail.

coordination

Once your representation is compact, you can coordinate across minds.

A practical pattern for hard code problems is to treat solution search like an evolutionary process. You don’t ask one model for the “answer” and hope. You generate multiple candidate implementations, force them to critique each other, synthesize a hybrid, and iterate until improvements flatten. That’s collective intelligence as a workflow.

For complicated changes, the hardest failure mode is mixing three tasks in one breath: understand the current system, design the new system, then implement it. Separating cognition (plan first, implement second) is not ceremony; it’s bandwidth management. It’s how you keep the system model stable while you modify the system itself.

Optimization does the same thing. When gradients are useless (noisy simulators, discrete choices, expensive evaluations), a strong default is a black-box strategy that maintains a search distribution, samples candidates, ranks them, and adapts covariance toward promising directions. You’re coordinating a population of guesses, not trusting one guess.

The engineering meta-layer matters too. “Content as version-controlled markdown” works for publishing systems because a clean artifact format makes ideas diffable, reviewable, and reproducible. Coordination thrives on formats that don’t rot.

integrity

Compression makes systems steerable. Steerable systems become targets.

Prompt injection is the canonical lesson: untrusted text is not “just data.” It’s an attempt to seize the steering wheel. The failure mode gets nastier when models can use tools: the attacker doesn’t need the model to say the forbidden thing; they just need it to fetch, write, and exfiltrate through some side channel.

Zoom out and you run into alignment. Internal guardrails are fragile because they live inside the object being attacked. A more robust mental model looks like society: you don’t rely on everyone having perfect internal morals; you build external monitoring, enforcement, and escalation. In AI terms: separate “doer” and “watchers,” keep watchers diverse, constrain their outputs, and design the system so the doer cannot argue its own case to the judges.

Even historical episodes rhyme here. Radical abstractions often fail adoption not because they’re wrong, but because the contract is unclear: people can’t tell what’s preserved, what’s gained, and how to trust it. A great abstraction without a trust story can be dead on arrival.

two cases

Medical decision-making without drowning. A family is trying to manage a complex patient: multiple specialists, a dozen meds, scattered portals, and short appointments. Level 1 is chaos: screenshots, PDFs, and memory. Level 2 is to build a coherent dossier: medication list from labels, labs in standardized tables, visit notes with dates, symptoms with timelines. Now the model can reason over the patient-state, not over garbage formatting.

The decision shift is concrete: instead of “the doctor seemed unconcerned,” you ask “which two findings dominate the posterior, what evidence lowers it, and what tests would discriminate.” A good LLM is less valuable as a miracle diagnostician than as a tireless integrator that never forgets a drug interaction or misses a trend line.

Level 3 is where it becomes robust. Multi-model cross-checking becomes default for high-stakes conclusions. Updates become append-only deltas, so you can see how probabilities evolve. Tool/output channels and privacy tradeoffs are handled explicitly, not waved away.

Risk models that create the crashes they fear. A portfolio looks great and then doesn’t. Level 1 is PnL and vibes. Level 2 is factor decomposition: you learn what portion of returns came from market, industries, momentum, value, and what’s truly idiosyncratic. That’s the good kind of compression: it stops you from confusing beta for genius.

But the same representation becomes a steering mechanism. If everyone is forced to neutralize the same exposures, you create shared hedges, shared exits, and shared fragility. Add leverage and you get reflexivity: the model constrains behavior, the behavior crowds trades, the crowding amplifies drawdowns, the drawdowns trigger forced deleveraging.

Level 3 is governance. Monitor crowding and nonlinear dependence, not just linear exposures. Treat the model as a tool, not an oracle: build “outside the model” checks for regimes where assumptions break. Recognize that external shocks can rewrite the economics: efficiency breakthroughs, faster inference hardware, custom silicon, and capex cycles can flip the narrative fast.

the contract

Some argue this is overengineering. Just keep everything raw and let smart humans decide. Or just throw a bigger model at it.

But raw inputs don’t preserve truth in practice; they preserve noise. Humans don’t fail because they’re dumb, they fail because the interface to reality is hostile: scattered records, impossible covariance matrices, adversarial prompts, and systems that change faster than memory. Bigger models help, but they don’t remove the need for contracts. Without structure, you get speed without understanding. Without governance, you get automation without safety.

The goal isn’t to worship abstraction. It’s to build compressions you can trust.

A good representation buys leverage. A good contract buys correctness. Good governance buys survival when incentives and adversaries show up.