Small models, big reasoning

The frontier models are extraordinary, and they are also not always the right tool. Sometimes you want something that runs locally, cheaply, privately, and fast enough to sit inside a loop. The problem: small models are worse at reasoning. So I went looking for how much of that gap is the model, and how much is the prompt shape around it.

Turns out: a lot of it is the shape.

What I built

A library of 16 reasoning strategies, each lifted from a paper and boiled down to something you can actually run: Chain of Thought, self-consistency, decomposition, step-back, and the rest. Then an auto-router that looks at the task and picks the strategy family that fits, instead of forcing one on everything.

Because that’s the real finding: there is no single best way to make a model reason. A strategy that rescues a multi-step math problem is dead weight on a lookup. Reasoning is not a setting you turn on. It’s a tool you match to the job.

What it taught me

Structure beats scale, up to a point. A small model with the right scaffold often beats a bigger one with none. The scaffold is cheap; the parameters aren’t.
Routing is the interesting problem. Picking the method is harder and more valuable than any single method.
This is plumbing. Nobody sees a router. It just makes everything downstream smarter, which is exactly the kind of thing I like to own.

Local, small, well-routed models are going to matter more than the current discourse suggests. I’d rather be early on that.