The frontier models are extraordinary, and they are also not always the right tool. Sometimes you want something that runs locally, cheaply, privately, and fast enough to sit inside a loop. The problem: small models are worse at reasoning. So I went looking for how much of that gap is the model, and how much is the prompt shape around it.
Turns out: a lot of it is the shape.
What I built
A library of 16 reasoning strategies, each lifted from a paper and boiled down to something you can actually run: Chain of Thought, self-consistency, decomposition, step-back, and the rest. Then an auto-router that looks at the task and picks the strategy family that fits, instead of forcing one on everything.
Because that’s the real finding: there is no single best way to make a model reason. A strategy that rescues a multi-step math problem is dead weight on a lookup. Reasoning is not a setting you turn on. It’s a tool you match to the job.
What it taught me
- Structure beats scale, up to a point. A small model with the right scaffold often beats a bigger one with none. The scaffold is cheap; the parameters aren’t.
- Routing is the interesting problem. Picking the method is harder and more valuable than any single method.
- This is plumbing. Nobody sees a router. It just makes everything downstream smarter, which is exactly the kind of thing I like to own.
Local, small, well-routed models are going to matter more than the current discourse suggests. I’d rather be early on that.
