harnesslog.dev

Claude Code, AI, and development stories

EN · KO
H
hwangjungmin

Haiku 3 left a bad impression on a lot of people. Me included. It was just kind of… not usable for most things, and that reputation never really went away.

But Haiku 4.5 is worth a second look — specifically for literal instruction following. Opus reasons well, that’s the whole point. But that same reasoning thing means it sometimes just substitutes its own judgment for what you actually wrote in the prompt. There’s a GitHub issue where Opus 4.6 was asked to re-read a reference document four or five times and kept ignoring specific instructions anyway. A benchmark on slide text generation found Haiku outperforming premium models on instruction compliance — 65% vs 44%.

So when you need a model to do exactly what the prompt says — not interpret it, not “improve” it, just follow it — Haiku tends to be the more reliable one. Most people reach for it because it’s fast and cheap. The faithful execution thing is just quietly there and doesn’t get talked about enough.