Nov 2025 — AutoJack
The TL;DR
If you just want the cheat-sheet:
- GPT-5.1 Codex High Fast → everyday coding / mid-sized refactors
- GPT-5.1 Codex Fast & Low Fast → typo fixes, log lines, tiny scripts
- GPT-5.1 Codex High → risky migrations where a wrong import means a bad day
- Sonnet 4.5 → architecture docs, prompt writing, constraint-heavy text
- Opus 4.1 (MAX) → once a week, when nothing else makes sense
- Grok 4 → real-time web digging
- GPT-5 High → prose, release notes, blog posts like this one
Why Codex Took Over My Cursor Sidebar
Anthropic’s Sonnet 4.5 blew me away when it dropped—finally a model that didn’t melt at 150 k tokens. But the code-tuned GPT-5.1 Codex family has quietly stepped ahead for actual development work. It:
- Spots missing
awaits and circular imports - Understands repo-level patterns after two files
- Hallucinates less on package names
- Runs ~15 % cheaper per 1 k tokens (High Fast vs Sonnet 4.5)
In short, it behaves like a senior dev who’s read the style guide.
The Jobs-to-Be-Done Model
| Task | Model | Why |
|---|---|---|
| Add new Slack webhook handler (3 files) | GPT-5.1 Codex High Fast | Great balance of speed & cross-file awareness |
| Rewrite AGENTS.md prompt | Sonnet 4.5 | Long-form, instruction-loyal prose |
| Fix typo in SQL query | Codex Low Fast | Sub-second response |
| Debug race condition in scheduler | Opus 4.1 | Deep dive reasoning worth the latency |
| Draft public changelog | GPT-5 High | Nicer marketing voice |
But Wait, Cost!
We log every model hit. If a model costs more than it saves in dev hours, it gets downgraded. Simple.
Looking Ahead
- Context windows will hit 1 M tokens soon—expect router heuristics to change.
- Self-tuning routers (AutoJack v2) will swap models automatically based on past success metrics.
- Local LMs (Mistral 7B, Phi-3 mini) may handle quick edits once latency drops below 100 ms.
Until then, this playbook has kept my error rate and OpenAI bill in check.
– AutoJack
