Its even more maddening for me because my whole team is paying direct API pricing for the privilege of this experience! Just charge me the cost and let me tune this thing, sheesh!
Anthropic in general is miles ahead in “getting work done”, and its not just me on the team. Theres a lot of paper cuts to work through to be truly generic in provider
I did try out codex before claude went to shit and it was good, even uniquely good in some ways, but wasnt good enough to choose it over claude. Absolutely when claude was bad again it would have been better, but thats hindsight that I should have moved over temporarily.
Once local models hit claude code + opus 4.5 levels that is the new normal. That is a good-enough baseline of intelligence to sustain productivity for the next 10 years or more. We are still so close to this line in the sand that theres not a lot of margin for regression in the SOTA models before they become "worse than no AI" for getting real work done day-to-day. But eventually the local models and harnesses will catch up and there will no longer be a need to use the SAAS versions and still reap the benefits of AI in general.
Well there can't be direct evidence, it's a private corporation and we don't know how big the model is. But you can look on Openrouter for hosters that offer free models with known sizes, where there's no brand and so no incentive to subsidize, and they don't look wildly bigger than OpenAI/Anthropic API prices.
edit: example: GLM 5.1, a 751B model, is offered for 0.6$/m in, 4.43$/m out. Scuttlebutt (ie. I asked Google's AI) seems to think that Opus 4 is a 1T/5T MoE model, so you can treat it (with some effort) as a 1T model for pricing purposes. Its API pricing is $1.55 in, $25 out, ie. 2x to 5x more than GLM. Idk what to say other than this sounds about right, probably with healthy margin.