I tried it for two tasks using Claude Code, on max effort.
1. Web platform, asking it to analyse a feature to create reports, and coming up with better solution and better UX. it did great, I would say on par with Sonnet 4.6 or even opus considering the thinking and explanation
2. Mac app with some basic functionality, it did well from functional perspective but then I used Opus 4.7 to evaluate and suggest improvements, where I noticed it missed many vital points in design system and usability.
I think it’s a leap, I haven’t used a model this capable that is not OpenAI or Anthropic
I do mind, since I enjoy speaking freely without concern of my opinions being linked to my employment. I assure you companies like this exist. Profiting off of inference is not the hard part, it's frontier training that is prohibitively expensive. You're free to disregard my commentary if you want, of course.
> Profiting off of inference is not the hard part, it's frontier training that is prohibitively expensive.
And given that Anthropic does both, it must make up its training costs by selling inference. jp57 was pretty clearly talking about Anthropic's flat-rate plans, rather than the flat-rate plans of companies that get to skip the most expensive part of the process.
I understand that very well, yes. The point I'm making is that I don't think Anthropic or OpenAI would have ever gotten significant traction if they didn't have flat-rate plans, because flat-rate plans themselves are not inherently predatory or part of the enshittification slope but actually extremely UX-friendly. Perhaps in another timeline, if their product was actually valuable enough to pay this price for, they could have simply provided a $50 plan as the standard level to provide enough margin to account for training costs as well. But as I see it DeepSeek is an existential threat to them, and they are now stuck between a rock and a hard place, because their product is devalued by its existence and if the frontier labs were to gate access with $50 plans they would get their lunch eaten even more quickly. It turns out there are downsides to burning inconceivably large stacks of other people's money.
> The point I'm making is that I don't think Anthropic or OpenAI would have ever gotten significant traction if they didn't have flat-rate plans...
That seems likely. If people had to pay their share of the actual all-in cost of the service (rather than having it be subsidized by investors with extremely deep pockets and a small handful of corporate customers), very, very few regular people would use it.
The point that 'jp57' pretty explicitly made [0] is that flat-rate plans that don't cover the all-in cost of providing the plans tend to result in those plans getting worse and worse and worse, as economic realities assert themselves. If the flat-rate plans that you are aware of actually cover the cost of providing the service, then you're discussing an entirely different situation that's entirely inapplicable to the discussion about Anthropic's pricing and degrading level of service.
[0] ...which is one that's understood by people who have been in pretty much any industry for more than a few years...
The crux of my argument is that there is a timeline where people would've paid the all-in cost of the service, with margin, as a flat-rate sub. The $20 rate was not sustainable when factoring in training costs but if not for DeepSeek they could have simply raised the prices rather than gestures broadly whatever the fuck is going on at Anthropic now, with a new PR fumble every three days. If the Chinese models didn't exist, people would've groaned but would likely still pay $40 or $50 for an LLM subscription.
You misdirected my quoted statement to assert a position I did not take. When I talk about flat-rate subs being a good UX, I am not talking about at a subsidized rate. My position is that people will pay more for a flat-rate sub than they are willing to through per-token billing. That is, a consumer who would only pay average $10/mo if they used the API will voluntarily pay $20/mo for a sub, because even though it's a worse value the latter is a tremendously more friendly user experience. When I say that flat-rate subs are necessary for traction, I mean that solely from a user experience perspective, not "subsidized usage is necessary for traction".
There’s also the “prepaid” alternative. Especially if you’re skittish about budgets. You topup you account for $10, and when you overflow (maybe by setting an alert to around $8), you can add an extra 5$ to make it to the end without interruption.
> You misdirected my quoted statement to assert a position I did not take.
Nope. You're reading way too much into what I'm saying, rather than reading the words I'm writing.
> When I say that flat-rate subs are necessary for traction, I mean that solely from a user experience perspective, not "subsidized usage is necessary for traction".
Sure. I never claimed that you said "subsidized usage is necessary for traction". It's "just" that your broader point is not relevant to the topic under discussion, which is Anthropic's financial situation. That's why I said
If the flat-rate plans that you are aware of actually cover the cost of providing the service, then you're discussing an entirely different situation that's entirely inapplicable to the discussion about Anthropic's pricing and degrading level of service.
That situation doesn't describe what's going on with Anthropic and OpenAI, so subsidized usage absolutely is necessary for "traction" for them. Roughly no regular folks would pay the all-in cost for the service they provide.
It is also unclear to me how much real debt they carry. They have famously been signing many deals: RAM, datacenters, maybe nuclear power plants -I no longer know what is a joke or not. They must be carrying hundreds of billions in paper debt obligations, which is tough to payback at $20B revenue.
When they put 10B in, they got weird tiered revenue shares and other rights. That has been simplified to 27% of OpenAI today. I don't know what that meant their 10B would be worth before dilution in later rounds.
Urs used to talk (internally) about not publishing "industry-enabling papers" which is why most Google infrastructure papers were describing something that had already been turned off, or was already in the process of being replaced by the next system (GFS, Vitess, etc). The things that did get published were either considered not key advantages, that other companies simply cannot do, things that other companies wouldn't bother doing, or experiments that never worked at all. There were exceptions of course. But it led to a public perception of the Google stack involving mostly technologies that were long dead or were never adopted.
"Attention Is All You Need" was a very very different thing and I also wonder if they are glad they published it. But I imagine if they hadn't, the motivation for researchers to leave Google would have been even larger.
It makes every bit as much sense as investing in Snap while still operating their own social network product. Seems to have worked out fine (for Google, not Snap).
Google makes a competing product to Claude's main product? So competing, in fact, that they have to ban Googlers from using Claude in order to get enough dogfooders.
I used pro via API (DeepSeek API not OpenRouter) with Claude Code, and the planning, visual solution, understanding was fantastic.
I would say I wouldn't notice this wasn't Opus 4.6. What I asked was looking at a feature implemented recently, and how it could be improved. Consumed 3.3 million tokens and create a much better flow.
It had a bug when I started the implementation though related to the API, which I suppose it is something they didn't catch when making their API compatible with CC.
There are carve-outs to allow for governments to make exceptions, but it's besides the point.
If the government were to hold themselves to account, they would fine themselves some amount N, and pay itself N using your taxes. It also wastes other finite resources for all the paperwork and legal action involved that could be used for something else.
Speaking pragmatically, there's no point trying to hold the government itself to it's own laws. The only time citizens do hold the government accountable, it's always done in the form of hangings, or the guillotine in France's case.
what would be the point of the government fining itself though?
Now that I'm thinking of it, it would create the need for an extra gaggle of bureaucrats to oversee the process,so I suppose someone might see a point to it ...
You may think you're funny or something, but boy do I have news for you.
There absolutely are fines for French administrations. And, knowing the French tax system, they've probably found a way to levy VAT and some other taxes on top of those fines.
1. Web platform, asking it to analyse a feature to create reports, and coming up with better solution and better UX. it did great, I would say on par with Sonnet 4.6 or even opus considering the thinking and explanation
2. Mac app with some basic functionality, it did well from functional perspective but then I used Opus 4.7 to evaluate and suggest improvements, where I noticed it missed many vital points in design system and usability.
I think it’s a leap, I haven’t used a model this capable that is not OpenAI or Anthropic
reply