While this sounds generous (and in some ways it is), it does not address the general point that GP is making. That is, the systematic disadvantage which large parts of humanity have w.r.t. to access to the tools. You could say they can't drive a Lambhorgini either, but that also doesn't solve the problem.
An aside: It was a very nice gesture and completely unexpected by me, so even if it doesn't work out, it made my day. I personally believe that kind gestures have a lot of power.
Back on topic: There is a real danger of the gap between rich and poor universities significantly widening in all fields if the rich can afford Pro level models, or even hardware that can run their own comparable models, and this being fiscally inaccessible to the rest.
One can sweep this under the rug by blaming the educational funding but this just shoots down all discussion. Even if GDP of a country goes up by a lot -- such as Poland -- it takes time before any budget benefit trickles to the education budget, and with some governments it might never do.
I believe Microsoft et al do have the most power here to boost affordable access to AI for researchers on a large scale; the fact that they cut some too expensive models (Opus, 5.5) from their academic benefits package is a grim omen. I do realize they would like universities to pay them also, and ultimately the universities should do that -- but then we are back at the institutional level of the problem.
Its a problem of the individual institutions and countries. The budget required for AI tools currently is negligible compared to other university expenses. We don't need to call everything a systemic disadvantage when the disadvantaged (at the institution level) have agency here.
Can you tell me what is the budget necessary to supply AI tools capable of substantial research assistance to all academic staff at a university?
You seem to have a good estimate in your head; I definitely do not.
From personal experience, ChatGPT 5.5 (the Plus tier) is excellent for programming tasks and also for various teaching related tasks but I have not observed the research benefits that Tim Gowers has when I asked it questions in my area of expertise. So the costs are definitely higher than a few dozen $ a month per PhD/professor.
You might be right that universities should immediately spring into action and demand funding for research level AI resources and hardware. One thing you might be mistaken in is that public universities are unfortunately very inflexible institutions; one reason for this is that they have a large internal leadership structure AND they are funded by the state, so even if the entire university agrees on something, the funding is at the whim of the ministry of education and thus the current political leadership.
> Can you tell me what is the budget necessary to supply AI tools capable of substantial research assistance to all academic staff at a university?
I think the GP meant that *if the tools provide substantial benefit* to staff, their costs can be compared to salaries and other large expenses of the university. The $100/month subscription costs less than your office space.
Which is good, since public money is tax money, so it better be spent wisely and not just thrown at the latest hype without thinking properly about it. It's a feature that public spending moves slowly, we should all be thankful for it.
> The budget required for AI tools currently is negligible compared to other university expenses.
Is it? Do you have any idea what the salary of a mid-tier university researcher in an Eastern European country is? Or in Africa or south-east Asia? With sota LLM pricing you easily get into the same order of magnitude, so essentially labour cost would double for researchers at such universies. Not "negligible" at all.
I feel like this is one of the most advantaged times in history in terms of regular citizens having access to cutting edge tools.
Looking online it seems like the low end estimate might be $30k a year for such math researchers? And ChatGPT pro or whatever you want will run $100 a month, and should be coverable by grants. I’m quite sure matlab alone cost more in the past
> While this sounds generous (and in some ways it is), it does not address the general point that GP is making. That is, the systematic disadvantage which large parts of humanity have w.r.t. to access to the tools. You could say they can't drive a Lambhorgini either, but that also doesn't solve the problem.
This was also the case historically, when being at certain universities, with better professors, better scope of works available at the library, etc, would necessarily provide systematic advantage.
This is the reality of progress. It is always unevely distrubuted.
I do think the open source side of model development is a substantial counter to the pessimism here.
I mean, I don't think OpenAI should be wading into the policies and practices of foreign institutions and governments. Look at all the blowback we see from the collision of Anthropic or OpenAI and the US government.
At present, the tools are available for whomever wants to buy them. Not OpenAI's fault that parent comment's government and/or institutions policies haven't been updated to allow for their purchase and use.
I'd argue that the OpenAI dude/dudettes level of generosity is appropriate given the circumstances.
You know what, I'm ashamed that I didn't think of this. I'll sponsor three months. Email in my hn profile. I don't understand the math in the article, but I'd love to help you make progress in it.
I will leave the contact up for a bit longer if people want to get in touch and share their experience with the research gap of the models -- or anything, really -- but I do not think there is any need of further support. Like I said elsewhere, the offer of support made my day and the gesture is enough.
This model is great at long horizon tasks, and Codex now has heartbeats, so it can keep checking on things. Give it your hardest problem that would take hours with verifiable constraints, you will see how good this is:)
It's genuinely so great at long horizon tasks! GPT-5.5 solved many long-horizon frontier challenges, for the first time for an AI model we've tested, in our internal evals at Canva :) Congrats on the launch!
That's what I've been heads down, HUNGRY, working on, looking for investors and founding engineers pst: https://heymanniceidea.com (disclaimer: I am not associated with heymanniceidea.com)
HN is owned by a startup accelerator and venture capital firm. They do growth hacking on the front page. And you probably know that since your throwaway account is several years old.
Interesting, I just had opus convert a 35k loc java game to c++ overnight (root agent that orchestrated and delegated to sub agents) and woke up and it's done and works.
What plan are you on? I'm starting to wonder if they're dynamically adjusting reasoning based on plan or something.
I'm on max 5x and noticed this too. I don't use built-in subagents but rather full Claude session that orchestrates other full claude sessions. Worker agents that receive tasks now stop midway, they ask for permission to continue. My "heartbeat" is basically "status. One line" message sent to the orchestrator.
Opus 4.6 worker agents never asked for permission to continue, and when heartbeat was sent to orchestrator, it just knew what to do (checked on subagents etc). Now it just says that it waits for me to confirm something.
Weird. I don't have this behavior, although I did with codex and 5.4 haha. I bet the providers are playing with settings underneath and different users are routed to different deployments, or they're secretly routing us to different models under load.
I'll be launching the game in a couple months so you can see it then :)
There are bugs, it doesn't work perfectly, but that's just part of testing and refinement at this point.
My initial prompt was just: "let's work on converting this java game to c++ using panda3d. you're a panda3d c++ expert. you will be the agent that owns the project, creating the plan, and the delegating each step to sub-agents that create each system in the correct order."
it created like 17 different tasks and sub agents and opus 4.7 orchestrated it. I did personally validate which rendering engine would be good for the project etc first.
Is there any task that actually doesn't require human intervention in-between, even if its just to setup stuff?
Like I will get Opus to make me an app but it will stop in between because I need to setup the db and plug in the API keys and Opus really can't do that on its own yet
> in a highly controlled and predictable environment
Why this constraint? A common sentiment I see online (sorry, to group you in) is "[tool] will be capable, actually, but only in a context that trivializes its usefulness."
I think modern post-training like RLVR + inference-time output token scaling can _probably_ scale so the agents can solve any computable task, even when placed in noisy or misconfigured environments. But it won't be economical for a long while. But it already seems largely capable of that today.
As if 3 different preview versions of the same model is not confusing enough, the last two dates are 05-06 and 06-05. They could have held off for a day:)
Since those days are ambiguous anyway, they would have had to hold off until the 13th.
In Canada, a third of the dates we see are British, and another third are American, so it’s really confusing. Thankfully y-m-d is now a legal format and seems to be gaining ground.
But it's not clear how to interpret the date code: 05-06 could be 5th June or 6th May; same sorry for 06-05. Very confusing due to American-style date formatting. Versions number are at least sequential, with a bigger number being a later version.
probably thinking of mapping the API routes to folder structure? like customer/transactions/{id} mapped to similar folder structure with a file containing the source for the API call
I have a deviated septum, but the doctors in the US doesn't think its medically necessary to do Septoplasty and I also have a sleep apnea, come to think I should get this done.
Not sure on the accuracy or if it applies to you, but prior to getting the septoplasty my ENT did mention it might not make a big difference if overweight or obese and so often doesn’t recommend it in those cases.
For comparison, I used some of those nasal strips before surgery and the post-surgery were very comparable, so that might be an inexpensive option to try? I also recall someone mentioning some sort of nose plug thing from Amazon you could get that does the same thing.
there's no way you guys would get the same comp though? like not even close. MSFT irrespective of how much it wants you is not going to honor the 10X increase in valuation from the upcoming $90 billion valuation.
Even if they do, you will miss on the upside. MSFT stock is not going to 10X but OpenAI's might.
> Real MSFT stock beats a theoretical could-have-been with a 10x upside.
As someone who has done startups, I highly agree with this statement.
But for a lot of people doing start ups for the first time (including my younger self) they don't understand the headache of private market equity, and thus do not apply the correct discount.
As they say, a bird in the hand is worth 2 in the bush.
> there's no way you guys would get the same comp though?
Nadella was directly involved and he's way smarter than that. Comping one the best ML teams in the world correctly is child's play compared to undoing Balmer's open source mess.
Uh, what? Are you claiming that if a company experiences a 10x (!) increase in valuation, that the dilution fully destroys that upside and the average developer experiences no increase in their comp? That is not even vaguely close to true in my experience.
> Are you claiming that if a company experiences a 10x (!) increase in valuation, that the dilution fully destroys that upside and the average developer experiences no increase in their comp?
Not the person you are replying to, but it depends on how diluted it gets. If they print 9x of currently outstanding shares as the value goes 10x, it would result in those original shares being worth exactly the same as before the 10x jump.
But I agree with you overall, in terms of the actual reality. I don’t think any company with half a brain would do that.
I have been the "regular people" at many companies that have increased in valuation. Dilution has never erased the gains of even much-more-modest increases in valuation. This sounds like paranoid fantasy to me.
How much do you really think that valuation would stand once OpenAI doesn’t get the massive subsidies from MS on compute? OpenAI couldn’t even be an ongoing concern.
I guess B2B kind of makes sense. Like most companies data is already on their cloud, so a wrapper to answer questions on their data seems pretty useful. But I see that they want this to be company's knowledge base chatbot which kind of doesn't make sense given most companies use MSFT/GOOGL products for conversations + knowledge management?
Quality of output is at same level as GPT.
The biggest issues for us:
1. text-bison limited to 1024 output tokens.
2. Output format we ask for JSON. But it is not valid json many times (, after last element, missing } after element etc). We have to write our own parsing code in the end to work around these JSON format issues.