Hacker Newsnew | past | comments | ask | show | jobs | submit | AlexCoventry's commentslogin

Sounds like bitterness and resignation, to me.

Right? It's exactly how we train AIs, after all. It's not like this is mysterious.


How is this one a submarine? It is not even PR.

It's potentially PR (it hit the front page of HN, and probably not organically) for Trump, Oracle or SAP, just from reading the first few paragraphs.

I've been using them for decades without issue, FWIW.

You are either a scientific anomaly or a single data point. Or both.

Well, then I am another anomaly and data point. I'm using ear plugs for 30 years at least 95% of nights. Never ever had an ear infection.

And by induction, earplugs don't give anyone an ear infection. QED!

I am also a scientific anomaly. Seems there are a lot of us!

You're a cohort now!

I have found it depends on how comfortable the earplugs are. If I feel they are uncomfortable in the ear, there is a good chance I'll get an infection/inflammation in the next few days.

Me too, 20 years without a single ear infection and without a single day without ear plugs

He describes in detail how curl is software-engineered to within an inch of its life. Do you really think most code is that highly polished?

You don't think AI is going to be able to understand things and apply their ability to formulate solutions better than you, in the near future?

In 2000 I learned about this old technology called "neural networks".

AI really depends on long winters and rare breakthroughs. Deep neural network was the most recent breakthrough.

The iterations you currently see it just adding more storage, but the fundamental neural network structure doesn't change.

I'm confident AGI will not be achieved by the LLM architecture, and when the next AI breakthrough is, is anyones guess. But if you take history into account, it will take a while.


Yes, same. In the late 90s through early aughts then I was taught over and over and over again that neural networks were a dead end concept and would never amount to anything.

Just like all the preceding AI booms, this one will hit its maximal point, the hype train will fizzle, the best parts will just become "normal", and then a couple of decades later something new will come to push the boundary again.


No, I don't. Do you? If so, why? Extrapolation from guesswork?

Yeah, I have an allergic reaction to tiktok being mixed up in any serious intellectual pursuit. :-)

Bandwidth is the killer, in distributed LLM training.

What’s the rush?

It depends on the purpose for the model. AFAIK LLMs aren't particularly capable at researching answers, relying more on having 'truth' baked in to their weights, so if it takes 12 months to train up a crowd-trained LLM it'll be 12 months behind the times.

How serious a risk is poisoned weights?

Can we leverage the cryptobros into using LLM training as a proof of work?


What? I use Qwen 3.5 35B-A3B and it definitely knows how and when to do web searches to fill in gaps in its knowledge.

Does Qwen3.5 know it needs to do this because the API in question has had loads of churn and much of its training data is on obsolete versions, or do you need to prompt it? How well does it handle having an API reference with sample code in its context window?

Having an LLM use a web search tool isn't the same thing as researching a topic, IMO, because it's so ephemeral and needs constant reinforcement. LLMs aren't learning machines, they're static ones.


How many facts change over time to create obsolete data? Unless you’re researching current events, I contend it’s a moot point.

You only need to train a range of small models in order to establish a plausible scaling law, IMO.

I don't think this is giving up. He's getting inside information on how Claude works, and a huge stream of Claude usage data. This will all inform future grok development, IMO.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: