Hacker Newsnew | past | comments | ask | show | jobs | submit | davidatbu's commentslogin

Note that something that helped the misinformation was that, on Twitter, there were Kimi employees expressing their surprise that the base model was Kimi K2.5, and their indignation that Cursor didn't credit Kimi. They later deleted their tweets (what I infer from that is that some employees were not aware of some pre-existing agreement or understanding between Cursor and Kimi until the drama happened).

I can spare a minute :). This isn't exhaustive because this is just stuff I know of, obviously.

- At Stanford, Led research on the first (to my knowledge) crop of joint image/text models. Super widely cited work.

- At Tesla, led their whole self driving effort for a while, came up with critical techniques that allowed them to make progress (e.g., the concept of "auto labelling": using a much larger NN to generate training data with which to train smaller models that could fit in the on-device compute. IIRC, Elon said they would not have been able to make progress without this insight).

I'm not sure his educative efforts for the mold of what you're looking for, but if so, the course he designed at Stanford (and availed online):for neural networks, as well as his blog posts, (most famous of which, to my knowledge, is "the unreasonable effectiveness of LSTMs"), made a huge impact on educating a generation of tinkerers and researchers.


The auto labeling work (which has been partially described/presented at Tesla AI day events) seems more like engineering than research, a grab bag of techniques that I would guess the whole team must have contributed to. For example, they auto label low resolution/indeterminate objects (image segments) by temporal continuity... Something that is a low-res blob in the distance becomes a hi-res and easy to identify object when you drive by it, so by tracking objects backwards across frames you can learn how to more confidently label the lo-res blob. Things like this are useful, but it's the sort of stuff that engineers and developers are coming up with every day.

Not back in 2016.

You don't think that tracking objects from frame to frame is obvious ?!

I can guarantee you this was built-in from day #1

I'm guessing you're not a developer if you don't then automatically think of end cases like "what if car # 1 isn't in the preceding frame" ... (then you look at some relevant test data and see it was there, unlabelled ...)


Obvious in hindsight and obvious at the time are very different things.

You seem to have missed the main point anyway - using a larger model to generate labels for a smaller one is what the parent was highlighting, not the temporal labeling alone. The gold standard at the time was human labeling (eg Waymo). Deep learning was just having its moment, all of this stuff was cutting edge, and there is a lot of work in between a published paper and actually applying that to production vehicles.


Yes, the automated labelling (which replaced a large team they had doing manual labelling) that Tesla implemented consisted of a bunch of different things.

Generating a training set, training on it, and then inferencing on the trained model are three different things.

1) Generating the auto-labelled training set was of course done on Tesla's supercomputer, based on data from 1000s of cars.

2) Using the generated training set to train the in-car model would also be done offline.

3) The trained (and tested) model is then deployed to the car and used by the vision system to label image segments ("stop sign", "cyclist" etc).

How could this be divided up any other way?!

Karpathy seems like a great guy, but honestly there seems to be little to nothing in his background that makes him stand out as an architecture guy or being very creative. Maybe his thesis on image captioning is his most creative work, but at the end of the day this consisted of feeding the output of a CNN into an LSTM, conceptually very similar to the way language translation was being done at the time by feeding the output of an encoder LSTM for language A into a decoder LSTM for language B, except Karpathy was using an image encoder (off the shelf CNN) since he wanted to describe (caption) images. It was certainly at least somewhat innovative at the time, but what he was really famous/popular for at Stanford was for teaching the CS 231n class on using CNNs, and this is what he continues to be best known for - explaining how things work.


Karpathy is also badmephisto, a name you might have heard of if you're into cubing.

http://badmephisto.com/


Tesla still hasn't achieved their 2016 self-drive goal by their self imposed deadline of 2017, even now a decade later. So, politely, is that accolade merited?

The current vehicles sure seem to come close. I'm not entirely clear on how they've missed this goal, but the current models can do full self driving where I live, including parking.

Sure they have improved but how do we define success? Is success "It can drive a road it has never been on?" Even then I'm not sure because the model (not the physical car) has probably scanned that road before so it is recalling a prior route while being aware of hazards. Is that learning, or rote memorization?

A Tesla drove coast to coast on full autopilot.

My Tesla drives to walmart, finds a parking spot, comes to me outside walmart and drives me home. I've been driving my model 3 for years, and honestly, i've never had to "Take over" due to a saftey issue.


I could never trust a Tesla to drive safely around people. They seem like death traps. Could you share a link to the coast to coast drive please? How aided was it?

https://x.com/karpathy/status/2006436622909452501

How old are you if you don't mind me asking?

>I could never trust a Tesla to drive safely around people. They seem like death traps.

Have you ever been in a Tesla? It's literally been rated the safest car in America since it's inception.


That's just a man standing by his car. I'm asking for video proof, I'm sorry that wasn't clear off the bat. I also abstain from X due to Elon's track record, so I'm not going to keep searching there for it. Could you please tell me how to self serve on this?

I've driven a Tesla on and off for about 4 years, and I'm thankful to never do so again.

> death trap

https://apnews.com/article/tesla-crash-doors-musk-regulators...

> It's literally been rated the safest car in America since it's inception.

I'm locating our disconnect a bit better: something can be marked "safest car" for road tests but still be rife with issues, like its obnoxious UI choices, etc.

https://www.tesladeaths.com


> also abstain from X due to Elon's track record

I can just tell you hate eveything Elon, so this is a pointless conversation. That was a post from Karpathy, who we are all talking about in the thread, so i thought it was the most pertinent. I'm sure you can google it, it's proven, so no point arguing that it didn't happen.

Obviously, since you can't even use X out of your hate for Elon, There's no way you have "Driven a Tesla on and off for four years". Thats just a lie. NHTSA has given every vehicle a 5/5, and model 3 is "The top saftey pick" of all cars for their crash test results.

The safety comes from the inherent electric drives. They are much less likely to flip and much less likely to catch fire.

from your Tesla Deaths, 772 deaths over hundreds of billions of miles is absolutely incredible. Do you have any data to share on Fords mile to death ratio? Do you offer any comparisons? Or are you still just hating elon, for being elon?

Edit: Also, have you looked through the Tesla Deaths that you posted? A drunk driver is involved in alot of those, at no fault to the Tesla. One of the largest "Tesla Deaths" was someone driving on the wrong side of the freeway and they crashed into the tesla killing a whole family in the Tesla. How on earth are you using this slop as evidence that Tesla's are unsafe.... That's not ignorance, that's actually just evil...


lol Dude I should be the one verifying your age with this response. I don’t hate Elon, I am wary of his track record and I don’t use X. I drove a Tesla back in 2012, and it’s been a marked downgrade ever since.

I’m not going to keep engaging with someone who makes wild assumptions about my stance and accuses me of lying, ignorance, and evil. I was really hoping for video proof; I know Elon is the type of guy to hold that high with all that SpaceX footage.

Good day sir :)


in 2012? lolol, they didn't even have a model 3. that was before they ipo'd. On and off over years is not "I drove a Tesla in 2012". So i was right to call you out for being a liar. "Been a marked downgrade ever since"... 2003 up until the end of 2012, Tesla sold approximately 5,100 to 5,450 vehicles globally. You got a first generation model S before they even had a manufacturing line.

You really think from 5k cars to the 9m they've produced today, has only gone downhill? have you tried FSD since 2012? lol

Edit: ooooohhhhhh, i actually remember you now looking at your past comments. You are the one who hated on the book "there is no antimemetics division" for being too "Cliche" which is like mine and everyone i knows fav book of all time. You are just a type of person that hates everything that is universally loved... I get it now.


How does Elon's arbitrary deadlines impact whether the accolade is "merited"? Incredible progress was made in a fairly short amount of time. His accolade isn't based on his employer's ability to predict delivery dates, they're based on the quality of the systems that are actively deployed today.

I think an accolade's merit is based on the definition of done for work delivered. Elon certainly told the public a certain vision of self-driving (a definition of done) and it didn't come to fruition despite PR progress; i.e. a washing machine can do a lot of work, but is it the right work?

We can arbitrate about what "self-driving success" means until the cows come home, but my point is I've seen a lot of self-driving failures from the Teslas I've witnessed in person.


Thank you!

I was more looking for signal that him + Anthropic might yield something beyond a step-change from Opus 4.7 (disappointing so far). We have not gotten to use Mythos yet, I wonder if that will become Opus 5 or something.


It wasn't LSTMs, it was RNNs.

Thanks for correcting the title I misremembered. Fwiw, the article did culminate with LSTMs: https://karpathy.github.io/2015/05/21/rnn-effectiveness/

---------------------

EDIT: It looks like you deleted the part of your post I quoted below. So feel free to ignore my question about it, I guess.

---------------------

Not sure what you mean by

> Shows how much you know

Do you mean that the fact that I misremembered a word on the title suggests that I know very little about Karpathy's contributions to the field of neural networks?


Add microgpt to that list

Do you have some examples?

Ah, I just learnt that you don't. Jarred's comment saying exactly that: https://news.ycombinator.com/item?id=48133806

I'll actually concede that, on a slower skim, some changes to the test suite and fixtures that first seemed suspicious to me indeed align with what those tests were doing previously, and I wish I could retract that comment.

I still think it's not such an impressive test suite as it's being claimed; which, if this actually works out, should say more about Claude's skill than the people driving it.


Gotcha. I'm genuinely curious: by "impressive", are you referring to coverage? I'd be grateful if you could say a few words about it could be more impressive (e.g, if you indeed meant to talk about coverage, say what functionality/edge cases aren't covered as of now)

Our programming languages are bad at specification and verification, so the next best thing is property-testing for modeling (e.g. Hypothesis for Python) or, for the reference implementations, extensive "expect"/snapshot test cases (e.g. Cram).

Instead, I found the bog standard suite with a single case per regression and very few actual modeling, although I wasn't expecting more. (I don't care much for JS, let alone Bun, so I can't point to features I'd like to see better tested, but I'm sure the issue tracker can do that job already.)

To be fair, our whole industry is really bad at this; most test suites are verification theatre, but now that machines can fill out implementations on their own, we should strive to properly model our requirements and limits so they can one shot what we intended. Otherwise we're left in an awkward middle in which we don't add much value over the AI fumbling around.


Thank you!

Not OP. For this particular use case, I think performance is a primary concern.

But if you mean in general, I also totally feel that languages that let you represent more invariants statically are better fit for LLMs. I'd love to see experimentation with LLMs with dependent types and managed effects.


Fwiw, that's not the stated motivation for the rewrite experiment. In fact, the Rust rewrite is slower to compile than the zig code when compiled with their internal fork of zig (tho it is faster when OG zig is used).

I don't want to infringe upon your right to speculate. I just want to point out that your statement is at best a speculation.


I'm pretty sure that they have decided that backwards-compat is not the best path for Mojo. Matter of fact, the following is the _last_ item on the roadmap on the home page:

> Supporting more of Python's dynamic features like classes, inheritance, and untyped variables to maximize compatibility with Python code.

What's more, note how it says "to maximize compatibility" not "to achieve full compatibility."


Well, of the top of my head, both chatgpt.com and Gemini have text on their home page to the effect of "AI can make mistakes". I'll bet a few bucks such copy can be found in other places, including the terms of service.


Sure, but bear in mind that in the US a fridge comes with a warning not to stand on top of the fridge door ...

"AI can make mistakes" is a bit quaint given that LLMs sometimes completely ignore what you say, and do the exact opposite. "Yes, I deleted the database. I shouldn't have done that since you explicitly told me not to. I won't do it again." (five minutes later: does it again).

I think the API terms of use is where this would be most needed, with something a lot more explicit about the potential danger than "AI can make mistakes". We are only at the beginning of this - agentic AI - no doubt lawsuits will eventually determine the level of warnings that get included, and who is liable when failures occur despite product being used as recommended.


Do you do rolling deploys?


I'd love to see a link to these emails, if you have one handy!


I don't think pypi or npm allow replacing existing packages?


They absolutely do. In this case litellm 1.82.8 had been out for at least a week (can’t recall the exact date offhand). The compromised version was a replacement.


It actually wasn't. That was one of the reasons why I looked into what was changed. Even 1.82.6 is only at an RC release on github since just before the incident.

So the fact that 1.82.7 and then 1.82.8 were released within an hour of each other was highly suspicious.


Ah, my mistake! Thanks for the correction.

But I believe you can replace versions on both, nonetheless. It’s a multi step process, unpublish then publish again. But the net effect is the same.


PyPI enforces immutable releases.

https://pypi.org/help/#file-name-reuse

> PyPI does not allow for a filename to be reused, even once a project has been deleted and recreated...

> This ensures that a given distribution for a given release for a given project will always resolve to the same file, and cannot be surreptitiously changed one day by the projects maintainer or a malicious party (it can only be removed).


If you lock your dependencies, it should fail if the hash doesn't match.


1.82.7 and 1.82.8 were only up for about 3 hours before they were quarantined on PyPI.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: