Hacker Newsnew | past | comments | ask | show | jobs | submit | Den_VR's commentslogin

Does the hammer lie to you that everything is a nail?

Can a voltmeter _lie_ to you?

EE are expected to know when their measurements are wrong. And Professional Engineers are legally accountable for consequences of such mistakes.


If a hammer had a chat interface that said everything was a nail then the answer would be yes, the hammer lies to you about everything being a nail.

That wasn’t the question though? A hammer doesn’t have a chat interface, that’s the point.


If someone believes a hammer when it tells them such things, they should probably have some sort of a caretaker assigned to help them through life.

If hammer companies were suddenly the most valuable international companies, and spent millions on ad campaigns and lobbying about trusting the hammer interface, then you can assume a large amount of people might trust the hammer interface

Still, it's a tool.

Even if your tool learns to talk and to make decisions, it's still a tool, not a person. You're the person and the one responsible for the decisions you make based on your tools.

Going back from the analogy, the problem is that we conflated software <engineers> with "coders". A lot of people thought their job was to create code, we gave them a tool to generate a lot of code fast, and they truly think that "more code" = "more good"


A hammer usually doesn't have the power to persuade people.

> it's still a tool, not a person.

Tell that to the CEO's who have replaced all of their yes-men with yes-chatbots.


Where are the ad campaigns telling me to trust LLMs?

I don’t use an adblocker, do read traditional dead tree newspapers and do get exposed to satellite tv channels.

I don’t think I’ve ever seen anyone anywhere telling me how reliable LLMs are.

Pretty sure this tech sells itself to consumers, enterprise sales are what they’ve always been.


Literally saw a video ad the other day which went like "I've always been cautious using Google's AI because it sometimes gets things wrong, but this time, it got it right!"

So now you're pivoting away from the caretaker proposal? I thought it had potential but I don't know how you'd fund it.

> I thought it had potential but I don't know how you'd fund it.

The same way we fund other social services here in Europe. If an individual is incapable of caring for themselves, the state is expected to care for them.


If I had a hammer robot that I told to go hammer some nails in a birdhouse and it goes "Sure, I'm on it!" then it nails a cat to the wall and says "Here's you new complete birdhouse, it's perfect in everyway and will make everyone jealous", then yes, that is a tooling issue.

The question wasn’t about a hammer robot, it was about a hammer

That's not a good analogy then. What benefit is provided by a hammer that just tells the operator (who has eyes and can see) that there is a nail under it (and I assume to swing)?

Yes, a voltmeter can lie to you.

Full disclosure: I do high voltage testing for a living.


It can misread, but meters cannot actively generate an incorrect output based on user expectations.

…yet!

Enshittification knows no bounds

If software engineering wants to progress past being an "art" and be considered an engineering discipline, then it should adopt methods and practices from engineering. First and foremost, one of the universal methodologies is analysis of root cause in faults, and redundancies to avoid that. e.g. the FAA has two pilots for planes, and each system is built in redundantly so if an engineer misses a bolt or rivet, the plane won't crash. intersections are designed such that there is a forcing function[0] on the behaviour of the motorists to prevent fault. Or, to take your tool analogy, nail guns are designed to be pressed against something with a decent amount of pressure before you can fire them.

All of these systems are designed around the core idea of "a human acting irrationally or improperly is not at fault" and, furthermore, that a human can have a bad day and still avoid a mistake. They all steer someone around a possible fault. Hell, the reason why we divide the road into lanes is itself a forcing function to avoid traffic collisions!

So, where is the forcing function in large language models? What part of a large language model prevents gross misuse by laymen?

I can think of examples here and there, maybe. OpenAI had to add guard rails to stop people from poisoning themselves with botulism and boron, etc. But the problem here is that the LLM is probabilistic, so there's really no guarantee that those guard rails will hold. I seem to remember there being a paper from a few months back, posted here, that show AI guardrails cannot be proven to work consistently. In that context, LLMs cannot be considered "safe" or "reliable" enough for use. Eddie Burback has a very, very good video showing an absolute worst case result of this[1], that was posted here last year. Even then, off the top of my head Angela Collier has a really, really good video demonstrating that there's an absolute plethora of people who have succumbed, in large ways or small, to the bullshit AI can spew[2].

I feel like if most developers were actually serious about being an engineering discipline, like we claim, then we wouldn't have all jumped on the LLM bandwagon until they'd been properly tested and had a certain level of reliability. Instead there are a sizable chunk of people saying they've stopped coding by hand entirely, and aren't even reviewing the code! i.e. They've thrown out a forcing function that existed to prevent errorenous PRs being committed! And for some bizzare reason, after about 2 decades of people talking about type safety and how we need formal verification to reduce error, everyone seems to be throwing "reduction of error" out the window!

[0]: https://en.wikipedia.org/wiki/Behavior-shaping_constraint (if you're curious about the term)

[1]: https://www.youtube.com/watch?v=VRjgNgJms3Q

[2]: https://www.youtube.com/watch?v=7pqF90rstZQ


> I feel like if most developers were actually serious about being an engineering discipline, like we claim, then we wouldn't have all jumped on the LLM bandwagon until they'd been properly tested and had a certain level of reliability

Development can’t be a “serious” engineering discipline because the economics of tech companies doesn’t allow for it. But this has a lot less to do about developers, and significantly more to do with the severe pressure company executives are putting on everyone to use AI, no matter what.

But let’s be honest, many companies have adopted things like root cause analysis and blameless postmortems to deal with infrastructure reliability and reducing incidents. Making systems resilient to human mistakes, making it impossible for the typo to blow up a database, etc. are considered best practices at most places I’ve worked. On the product side, I think it’s absolutely normal to make it hard for a user to take an action that would seriously mess up their account.

The core problem happens when your product idea (say, social media) has vast negative externalities which the company isn’t forced to deal with economically. Whereas in other engineering disciplines, many things are actually safety related and you could get sued over. I’m imagining pretty much anything a structural engineer or electrical engineer works on could seriously hurt or kill someone if a bad enough mistake was made.

That just doesn’t apply to software. There is a lot of “life & death” software, but it’s more niche. The reality is that 90% of what the tech industry works on is not capable of physically harming humans, and it’s not really possible to sue over the potential negative consequences of… a dev tooling startup? It’s a very, very different industry than those other engineering disciplines work in.

But, software engineering has actually been extremely successful at minimizing risk from software defects. The most likely worst software level mistake I could make could… crash my own program. It likely wouldn’t even crash the operating system since it’s isolated. That lack of trust in what other people might do is codified everywhere in software. On an iPhone, I’m downloading apps edited by tens of thousands of other engineers, at essentially no risk to myself at all.


> Can a voltmeter _lie_ to you?

Hell fucking yes it can?


When used according to it’s datasheet/user manual, how?

Cheaper voltmeters will lie on RMS values when not reading a pure sine wave

When their precision mismatches their accuracy (or your expectations as driven by their design), just like with any other metrology tool.

Now you might say: "but the datasheet will give you the tolerances, and the manual will tell you to mind it!"

And yes, that's true. Just like how LLM providers also do: they tell you that outputs may be arbitrarily wrong, and that you should always check for mistakes.

Is this bullshit? Yes. So are metrology tools that have a mismatching precision and accuracy, need calibration, and have designs that fail to make you mind either of these, sending you to reading duty instead. Which just so happens to be a whole lot of them.

It is also absolutely not bullshit of course, because it is a fundamental limitation, just like those properties are for metrology devices. LLMs produce arbitrary natural language. Short of becoming able to perfectly read and predict the users' mind, they'll never be able to make any hard assurances, ever.

Defective devices also exist, and so do incorrect / incomplete documentation.


That's the difference - they have well defined and bounded errors. LLMs do not.

Which is a notably different argument than whether they can lie to you.

Why are we skipping over the miscalibrated, defective, or ill-documented devices bit though? Those also all have arbitrary error.


> are [we] beginning to attribute too little mind to humans.

I don’t think this way of thinking started with LLM. Does Systems Based Thinking also attribute too little mind to humans?


Agreed. I think we, as humans, like to think in terms of various metaphors when it comes to how we perceive ourselves in the world ( for example, "I am not some sort of automaton/robot" when objecting to some boss way back when ).

South Korea?

For Microsoft, free isn’t free… It puts them in a position of advantage. However, I still agree this is abusing goodwill and is rather disgraceful.

I still think of The Unreasonable Effectiveness of Recurrent Neural Networks and related writings.

http://karpathy.github.io/2015/05/21/rnn-effectiveness/


Fun to revisit no doubt, the comments make it even better.

> SuckCocker 7 years ago - "in short: SKYNET is not far away. Be proud to be a part of it!"


:<


:')


>_>


\o/


> metabolic pathways that span multiple species are common to the point that trying to isolate a given species’ contribution can miss the effect entirely.

What does this mean?


So, a metabolic pathway is the set of steps by which an organism converts one molecule into another - this can be by splitting a molecule into pieces, by adding or removing an atom or small group of atoms, or by combining two different molecules into a larger or more complex one. By way of a very, very simple pathway, your body breaks down ethanol (alcohol, C2H5OH) by first removing a hydrogen (and causing the oxygen to double-bond to the carbon) to create Acetaldehyde, CH3CH=O, and then oxidizing that by swapping the H remaining on the second carbon for an OH to create Acetic Acid, the primary component in vinegar. So, when we say your body metabolizes ethanol into acetic acid, we're talking about a two step metabolic pathway.

Bacteria can stash intermediate pathway results outside of their cell wall for various reasons (sometimes the chemical environment is more amenable outside the cell than inside, sometimes buildup of the intermediates can disrupt other processes, sometimes that's just how it happens - biology is weird), and very often what you'll see is that a multi-step metabolic pathway can span across multiple different organisms - so, species 1 takes up a starting material, performs a handful of modifications, and then excrete the results outside the cell wall, and then another species will take up that substance and perform additional modifications on it, and this can run through several species before reaching the terminal state in the pathway (including the first species again). This works because each bacteria can have different enzymes and different internal chemistry which can affect how easy or likely a reaction is.

Nitrogen fixing is a notable example of this - it's not just one species in the roots of legumes responsible for taking N2 and converting it into ammonia, there's 6 or 7 that take part in that pathway.


I think author is saying that you ingest compound A, microbe 1 eats A and secretes B, microbe 2 eats B and releases C. C happens to do <positive thing>. You could imagine parallel pathways where maybe microbe 2 only works if it is in the presence of microbe 3.

Meaning everything is a mess to try and disentangle.


Perhaps somehow related to founding protestants fleeing catholic persecution. It’s the sort of thing that will leave the world blind.


The Puritans, what we generally mean when we say 'founding protestants,' weren't fleeing persecution from Catholics.

In fact, they weren't fleeing persecution at all! They were living in the (relatively) religiously tolerant Netherlands. They left the Netherlands because they weren't succeeding in business there. They came to North America essentially as economic migrants.


I think the Pilgrims lived in the Netherlands for about 10 years, in a refugee town run by Catholics. But they were a minority on the Mayflower. I don’t think any spoke Dutch.


Posted from Maryland?


Too bad the internet inside Iran was shut down by the IRGC terrorists regime


I know your ipv4 address, 127.0.0.1. :)

There’s something to be said for human readable addresses. I’m a little nostalgic of how the .hack world was envisioned, where servers had address names like Hidden Forbidden Holy Ground.

If roughly 10 million words exist, then allowing any three words in order creates a space for 10^21 addresses… five words and you’re close to ipv6 address space, six words and there’s more combinations than ipv6 addresses.


I also know your IPv6 address, ::1

Even easier.


Scrabble is 250-280k uk edition - wouldn't want to go too much beyond that I suspect. Where'd 10 mil come from?


Across 7000 languages in the world including non-latin alphabets…


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: