Hacker Newsnew | past | comments | ask | show | jobs | submit | bitexploder's commentslogin

They can map things like this. They are amazing translation layers. As long as it is a shape of problem or data they are trained on they can translate. The DSL they made up is shaped like some other data format they know for that latent space. It seems amazing, and it is, but it is also a core feature of how LLMs work. The problem is it works until it doesn’t. Fuzzy can only get you so far before it decoheres without rigor.

It is worse because the signal is buried in the noise.

When the noise floor is higher than the signal, it's all noise.

Vibe coded apps with barely no tests, invariants, etc. No wonder it turns into spaghetti. You can always refactor code, force agents to write small modular pieces and files. Good engineering is good engineering whether an agent or human wrote the code. Take time to force agents to refactor, explore choices. Humans must at least understand and drive architecture at this point still. Agents can help and do recon amazingly and provide suggestions.

I can’t understand this. The first thing I do with new agent driven project is set up quality checks. Linters, test frameworks, static analysis, etc… Whatever I would expect a developer to do, I would expect an agent to do. All implementation has to go through build success and mixed agent reviews before moving on. I might not do this with initial research/throwaway prototype, but once I know what direction to go and expect code to go to production it is vital to set guard rails.

Generated tests... I mean... listen to yourself.

I can generate a lot of tests amounting to assert(true). Yeah, LLM generated tests aren't quite that simplistic, but are you checking that all the tests actually make sense and test anything useful? If no, those tests are useless. If yes, I don't actually believe you.

It's the typical 10 line diff getting scrutinized to death, 1000 line diff: Instant LGTM.

Pay attention to YOUR OWN incentives.


> The first thing I do with new agent driven project is set up quality checks. Linters, test frameworks, static analysis, etc

I do this too, but then I sit and observe how agent gets very creative by going around all of these layers just to get to the finish line faster.

Say, for example, if I needlessly pass a mutable reference and the linter screams at me, I know it's either linter is wrong in this case, or I should listen to it and change the signature. If I make the lazy choice, I will be dissatisfied with myself, I might even get scolded, or even fired if I keep making lazy choices.

LLM doesn't get these feelings.

LLM will almost always go for silencing it because it prevents it from reaching the 'reward'. If you put guardrails so that LLM isn't allowed to silence anything, then you get things like 'ok, I'll just do foo.accessed = 1 to satisfy the linter'.

Same story with tests. Who decides when it's the test that should be changed/deleted or the implementation?


> Same story with tests. Who decides when it's the test that should be changed/deleted or the implementation?

Claude is remarkably good at figuring this is out. I asked it to look at a failing test in a large and messy Python codebase. It found the root cause and then asked whether the failure was either a regression or an insufficiently specified test, performed its own investigation, and found that the test harness was missing mocks that were exposed by the bug fix.

It has become amazingly good at investigating.


If you point it at a specific thing and ask a specific question, yes, it will figure it out.

But I never have "fix this test" as a task. What happens when you task it with a feature implementation and test breaks in the middle of the session? It will not behave the same way.


You have to not "stress" the agents out over testing. If a gate is no failing tests they cheat. If the gate is triage failing tests, quantify risk of failing test, prioritize in next work cycles... agents behave amazingly better at cheating tests.

I think they would not be LLMs then.

Agreed. It feels like LLMs are just a piece of the whole final solution towards AGI. I do foresee possibly seeing "LLM flavored AGI" where it does all those things, via tool calling, RAG and other techniques. The real AGI in my eyes will be more than just an LLM though.

I think the idea is to find things true to you to genuinely compliment?

The idea is to have genuine compassion without any agenda, actually. Or on a deeper level, just acknowledge people exist, and let them know that their existence is noticed.

Nothing more, nothing less.


There is no such thing as irresponsible disclosure. Thanks though.

I feel at least partially responsible. I would often instruct agents to "stop being a goblin". I really enjoyed this story too, though.

We do not have the complete picture.

Staring does something interesting. It does slowly reduce brain waves, but it is harder to hit theta with eyes open. And it works very differently initially. With eyes closed meditation where we, say, follow the breath we use the salience network to slowly chill the DMN for a bit. When you stare at a wall the salience network is what deactivates letting the DMN rip to try and figure what predictions are useful. But it runs out of steam and slowly the state converges with a traditional meditative state. With one important difference: your visual field is still active. Traditional meditation lets you hit theta brain waves. Eyes open is harder to hit theta with but you can definitely hit alpha waves.

So I agree it is meditation, but its quality and mechanism is interesting and different by a bit. It does make me wonder. When we traditionally meditate we grow the salience network (physically). Wall staring trains the brain to simply not seek attention in the first place. Wall staring doesn't strengthen the Salience Network's ability to act as a manager. It recalibrates the Salience Network's threshold for alarm. It trains the dACC to stop firing when nothing is happening.

So both are useful. And provide different neural wiring and myelination.


I'm curious what is meant by

> ...recalibrates the Salience Network's threshold for alarm.

Superficial googling reveals superficial information about the SN.

And more specifically, i'm curious what sort of physiological signals could verify recalibration.


Basic a high priority stimulus arrives. It is actually fascinating how this get sorted out, but the salience network say, gets a signal like, user stubbed toe, massive spike of cortisol along with pain signals filtering in from that part of the brain that manages them. When your brain gets bored it is effectively lowering its threshold to trigger "do something" <time wasters, phone scroll, etc.>. The brain is just wired up to constantly process predictive activity. By staring at a whole it stops registering stillness as a thing to activate this process of hunting for stimulative activity. This effectively lets your brain be more calm with no stimulation as it has learned this is not a "threat" state.

Being a modern human is hard. We were not really built for our life post industrial revolution. We evolved to always be ready for threats. The fact that our brains have adapted so well to modern life is amazing and why we have gotten to where we have as a species vs. others. However, it has costs. Our brains our wired to run the DMN loop non stop. So you can do two things. With traditional meditation you make your salience network stronger. Every time your DMN interrupts your meditation and you flex your salience network muscle so to speak you are training it to shift back to the lower DMN activity state. And with wall staring you are changing the brains calibration of what no stimulation means. Both contribute synergistically based on my understanding. (contribute to being less distractible).


I am bluer than 78%. Colors. How do they work.


    Blue his house
    With a blue little window
    And a blue Corvette
    And everything is blue for him
    And himself and everybody around
    'Cause he ain't got nobody to listen (to listen)
https://www.youtube.com/watch?v=BinWA0EenDY


It’s 01:30 in the night you cannot just drop lyrics like that, I’ll have the song stuck in my head for hours.. :(

For this, you just lost The Game.


The only effective ear bleach for that is, say, German Weimar Republic era cabaret Orchester; eg: https://www.youtube.com/watch?v=0bPh9CitHJs


You bastard.



The game mentioned


Thanks for the earworm.


I'm bluer than 98% apparently. For me, turquoise is green. I didn't realize that's not normal.

If I'm off on a detail like that, then...uh oh.


blue than 90%, same verdict with turqouise, though what I call turquoise is bluer than what is shown


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: