Yeah, a lot of "it doesn't matter how the code looks" convos seem to be ignoring that we know what happens over time when you just make tactical the-tests-still-pass changes over and over and over again. Slowly some of those tests get corrupted without noticing. And you never had the ENTIRE spec (and all the edge-case but user-relied-on-things) covered anyway. And then new dev gets way harder.
An LLM-generated TLA+ model can be verified for certain things in a way that LLM-generated code can't. It's infamously hard to exhaustively unit-test concurrency.
Whether or not you're modeling the right things or verifying the right things, of course... that's always left as an exercise for the user. ;)
(How to prove the implementation code is guaranteed to match the spec is a trick I haven't seen generalized yet, either, too.)
First counterexample that comes to mind: Rails vs 90s networked/shared line-of-business crud app development was a 10x factor. It also enabled a lot of internal tools that wouldn't have been worth doing without it.
But after people's expectations adjusted it was just back on the treadmill.
I don't think we've found a new steady-state yet, but I have some gut feeling guesses about where it's going to be.
90% of my my experience has always been dealing with large-ish corporate systems.
I am in Europe, so YMMV even when talking about corporate instead of smaller scale projects.
In my experience stuff like RAILS had negligible impact in my field because companies would always require solid backup from some big name vendor (MS, Oracle, IBM, Sun - back in the day, or even SAP).
So most if not all the smaller silver bullets did not even make a blimp on the radar... and stuff like Java or .NET, while definitely better than C or COBOL... did not really deliver in terms of productivity boost (in part because, as noted in the message I am answering to, expectations kept growing at the same pace)
>90% of my my experience has always been dealing with large-ish corporate systems. I am in Europe, so YMMV even when talking about corporate instead of smaller scale projects.
>In my experience stuff like RAILS had negligible impact in my field because companies would always require solid backup from some big name vendor (MS, Oracle, IBM, Sun - back in the day, or even SAP).
>So most if not all the smaller silver bullets did not even make a blimp on the radar... and stuff like Java or .NET, while definitely better than C or COBOL... did not really deliver in terms of productivity boost (in part because, as noted in the message I am answering to, expectations kept growing at the same pace)
I've always been in small-to-midsized US corporate where Oracle etc were generally "no way are we gonna spend that much" but if someone can hack a decent thing together and run it on a spare server... that got traction 10-20 years ago.
I'm curious if those large corporations are more homegrown-code-by-AI-friendly than they would've been towards homegrown-Rails-app? A lot of the same potential problems exist.
In particular if that steady-state requires 4 to 40GB blob of binary code to be installed or an internet connection to an AI SaaS provider and a credit card.
I remember when coding was free as in beer and freedom!
It's okay. In the near future people will not own a computer of significant capability at all, so it's not like they'd even be able to develop without a cloud connection in the first place.
And, for the most part, they will be okay with this. Gen Alpha or Beta will tell each other how morally wrong it is to expect for everyone to own a computer or even a smartphone, citing the environmental impacts, slaves mining coltan, and the toxic emotional effects upon people who grew up in the Social Media Dark Ages. Much like present-day Hackernews feels about personal automobile ownership.
Ah, Fails. "Before we made x improvement, the app had to restart 400x a day, now it's only 10x!"
For all the complaining we do about "enshittification", we (Hackernews, the broader industry, whatever), are perfectly willing to pay the price of stability and performance to get a little development speed. That's one prong of how enshittification happens. "I can make compromises in the quality of my product because time to market is the one thing, the only thing that matters in this move-fast-break-things economy—and pass my savings on to the customer (in the form of hidden costs)!"
An experiment I'd love to do, but which isn't actually possible anymore, is run GPT 3.5 or the original 4 API release through a modern "agentic" harness for a task like this.
I think 3.5 would probably need more frequent intervention than a lot of harnesses give. But I bet 4 could do a simple JSON API one-shot with the right harness. Just back then I had to manually be the harness.
The hardest thing about software engineering has always been that your intent often has to be decided on the fly once you get into complicated edge cases, weird-or-legacy-business requirements, or things that the spec literally has no answers for.
Letting the tool figure out your assumed intent on those things is a double-edged sword. Better than you never even thinking of them. But potentially either subtle broken contracts that test coverage missed (since nobody has full combinatoric coverage, or the patience to run it) or just further steps into a messy codebase that will cost ever-more tokens to change safely.
Every day I do something where the llm writes it ten times faster than I would with twice the test coverage.
And every day I do something else where the LLM output is off enough that I end up spending the same amount of time on it as if I'd done it by hand. It wrote a nice race condition bug in a race I was trying to fix today, but it was pretty easy for me to spot at least.
And once a week or so I ask for something really ambitious that would save days or even weeks, but 90% of the time it's half-baked or goes in weird directions early and would leave the codebase a mess in a way that would make future changes trickier. These generally suggest that I don't understand the problem well enough yet.
But the interesting things are:
1) many of the things it saves 90% of the time on are saving 5+ hours
2) many of the things I have to rework only cost me 2+ hours
3) even the things that I throw away make it way faster to discover that 'oh, we don't understand this problem well enough yet to make the right decisions here yet' conclusion that it would be just starting out on that project without assistance
This. There is definitely a ratio. A year ago, it was 50/50. It felt better because the hard things it did fast while I sipped coffee outweighed in my mind the negatives.
Now that ratio is swinging way over towards the LLMs favor.
Having the humans document the code seems backward (maybe that's not what they're doing, but "make everything ready for ai" sound manual). And hopefully there aren't that many scary surprises that humans need to manually document.
One of the best parts of LLMs is that you can use them to bootstrap your documentation, or scan for outdated things, etc, far more quickly than ever before.
Don't just throw a mountain at it and ask it to get it right, but use a targeted process to identify inconsistencies, duplicates, etc, and then resolve those.
And then you have better onboarding material for the next human OR llm...
Oddly enough, asking an AI to add docs to a classfile explaining "what it does, why it needs to exist, and what uses it" is a great way to include some of the "why". I know it's not ALL the why, but it does a pretty good job of finding the reasons that someone new to the code wouldn't be aware of.
Somewhat by definition, AI-generated docs would only include information that could be obtained elsewhere in the codebase. That can be valuable, but far more valuable is the information you got from debugging all the failed alternative designs that were never committed, etc. Information and context that goes beyond the actual code being read.
My experience is that most people fail to capture this ultra-valuable documentation, but AI never does.
> Having the humans document the code seems backward (maybe that's not what they're doing, but "make everything ready for ai" sound manual).
No, that's forward. Any documentation an AI can make, another AI can regenerate. If an LLM didn't write the code, it shouldn't document it either. You don't want to bake in slop to throw off the next LLM (or person).
> Has canned fruit actually lost popularity? Or did the grocery stores decide that the shelf space had a higher profit margin pushing something else?
> The last couple of times I tried to get canned fruit for a recipe I had to actively hunt for the particular cans of fruit I needed (I needed to hit 3 different grocery stores).
> I haven't tracked peaches recently, but I can tell you that canned apricots have been a bit thin on the ground for at least a couple of years.
Groceries stores with canned fruit being harder to find is entirely consistent with it being less popular. Pushing you to go to another store for something is bad, if you're a grocery store. That's a great way to drive off customers. There's a lot of shelf space at my local grocery stores still dedicated to fairly-redundant products or high amounts of extra copies of items, so I don't think they're being pushed out because something else is way more profitable. (My local stores have much larger selections of canned beans than canned peaches, for instance.)
I think it's just generational trends. Generally health-conscious consumers these days are more skeptical of canned vs fresh, and non-health-conscious have more junk food options than ever. It's also gotten easier to source fresh fruit across seasons than thirty or forty years ago, further squeezing canned options.
> It's really weird, I'm seeing across the board that people who never believed in them before are suddenly all into good software eng practices (starting with writing a spec) because of AI.
> It's kind of fascinating that we never were willing to do these things for humans but now that AI needs it ... we are all in. A bit depressing in the sense that I think mostly the reason we happy to do it for AI is that we perceive it will benefit us personally rather than some abstract future human.
I don't think that's the reason.
I think it's because they take time, and few people were willing to put in time for "maybe it'll make writing the actual code faster" gains when the code was going to take a few times longer to write itself.
You also can get faster feedback to iterate on your spec now, which improves the probability of it helping future-you.
So combine that with the fact that the llms are more likely to get lost if you don't spec stuff in advance, and the value of up-front work is higher (whereas a human is more likely to land on the right track, just more slowly than otherwise, making the value harder to quantify).
The problem with up front spec writing was that it created this long, very high risk period where there was no code, no prototype, no hard limits.
So you're arguing back and forth about exactly which fields should be collected where and meanwhile the world is changing around you. The person who insisted on an absolutely minimal signup page has changed jobs, the new hire prioritizes complete data even if it means more signup bounces.
Weeks and months are going by and there's nothing to click on and still nothing to tether the project to reality or limit the scope of debate. High functioning teams can avoid this, but they don't always control who inserts themselves into the conversation.
And of course as soon as coding starts the spec is out of date unless you absolutely nailed it which I've never seen happen in 15 years. Once a user sees the software and gives feedback it probably needs to be rewritten. Maintaining the spec (pre-LLM) was not that much less work than maintaining the software itself. All this time the world around the software is changing and the spec needs to be updated to reflect that.
And while the spec helps in understanding the system, ultimately you still need to understand the code and it.
But now the time between spec and working code is greatly reduced and spec updating can be automated to an extent. The cost is greatly reduced and the benefits have increased, so people like specs now.
What I hated is when you come into an old project. The spec says one thing and the code contradicts it. WTF do you do? I felt more like an archeologist than a programmer.
Yeah I think a lot of pushback to best practices is basic cost/benefit; I like writing documentation, but I'm also often feeling a bit depressed that nobody will actually read it in as much detail as I wrote it. But LLMs do / can.
Actually there's a lot of projection there too; I don't read documentation in detail. And nowadays, I point an LLM at documentation so that it can find the details I would otherwise skip over.
The destruction of the millennial attention span is real, and it's worse in the younger generations, lmao.
Well it's also just that you have a list of 20 features to add, and if it works, you want to ship it, and someone might even get mad if you spend a day dawdling on best practices and documentation and so on. Corporate cultures generally don't have the same long term thinking about reusability and legibility and fault-tolerance that an individual coder may have about the code they want to write once and forget. (Neither do LLMs, for that matter).
reply