AI-assisted coding will change software engineering: hard truths

newsletter.pragmaticengineer.com

91 points by pseudolus 6 months ago

> It's becoming a pattern: Teams use AI to rapidly build impressive demos. The happy path works beautifully. Investors and social networks are wowed. But when real users start clicking around? That's when things fall apart.

Oh no. We will have to endure another sort of AI slop infesting the web? It's bad enough as it is. Most smaller web sites are already broken in tiny ways. Who hasn't had to break out the browser debugger just get past some web sites broken order page?

Sloppy reviews, images, chat bots, and phishing are everywhere now. In this brave new world someone no computer experience tinkering a home with can produced a beautiful looking web site that's broken it 1000's of ways, we are going to be overrun with this crap. And they are going to be harvesting login email addresses and passwords.

It's going to be a rough decade.

raxxor 6 months ago

> Last year, we saw that about 75% of developers use some kind of AI tool for software engineering
I heavily doubt that if I look left and right. They are prompting them in a browser, but few have adopted AI in their toolchain yet.
> “Software engineers are getting closer to finding out if AI really can make them jobless”
It might indeed partially work if people are able to describe a problem and prompt an AI. Do they know how to describe a problem and prompt an AI, rhetorical question?
In practice is does indeed allow other engineers not in software to churn out some scripts and they mostly are of good or at least decent quality. But for complex work or integration, the AIs just lack any context that is needed for the specific problem to be solved. Still, AIs do help the developers themselves.

godelski 6 months ago

My personal belief is just that AI assistants feel faster, not that they actually are. I'm sure they are in some specific circumstances but on average. They feel faster because your work is different, you put in different energy. I don't want to say easier (probably is) because just context switching has similar effects. People are really not reliable self evaluators. It's always the top comment on any post about some psychological study but never for these AI things. Truth is it's hard to find objective measurements. Lines of code, commits, things shipped, etc aren't strong metrics.

But I think the opening of the article is important, it asks why products aren't getting better [0]. We all feel this, right? There's so much low hanging fruit that could make our lives less frustrating but is never done because it isn't flashy. Like Apple, you got all that AI but you can't use a regex to merge calendars? Google, you can't allow a test task to be made in the past (extra helpful when it repeats). Wikipedia still uses the m address and you go to the mobile site from desktop if you don't manually remove? I could go on and on but I think we're just in the wrong headspace.

[0] imo products are getting worse, but that decline started before GPT

Zacharias030 6 months ago

I‘m not sure I believe that when just yesterday, I ran a bunch of data analysis, simulation, and visualization based on a single csv and produced 5-10 decent matplotlib plots in a 90 minute back and forth between OpenAI canvas, vscode, python and jupyter. I didn’t believe some of my results and then discovered some problems of the dataset itself, so there was some „real work“ done in those 90mins.
I can say with certainty that I wouldn’t know how to wield matplotlib and pandas with such fluency in an hour, even though I am perfectly able to read the implementation and query for some relevant intermediate results to check my mental model.
Granted this is not the world’s most complex problem, but that is a good example of the domains where these tools are incredibly useful and productive already (I didn’t have to consult the docs even once). So in a way I think of LLMs as very good interfaces to the docs :)
I often feel that the UI aspect of new technologies is underappreciated. All our computers (even grep) are turing complete. This means software engineering is fundamentally a discipline of building better user interfaces that allow us to do whatever we want more easily.
I am always curious how other people experience these things as so useless :)
- specproc 6 months ago
  
  Most of my work is data analysis, and it's the area I find LLMs are most difficult to use, and potentially dangerous!
  I really worry when I think of how less experienced colleagues and Excel power-users might approach a pandas/matplotlib workflow with an LLM.
  Very normal example from the other week. I've got a big csv of housing data, poking around I spot a bunch of very weird things in the prices: cheap places where I'd expect high prices, a strange bicameral distribution. I spend a few hours thinking about it before taking it up with the colleague who'd provided the data. There's a bug in the scraper that's messed up a bunch of values!
  It'd be super-easy to naively run a simple LLM-driven analysis and miss this sort of thing.
  When you're "building things" with AI, feedback is immediate. You press a button, it doesn't work as expected, the charts you can quickly run up don't give the same signals.
  
  godelski 6 months ago
  
  I worry about the same thing. Even with traditional tools people mess it up all the time. It's in the same way too, treating your algorithms like black boxes. To really do data analysis you have to understand your data and understand what you're algorithms are doing/what they mean. Too often people just naively plug in numbers and take a result. Unfortunately it's not that easy. I work "AI" just accelerates this problem and brings it to more domains
- godelski 6 months ago
  
  It's hard to judge without knowing. Yes, when something is rather routine (i.e. there are many examples, especially in similar formats) LLMs will almost always write this code with little to no issues. But outside that things get far harder to evaluate, especially without seeing it.
  > I often feel that the UI aspect of new technologies is underappreciated.
  I do too. But I should also mention that the human language interface is not a great UI. I mean how often do you miscommunication with your boss?
  > other people experience these things as so useless
  Who said useless?
- in-pursuit 6 months ago
  
  The one use case I’ve found LLMs excel at is using a new library. Even if it gets a lot of the API wrong, figuring out most of the setup / boilerplate is useful.
MrMcCall 6 months ago

> imo products are getting worse, but that decline started before GPT
Yes, and it's not going to change until the motivations and methodologies that drive the corporation change. All these LLMs are just reinforcing their MBA mentality, which is to try to do more with less, which just ends up being more enshittification.
The money people only care about one thing, and one thing only. I say that the only real qualification for being a manager is to be willing to prioritize money over people. I've had exactly one good manager in my career, because he knew his role and respected that I knew and was passionate about mine while respecting his.
A main problem with all these huge software companies is that they are too busy chasing new features while letting long-existing bugs remain the bane of the users' experience. There's no "lets fix the current issues before we embark on new projects" mentality. I literally encountered this exact mentality in a dumb little marketing company 25yo. Different time, very small company, same dumbasses running the show.
When large software tries to be adapted beyond its initial design specs, it will eventually collapse under its own weight. At some point, a fundamental redesign must occur to prevent change grinding to a halt.
- godelski 6 months ago
  
  > There's no "lets fix the current issues before we embark on new projects" mentality.
  Yeah I think this ends up being just monopoly behavior. Like a few times a year I drive through The Bay and every single time am left wondering "How is Google Maps so bad here?" Things like it'll tell me the exit number but numbers are small and names are big (give BOTH. Different strokes for different folks?) or how it'll tell me to get into one lane and then want me to make an impossible turn from that lane. There's a freeway interchange where for years it has told me to use either lane and the exit is literally two lanes with each one going a different direction...
  It comes off as feeling like no one is dogfooding their software. Which to me says something really really bad. Either no one internally is using the software they are building (so don't believe in it), no one is paying attention, and/or internal reports are flat out ignored. This should kill any business that doesn't have a monopoly.
  It is crazy to me that they push new features with the justification of better user experience while there's a lot of stuff that can be done to make user experience better for cheaper, faster, and is far more impactful.
  Sure, there's cool things like when I jump on WiFi my friend's iPhone asks them if they want to share the password with me, but are we really so fucking dumb that we think we know what is happening on each other's screens simultaneously as ours? And that "whoops, I backed out" or "whoops, I clicked off the dialogue" and then you can't repeat the process so I got to manually type in the damned thing anyways? Or just today, I was listening to music with my partner, turned on airplane mode, it turned off bluetooth, I turned it back on, and then I had to click "share" again and she had to... again pair her headphones. Even though she's in my family (they still haven't figured out that I don't care that her airpods are following me and that my airpods in my pocket are not in fact lost). There's so much of this fucking shit that every day normal people (aka my tech illiterate parents and family) complain about (and me!). I think a problem is that we also dismiss "complainers" on software teams. You need at least one grumpy fucker. The one that's not saying things are impossible, but the ones that are saying things are broken AND trying to fix them. We have too many people that are just putting their fingers in their ears and pretending problems don't exist.
  And don't get me started on ML (being a ML researcher myself). I don't understand how this community can confidently claim to control "AGI" (or even Stupid AI (SAI)) when details or issues are not just ignored, but we gaslight people who bring them up. Our fucking job is to recognize limits so we can fucking fix them...
  
  MrMcCall 6 months ago
  
  > You need at least one grumpy fucker.
  "Now we're two." --Batman
  
  godelski 6 months ago
  
  Not just grumpy, but grumpy who wants to fix things. Criticism >> complaints.
  Just want to make sure we're on the same page ;)

aorona 6 months ago

I have been using LLMs (chapt-gpt, perplexity, claude) for development for over a year. It is helpful for summary explanations of concepts and boilerplate for frameworks and library APIs. But it makes errors within those consistently.

Its a great tool and saves a great deal of time, but I have yet to go beyond generating snippets I have to vet, typically finding a made up library API call or misunderstanding of my natural language prompt.

I find it hard to pare down these LLM evangelizing articles into take aways that improve my day to day.

nurettin 6 months ago

I know it is in the nature of probabilistic neural network outputs, but it almost feels like these commercial models are built to make those mistakes (making up functions/parameters) and it is all a big conspiracy to hide the real useful stuff from general public.
I started giving the models api docs and headers before interacting with them and it seems to work a lot better.

gazchop 6 months ago

Different take: AI writes a lot of mediocre code for us so we don't have to and we're impressed at that.

But that's not the problem we need to solve. All our programming languages are verbose and stupid. It takes too much effort to solve the problems we do in them.

smokel 6 months ago

I think it's even worse.
We are solving a lot of mediocre problems and customers are impressed at that. But that's not the problem we need to solve. There are wars going on, and the climate is changing in undesirable ways, and we have no clue how to organize people, or how we could put modern technology to good use.
- Spivak 6 months ago
  
  I suppose but I also have to eat and the skill set I have is technomancy.
  Division of labor is fine, I think it's a fairly unique attitude in tech that we have do something (tm) with our skills. No one asks the accountant to get out there and fix climate change.
- ravenstine 6 months ago
  
  > We are solving a lot of mediocre problems and customers are impressed at that.
  And even that's not true much of the time.
  We are solving a lot of mediocre problems that product owners and shareholders are impressed at.
- MrMcCall 6 months ago
  
  > we have no clue how to organize people, or how we could put modern technology to good use
  But I do, friend, I really do.
  It all starts with compassion; profit-motive being prioritized over compassion is the source of all this world's problems, from strife to pollution to fascism to oppression of minorities or other ethnic/religious groups.
  Both political parties here in America are on the take from various moneyed groups, though the Dems have less disdain for the poor, to be certain. Their prioritization of those interests damages how our government behaves, how it should serve the people of both our country and the world at large.
  And look at the fools who are about to run America now, as well as the majority of the commentariat around here, burning up the Earth for their "coins" and "near-worthless" LLMs, worshipping dangerous fools like Musk.
  If you want to be a part of the solution, connect yourself with our Creator and learn how to become more compassionate, for EVERY SINGLE ONE of our problems is solely and completely due to a lack of compassion.
  So, tech is great to help people but can be (and is being) used to oppress others (e.g. rent collusion), amplify mis- and disinformation, cheat consumers, and so on and so on.
  Steel yourself in compassion for all innocent human beings, i.e. the ones not oppressing others and destroying the Earth.
  
  bulatb 6 months ago
  
  "If everybody just" is a description of a problem, not a solution. The problem is that everybody will not "just," no matter what could happen if they did.
  
  MrMcCall 6 months ago
  
  An alcoholic must stop drinking, or they will die from it.
  The solution is to stop drinking.
  How to succeed in making that solution happen is a separate problem, but the first step in solving any and every problem is identifying its root causes.
  For our world, not understanding that compassion must be the fundamental factor under consideration is the primary problem. Once that essential fact is understood, we can then haggle out the solution, but not before then.
  With love, all things are possible, if we so choose.
  
  smokel 6 months ago
  
  Hm, I sense a very positive attitude, so I'm eager to hear more.
  I do have some doubts though -- concepts such as compassion and love have been around since, and even before, Christ and Buddha. I don't necessarily see these concepts as being a robust solution though. Bad things keep popping up again and again, and not everyone is equally well off.
- hackable_sand 6 months ago
  
  Those are all solved problems. Being scared and acting a fool is not a solution.
- namaria 6 months ago
  
  Shuffling numbers in networked personal computer devices has diminishing returns and we're hitting them. LLMs are a hail mary, throwing compute at the wall to see what sticks. No, Silicon Valley, the world doesn't need more compute. It has ceased being a limiting factor several years ago, and the consumer experience has been worsening for quite a while. Just for an example, consuming media on a smart tv using a streaming app is a vastly inferior experience to just using bluray discs.
davidclark 6 months ago

Have you written a computer programming language? Calling every single one “verbose and stupid” seems like a Chesterton’s Fence issue.
- gazchop 6 months ago
  
  Yes I spent a good decade writing domain specific languages and some time maintaining a commercial compiler and code generator for complex state machines.
  We can do a lot better. More time solving problems, less time satisfying what is effectively a trite extrapolation of theoretical languages from the 60s
drowsspa 6 months ago

Maybe it's informed by non-existing formal training, but even before AIs I kept feeling we're stuck in a local valley when it comes to programming languages.
- skydhash 6 months ago
  
  I wouldn't say so. Once you learn a bit more about programming languages, you realize the problem is mostly about communication, not the tools itself. No matter how good the language is, if the writer and the reader is not fluent in it, you'd have a hard time communicating ideas. My very subjective benchmark is that you ought to be capable to express the overview of your implementation in plain English before even going to code (except if you're prototyping). I've met people that are only gluing snippets together and have never given a thought about the overall design.
  
  agentultra 6 months ago
  
  Spoken language is often insufficiently precise for describing complex systems.
  I get tired of folks drawing boxes and arrows and waving their hands about their designs. I ask for proof that it works they way they say it does and I get nothing.
  People seem to like GenAI tools because they can avoid doing the work and get the results. It's like having a genie or a monkey's paw... with similar consequences.
  If you don't understand the concurrency problems before you're not going to understand them by staring hard at the code that a chatbot generated.
  
  cle 6 months ago
  
  We're just arguing about abstractions. Do people need to always understand "under the hood" of abstractions? Obviously not, most of us don't know how ADC works or how CPU instructions are pipelined or how light pulses end up as 1's and 0's in a buffer somewhere, but requests.get(...) is a genie that we can still use.
  
  namaria 6 months ago
  
  Abstractions are not a dichotomy between "forget everything bellow the level in which you wish to operate" and "I need to be able to follow the path of each electron". Which abstractions are used and how you traverse them are relevant discussions that a lot of professionals seek to avoid by using framework, jargon and box diagrams.
  
  cle 6 months ago
  
  We are agreeing, I'm arguing against that dichotomy from the other direction—you don’t always need to know everything under-the-hood. You also can’t always ignore everything under-the-hood.
  Sometimes you can get the job done by using a tool like a genie, and that’s awesome (it’s the goal of abstractions IMO). Sometimes you can’t.
  You don't always need exhaustive "proof that it works" other than running the thing and looking at the output.
  If you can get the job done with an LLM without understanding the code it wrote, then that's awesome. I've seen it multiple times with non-technical people who just want to do some small thing with a Python script. They solve their problem, send their report/email, and move on.
  
  skydhash 6 months ago
  
  > If you can get the job done with an LLM without understanding the code it wrote, then that's awesome.
  I don’t think anyone would find fault with this argument. People who are wary of LLMs, including myself, are in fact wary of lazyness. Getting the (current?) job done is only a small part of building a system. The fact is you have to maintain it or extract the embodied knowledge later, and that’s where no thought have not been given.
  In the Tidy First? book by Kent Beck, he explains that the value of software is both in its current behavior and the future possibilities that the structure provides. If there’s no future to worry about, LLMs may be a valuable tool.
  
  namaria 6 months ago
  
  Let's not conflate running one off scripts with software development. If LLMs can help people do something with Python that would be just a lot of boring clicking around or whatnot, great.
  The conversation in this thread was about how LLMs will change software engineering as a profession tho...
WillAdams 6 months ago

Isn't that why a frequent approach to a complex problem is to create a custom language/Domain Specific Language (DSL) for that problem space?
Or where possible, the selection of an extant DSL?
godelski 6 months ago

I agree with one part but to call coding languages verbose is odd. If it's too verbose, pick another language. You can always write asm.
asdff 6 months ago

You’d think ai models would be good enough to write code in pure binary. Python is for humans. Take the human out there’s no point in it.
- raincole 6 months ago
  
  It's possible that one day they'll be. But to me it's very obvious why LLMs are better at writing human-readable code than binary code.
  The hype around LLMs come from the fact that you can command them in natural language. To make it possible, LLMs have to be trained mostly on natural language corpus. And human-readable code is closer to natural language than binary code.
- probably_wrong 6 months ago
  
  With all due respect, I think that's a terrible idea for a couple reasons.
  First, a model that produces binary code would be impossible to vet. This is already an issue with "trusting trust" in compilers [1], and would be made worse in the case of companies that constantly tweak their black-box models however they like. At least my compiler can work offline.
  But then there's the idea of programming as "turn data into data" that forgets that software has to interface with humans too. If I ask my LLM to give me sentencing guidelines that are "fair" given a set of parameters, no one on Earth should trust a binary they cannot check themselves. In fact, those affected may even have a constitutional right to source code.
  [1] https://dl.acm.org/doi/10.1145/358198.358210
  
  asdff 6 months ago
  
  Imo you wouldn’t even have to check the binary. Only that the output is what you expect and the performance is reasonable. No different than say agriculture where we measure success of a new varietal on output and resource use and have a very dim understanding of the inner workings that might have led to those observables.
  Or to take it further, you hire a consultant to produce some work in a given amount of time. You accept the work over time, you don’t feel the need to ask the consultant to turn over all their notes or to plug their brain into a mind reader machine to vet the source. They could indeed be lying and take whatever they learned about your company elsewhere, but we have no protection over that possibility lacking a mind reading instrument today. And society also doesn’t collapse today given that risk.
- gazchop 6 months ago
  
  You think something which gets 70% of stuff right on a very good day when the planets are all aligned can write assembly that works? I have news for you.
  
  asdff 6 months ago
  
  I think it sure can. Randomly modify a binary function over a million potential permutations and measure the distribution of output, subset from the distribution the binary function(s) that output ahead of your threshold. Should give you a fast function that does what you want with absolutely indescribable code, which isn’t an issue if it does what you want with reasonable performance.
- koe123 6 months ago
  
  I suppose the issue is a data problem, there being relatively little high quality data explaining how things should be solved in binary. As such making the learning mapping between prompt (english) and good solution (binary) difficult.
  
  asdff 6 months ago
  
  You don’t even need a training set imo just enough permutations to get a distribution of putative functions and outputs.
  
  ninetyninenine 6 months ago
  
  The data is easily generated by compiling the code.
  
  0xCMP 6 months ago
  
  But compiled code loses a lot of the "extra" data. Also these are "language" models so I would be surprised if training on binaries was much more efficient versus writing in some kind of language.
  Besides, how do you even check the result now without running untrusted code? Every run of the model you need to reverse-engineer the binary?

ilrwbwrkhv 6 months ago

I heavily started using windscribe and cursor editor. Mind you, I have 10 years of experience.

But lately I have cut down their usage and gone back to writing stuff by hand.

Because it is so easy to apply the changes the AI suggests but there's this subtle shift over time in the code base towards a non-optimal architecture.

And it becomes really hard to get out of.

The place where I still use AI quite a lot is autocomplete but of the intelligent kind like if I am returning a different string for each enum, all of that gets autocompleted really fast.

So line completion models like what JetBrains provides for free is I think the right balance. Supermaven also works well.

8s2ngy 6 months ago

I am in the same boat with you. I have also come to dislike having AI code generator inside my editor. Recently, I started looking into learning Rust seriously and decided to go with RustRover (the rust IDE from JetBrains) and I find that single-line AI autocomplete complements existing IDE features nicely. It works really well with the workflow I have come to prefer: describe the structure of the data using enums and structs, and then handle them using pattern matching and write unit tests for them.

ilaksh 6 months ago

The problem with this is that it's supposedly about predicting the future but actually bases everything on LLM's current capabilities.

It's incredible that people still haven't figured out or won't accept or plan for technology to continue to improve. Especially given how obvious the rapid improvement in this area has been.

Having said that the article seems to accurately reflect what it's like using the current tools.

But how could anyone reasonably expect the situation to be similar 3-5 years down the line?

If they just didn't frame it as a prediction then it would make sense.

intended 6 months ago

To be fair - the gap between present capabilities and future capabilities is the issue.
Having a system that is accurate a random number of times, is very different than a system that is predictable. There’s just too many use cases where GenAI looks like a great fit, but then you have to have the mental overhead of figuring out if this is the time you lose the lottery.
It’s just… the kind of mental overhead I wanted a machine to take away from me, not add to my life.
PaulDavisThe1st 6 months ago

> It's incredible that people still haven't figured out or won't accept or plan for technology to continue to improve. Especially given how obvious the rapid improvement in this area has been.
If you wrote this about aeroplanes in the mid-1960s, despite the previous 40-50 years of rapid improvement, you'd be wrong.
The same would be true for bicycles, and automobiles - despite decades of incremental and noteworthy-for-specific-context improvements, there's been no fundamental changes in these (and many other technologies) that reflects the early progress. Yes, modern cars are safer, more comfortable, more fuel efficient (sometimes), but they get nothing done that wasn't possible with a car from the mid-1950s.
Why would you assume that the recent past of LLMs provides an outline of what the future of AI in general (not just LLMs) is going to be?
- dwaltrip 6 months ago
  
  The question is, are current LLMs more like wooden biplanes or early passenger jets?
  It feels like we are closet to the wooden biplane era, imo.
  
  PaulDavisThe1st 6 months ago
  
  Or are they more like a video of flying that we're watching and saying with glee "we're flying!" ?
swiftcoder 6 months ago

> But how could anyone reasonably expect the situation to be similar 3-5 years down the line?
That just depends how optimistic you are about rate of improvement. Folks who think we are on the cusp of AGI predict that improvement will be exponential. Folks who think that Moores law will keep up with training costs predict improvement that is somewhat linear.
On the other hand, folks who are looking at the rate of change of current LLMs think we're running out of training data, predict that improvements are more likely to be logarithmic, and we're already in the flattening section of the curve...
- godelski 6 months ago
  
  I believe growth is exponential. The growth of complexity. The author mentions how it can get you 70% of the way there, well Pareto is a bitch. Once the details matter more the difficulty grows evidentially (well power series).
  What people fail to realize is despite what we've done is incredibly and no easy task, it's "the easy part". And progress is unfortunately constantly like this. Because as we progress we are effectively solving higher order solutions. Solve your Taylor series problems. If you've done up to 4th order and still need 5th, everything previously is (usually) "easy" in comparison. Getting to the 6th is no different, and so on. And of course it gets harder, we made progress! The naive part is to think the future will be as easy as the past looks in retrospect, not as hard as the current state looked a priori.
  
  intended 6 months ago
  
  “Pareto is a bitch”, is great. Basically a short summary of the problems with production GenAI.
- PaulDavisThe1st 6 months ago
  
  > Folks who think we are on the cusp of AGI predict that improvement will be exponential.
  They think more than that. They think that the mechanisms embodied in LLMs only require exponential improvements to reach AGI.
  Lots of very smart people don't agree with that at all.
  
  MrMcCall 6 months ago
  
  I'd say no really "very smart people" agree with that idiocy in the first place. Most people just believe what they want to believe, regardless of the truth, and what they want to believe is usually primarily due to what they figure they'll get out of those signals they send.
  People's confidence is no indicator of their intelligence. See Dunning-Kruger's landmark study for how human nature in modern engineering corps manifests itself in two diametric ways, depending on one's humility and actual hard work.
  A major problem is that when someone has the expertise to actually know the truth, the fools of the world deride them mercilessly. See Eugene Parker for a perfect example.
  Most people are simply far too stupid to know how stupid they are, and trying to tell them how stupid they are makes them really, really angry.
  
  PaulDavisThe1st 6 months ago
  
  > Dunning-Kruger
  There are some solid, though not uncontested, arguments that the D-K study itself is fatally flawed.
  
  MrMcCall 6 months ago
  
  But the conclusions are bang-on correct vis a vis human achievement.
  We can choose to be humble craftspeople that know we can always improve some more, especially as a software engineer; such humility naturally leads to undervaluing our expertise.
  And then there are the people who are in the job for the money or social perks and that's all they ever really wanted, so, because they already have the status/money they desired and that's "good enough" for them, they overvalue their expertise.
  I'm not a study designer but as a student of human nature, the results they reported are just absolutely correct. Humility and hard work are both causitive and correlated with excellence, whether the person is a physicist, mathematician, engineer, musician, programmer, or just a regular old human being.
  I mean, look at the lying, know-nothing, big-talking turd-sandwich we just elected President, my friend. And then look at the people who elected him, and know those fools are just a bunch of rubes who think they're "real smart".
  It is said that it is far more difficult to convince a person that they've been fooled than it is to fool them in the first place. The low-expertise folks think they're fooling people with their confident self-assurance.
  It is possible to achieve the level where one knows that one knows, and the idiots will attack such a person out of their lack of humility. Eugene Parker dealt with that, as do I, my friend, but I know that I know. And my advantage is that I am always ready to learn more. Always.
- ilaksh 6 months ago
  
  There are many curves. It's a series of sigmoids. There are many innovations to keep up with the expected constant performance increases and demand.
  They started running out of data, then curated it, then spent more time on inference. There are many small and large innovations that continue to improve performance as each improvement gets implemented and maximized and the gains flatten out and then increase again.
  Some are using giant SRAM chips. They will then connect them with optics. When that doesn't keep up with demand, probably memristors or something will be scaled to be the next paradigm.
  Look at the massive improvements made by the DeepSeek team to the open source SOTA recently.
rileymat2 6 months ago

> But how could anyone reasonably expect the situation to be similar 3-5 years down the line?
We can’t, that’s why we are talking about the current capabilities. I remember using Dragon Natural Dictation software before the year 2000, I was young and it blew my mind away with the possibilities. 27 years later it still has not lived up to my young imagination. (It has gotten a ton better, no doubt, and will continue to with more AI)
- ilaksh 6 months ago
  
  AssemblyAI's most recent models are incredible. I think you aren't accurately assessing leading edge STT or how your young self would judge this capability level.
- tessierashpool 6 months ago
  
  spoiler alert: this always happens with new technologies, to some extent. the possibilities that materialize are always an odd subset of the possibilities that people envision when the tech is new. you can get better at predicting that subset by studying economics or the humanities, but randomness also plays a crucial role.
  
  mooreds 6 months ago
  
  Or it could be because low hanging fruit is low hanging.
DanHulton 6 months ago

The situation certainly won't be similar 3-5 years down the line, but it also won't necessarily be different in the way that people are predicting.
In the section about the 70% problem, the author writes:
> The good news? This gap will likely narrow as tools improve.
This is not a fact, it's a prediction, an article of faith. It's a common prediction, that these tools will only get better over time, but that's not guaranteed! I think it's likely as well, provided you define "improve" incredibly pedantically, but the unspoken part of this prediction is that the tools will improve _significantly,_ and _that_ part is one I have doubts about.
Honestly, I think it's pretty likely that we've just about hit the local maximum of our current techniques - training newer, bigger models is fantastically expensive and doesn't seem to have the same jumps in capability as earlier generations did, and stringing together a bunch of models in an agent produces frankly only modest improvements for the cost increase.
I've written about "the 70% problem" before (not by name, though), and my biggest worry there is that you require experience and good judgement to be able to use these tools effectively, and these tools by their nature erode that good judgement and deny you the experience. Juniors won't have to work through "the tough parts" of programming and build up the skills required to understand when an LLM is leading them astray, and even experienced programmers can lose familiarity with their own codebases as they rely on LLMs to provide the "understanding." Think of all the times where you haven't had to interact with a service for months, how long it takes to warm back up to it -- what about when that's _most_ of your code, because LLMs have been preventing you from needing to gain any deep understanding about it?
What happens if, in 3-5 years, LLMs don't get significantly better, but we've stopped producing properly capable Intermediate and Senior developers, and even started atrophying the skills of existing Seniors?
(Couple this with LLM-using developers who can move mountains in a day to kickstart new projects and get promoted, but leave behind a trashfire of a codebase for more "classically" experienced developers to clean up and maintain at a much slower pace. That's a problem that's _always_ been in the industry, just magnified due to the power of LLMs.)
- noobermin 6 months ago
  
  I work in a certain niche. The older I get and the longer I get, I realise I have to use less crutches and abstractions because intimate knowledge of my code is more important than supposed speed.
  LLMs lead everyone in the opposite path and that to me seems to ruin everything.
- MrMcCall 6 months ago
  
  [flagged]
  
  danielbln 6 months ago
  
  Predicting the next word correctly in the right context requires tremendous, vast amounts of knowledge; about language, about the world, about code. You're trying to be clever while being reductive, but you're also wrong.
  Every programming project starts with a brand new problem to solve? That's obviously false. The vast majority of programming projects are not only composed of smaller and simpler problems (of which most are not novel in any way), they aren't often even novel to begin with.
  These models are far from infallible (and frequently quite fallible), but if you can't even see the smallest sliver of utility for using LLMs for coding, I question your ability to judge technology.
  
  DanHulton 6 months ago
  
  Not to get into a pissing war, but I'll judge ya right back. =)
  There's a phrase from aviation that I keep thinking about when I think about LLMs - "behind the airplane."
  When you're "behind the airplane," you're in a situation where the airplane is doing something you don't intimately understand and you're reacting to it. It is an _incredibly_ dangerous place to be, and is indicative of you being in an emergency situation and over your head. You want to be "ahead of the airplane," where you understand intimately what is happening, and you are making forward-thinking plans about what you expect to do in the coming seconds, minutes, and hours.
  When I'm coding, I generally feel "ahead of the code." I'm thinking ahead about the architecture and design of what it is I want to accomplish, refactors are done with a high level of understanding and for a well-understood reason, etc. When coding with an LLM, I feel like I'm "behind the code" - I'm being given code that I now have to evaluate for purpose. I don't intimately know what it's doing, why it's written that way, the broader effects it will have or how to integrate it into the larger system.
  When you are "behind the airplane," that's when mistakes happen and planes crash. I think that metaphor extends very well to coding here, as well. =)
  
  danielbln 6 months ago
  
  This is not how I use LLMs for coding. I request small snippets or edits, and most of my time is spent conversing and evaluating solution paths. Once I do request and accept code, it is heavily typed and evaluated by the IDE, it is checked against tests and the overall architecture and data flow is still coming from me, I merely off-shore the minute implementation details. And here lies the crux: there are so many different ways of using these tools, some will make the plane crash, some won't and it's all kind of in flux. That said, even the staunchest critic should see that there is _something_ useful there.
CharlesW 6 months ago

> The problem with this is that it's supposedly about predicting the future but actually bases everything on LLM's current capabilities.
Is it possible you missed some of it? The entire last half is dedicated to predicting the future based on an agentic future that's (as the author notes) "a big unknown" as we have this conversation.
- ilaksh 6 months ago
  
  It's not the entire last half. He does mention agents, but inaccurately says that Devin is the only agentic software engineering tool.
  But he does not override his initial premise of doubting that things will change significantly from the current uses in the future.
  It's not really unknown, there are several popular tools, and we can expect them to be more reliable and useful as better models continue to be rolled out.
  Also if you are writing about the future then you should look beyond next year.
  Everyone should anticipate that there is a strong possibility that we continue on a similar trajectory of improvement.
  Therefore it is NOT reasonable to expect that things won't change dramatically in software engineering, because the trajectory is very rapid progress.
  Within the next 3, 5, maybe 10 years, many existing jobs including software engineering may be replaced by AI. That's the direction we are headed and any article should take the possibility seriously by now.
logicchains 6 months ago

>It's incredible that people still haven't figured out or won't accept or plan for technology to continue to improve
Because it's incredibly difficult for people to accept that the career they put years into and love might not exist in 5-10 years. At the current pace, in a few years we'll have something smarter than o3 while no more expensive than o1. Then all it would take is someone to find a nice way to rig it up with short-term memory and wrap it in something like Devin, and then companies would be able to hire the equivalent of a top 1% remote dev for less than 10% of a current dev salary.

leovingi 6 months ago

> The irony? AI tools might actually enable this renaissance. By handling the routine coding tasks, they free up developers to focus on what matters most: creating software that truly serves and delights users.

How very naive. All productivity and efficiency gains will be utilized to push out an ever-increasing stream of new features because that is what drives sales and that is what the business needs.

It is the exact same reason why widening a highway does not actually reduce traffic congestion.

jakozaur 6 months ago

Echos with my experience. LLMs work great if you micromanage them aggressively, but the moment I put too much trust, it backfires terribly.

verteu 6 months ago

Yeah, but that's also true for a mediocre (human) SWE...
edit: My point was that, since mediocre SWEs make up a large proportion of the workforce, an LLM that performs at the level of "mediocre human" will still have massive implications for the labor force.
- namaria 6 months ago
  
  People keep saying that, but would you work in a team full of "mediocre" professionals or have a social circle full of "mediocre" friends if you had a choice?
- downrightmike 6 months ago
  
  Ideally LLMs speed them up enough that companies will use the time savings to upskill them right? Right?

polishdude20 6 months ago

I've tried using Claude for the first time to help me write some wifi code for an ESP32. It gets like 90% of the way there and I can't get the last 10% to where it works how I need it. Every time I mention an issue with the code, it rewrites it and a new issue pops up. Then I me tiin that and it rewrites it with the original issue again. It's like it forgets the past mistakes it's made.

Also there's been multiple times where it just forgets a closing parenthesis in a function or tells me a function definition doesn't exist in the code even though it's literally right there.

Bjorkbat 6 months ago

Tangentially related, I get strong Metaverse/NFT vibes around predictions on agents.

Namely, a lot of predictions were made around NFTs that just didn’t make sense or were kind of dumb. My pet favorite was this notion that in the future you could bring your NFTs with you to different games and the like. You could buy a Batman NFT costume and have your guy wear it while playing metaverse World of Warcraft. They basically took Ready Player One and ran with it. Besides the fact that this is much harder to do than they could imagine, it’s also kind of a goofy idea.

I feel the same way with predictions made around AI agents. My pet favorite is the notion that we stop using the internet and delegate everything to our agents. Planning a trip? Let an AI agent handle things for you. Shopping? Likewise, let an agent handle your purchases. In the future ads won’t even be targeted at people, they’ll target agents instead, and pretty soon agents won’t even browse the internet but talk to other agents instead.

Is it feasible? I can’t say. I’m more interested in how goofy it all sounds. The notion that you no longer have buyer preferences while your agent gets served ads, or the notion of planning that trip to Rome or whatever and just entrusting the agent with the itinerary as if it won’t come up with unoriginal suggestions.

Work agents make more sense in general, but the sentiment remains.

coffeefirst 6 months ago

Yeah… as best I can tell, there is no such thing as an Agent. It’s a brilliant piece of marketing on top of “an automation with an LLM somewhere in the input interface.”
Which is kind of neat, but it’s being sold as Jarvis and it’s more like an ATM you can talk to.

LeicaLatte 6 months ago

Flow state code != production quality code. Not yet.

From personal experience, writing real-life production code is like running a marathon; it requires endurance and rigor. I've seen AI-generated code — it’s more like a treadmill run, fine for practice only. Unpredictable issues, hallucinations pop up all the time with AI code, I have to rely on my own skills to navigate and solve problems.

monkeydust 6 months ago

The unintended consequence of this is well captured in Bainbridge's piece from 1983 'Ironies of Automation'

https://www.sciencedirect.com/science/article/abs/pii/000510...

- The more tasks you automate the less practice people get in doing those tasks themselves and developing the experience in executing them.

- Yet, experience becomes more important as issues/exceptions occur (which they will)

- Ironically, when people are needed the most they are least prepared to step in because automation has taken over their day-to-day.

Net result might be a reduced supply of 'experience' but demand remains strong thus increasing the price of it.

muglug 6 months ago

The even harder truth not mentioned here is that existing tools have a hard time understanding large codebases with well-establishd internal patterns and libraries.

The article mostly talks about how AI tools can help with new things, but a large amount of software development is brownfield, not greenfield.

greenavocado 6 months ago

This is not a problem at all as long as you use very good typing because the local contract boundaries are what matter unless you use huge amounts of global state which everybody knows is a very very bad idea and has been demonized for decades

sega_sai 6 months ago

It is a somewhat interesting take.

What I am interested in as a person teaching a computing course, what is the best way to force people to understand/interact with the code coming from the LLM. I.e. when I give computing problems to students, it is often easy to put the problem in chatgpt and get an answer. In a very significant fraction of cases the code would be somewhat sensible and would pass the tests. In some cases the output would use the wrong approach or would fail the test, but not often enough to completely discourage cheating.

In the end this comes down to the question of what skills we want from people writing code with the help of the LLM, and how to test for those skills. (here I'm not talking about professional programmers, but scientists rather)

uludag 6 months ago

I was reading the book "How Big Things Get Done," which is immensely applicable to the field of software engineering. The book is about how and why big projects fail. The book mentioned that IT projects were among the worst in cost and time overrun. I see essentially a win-win situation for software developers:

Either,

AI will enhance the work of software engineering on a fundamental level, helping SWE projects to be delivered (more) on time and with high(er) quality (I can't state how amazing this would be)

things won't get significantly better, projects still can't reliably be delivered, software quality doesn't get better, etc. (the robots won't be taking our jobs)

It will be interesting to see which future we'll end up.

mooreds 6 months ago

Here's a related HN discussion about the 70% article that Addy wrote on the same topic:

https://news.ycombinator.com/item?id=42336553

agentultra 6 months ago

Here are some more hard truths to add to the pile.

> this kind of crawling and training is happening, regardless of whether it is ethical or not

Glad we've established that it's going to change our profession regardless of ethics.

> Software engineers are getting closer to finding out if AI really can make them jobless

The capital class is definitely interested in this. They would love to pay fewer of us or pay us less and still get the same results. The question in 2025 might be: why would I pay you if you're not using GenAI assistants? Bob over there accepts a lower salary and puts out more code than anyone else on this team!. They may not care what the answer is: profit is all that matters.

After all, they clearly don't care about the ethics of training these models, exploiting labor in countries with weak worker protections, soaking up fresh water during local droughts, etc. Why would they care about you and your work?

Personally I don't find that generating code is where I do most of my programming work. I spend more of my time thinking and making sure I'm working on the right thing and that I'm building it correctly for the intended purpose. For that I want tools that aid me in my thinking: model checkers, automated theorem provers, and better type systems, etc. I need to talk to people. I don't find reviewing generated code to be especially productive even though it feels like work.

I think code synthesis will be more useful. Being able to generate a working implementation from a precise specification in a higher-level language will be highly practical. There won't be a need to review the code generated once we trust the kernel since the code would be correct by construction and it can be proven how the generated code ties to the specification.

We can't use GenAI to do the synthesis and replace a kernel as we still haven't solved the "black box" problem of neural nets.

The problem I find with GenAI and programming is that human language is sufficiently vague for communicating with folks but too imprecise for programming.

I suspect that in a few years there could be a gold mine for consulting: fixing AI-generated "house of cards" code.

Hope we're all good with the coming wave of security errors, breaches, and general malfeasance that's coming with the wave of GenAI code. You think software today could be better? The current models have been trained on all the patterns that make it the way it is now. And they will generate more of it. We have to hope that "software engineers" can read enough code, fast enough, and catch those errors before they ship. Should be good times.

deadbabe 6 months ago

The big change: no more big universally used frameworks like React, Vue, etc.

Every company has its own little special framework crafted by an AI with its own nuances you need to learn, and your skills will no longer transfer from company to company. Gone will be the days when you can swap out a software engineer for a similar one with the same experience in a framework you use. Every engineer coming in has to start from zero and learn exactly how to work with your special paradigms, DSLs, etc.

ThrowawayR2 6 months ago

The LLM only "knows" what's in its training corpus so it's output quality is going to be comparatively crappy on private custom frameworks with only a small codebase to train on (relative to the entire corpus). Companies with private frameworks will therefore lose out.
If anything, the opposite is going to happen: LLMs will cause consolidation and software evolution to grind to a halt. Coders are going to gravitate to libraries and frameworks the LLM generates the best outputs for, leaving a few big winners and the rest going extinct. New language features and new frameworks will be slow to enter general usage because of the chicken and egg problem: there will be little or code in the LLM training corpus that uses them idiomatically until some humans write enough quality code using the new features/frameworks that can be absorbed into the corpus to meaningfully affect its output.
- deadbabe 6 months ago
  
  Except no companies will allow substantial improvements to enter the public corpus, because they will basically be giving away their competitive advantages that will quickly be gobbled up by other LLMs. Those improvements will represent larger investments since there will be fewer engineers who can write them by hand.
aorona 6 months ago

That would be a disaster for hiring and productivity I don't any upside to creating AI generated siloes from a business or developer perspective.
- deadbabe 6 months ago
  
  Disaster is coming. Tragedy of the commons.

siliconc0w 6 months ago

You can kinda see an agent that automates a 'best practices' AI-assisted workflow to iterates with AI to generate and run tests, optimize code, and feed in the right examples or signatures so it can generate code that properly uses existing APIs.

Maybe trying to use cheaper models first and then calling the more expensive models to iterate and get through tests or errors.

I haven't really seen anything like this so I imagine it's a lot harder than I'm imagining.

localghost3000 6 months ago

LLM’s have replaced Stack Overflow for me. Occasionally I can use them to write a simple bash script for me or some bit of terraform that I don’t feel like looking up. Useful but not exactly life changing.

What I would consider a game changer would be generating USEFUL unit and integration tests. Ideally that used the existing fixtures and utilities already in place. I’ve yet to see that happen even with code the LLM had just generated.

MrMcCall 6 months ago

"Predicting the next word" is never going to facilitate modifying complex code in any way that won't be a disaster. Creating sensible, useful, comprehensive tests requires a comprehensive understanding of the code -- there is no shortcut for any but the simplest of algorithms.
"Short cuts make long delays." --Tolkien
- localghost3000 6 months ago
  
  Just so. That’s kind of my point. In order for them to be the revolution the tech industry wants us to believe they are, they need to be able to handle necessary but laborious tasks like tests for us. I don’t think it’s possible with the current approaches however

m463 6 months ago

I think this might be like those static analysis tools.

people run their code through it, and it finds a LOT of problems, some of them serious.

and then you fix them and you feel good.

But after that, you have a bunch of problems that aren't real. You either ignore them or the tool starts creating more work exponentially.

I suspect AI will be like that. It will help you a bit, but don't get caught up in it because you'll spend time distracted doing AI things.

mtrovo 6 months ago

This is an amazing breakdown. I can't believe how quickly AI tools have become integral to our workflows. Completely agree with the idea that experienced developers will be even more valuable in the future.

intended 6 months ago

I found these ideas / analogies to be helpful to cut out the chatter for GenAI

1) Analogy - Using chat GPT to do code is like deciding to cross the amazon. You start moving, and half way through you realize the map is wrong. Now you are in the middle of the Amazon, without a map.

2) Reliable Matte Painting / Rough work -> I sketch, so matte painting is what GenAI reminds me of, quite a bit. It’s going to get you half way… somewhere, faster. You have to get to the end yourself.

It’s easier for me to assume that GenAI is going to be mostly correct 70% of the time, and never a 100% of the time. Build and use accordingly.

I’m tired about the chatter about the chatter about GenAI at this point.

bionhoward 6 months ago

the biggest reason not to use LLMs is the customer noncompete clauses where they’re learning from the conversations but you agree not to “develop models that compete” —

makes me think everyone using OpenAI, Anthropic, Gemini API, Mistral, Copilot, Perplexity, etc is dumb or addicted to short term benefits while totally ignoring the long term consequences of paying to train your own replacement while agreeing not to compete

bdangubic 6 months ago

smart people run LLM locally :)

bflesch 6 months ago

LLMs are an ad-free version of google search. If you use LLMs with this expectation, it'll be improvement of software engineering productivity.

bflesch 6 months ago

AI bros downvoting this argument every time because they can't handle the truth. For people who can't install ublock origin, LLMs are a god-sent. AI bros want to keep living in their bubble where they grift millions of dollars for a better google search.
- kbelder 6 months ago
  
  In fairness, a better Google search is worth millions of dollars.
  And I agree that the primary virtue of coding with current AI is that it just cuts time that is spent looking through multiple pages of junky search results. It's not necessarily better or more accurate, but it is far faster.
  I'd even say that AI coding would be far less appealing if Google hadn't let its search results go to crap over the last ten years.
- boshalfoshal 6 months ago
  
  I mean this is clearly an incorrect take? Google search cannot reason about its contents autonomously, nor synthesize out of sample data from the things it indexes.
  Google search is a query over user generated content. LLMs generate content. These are fundamentally different things.
  Even if ChatGPT was just "lossy google" it is a much better UX than searching documents that google returns, since it can just synthesize what I want to hear in a few paragraphs, whereas with Google search I would have to know what to search and synthesize the response I'm looking for after analyzing a few things. There is an argument to be made here that we are eroding the value of intelligence and thinking for ourselves, but as a _product_ its vastly better, imo.
  Whether or not LLMs actually lead to AGI is yet to be seen though. But I definitely think it has reasoning capabilities that you are underating in your comparison.
- bdangubic 6 months ago
  
  “AI bros” have learned how to use “AI” to automate 30-40-50..% of what they used to have to do manually while others are bitching about “AI bros” on online forums. who knows, knows. who doesn’t bitches online :)

dvas 6 months ago

There are many ways of thinking and reasoning about the profession and what it means to each and every one of us.

Some of the buckets:

* The builders, don't care how they get the result.

* The crafters, those who care how they get to the results (vim vs emacs), and folks who enjoy looking at tiny tiny details deep in the stack.

* The get-it-done people, using standard IDE tools, stick with them, and it's a strict 9-5.

...

And many with types, and subtypes of each ^^.

In my opinion, many people have a passion for making computers to do cool things. Along the way, some of us have fallen into the mentality of using a particular tool or way of doing things and get stuck in our ways.

I think it's another tool that you must know how to utilize and utilize in a way that does not atrophy your skills in the long run. I personally use it for learning and allowing me to get an in on a knowledge topic which I can then pull on and verify that the information is correct.

lincpa 6 months ago

[dead]