Survey: a third of senior developers say over half their code is AI-generated
Related: https://www.theregister.com/2025/08/28/older_developers_ai_c...
I have over 30 years of experience and recently used Claude Opus 4.1 (via browser and claude.ai) to generate an ECMA-335 and an LLVM code generator for a compiler, and a Qt adapter for the Mono soft debugging protocol. Each task resulted in 2-3kLOC of C++.
The Claude experience was mixed; there is a high probability that the system doesn't respond or just quickly shows an overloaded message and does nothing. If it generates code, I quckly run in some output limitation and have to manually press "continue", and then often the result gets scrambled (i.e. the order of the generated code fragments gets mixed up, which requires another round with Claude to fix).
After this process, the resulting code then compiled immediately, which impressed me. But it is full of omissions and logical errors. I am still testing and correcting. All in all, I can't say at this point that Claude has really taken any work off my hands. In order to understand the code and assess the correctness of the intermediate results, I need to know exactly how to implement the problem myself. And you have to test everything in detail and do a lot of redesigning and correcting. Some implementations are just stubs, and even after several attempts, there was still no implementation.
In my opinion, what is currently available (via my $20 subscription) is impressive, but it neither replaces experience nor does it really save time.
So yes, now I'm one of the 30% seniors who used AI tools, but I didn't really benefit from them in these specific tasks. Not surprisingly, also the original blog states, that nearly 30% of senior developers report "editing AI output enough to offset most of the time savings". So not really a success so far. But all in all I'm still impressed.
Instead I provided Claude with the source code of a transpiler to C (one file) which is known to work, uses the same IR as the new code generators were supposed to use.
This is a controlled experiment with a clear and complete input and clear expectations and specifications of the output. I don't think I would be able to clearly isolate the contributions and assess the performance of Claude when it has access to arbitrary parts of the source code.
That's not the part it saves me time in, it saves me time in looking up the documentation. Other than that, it might be slower, because the larger the code change is, the more time I need to spend reviewing, and past a point I just can't be bothered.
The best way I've found is to have it write small functions, and then I tell it to compose them together. That way, I know exactly what's happening in the code, and I can trust that it works correctly. Cursor is probably a better way to do that than Claude Code, though.
If you point out, with test it's also the same with any AI tool available, but to come to that result, I have to continuously prompt it till it gives me the desired output, while I may be able to do it in 2/3 iterations.
Reading documentation always made me little bit knowledgeable than before, while prompting the LLM, gives me nothing of knowledge.
And, I also have to decide which LLM would be good for the task at hand, and most of them will not be free (unless I use a local, but that will also use GPU, and add an energy cost)
I may be nitpicking, but I see too many holes with this approach
I find it extremely useful as a smarter autocomplete, especially for the tedious work - changing function definitions, updating queries when DB schema changes, and writing http requests/api calls from vendor/library documentation.
None of the use-cases you mention requires LLM. Just available as IDE functionalities.
IntelliJ has LLM based auto complete, with which I am okay, But it still wrong too many times. Works extremely well with Rust. Their non-llm autocomplete is also superb, which uses ML for suggesting closest, relevant match, IIRC.
It also makes refactoring a breeze, I know what it's going to do exactly.
Also, it can handle database refactoring to a certain capacity! And for that it does not require LLM, so no nondeterministic behavior.
Also, the IDE have its own way of doing http requests, and it's really nice! But, I can use their live template to do autocomplete any boilerplate code. It only requires setting once. No need to fiddle with prompts.
it saves me time in looking up the documentation
I have a Perplexity subscription which I heavily use for such purpose, just asking how something works or should be used, with a response just on the point and with examples. Very useful indeed. Perplexity gives me access to Claude Sonnet 4 w/o Thinking which I consider great models, and it can also generate decent code. My intention was to find out how good the recent Claude Opus is in comparison and how much of my work I'm able to delegate. Personally I much prefer the user interface features, performance and availability of Perplexity to Claude.ai.
(It wasn't clear to me that I would be able to toggle out of accept changes mode, so I resisted for a loooooong time. But turns out it's just a toggle on/off and can be changed in real-time as it's chugging along. There's also a planning state but haven't looked into that yet)
It always asks before running commands unless you whitelist them. I have whitelisted running testsuites and linters, for example so it can iterate on those corners with minimal interaction. I have had to learn to let it go ahead and make small obvious mistakes rather than intervene immediately because the linters and tests will catch them and Claude will diagnose the failure and fix them at that point.
Anyway I took a small toy project and used that to get a feel for claude-code. In my experience using the /init command to create CLAUDE.md (or asking Claude to interview you to create it) is vital for consistent behavior.
I haven't had good "vibe" experiences yet. Mostly I know what I want to do and just basically delegate implementation. Some things that have worked well for me is to ask Claude to propose a few ways to improve or implement a feature. It's come up with a few things I hadn't thought of that way.
Anyway, claude-code was very good at slowly and incrementally earning my trust. I resisted trying it because I expected it would just run hogwild doing bewildering things, but that's not what it does. It tends to be a bit of an asskisser in it's communication style in a way that would annoy me if it were a real person. But I've managed to look past that.
You can just let it work, see what’s uncommitted after it’s over, and get rid of the result if you don’t like it.
The problem is, it's like a very, very junior programmer that knows the framework well, but won't use it consistently and doesn't learn from mistakes AT ALL. And has amnesia. Fine for some trivial things, but anything more complicated the hand-holding becomes so involved you are better off doing it yourself. That way you internalise some of the solutions as well, which is nice because then you can debug it later! Now I have a huge PR that even I myself don't really grasp as much as I would want.
But for me the nail in the coffin was the terrible customer service. ymmv.
If you don't know what they are good at and how to use them of course you may end up with mixed results and yes, you may waste time.
That's a criticism I have also towards AI super enthusiasts (especially vibe coders, albeit you won't find much here), they often confuse the fact that LLMs often one shot 80% of the solutions with the idea that LLMs are 80% there, whereas the Pareto principle well applies to software development where it's the hardest 20% that's gonna prove difficult.
1) Your original post asks a lot if not too much out the LLM, the expectation you have is too big, to the point that to get anywhere near decent results you need a super detailed prompt (if not several spec documents) and your conclusion stands true: it might be faster to just do it manually. That's the state of LLMs as of today. Your post neither hints at such detailed and laborious prompting nor seem to recognize you've asked it too much, displaying that you are not very comfortable with the limitations of the tool. You're still exploring what it can and what it can't do. But that also implies you're yet not an expert.
2) The second takeaway that you're not yet as comfortable with the tools as you think you are is clearly context management. 2/3k locs of code are way too much. It's a massive amount of output to hope for good results (this also ties with the quality of the prompt, with the guidelines and code practices provided, etc, etc).
3) Neither 1 or 2 are criticisms of your conclusions or opinions, if anything, they are confirmations of your point that LLMs are not there. But what I disagree with is the rush into concluding that AI coding provides net 0 benefits out of your experience. That I don't share. Instead of settling on what it could do (help with planning, writing a spec file, writing unit tests, providing the more boilerplate-y part of the code) and use the LLM to reduce the friction (and thus provide a net benefit), you essentially asked it to replace you and found out the obvious: that LLMs cannot take care of non-trivial business logic yet, and even when they can the results are nowhere near being satisfactory. But that doesn't mean that AI-assisted coding is useless and the net benefit is 0, or negative, it only becomes so as the expectations on the tool are too big and the amount of information provided is either too small to return consistent results or too large for the context to be an issue.
Your conclusion from that is "but they are doing it wrong", while also claiming they are saying things they didn't say (0 net benefits, useless, etc).
Do you see how that might undermine your point? That you feel they haven't take the time to understand the tools, but you didn't actually read what what wrote?
Even if you know better than themselves how musch they know, isn't the tool inadequate just yet for power use then when it is sooo easy to misuse?
Too much tweeking and adapting users to the needs of the tool (vs. the other way around) and there is little point using those (which is a bit of the sickness of modern day computing: 'with computers you can solve problems lightning fast that you wouldn't have without them')
Prior to LLMs, my impression was "high learning curve, high results" was a pretty popular sweet-spot with a large portion of the tech crowd. It seems weird how much LLMs seem to be an exception to this.
Something I've found useful with Claude Code is that it works a lot better if I give it many small tasks to perform to eventually get the big thing done, rather than just dumping the big thing in its lap. You can do this interactively (prompt, output, prompt, output, prompt, output...) or by writing a big markdown file with the steps to build it laid out.
The fact that it works is amazing, but I'm less convinced that it's enhancing my productivity.
(I think they real productivity boost for me is if I still write the code and have the assistant write test coverage based on diffs, which is trivial to prompt for good results)
I still do the part of my job that I got experience on, analyze the need, and use the AI like an assistant to do small libraries or part of code. Like these, errors have less chance to appear. Then I glue that together.
For me the time ratio is best use like that. If I have to describe the whole thing, I'm not far from doing it myself, so there's no need for me.
Important: I work alone, not in a team, so maybe it has an impact on my thought
Then we can give someone that entire string of prompts as a repeatable recipe.
They certainly weren't a time-saver right away but they became one after some time giving them a real shot. I tested their + my limits on small projects, working out how to get them to do the whole project, figuring out when they stop working and why, figuring out which technology they work best with, figuring out the right size problems to give them, figuring out how to recognize if I'm asking them something they can't do well a ask something different instead, guide them into creating code that they can't actually continue to be successful with.
I started last December in Cursor's agentic mode and have been in Claude Code ever since probably March or April. It's definitely been a huge boost all year for side projects - but only in the last couple months have I been having success in a large codebase.
Even with all this experience I don't know that I would really be able to get much value out of the chat interface for them. They need to be proposing changes I can just hit accept or reject on (this is how both Claude Code and Cursor work btw - you don't have to allow it to write to any file you don't want or execute any command you don't want).
I'm pretty radical on this topic but for me cognitive load is good, you are making your neurons work and keep synapses in place where they matter (at least for your job). I totally accept writing down doc or howto to make doing some action in future easier and reduce that cognitive load, but using AI agent IMO is like going to bike in the mountain with an electrical bike.
Yes, you keep seeing the wonderful vistas but you are not really training your legs.
I know how to nail a nail, I've nailed so many nails that I can't remember them all.
My job is to build a house efficiently, not nail nails. Anything that can make me more efficient at it is a net positive.
Now I've saved 2 hours in the framing process by using a nail gun, I have 2 extra hours to do things that need my experience. Maybe spot the contractor using a nail plate in the wrong way or help the apprentice on their hammering technique.
Caveat: In the EU, an e-bike REQUIRES some physical effort any time for the motor to run. Throttles are illegal.
Ironically, e-bikes, at least in the EU, are having the exact opposite effect. More people that don't normally ride bikes are using e-bikes to get about.
At least in Germany people rather joke that the moment e-bikes became popular, people began to realize that they suddenly became too unathletic to be capable of pedaling a bicycle. I know of no person who uses an e-bike who did not ride an ordinary bicycle before.
In the EU, an e-bike REQUIRES some physical effort any time for the motor to run.
The motor must shut off when 25 km/h is reached - which is basically the speed that a trained cyclist can easily attain. So because of this red tape stuff, e-bikes are considered to be useless and expensive by cyclists who are not couch potatoes.
So you get both the wonderful views (building the house or delivering the software) but also you get improved health (keeping your mind trained on both high level thinking and low level implementation vs high level only).
The vast majority of developers aren't summitting beautiful mountains of code, but are instead are sifting through endless corporate slop.
We might say that using a hammer constantly will develop more your muscles, but in carpentry there are still plenty of manual work that will develop your muscles anyway.
The trades destroy human bodies over time and lead to awful health outcomes.
Most developers will and should take any opportunity to reduce cognitive load, and will instead spend their limited cognitive abilities on things that matter: family, sport, art, literature, civics.
Very few developers are vocational. If that is you and your job is your identity, then that's good for you. But don't fall into the trap of thinking that's a normal or desirable situation for others.
The vast majority of developers aren't summitting beautiful mountains
I'm not sure you're approaching this metaphor the right away. The point is that coding manually is great cognitive exercise which keeps the mind sharp for doing the beautiful stuff.
The trades destroy human bodies over time and leads to awful health outcomes.
Again, you're maybe being too literal and missing the point. No one is destroying their minds by coding. Exercise is good.
I'm not sure you're approaching this metaphor the right away. The point is that coding manually is great cognitive exercise which keeps the mind sharp for doing the beautiful stuff.
No, I'm challenging the metaphor. Working the trades isn't exercise - it's a grind that wears people out.
Again, you're maybe being too literal and missing the point. No one is destroying their minds by coding. Exercise is good.
We actually have good evidence that the effects of heavy cognitive load are detrimental to both the brain and mental health. We know that overwork and stress are extremely damaging to both.
So reducing cognitive load in the workplace is an unambiguous good, and protects the brain and mind for the important parts of life, which are not in front of a screen.
No, I'm challenging the metaphor. Working the trades isn't exercise - it's a grind that wears people out.
If your job is just grinding out code in a stressful and soul-crushing manner, the issue lies elsewhere. You will be soon either grinding out prompts to create software you don't even understand anymore or you will be replaced by an agent.
And by no way I'm implying you are part of the issue.
I need to add a FooController to an existing application, to store FooModels to the database. The controller needs the basic CRUD endpoints, etc.
I can spend a day doing it (crushing my soul) or I can just tell any Agentic LLM to do it and no something that doesn't crush my soul, like talk with the customer about how the FooModels will be used after storing.
But it'll produce bad code!
No it doesn't. It knows _exactly_ how to do a basic CRUD HTTP API controller in C#. It's not an art form, it's just rote typing and adding Attributes to functions.
Because it's an Agentic LLM, it'll go look at another controller and copy its structure (which is not standard globally, but this project has specific static attributes in every call).
Then I review the code, maybe add a few comments for actual humans, commit and push.
My soul remains uncrushed, client is happy when I delivered the feature on time and I have half the day off for other smaller tasks that would become technical debt otherwise.
I can’t write this without feeling preachy, and I apologize for that. But I keep reading a profound lack of agency in comments like these.
If your job is just grinding out code in a stressful and soul-crushing manner, the issue lies elsewhere.
The vast majority of developers are in or near this category. Most software developers never write code outside of education or employment and would avoid doing so if an AI provided the opportunity. Any opportunity to reduce cognitive load is welcome
I think you don't recognise how much of an outlier you are in believing that your work improves your cognitive abilities.
We actually have good evidence that the effects of heavy cognitive load are detrimental to both the brain and mental health. We know that overwork and stress are extremely damaging to both.
I don't think this is fair either, you're comparing "overwork and stress" to "work." It's like saying we have evidence that extreme physical stress is detrimental ergo it's "unambiguously" healthier to drive than to walk.
Maybe you could share your good evidence so we can see if normal coding tasks would fall under the umbrella of overwork and stress?
This is so well-founded that I do not have to provide individual sources - it is the current global accepted reality. I wouldn't provide sources for the effect of CO2 emissions on the climate or gravity, either.
However, the opposite is not true. If you have evidence that routine coding itself improves adult brain health or cognitive ability, please share RCTs or large longitudinal studies showing net cognitive gains under typical workloads.
It's clear that you're more interested in "winning" than actually have a reasonable discussion so goodbye. I've had less frustrating exchanges with leprechauns.
And yet, you now want me to source individual studies on those effects in a HN thread? Yes, in this instance you are approaching flat-earth/climate-change-denial levels of discourse. Reducing cognitive load is an unambiguous good.
If you think routine coding itself improves brain health or cognitive ability, produce the studies showing as you demanded from me, because that is a controversial claim. Or you can crash out of the conversation.
With electric assist, I can get up faster and to the business of coming down the hill really fast.
We have ski-lifts for the exact same reason. People doing downhill skiing would make their legs, heart and lungs stronger in the process of walking up the hill with their skis. But who wants to do that, that's not the fun part.
And to step back from analogy-land.
I'm paid to solve problems, I'm good at architecture, I can design services, I can also write the code to do so. But in most cases the majority of the code I write is just boilerplate crap with minor differences.
With LLMs I can have them write the Terraform deployment code, write the C# CRUD Controller classes and data models and apply the Entity Framework migrations. It'll do all that and add all the fancy Swagger annotation while it's at it. It'll even whip up a Github build action for me.
Now I've saved a few days of mindless typing and I can get into solving the actual problem at hand, the one I'm paid to do. In reality I'm doing it _while_ I'm instructing the LLM to do the menial crap + reviewing the code it produced so I'm moving at twice the speed I would be normally.
"But can't you just..." nope, if every project was _exactly_ the same, I'd have a template already, there are just enough differences to not make it worth my time[1]. If I had infinite time and money, I could build a DSL to do this, but again, referring to[1] - there's no point =)
It's more cost efficient to pay the LLM tax to OpenAI, Anthropic or whatever and use it as a bespoke bootstrapping system for projects.
A nailgun isn’t automated in the way an LLM is, maybe if it moved itself around and fired nails where it thought they should go based on things it had seen in the past it would be a better comparison.
But I think of work as essentially two things - creative activity and toil. I simply use AI for toil, and let my brain focus on creativity and problem solving.
Writing my 100,000th for loop is not going to preserve my brain.
So yeah, 30%-50% seems right, but it's not like I lost any part of my job that I love.
So yeah, past a certain age, you'll be happy to reduce your mental load. No question about it. And I feel quite relieved when Claud writes this classic algorith I have understood long ago and don't want to re-activate in my brain. And I feel quite disappointed when Claude misses the point and I have to code review it...
If you don't wanna use AI, that's entirely up to you, but "you're gonna forget how to program if you use AI and then whatever are you going to do if the AI is down" reeks of motivated reasoning.
By using LLMs to do some of the stuff I have long gotten over, I have a bit more mental energy to tackle new problems I wouldn't have previously.
As well LLMs just aren't actually that competent yet, so it's not like devs are completely hands off. Since I barely do boilerplate as I work across legacy projects, there's no way Claude Code today is writing half my code.
I'm training smarter, and exercising better, instead of wasting all the workout/training time on warmups, as it were.
are you all aware that this basically means you are exercising less your brain day in and day out, and in the end you will forget how to do things?
IDE did the same thing and we found other ways to exercise our brains. This one is really something unreasonable to worry about.
In the past, that much nitpicky detail just wouldn't have gotten done, my time would have been spent on actual features. But what I just described was a 30 minute background thing in claude code. Worked 95%, and needed just one reminder tweak to make it deployable.
The actual work I do is too deep in business knowledge to be AI coded directly, but I do use it to write tests to cover various edge cases, trace current usage of existing code, and so on. I also find AI code reviews really useful to catch 'dumb errors' - nil errors, type mismatches, style mismatch with existing code, and so on. It's in addition to human code reviews, but easy to run on every PR.
Don’t get me wrong, I care very deeply about the organization and maintainability of my code and I don’t use “agents”. I carefully build my code (and my infrastructure as code based architecture) piece by piece through prompting.
And I do have enough paranoia about losing my coding ability - and I have lost some because of LLMs - that I keep a year in savings to have time to practice coding for three months while looking for a job.
Also I tend to get more done at a time, it makes it easier to get started on "gruntwork" tasks that I would have procrastinated on. Which in turn can lead to burnout quite quickly.
I think in the end it's just as much "work", just a different kind of work and with more quantity as a result.
A far more interactive coding "agent" that makes sure it walks through every change it makes with you, and doesn't just rush through tasks. That helps team members come up to speed on a repository by working through it with them.
Strangely I've found myself more exhausted at the end of the week and I think it's because of the constant supervision necessary to stop Claude from colouring outside the lines when I don't watch it like a hawk.
Welcome to management. Computers and code are easy. People and people wannabes like LLMs are a pain.
I've found a huge boost from using AI to deal with APIs (databases, k8s, aws, ...) but less so on large codebases that needed conceptual improvements. But at worst, i'm getting more than 10% benefit, just cause the AI's can read files so quickly and answer questions and propose reasonable ideas.
it alleviates a lot of mental energy
For me, this is the biggest benefit of AI coding. And it's energy saved that I can use to focus on higher level problems e.g. architecture thereby increasing my productivity.
It breaks flow. It has no idea my intention, but very eagerly provides suggestions I have to stop and swat away.
it's so [great to have auto-complete]
annoying to constantly [have to type]
have tons of text dumped into your text area. Sometimes it looks plausibly right, but with subtle little issues. And you have to carefully analyze whatever it output for correctness (like constant code review).
There's literally no way I can see that resulting in better quality, so either that is not what is happening or we're in for a rude awakening at some point.
https://www.joelonsoftware.com/2000/04/06/things-you-should-... (read the bold text in the middle of the article)
These articles are 25 years old.
(a) don't know what you're doing and just approve everything you see or
(b) don't care how bad things get
And at this point it's not just a productivity booster, it's as essential as using a good IDE. I feel extremely uncomfortable and slow writing any code without auto-completion.
When the AI tab completion fills in full functions based on the function definition you have half typed, or completes a full test case the moment you start type - mock data values and all, that just feels mind-reading magical.
But your approach sounds familiar to me. I find sometimes it may be slower and lower quality to use AI, but it requires less mental bandwidth from me, which is sometimes a worthwhile trade off.
because my corporate code base is a mess that doesn’t lend itself well to AI
What language? I picked up an old JS project that had several developers fail over weeks to upgrade to newer versions of react. But I got it done in a day by using AI to generate a ton of unit tests then loop an upgrade / test / build. Was 9 years out of date and it’s running in prod now with less errors than before.
Also upgraded rails 4 app to rails 8 over a few days.
Done other apps too. None of these are small. Found a few memory leaks in a C++ app that our senior “experts” who have spent 20 years doing c++ couldn’t find.
I'm pretty confidient that I couldn't get it to implement a element in a web browser. I'm talking about C++ in WebKit or Chromium, not a custom element in HTML/JS. Let's say browsers wanted a new data-table element that natively implemented a scrolling window such that you registered events to supply elements and it asked for only the portion that were visible. I feel like those code bases are too complex for LLMs to add this (could certainly be wrong).
In any case, more concrete examples would help these discussions.
I can give a concrete example: I just asked ChatGPT (should have asked something more integrated into my editor), to give me JavaScript to directly parse meta data out of an MP4 file in JS in the browser. It gave my typescript but told me I should look at mp4box.
I spent 15-20 minutes getting a mp4box example setup (probably should have asked for that too). Only to find that I think mp4box does not get me the data I wanted.
So I went back to ChatGPT and asked for JavaScript because I was hacking in something like jsfiddle. It gave me that and it worked!
I then said I wanted the title and comments meta data as that was missing from what it gave me. That worked too, first try.
I teach at an internship program and the main problem with interns since 2023 has been their over reliance on AI tools. I feel like I have to teach them to stop using AI for everything and think through the problem so that they don't get stuck.
Meanwhile many of the seniors around me are stuck in their ways, refusing to adopt interactive debuggers to replace their printf() debug habits, let alone AI tooling...
Meanwhile many of the seniors around me are stuck in their ways, refusing to adopt interactive debuggers to replace their printf() debug habits, let alone AI tooling...
When I was new to the business, I used interactive debugging a lot. The more experienced I got, the less I used it. printf() is surprisingly useful, especially if you upgrade it a little bit to a log-level aware framework. Then you can leave your debugging lines in the code and switch it on or off with loglevel = TRACE or INFO, something like that.
And obviously when you can't hook the debugger, logs are mandatory. Doesn't have to be one or the other.
Yes you need good logging also.
printf doesn't improve going up and down the call stacks in the debugger to analyze their chain (you'd have to spam debug printfs all around you expect this chain to happen to replace the debugger which would waste time). debugger is really powerful if you use it more than superficially.
you'd have to spam debug printfs all around you expect this chain to happen to replace the debugger which would waste time
It's not wasting time, it's narrowing in on the things you know you need to look for and hiding everything else. With a debugger you have to do this step mentally every time you look at the debugger output.
If you only care about some specific spot, then sure - printf is enough, but you also need to recompile things every time you add a new one or change debug related details, while debugger can do it re-running things without recompilation. So if anything, printf method can take more time.
Also, in debugger you can reproduce printf using REPL.
printf() is surprisingly useful, especially if you upgrade it a little bit to a log-level aware framework.
What do you mean by this? Do you mean using a logging framework instead of printf()?
https://rustc-dev-guide.rust-lang.org/tracing.html#i-dont-wa...
Properly debugging my stack is probably one of the first things I setup because I find it way less tedious. Like, for example, if you have an issue in a huge Object or Array, will you actually print all the content, paste it somewhere else and search through the logs ? And by the way, most debuggers also have ability to setup a log points anyways, without having to restart your program. Genuinely curious to know how writing extra lines and having to restart makes things easier.
Of course I'm not saying that I never débug with logs, sometimes it's require or even more efficient, but it's often my second choice.
After that I went to work for Google, building distributed software running across many machines in a datacenter, and I have no idea how you would hook up a debugger even if you wanted to. It's all logs all the time, there.
By the time that was over, I was thoroughly accustomed to logging, and attaching a debugger had come to seem like a nuisance. Since then I've mostly worked on compilers, or ML pipelines, or both: pure data-processing engines, with no interactivity. If I'm fixing a bug, I'm certainly also writing a regression test about it, which lends itself to a logging-based workflow. I don't mind popping into gdb if that's what would most directly answer my question, but that only happens a couple of times a year.
Printf gives you an entire trace or log you can glance at, giving you a bird's eye view of entire processes.
I guess it's not a massive surprise given it's an HN forum and a reasonable percentage of HN candidates are doing LLM/AI stuff in recent cohorts, but it still means I have to apply a very big filter every time I open an article and people wax lyrical about how amazing Claude-GPT-super-codez is and how it has made them twice the engineer they were yesterday at the bargain price of $200 a month...
May it all die in a fire very soon. Butlerian jihad now.
Interactive debuggers are a great way to waste a ton of time and get absolutely nowhere. They do have their uses but those are not all that common. The biggest usecase for me for GDB has been to inspect stacktraces, having a good mental model of the software you are working on is usually enough to tell you exactly what went wrong if you know where it went wrong.
Lots of people spend way too much time debugging code instead of thinking about it before writing.
Oh, and testing >> debugging.
As for tooling, I really love AI coding. My workflow is pasting interfaces in ChatGPT and then just copy pasting stuff back. I usually write the glue code by hand. I also define the test cases and have AI take over those laborious bits. I love solving problems and I genuinely hate typing :)
While I understand that <Enter model here> might produce the meaty bits as well, I believe that having a truck factor of basically 0 (since no-one REALLY understands the code) is a recipe for a disaster and I dare say long term maintainability of a code base.
I feel that you need to have someone in any team that needs to have that level of understanding to fix non trivial issues.
However, by all means, I use the LLM to create all the scaffolding, test fixtures, ... because that is mental energy that I can use elsewhere.
If I … generate fairly exhaustive unit tests of a trivial function
… then you are not a senior software engineer
Java: https://docs.parasoft.com/display/JTEST20232/Creating+a+Para...
C# (nunit, but xunit has this too): https://docs.nunit.org/articles/nunit/technical-notes/usage/...
Python: https://docs.pytest.org/en/stable/example/parametrize.html
cpp: https://google.github.io/googletest/advanced.html
A belief that the ability of LLMs to generate parameterizations is intrinsically helpful to a degree which cannot be trivially achieved in most mainstream programming languages/test frameworks may be an indicator that an individual has not achieved a substantial depth of experience.
If you can share a tool that can analyze a function and create a test for all corner cases in a popular language, I'm sure some people would be interested in that.
What you should have said is that some parameterized test generators do automated white box testing where they look at your code similar to a fuzzer and try to find the test cases automatically. Your first link is literally just setting up an array with test cases, which basically means you'd have to use an LLM to quickly produce the test cases anyway, which makes parameterized testing sound exceedingly pathetic.
https://learn.microsoft.com/en-us/visualstudio/test/generate...
IntelliTest explores your .NET code to generate test data and a suite of unit tests. For every statement in the code, a test input is generated that will execute that statement. A case analysis is performed for every conditional branch in the code. For example, if statements, assertions, and all operations that can throw exceptions are analyzed. This analysis is used to generate test data for a parameterized unit test for each of your methods, creating unit tests with high code coverage. Think of it as smart fuzz testing that trims down the inputs and test cases to what executes all your logic branches and checks for exceptions.
Maybe you should have brought that up earlier instead of acting smug and burying the lede? It's also pretty telling that you didn't elaborate this further and kept your comment short.
I thought people were generally competent within the areas they discuss and are aware of the tooling within their preferred ecosystem. I apologize if that is not the case.
I'm not sure what depth of experience has to do with any of this, since it is busy work that costs a lot of time. A form with 120 fields is a form with 120 fields. There is no way around coming up with the several dozens of test cases that you're going to test without filling out almost all of the fields, even the ones that are not relevant to the test itself, otherwise you're not really testing your application.
I do like integration tests, but I often tell people the art of modern software is to make reliable systems on top of unreliable components.
There is a dramatic difference between unreliable in the sense of S3 or other services and unreliable as in "we get different sets of logical outputs when we provide the same input to a LLM". In the first, you can prepare for what are logical outcomes -- network failures, durability loss, etc. In the latter, unless you know the total space of outputs for a LLM you cannot prepare. In the operational sense, LLMs are not a system component, they are a system builder. And a rather poor one, at that.
And the integration tests should 100% include times when the network flakes out and drops 1/2 of replies and corrupts msgs and the like.
Yeah, it's not that hard to include that in modern testing.
Unit tests are for validation of error paths. Unit tests can leverage mocks or fakes. Need 3 retires with exponential back off, use unit tests and fakes. Integration tests should use real components. Typically, integration tests are happy path and unit are error paths.
Making real components fail and having tests validate failure handling in a more complete environment jumps from integration testing to resilience or chaos testing. Being able to accurately validate backoffs and retries may diminish, but validating intermediate or ending state can be done with artifact monitoring via sinks.
There is unit-integration testing which fakes out as little as possible but still fakes out some edges. The difference being that the failures are introduced via fake vs managing actual system components. If you connect to a real db on unit-integration tests, you typically wouldn't kill the db or use Comcast to slow the network artificially. That would be reserved for the next layer in the test pyramid.
Because my tests shouldn't fail when a 3rd party dependency is down.
Because I want to be able to fake failure conditions from my dependencies.
Because unit tests have value and mocks make unit tests fast and useful.
Even my integration tests have some mocks in them, especially for any services that have usage based pricing.
But in general I'm going to mock out things that I want to simulate failure states for, and since I'm paranoid, I generally want to simulate failure states for everything.
End to End tests are where everything is real.
Because if part of my tests involve calling an OpenAI endpoint, I don't want to pay .01 cent every time I run my tests.
This is a good time to think to yourself: do I need these dependencies? Can I replace them with something that doesn't expose vendor risk?
These are very real questions that large enterprises grapple with. In general (but not always), orgs that view technology as the product (or product under test) will view the costs of either testing or inhousing technology as acceptable, and cost centers will not.
But in general I'm going to mock out things that I want to simulate failure states for, and since I'm paranoid, I generally want to simulate failure states for everything.
This can be achieved with an instrumented version of the service itself.
This is a good time to think to yourself: do I need these dependencies? Can I replace them with something that doesn't expose vendor risk?
Given that my current projects all revolve solely around using LLMs to do things, yes I need them.
The entire purpose of the code is to call into LLMs and do something useful with the output. That said I need to gracefully handle failures, handle OpenAI giving me back trash results (forgetting fields even though they are marked required in the schema, etc), or just the occasional service outage.
Also integration tests only make sense once I have an entire system to integrate. Unit tests let me know that the file I just wrote works.
test fixtures
I'm curious- how does the AI know what you want?
""" Given these files: file1, file2, ... (these are pulled entirely into the LLM context)
Create a test fixture by creating a type that implements the trait A and should use an in memory SQLite DB, another one that implements Trait B, ...
"
Of course there is a bit of back and forth, but I find that using Interfaces/Traits/ABCs extensively makes LLMs perform better at these tasks (but I believe it is a nice side-effect of having more testable code to begin with).
However, wiring with IoC frameworks are a bit of a hit and miss to be honest so often I still have to do these parts manually.
Found myself having 3-4 different sites open for documentation, context switching between 3 different libraries. It was a lot to take in.
So I said, why not give AI a whirl. It helped me a lot! And since then I have published at least 6 different projects with the help of AI.
It refactors stuff for me, it writes boilerplate for me, most importantly it's great at context switching between different topics. My work is pretty broadly around DevOps, automation, system integration, so the topics can be very wide range.
So no I don't mind it at all, but I'm not old. The most important lesson I learned is that you never trust the AI. I can't tell you how often it has hallucinated things for me. It makes up entire libraries or modules that don't even exist.
It's a very good tool if you already know the topic you have it work on.
But it also hit me that I might be training my replacement. Every time I correct its mistakes I "teach" the database how to become a better AI and eventually it won't even need me. Thankfully I'm very old and will have retired by then.
invalidating refresh keys after single use
That's called refresh token rotation and is a valid security practice.
Not sure why Google doesnt do this but Atlassian does.
But trying to use it like “please write this entire feature for me” (what vibe coding is supposed to mean) is the wrong way to handle the tool IMO. It turns into a specification problem.
I've also been close to astonished at the capability LLMs have to draw conclusions from very large complex codebases. For example I wanted to understand the details of a distributed replication mechanism in a project that is enormous. Pre-LLM I'd spent a couple of days crawling through the code using grep and perhaps IDE tools, making notes on paper. I'd probably have to run the code or instrument it with logging then look at the results in a test deployment. But I've found I can ask the LLM to take a look at the p2p code and tell me how it works. Then ask it how the peer set is managed. I can ask it if all reachable peers are known at all nodes. It's almost better than me at this, and it's what I've done for a living for 30 years. Certainly it's very good for very low cost and effort. While it's chugging I can think about higher order things.
I say all this as a massive AI skeptic dating back to the 1980s.
I tend to begin asking for an empty application with the characteristics I want (CLI, has subcommands, ...) then I ask it to add a simple feature.
That makes sense, as you're breaking the task into smaller achievable tasks. But it takes an already experienced developer to think like this.
Instead, a lot of people in the hype train are pretending an AI can work an idea to production from a "CEO level" of detail – that probably ain't happening.
you're breaking the task into smaller achievable tasks.
this is the part that I would describe as engineering in the first place. This is the part that separates a script kiddie or someone who "knows" one language and can be somewhat dangerous with it, from someone who commands a $200k/year salary, and it is the important part
and so far there is no indication that language models can do this part at. all.
for someone who CAN do the part of breaking down a problem into smaller abstractions, though, some of these models can save you a little time, sometimes, in cases where it's less effort to type an explanation to the problem than it is to type the code directly..
which is to say.. sometimes.
It is very useful to be able to ask basic questions about the code that I am working on, without having to read through dozens of other source files. It frees up a lot of time to actually get stuff done.
This, IMHO, is the critical point and why a lot of “deep” development work doesn’t benefit much from the current generation of AI tools.
Last week, I was dealing with some temporal data. I often find working in this area a little frustrating because you spend so much time dealing with the inherent traps and edge cases, so using an AI code generator is superficially attractive. However, the vast majority of my time wasn’t spent writing code, it was getting my head around what the various representations of certain time-based events in this system actually mean and what should happen when they interact. I probably wrote about 100 test cases next, each covering a distinct real world scenario, and working out how to parameterise them so the coverage was exhaustive for certain tricky interactions also required a bit of thought. Finally, I wrote the implementation of this algorithm that had a lot of essential complexity, which means code with lots of conditionals that needs to be crystal clear about why things are being done in a certain order and decisions made a certain way, so anyone reading it later has a fighting chance of understanding it. Which of those three stages would current AI tools really have helped with?
I find AI code generators can be quite helpful for low-level boilerplate stuff, where the required behaviour is obvious and the details tend to be a specific database schema or remote API spec. No doubt some applications consist almost entirely of this kind of code, and I can easily believe that people working on those find AI coding tools much more effective than I typically do. But as 'manoDev says in the parent comment, deeper work is often a specification problem. The valuable part is often figuring out the what and the why rather than the how, and so far that isn’t something AI has been very good at.
Feels like a similar situation to self driving where companies want to insist that you should be fully aware and ready to take over in an instant when things go wrong. That's just not how your brain works. You either want to fully disengage, or be actively doing the work.
The AI tools can just spit out function names and tools I don't know off the top of my head, and the only way to check they are correct is to go look up the documentation, and at that point I've just done the hard work I wanted to avoid.
This is exactly my experience, but I guess generating code with depreciated methods is useful for some people.
I thought vibe coding meant very little direct interaction with the code, mostly telling the LLM what you want and iterating using the LLM. Which is fun and worth trying, but probably not a valid professional tool.
E.g one tool packages a debug build of an iOS simulator app with various metadata and uploads it to a specified location.
Another tool spits out my team's github velocity metrics.
These were relatively small scripting apps, that yes, I code reviewed and checked for security issues.
I don't see why this wouldn't be a valid professional tool? It's working well, saves me time, is fun, and safe (assuming proper code review, and LLM tool usage).
With these little scripts it creates it's actually pretty quick to validate their safety and efficacy. They're like validating NP problems.
This is complicated by the fact that some people use “vibe coding” to mean any kind of LLM-assisted coding.
And then, more people saw these critics using "vibe coding" to refer to all LLM code creation, and naturally understood it to mean exactly that. Which means the recent articles we've seen about how good vibe coding starts with a requirements file, then tests that fail, then tests that pass, etc.
Like so many terms that started out being used pejoratively, vibe coding got reclaimed. And it just sounds cool.
Also because we don't really have any other good memorable term for describing code built entirely with LLM's from the ground up, separate from mere autocomplete AI or using LLM's to work on established codebases.
I’m willing to vibe code a spike project. That is to say, I want to see how well some new tool or library works, so I’ll tell the LLM to build a proof of concept, and then I’ll study that and see how I feel about it. Then I throw it away and build the real version with more care and attention.
From Karpathy's original post I understood it to be what you're describing. It is getting confusing.
One imagines Leadership won't be so pleased after the inevitably price hike (which, given the margins software uses, is going to be in the 1-3 thousands a day) and the hype wears off enough for them to realize they're spending a full salary automating a partial FTE.
But, by the looks of things, models will be more efficient by then and a cheaper-to-run model will produce comparable output
So far there's negative evidence of this. Things are getting more expensive for similar outputs.
The current feature that I'm working on, required 100 messages to finalize things and I would say the context window was around 35k - 50k per "chat completion". My model of choice is Gemini 2.5 Flash which has an input cost of $0.30/1M. Compare this to Sonnet which is $3.00/1M.
If the person was properly designing and instructing the LLM to build something advanced correctly, I can see the bill being quite high. I personally don't think you need to use Sonnet 99% of the time, but if somebody else is willing to pay the bill, why not.
I'm using Zed as my editor, and maybe 18 months ago I upgraded my system. I didn't miss the AI autocomplete at the time, so I didn't bother to set it up again. However, around two weeks ago I figured I'd give it another go.
I set up GitHub Copilot in Zed and... it's horrible. It seems like most of its suggestions are completely misguided and incorrect, usually just duplicating the code immediately above or below the cursor location while updating the name of a single identifier to match some perceived pattern. Not remotely close to what I'd consider useful; I'm definitely a faster & better programmer without it.
I also tried setting up some local models on ollama: I kept getting random tokens inserted that seemed to be markup from the model output that Zed didn't know how to parse. (On mobile rn, will post sample output when I am back at work if I remember to.)
Is paying Anthropic an arm and a leg for the privilege of granting them first-party access to train on my user data really the competitive move as a modern developer?
P.S. I know Zed has their own AI (and it seems like it should be really good!), but when they first introduced it, I tried it out and it immediately consumed the entire free tier's worth of credits in just a few minutes of normal coding: suggestions are proactively generated and count against your account credit even if not accepted, so I didn't really feel like I'd gotten a good sense of the tool by the time the trial ran out. Even if it's really good, it burns through credits extremely fast.
If you want a perfect investment imho, get Supermaven. It's autocomplete is 99% perfect.
suggestions are proactively generated and count against your account credit even if not accepted
This is no longer the case. They only count if you accept them.
Ime their model is not great, but better than copilot (ime copilot is just too slow and disruptive).
Also the options you have for tab-complete, afaik, are zed/zeta, guthub copilot, supermaven, and local models. I dont think there are other providers right now, glad if I am wrong in that.
For large pieces of work, I will iterate with CC to generate a feature spec. It's usually pretty good at getting you most of the way there first shot and then either have it tweak things or manually do so.
Implementation is having CC first generate a plan, and iterating with it on the plan - a bit like mentoring a junior, except CC won't remember anything after a little while. Once you get the plan in place, then CC is generally pretty good at getting through code and tests, etc. You'll still have to review it after for all the reasons others have mentioned, but in my experience, it'll get through it way faster than I would on my own.
To parallelize some of the work, I often have Visual Studio Code open to monitor what's happening while it's working so I can redirect early if necessary. It also allows me to get a head start on the code review.
I will admit that I spent a lot of time iterating on my way of working to get to where I am, and I don't feel at all done (CC has workflows and subagents to help with common tasks that I haven't fully explored yet). I think the big thing is that tools like CC allow us to work in new ways but we need to shift our mindset and invest time in learning how to use these tools.
* brainstorm all the ideas, get Claude to write docs + code for all them, and then throw away the code
* ask it to develop architecture and design principles based on the contents of those docs
* get it to write a concise config spec doc that incorporates all the features, respects the architecture and design as appropriate
* iterate over that for a while until I get it into a state I like
* ask it to write an implementation plan for the config spec
* babysit it as I ask it to implement phase by phase of the implementation plan while adhering to the config spec
It’s a bit slower to than what I’d hoped originally, but it’s a lot better in terms of end result and gives me more opportunity to verify tests, tweak implementation, briefly segue or explore enhancements, etc.
To get that to work half-decent, you have to take on a PM/Tech-lead role, you're no longer a senior engineer.
But you’re saying it can be half-decent?
The problem is that about 75% of HN commenters have their identities tightly wound up in being a (genuflect) senior engineer and putting down PM/tech-lead type roles.
They’ll do anything to avoid losing that identity including writing non-stop about how bad AI code is. There’s an Upton Sinclair quote that fits the situation quite nicely.
I'd agree that 75% you speak of is generally hostile to the mere concept of PMs, but that's usually from a misapplication of PMs as proxy-bosses for absentee product owners/directors who don't want to talk to nerds - flow interruptions, beancounting perceived as useless, pointless ceremonies, even more pointless(er) meetings etc, and the further defiling of the definition of "agile".
But a deep conceptual product and roadmap understanding that helps one steer Claude Code is invaluable for both devs and PMs, and I don't think most of that 75% would begrudge that quality in a PM
But I’ve come full circle and have gone back to hand coding after a couple years of fighting LLMs. I’m tired of coaxing their style and fixing their bugs - some of which are just really dumb and some are devious.
Artisanal hand craft for me!
Usually it isn't, though - I just want to pump out code changes ASAP (but not sooner).
It’s just not worth it anymore for anything that is part of an actual product.
Occasionally I will still churn out little scripts or methods from scratch that are low risk - but anything that gets to prod is pretty much hand coded again.
https://github.com/BeehiveInnovations/zen-mcp-server/blob/ma...
It basically uses multiple different LLMs from different providers to debate a change or code review. Opus 4.1, Gemini 2.5 Pro, and GPT-5 all have a go at it before it writes out plans or makes changes.
along with improving my skills in vim, this approach has made me significantly more productive and has made my code much simpler compared to when i was using LLM code generation tools.
there is no shortcut around hard work.
there is no shortcut to thoroughly interrogating the constraints of the software problem at hand.
developers who rely on LLM code generation are poor coworkers because they don't understand what they've written and why they've written it.
Older devs are not letting the AI do everything for them. Assuming they're like me, the planning is mostly done by a human, while the coding is largely done by the AI, but in small sections with the human giving specific instructions.
Then there's debugging, which I don't really trust the AI to do very well. Too many times I've seen it miss the real problem, then try to rewrite large sections of the code unnecessarily. I do most of the debugging myself, with some assistance from the AI.
I've largely settled on the opposite. AI has become very good at planning what to do and explaining it in plain English, but its command of programming languages still leaves a lot to be desired.
And remains markably better than when AI makes bad choices while writing code. That is much harder to catch and requires pouring over the code with a fine tooth comb to the point that you may as well have just written it yourself, negating all the potential benefits of using it to generate code in the first place.
I always have more work than I can handle. Part of my job is deciding what not to do and what to drop because it hasn't got the right priority. Me spending an afternoon on a thing that is fun but not valuable is usually a bad use of my time. With LLMs, I'm taking on a few more of the things that I previously wouldn't have. That started with a few hobby projects that I'm now doing that I previously wasn't. And it's creeping into work as well. LLMs struggle on larger code bases. But less so with recent model releases.
LLMs are tools. If you master your tools, you become more productive.
LLMs are unpredictable tools which change all the time, let’s not pretend otherwise. You can’t “master” them in the same way as previous tools. You can learn some tricks to trick them to be closer to what you want, and that’s about it.
Imagine if every time you did the exact same movement to hammer a nail, you had to check your work. Maybe this time it hammered it in perfectly, or maybe it smashed your finger, or maybe it only went half-way through. You could never develop muscle memory for such a tool. You could use it, sure, but never master it.
A third? I would expect at least a majority based on the headline and tone of the article... Isn't this saying 66% are down on vibe coding?
I feel the benchmark is some engineer who shoots out perfect work the first time they hit the keyboard.
While I'll say it got me started, it wasn't a snap of the fingers and a quick debug to get something done. Took me quite a while to figure out why something worked but really it didn't (LLM using command line commands where Bash doesn't interpret the results the same).
If its something I know, probably wont use LLM (as it doesn't do my style). If it's something I don't know, might use it to get me started but I expect that's all I'll it for.
If you find it is quicker not to use it then you might hate it, but I think it is probably better in some cases and worse in other cases.
Anything that makes development faster or easier is going to be welcomed by a good developer.
I strongly disagree. Struggling with a problem creates expertise. Struggle is slow, and it's hard. Good developers welcome it.
Struggling with a problem creates expertise. Struggle is slow, and it's hard. Good developers welcome it.
There is significant evidence that shows mixed results for struggle-based learning - it’s highly individualized and has to be calibrated carefully: https://consensus.app/search/challenge-based-learning-outcom...
Anybody who has developed software should understand the value of struggling with a difficult problem. I'm obviously not talking about classroom exercises where the problem sets are expected to match a given skill level or cultivate a specific skill set, so the very idea of individualized, calibrated learning is irrelevant.
As a teacher I'm also 100% uninterested in highly individualized, calibrated challenges for what I teach -- or for what I do professionally. The people who need those highly individualized, wildly different, more gently graduated increases in difficulty, for general problem solving or for the study of any area of programming or computer science, simply should not become engineers.
I think we'll find a middle ground though. I just think it hasn't happened yet. I'm cautiously optimistic.
("Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize." - https://news.ycombinator.com/newsguidelines.html)
They surveyed 791 developers (:D) and "a third of senior developers" do that. That's... generiously, what... 20 people?
It's amazing how everyone can massage numbers when they're trying to sell something.
And of course, its an article based on a source article based on a survey (of a single company), with the source article written by a "content marketing manager", and the raw data of the survey isn't released/published, only some marketing summary of what the results (supposedly) were. Very trustworthy.
as long as we manage trust, ethics, and transparency
Trust, ethics and transparency are at all time lows in every corner of every capitalist industry imo
I don't know how anyone could look at the current state of things and think AI isn't being positioned to gut employment across many industries
Naive, childish and short sighted mentality imo
survey of 791 developers
We have got to stop. In a universe of well over 25 million programmers a sample of 791 is not significant enough to justify such headlines.
We’ve got to do better than this, whatever this is.
From another perspective: we've deduced a lot of things about how atoms work without any given experiment inspecting more than an insignificant fraction of all atoms.
TL;DR: The population size (25e6 total devs, 1e80 atoms in observable universe) is almost entirely irrelevant to hypothesis testing.
But statistically speaking, at a 95% confidence level you'd be within a +/- 3.5% margin of error given the 791 sample size, irrespective of whether the population is 30k or 30M.
I feel no shame in doing the later. I've also learned enough about LLMs that I know how to write that CLAUDE.md so it sticks to best practices. YMMV.
As we have all observed, the models get things wrong, and if you’re wrong 5% of the time, then ten edits in you’re at 60-40. So you need to run them in a loop where they’re constantly sanity checking themselves—-linting, styling, typing and testing. In other words; calling tools in a loop. Agents are so much better than any other approach it’s comical precisely because they’re scaffolding to let models self-correct.
This is likely somewhat domain-specific; I can’t imagine the models are that great at domains they haven’t seen much code in, so they probably suck at HFT infrastructure for example, though they are decent at reading docs by this point. There’s also a lot of skill in setting up the right documentation, testing structure, interfaces, etc etc etc to make the agents more reliable and productive (fringe benefit; your LLM-wielding colleagues actually write docs now, even if they’re full of em-dashes and emoji). You also need to be willing to let it write a bunch of code, look at it, work out why it’s structurally deficient, throw it away, and build the structure you want to guide it - but the typing is essentially free, so that’s tractable. Don’t view it as bad code, view it as a useful null result.
But if you’re not using Claude Code or Codex or Roo or relatives, you’re living in an entirely different world to the people who have gone messianic about these things.
I'm sure you're correct that a lot of this comes down to workflow and domain - in mine, the overhead of prompting, reviewing, and correcting usually outweighs the benefits. Of course, as always, this is just one asshole's opinion!
I'm not a coder but a sysadmin. 35 years or so. I'm conversant with Perl, Python, (nods to C), BASIC, shell, Powershell, AutoIT (int al)
I muck about with CAD - OpenSCAD, FreeCAD, and 3D printing.
I'm not a senior developer - I pay them.
LLMs are handy in the same way I still have my slide rules and calculators (OK kids I use a calc app) but I do still have my slide rules.
ChatGPT does quite well with the basics for a simple OpenSCAD effort but invents functions within libraries. That is to be expected - its a next token decider function and not a real AI.
I find it handy for basics, very basic.
For me, success with LLM-assisted coding comes when I have a clear idea of what I want to accomplish and can express it clearly in a prompt. The relevant key business and technical concerns come into play, including complexities like balancing somewhat conflicting shorter and longer term concerns.
Juniors are probably all going to have to be learning this kind of stuff at an accelerated rate now (we don't need em cranking out REST endpoints or whatever anymore), but at this point this takes a senior perspective and senior skills.
Anyone can get an LLM and agentic tool to crank out code now. But you really need to have them crank out code to do something useful.
However now they try to sell subscriptions to LLMs.
Tabnine has been in the scene since at least 2018.
Even when I am building tools that heavily utilize modern AI, I haven’t found it. Recently, I disabled the AI-powered code completion in my IDE because I found that the cognitive load required to evaluate the suggestions it provided was greater and more time consuming than just writing the code I was already going to write anyways.
I don’t know if this is an experience thing or not - I mainly work on a tech stack I have over a decade of experience in - but I just don’t see it.
Others have suggested generating tests with AI but I find that horrifying. Tests are the one thing you should be the most anal about accuracy on compared to anything else in your codebase.
a) Answer "a lot". This answer supports the notion that developers are a dying breed and soon everyone will be able to vibe code their own personal software. Which at this point is obviously false. This answer is detrimental to my job.
b) Answer "not much". Then this gets interpreted as "he's an old fart, he can't learn new things, we should be thinking of retiring him". Which is (hopefully) false. Which is - again - detrimental to my job.
Personally for me it's 20 years of experience / Python / Copilot / 85% positive experience.
Senior developers were also more likely to say they invest time fixing AI-generated code.
That quote is the key - even if 1/3 of senior devs are pushing mostly AI-driven code, they are checking it first. And while the survey did not cover it, I suspect they are using their experience to decide which areas of a codebase are commonplace enough that AI can handle it vs. which areas are unique and require coding without AI.
People are learning when to use AI as a helpful tool and when not to.
I just can't fathom shipping a big percentage of work using LLMs.
I actually made a Ask HN about it just today https://news.ycombinator.com/item?id=45091607 but for some reason the HN algorithm never even showed it on the Ask page :/
In the last 6 months, when I have had an assignment that involved coding, AI has generated 100% of my code. I just described the abstractions I wanted and reusable modules/classes I needed and built on it.
This also: AI slowing down completion time by 19% - https://arxiv.org/abs/2507.09089
Not sure which one is true
Maybe I can potentially believe at one point AI code was used to create a starting point for 50% of PRs…
Also, green coding? That's new to me. I guess we'll see optional carbon offset purchasing in our subs soon.
https://publichealthpolicyjournal.com/mit-study-finds-artifi...
Checks out
https://www.fastly.com/products/ai
https://www.fastly.com/products/fastly-ai-bot-management
https://www.fastly.com/documentation/guides/compute/about-th...
If I don't know how to structure functions around a problem, I will also use the LLM, but I am asking it to write zero code in this case. I am just having a conversation about what would be good paths to consider.