i'm not sure i buy the long-term "*90% productivity*" claims for complex, legacy enterprise systems, but for the boilerplate, libraries, build-tools, and refactoring? the gain is gigantic. all the time-consuming, nerve-wrecking stuff is mostly taken care of.
you start off checking every diff like a hawk, expecting it to break things, but honestly, soon you see it's not necessary most of the time. you just keep your IDE open and feed the "analyze code" output back into it. in java, telling it to "add checkstyle, run mvn verify and repair" works well enough that you can actually go grab a coffee instead of fighting linter warnings.
the theory is that what remains is just the logic and ideas. we'll see how that holds up when the architecture gets genuinely tangled. but for now, letting it branch off, create boilerplate, and write a simple test while you just iterate on the spec works shockingly well. you only write source code when it's too annoying to write down the spec in plain english.
it raises the real question: if your competitor Y just fired 90% of their developers to save a buck, would you blindly follow suit? or would you keep your team, use this massive leverage, and just *dwarf* Y with a vastly better product?
If it is writing both the code and the tests then you're going to find that its tests are remarkable, they just work. At least until you deploy to a live state and start testing for yourself, then you'll notice that its mostly only testing the exact code that it wrote, its not confrontational or trying to find errors and it already assumes that its going to work. It won't ever come up with the majority of breaking cases that a developer will by itself, you will need to guide it. Also while fixing those the odds of introducing other breaking changes are decent, and after enough prompts you are going to lose coherency no matter what you do.
It definitely makes a lot of boilerplate code easier, but what you don't notice is that its just moving the difficult to find problems into hidden new areas. That fancy code that it wrote maybe doesn't take any building blocks, lower levels such as database optimization etc. into account. Even for a simple application a half-decent developer can create something that will run quite a bit faster. If you start bringing these problems to it then it might be able to optimize them, but the amount of time that's going to take is non-negligible.
It takes developers time to sit on code, learn it along with the problem space and how to tie them together effectively. If you take that away there is no learning, you're just the monkey copy-pasting the produced output from the black box and hoping that you get a result that works. Even worse is that every step you take doesn't bring you any closer to the solution, its pretty much random.
So what is it good for? It can both read, "understand", translate, write and explain things to a sufficient degree much faster than us humans. But if you are (at the moment) trusting it at anything past the method level for code then you're just shooting yourself in the foot, you're just not feeling the pain until later. In a day you can have it generate for example a whole website, backend, db etc. for your new business idea but that's not a "product", it might as well be a promotional video that you throw away once you've used it to impress the investors. For now that might still work, but people are already catching on and beginning to wise up.
Because it does.
> I still don't see ANY proof that it doesn't generate a total unmaintainable unsecure mess, that since you didn't develop, you don't know how to fix.
I wouldn't know since I've never tried but I'd imagine that Claude Code would indeed generate a half-baked Next.js monstrosity if one-shot and left to its own devices. Being the learned software engineer I am, however, I provide it plenty of context about architecture and conventions in a bootstrapped codebase and it (mostly) obeys them. It still makes mistakes frequently but it's not an exaggeration to say that I can give it a list of fields with validation rules and query patterns and it'll build me CRUD pages in a fraction of the time it'd take me to do so.
I can also give it a list of sundry small improvements to make and it'll do the same, e.g. I can iterate on domain stuff while it fixes a bunch of tiny UX bugs. It's great.
I'm not sure what your circumstances are but even if it's not true for you, it's true for many other people.
People online with identical views to them all assure me that theyre all highly skilled though.
Meanwhile I've been experimenting using AI for shopping and all of them so far are horrendous. Cant handle basic queries without tripping over themselves.
This is an interesting choice for a first experiment. I wouldn't personally base AI's utility for all other things on its utility for shopping.
Most people dont really understand coding but shopping is a far simpler task and so it's easier to see how and where it fails (i.e. with even mildly complex instructions).
On the tech side I see it saving some time with stuff like mock data creation, writing boiler plate, etc. You still have to review it like it's a junior. You still have to think about the requirements and design to provide a detailed understanding to them (AI or junior).
I don't think either of these will provide 90% productivity gains. Maybe 25-50% depending on the job.
We are very much in need of an actual way to measure real economic impact of AI-assisted coding, over both shorter and longer time horizons.
There's been an absolute rash of vibecoded startups. Are we seeing better success rates or sales across the industry?
That's the same false argument that the religious have offered for their beliefs and was debunked by Bertrand Russell's teapot argument: https://en.wikipedia.org/wiki/Russell%27s_teapot
not talking about toys or vibecoded crap no one uses.
Nobody is.
Perhaps nobody cares to “convince you” and “win you over”, because…why? Why do we all have to spoon feed this one to you while you kick and scream every step of the way?
If you don’t believe it, so be it.
If you use it correctly, you can get better quality, more maintainable code than 75% of devs will turn in on a PR. The “one weird trick” seems to be to specify, specify, specify. First you use the LLM to help you write a spec (document, if it’s pre existing). Make sure the spec is correct and matches the user story and edge cases. The LLM is good at helping here too. Then break down separations of concerns, APIs, and interfaces. Have it build a dependency graph. After each step, have it reevaluate the entire stack to make sure it is clear, clean, and self consistent.
Every step of this is basically the AI doing the whole thing, just with guidance and feedback.
Once you’ve got the documentation needed to build an actual plan for implementation, have it do that. Each step, you go back as far as relevant to reevaluate. Compare the spec to the implementation plan, close the circle. Then have it write the bones, all the files and interfaces, without actual implementations. Then have it reevaluate the dependency graph and the plan and the file structure together. Then start implementing the plan, building testing jigs along the way.
You just build software the way you used to, but you use the LLM to do most of the work along the way. Every so often, you’ll run into something that doesn’t pass the smell test and you’ll give it a nudge in the right direction.
Think of it as a junior dev that graduated top of every class ever, and types 1000wpm.
Even after all of that, I’m turning out better code, better documentation, and better products, and doing what used to take 2 devs a month, in 3 or 4 days on my own.
On the app development side of our business, the productivity gain also strong. I can’t really speak to code quality there, but I can say we get updates in hours instead of days, and there are less bugs in the implementations. They say the code is better documented and easier to follow , because they’re not under pressure to ship hacky prototype code as if it were production.
On the current project, our team size is 1/2 the size it would have been last year, and we are moving about 4x as fast. What doesn’t seem to scale for us is size. If we doubled our team size I think the gains would be very small compared to the costs. Velocity seems to be throttled more by external factors.
I really don’t understand where people are coming from saying it doesn’t work. I’m not sure if it’s because they haven’t tried a real workflow, or maybe tried it at all, or they are definitely “holding it wrong.” It works. But you still need seasoned engineers to manage it and catch the occasional bad judgment or deviation from the intention.
If you just let it, it will definitely go off the rails and you’ll end up with a twisted mess that no one can debug. But use a system of writing the code incrementally through a specification - evaluation loop as you descend the abstraction from idea to implementation you’ll end up winning.
As a side note, and this is a little strange and I might be wrong: I have the AI keep a journal about its observations and general impressions, sort of the “meta” without the technical details. I frame this to it as a continuation of “awareness “ for new sessions. I also have a short set of “onboarding“ documents that describe the vision, ethos, and goals of the project. I have it read the journal and the onboarding docs at the beginning of each session and try to work with it as a “collaborator” rather than a tool. At the end of the day, I remind it to update its journal of reflections about the days work.
I find that the work produced is much less prone to going off the rails or taking shortcuts when I have this in the context, and by reading the journal I get ideas on where and how to do a better job of steering and nudging to get better results, it’s like a review system for my prompting. The onboarding docs seem to help keep the model working towards the big picture? Idk.
This system only seems to work with some models. GPT5 for example doesn’t seem to benefit and sometimes gets into a very creepy vibe.
Ruby on Rails and its imitators blew away tons of boilerplate. Despite some hype at the time about a productivity revolution, it didn’t _really_ change that much.
> , libraries, build-tools,
Ensure what you mean by this; what bearing do our friends the magic robots have on these?
> and refactoring
Again, IntelliJ did not really cause a productivity revolution by making refactoring trivial about 20 years ago. Also, refactoring is kind of a solved problem, due to IntelliJ et al; what’s an LLM getting you there that decent deterministic tooling doesn’t?
Smart organizations will not just deliver better products but likely start products that they were hesitant to start before because the cost of starting is a lot closer to zero. Smart engineering leadership will encourage developers into delivering value and not self-serving, endless iterations of tooling enhancements, etc.
If I was a CTO and my competitor Y fired 90% of their devs, I'd try to secure funding to hire their top talent and retain them. The vitriol alone could fuel some interesting creations and when competitor Y realizes things later, their top talent will have moved on.
>> Smart organizations will not just deliver better products but likely start products [...]
This is not the 90s anymore when low hanging fruit was everywhere ready to be picked. We have everything under the sun now and more.
The problem with bullshit apps is not that it took you 5 months to build. What you build now in 5 minutes it's still bullshit. Most of the remaining work is bullshit jobs. Spinning useless "features" and frameworks that nobody needs and shove them down the throat of customers that never asked for them. Now it's possible to dig holes and fill them back (do pointless work) at much improved pace thanks to AI.
Another way of increasing profit is to simply reduce your headcount by 90% while keeping the same profit.*
Hence, I think some companies will keep downsizing. Some companies will hire. It depends a lot.
*Assuming 90% productivity increase.
Is it the same with tech? Facebook has 3 billion monthly active users. No amount of tech will bring that up to 6 billion. If you were to double the amount of time someone spends on Facebook, or double the ads they see or double the click through rate, what does that really mean?
I think that it's more along the lines of "do you fire people" instead of just "do you fire devs". Fewer devs means less of a need for PMs, so they can be let go as well, and maybe with the rise of AI assisted design tools, you don't need as many UX people, so you let some of them go as well.
As for building better products, I feel like that's a completely different topic than using AI for productivity gains, but only because at the end of the day you need buy in from upper management in order to build the features/redo existing features/both that will make the product better. I should also mention I'm viewing this from the position of someone who works at an established company and not a startup, so it may differ.
Seniors can adjust, but eg. junior frontend-only devs might be doomed in both situations, as they might not be able to contribute enough to business-critical features to justify their costs and most frontend-related tasks will be taken over by the "10x" seniors.
If it is a big company the answer is and will always be: whatever makes the stock price rise the most.
Remember sometimes the most productive thing to have is not money or people but time with your ideas.
CTO is rewriting company platform (by himself with AI) and is convinced it's 100x productivity. But when you step back and look at the broader picture, he's rewriting what something like Rails, .NET, or Spring gave us 15-20 years ago? It's just in languages and code styles he is (only) familiar with. That's not 100x for the business, sorry...
you hire more if you are growth and have new ideas just never had the chance to implement them as they were not practical of feasible at that level of tech (non-assisted humans clicking code and taking sick leaves)
Terafab is suddenly making so much sense!