# AI Code Generation: Architecture, Guardrails, and Legacy Strategy

**Podcast:** alphalist.CTO Podcast - For CTOs and Technical Leaders
**Published:** 2026-04-23

## Transcript

Hello friends, this is the Alphalist podcast.
I am your host, Tobi.
The goal of the Alphalist podcast is to empower CTOs with the info and insight they need to make the best decisions for their company.
We do this by hosting top thought leaders and picking their brains for insights into technical leadership and tech trends.
If you believe in the power of accumulated knowledge to accelerate growth, make sure to subscribe to this podcast.
Plus, if you're an experienced CTO, you will love the discussion happening in our Slack space where over 600 CTOs are sharing insights or visit one of our events.
Just go to alphalist.com to apply.
Welcome to the Alphalist podcast.
I'm your host, Tobi.
And today with me is a, let's say, consultancy legend and software architecture legend.
And we're here to talk about the topic or...
The title for this episode will be AI Rights Code, Who Architects the Consequences?
And today with me in the studio is Neil Ford.
And Neil is a software architect, author, speaker, and one of the clearest thinkers in modern software architecture.
Many of you know him from his work on evolutionary architecture fitness functions and architecture trade-offs.
Neil spent more than 20 years, is that correct, Neil?
That's correct.
ThoughtWorks, a consultancy that is also legendary.
And now he's working as an independent consultant and instructor, author.
Great to have you here, Neil.
Great to be here.
Great to have a chat with you today.
Maybe before we jump into the topic, we start a little earlier.
I always do that.
I mean, before all the books and architecture fame, what is your original?
nerd path like how did you get into how did you win the fascination or how did computers win you right like how did that happen it was very much a sort of a love affair because i actually started out in journalism That's where the writing comes from, because I started as a writer.
But then I didn't like how subjective writing was.
And I talked to someone at the time, and it was very good advice because they were warning that journalism maybe wasn't as great a career path as it was, you know, back in the mid part of the century.
And so I had always been good at math.
two-year degree in physics, and then went to university for mechanical engineering, and I hated it.
It was awful.
It was like pulling teeth.
I did not like that at all.
And so when I hit that wall, I looked back and said, okay, what classes do I really like?
Because one of the, I guess, advantages slash disadvantages I had was I was paying my own way through university so I could decide what I liked and didn't like.
So I looked back and said, what classes do I really, really like?
And it was all the programming classes that were really fascinating.
Partly it was really good teachers, but part it was just fascinating to me as a subject matter.
So that's when I switched over to computer science, got my degree in computer science, and then joined a consulting company and then put my writing to work two years later, writing my first book, which came out in 1995, which is about this tool called Delphi.
which we'd written courseware about and then subsequently wrote a book about.
So that's where the writing comes from is the journalism bit.
So I think that's an advantage I have over a lot of people who started and stayed purely in the computer science world.
They never learned to write a complete sentence or learn to avoid passing voice.
And hate writing, right?
I know so many of them.
Exactly.
It's a whole different thing.
And so I was able to take advantage of that brief stint in journalism.
and apply it to computer science.
Okay, okay.
And then you said you didn't like the subjective parts of journalism.
And now also the whole software industry is changing a lot.
And maybe a little bit of that sips back in through AI-generated code.
How do you like that?
And how do you see that as a challenge?
There's a big difference between subjective and non-deterministic.
Yeah.
Subjective still has a rational thought process behind it.
It's just I don't like your rational thought process.
I think it's biased in some way or another.
I think it's subjective, subject to something.
I think that's where subjective comes from.
Whereas the thing we're dealing with now, which is really annoying to me, much more so than a lot of other people that seem to be putting up with it, is the non-determinism.
And just the outright lying that agents will frequently do to you as you're working with them.
It's like, yeah, everything's great.
It's like, well, what about this?
Like, oh, yeah, you're right.
That's a train wreck.
Thanks a lot.
Yeah.
So I think that's an ongoing struggle because it's not something that's necessarily solvable because it's a feature, not a bug.
Part of that is why it works.
And you can't squeeze all that out of it and still have it work the way it does.
And so I think it's just something we're going to have to learn to live with.
Trust, but verify.
Everybody sells AI as a huge productivity boost, and you seem to see it partly as an architectural disruption.
What are people missing from your perspective, and how do you see through?
So let me first make this distinction that we made in one of our books, several books ago, and we keep making this distinction as we talk about software architecture.
As a software architect, let's talk about structural design as a software architect for a second and the role of a software architect.
One of the things we have to worry about is what we lump into what we call behavior.
This is the motivation for writing a piece of software.
I've got a problem.
I'm going to write some software to solve that problem.
That's the behavior that we want out of that system.
But the other thing you have to worry about and design for are capabilities.
That's where the ility thing comes from.
Security, scalability, elasticity, responsiveness, all the ilities of software architecture.
For non-trivial systems, you also have to design and take specific design steps to enable those things as well.
And one of the things that's driving me crazy about the AI stuff that is, you know, all the rage right now is everybody focuses heavily on behavior and ignores the capabilities part.
which is great for little demos and that sort of stuff.
So a perfect example of this, I was talking to one of my former colleagues.
We're doing a talk together at this upcoming Archive AI conference in Austin about how to strategize about AI.
She's a data scientist and has built this magnificent ecosystem where you can take a data flow diagram and give it to agents, and they'll produce Python code and do all this analysis.
It's like, wow, that's really amazing what you built.
And I said, how scalable is that?
And she looked like a deer in headlights because she'd never thought about that.
And that's the thing that architects have to think about.
It's like, hey, your demo is great.
Now it needs to work for 100,000 people.
How does that work?
That's the architecture part of that decision.
And it's not that agents are necessarily bad at architecture.
They just don't know about it because you have to tell them.
And so one of the things that we're emphasizing, we're doing this class now called How to Teach Your Agents About Architecture.
As we're emphasizing, you have to specify both those things, just like a human architect would.
Talking to human developers, you have to specify both those concerns before you can be successful and build actual real software and not toys and demos and that sort of stuff.
Sure, but there's also one, I think, interesting aspect about demos and MVPs, or one thing that often is over-architected, right?
solutions in the past have often been over-architected.
I remember one sentence from the former CTO of Pipedrive, Sergey Anakin.
who said like scalability problems are nice problems to have that like really burned into my brain.
And this is also true, right?
But that's a common anti-pattern.
So one of the things that I did when I was at ThoughtWorks is I helped work on 32 editions of the ThoughtWorks technology radar.
And one of the things that we put on our radar at one point was web scale envy.
because we kept running into these companies who were tying themselves in knots to try to build these ultra-scalable systems.
It's like, well, how many users do you have?
And they're like, well, 50 or 60.
It's like, what are you doing?
So, I mean, there's no excuse for over-engineering things.
What we're seeing now is, I don't think, as much over-engineering as for the first time ever in software.
There's a new-ility that we can start talking about now, ephemerality.
How ephemeral is this code?
If it's code that I'm just building right now that's going to be a small part of something I'm going to throw away in a week, I don't care about code quality or architecture or security and a bunch of other stuff.
But if I'm building a foundation for something that I want to build on for a decade, I got to care about those things.
I got to care about structure and components and tendencies and all that sort of stuff.
And so I think ephemerality gets to, and the reason that's interesting now is because Always in the past, building software was a constant amount of effort.
But now you can actually have poorly created software built really, really quickly.
If you don't care about the quality, then you can just build it.
We saw little bits of this in the first big boom that I participated in, which was the RAD.
Rapid Application Development Revolution, which gave us Visual Basic and Delphi.
My first book was about Delphi, this rapid application development tool.
And the goal for a lot of people was to take inexperienced people who understood computers and have them crank out business applications.
And you could to a point, but then you'd run into...
scaling and security and all the things we're talking about and building things that are great as very simple systems.
But as you keep trying to build on them, they start collapsing under their own weight because there's not good architectural structure underneath.
And so we're seeing the same sort of thing now.
But the difference now, of course, is you can produce bad code so quickly now.
It's shocking.
And I think it's going to be shocking.
You sometimes produce good code very quickly, right?
Absolutely.
It's getting better all the time, in fact.
But let's talk about one of the fundamental problems that we keep bringing up.
And this goes back to the Dreyfus scale of knowledge acquisition, which is a very fancy thing.
It was created for the nursing industry, actually, in the United States back several decades ago.
Because nursing is a very complex profession.
It's not something you can go to class and learn.
It's a combination of on-the-job learning and education, much like software development is.
And so the Dreyfus scale is applicable to a lot of things.
So let's talk briefly about the Dreyfus scale.
The bottom of the scale is novice.
And novices only know how to apply recipes.
And if a recipe breaks, they don't know what to do.
Advanced beginners have done recipes a lot.
They don't understand a lot about it, but they've done enough recipes so that if a recipe breaks, they don't understand why, but they can say, well, you know, that recipe was similar to this other one.
Let's take these steps from this other one and see if I can get that to work here and kind of improvise their way to a solution.
The next level up is competent, understands how things work, at least one abstraction level below where you're working all the time.
Proficient is good enough to write recipes for advanced beginners and novices.
And then experts, if it's so ingrained that they can't explain how it works anymore because it's just part of their muscle memory.
The reason the scale is interesting is that for all the work they've done, generative AI agents are still advanced beginners.
They are very, very good at finding recipes to apply, but they don't reason about why they're applying the recipe.
I saw just today, a big study came out from, I believe it was Apple.
that it's very easy still to fool all the LLMs with a math word problem that says something like, Bob has five Kiwis, Sally has six.
When they were harvested, five of the Kiwis are smaller than the other ones.
How many do they both have?
All the LLMs fall for the smaller thing and subtract five Kiwis, whereas a human reading that would say, oh, that's irrelevant information.
The LLM's not reasoning through it like you would a math problem.
It's pattern matching and looking for recipes to solve problems.
And that's the thing that they're trying to push past, but it's going to be very difficult because, I mean, what they're getting better and better at is finding recipes that are applicable, but therein lies the danger as well.
And this goes back to what I was talking about before.
It's a feature, not a bug that we have to deal with.
So let's say that you give an agent a task to do, and part of it is a fairly complex set of unit tests that it has to pass before it can say it's successful.
And one of them persistently won't pass.
One of the recipes that it can find is to just remove all the assertions from the unit test, and hey, it's successful.
Or replace the assertion that's in there with assert true, it's successful.
and I can move on to my next test, a human wouldn't do that because they know what's wrong.
Well, some would maybe do that.
Well, I mean, that's an ongoing debate as to whether you can trust a human to do that or not.
But an agent will do that as a legitimate recipe to solve a problem.
So what that means is, and this goes back to my history, one of the things that we...
founded and created the terminology for, this is actually my co-author, Dr.
Rebecca Parsons, this idea of architectural fitness functions, the ability to write code like unit tests, but for architecture characteristics.
And a lot of the fitness functions that you can write, like ensure that every unit test has an assertion in it, is overly pedantic for humans, but necessary for agentic generation at least for now because there are lots of legitimate recipes that solve problems that are not acceptable software and we have to put legitimate guardrails around those things before we can trust what agents are doing that's the stuff i'm doing it's basically the the the rebirth of the fitness function do we want to call it like that or it's not a rebirth it's what they've always been yeah We're just realizing how critically important they are.
Besides tests, right?
So fitness functions are always broader than just tests because if I'm checking scalability or responsiveness, I have to have monitors to do that.
Those are still fitness functions.
So the technical definition of fitness function is any mechanism that provides an objective integrity.
assessment of some architecture characteristic or some combination of architecture characteristics.
So it can be monitors.
Chaos engineering is a great example of fitness functions because they're stress testing your ecosystem at runtime.
What we're doing now, though, is so there are actually two levels of fitness functions that we are now talking about with agents.
One of them is wired into the agent itself.
So an example of this is, okay, I want agents to build code, but I want them to build code that I can build on later, not throwaway code, not ephemeral code, long-term code that, for example, I want to open source or something like that.
So when you generate that code, I want you to build it within these components.
This is part of what an architect designs is what components I want, what dependencies I want between those components.
I need to wire that into...
the agent before it starts producing code as a constraint that says and the way you do this for all of these architecture level things we would put these in at a higher level than the regular agentic generation so skills and clawed code or root level context for llms that you know the higher level root is the more stronger system prompt to constrain it to say always create within these components but you also need the concrete fitness functions written in Java or .NET or some language like that that runs as part of your deployment pipeline or your continuous integration to verify that the LLM hasn't cheated on you.
And the fitness function will be human-written or machine-written?
We machine-write those because what we do is use the LLM as an interpolator.
Yeah.
Not an interpreter, not a compiler, not a transpiler, but an interpolator.
So what we do is, so the book we're working on right now is called Architecture as Code.
And we're creating this pseudocode called Architecture Definition Language.
And that's what we wire into the agents.
And there are compelling reasons why you want a pseudocode there, not just a bunch of pros.
And we have evidence to back this up with some case studies.
I can talk more about that in a second if you want.
But we also take that ADL and use the LLM to create.
concrete fitness functions in Java or .NET or Python or TypeScript or whatever language our target is.
Now, you have to verify the output to verify that it's actually true, but that gets wired into your continuous integration as a concrete.
So our mantra is for any non-deterministic code generation, you need deterministic tests as guardrails to make sure that it's producing the right stuff.
But you can have a...
What you do by putting it in both places is you cut out several churn cycles.
If you only have it as an end of the process test, then your agents will pill something, reject it, build it again, reject it.
Having it wired in makes it a lot more efficient and much more likely to be successful.
So it's like on the outer layer, basically, to double check that.
And then you implement the fitness functions first and then make them pass later?
We can do that.
That's actually fitness function-driven architecture.
So there are examples of that.
Okay, I need to have this capability always to be true.
Let's define that first and make sure that every change that happens in our ecosystem matches that capability.
I think it's much more realistic now than in the past because we can generate code so quickly and verify it against some sort of concrete definition to say, you know, is it doing what we want it to do or not?
The advantage of using the LLM as an interpolator like that, we actually realize by experience because So here's an example of the kind of constraints you might want to build.
Let's say I'm building a layered architecture and I have a persistence layer, and it's the only one that should be able to talk to the database.
That gives me better performance, better separation concerns, et cetera.
So I want to write a fitness function that says only the persistence layer can talk to the database.
The original version of our ADL was very specific.
don't allow anything to instantiate this class or this class or this class.
But we generalized it and said, don't let any data access happen.
And when you interpolate that in Java, it picks up the classes in Java that allow you to connect to a database.
But when you interpolate that in the .NET world, in NetArcTest, it picks up the interfaces in .NET that allow you to connect to a database and restricts those individually per platform.
And so there's an advantage of keeping it sort of general in the pseudocode and then letting the LLM, the second L, the L part of the LLM, actually interpolate that into concrete interfaces or classes for the particular platform you're writing verifications for.
You also mentioned that you're often in touch with legacy-heavy systems and code bases.
Would that be the first?
guardrails that you would set if you come into one of those bigger legacy setups?
Or is that the basic principles that you set initially?
Well, yeah, the same principle works for legacy code bases.
In fact, one of the common, we actually gave it a name, the Evolutionary Architecture book, we called it a fidelity fitness function.
I'm trying to replace this old thing with this new thing.
I need a fitness function that compares the output of the old thing to the new thing for each case that can come up.
The problem with those always is, okay, I'm trying to replace some sort of AS400 or mainframe or something, and we don't even understand how the code works anymore inside the thing.
You can build these fidelity fitness functions that say, okay, for this business function, given this input, does it produce the same output?
Now, of course, you can actually use LLMs to figure out some of what that code is doing.
But again, you need deterministic guardrails to say, okay, is it actually producing the same output as what we had before?
The real trick there is figuring out what are the slices that I can get deterministic results from?
What's the narrowest scope I can get to validate that and replace it and then iterate and repeat over and over?
But we've actually done a lot of that kind of work for legacy systems gradual replacement by building fitness functions that validate that it's still producing the same results or desirable results.
So that the overall or intermediate outputs are basically the same.
And what we typically do is first you call the old one and then run the new one to verify it against it.
But the old one is slow and awful, which is why you're trying to get rid of it.
And so at some point you switch around and call the new one, but then also run the old one in the background to see, is it still agreeing with that?
You know, at some point you decide, you know, I want to switch over to the...
So that means that from your perspective, you still need to know the business whenever you look at...
an an old code base or how do you see that like that ai dissolves a lot of the knowledge silos that were there and and the the knowledge islands right um and and this is the big chance if you have something like really legacy that no one knows of right like how would you treat that Well, the danger there, that's where you really need business analysts.
I mean, I can tell you the architecture of the system.
I can tell you capabilities and that kind of stuff.
But I can't necessarily validate, is this capturing the business the way we really want it?
So I'm from the southern part of the U.S.
And we have an expression that I'll whip out here, which is paving the cow path.
You don't want to pave the cow path when you could build an actual road.
So, you know, who knows if that old system, if it made any sense or not, or whether it was some terrible system that everybody hated using.
So you don't necessarily want to pave the cow path.
You want to understand what it was doing from a business capability standpoint.
I mean, now we have a unique opportunity to say, always in the past, it was, you know, a person decades worth of effort.
to reproduce the behavior of something that already exists.
And now it's one person with agents in months, probably.
And so, you know, the timescale of replacing a legacy system has vastly constricted.
And in fact, that's the thing that AI is best for is build me something that already works this way.
Because it can validate the output.
What it's terrible at is creative stuff.
Build me something that's never been built before.
It can't.
Because they can't find recipes for that because there are no recipes for that because it's never been built before.
But, hey, there are a lot of, you know, one of the things that we find in the U.S., every state has their own, like, medical records and, you know, driver's license.
So there are 50 of those and they're almost all horrible legacy AS400 or mainframe systems.
So there's a rich ecosystem of expertise there and recipes.
for solving you know statewide driver's license it's been solved 50 times at least in the us so that's an easy thing for the the kind of code for agents to attack so you would say what you need at least if you have such a system in front of you is kind of like some sort of deterministic output like be it like a database that the system writes to and you can like verify the output then later on if you just re-engineer it or That's one way to build a fidelity fitness function.
So part of what complicates this is virtually always the old AS400 system did not use domain-driven design because domain-driven design didn't exist when they built that AS400 system in COBOL.
And we'd really like the new system to adhere to modern design principles like domain-driven design.
So it's almost never a one-for-one replacement.
It is a...
what we call, it's not just a restructuring, it's a migration as well, because we're changing the partitioning.
We're changing it from a technically partition architecture to a domain partition architecture.
You're probably building things like microservices instead of copybooks and COBOL and stuff like that.
And so you're making fundamental changes to the internal structure and just trying to preserve the external behavior.
And so that's where the sort of black box almost testing comes in for this narrow, scope of the application?
Can I get the new one to give me the same answer as the old one?
I've actually been involved in a whole bunch of legacy projects that didn't build that verification layer.
And the hazard is, and I've seen this happen a couple of times, is you get the new thing built finally.
And they produce different answers.
Well, and they produce different answers.
And it's like, why are they different?
It's like, we don't know, but entire markets are built on the answer that the old one produces.
So we got to figure out a way to produce that answer because it's kind of important to the business that that's the answer.
But I think if it's about driving licensees, it's kind of, you always have some deterministic output that you can validate against, right?
And those are the easy kind of things which make them perfect candidates for this kind of agentic generation.
Any kind of simple software, which used to take as much time, almost as complex software, in terms of, you know, user interfaces take out an enormous amount of effort to create.
You know, having automated systems to generate more of the very simple parts of software development, I think that's where the great productivity gain comes from experienced software developers.
But one of the things that I note is that, so it's a magnifier.
AI is a magnifier.
Let's say, and I don't know what the number is, let's say it's 10 times.
What's 10 times zero?
Zero.
That's the problem.
If you give agents to people who are not experienced software developers, tell them, use this to produce code, I don't know if it's any good or not.
Hey, it seems to work, sort of.
You know, the funny case that I keep bringing up is right after the new year, somebody said, yeah, you know, last year I vibe-coded this cool little fitness app for myself and it was awesome.
But it stopped working in 2026 because it hard-coded the year 2025 in the app everywhere.
It's like, you don't know if it's any good or not if you don't know anything about software.
So it is a multiplier, but the more you start with, the higher a multiplier it is.
And so really, really experienced software developers and architects who understand why systems have capabilities that they have, I think are very valuable commodities now because they are the ones who can build those guardrails to say, you know, can we get agents to produce code that we can trust within these constraints?
I'm working with someone now who's built this really cool consensus algorithm where we've got a problem.
Let's say we're doing...
the case study we're sort of playing with is automated grading of essays.
So what we want is not just a single grader, but we want a bunch of agents.
We've got a grammar agent.
We've got a, you know, did you copy this from AI agent?
We've got a content agent.
We've got, you know, grade level agent.
And they have to reach a consensus, I'll say at least 80% before we say, yep, it's graded, true.
A friend of ours built this system just sort of, you know, as you would focusing on the behavior.
But then we started, now he's thinking about open sourcing it because it's a very useful thing.
Then we started looking at the quality of the code that was produced and it was not good.
But we wired in some architectural constraints around code quality into the agents and it produced noticeably, measurably better quality code.
This goes back to what I was saying before.
Agents don't produce bad code on purpose.
They produce bad code when they're not guided to produce better code.
And that's one of the things that we have to get better at.
I think as the hype is dying down some and the demo where the flash is going off of it, I think people will start realizing that you need to be a lot more diligent about the way you design things and focus on both behavior and capabilities.
So agents are not Lego bricks.
They are not.
And in fact, so this is one of the most dangerous things, this juxtaposition out in the world of agents and microservices as the target for agentic generation.
This is the thing that drives me crazy.
And this happens for all the booms, but it's been really acute for this one, where as soon as a boom shows up, everybody just chucks everything they learned about software development out the window because it's a brand new world and nothing applies anymore.
So, every time you hear people talk about agents and code generation, they always talk about microservices because it's convenient because micro is small.
And so, okay, that seems really reasonable.
Okay, but that's great.
But what they're ignoring is that when you actually build microservice architectures, microservices are very small on purpose.
And to make them do useful stuff, they have to collaborate with each other within workflows.
And now you have to start worrying about the code that you generated now collaborating with each other within distributed architectures.
And this is where the stuff I've been thinking about in software architecture for a long time comes in because, okay, let's say microservices are the target that we're going to generate for our agents for some sort of behavior.
But you have to worry about five different kinds of coupling to be successful with that.
Static coupling, which are the dependencies between that thing, because if you create two microservices that depend on the same thing, you don't have microservices anymore because you change that shared thing and you have to change both of them.
You're breaking the philosophical rule of microservices there.
For dynamic coupling.
If you have services that were created by agents that synchronously call each other, you can create these clogs, these responsiveness clogs in your architecture because you're blocking and waiting.
And now it's starting to impact the overall responsiveness of your system.
So there are five different things you have to think about.
in terms of analyzing it from an architectural standpoint.
And you can have agents build proper microservices, but you can't just say, oh, I need it to do this behavior and assume that it's going to work nicely when you start composing and aggregating those things and making them work in a real scalable system that needs responsiveness and all the things that we value in architecture.
What are your thoughts on the overall?
change of the landscape for software engineers and architects?
How will this impact the world long term?
I mean, now, if you read the news, then it's kind of scary for many.
Do you think this is entirely true?
Or what are your thoughts?
Is it more of a positive picture?
The way that I view this is we're seeing a new level of modularity appear.
So when I first started, I was writing in programming languages that you had to manually do memory management.
And it was a pain.
It was error-prone.
And then language like Java came along, and suddenly I didn't have to worry about that anymore.
The Java runtime is still choosing a garbage collector, and it's choosing it by intelligent mechanisms, but I don't worry about that anymore.
That's just an abstraction that's melted away from me, and I don't care about that anymore.
And that's a lot of what agents are, is just a new level of modularity where there's some level of modularity where I actually don't care what's inside that as long as it passes the unit and functional test and it matches my fitness functions.
But I also have to care about the ephemerality of that code.
Because if I'm trying to build a long-term...
project on top of this i need to crack it open and make sure that it has good quality code inside it and so what we're seeing is a new level of modularity i don't i think this is actually really good for experienced developers and architects because zero times 10 is still zero five times 10 is 50 and 10 times 10 is 100 the more experienced you are the more powerful these tools are because now you can create a metaphorical army to do busy work for you.
Knowing what busy work to do is the critical thing.
If you give agents to beginners, they don't know what to do with them, but you give them to an experienced architect and they can say, oh, this is what we need.
This is how to constrain it, et cetera.
So I think what this is going to do is force the...
Architecture, for sure, is going to become much more important because as we generate more code, it needs to be able to adhere to good architecture practices, etc.
And I think it's going to be good for professionals.
And I think it may compress the market somewhat.
But, you know, when rapid application development came along, the attempt was to take a lot of people who really weren't qualified as developers and turn them into developers.
There's probably going to be the same attempt here.
Same for no-code, same for other.
It's basically what we're chasing for years, right?
It's the same hype cycle over and over again.
You just keep repeating over and over again.
This is the exact same thing that's going to happen here.
I believe that it's never been more critical for developers who are experienced now to understand good software engineering and architecture fundamentals.
Understand what metrics are available to assess code.
Understand what good engineering practices look like.
Think about how do you govern things like I have agents that are using version control.
How do I make sure they don't cheat on the monorepo?
I just fix that problem in an agentic way.
project that I was working on just recently, where the agents were cheating on the monorepo because, oh, there's some code I need.
So we built constraints into it, stop it from doing that.
And so knowing that that exists and how to do that, I think it's going to become important.
But I think it's after some very rocky, bad case studies will show up.
over the next few months.
I think ultimately it's going to be really nice because it's going to allow experienced people to build useful software much faster.
We've needed that ever since the first software.
developers showed a business person a report and said, you know, I could take this data and produce a report from it.
And since that day, the supply and demand for software has been massively skewed, which is why we keep having all these productivity booms, because we need more software.
We need more software.
Maybe now we finally have a mechanism that can start catching up with the supply and demand.
Of course, the other problem we have is every time we catch up with the demand, we change platforms entirely and build new things like mobile devices and have to start over.
But maybe this will help us, empower us to start catching up with some of the out-of-control demand that business people seem to have for software and what it does.
And two thoughts.
Aren't software developers now basically a bit supposed to become business people, first thought?
And second one?
Didn't many software developers really now focus on business, sorry, on busy work for a very long time?
And like, is there a way back for them?
Well, I think the busy work's gone for good because agents can certainly do the busy work.
So that's really nice.
We still need business analysts because this goes back to, you know, one of the first, the original selling points of COBOL was finally we can get rid of those.
pesky, annoying developers and just let the business people write the code exactly the way they want it.
But then they realized business people don't want to write COBOL code.
Programmers want to write COBOL code.
The exact same thing is true of programmers.
Programmers don't want to become business analysts.
They want to code.
They don't want to learn the business insight.
That's a business analyst job.
I think what we're going to see is a lot tighter collaboration between developers and business analysts to craft the specifications because knowing how to put the specification together is still a programmer skill of making it as unambiguous as possible.
Every pronoun you put in there is an opportunity for the LLM to hallucinate something.
So you've got to be very precise about that.
But you still need a business analyst to understand the business deeply because trying to do...
both those things is less efficient than specializing one to the other.
So I don't think that it's going to eliminate that role.
I think what it's going to do is a lot of the things.
So, you know, we used to have a role on projects that was sometimes sort of.
a semi-facetiously punishment for something which was to manage the bill files and stuff like that because we're always making changes and but now agents can do that kind of busy work and so i think that's the kind of you know automated busy work that we'll see more of and and allow developers to spend more time focusing on behavior and capabilities and less on the sort of plumbing that's required to actually build software and and what what do you think like for cto's my audience um What do you think really matters for them in the near future?
What is, from your perspective, the important capability to also build besides building with AI?
What do you think is getting more and more important?
Understanding what good engineering looks like.
Understanding what good capabilities look like.
Too many organizations.
particularly now, are focusing way too much on behavior, not enough on capabilities.
And capability is the thing that gives you long-term sustainability for a piece of software.
If you're building something you're going to throw away in a year, that's fine.
But if you're trying to build something that's going to last for five or ten years, you need to pay attention to the structure, the dependencies, the communication, the granularity, all of those things that come along with architecture.
So I think that's becoming more and more important now as we start thinking about ephemerality.
I think that's going to become the big decision for CTOs is what is the ephemerality of this project?
We never had to worry about that before, but now it's the key first question you have to ask is how long is this going to last?
Well, didn't we have to worry about that?
But just pretty much like late when, like in traditional projects, like it just took longer, right?
Until you reached a certain point.
Well, that's exactly right.
We worried about it, but it was on the order of years.
Now it's on the order of weeks and months.
Or days, yeah.
I mean, it's literally compressed to that point where, you know, I don't really care if that code lasts for six months.
Whereas, you know, in the past, it would have taken two years to write that code.
So I think that's where the equation has changed a lot.
This, in theory, is a great chance for good people, right, to accelerate to that degree.
But like acceleration, as you mentioned, can also go into the other direction, right?
Like you can accelerate into the wrong direction, right?
Well, that's always the problem with measuring velocity is you may be going in the wrong direction, but making great time.
Yeah, yeah.
Numbers look great.
Like lines of code is up.
Exactly.
Yeah.
And nothing's working.
So, you know, that's always the danger.
How do you see the, I mean, you basically already expressed it that you think it's very important to have like human in the loop when it comes to like looking at change and looking at changes.
especially the change that has been written by an agentic system.
Do you think pull requests will survive?
They will.
I mean, agents will probably use them as much as anything else.
But I think they will survive because it is a very compact way to see what change has been made to a code base.
particularly if you're having agents rebuild something or change something in existing code base, you may communicate through pull requests because I want to see what's there.
One of the mistakes I've seen made is people saying to their agents, okay, don't change any existing code, but then add this new behavior.
Well, you're just building technical debt if you do that, because if you can't refactor code as you make changes to it, that's a...
a bad long-term outcome you need to be able to make fundamental changes to things not just bolt new things onto it um so i i think that's going to be uh we're going to see more and more of that kind of uh checks on what agents are building but potentially right now we're confusing things as many people i know are basically using pull requests to validate with codex what Claude wrote, right?
Like to basically hook in like a codex on pull request level and then basically accept everything, like accept the comments from codex and change the stuff, right?
Like let Claude change it again and then like let Claude commit and change the PR.
So it will fundamentally go back to like, real human review after you went through everything that can be done with skills, most likely, right?
That or it depends on how ephemeral the code is.
If I don't really care what's inside there, then I don't care.
Well, if you don't know.
But part of ephemeral code is it matches the tests that I have in place for behavior and for capabilities.
At what point do you not care anymore?
So we don't care about the garbage collector because it is deterministic.
There's a deterministic algorithm, so we don't worry that it doesn't work.
But at some point, if we regenerate something with an agent, it might not work anymore because it may have changed something.
So we need these kind of fitness functions and unit tests that are deterministic to act as guardrails and will change.
So I believe, actually, the role of developers is going to change drastically into primarily writing tests.
deterministic tests that validate what the agents have created.
Whether they interpolate those using AI or not, you still need to run deterministic tests and not trust.
Trust but verify.
Trust but verify, which also applies to the garbage collector, right?
Like for everyone who has written Java code in the past and like famous last words would also be, I trust the garbage collector to work.
I mean, it does, but you sure have to do it.
A key distinction, though, because the garbage collector in Java is a great example of a leaky abstraction.
It's a very, very good one because it almost never leaks, but it is possible to get it to leak because there is a mechanism running down there.
But, you know, we don't even think about that very much.
Agentic created code is not a leaky abstraction.
It's non-deterministic.
If you regenerate code from an existing...
codebase to change one thing, it may change a bunch of stuff.
And so that's why you need deterministic checks on this, much more than you would need on just a leaky abstraction.
You still need checks on that, but there's no chance that the garbage collector is suddenly going to start consuming or producing garbage versus eliminating it.
Right, right.
That's what you can rely on at least, yeah.
Cool.
Yeah, thanks a lot.
Many very, very helpful tips and ideas.
And yeah, I'd be curious to also like really understand what kind of problems you tackle in your day-to-day.
How many legacy code bases have you reviewed or rewritten in the last?
12 months, like how many code bases are you in touch with and how much did the companies really win through like adding an LM and a genetic to the equation?
So directly involved with, I mean, I'm not, I've been advising a lot of projects because at least what works in North America, that's a lot of what they're doing is working on.
And in fact, they've built some very specialized tools to do that.
So they built this internal tool called Code Concise.
This is not any sort of trade secret.
This is part of a platform they're building called AI Works, and it is optimized for doing this kind of what I'm calling is COBOL whispering, taking a legacy system and re-engineering it and figuring out and producing modern code and diagrams and that sort of stuff.
This is a great, and in fact, ThoughtWorks is not the only company doing this.
Every major consulting company I know is building one of these because it's an obvious, obvious low-hanging fruit.
got this massive code base, which is basically a big giant pattern matching exercise.
Look, I've got this agentic AI stuff, which is fantastic at pattern matching.
Let's let it go at it and do this.
Let me roast your code base, right?
Exactly.
So one of the early obvious wins, and in fact, the only place I think we've got provable massive productivity gains is re-engineering legacy code bases into modern systems.
Yeah, I agree.
Yeah.
i agree and and i had like a few times at dinners with friends where i said like if i had a consulting company this is what i would focus on right like there are so many so many crazy legacy systems um like in larger companies and german mittelstand companies it's everywhere and it's crazy well what happens is and i always talk about this in the context of agile engineering is let's say your big insurance company And all you care about is minimizing the cost of your software.
And that's great as long as all the other insurance companies agree to do the same thing.
Software is overhead.
Until somebody shows up where software is strategic, and now suddenly you're all in deep trouble.
And that happens for every market at some point.
Somebody shows up and disrupts it through some sort of technology.
So you see this in all these vertical markets all over the place.
You're right.
That is a rich vein of consulting right now.
And guess what?
Every major consulting company on earth is going after that rich vein of consulting right now because it's the obvious thing to do.
And the tools are greatly facilitating for it.
So I think that's the better use for it right now.
At least the more proven way to get good outcomes than, you know, vibe coding the next startup.
Yeah, yeah, yeah, yeah.
So add MCP servers, rebuild your software.
And that's the message.
Cool.
Yeah, thanks a lot for the discussion.
I just have one like outro question for you.
Imagine I have a little secret here for you, which like in my legacy system rebuilding machine, I kind of accidentally built a time machine.
And imagine we now have the chance to travel back in your life to your time at Georgia State when you were deep in languages, compilers, and very early technical ambition.
And you've got one minute or a few sentences to whisper into your younger self's ears.
What would it be?
I would probably tell them something about opera or symphony or something like that because whatever I would whisper in my ear would fundamentally change where I am now, which is bad because there's no idea.
I've seen enough science fiction movies to know that that can turn out bad.
So, I mean, you know, what would you tell yourself from a technology standpoint?
I mean, learn more about AI.
I mean, I took AI courses.
I mean, you know, it's...
It's much more interesting to kind of encounter it as it comes.
It could also be a waste of time, right?
If you do that too early and then you look at all the foundation that is now worthless because it was also, in a way, an area that was heavily disrupted.
A great example of this is I spent the very early part of my career using this language called Clipper.
I was probably one of the world's biggest experts on Clipper.
I haven't touched Clipper in 25 or 30 years.
I wished I could take all the stuff I knew about Clipper and flush it out of my brain.
But if I told my younger self, oh, forget about Clipper, you don't need that, I never would have gotten from Clipper to the next thing, which was Java.
Or actually, Delphi was the next thing, and that led to Java.
And so it's hard to reverse engineer a path like that because you start forking it, and then who knows what's going to happen.
So your message is then like, let life surprise you, basically.
I'm actually quite happy with when I decided to go to computers, I had two choices, computer information systems or computer science.
Computer information systems, all the business stuff like cool ball and databases and computer science was all the stuff like AI and automata theory and compilers.
And I thought that sounded cooler.
And so I went into computer science and that has been a fantastic choice because I actually understood how language models worked when they came out and I had, you know, bases and compilers.
And so.
I understand several abstraction levels down that I would not have gotten.
And I learned enough about databases and all that other stuff as part of my professional career, but I never would have taken a deep dive into machine learning and automata theory if I hadn't taken computer science.
That would be something that I would have gone back and patted myself on the back for.
It's a good move of choosing the one that sounded more interesting versus the one that sounded more practical because it turned out really good in the end.
It's good that you still understand punch cards, right?
Exactly.
It's fascinating if you look back to that time and how it really all evolved from the C64 to now and how many abstraction layers were added.
That's incredible.
as it's non-deterministic, right?
But it's really incredible and also sometimes a bit...
Yeah, a bit tiring because it's just repeating, right?
Like, it's just repeating, repeating, repeating, repeating.
And let's see what comes down the road in five years.
You definitely see the same hype cycles repeating and the same sort of, you know, side effects and blowbacks.
This one's been particularly intense, but the AI bubble stuff is starting to come apart now, finally.
Let's see, right?
Let's see.
Scary things ahead in a way, but also, yeah, many people invested into this.
Hard to really trust the narrative that you hear every day.
But part of the problem we're running into here is that all of the AI companies are running the Silicon Valley business model, which means giving you amazing features at way below their cost, the hopes of getting you hooked so they can crank the price up.
Recently, OpenAI turned off their video generating tool, Sora, I think it was called.
because it was costing them something like a million dollars extra every time somebody used the thing, because they were giving away these amazing capabilities at so below cost, they finally had to shut it down because even the venture capital couldn't keep up with it.
At some point, the price is going to go way up on these things.
So now there's going to be a pressure about how much can we do locally versus with the big language models and tokens.
And so that's going to be an interesting thing that is going to...
to erupt in this space.
Right now, they're sprinting as hard as they can to get you addicted to capabilities so that when the price cranks up, you can't live without it.
Which is also history repeating, right?
I mean, looking back at e-commerce, like, hey, we just want to get you addicted.
That is the Silicon Valley business model.
And they're running it exactly to the letters.
Yeah, we're Datadog.
Here are the credits, right?
Absolutely.
But that's the advantage.
Both of us have been around long enough to see several of these waves pass through, and you see the same patterns in every wave of this kind of business.
But on the other side, it's also quite potentially beneficial for the user to early jump into the waves because, yes, at first, lunch is for free.
But here's the problem.
identifying fads versus trends.
Because if you jump onto a fad, you waste a lot of time and effort and it fades away.
So you want to be early on trends, but not fads.
And that's always easy in hindsight, not always easy in foresight.
Yeah, you're right.
In hindsight, everything's easy.
Exactly.
Thanks a lot for the discussion, which is quite positive.
And I think also...
From your perspective, the outlook is rather positive, right?
There are so many legacy systems to take care of.
There's so much stuff to deal with and so much adoption work being done, if I would sum it up.
I'm not a doomsayer whatsoever.
This is a new tool.
And when used correctly, it's a fantastic enabler.
But just like so many hype things, it's being used recklessly and foolishly by...
too many organizations.
So I'm anxious for the adults to show up and start constraining this and actually making it useful.
Cool.
Let's do that together.
Thanks a lot.
My pleasure.
Have a great day and I hope to see you soon.
Thanks.
Bye.
Thank you for listening to the Alphalist podcast.
If you liked this episode, share it with friends.
I'm sure they love it too.
Make sure to subscribe so you can hear deep insights into technical leadership and technology trends as they become available.
Also, please tell us if there is a topic you would like to hear more about or a technical leader whose brain you would like us to pick.
Alphalist is all about helping CTOs getting access to the insights they need to make the best decisions for their company.
Please send us suggestions to cto at alphalist.com.
Send me a message on LinkedIn or Twitter.
After all, the more knowledge we bring to CTOs, the more growth we see in tech.
Or as we say on Alphalist, accumulated knowledge to accelerate growth.
See you in the next episode.
