# AI Agent Safety, Drift, and Productivity Gaps

**Podcast:** Dev Interrupted
**Published:** 2026-05-08

## Transcript

I don't know, Angie, are you getting a Codex Pet?
I feel like it's only a matter of time until we have NFT-based identities for all of our bots.
Oh, no.
That is not the direction I want this thing at all.
I actually think it's funny to see things like Codex Pets.
For those who haven't seen it, it's a plugin for Codex that lets you add a terminal companion or a terminal friend as you're using the tool.
We've obviously been seeing this before.
CloudCode had buddy mode a few weeks ago or a month ago.
I know you were lamenting the loss of yours.
That was devastating.
I'm sure you're looking for a new pet.
Trixel's time in this world was too short because the CloudCode buddy feature only lasted about 10 sweet days.
It's funny to then see the Codex folks over there and like, oh, you know, this is a grass is greener kind of moment where I'm like, oh, they're having fun over there with their pets.
But you know what?
I'm actually not really leaning into the whole trying to make it.
a personality or trying to have it as this like thing that I talk with a lot of times that my sessions are, you know, very ephemeral and they're anchored in durable sources of context and information.
But like the sessions and agents themselves don't really glom much of an identity.
Like even in the open claw world, when that really hit the scene, there were parts of that that I stole and wanted to use from my own harness.
And I certainly did.
But, you know, the growing personality over time just wasn't really one of them.
Well, that's not, you're not entirely true.
Your agents do ask you to build them a profile picture at the very least.
It seems like that is true.
That is true.
That is because all of my, all of my agents are represented as a little fish with a cowboy hat.
For those that aren't in the savvy, you know, I post about them on LinkedIn because.
They certainly all have the same kind of shape and face.
But actually, that is just because I have a really simple template that I go to Nana Banana and I make them a little fish with a hat.
It's zero cognitive burden for me.
I don't have to be like, oh, what's your name?
What are you?
I'm not trying to put any kind of face on them, but it doesn't mean that I can't have fun with it.
So yeah, I'll go slap a fish with a cowboy hat on them.
That's the kind of learnings you get here at the Friday Deploy brought to you by Linear B.
I'm your host, Ben Lloyd Pearson.
And I'm your host, Andrew Ziegler.
Yeah.
And this week, beyond Codex Pets, we have OpenAI's goblin invasion, organizational AI learning crisis, specs maxing, wow, and AI productivity myths.
Tongue twisters aside, where are we going to start, Andrew?
Let's talk about the story where goblins are overrunning our chats.
So this is a really fun post-mortem that came across our desk in the last week.
If you've been on social media, you've probably seen fun prompts and outputs folks have shared from their usage of chat GPT, where it becomes obsessive in mentioning goblins, gremlins, and other creatures in its responses, rendering them in places where they don't belong, and goblin usage and goblin-directed conversations spiking overrun.
over 175%, which is an amazing.
thought to think that there's a dashboard or a metric inside of OpenAI that's tracking Goblin.
And it forced a lot of folks to add explicit anti-Goblin instructions, literally like a barricade on the town hall walls to keep Goblins out of their outputs.
So, you know, where did this come from?
This is a really interesting dive into the reality of how LLMs are trained and where they get their performance from.
So the problem on this originally originated from...
personality training for a quote, nerdy chat GPT preset where, you know, things like a creature references to a bestiary might come up every once in a while.
It's the idea of like having a personality on Claude or like you'd probably be playing Dungeons and Dragons in a basement with it.
So this output from this particular personality training ended up spreading to other models because it's a pretty standard industry practice to use model outputs to train future.
models so all it took was the existence of this nerdy chat gpt preset somewhere deep in the training architecture of the models to slowly be pumping out goblin obsessed outputs and otherwise playing an infinite game of dnd in some virtual basement somewhere and it really just dramatically polluted the output downstream for folks so it really reminds us that this is a really big ouroboros right it's eating its own tail uh in terms of the kind of like performance we're getting and it really calls to mind the importance of provenance and understanding the data that goes into your model.
What did you learn from this one, Ben?
Yeah, well, the biggest thing that I'm coming away from this is just wondering, like, how many other goblins, so to speak, are out there, like these things that have leaked into the core training models that have created an unknown tendency that impacts everything we do with them.
So I think it's a really great representation as well of intention drift.
So a minor shift to word structure can sort of cascade into massive downstream differences the more that you iterate on this.
You know, we've seen this quite frequently where we introduce like a minor thought into our agent harness and maybe it implies a little more intention than we intended.
Excuse the redundancy on that a little bit.
And retraining AI models on past.
iterations is effectively, you know, you said Orboros, but it's like a game of agentic telephone as well.
You know, every iteration is slightly different than the one before it.
And so the final outcome is sometimes is nowhere near what it started or what you expected it to be.
And this thing, you know, happens at the micro scale as well when you're working with an agent harness.
You know, a minor idea could be injected early and that could morph into some sort of strategic imperative within the agent's mind.
So, you know, for example, consider the line like, it would be nice to build feature X, but we would need to build a new API integration for that feature.
So you would add that to like your roadmap along with the API integration.
Later, you know, your agents are working on it.
They come across the part that says that you need this API integration, but they miss the context that it's only in the context of this.
future feature that we're not building at this moment.
And it could mistake that and think that this API integration is now a critical component of what it needs to do.
And this happens at scale when you're dealing with agent orchestrators.
That drift is the constant battle that you have to work against, whether it's goblins or whether it's agents that violate your testing policies or do any other number of things.
You know, AI loves over-indexing towards things that seem really important.
And yeah, I can understand this goblin channel very deeply myself.
Yeah, it's really good advice, like what you said, to go back and revisit your harness and your prompts and your skills and make sure you're not over-rotating on things that are polluting downstream context.
If you find yourself iterating a lot or making the same kinds of revisions on outputs that you previously had.
crystallized as like a skill or some kind of process, I think that's a good reminder to go and revisit it because there might be goblins lurking in that prompt.
All right, Andrew, let's move on to AI deleting databases.
Is it the AI that did it?
Is it you that did it?
Who's responsible for that, Andrew?
Okay, well, another week and another tragic incident of a production database or some sort of production environment getting totally wiped out or sideswiped by a rogue AI agent or a bad endpoint or a combination of the two.
And really, this is just, once again, your weekly reminder that ultimately the permissions and the scopes and the hooks and the protections around your agent are your responsibility.
And understanding the power and the leverage.
that you can have with your agent and the responsibility that comes with it is really crucial when you start entering into interacting with a lot of systems at scale.
Like even going back to the whole goblin problem we talked about a moment ago, it can imagine this drift now on top of something that is working with your production data, with your customer data, with your product data.
It becomes more than just, oh, it's annoying.
A goblin is popping up in outputs I don't like.
It can become catastrophic when you combine that with tool calling.
So it really kind of highlights the danger of not having those guardrails and reminding us that it's a system-based thinking, and part of that is closing the environment.
What is the world that my agent has access to?
And understanding that really, really deeply before you start to put tools in its hand.
It really just is a strong reminder that, like...
If you're working in production environments and you're working with agents to make sure that you have these structured layers and these protections in place to protect them from doing harm to data that might be irrecoverable or cause downtime or interruptions for your users.
You know, Andrew, you and I have spoken to enough people firsthand that flew a bit too close to the sun with their agent orchestrators and have come away with scary stories that we should all learn from.
You know, in addition to all these ones that are getting public attention.
And I feel like I'm actually kind of getting to the point now where I'm starting to feel almost like a safety officer.
Like when I hear about like coworkers that are like, I just discovered this new agentic workflow.
And I'm like, okay, are you being careful about all the ways I can go wild?
And so, you know, but the technology is just so exciting that you want to explore like that because it's fascinating.
But, you know, there was a line in this article that really stuck out to me about how we use terms like thinking and reasoning when we're talking about these AI systems.
But really, those are just like marketing terms that have been put on top of them.
The reality is that they're just coming up with structured ways to generate tokens that simulate those those like thinking and reasoning.
So we have to remember constantly that AI is a simulation of our ability to think and reason.
And I'm really 100% aligned with this author.
We need to be responsible for our own agents.
And I think it also shows really just the importance of blameless culture.
Failures happen all the time.
The important thing is that you respond to them and make things more robust for the future.
I'm going to get back on my soapbox about agent permissions for a moment.
I kind of do think that these AI companies, the vendors, they're putting some unreasonable expectations on users about how to use them safely.
If your AI is willing to go to the end of the world to solve a problem that it thinks it's responsible for solving, it's really hard to prevent it from going rogue.
Yes, we need to have safe practices by default, but what was safe in a human-driven world is no longer safe in an AI-driven world.
And mistakes are now happening in the blink of an eye and just unraveling.
And this is where we really still need determinism.
Determinism plays an even more important role today.
Absolutely.
We all need to have checks in place to prevent those rogue AI agents.
At the very simplest level, we have AI...
hard caps on API usage for our agents in the case that they decide to go spend a year's worth of budget in an afternoon.
But you need to have that same sort of protection at every single stage of your SDLC.
You have to have that awareness and that check in place.
At Linear B, we've been focused for a long time specifically on code reviews because they are one of the most important quality checks and they're the single most important or the single most frequent.
bottleneck in most engineering teams.
And, you know, we spend a lot of time just like helping teams like eliminate hours of toil around code review specifically, you know, and the goal is really always just to get developers back to more high impact work and to make sure that they can work quickly without, you know, creating additional problems and headaches for their team.
So that's a great place for to implement a lot of this determinism to protect your organization.
You know, CICD is kind of old news at this point.
Like it feels like this.
Scandalous statement.
But it's as relative as ever, you know, and it's kind of more fun.
It's more foundational than ever before.
Like you can't expect human attention on every single detail of the code that's entering your code base at this point.
So, you know, these failures, like they need to be caught early.
They need to be contained.
You need to have rollback procedures in place for when it's necessary.
And then you use failures to strengthen your automation.
And if the opposite is happening to your organization right now, like failures are cascading, they're becoming more frequent, they're impacting your ability to deliver.
That's where you need that visibility to understand what's happening so you can fix those problems before going to broader AI adoption.
So, you know, my advice is focus your tokens on solving the process first, and then you can go and solve that like AI adoption problem.
Absolutely.
All right, Andrew, let's talk about when everyone has AI, the company still learns nothing.
I really love this article, but why don't you summarize what was in it for us?
Okay, so this is a fun article.
It really cuts to the heart of a problem that's happening in a lot of organizations.
And it talks about how individual AI productivity gains don't...
automatically or even realistically sometimes translate to organizational benefits.
And a lot of companies are now entering this messy middle phase where the AI use is certainly widespread.
It's undeniably there, but it's uneven, it's hidden, it's disconnected and siloed.
You focus on all different parts of the AI adoption journey and their ability to communicate and collaborate with each other is pretty staggeringly hampered by the differences in their outputs and what they need of each other.
right now.
And really the organizational work needed to understand how to connect these different types of learners and people of agentic ability with their outputs and their organizational goals is so early days that we don't really even have the language.
as teams to talk about how to share and distribute, how change and process management has to evolve, and just like the new reality of how fast some folks can work and how some other types of roles are stuck in a more traditional throughput.
So because of this, you might be in a familiar story, especially like...
If you listen to this podcast, you're probably a little more on the advanced side of your agentic journey.
And you probably have, in some cases, encountered where you meet folks that are on another end of that journey.
And communicating and collaborating and getting on a shared baseline can be really...
hard uh so this article outlines the three new capabilities that we all need to be thinking about on like a meta and organizational level and this is about how to like orchestrate and and operate agents as an organizational scale but also things like understanding the intelligence of a loop like what makes an ai uh workflow effective unscalable and what are the capabilities of the things inside this kind of uh literacy around ai is really critical and it's applied right it can only come from having lived in the experience and being able to communicate it on your own company's terms.
It also even veers against some of the token maxing stuff that we've talked about recently.
Some organizations, when challenged with this problem, what they threw at it was what we talked about last week.
And those were the Goodhart's laws to the extreme token maxing boards that were tracking all of the engineering teams and their token usage.
And then when you peeled back the hood, or when these engineers of these companies revolted and threw back the tracking, you really started to peel back the learnings and realize that.
You weren't really understanding anything about your organization's impact or what you were shipping or how to get this one guy over here who has like, you know, a thousand agents running.
How do I distribute that to everyone else?
How does everyone even understand what's going on over there?
Like none of that was elucidating.
So really there's a competitive advantage available for folks here.
So this is the leadership takeaway if you're listening to this about.
How can you shift on an organizational level the ability for people of different AI competencies to interact with each other, to share their games, but then also just find a new collaborative normal and accepting the fact that not everyone's going to be some super agentic 30,000 terminals open.
You're in the minority report and you've got the glove on and you're throwing things across the screen.
I don't think anyone's expecting that of everybody.
But there is a baseline expectation for all of us to be curious and to throw away assumptions of yesterday and to find our new place in this matrix.
Yeah.
Even if you could operate with that matrix machine, you're not always operating with that at that level.
It really depends heavily on the type of task and work that you are focused on.
But I really love articles like this that just sort of paint a picture of the moment in time that we're all experiencing right now, because this is a very emergent, just foundational skill.
Working with AI is not like a...
We've all purchased a new tool and now have to use it or something like that.
This is actually skills that we have to almost retrain ourselves on how to operate within our day-to-day lives.
And the messy middle, I just love that phrasing because it really does feel like we're going with what I'm experiencing right now and what I'm seeing.
Because it does feel like we're kind of transitioning to this super awkward phase of AI adoption.
But there was a quote that I really love that it focused on the key thing that many organizations don't understand about measuring AI adoption.
And that is that AI collaboration stretches from tight synchronous co-driving to looser asynchronous delegation.
So the point there is that too many companies are focused on whether or not people are using AI.
Like they're just looking at like, how many times did you accept co-pilot's suggestion?
Did you put AI code into your pull request this week?
But you should really be more focused on whether or not your teams know which loop to use and where they need to put resistance into the machine.
But also like which artifacts survive the loop and then how those artifacts become something that the organization can learn from.
And we're seeing a lot of this on our own engineering team.
You know, someone discovers a loop that allows them to consistently achieve like high quality output.
And sometimes it can be translated into broader organizational improvements.
Other times it gets used for a single project and then basically just scrapped and thrown away.
And then we just take the knowledge that we gain from that into the next one.
There's just such a wide range of how our experience with AI is unfolding right now.
And I do want to highlight the three capabilities.
The top one that was listed is that to help navigate this messy middle.
is agent operations and specifically things like which agents are allowed to run.
What systems can they touch?
What data can they see?
What actions require explicit human approval?
All the things that we've solved for the human workplace, but not for the agentic workplace.
And then there's just the bigger question of who even owns this?
The cloud era led to platform engineering becoming a thing.
We've seen that sort of morph into developer experience.
And now we're seeing a lot of terms like AI enablement, AI innovation turning up within organizations.
There's really no established precedent for how to operate AI within an organization at scale.
And anyone who's doing it is basically building that knowledge from scratch.
So yeah, personally, that's where I love where we're at right now.
both with Dev Interrupted and Linear B, we're always working with those people.
And we really have as a part of our core mission is to help that audience really navigate the uncertainties of this.
So yeah, go check this article out because I think everyone here will find a lot to empathize with.
For sure.
And I think it pairs really well with our next article too, which talks about the usage of Cloud Code and whether or not it's making your product better.
And this analyzes a very similar, reality of the world that we live in right now, this messy middle, and that is of this K-shaped productivity curve.
The idea that senior engineers are showing measurable output gains and throughput by using agents, and they're able to kind of up-level their abilities in a very new way.
Meanwhile, on the other end of the curve, trending down, you get experiences where engineers with less experience or with less domain expertise, are flattening or declining in their productivity.
And this might be because of a cyclical nature of iterating on outputs instead of the source of the problems.
It could be an AI literacy problem.
It could also be a product understanding problem.
Because in many cases, the senior engineers are leveraging years and years of experience, sometimes for the very specific products that they're using the AI on, if they've been on that team for a while, or if they've been in that ecosystem as part.
of their career.
So, you know, the distribution gains of being able to work with the tool are uneven across even engineers as much as they are uneven across the whole organization.
And this article is a reminder about how to fight this K-shape curve within your own organization.
It kind of like highlights, I think, the importance of these engineers that are more senior that have found these gains to find ways and pathways.
for those other engineers to learn and to be part of that same experience.
Because really what this does is starts to unfold your capabilities into this thing we're starting to call like an agentic halo.
I think this is going to be something we talk about more on the show.
And you're going to be hearing more about in general of these engineers that have managed to unlock.
huge gains or become that very fabled 100x or 1000x kind of person that we talk about once you reach all of like the levels of enablement you can with this tool.
But instead of then just using it to become like a single user canon, they distribute that they find ways to create an ecosystem where that is supported by their knowledge and earned expertise that then any of those folks in their AI journey can play around.
I think this is the key.
to establishing that fluency and flattening this K-shaped problem.
What did you learn from this one, Ben?
Yeah, I'm with you.
This really is a great representation of the messy middle.
And I don't have a whole lot of comments on it because I just think it's really important that we highlight multiple perspectives on this issue, particularly for people who aren't feeling this transformation that they're hearing others talk about.
We've learned that they're...
There's really like three big stages that a lot of companies are at these days.
And it seems like you're either super early into the AI adoption phase, like you've just gotten access to it and you're still like trying to understand how to apply it across your teams.
There's the people who are deep into the experimentation phase, like they've given their developers the freedom to do a lot of experimentation and try out new things.
I think this is a lot of times where you start to hear people start talking about token maxing.
It's like, who can come up with ingenious ways to apply tokens to a problem?
And then the third group is like people who are starting to feel that transformation because they've started to sort of systematize the productivity improvements.
The thing that stood out the most to me from this article was that comparison of senior roles versus junior roles and how...
Senior roles have been increasing in productivity quite substantially since the advent of AI versus junior roles that have been decreasing in overall productivity.
And I think it's important to remember, though, that the trend for senior engineers in particular has been ongoing for quite some time, even long before AI became a thing because the tooling just continues to get better and better for software engineering.
And this is something I want to hear more about.
People out there, if you're listening and you're focused on this problem of creating the new pipelines for junior engineers, I would really love to hear this story because I do think it's something that we need to be thinking about and making sure that the industry as a whole is going to be built sustainably and that we have a healthy pipeline of new people coming into it.
Because it does seem like it's quite difficult right now to start fresh in this profession.
given all of the productivity gains that are going to the people who have been more established in their career.
Absolutely.
I agree.
All right, Andrew, what are your agents up to right now?
Oh boy.
Well, my agents have been working and learning.
I've actually, I mentioned this recently.
to kind of unlock this really nice new loop where my agents can work across Asana and also across Beads, which I use in the terminal.
And really, this has been a really virtuous cycle for me.
It's allowed me to really quickly get an understanding of the tasks on my plate, but then also leverage a lot of learning opportunities from the world around me, like learning a lot faster and in real time is now more possible because when something catches my interest, I can throw it on a task and then delegate it out to an agent to research and fill out.
the task for me.
And then it's something I can come back to later.
It can inform something downstream.
I can then string these tasks together.
And it reminds me of like, oh, wow, it would have been great to have been doing this the whole time.
But the blocker was always just the amount of throughput and time it took to kind of create those learnings.
But I've managed to create a nice little local skill system that understands the things I tend to be curious about and the stuff that I'm working on and goes in and sees like, what did we learn from here?
Is there something that we could apply back to even our own harness or our own practice?
And instead of being of like a digest, it's more of an investigation of like, what do people out there know that we don't know?
And that's been a really new workflow to have in partnership with my agents.
That's what I've been working on recently.
How about you?
Yeah, that's fun.
That's fun.
Well, I got a chance to skim the latest fragments from Martin Fowler.
And there's a lot of nuggets in there.
But in particular, I want to put my agents on looking into this thing that he talked about called Lattice, which is a new agent harness from Rahul Garg.
But yeah, I really love this because, well, hey, I think we're in sort of this renaissance period of open source agent harnesses.
It's sort of like everyone's doing it right now and releasing a bunch of really cool stuff.
But also, I like this one in particular, A, because lattice just is a very logical name for something like this.
But it uses the metaphor of atoms, molecules, and refiners, which I think is just like, I mean, we both love metaphors.
Well, that's a lot how Gastown works.
Gastown has molecules.
That's what the meows are.
they're molecular, you know, things of work.
That's right.
And so it kind of, I think is, and also too, I also love lattice.
Lattice is really, it becomes a durable and composable system through which you can make product and architectural decisions.
You can rapidly iterate.
It's an applied version of beads.
It's this idea that you really need this like thin, durable, like.
context layer for you and the agent to work on.
And then Lattice comes in with all the structures and underpinnings needed for software engineering.
So definitely a really cool one to check out.
I think also, too, as well, there is definitely a new world of products and open-source harnesses that are coming out right now.
The renaissance, as you call it, is definitely happening.
There's another one like Py.dev, which is an extremely small and lightweight, unopinionated coding harness that doesn't have any kind of prompting, doesn't have caching, doesn't have anything underneath.
The idea is that you go in and you bundle that on.
You figure out exactly what you need.
It's really as close to the raw loop that you can get on an agentic coding tool as possible.
And the developer mindshare on it is staggering.
It has a lot of attention on GitHub.
There's lots of folks who are taking it, forking it, mixing it into things.
Simultaneously, we're in this world where a whole bunch of really powerful local models have been released on Apache 2 licensing, where folks can take them, lightly fine-tune them on their domain expertise, and bundle them inside of Other applications distribute them to users to download locally for private local first models, put them in the web to serve a one of a kind platform that allows folks to interact with domain expertise owned entirely by the person who then distills and trains that model.
So the idea of like you have the model, you have the harness, whether that's a web site or it's a terminal loop.
And then you have your domain expertise.
And I think you're going to start to see a lot of folks gluing these three things together and shipping really unique, one-of-a-kind products that are bundled with this highly domain-specified model.
And the harnesses are a really key part of that.
So definitely be paying attention to this trend and check out these projects if you haven't already.
Awesome.
Well, thanks everyone for joining us again for the Friday Deploy presented by Linear B.
All right.
Give us a like, thumbs up wherever you're listening to us.
Leave us a review, comments on whatever platform you're on.
We love hearing the engagement from the audience and we'll see you next week.
See you next time.
AI is everywhere in software engineering, but most teams still can't prove its impact.
That's where the APEX framework comes in.
APEX is a new operating model for engineering productivity designed to measure AI where it actually matters at the pull request level.
It connects AI activity to delivery outcomes, not just tool usage.
Apex is built on four pillars with AI leverage, predictability, efficiency, and developer experience.
Apex helps you increase throughput without sacrificing delivery confidence or burning out your team.
Because speed without predictability creates chaos and faster coding often shifts bottlenecks downstream.
If you want to operationalize AI the right way, Linear B and Apex gives you the system and the cadence to do it.
download the guide and start measuring what matters