# The Future of Version Control in the Age of AI Agents

**Podcast:** AI + a16z
**Published:** 2026-04-08

## Transcript

If you ask almost any software developer, when you do code review, do you really read the whole PR?
Like, do you go through every line and think it through?
Do you pull it down and test it out and then leave the good feedback on each line?
Agents are very good at that, right?
If something goes wrong, that's very human.
It's actually the thing that's never been good in software development is interteam communication.
And so it's a very interesting UX problem set that I think nobody's really thought through, really, even now.
What does that tool look like in a way that is easy to use and easy to learn?
Software developers that would be the best producers of product in the near future are the ones who can communicate, the ones who can write, the ones who can describe.
That is, I think the next superpower.
And I think if we could talk to each other in more real time about what we're doing, that's a lot of overhead.
That is not a problem that agents have.
And for 20 years, almost nothing has changed.
Now coding agents are the fastest growing users of command line tools, an entirely new persona.
They struggle with interactive rebasing.
They run status after every command.
The assumptions baked into Git's interface no longer hold for humans or machines.
The question is whether the tool underpinning nearly all modern software can adapt or whether something new has to replace it.
Matt Bornstein, general partner at A16Z, speaks with Scott Shacone, co-founder of GitHub and CEO of GitButler.
We are here today with Scott Shacone, CEO of GitButler, former co-founder of GitHub.
Thank you very much for being here of today.
Thanks for having me on.
You are a major driving force behind GitHub.
You've literally written a book on Git.
You could be doing anything in the world with your life right now.
What's brought you back into Startup Land?
It's interesting.
I feel like if you ask any sort of repeat founders, they probably have similar answers, right?
This is the most fun thing to do.
So when I started a GitHub, it was a real sort of slog to learn, okay, like it's stressful and it's difficult and stuff, but when you get something working, it's so satisfying and it's so much fun to build and grow and create something that you want to see exist in the world.
So I'm I'm sure I'll be doing this when I'm 90.
Do you think there's kind of unfinished business for you in version control?
Or what kind of attracts you back to the same space that you know so well?
Yeah, I mean, I did a language learning startup post-GitHub because I was trying to learn French at the time.
And I think this is the other thing that other founders do is they leave and then they think they can solve any problem.
And I couldn't solve that problem.
I did try very hard.
But did you successfully learn French though?
I did not.
I successfully learned German because I wanted to start from scratch, and so that was what we I dog footed the product with.
And so I'm my my German's not bad now.
And I married a German after that and live in Berlin now, so it did definitely change the course of the time.
I had a long-term ROI on that.
Yeah, 100%.
Totally worth it, even though the company didn't quite work out.
But when I went to go look for something else to do, after a very short stint of doing some woodworking, like I think most of us do at some point when we have some time off, I started building some stuff and realized that the tooling for Git hasn't changed since I left, right?
Since really I started at started GitHub or started wrote the first edition of the book.
Like I was approached by A Press to write a third edition of the book, and I was like, why?
It hasn't, it's exactly the same.
Nobody's gonna care about updating it with a handful of new commands or capabilities it has.
So it became an interesting problem set.
What would I want this to look like if I could just sort of scrap the porcelain user interface and have a tool that not only did what Git does better, right, or easier or something, but like rethinks it a little bit and said, you know, if we had started from scratch, learning everything we had learned in 2008, right, or 2005, if I'd gotten involved in the Git project and could come in and say, maybe it should work this way, maybe these are the things that it should do for us.
You know, I set out to kind of build that because I thought it was a it'd be a really interesting, fun thing to do from especially from my background.
Is there truth to the story that there was sort of tension between the Git core committer team and the GitHub founding team early on?
Right?
Because on the face of it, it makes some sense that these teams wouldn't have exactly the same objectives.
Yeah, I think they didn't think we were very smart because we couldn't write C code, right?
And there was a grudging respect over time because so many projects kind of ended up moving to GitHub.
But I think they the you built like the foundational piece of the entire dev stack so that are indeed like a little bit of credibility.
I think they only like it because it's fast.
Like it I don't think they liked anything else.
If you list like Linus has talked about us, right?
Where they're like, well, like he moved his tree, there was some outage or something, and he moved his tree to GitHub.
And he's like, they're a good host.
I hate PRs and I hate issues.
I hate everything else they have, but they're abuse them as a host if you want to, right?
And so I think that's kind of the general that was kind of the general.
I had friends on the core teams and stuff, and I still hang out and go to the Git merge conferences and stuff like that.
But I think we always try to be supportive and stay out of their way.
Like it's one of the interesting things, like we we might talk about this more later, but how hands off everybody is to Git itself, right?
And so, like it's a very designed by committee type thing because it's an open source project where whatever seems to be a relatively good idea comes in, but there's not sort of a drive to say, here's what the product should look like, right?
And so I think over time it's just kind of become a Frankenstein where it does lots of things very fast and very well, but it's not designed, right?
It doesn't have sort of overall sort of an arc of taste.
And so that's kind of where I wanted to come in because there is a lot of the I mean, this was the root of Git Butler is we don't want to rewrite the whole stack, right?
We don't want to rewrite how it stores data or how it transmits data wire protocols or anything like that.
That's all very solid, it's very good, it's very smart.
It's just the user interface that we want to inject some taste and say, here's a way that we think people are trying to use Git and make it easy to do, right?
So the world is moving towards sort of a gentic coding or AI-assisted coding now.
This is obviously a very different set of ergonomics compared to a human writing all code by hand.
It sounds like you're making the argument that Git wasn't even sort of optimally configured for humans before, right?
Right.
No one kind of really driving taste of the developer interface.
Now with agents, it's sort of this compounding problem.
What do you think will happen?
And like I just expand on this point of what you think needs to change, what needs to stay the same.
What's interesting about the Git project is that they started with essentially Unix philosophy.
And so I think of the listeners that are too young may never have sort of heard of the Unix philosophy, but like when you write tools in a Unix sort of environment, you want to kind of pipe the output of one into the input of another so you can change stuff, right?
And so it's actually kind of funny now seeing how agents work because they use all of these old Unix tools.
They're using like said and grep and stuff like that.
And like where a lot of developers would never have like may never have heard it.
They may learn it from their agent and then just seeing these things running, and they're like, you want to run said?
I guess so.
Like what, you know, what the what does that do?
And so it's very good for that type of thing of saying, okay, I want to do this thing, I want to pipe it into something else, and then have it take the output of that, and then I can kind of do this set of filters and stuff.
And so the original sort of git plumbing commands, like all of the commands, like Linus and the original team were like, we're just gonna do things that do all of these very basic things, and then you can write Perl scripts to wrap all of them and do whatever you want.
So they had no, I don't even think they had an intention of writing a user interface to it, right?
Or making it easy to use.
It was completely orthogonal to their goals, right?
They were like, whatever you want it to be, here's the tooling that does it well.
And we've solved a lot of the hard problems, the APIs or whatever.
You write the interface you want.
The hard problems are like the storage layer, a data reverence, like what are some of the hard problems?
Delta compression algorithms, right, wire transfer protocols, like all of the sort of how the trees are read fast or written fast or stored in a like in a in a format that's small or can be transmitted.
I see.
So if you sort of think of the like Git history as sort of a fairly complex tree with kind of data attached, like you need sort of an efficient way to represent this.
Right.
How to move branches around, like what branches even are.
Like that was a very it uh all of it was so much different than the way subversion or RCS, like all like subversion, CVS, RCS, they were all kind of sort of the next step of a philosophy of how to store data, right?
Git completely changed that, right?
They just thought of it more as tarballs rather than as like a series of patch files that deltas, like a sort of delta series.
So they kind of rewrote it and then had people just do their own Perl scripts, right?
Not thinking a lot of people would be using it and just sort of the Linux core team or whatever.
This guy Pasky, he wrote these Perl scripts that gave it a unified.
So then the lazier people that came along that didn't want to write all of that stuff themselves, just started using that and it became popular enough that they just pulled it into core and they're like, if you want a user interface, here you go, right?
Here's the porcelain for and that's the CLI tool basically.
That is the CLI.
And then most of those commands haven't changed massively.
Like those original core ones from 2005, 2006, right?
Didn't really change a lot.
And they rewrote them from Perl into C.
They used to just actually send the Perl scripts to everybody.
That's another thing that kids listening to this podcast may not be familiar with Pearl.
Pearl 6 is coming out any day now.
So yeah, that was kind of how this started.
And the other big philosophy of Git that I really do appreciate, but has added to this problem set is that they always wanted to be backwards compatible, right?
And so anything that existed before, they won't take out.
And like it, they would wait for a major version, take out a handful of things, but for the most part, everything works exactly the same way that it did.
When I wrote the first version of Pro Git, which was like 2009, right?
There's almost nothing in that book that doesn't work still.
I've just added more stuff that they, you know, more commands or whatever.
So, but yeah, if you start with a Unix philosophy, then what you end up with is sort of a middle ground of something a computer can use, but maybe not super well, and something a human can use, right?
So if you run Git branch, it's just a list of branches, right?
There's no user interface on it by default.
And you can add some stuff that makes it slightly more usable.
But the point is they they kind of need to solve both of these problems with one interface, which is I need a computer to be able to do this, and I also want a human to be able to sort of interpret this.
And so how do we how do we bridge that gap, right?
And so it's kind of not great for humans, and not all of the ones are particularly good for computers, but like they'll do dash-portslin in some of them if they if they want to do that, right?
And so, like Blame, for example.
Um I think that's kind of where the problem set happened.
And now it's been so long and they don't want to be backwards compatible.
And so it's really difficult to go in and be like, okay, get 3.0 is going to be a complete rewrite of the user interface that takes all these things into account.
Like, because also there's nobody really running the project in that way, um, to have some vision and say, okay, this is what's going to happen.
And so we felt, I felt like this is a good opportunity to try that and and to be able to put that in as long as it's drop-in and like with GitBother, you can go back and forth between Git and GitBuller.
Like, we don't want to, it's not jujitsu where you have to have some co-located thing that's a really different way of doing stuff.
And that was kind of a design decision.
And so what do you think like the machine optimized version of Git looks like versus the human optimized version of Git.
Like now that we to your point you have the flexibility, you can kind of do both.
Right.
So actually it's been really interesting.
So we started as a GUI, and and the idea there was that I never used a GUI, right?
I've used Git for 20 years now.
Um boy, that sounds a lot like a long time.
I've used Git for a long time now.
And I've never used a GUI.
I've never found it valuable because it kind of just wraps Git commands, and it's generally faster for me to just run the Git commands.
And most people that I know, I mean, we did a survey at some point, like 80% of people still use CLI for Git stuff, even though there are GUIs that exist.
Because you know, I can click a button or I can run the command, it's pretty fast to run the command.
So but but the GUIs don't add a lot of functionality that is hard to do it in in the CLIs, right?
If you know how to do them.
So that was kind of where we wanted to go is is starting with this GUI and do some drag and drop stuff and have multiple branches and you know, rebase by just dragging a file from one commit to another commit and that that type of stuff.
Which is I have to like admit, I'm not I'm not sure I've ever successfully rebased anything.
So maybe I'm the target, you know, for like a GUI on that.
I have to admit I've uh messed up rebasing recently.
Um and I I literally wrote the book on it.
So it's it's uh it's it's pretty, it's very error-prone, right?
Um and not really automatable.
So like agents can't do rebase dash i, right?
Like they they can't reword commits very easily, they can't squash commits together or or whatever, right?
Like you have to drop into it uh uh an editor and and do stuff and then have it keep running.
And so there's modalities that I think nobody's particularly well served by.
What what we ended up doing somewhat recently, last six months or so is create a CLI, and that that became really interesting to us because there's with a CLI, like a GUI, you know, an agent can't use a GUI, which is really why we started going down that path.
Um, but people are using TUIs now for stuff.
So now we have a TUI, we have a CLI, we have a GUI, they all operate on the same data structures, and we can optimize each one for what's good for that.
But even in the case of the CLI, we can do a TUI that's sort of interactive, right?
And like very fast to do a bunch of interesting stuff.
Um the you can do a CLI where you can run whatever you want and kind of get nice human sort of, you know, output that we know a human's going to read.
So do hints or something like that, right?
That that you wouldn't you wouldn't do if you're piping into another command.
Um, or you can do dash dash JSON and get the same data but in JSON, right?
So that's very easily scriptable now.
I can pull it into Python and you know, JQ and and and take stuff out of it or whatever.
Um and we're we've been talking about doing like a dash dash-markdown where it gives you the same information, but very specifically for agents, right?
Because that's what's good at kind of injecting into context or whatever, or like a very brief just one simple example where we have some agent loops, uh, some eval loops that that we run through sort of all these things of is an agent good at using our like an agent is sort of uh people used to talk about like personas, right?
Like an agent is now a persona, but it's a very, very different persona.
It's very hard to target user.
It's it's hard to empathize with, right?
Like it's hard, like whatever I guess an agent's gonna be good at or want to do or whatever is not always what it really wants to be good at or do.
Um like we did dash-json, and it turns out that the agents like just actually getting the human data because they would they would kind of compensate by piping it through JQ or writing Python scripts to get the one piece of data that they want out of it, right?
And then they would immediately run uh stat the status command.
And so we would we added a dash-status after to all the mutable commands because we're like, you're going to run this next, so we might as well just give you that as the output, right?
Yeah, yeah.
And so like that's stuff you would never do for scripting, you would never do for Unix Philosophy, you'd never do for humans, really, right?
Oh, I can't.
Um and so, but agents really want it, and and so we have to think about them as a persona of the user of the CLI.
But I can have it using those, and I can open up the GUI and see what it's doing and you know, help it out, or or like it's kind of interesting to have these very sort of persona, you know, focused uh interfaces for what each person is good or not good at, right, in order to accomplish the goal that they have.
That's really interesting.
So because the agents or the underlying models have sort of flexible input schema, the it it's actually better for your kind of tool output schema to be kind of flexible or kind of like dump all the information that they need.
Yeah, I mean actually we we're even thinking of like for the we've been talking a lot about this dash dash markdown sort of output format because we thought it would love JSON and it doesn't like JSON that much.
Um and so uh uh like w how do you optimize for what uh an agent really wants, right?
Because we can put in stuff like w guess what I think the agent's next step is going to be and and give it some context, some extra context that we wouldn't give a human, right?
Because we're like, if you want to do this next, run this.
If you want to do this next, run this, right?
And then it can kind of help lead whatever the next steps that we think it's gonna be in that particular case, right?
And so it's a very interesting UX problem set that I think nobody's really thought through, really, even now, right?
Like even most CLIs don't have dash-shon or or whatever, right?
Um, even ones that are that have been developed fairly recently, like we're all learning this now, right?
And it's not easy to see what it like.
You really have to dig into you have to start asking it, like look through the last, you know, 50 tool calls that you did, yeah, the Git other stuff, like what uh what did you struggle with, right?
Like what had errors, what did not do what you expected it to do, and like weirdly, it will kind of tell us, right?
And we can kind of work on the skill files and figure it out.
But it's a new, it's a new era of trying to figure out usability.
That's so interesting.
You know, I'm picturing the consultants showing up with their slide decks saying, you know, here's your persona, you know, agent Alice, right?
You know, here's your persona.
Yeah, exactly.
You know, at least actually, I guess you don't need the consultants because you can just ask the models directly, right?
Like they're available everywhere all the time.
So I think that's actually a really interesting observation.
Can you go through some of the some of the sort of um reasons that that Git Butler uh especially the CLI is a better fit for agentic workflows than then kind of plain Git?
I mean, for one, it's because the the input and output are built specifically for that, right?
So we get to look at what it's trying to do and try to give it feature flags that it's trying to run, right?
Or that it it asks for or is trying to get around by writing some Python script and and then I can see what the Py Python script does and be like, okay, here's a new sort of flag that you can give that that command so that that's what the out that's the only output that comes out, right?
And sort of optimize for what it's trying to like we can see what it's trying to do.
Um and I I find that really interesting.
The other thing that GitBrother specifically is really good at one of the early design uh decisions that that I found really valuable and powerful is is doing parallel branches.
So a lot of people are using kind of work trees now to do a lot of stuff in parallel so the agents don't step on each other.
Um and and every I mean, humans have dealt with this for a long time, right?
Of of like you're working on some feature, and then you see a bug, you notice the a bug, right, that you want to fix, and you have to decide do I stash everything I'm doing, fix the bug, open a branch, you know, push it to that, and then go like, you know, go back to my other branch and on the stuff.
Stashing always felt a little hacky, didn't it?
It's but I mean it's it's uh an artifact of you can only be on one branch at one time.
Right.
There's one head, there's one index, there's one working directory that you can deal with at a time.
Um, and the data model just doesn't really support that very well.
And so we built stuff on top of that where we essentially have like a hidden mega merge type thing, and you can have multiple branches, and you can take stuff that you've done and assign it to different branches and commit it to different branches orthogonally, and then it doesn't matter what order you merge these branches in, but it's using one working directory.
And so what's really nice about doing multi-agent work, which not everybody is doing, I think some of us are like some people that that are really trying to push the boundaries of you know, using a lot of agents constantly.
Um but one thing that's nice about doing that instead of work trees is that the agents can see each other's work, right?
And so it's almost like they're kind of not pair programming, but they're using one working directory.
So if one agent modifies a file and then the other agent tries to, it notices that it's been modified.
Yeah, and it can pull how it's been modified and then add on top of that, right?
And so, um, and not create conflicts.
And and so they don't even we we even experimented with having like a communication between agents, so we have three agents running at one time, give them a little chat channel, and they can talk to each other about what they're doing.
Hey, I'm editing this file now and stuff.
And it was super cool.
I wanted to ship it so badly, I'm like, this is awesome.
This is so cool.
We had a little two, you could see them talking to each other and stuff, and I'm like, this is amazing.
And we put it like through, like very sadly, we were all devastated.
It does not help.
Right.
Like, they they will see that something else is happening.
I see.
They'll figure out what why, right?
Like it'll be like, looks like somebody's working on this some other feature because they added this stuff.
So I'm gonna leave that alone and I'll put my my changes somewhere else.
And it's faster, right?
Because they don't have the overhead of the communication.
So interesting.
Unfortunately, we we're not shipping that because I I really wanted to be able to show that off because it's very it's very fun to be.
But but just to sort of parse what you're what you're saying a little bit, like if if I have uh, you know, five agents operating on the same repo at the same time, same code base, the work tree solution is basically making five copies of the of the working directory with with some sort of smart, you know, storage optimization.
So basically five separate copies.
The idea behind parallel branches with Git Butler is that everybody's operating on the same code base directly at the same time.
And they're they're surprisingly good at not stepping on each other.
Um and they can't make merge conflicts because they only have they all have the same files to edit, right?
That's super interesting.
Um so that but when they're done with their their loop, right?
When or when they're when they have an agent stop and they try to commit stuff, if they have our skill that then they'll look at it and they'll be like, okay, I'm just gonna create my own branch.
Like each one of them can work in their own branches and they can commit their stuff into their own branches.
And now now it's sophisticated enough where if one agent really wants to edit something that another agent did, it can see that it's locked and it can stack the branch instead, right?
Oh wow.
And so now it doesn't really matter if they're stepping on each other or not.
They can kind of figure out.
My co-founder Kirill today was showing me this thing where um the two agents had, you know, they were both trying to sort of vie for the same file and edit it in a way that wasn't really compatible.
And so they one stacked hit their branch on top of the other one, and then they kept working and they kept committing, but they commit to their part of the stack, right?
Oh, that's cool.
Um, and so like that's the type of thing that you can't, it's really, really difficult to do that with with work trees or or really any other type of like you can't do it with Git, right?
Like just plain Git doesn't allow for that very easily.
It doesn't really allow for it at all, right?
Like you can't do rebasing and and sort of amending commits or moving commits down the stack or something like or squashing stuff or so um it's just not possible.
So part of it is is that we have that that solution, but the other part is we're really trying to make the tools accessible for agents in the first place, right?
That's super interesting.
So so I still get this the sort of logical isolation of the branch, but the but the agents like can kind of see each other's work and and like you don't, you know, it kind of kind of makes sense, right?
It's like if you completely isolate them each in their own in their own working directory, they they by definition don't have any awareness of what the others are doing.
And I mean, there's other ways of getting around that as everybody, you know, finds out, right?
Like if two work trees create merge conflicts, they won't figure it out, and then they get a merge conflict, you know, on GitHub when the second person merges or whatever.
Um, and then you can get the agent to pull it down and fix it and try it again or something like that.
So it is doable, but it's kind of nicer to just not have them in the first place and be able to kind of review everything sort of from a high-level standpoint and be like, okay, that's all of these have done what I want, and I have these two stack branches and then two independent branches, and these all three of these stacks can be merged in any order, and and we're really gonna end up with my working directory now, right?
That that's kind of where it started, right?
And and so and so this functionality is all built into the CLI now, where and and so agents can different types of agents can use this just by calling out to the CLI, basically.
Right, right.
That's very cool.
So, what do you think happens to GitHub in this in this world, right?
Like, like Git clearly is still the background backbone everyone's building on.
You're even extending sort of Git.
You're not like building a whole thing.
But GitHub in some ways seems less relevant than than it was before.
I'm curious what you think will happen next.
I mean, I think it depends on GitHub, right?
I think it's mostly it's actually mostly framed as what's the next GitHub, because I think people feel like GitHub's not going to be sort of, you know, be able to pivot fast enough to keep up.
I mean, it's a rocket, it's a behemoth, right?
That is both its advantage and disadvantage, is that its advantage is it has everyone in the world using it can.
And so whatever it does, it does at scale, which is awesome, right?
It could, it can, it has the obvious capability of being able to introduce something to most of the world of all of the agendic users in the world, no matter if they're using Copilot or not.
Um the question is, do they care enough to do that or do they have the vision to do that?
Do they know?
I mean, the other question is, do they do any of us even know what that should be, right?
Like I feel like we're in an interesting sort of Cambrian explosion of workflows now where it's like, who who knows, right?
Like, like, and to to put time and resources into doing that in a way that's very hard to pivot around is is really difficult.
For a startup, you know, we don't have the audience in in the same way, clearly, but we can we can, you know, mess around more and kind of follow things faster and try to figure out, okay, this is because I mean it changes every month now.
Like the like I was saying, the tools that we're writing, we think work, they don't work, we can ditch them, there's no whatever, right?
Like it's it's fine.
So it's it's kind of um an idea of of you know, what are people going to more or less settle on, and then how can you give that to the most people or capitalize on that to some degree.
Like I found GitHub's evolution really interesting because people ask, what is the next GitHub?
And I find that interesting to look at from the point of view of what was before GitHub.
Like, what was what was what was GitHub the next one of, right?
Yeah, yeah, yeah.
And it wasn't really anything.
Like there's no there was nothing like a GitHub before GitHub.
And so I think whatever is the next GitHub is going to be the same problem set, right?
It's not going to look like GitHub, right?
Like I think GitHub will be more like a SourceForge or something that that maybe you could say was kind of before GitHub, but like it didn't really have collaboration tools, right?
It wasn't, it wasn't about sharing patches or or like track or you know, issue tracking systems was kind of before GitHub.
And or so like it just the entire programming community changed so fast.
And I think GitHub took advantage of that and was able to grow and provide tooling that nobody had, right?
And nobody was really thinking about in a short enough time spanning and get that audience.
And I think there definitely probably will be something.
The question is, can GitHub do that like the way that you know SourceForge couldn't when when or Google Code or whatever was kind of before us?
Um, or is there gonna be a startup, you know, like us or or like somebody else?
You know, I'm sure there's still a lot of people trying to do this type of thing right now because they see the need.
It's just that nobody's agreed on what the solution is, right?
Like where where to go.
Um and it doesn't have to be perfect, it just has to be good enough that people don't want to write it themselves and are like, okay, I'll I'll give you some money so that I can I can use that instead of re-implementing it from you know, reinventing uh that particular wheel.
Uh do you uh you mentioned before that that the original Git developers didn't like the GitHub primitives, you know, PRs and issues and all this.
Um do you do you think now the primitives need to be reevaluated?
Do we need issues or like is there some like agent native issue format that you know is better than than one of the things?
I think I think we've needed it for a long time.
Um and and it's just too much friction for too little value really to change to like I I I would really prefer patch-based review rather than branch-based review, right?
Like PRs made things really easy, but you get a lot of commit slop, right?
Like there's a lot of oops sort of things added on the end because it's the branch that matters in the review context, and it's the branch that matters in the in the merge context.
And so the the commit message doesn't make any like it nobody really looks at them.
And so Hale once posted an old commit log, and like half the messages were like, ah, this is not working.
And they're like, yes, I made it work.
And yeah, I mean, it doesn't, it doesn't provide value that way, right?
Like in the in the mailing list days, it did because that was that was the way that you reviewed stuff.
You had to have a good commit message because that that was your PR description, right?
Um now it's the PR description was not kept in Git.
And so, but people don't care that much, right?
Just once it's merged and out there, who cares, right?
There's not a lot of sort of, you know, card code archaeology or whatever that that people really depend on for that.
And and so what I think, I think review changes a lot because we're gonna, you know, I mean, like just to go back to to sort of the the problem with the primitives right now, PRs.
If you ask, I think almost any software developer, like when you do code review, do you really read the whole PR, right?
Like, do you go through every line and think it through?
Do you pull it down and test it out and then leave the good feedback on each line or whatever, or say this is working or not working, or do you give it a cursory glance and say, yep, fine, right?
Like it doesn't look badly broken or or like it, you know, you're introducing, you know, API keys or whatever, right?
Like if you can pull it down and compile it and run it and test it, and I feel like that's almost better review.
If something doesn't work right, fine, go to that piece of the code and do it.
Now agents can do this for us in a night or can augment us in the in a very nice way.
I think review, it would be nicer if it was patch-based and it was local and you could actually run stuff, right?
Or your agent can run stuff and then give you sort of a short list and then you can look at that.
And so I think a sort of centralized online URL for a PR is something that was not the greatest.
It was better, it was a better tool.
I think it was better than than you know, track or you know, looking at patch files or whatever.
Um certainly, but it it's unfortunate that it hasn't evolved more, right?
I think there was still a lot of room to grow in that space, and it's it hasn't changed that much.
And I think there's definitely room for that.
So the question is like we we even talked internally about trying to do a like our own forge, right?
Or our own review system, which we actually shipped one and then kind of took it back off because it it wasn't solving the problems that we really really needed.
And but I a hundred percent think that, and we'll, I'm sure we'll do other approaches to the review system and other people will as well.
I'm really fascinated where it goes.
Like, like what do we really need as code writers, whatever that means in the near future.
Yeah.
Like, what do we really want to accomplish from review of code uh by whoever does it, right?
Um, and and what does that tool look like in a way that is easy to use and easy to learn, right?
And it's it's a very unsolved problem right now that is for me incredibly fun to tinker around with, right?
Like I have lots of opinions here and I want to try out lots of stuff and I want to see what feels right to me as a software developer, but but um there's lots of possible answers to it.
But I I definitely think it's an unsolved problem.
Uh I don't think the primitives are are the right thing anymore.
Yeah, it's sort of interesting, right?
Because people are are reading the code less.
I mean, there's even sort of an extreme view that you know, like uh, you know, JavaScript is the new assembly, right?
Where it's like, you know, it's it's there and and you know, something compiles down into it, but but it's like not really, you know.
So it's like if we're not writing JavaScript directly, if we're not reading JavaScript directly, like should the should code review actually be code review or should it be prompt review or or something else?
So right now at the state of you know, agents and models, the way that we sort of approach stuff, I think is somewhat triagey, where, you know, if it's a if it's a red wristband, like if it's really if if it, you know, if something goes wrong, it's bad, right?
Then that's very human.
Like, like we look at stuff, we r hand write stuff, like it's very important for us to essentially get the APIs right.
Like, like we know if we call these APIs, it will do the right thing and it will give us the right errors.
Or it it has a like we have high confidence in in this is good code.
Um there are sort of other levels of triage where it's not that important, right?
If they're just calling those APIs and it's a UX problem, if we add a feature flag for an agent, like we refine vibe coding that, right?
And just kind of being like testing it, having some, you know, having it write some tests, look at the test, be like, okay, that seems fine, right?
Like that seems to work.
Run it through the eval loop.
It didn't, you know, it doesn't break anything.
It makes things fast 10% faster, whatever.
Ship it, right?
Like so, so we we do kind of have this this middle ground, but I feel like a lot, even internally at at Git Butler and with a lot of companies that are really trying to to push push the boundaries of this and use LLMs a lot, um it's it's becoming much more of a, you know, good of a writing problem.
Like can you do a good write-up?
And and not every team is good at that.
Like not every software developer is good at that.
Like can you communicate you know, some I feel like a lot of developers that, especially the ones who think that they're very smart or are legitimately very smart, feel like they don't have to describe what they're doing, right?
Like can live in their head and it's fine.
I think almost the the the software developers that are be the best producers of product in the near future are the ones who can communicate, right?
The ones who can write, right?
The ones who can describe and and like that that is a I think the next superpower.
And I think you're right the collaboration of of how do we write code is more important of what's the spec, right?
Like what is the write up that we want to be true and then we can give it to you know the implementation details are probably they become less and less important as the agents get better.
So all of us who uh you know were attracted to engineering because we could deal with machines instead of people, are now finding out that engineering is is like an actual like human discipline after all.
I clearly have a huge bias because we started this saying I wrote a technical book.
So I like the idea of I'm gonna be the best programmer because I can write technically, right.
Yeah, I mean, we've all had this thought, you know, like there are there are 10x or 100x programmers out there.
All of us have probably thought at some point that we were we were one of them, which like definitionally is is challenging.
And so we've all had this thought of like, oh yeah, I don't I don't need to like explain what I'm doing.
I'm right.
But but the point you're sort of making is like now like write is not is not sort of as objective as as it was, or or there's just so much activity going on that like kind of managing state across the team is maybe the most important thing.
Yeah.
Yeah, or yeah, uh consensus, right?
Yeah.
Like m the why rather than the how is it becomes I think more and more valuable as the how becomes cheaper.
What do you think will happen with coding agents in in general next?
Like what do you think is the big hill to climb for them based?
J you know, just because you guys are are pretty deep in this.
As they get better, I think it's the it's the the problem set of what to work on next, right?
Like I think we're all kind of afraid of just giving it access to linear and being like, do it, right?
Like just solve all of them and because uh I mean the other problem with that is that most linear tickets are most ticket, like there's no write-ups, right?
Like it's not it's not a database of write-ups of here's like you don't sit, it's actually very hard to sit down and write up, you know, this is how I want the entire product to look in every interesting and important aspect, right?
Um I s I think what becomes problematic is you build something, you have this downtime where you're using it and trying to figure out, you know, I spend most of my time when I'm when I'm doing stuff right now, and again, I'm the CEO, I'm not, you know, writing the important code, but like doing proof of concepts or coming up with ideas or whatever, is I'm spending most of my time like testing it right and writing up the next thing and not that much like there's a lot of wasted cycles right like I I almost feel I don't know if if there's a phrase for this but like you're you're you're not spending enough tokens, right?
Like you're not you're not having enough things working at the same time and so you're the German speaker.
You should you should have a word you know for the compound word.
The token shot deployed up yeah the fear of not consuming enough tokens.
I think the the problem set becomes training a team to be good at writing, right?
And and and training it uh uh like figuring out what that coordination looks like to decide like which of these series of write-ups it describes the product that we want to to have um and less the probably less so how exactly is it is it implemented right but but that that makes things really that makes things really constraint like the constraint moves not to can I produce the code but can we agree on what we want right and and and that's that's I mean it's difficult in some ways it's easier in some ways like I kind of like being able to like I've been working on the this metadata system that we want to we want to we might talk about this at some point but like add like a append transcripts to commits or or branches or whatever right and get can't really do that very well.
And so we're trying to figure that out and so I've been spending a lot of time writing a proof of concept in a spec and it's most of the time on the spec, right?
And like weeks or like just so much time on the spec.
And then every time I have a decision, I just make it build it and then I try it out and then I go back to the spec and I fix the spec and I tell it okay do it again right.
And so that's really nice because I don't have to spend all my time implementing it and then seeing what's wrong or or telling I and I don't have to just have a write up that I'm trying to convince you to read and agree with me or come up with problems or whatever.
I can have something I can have show and tell all the time right like I can always show you something and be like is this what we want to productionize or not?
Um and we have a really good idea that it is or it isn't and and I find that a really interesting powerful thing.
So I'm yeah I I'm really curious kind of where things fall down uh in the in the long run but um but right now I mean I'm I'm using AI to help me write specifications so that we can by hand implement a lot of it if if necessary, right?
But we know we're implementing the right thing because we have a very good idea of what it feels like.
That that's really interesting.
So you know so not only will AI coding models and and coding agents continue to improve but it but it sounds like you're making the case that we're not maximally taking advantage of the of the agents that exist now especially in a team in a team setting.
I mean that I mean there's so many there's so many you know things you could do, right?
Like our agents could talk to each other.
Yeah like it's actually a thing that's always been like from like there's never been good in software development is is interteam communication.
Like if you're working on on some project and it's uh modifying a file and I'm working on a project and I'm modifying the same file, neither of us know that until I mean we kind of lost that with centralized version control to a degree as well.
But like, but it I think it's obviously much better the the advantages that we get, but we've lost some aspects of coordination where it'd be nicer to know that rather than you merge first, and so I have a hundred percent of the work, right?
Like if we could talk to each other in more real time about what we're doing, but that's a lot of overhead for, you know, but agents are very good at that, right?
Like that is not a problem that agents have.
Like it can take their downtime and can like talk to the rest of your team's agents and be like, what do I need to look out for or you know, like tell Scott about so that he can be aware of this, right?
When he's when he's working on his feature and or the the next iteration of this or whatever.
Like I feel like that almost would be a better use of of sort of the cycle downtime than trying to run five or twenty of them at the same time, right?
Which just becomes hard to manage and hard to figure out, like, is this doing the right thing or going the right direction?
I feel like it's almost interesting to help it have you write or suggest things or you know, be be sort of a helper that way, talk to the rest of your team, figure out what the world of things is in that project in that could could affect you and give you that information so you can make decisions, right?
Like that that that moves towards what I think would be a really interesting way of writing good product.
So this is like the smart, responsible version of using agents instead of like the amphetamine fuel, just like pound as many agents at the you know, repos at the same time as I can.
Yeah, I mean, also I I could be biased in that I'm I've I've just never been able to keep enough of them running at at the same time and then review stuff and then find useful data out of it.
Like I like to have a little bit more control and so I take things slower generally, but um, but yeah, it's it's maybe maybe some people are really good at that, right?
But but in all of these scenarios, the metadata you're talking about and like appending chat transcripts to the changes and things like that.
I feel I guess it feels like that gets increasingly, increasingly important, right?
Because the like the the like focus of sort of creative control or like engineering insight is is at the metadata layer even more than in the code layer.
Yeah, I mean, it also becomes a big data problem.
Like it you would be surprised, it's all text, but you'd be surprised how quickly that balloons, right?
If you're really trying to keep every transcript or even every I mean, every prompt is okay.
It's kind of commit message-y like length, but having like all the tool calls or whatever, like, or everything that that the LLM was thinking sort of thinking like it becomes it becomes a really, really, really big data problem um very fast, like even on small, small projects.
So we're now you have to go into like, you know, large file, large large repository, like, you know, the the it's interesting.
Git has some things of of being able to to um work with data of the size that you know, Chrome uses or or Microsoft Office team uses or you know, stuff that Microsoft built into Git a long time ago that most people don't use or know about.
Um and so we're taking advantage of some of those, some of those uh primitives in Git to to try to do a metadata system that can scale relatively well without having to worry about it too much.
But um, but yeah, so there's there's there's a lot of interesting problem sets that are coming up with just trying to keep everything, right?
Like we have version control is all about you know, well, change management, I guess, change control and history.
Like you want to be able to rewind, you want to have save points, like and you have to have this balance of how much data do I store in this that has, you know, is there a cost-benefit sort of analysis there of I need all that data because then it it's too much data you can't find anything you want, right?
Or it's yeah, it has to be indexed or searchable in some way, presumably.
First version of Git Butler that we did, we did like a C RDT-based version where it was just constantly recording every sort of file buffer save, right?
Like everything you ever change so you could just take a timeline and scrub it back to like any point in your working directory.
That's amazing.
Um and it was awesome.
But and it actually wasn't even that much data, but it was just the user interface, it was too much, too much information, right?
Like I it's not common that I want to go back 17 minutes, right?
Um and so we we kind of strapped it because it was just too much complexity for humans.
Um it'd almost be interesting to re-implement some of that and have agents take a look.
Like, if you want to get the entire working directory back to what it was, you know, 27 minutes ago, here you go, right?
Like here's a tool to do that.
But I I think it's still it, you know, you have to figure out usability versus versus what's possible.
I I watch some of the thinking logs for agents sometimes, just like out of curiosity.
And some of them are super unhinged, right?
It's like, I see the problem.
It's Bubba.
Oh, oh wait, no, oh God, I can't believe I missed that.
You know, it's like the agent starts to berate itself.
And you know, it's like it don't be too hard on yourself.
Yeah, but but there's there's super valuable information in there, right?
It's just like it's a little bit like extracting it out of this like junior developers kind of like private freak out.
It's it's a very difficult problem set because it's so uh, you know, I don't know.
Like, like I I'll have conversations with other people on my team where we it's so subjective.
Like they think, you know, Codex is good at this, and I think Claude Code is good at exactly the same thing, and we have different reasons why on a similar problem set, right?
And we're like, okay, well, it was it like what is it, right?
Like, why am I having that that that particular feeling?
Why is it does it seem to be acting that way for me?
And it's kind of difficult to quantify.
Um, but yeah, I I find it interesting thinking about I I used to do this for the the language learning thing of trying to figure out, you know, what's the logical end of the tooling that that this will eventually become, and then what's interesting or valuable at that point.
Like for language learning, we used to uh like, you know, uh Google would come out with like a in-ear translation auto-translation thing or whatever.
And they're like, okay, let yeah, a babble fish, and they're like, okay, well, language learning is done, right?
Yeah, yeah.
And I was like, I was always like, that's it's a dumb argument because there's no it's not the same, right?
To have a babblefish.
Like both people have to have a babble fish for one, and it has to be very fast and very effective and have cultural context that's very difficult.
And I I like to think about, okay, like what's the logical sort of end point of this?
And for language learning, it was having a human translator, which, you know, I've I I did a tour in in uh for GitHub in Japan at some point, and I had this uh this translator walk around with me in like for a week, right?
And every interaction was through the the translator.
And she was amazing.
She was very good, but it's still not a good experience, right?
Like you're not gonna get married with that type of of communication, right?
Or you're not gonna, you know, you're not gonna start a bit like I don't know.
There's a lot of things that you don't really it's not good, right?
And but that is the best possible version of a of a babble fish, right?
And so I I I always think about that because for this, it's kind of interesting.
Like, what is the logical extension of how good agents become, right?
Is it is it having the best engineer you've ever known that can stop time and work on something as long as they want and then start time again and now you have that solution, right?
Which is kind of what it is, right?
Like in like you have the uh the smarter and smarter and smarter people are you know, people that can write code that can do it faster and faster.
And so, but when you have that, what do you want to do with that?
Like, how do you manage that time?
How do you how do you figure out, like what tooling do you want to help you like figure out what you want to build and and and what you're happy with at the at the end result?
And that's a very, very interesting question.
Um that we're, you know, we're getting closer to, but I still think we have a while before it's that good, right?
But it will get that good, I think, or very close to that.
And so how do we, how do we, what do we do with that?
This is um one of the big questions in the tech industry and the big question in the world right now.
I mean, what is the end state of this all look like?
I don't know, I don't claim to know that I don't claim to know the answer, but I'm glad I'm glad people smarter than me are working on the problem.
Well, thank you very much, Scott, for for doing uh doing this talk with us today.
Yeah, thanks for having me, Matt.
And um, yeah, for folks who are listening, try out Git Butler.
Um, especially with the new command line tooling.
Super, super cool.
And for folks building uh startups, try out Andrees and Horowitz.
They're awesome.
Two kind as as a as a two-time Andrews and Horowitz founder.
Hopefully, hopefully this one even more successful.
Thanks, Matt.
All right, cool.
Thanks, Scott.
Thanks for listening to this episode of the A16Z Podcast.
If you like this episode, be sure to like, comment, subscribe, leave us a rating or a review, and share it with your friends and family.
For more episodes, go to YouTube, Apple Podcasts, and Spotify.
Follow us on XAA16Z, and subscribe to our Substack at A16Z.substack.com.
Thanks again for listening, and I'll see you in the next episode.
As a reminder, the content here is for informational purposes only.
Should not be taken as legal business, tax, or investment advice, or be used to evaluate any investment or security, and is not directed at any investors or potential investors in any A16Z fund.
Please note that A16Z and its affiliates may also maintain investments in the companies discussed in this podcast.
For more details, including a link to our investments, please see A16Z.com forward slash disclosures.