# The Evolution of Version Control in the Age of AI Agents

**Podcast:** a16z Podcast
**Published:** 2026-04-20

## Transcript

If you ask almost any software developer, when you do code review, do you really read the whole PR?
Like, do you go through every line and think it through?
Do you pull it down and test it out and then leave the good feedback on each line?
Agents are very good at that, right?
If something goes wrong, that's very human.
It's actually a thing that has never been good in software development is inter-team communication.
And so it's a very interesting UX problem set that I think nobody's really...
Thought through, really, even now, what does that tool look like in a way that is easy to use and easy to learn?
Software developers that will be the best producers of product in the near future are the ones who can communicate, the ones who can write, the ones who can describe.
That is, I think, the next superpower.
And I think if we could talk to each other in more real time about what we're doing, that's a lot of overhead.
That is not a problem that agents have.
The most widely used developer tool in the world was never designed.
Git started as plumbing commands for the Linux kernel team.
Unix primitives meant to be wrapped in whatever scripts each developer preferred.
A volunteer wrote a unified interface.
It got pulled into core.
And for 20 years, almost nothing has changed.
Now coding agents are the fastest growing users of command line tools, an entirely new persona.
They struggle with interactive rebasing.
They run status after every command.
The assumptions baked into Git's interface no longer hold for humans or machines.
The question is whether the tool underpinning nearly all modern software can adapt or whether something new has to replace it.
Matt Bornstein, general partner at A16Z, speaks with Scott Chacon, co-founder of GitHub and CEO of GitButler.
We are here today with Scott Chacon, CEO of GitButler, former co-founder of GitHub.
Thank you very much for being here.
Of course, thanks for having me on.
You are a major driving force behind GitHub.
You've literally written a book on Git.
You could be doing anything in the world with your life right now.
What's brought you back to startup land?
It's interesting.
I feel like if you ask any sort of repeat founders, they probably have similar answers, right?
This is the most fun thing to do.
So when I started at GitHub, it was...
a real sort of slog to learn, okay, like it's stressful and it's difficult and stuff, but when you get something working, it's so satisfying and it's so much fun to build and grow and create something that you want to see exist in the world.
So I'm sure I'll be doing this when I'm 90.
Do you think there's kind of unfinished business for you in version control or what kind of attracts you back to the same space that you know so well?
Yeah, I mean, I did a language learning startup post.
GitHub because I was trying to learn French at the time.
And I think this is the other thing that other founders do is they leave and then they think they can solve any problem.
And I couldn't solve that problem.
I did try very hard.
But did you successfully learn French, though?
I did not.
I successfully learned German because I wanted to start from scratch.
And so that was what I dogfooded the product with.
And so my German's not bad now.
And I married a German after that and live in Berlin now.
So it did definitely change the course of my life.
Yeah, it's a long-term ROI on that.
Yeah, 100%.
Totally worth it, even though the company didn't quite work out.
But when I went to go look for something else to do after a very short stint of doing some woodworking, like I think most of us do at some point when we have some time off, I started building some stuff and realized that the tooling for Git hasn't changed since I left, right?
Since really I started at...
started GitHub or wrote the first edition of the book.
I was approached by A-Press to write a third edition of the book and I was like, why?
It's exactly the same.
Nobody's going to care about updating it with a handful of new commands or capabilities it has.
So it became an interesting problem set.
What would I want this to look like if I could just sort of scrap the porcelain user interface and have a tool that not only did what Git does better, right, or easier or something, but Like, rethinks it a little bit and said, you know, if we had started from scratch, learning everything we had learned in 2008, right, or 2005, if I'd gotten involved in the Git project and could come in and say, maybe it should work this way, maybe these are the things that it should do for us.
You know, I set out to kind of build that because I thought it would be a really interesting, fun thing to do, especially from my background.
Is there truth to the story that there was sort of tension between the Git core committer team and the GitHub founding team early on, right?
Because on the face of it, it...
make some sense that these teams wouldn't have exactly the same objectives.
Yeah, I think they didn't think we were very smart.
because we couldn't write C code, right?
And there was a grudging respect over time because so many projects kind of ended up moving to GitHub.
But I think they, you built like the foundational piece of the entire dev stack.
So that earned you like a little bit of credibility.
I think they only like it because it's fast.
Like I don't think they liked anything else.
If you list like Linus has talked about us, right?
Where they're like, well, like he moved his tree.
There was some outage or something and he moved his tree to GitHub.
And he's like, they're a good host.
I hate PRs and I hate.
I hate everything else they have, but abuse them as a host if you want to, right?
And so I think that's kind of the general, that was kind of the general, I had friends on the core teams and stuff, and I still hang out and go to the Git merge conferences and stuff like that.
But I think we always try to be supportive and stay out of their way.
Like, it's one of the interesting things, like we might talk about this more later, but how hands off everybody is to Git itself, right?
And so, like, it's a very design by committee type thing because it's an open source.
project where whatever seems to be a relatively good idea comes in but there's not sort of a drive to say here's what the product should look like right and so I think over time it's just kind of become a Frankenstein where it does lots of things very fast and very well but it's not designed right it doesn't have sort of overall sort of an arc of taste.
And so that's kind of where I wanted to come in because there is a lot of, I mean, this was the root of GitBudder is we don't want to rewrite the whole stack, right?
We don't want to rewrite how it stores data or how it transmits data, the wire protocols or anything like that.
That's all very solid.
It's very good.
It's very smart.
It's just the user interface that we want to inject some taste and say, here's a way that we think people are trying to use Git and make it easy to do, right?
So the world is moving towards sort of agentic coding or AI-assisted coding now.
This is obviously a very different set of ergonomics compared to a human writing all code by hand.
It sounds like you're making the argument that Git wasn't even sort of optimally configured for humans before, right?
No one kind of really driving taste of the developer interface.
Now with agents, it's sort of this compounding problem.
What do you think will happen?
And like, just expand on this point of what you think needs to change, what needs to stay the same.
What's interesting about the Git project is that they started with...
essentially Unix philosophy.
And so I think of the listeners that are too young may never sort of heard of the Unix philosophy, but like when you write tools in a Unix sort of environment, you want to kind of pipe the output of one into the input of another so you can change stuff, right?
And so it's actually kind of funny now seeing how agents work because they use all of these old Unix tools.
They're using like SED and grep and stuff, right?
And like where a lot of developers would never, like may never have heard it.
They may learn it from their agent and then just seeing these things running and they're like, you want to run SED?
I guess so.
like what you know what the what does that do and so it's very good for that type of thing of saying okay i want to do this thing on a pipe it into something else and then have it take the output of that and then i can kind of do this set of filters and stuff and so the original sort of git plumbing commands like all of the commands like And Linus and the original team were like, we're just going to do things that do all of these very basic things.
And then you can write Perl scripts to wrap all of them and do whatever you want.
So they had no, I don't even think they had an intention of writing a user interface to it, right?
Or making it easy to use.
It was completely orthogonal to their goals, right?
They were like, whatever you want it to be, here's the tooling that does it well.
And we've solved a lot of the hard problems, the APIs or whatever.
You write the interface you want.
The hard problems are like the storage layer, like what are some of the hard problems?
Delta compression algorithms, right?
Wire transfer protocols, like all of the sort of how the trees are read fast or written fast or stored in a format that's small or can be transmitted quickly.
I see.
So if you sort of think of the like Git.
history as sort of a fairly complex tree with kind of data attached, like you need sort of an efficient way to represent this.
Right, how to move branches around, like what branches even are, like that was a very, all of it was so much different than the way Subversion or RCS, like Subversion, CVS, RCS, they were all kind of sort of...
the next step of a philosophy of how to store data, right?
Git completely changed that, right?
They just thought of it more as tarballs rather than as like a series of patch files and deltas, like a sort of delta series.
So they kind of rewrote it and then had people just do their own Perl scripts, right?
Not thinking a lot of people would be using it and just sort of the Linux core team or whatever.
This guy, Pasci, he wrote these Pearl scripts that gave it a unified...
So then the lazier people that came along that didn't want to write all of that stuff themselves just started using that, and it became...
popular enough that they just pulled it into core and they're like, if you want to use your interface, here you go, right?
Here's the portion for it.
And that's the CLI tool, basically.
That is the CLI.
And then most of those commands haven't changed massively.
Like those original core ones from 2005, 2006, right?
Didn't really change a lot.
And they rewrote them from Perl into C.
They used to just actually send the Perl scripts to everybody.
That's another thing that kids listening to this podcast may not be familiar with.
Perl.
Perl 6 is coming out any day now.
So, yeah, that was kind of how this started.
And the other big philosophy of Git that I really do appreciate but has added to this problem set is that they always wanted to be backwards compatible, right?
And so anything that existed before, they won't take out.
And, like, they would wait for a major version, take out a handful of things, but for the most part, everything works exactly the same way that it did when I wrote the first version of ProGit, which was, like, 2009, right?
There's almost nothing in that book that doesn't work still.
I've just added more stuff, more commands or whatever.
But yeah, if you start with a Unix philosophy, then what you end up with is sort of a middle ground of something a computer can use, but maybe not super well, and something a human can use, right?
So if you run git branch, it's just a list of branches, right?
There's no user interface on it by default, and you can add some stuff that makes it slightly more usable, but the point is they kind of need to solve both of these problems with one interface, which is I need a computer to be able to do this, and I also want a human to be able to.
to sort of interpret this.
And so how do we bridge that gap, right?
And so it's kind of not great for humans and not all of the ones are particularly good for computers, but like they'll do dash dash porcelain in some of them if they want to do that, right?
And so like blame, for example.
And so I think that's kind of where the problem's at.
happen and now it's been so long and they don't want to be backwards compatible and so it's really difficult to go in and be like okay get 3.0 is going to be a complete rewrite of the user interface that takes all these things into account like because also there's nobody really running the project in that way to have some vision and say okay this is what's going to happen and so we felt I felt like this is a good opportunity to try that and to be able to put that in as long as it's drop in.
And like with GitBuddle, you can go back and forth between Git and GitBuddle.
It's not jujitsu where you have to have some co-located thing that's a really different way of doing stuff.
We want it to be a Git-compatible tool, right?
And that was kind of a design decision.
And so...
So what do you think like the machine optimized version of Git looks like versus the human optimized version of Git?
Like now that we, to your point of the flexibility, you can kind of do both.
Right.
So actually it's been really interesting.
So we started as a GUI and the idea there was that I never used a GUI, right?
I've used Git for 20 years now.
Boy, that sounds like a long time.
I've used Git for a long time now and I've never used a GUI.
I've never found it valuable because it kind of just wraps.
Git commands.
And it's generally faster for me to just run the Git commands.
And most people that I know, I mean, we did a survey at some point, like 80% of people still use CLI for Git stuff, even though there are GUIs that exist.
Because, you know, I can click a button or I can run the command.
It's pretty fast to run the command.
So, but the GUIs don't add a lot of functionality that is hard to do in the CLIs, right?
If you know how to do them.
So that was kind of where we wanted.
to go is started with this GUI and do some drag and drop stuff and have multiple branches and, you know, rebase by just dragging a file from one commit to another commit and that type of stuff.
I have to, like, admit, I'm not sure I've ever successfully rebased anything.
So maybe I'm the target, you know, for, like, a GUI on that.
I have to admit, I've messed up rebasing.
recently.
And I literally wrote the book on it.
So it's pretty, it's very error prone, right?
And not really automatable.
So like agents can't do rebase-i, right?
Like they can't reword commits very easily.
They can't squash commits together or whatever, right?
Like you have to drop into an editor and do stuff and then have it keep running.
And so there's modalities that I think nobody's particularly well served by.
What we ended up doing somewhat recently last six months or so is create a CLI and that that became really interesting to us because there's with a CLI like a GUI you know an agent can't use a GUI which is really why we started going down that path but people are using TUIs now for stuff so now we have a TUI we have a CLI we have a GUI they all operate on the same data structures and we can optimize each one for what's good for that but even in the case of the CLI We can do a TUI that's sort of interactive, right, and like very fast to do a bunch of interesting stuff.
You can do a CLI where you can run whatever you want and kind of have nice human sort of, you know, output that we know a human's going to read.
So do hints or something like that, right, that you wouldn't do if you're piping into another command.
Or you can do dash dash JSON and get the same data, but in JSON, right?
So that's very easily scriptable now.
I can pull it into Python and, you know, JQ and take stuff out of it or whatever.
And we've been talking about doing like a dash dash markdown where it gives you the same information, but very specifically for agents, right?
Because that's what's good at kind of injecting into context or whatever.
Like a very, just one simple example where we have some agent loops, some eval loops that we run through sort of all these things of.
Is an agent good at using our, like an agent is sort of, people used to talk about like personas, right?
Like an agent is now a persona.
It's a very, very different persona.
It's very hard to guess.
It's hard to empathize with, right?
Like it's hard, like whatever I guess an agent's going to be good at or want to do or whatever is not always what it really wants to be good at or do.
Like we did dash dash JSON and it turns out that the agents liked just actually getting the human.
because they would kind of compensate by piping it through JQ or writing Python scripts to get the one piece of data that they want out of it, right?
And then they would immediately run...
the status command.
And so we would, we added a dash dash status after to all the mutable commands because we're like, you're going to run this next, so we might as well just give you that as the output, right?
And so like that stuff you would never do for scripting, you would never do for Unix philosophy, you'd never do for humans really, right?
And so, but agents really want it.
And so we have to think about them as a persona of the user of the CLI, but I can have it using those and I can open up the GUI and see what it's doing and, you know, help it out or like, it's kind of interesting to have these very, sort of persona focused interfaces for what each person is good or not good at, right, in order to accomplish the goal that they have.
That's really interesting.
So because the agents or the underlying models have sort of flexible input schema.
it's actually better for your kind of tool output schema to be kind of flexible or kind of like dump all the information that they need.
Yeah, I mean, actually, we're even thinking of, like, we've been talking a lot about this dash dash markdown sort of output format because we thought it would love JSON and it doesn't like JSON that much.
And so, like, how do you optimize for what an agent really wants, right?
Because we can put in stuff like, guess what I think the agent's next step is going to be and give it some context, some extra context that we wouldn't give a human, right?
Because we're like, if you want to do this next, run this.
If you want to do this next, run this, right?
And then it can kind of help lead whatever the next steps that we think it's going to be in that particular case, right?
And so it's a very interesting UX problem set that I think nobody's really thought through really even now, right?
Like even most CLIs don't have dash dash JSON or whatever, right?
Even ones that have been developed.
fairly recently like we're all learning this now right and it's not easy to see what it like you really have to dig into you have to start asking it like look through the last you know 50 tool calls that you did yeah the get mother stuff like what uh what did you struggle with right like what had errors what did not do what you expected it to do and like weirdly it will kind of tell us right and we can kind of work on the skill files and figure it out but it's a new it's a new era of trying to figure out usability That's so interesting.
You know, I'm picturing the consultants showing up with their slide decks saying, you know, here's your persona, you know, Agent Alice, right?
Here's your persona.
Just a picture of a bot.
Yeah, exactly.
You know, at least, actually, I guess you don't need the consultants because you can just ask the models directly, right?
Like they're available everywhere all the time.
So I think that's actually a really interesting observation.
Can you go through some of the sort of reasons that Git Butler, especially the CLI, is a better fit for agentic workflows than kind of plain Git?
I mean, for one, it's because the input and output are built specifically for that, right?
So we get to look at what it's trying to do and try to give it feature flags that it's trying to run, right?
Or that it asks for or is trying to get around by writing some Python script.
And then I can see what the Python script does and be like, okay, here's a new sort of flag that you can give that.
command so that that's the only output that comes out, right?
And sort of optimize for what it's trying to, like we can see what it's trying to do.
And I find that really interesting.
The other thing that Git Brother specifically is really good at, one of the early design decisions that I found really valuable and powerful is doing parallel branches.
So a lot of people are using kind of work trees now to do a lot of stuff in parallel so the agents don't step on each other.
And every, I mean, humans.
have dealt with this for a long time, right, of, like, you're working on some feature, and then you see a bug, you notice a bug, right, that you want to fix, and you have to decide, do I stash everything I'm doing, fix the bug, open a branch, you know, push it to that, and then go, like, you know, go back to my other branch and unstash.
Stashing always felt a little hacky, didn't it?
It's, but, I mean, it's a...
an artifact of you can only be on one branch at one time.
There's one head, there's one index, there's one working directory that you can deal with at a time.
And the data model just doesn't really support that very well.
And so we built stuff on top of that where we essentially have like a hidden mega merge type thing.
And you can have multiple branches and you can take stuff that you've done and assign it to different branches and commit it to different branches orthogonally.
And then it doesn't matter what order you merge these branches in, but it's using one working directory.
And so...
What's really nice about doing multi-agent work, which not everybody's doing, I think some of us are, like some people that are really trying to push the boundaries of, you know, using a lot of agents constantly.
But one thing that's nice about doing that instead of work trees is that the agents can see each other's work, right?
And so it's almost like they're kind of, not pair programming, but they're using one working directory.
So if one agent...
modifies a file and then the other agent tries to, it notices that it's been modified and it can pull how it's been modified and then add on top of that, right?
And so, and not create conflicts.
And so they don't even, we even experimented with having like a communication between agents.
So we have three agents running at one time, give them a little chat channel and they can talk to each other about what they're doing.
Hey, I'm editing this file now and stuff.
And it was super cool.
I wanted to ship it so badly.
I'm like, this is awesome.
This is so, we had a little Tui, you could see them talking to each other and stuff.
And I'm like, this is amazing.
And we put it like through, like very sadly, we were all devastated.
It does not help, right?
Like they will see that something else is happening.
They'll figure out what, why, right?
Like it'll be like, looks like somebody's working on this, some other feature because they added this stuff.
So I'm going to leave that alone and I'll put my changes somewhere else.
And it's faster, right?
Because they don't have the overhead of the communication.
So unfortunately, we're not shipping that because I really wanted to be able to show that off because it's very fundamental.
But just to sort of parse what you're saying a little bit, like if I have, you know, five agents operating on the same repo at the same time, same code base, the WorkTree solution is basically making five copies of the working directory with some sort of smart, you know, storage optimization.
So basically five separate copies.
The idea behind parallel branches with Git Butler is everybody's operating on the same code base directly at the same time.
And they're surprisingly good at not stepping on each other.
And they can't make merge conflicts because they all have the same files to edit, right?
That's super interesting.
And so that, but...
When they're done with their loop, right, or when they have an agent stop and they try to commit stuff, if they have our skill, then they'll look at it and they'll be like, okay, I'm just going to create my own branch.
Like each one of them can work in their own branches and they can commit their stuff into their own branches.
And now it's sophisticated enough where if one agent really wants to edit something that another agent did, it can see that it's locked and it can stack the branch instead, right?
And so now it doesn't really matter if they're stepping on each other or not.
They can kind of figure out.
My co-founder, Kirill, today was showing me this thing where the two agents had, you know, they were both trying to sort of vie for the same file and edit it in a way that wasn't really compatible.
And so they, one stack hit their branch on top of the other one, and then they kept working and they kept committing, but they commit to their part of the stack, right?
Oh, that's cool.
And so, like, that's the type of thing that you can't, it's really, really difficult to do that with work trees or really any other type of, like, you can't do it with Git.
Just plain Git doesn't really allow for it at all.
You can't do rebasing and sort of amending commits or moving commits down the stack or squashing stuff.
So it's just not possible.
So part of it is that we have that solution, but the other part is we're really trying to make the tools accessible for agents in the first place.
That's super interesting.
So I still get the sort of logical isolation of the branch, but the agents like, can kind of see each other's work and like you don't, you know, it kind of makes sense, right?
It's like if you completely isolate them each in their own working directory, they...
by definition, don't have any awareness of what the others are doing.
And I mean, there's other ways of getting around that as everybody, you know, finds out, right?
Like if two work trees create merge conflicts, they won't figure it out.
And then they get a merge conflict, you know, on GitHub when, you know, the second person merges or whatever.
And then you can get the agent to pull it down and fix it and try it again or something like that.
So it is doable, but it's kind of nicer to just not have them in the first place and be able to kind of review everything.
sort of from a high-level standpoint and be like, okay, all of these have done what I want, and I have these two stack branches and then two independent branches, and all three of these stacks can be merged in any order, and we're really going to end up with my working directory now, right?
That's kind of where it started, right?
And so this functionality is all built into the CLI now, and so agents can...
different types of agents can use this just by calling out to the CLI, basically.
That's very cool.
So what do you think happens to GitHub in this world, right?
Like Git clearly is still the backbone everyone's building on.
You're even extending sort of Git.
You're not like building a whole thing.
But GitHub in some ways seems less relevant than it was before.
I'm curious what you think will happen next.
I mean, I think it depends on GitHub, right?
I think it's mostly, it's actually mostly framed as what's the next GitHub?
Because I think people feel like GitHub's not going to be sort of, you know, be able to pivot fast enough to keep up.
I mean, it's a rocket, it's a behemoth, right?
That is both its advantage and disadvantage, is that its advantage is it has everyone in the world using it, and so whatever it does, it does at scale, which is awesome, right?
It can, it has the obvious capability of being able to introduce something to most of the world, of all of the agendic users in the world, no matter if they're using Copilot or not.
The question is, do they care enough to do that or do they have the vision to do that?
Do they know?
The other question is, do any of us even know what that should be, right?
I feel like we're in an interesting sort of Cambrian explosion of workflows now where it's like, who knows, right?
And to put time and resources into doing that in a way that's very hard to pivot around is really difficult.
For a startup, we don't have the audience in the same way, clearly, but we can mess around more and kind of...
follow things faster and try to figure out okay this is because i mean it changes every month now like the like i was saying the tools that we're writing we think work they don't work we can ditch them there's no whatever right like it's it's fine so it's it's kind of um an idea of of you know what are people going to more or less settle on and then how can you give that to the most people or capitalize on that to some degree like i found GitHub's evolution really interesting because people ask, what is the next GitHub?
And I find that interesting to look at from the point of view of what was before GitHub.
Like, what was GitHub the next one of, right?
Yeah, yeah, yeah.
And it wasn't really anything.
Like, there was nothing like a GitHub before GitHub.
And so I think whatever is the next GitHub is going to be the same problem set, right?
It's not going to look like GitHub, right?
Like, I think GitHub will be more like a source forge or something.
maybe you could say was kind of before GitHub, but like it didn't really have collaboration tools, right?
It wasn't about sharing patches or like track or, you know, issue tracking systems was kind of before GitHub.
So like it just, the entire programming community changed.
fast.
And I think GitHub took advantage of that and was able to grow and provide tooling that nobody had, right?
And nobody was really thinking about, you know, short enough time spanning and get that audience.
And I think there definitely probably will be something.
The question is, can GitHub do that like the way that, you know, SourceForge couldn't run or Google Code or whatever was kind of before us?
Or is there going to be a startup, you know, like...
us or or like somebody else you know i'm sure there's a lot of people trying to do this type of thing right now because they see the need it's just that nobody's agreed on what the solution is right like where where to go um and it doesn't have to be perfect it just has to be good enough that people don't want to write it themselves and are like okay i'll give you some money so that i can i can use that instead of re-implementing it from you know reinventing that particular wheel uh do you uh you mentioned before that that the original Git developers didn't like the GitHub primitives, you know, PRs and issues and all this.
Do you think now the primitives need to be reevaluated?
Do we need issues or like, is there some like agent native issue format that, you know, is better than what we have now?
I think we've needed it for a long time.
And it's just too much friction for too little value, really, to change to.
Like, I would really prefer patch-based review rather than branch-based review, right?
Like, PRs made things really easy, but you get a lot of commit slop, right?
Like, there's a lot of oops sort of things added on the end because it's the branch that matters in the review context and it's the branch that matters in the merge context.
And so the commit message doesn't make any, like nobody really looks at them.
And so Suhail once posted an old commit log and like half the messages were like, ah, this is not working.
They're like, yes, I made it work.
Yeah, I mean, it doesn't provide value that way, right?
Like in the mailing list days, it did because that was the way that you reviewed stuff.
You had to have a good commit message because that was your PR description, right?
Now it's the PR description was not kept in Git.
And so, but people don't care that much, right?
Just once it's merged and out there, who cares, right?
There's not a lot of sort of, you know, code archaeology or whatever that people really depend on for that.
And so what I think, I think review changes a lot because we're going to, you know, I mean, like just to go back to sort of the problem with the primitives right now, PRs, if you ask, I think almost any software developer, like, when you do code review do you really read the whole pr right like do you go through every line and think it through do you pull it down and test it out and then leave the good feedback on each line or whatever say this is working or not working or do you give it a cursory glance and say, yep, fine, right?
Like, it doesn't look badly broken or like, you know, you're introducing, you know, API keys or whatever, right?
Like, if you can pull it down and compile it and run it and test it, and I feel like that's almost better reviews.
If something doesn't work right, fine, go to that piece of the code and do it.
Now agents can do this for us or can augment us in a very nice way.
I think review...
it would be nicer if it was patch-based and it was local and you could actually run stuff, right?
Or your agent can run stuff and then give you sort of a short list and then you can look at that.
And so I think a sort of centralized online URL for a PR is something that was not the greatest.
It was better.
It was a better tool.
I think it was better than, you know, track or, you know, looking at patch files or whatever, certainly.
But it's unfortunate that it hasn't evolved more, right?
I think there was still a lot of room to grow in that space.
it hasn't changed that much.
And I think there's definitely room for that.
So the question is, like we even talked internally about trying to do like our own forge, right?
Or our own review system, which we actually shipped one and then kind of took it back off because it wasn't solving the problems that we really, really needed.
And, but I 100% think that, and we'll, I'm sure we'll do other approaches to the review system and other people will as well.
I'm really fascinated where it goes.
Like, what do we really need as...
code writers, whatever that means in the near future.
Like, what do we really want to accomplish from review of code by whoever does it, right?
And what does that tool look like in a way that is easy to use and easy to learn, right?
And it's a very unsolved problem right now that is, for me, incredibly fun to tinker around with.
Like, I have lots of opinions here and I want to try out lots of stuff and I want to see what feels right to me as a software developer, but...
There's lots of possible answers to it, but I definitely think it's an unsolved problem.
I don't think the primitives are the right thing anymore.
Yeah, it's sort of interesting, right?
Because people are reading the code less.
I mean, there's even sort of an extreme view that, you know, like...
you know, JavaScript is the new assembly, right?
Where it's like, you know, it's there and, you know, something compiles down into it, but it's like not really, you know, so it's like, if we're not writing JavaScript directly, if we're not reading JavaScript directly, like should the, should code review actually be code review or should it be prompt review or something else?
So right now, at the state of, you know, agents and models, the way that we sort of approach stuff, I think is somewhat triagey, where If it's a red wristband, if something goes wrong, it's bad, then that's very human.
We look at stuff, we handwrite stuff.
It's very important for us to essentially get the APIs right.
We know if we call these APIs that we'll do the right thing and it will give us the right errors.
We have high confidence in this is good code.
There are other levels of triage where it...
It's not that important, right?
If they're just calling those APIs and it's a UX problem, if we add a feature flag for an agent, like we refine vibe coding that, right?
And just kind of being like testing it, having some, you know, having it write some tests, look at the test, be like, okay, that seems fine, right?
Like that seems to work.
Run it through the eval loop.
It didn't, you know, it doesn't break anything.
It makes things 10% faster, whatever.
Ship it, right?
Like, so we do kind of have this middle ground, but.
I feel like a lot, even internally at GitBuller and with a lot of companies that are really trying to push the boundaries of this and use LLMs a lot, it's becoming much more of a writing problem.
Can you do a good write-up?
And not every team is good at that.
Not every software developer is good at that.
Can you communicate?
I feel like a lot of developers that...
especially the ones who think that they're very smart or are legitimately very smart, feel like they don't have to describe what they're doing, right?
Like can live in their head and it's fine.
I think almost the software developers that would be the best producers of product in the near future are the ones who can communicate, right?
The ones who can write, right?
The ones who can describe.
And like that is, I think, the next superpower.
And I think you're right, the collaboration of...
how do we write code is more important of what's the spec, right?
Like, what is the write-up that we want to be true?
And then we can give it to, you know, the implementation details are probably, they become less and less important as the agents get better.
So all of us who...
you know, we're attracted to engineering because we could deal with machines instead of people and are now finding out that engineering is like an actual human discipline after all.
I mean, I clearly have a huge bias because we started this saying I wrote a technical book.
So I like the idea of I'm going to be the best programmer because I can write technically, right?
Yeah, I mean, we've all had this thought, you know, like there are 10x or 100x.
programmers out there all of us have probably thought at some point that we were we were one of them which like definitionally is is challenging and so we've all had this thought of like oh yeah i don't i don't need to like explain what i'm doing i'm right but but the point you're sort of making is like now like right is not is not sort of as objective as as it was or there's just so much activity going on that like kind of managing state across the team is maybe the most important thing.
Yeah, yeah, or, yeah, consensus, right?
Like, the why rather than the how becomes, I think, more and more valuable as the how becomes cheaper.
What do you think will happen with coding agents in general next?
Like, what do you think is the big hill to climb for them?
You know, just because you guys are pretty deep in this.
As they get better, I think it's the problem set of what to work on next, right?
Like, I think we're all kind of afraid of just...
giving it access to linear and being like, do it, right?
Like just solve all of them.
And because, I mean, the other problem with that is that most linear tickets are most ticket, like there's no write-ups, right?
Like it's not, it's not a database of write-ups of here's, like you don't sit, it's actually very hard to sit down and write up, you know, this is how I want the entire product to look in every interesting and important aspect, right?
I think what becomes problematic is you build something, you have this downtime where you're using it and trying to figure out, you know, I spend most of my time when I'm doing stuff right now, and again, I'm the CEO, I'm not, you know, writing the important code, but like doing proof of concepts or coming up with ideas or whatever is I'm spending most of my time.
like testing it, right?
And writing up the next thing and not that much, like there's a lot of wasted cycles, right?
Like I almost feel, I don't know if there's a phrase for this, but like you're not spending enough tokens, right?
Like you're not having enough things working at the same time.
And so- Hey, you're the German speaker.
You should have a word for this, a compound word.
Yeah, I know.
Token shot if quite up.
Yeah, the fear of not consuming enough tokens.
I think the problem set becomes- training a team to be good at writing, right?
And training it, like figuring out what that coordination looks like to decide like which of these series of write-ups it describes the product that we want to have.
And less, probably less so how exactly is it implemented, right?
But that makes things really, that makes things really constrained, like the constraint moves not to, can I produce the code, but can we agree on what we want, right?
That's, I mean, it's difficult in some ways, it's easier in some ways.
Like, I kind of like being able to, like I've been working on this metadata system that we want to, we might talk about this at some point, but like add, like append transcripts to commits or branches or whatever, right?
And Git can't really do that very well.
And so we're trying to figure that out.
And so I've been spending a lot of time writing a proof of concept in a spec, and it's most of the time on the spec, right?
weeks or like just so much time on the spec.
And then every time I have a decision, I just make it, build it, and then I try it out.
And then I go back to the spec and I fix the spec and I tell it, okay, do it again, right?
And so that's really nice because I don't have to spend all my time implementing it and then seeing what's wrong or, or.
And I don't have to just have a write-up that I'm trying to convince you to read and agree with me or come up with problems or whatever.
I can have show and tell all the time, right?
Like I can always show you something and be like, is this what we want to productionize or not?
And we have a really good idea that it is or it isn't.
And I find that a really interesting, powerful thing.
So I'm really curious kind of where things fall down in the long run.
But right now, I mean, I'm using AI to help me write.
specifications so that we can by hand implement a lot of it if necessary, right?
But we know we're implementing the right thing because we have a very good idea of what it feels like.
That's really interesting.
So, you know, so not only will AI coding models and coding agents continue to improve, but it sounds like you're making the case that we're not maximally taking advantage of the agents that exist now, especially in a team setting.
I mean, there's so many...
things you could do, right?
Like our agents could talk to each other.
It's actually a thing that's always been, like from, like it's never been good in software development is inter-team communication.
Like if you're working on some project and it's modifying a file and I'm working on a project and I'm modifying the same file, neither of us know that until, I mean, we kind of lost that with centralized version control to a degree as well.
But like, I think it's obviously much better, the advantages that we get, but we've lost some aspects of coordination.
where it'd be nicer to know that rather than you merge first and so I have 100% of the work, right?
Like if we could talk to each other in more real time about what we're doing, but that's a lot of overhead for, you know, but agents are very good at that, right?
That is not a problem that agents have.
Like it can take their downtime and talk to the rest of your team's agents and be like, what do I need to look out for?
Or, you know.
like tell Scott about so that he can be aware of this, right, when he's working on his feature or the next iteration of this or whatever.
Like, I feel like that almost would be a better use of sort of the cycle downtime than trying to run five for 20 of them at the same time, right, which just becomes hard to manage and hard to figure out, like, is this doing the right thing or going the right direction?
I feel like it's almost interesting to help it have you write or suggest things or, you know.
Be sort of a helper that way.
Talk to the rest of your team.
Figure out what the world of things is in that project that could affect you and give you that information so you can make decisions, right?
That moves towards what I think would be a really interesting way of writing good product.
So this is like the smart, responsible version of using agents instead of like the amphetamine fuel.
Just like pound as many agents at the repo at the same time as I can.
Yeah, I mean, also I could be biased in that I've just never been able to keep enough of them running at the same time and then review stuff and then find useful data out of it.
Like I like to have a little bit more control and so I take things slower generally.
But yeah, maybe some people are really good at that, right?
But in all of these scenarios, the metadata you're talking about and like appending chat transcripts to the changes and things like that, I guess it feels like that gets increasingly important, right?
Because the like...
the like focus of sort of creative control or like engineering insight is at the metadata layer even more than at the code layer.
Yeah, I mean, it also becomes a big data problem.
Like, you would be surprised, it's all text, but you'd be surprised how quickly that balloons, right?
If you're really trying to keep every transcript or...
I mean, every prompt is okay.
It's kind of commit message-y length.
But having all the tool calls or whatever, or everything that the LLM was thinking, it becomes a really, really, really big data problem very fast, even on small projects.
So now you have to go into large file, large repository.
It's interesting.
Git has some things of being able to...
work with data of the size that, you know, Chrome uses or Microsoft Office team uses or, you know, stuff that Microsoft built into Git a long time ago that most people don't use or know about.
And so we're taking advantage of some of those primitives in Git to try to do a metadata system that can scale relatively well without having to worry about it too much.
But yeah, so there's a lot of interesting problem sets that are coming up with just trying to keep everything, right?
Like we have version control is all about, you know, well, change management, I guess, change control and history.
Like you want to be able to rewind, you want to have save points, like, and you have to have this balance of how much data do I store in this?
that has, you know, is there a cost-benefit sort of analysis there of I need all that data because then it's too much data, you can't find anything you want, right?
Yeah, it has to be indexed or searchable in some way, presumably.
First version of Git Butler that we did, we did like a CRDT-based version where it was just constantly recording every sort of file buffer save, right?
Like everything you ever changed so you could just take a timeline and scrub it back to like any point in your working directory.
That's amazing.
And it was awesome.
And it actually wasn't even that much data, but it was just the user interface.
It was too much information, right?
Like, it's not common that I want to go back 17 minutes, right?
And so we kind of scrapped it because it was just too much complexity for humans.
It'd almost be interesting to reimplement some of that and have agents take a look.
Like, if you want to get the entire working directory back to what it was, you know, 27 minutes ago, here you go, right?
Like, here's a tool to do that.
But I think it's still, you know, you have to figure out.
usability versus what's possible.
I watch some of the thinking logs for agents sometimes, just like out of curiosity.
And some of them are super unhinged, right?
It's like, I see the problem.
It's Bubba.
Oh, wait, no.
Oh, God, I can't believe I missed that.
You know, it's like the agent starts to berate itself.
Don't be too hard on yourself.
Yeah, but there's super valuable information in there, right?
It's just like it's a little bit like extracting it out of this like junior developers kind of like private freakout.
It's a very difficult problem set because it's so, you know, I don't know.
Like I'll have conversations with other people on my team where it's so subjective.
Like they think, you know, Codex is good at this and I think Cloud Code is good at exactly the same thing and we have different reasons why on a similar problem set, right?
And we're like, okay, well, Was it like, what is it, right?
Like, why am I having that particular feeling?
Why does it seem to be acting that way for me?
And it's kind of difficult to quantify.
But yeah, I find it interesting thinking about, I used to do this for the language learning thing, of trying to figure out, you know, what's the logical end of the tooling that this will eventually become?
And then what's interesting or valuable at that point?
Like, for language learning, We used to, you know, Google would come out with like a in-ear translation, auto-translation thing or whatever.
Babelfish.
Yeah, a Babelfish.
And they're like, okay, well, language learning is done, right?
Yeah, yeah.
And I was like, I was always like, it's a dumb argument because there's no, it's not the same, right, to have a Babelfish.
Like, both people have to have a Babelfish for one.
And it has to be very fast and very effective and have cultural context that's very difficult.
And I like to think about, okay.
Like, what's the logical sort of endpoint of this?
And for language learning, it was having a human translator, which, you know, I did a tour for GitHub in Japan at some point, and I had this translator walk around with me for a week, right?
And every interaction was through the translator.
And she was amazing.
She was very good.
But it's still not a good experience, right?
Like, you're not going to get married with that type of communication, right?
Or you're not going to...
You know, you're not going to start a business.
I don't know.
There's a lot of things that you don't really, it's not good, right?
But that is the best possible version of a Babelfish, right?
And so I always think about that because for this, it's kind of interesting.
Like, what is the logical extension of how good agents become, right?
Is it having the best engineer you've ever known that can stop time and work on something as long as they want and then start time again and now you have that solution, right?
Which is kind of...
What it is, right?
Like you have the smarter and smarter and smarter people or, you know, people that can write code that can do it faster and faster.
And so, but when you have that, what do you want to do with that?
Like, how do you manage that time?
How do you figure out, like, what tooling do you want to help you?
figure out what you want to build and what you're happy with at the end result.
And that's a very, very interesting question that we're, you know, we're getting closer to, but I still think we have a while before it's that good, right?
But it will get that good, I think, or very close to that.
And so how do we, how do we, what do we do with that?
This is one of the big questions in the tech industry and the big question in the world right now.
I mean, what is the end state of this all look like?
I don't claim to know the answer, but I'm glad people smarter than me are working on the problem.
Well, thank you very much, Scott, for doing this talk with us today.
Yeah, thanks for having me, Matt.
And yeah, for folks who are listening, try out GitButler, especially with the new command line tooling.
Super, super cool.
And for folks building startups, try out Andreessen Horowitz.
They're awesome.
Too kind.
As a two-time Andreessen Horowitz founder, hopefully this one even more successful than the last.
Thanks, Matt.
Thanks for listening to this episode of the A16Z Podcast.
If you liked this episode, be sure to like, comment, subscribe, leave us a rating or a review, and share it with your friends and family.
For more episodes, go to YouTube, Apple Podcasts, and Spotify.
Follow us on X at A16Z.
and subscribe to our Substack at a16z.substack.com.
Thanks again for listening, and I'll see you in the next episode.
As a reminder, the content here is for informational purposes only, should not be taken as legal business, tax, or investment advice, or be used to evaluate any investment or security, and is not directed at any investors or potential investors in any A16Z fund.
Please note that A16Z and its affiliates may also maintain investments in the companies discussed in this podcast.
For more details, including a link to our investments, please see a16z.com forward slash disclosures.