# Scaling Autonomous AI Agents for Business Leverage

**Podcast:** The Startup Ideas Podcast
**Published:** 2026-04-29

## Transcript

Howie Liu is an absolute legend.
I mean, this guy started Airtable, half a billion in revenue, a billion dollars in the bank, growing quarter after quarter.
So he's one of those people that when I want to know where is the world going, I call Howie.
This episode is structured into two parts.
First, where is the opportunity when it comes to AI agents?
I think that there's a trillion dollars up for grabs.
I don't know why more people aren't talking about it.
So I had him just...
Give us the tips and tricks for how to use HyperAgent so that you can outperform 99.9% of people.
I got good news.
Howie is going to give you $1,000 of HyperAgent credits, no strings attached.
You just log into the account.
There's going to be $1,000 right there to go and build the business of your dreams.
The catch is first 1,000 people do it, get the $1,000.
He's committing a million dollars.
How crazy is that?
Just writing a million dollar check of tokens to you, to the Startup Ideas podcast community, to play with HyperAgent, to automate some stuff, to do some research, to build their business.
So thanks, Howie.
You know, all I ask is you'd like and comment on this video.
Show some love for Howie for doing such a cool thing.
We need more entrepreneurs, more builders.
And I'm stoked to see him support you all.
Thank you to Airtable for sponsoring this episode.
You guys are legends.
Enjoy the episode and have a creative day.
Feeling really lucky right now because we've got Howie.
He's the co-founder and CEO of Airtable.
And today we're going to talk about agents.
He's going to do a little show and tell of his new product that I've been using for the last few weeks.
But first, Howie, I haven't been sleeping very much, to be honest.
Yeah, exactly.
And I just need your reaction to just some things I've been thinking about.
This chart over here is by Sequoia.
In what domains are AI agents deployed?
You can see software engineering is at almost 50%, back office at 9%, marketing and copywriting 4%, sales and CRM 4.3% and down.
When you see this, what's your reaction?
I think two things.
One is, I think it absolutely reflects the under-penetration of AI in industries that clearly could already be disrupted or benefit.
with even today's AI capabilities, right?
If you took like frontier agents today and deployed them into every one of these categories, you should get to 100%.
And then two, I think even the higher numbers like software engineering is actually kind of an overestimate.
Meaning, you know, like as I think frontier developers and companies applying frontier agentic development practices are finding like, you know, the new model of software development is not even just like every engineer using.
AI autocomplete, like tab autocomplete, which like we all figured out like three years ago, right?
With even GitHub Copilot.
But it's now like, you don't even need the IDE, right?
Like the way I develop on Hyperagent is I have like 30 different cloud code instances running in parallel.
And each one is coupled up to like a browser, fully autonomous.
It can go and like get other agents to comment on any PRs it creates.
And so like this modality shift of like, you know, no AI to like kind of what I would call gen one.
AI, which is like basically like AI augmentation for still like very human driven development workflows.
Andre Karpathy talked about like, you know, in October, November is when he completely inverted from like mostly still human written code with AI augmentation to completely the opposite.
Right.
And that's what we've seen like the frontier companies leap into.
Like, I think even the 50% is an underestimate because the number of companies and even people who have switched into that.
new frontier mode is actually like, you know, definitely less than 50% of software engineering today.
Right.
So I think what we're actually seeing is like the frontier is advancing so quickly and many companies and many industries and many functions are barely catching up to like the three year ago state of the art, let alone like, you know, disrupting themselves and their company, you know, and their industry with the new state of the art.
Right.
I mean, another way to think about it is there's co-pilot territory.
These charts are from Sequoia, right?
There's co-pilot territory, there's autopilot territory.
How do you see, you look at this, right?
This is what Sequoia says, there's a trillion dollars up for grabs within agents.
But they're very different.
What's your reaction to this?
I mean, look, I think to me it's like these agents really reached a breakthrough.
really, you know, call it like four or five months ago.
Right.
And I think developers felt this with Opus, you know, Opus 4.5 just kind of set a new high watermark of like, whoa, this thing for the first time, like really feels like a true software engineer that's able to work like on a task that would have taken a real human engineer, like maybe many hours, if not days.
it can go do it completely autonomously and it ships me a perfect clean PR that I can just review like a, you know, like a reviewer would, right?
And I think that that experience is going to be unlocked and already is unlockable across every single other domain, right?
Because we've kind of just reached this point where like the models are more than smart enough, right?
Like you talk to these models, even in like a more synchronous like chat interaction, not like an autonomous agent interaction.
And you can ask it the most advanced things, give it like really complicated subject matter content, right?
Like management consulting, you give it like, you know, kind of some really hard meeting problems in the context thereof.
And it gives you really smart answers that truly are like expert level.
And so it's clear that the model intelligence is there.
The models are smart enough also to kind of coherently execute across multiple terms with lots of tools and context.
And so I think it's more of just a matter of how.
And how quickly we can deploy agents into every role in industry before we can like truly just almost do anything that humans could do in each of these functions with agents.
And I mean, the time for that is like not even a trillion.
It's like probably like the whole GDP of like all white collar labor, which is like obviously many tens of trillions, right?
Like in even like the Western hemisphere alone.
Right.
Which is sort of like, I don't understand how you're not, how people aren't motivated to create.
startups right now in that sense.
The person listening to this is like, yes, yes, Howie.
But it just feels like I can't think of a better time to be creating a startup than now.
Totally.
I think the weird thing is it's almost like using is believing.
It's really hard to fully grok the power here if you haven't actually gone and hands-on spent like...
at least a full weekend playing with agents.
Right.
Like, and that means more than just a superficial, like you did like some naive, like one shot thing, like, Hey, like, you know, who's going to win the next presidential election, like kind of question that you could have asked a chat bot.
Like I think people are not actually coming in and when they're doing light experimentation, they're not actually putting in an ambitious enough prompt or task in front of the frontier agents.
And they're still kind of using it.
Like they use gen one chat bots.
And like until you actually experience the full power and autonomy of these frontier agents, you know, I think it's hard to fully extrapolate like what types of companies can be built now that were possible for structurally.
How could you build like a multi-billion revenue business with one human and like hundreds of agents, right?
Like you have to use it to get it.
This is another chart I can't stop thinking about, which is the unique economics just absolutely crush.
When you look at a human person versus an AI agent and what it costs, you can create some serious gross margin businesses on top of this.
you know, a lot of people, like, complain about the cost per token of the Frontier models, right?
So, like, Opus 4.6, now 7, clearly the most expensive model, right?
You know, and then, like, GPT 5.4, very good, still kind of expensive, even open source, like, you know, like, it's cheaper, but, like, it's not free, right?
And I think, like, people, you know, some people are struggling, I've seen, to, like, you know, adopt this mental model of, like, you know, in the old days of software, like a lot of stuff was free.
Like you could get like, I mean, even ChachiPT has a free version, right?
That you could just use however much you want.
You get a cheap, dumb model, but like you're not expending that many tokens because it's not actually doing like autonomous multi-turn work and expending like a billion tokens like every few days, right?
Like it's much more token cheap or token lean.
And I think that like we have to get over this hump of like, you know, anchoring our price expectations for AI on like, traditional subscription software where it's like, Oh my God, I have to pay like 20 bucks for like Netflix per month now, instead of like whatever it was 1299 before.
And instead think of this as like, yeah, like to your point, like how much would it have cost a human to do the thing?
Right.
Like, you know, if I wanted to go and like, create an entire marketing campaign um we're actually in my you know ceo ceo role like it's funny like one of our recent uh board memos that i wrote uh and sent out to our entire board and kind of major investor list like you know a lot of it was researched and crafted by hyper agent right obviously with like my you know kind of instincts and context and whatever imbued into the agent um and of course i oversee it at the end but like I got feedback that that was the best memo from some of our best investors that I'd ever written.
And I'm like, yeah, because an agent did it.
And by the way, I got to do it in like 10 times less time.
And so like, even if it costs me, let's call it like $150 of tokens to generate that output.
Like think about the opportunity to cost my time.
And so I think that is a real reframe moment that's needed is let's think of this as like, what is the human equivalent time cost versus wow, $150, that sounds really expensive versus a $10 per month sub.
100%.
The way I always think about it is I anchor it around value.
What's the value I'm getting out of that?
The truth is with your board deck or whatever, it probably was the best because you had so much research support.
Yeah, totally.
Two more quick graphs and then I want to get into Hyperagent.
Percent of enterprise apps with embedded AI agents.
This is the fastest adoption curve in enterprise history.
When you see this, how do you react?
I am not surprised.
I think even this reflects the pace at which incumbents can even integrate.
AI into their products, right?
And I think even that is like stimmied by like just incumbency and like, you know, kind of how seriously did enterprises, you know, enterprise apps or enterprise app makers or internal app teams like take this?
I think the real show of how profound this growth curve is, is like if you take the aggregate revenue created from zero of all the leading AI companies, right, or companies like doing AI things, like take OpenAI and Anthropica alone, right?
Let's just say they have a combined revenue probably of like 80 million plus, right?
Or 80 billion, sorry, plus right now, up from like basically zero a few years ago.
Like what in the history of software?
Like, has there ever been an industry where like any company, let alone like, or even an aggregate, like, you know, across all the companies, you got a category that went from zero to like, you know, 80 billion plus.
Right.
And that's not even including like all of the other AI providers, inference, inference providers and like tooling, et cetera, like out there, like the, the revenue of like, I think the AI category.
is an even sharper curve.
I think that really reflects just how profound this lightning in a bottle is.
Totally.
And just from an opportunity perspective, it's like selling to these enterprises and helping them figure it out and just helping them transform is just a huge opportunity.
I think it's probably one of the bigger cash grabs in business history is You know, there's kind of two angles, I think, you know, to create a very valuable business right now with AI as a wedge, right?
One is PLG.
And obviously we see a lot of these like PLG products.
I kind of put OpenClaw itself in this category because even though it's like not actually like a monetized business, like...
it is getting this massive amount of adoption, right?
And, you know, just the raw token consumption through OpenClaw is, I'm sure, in the many hundreds of millions, if not billions already, right?
And likewise, other products in the PLG genre.
So that's one way, just like let people use the AI thing that actually works, and you're going to get profound growth.
But the other is like to come in top-down Palantir style.
This is why OpenAI and Anthropic and like, you know, the big guys are also doing it.
There's new companies as well going after this opportunity, which is...
go pitch to every enterprise board and CEO.
Like we will fix your AI problem, pay us a massive check, like give us a hundred million dollar plus check and we will purportedly solve your problems for you.
Like that is a existential like risk mitigation that like every large company incumbent should be willing to pay.
Because frankly, like the CEO's choice is like, either I pay it and I risk wasting a hundred million dollars and maybe getting fired over it or like, I don't do anything with AI and I'm definitely getting fired over it.
So on a game theory level, it's like everybody's going to pay it right now.
Whether that actually results in like long term, substantial, structural, like, you know, kind of transformation to the business that probably could be run now with like five people, maybe instead of like 50,000.
In some cases, that's a bigger question.
Yeah.
And this this is, you know, sort of speaks to my my last.
point too, which is if you can help a company run a fleet of 20 agents doing customer intel, content production, competitive research, lead enrichment, all these different things.
This is the future of work in one image, right?
An agent command center, right?
So when you see this, your reaction?
I mean, look, that literally is a view in hyperagent.
I feel like I'm looking at a hyperagent and I think this is the future, right?
We are building towards a world where, you know, it may not be that every company is like literally one person.
Right.
And we have a lot of like one person companies, you know, but I do think like every company will have a fleet of agents.
And, you know, what's interesting to me is actually that like, you know, agents are converging on like these purposeful, like they almost map to job roles that humans were playing.
Right.
And.
you know maybe it's a little bit like why are why are robots like hardware robots converging on a humanoid form factor and part of it is like well like a lot of the infrastructure of everything we have in our homes in construction sites in factories are built for human ergonomics so for the robot to effectively you know kind of um just kind of insert themselves seamlessly with the current infrastructure, they have to kind of have human scale, you know, kind of capabilities.
Right.
And so I think there's a kind of very similar phenomenon happening with agents, which is it's not like, I guess like five years ago when people talked about super intelligence, I always imagined like there's going to be just like, like the single.
omnipotent like ai that just like figures everything out and looks at everything all at once like everything everything everywhere all at once right and i think now like i'm more and more of the belief that like they're going to be fundamental and and always you know kind of present limitations on like context windows for instance right i just don't think we're ever going to get to a point to where like a an ai model can like have infinite context window, right?
And I think there's like a physics to that, right?
Like you can just literally only have so much attention on like so much, you know, context at once.
And, you know, I think what that means is that like for the same reason why we partition humans into different roles and org structures so that not everyone in the company has to know everything and work on everything all at once, like I think the same is true for agents.
And so hence, like you get this like...
overview of agents that actually maps like to kind of intuitive human played roles really well.
And that's the really kind of interesting emerging phenomenon phenomenon for me.
You know, I just recently like spent some time playing around with paperclip, which was kind of fun because it literally creates the org chart metaphor.
But I think this is really exciting, right?
Where it's in a way it's, it's both familiar because we're not like just.
completely upending everything we knew about job functions and roles in the old world to the AI world.
And yet there is a rethink and reapplication of, okay, how do I play that content production role with an agent?
Right.
Well, I think we should get into hyperagent.
Let's do it.
Now's the time, right?
So for the listener, what is hyperagent?
Why are you building it?
And this is a show and tell podcast.
By the end of this episode, can you commit to giving all the sauce around how to use Hyperagent to build a business?
our take on like the Mac version of it.
Like we want it to just work to be secure.
It's cloud native.
Like, you know, you don't have to run a Mac mini and, and perhaps most importantly, like, you know, hyper agent is like applying a lot of the same design philosophy and like obsession with great UX that we applied to the no code app category 10 years ago, but now to agents, right.
Meaning like, apps are kind of complicated, right?
Like, you know, if you're a developer, even at that time, you could build a Rails app.
You had like a data layer, a logic layer, a view layer, but like it was kind of technical, right?
And we're very technical.
And the whole idea of Airtable was to distill that into a really intuitive experience.
In fact, we were very inspired by like the Macintosh, the GUI, like taking terminal-based command line computing and making it into something that like people could just grok immediately.
And so...
Hyperagent is really intended to be a very intuitive and visual way of using agents.
This is actually a task thread that I ran a little bit earlier.
This is actually one of your startup ideas, Greg, that we had a hyperagent work on.
Basically, the pitch was hyperlocal market reports for real estate agents generated from public data.
Basically, this...
Agent went around and did research on the landscape of the market.
It ran a bunch of analysis.
It's got full coding capability.
It's got a full sandbox environment.
So it is running a full computer.
It's just one of the cloud, not your own computer.
And you can connect it to all your accounts if you want.
It can access your Slack and Granola and email.
It can send stuff if you want it to on your behalf or just pre-draft emails.
It's got already pre-configured ability to do things like pull from Twitter, use advanced tools like generate imagery or use Google Maps, et cetera.
But basically what happened was it went around and did all of this.
It researched the opportunity, right?
And then created this research brief.
And let me just show you what this one looks like.
This is kind of the business case for the idea you pitched, right?
I kind of love it because I actually think these...
What I would call like medium sized markets, like it's not like a hundred billion dollar market, which is going to be super competitive and there's going to be massive incumbents going after it.
But I really love this idea of like the kind of like, maybe it's not micro, it's more like mini or medium market, like a couple billion TAM large, which is to say you can build a very lucrative business, even capturing like a double digit percent chunk of this, like you can make a few hundred million per year.
And yet like it's small enough.
to where really big guys are not coming after it, right?
So, you know, this agent created kind of a business case for it.
It found some really cool, like, user validation of the problem.
So it's like, you know, looked up Reddit, like, you know, and found like some real estate people who are actually saying, like, I need this product, right?
So it's kind of validating the market need.
Here's actually the current problem.
I didn't even know about this, but like, apparently, I guess there was some like legal thing that, you know, kind of changed, you know, kind of the dynamic of the market.
People don't want more software, like, you know, another tool with an interface and did like some competitive analysis.
Here's who else is out there.
And then kind of just put together the case for this, right?
But then, you know, better yet, like you don't just have to stop there, right?
You can go and like actually tell it to go and just build a V1 of the product.
So in this case, because Hyperagent has full coding capability, it just went ahead and like created a V1 of this product, right?
Which I think this will actually work.
Like, where do you farm?
Like, here's my report style.
It also looks really clean.
What's that?
Yeah, I mean, and honestly, a lot of this is just like, if you have a good, frontier agent running a frontier model, i.e.
like Opus, you know, 4.7 or GPD 5.4, like it just does a lot of this really well out of the box.
So any frontier agent powered by a frontier model should be able to create an app of this quality.
What's unique about Hyperagent is that it can do that perfectly well, but then kind of do that in the workflow of like, it's not just an app builder.
App building is just a feature now.
It's a commoditized feature.
And what it can actually do is like go and research the end-to-end of like, here's actually, the business context of what I'm trying to do and then build the app informed by it.
Right.
So it's more like hyper agent is the founder in this case.
It's not just the developer, it's the founder.
One of the cool thing I like about a hyper agent is like, it just comes out of the box with like really powerful tools.
So it has like, you know, Google maps as a tool and it can actually go and like, let's say, I think I already did this, but.
Like I wanted it to go and actually find like real street view imagery of billboard locations.
So it knows how to use street view to like find actual points of interest and then to take that image and use that as a reference seed image for like a AI image generation or video generation.
Right.
So like, I mean, another cool thing you can do with hyper agent is you could tell it like, take this house.
And like, I want you to redesign the house using interior photos from Zillow or like the exterior shots.
And it will do that like really, really well, right?
So that's HyperAgent in a nutshell.
I can walk through some of the other stuff here.
You know, once you actually build like a lot of agents, then you get like this ability to start looking at like, well, what if I wanted to see, you know, not just my one agent, sorry, but...
But an overview of all of my agents, right?
So this is not like a very built out account.
This would be like your first week of hyper agent use.
But like literally that command center view that we talked about, like, you know, we want you to be able to create many different agents that each play a role.
Here's the content marketer.
Here's the market researcher.
Here's like the like customer email responder and like just manage and oversee an entire fleet of agents.
constantly improve them because we actually have this ability to go and like, you know, curate memory and skill improvements from every run that you do.
And then finally, to be able to deploy them into a team setting as well.
So if you wanted to take any of these agents and actually give it the ability to talk in Slack, right?
So I can actually say like, let me put this into Slack.
Let me have it always on, always listening, in fact.
And, you know, just...
sit there in my channels, listening to everything I'm talking about, my team's talking about.
And when I have something relevant to add to automatically chime in and then people can interact with me truly like I'm a, you know, I'm a virtual coworker.
Right.
And I think that's kind of part of the open clock experience I've seen some of the power users achieve.
That's really quite magical.
Like your Slack coworkers are now agents in addition to humans and they're really smart and they have their own like expertise and context.
Like you get that with a single click out of.
Any agent that you build in hyperagent.
So you mentioned skills.
How do skills work on hyperagent?
And how should people think about it?
Yeah, so skills are, I think, the most important concept or primitive in the frontier agents world.
Meaning the models are generally intelligent enough.
It's like, find Albert Einstein, who's obviously super smart in a general sense.
And he may not know how to solve problems in real estate.
But if you gave him, like, just the right, like, kind of briefing on, like, here's a playbook, here's a manual to learn everything you need to know to do this job in real estate, like, he's going to go and, like, figure it out pretty well, right?
And so what's really powerful about skills is, like, they're a really, really composable concept.
Like, you can interactively create skills.
So let's say I'm actually going to create, like, a new thread here.
Just keep this super clean.
Greg Eisenberg like AI content.
Okay.
And so what's really powerful about this is like, no, don't create this.
Don't create this.
But, but worse enough that, you know, we don't take Greg's business.
But what's really cool about this is like, it's not going to just like go and like, like say, okay, like, you know, I'm just going to have a prompt that, you know, pretends to be Greg Eisenberg.
It could actually go and like, you know, research how you actually, do content.
So it's coming up with a plan.
The plan is like, I'm going to first go and like research your style, figure out like what platform I care about, like look at some of your actual posts and then distill all that into a skill that I can then pin to an agent or like just.
use on demand at any point right so let's say um just for fun uh like what um what platforms uh do you want to post to let's just say x for now uh we're gonna have the skill only generate drafts so it's not gonna auto post for you is there any kind of uh content you want your uh agent eisenberg to uh to be focused on yeah let's do contrarian ai takes okay cool um and then any topics beyond that like Solopreneur, Bootstrap.
Okay, cool.
And then how do you want to use this agent if you end up using this agent?
Do you want to start with an idea?
Do you want it to just come up with ideas for you?
I don't want to do it anymore.
We'll go full autonomous, right?
Like someday we're going to have to see if like real Greg is actually just sitting at the pool all day.
It's just created the Greg avatar version of you and is doing everything on its own.
But okay, so now it's like going to go and like do some research about you and figure out like how to distill the perfect skill for Greg like into this skill.
How should people think about...
you know, hyper agent versus perplexity computer versus madness versus open claw itself.
Yeah.
So codex, how do you see it?
So I think, um, Against Codex, you know, it's quite simple.
Like Hyperagent is a more general purpose agent platform, right?
I think against OpenClaw, like this is much more turnkey, ready to go, safe and secure by default, cloud native, like, you know, and I think just much more focus on like great UX, right?
OpenClaw, like we actually have to go into configuration or like you're trying to edit memories or do any kind of curation or like kind of...
configuration, it's, you know, it's, it's quite raw, right?
It's like a very, you know, kind of raw product kind of feels like it's more for like very technical people who've become like expert at it.
I think Perplexity and Manus, or Perplexity, Computer, and Manus are, like, the closest comps for Hyperagent.
The key difference is, like, one, you know, Hyperagent has more powerful tools out of the box.
And also, it has more focus on UX out of the box, right?
Like, you know, I've spent some time playing with both of those products.
I think they're great products.
And, like, you know, at their time, and, you know, or at least when Manus first came out, it was truly groundbreaking, right?
Like, it was the first kind of real, like...
holy crap, like YOLO agent, like look at everything it did, kind of like before even OpenClaw, right?
Long before OpenClaw.
And so I think they were really kind of pioneers in this space.
With HyperAgent, like we've just taken a very UX focused approach.
So for people who like, you know, seeing visually and be able to like interact with the outputs and see more visually, like what the agent is doing and have a more visual way.
of, you know, defining skills, deploying skills, creating agents, et cetera.
Hyperagent is just much more of like the Macintosh experience, right?
Versus the Linux.
I think secondarily, we've also kind of done a lot more to make Hyperagent immediately ready to run not just like one like agent.
Like I think the nominal experience for Manus and Perplexi Computer is still like.
You use those products and you kind of have this like agent that's pretty awesome.
And, you know, you use it directly, right?
You can do that with hyper agent.
That's exactly what we're doing here.
But it's also designed from day one with much more of like the scalability and deployability story in mind.
So meaning like once I have an agent that kind of works for me, I can now deploy it one click into my Slack channel.
And now everyone in my company can benefit from this agent just.
always on like kind of chiming into conversations you know they can ask it questions it will respond you have the command center that fleet view where it's not just one agent you can oversee your entire fleet of multiple agents and we even have things like you know the ability to oversee and curate like the learnings that keep making each agent better.
So like they kind of have this automatic self-improvement loop where over time they're accumulating not just new memories, but also like suggesting to you, hey, maybe you should add this additional skill or update or tweak the skill.
Or even like maybe you should go and actually try changing my agent system prompt or give me access to different tools so I can do this type of job better.
And best yet, like we actually have this concept of what we call rubrics, which is exactly what.
it sounds like.
It's like an eval rubric.
And what you can do with rubrics that's really powerful is actually like define what does good look like for a certain type of task, right?
So I could create one here that's like, what is a rubric for great Greg Eisenberg content?
And what it basically does is I can then have a full eval loop where every time my agent runs, like once the Greg Eisenberg skill is ready, I could say like, I'm creating the virtual Greg agent and I'm going to pin a rubric to that agent that then says every time Greg creates a piece of content, I want to score that content along the dimensions that you care about using a separate LLM as judge that fires off.
And then I can literally oversee like how well is my agent doing over time, right?
And if I want to double click in and inspect any one task run to see like how did it get scored, I can do so.
So we basically, you know, have this complete.
full loop of, it's not just like you get a day one agent or thread experience that works really well out of the box.
And it's not just like you can curate agents and deploy them and like improve them over time, but it's that you have this complete observability layer and kind of this orchestration story where you can actually just like look at all of your agents running all the time and see how they're doing.
And so if I pinned the eval rubric, to any one of these agents, I would see like the trend line of how it's scoring.
I could then automatically like suggest, hey, maybe I can reduce the model quality.
So I dropped from Opus to Sonnet, get a five times reduction in cost.
And the score didn't go much down, right?
So just once people actually start running agents at scale, these kind of secondary capabilities become really critical because it's not just about, can I get one agent to do one thing?
But how do I like oversee and run?
an entire business with many different agents and ensure consistent quality.
Which is a big deal because, for example, if you're using Manus, who is the judge around the output?
The judge is you, the human being.
It's not Opus 4.6, it's a human being.
So if you're trying to actually create what we were talking about before, which is an agent-first business, managing a ton of agents, realistically, you're not going to have the bandwidth to be looking at every single output at all stages, right?
Yeah.
It's kind of like management 101, right?
But applied to agents now, where it's like, as you scale up, if you're the CEO of a business, you just literally don't have time to go and look at every single thing that every single person in the company has done.
And so you need to create better automated checks and balances.
to oversee what the agents are doing, right?
And like inspect quality of work, right?
Like this would be like, if you actually had like a giant army of human content creators, like you would want some way of like, you know, in a scalable way, like to detect like if they're posting good or bad content or not, right?
And then know like, okay, we got to tweak like the guidelines for each of these people.
Okay, so now we have the Greg Eisenberg contrarian draft skill.
And I'm gonna go ahead and save this skill.
And I'm going to try seeing like, OK, let's do a dry run.
It's going to scan today's AI and news and trends and then create some contrarian drafts.
Right.
And the whole idea here is like, look, like it's probably going to do an OK job.
on like the first effort here.
Like it did some research about you.
It kind of like, you know, it has a lot of like context about how you work, right?
And if I wanted to see more about this skill, I could actually open it up.
Here's when it should be used for.
Here's the actual kind of skill contents.
Greg's voice is a smart friend at dinner saying the quiet part out loud.
Not a corporate communicator.
I would agree with that.
You know, you've been inside all these companies, blah, blah, blah, like doesn't mean be a jerk.
I think it's very astute, like you're loud, but like not annoying or like kind of rude.
And then actually, I'm curious if you agree with some of these stylistic things, right?
Like you got a hook in the first seven words.
You know, you don't want like long blocks of text, which I'm guilty of.
So I should take some of this Greg skill and apply it to myself.
um you love ordered lists uh never end with what do you think which is super generic um so let's just say like this is a pretty good v1 like maybe it's like 50 of the way there um but the idea is that like these skills should be evergreen right like it's not like you do one and done the whole point is like every time i use this skill like either automatically using, you know, kind of the LLM generating learnings and like suggestions to improve itself, or because I am looking at the content and saying, oh, that's not quite right.
Like, here's why you got that wrong.
Like you can interactively tweak and improve the skills and performance of the agent over time.
So like, I think this is the, the, the like challenge that a lot of people face is like, they one shot something.
It's not quite like as profound as what they hoped for.
And they kind of give up.
Right.
And I think like, So my, you know, kind of strong guiding and urgency to folks, and I think this is very aligned to how you've thought about it, like, is don't give up after the first shot, right?
Like, because it's very, very clear that the agents are powerful enough to do almost anything you want it to do.
And the issue is not whether it's capable of and whether you should like give up on it.
It's whether you are able to invest the kind of time and coaching and like curation to get it there.
And I think that like, it is well worth it, right?
Like, if you get it there, it's obviously going to be so much leverage for you that like, what's the value of like having an always on now employee that just like does the things that you care about, like behind the scenes at all times and like, you know, runs for trivial costs relative to like the cost of hiring a new employee.
Well, it's like real life too, which is like, you know, when I first started playing tennis, I was bad at playing tennis.
And when I, would go to play tennis, I almost didn't want to go because I was like, I'm bad at this.
But you go through the messy middle and you get better and better and over time, then you end up, wow, this is a lot of fun.
I think that once you get to the point where it's a lot of fun and it does feel like the outputs are really good, the truth is 99% of people are not putting in the work to get the great outputs.
This is the arbitrage.
It's for people to actually invest in spending time to optimize and get it to a place where it's high quality.
Absolutely.
One of the benchmark partners sent out this memo.
It was basically a wake-up call to all of the portfolio companies to get with the program and really radically rethink how you operate your business immediately with AI.
The assumption is like you're probably you think you're doing some or some things for AI.
You have an AI like, you know, kind of like, you know, center of excellence.
You have like this AI feature, but it's not enough.
Right.
And the kind of parable that they ended with was like, imagine like there's two friends back in like, call it, you know, like.
2003.
And they're both going door to door selling like, you know, kind of knives, right?
Like, or some other like, you know, kind of in person, you know, kind of offline product.
And one of them decides, you know, like, every night and weekend, I'm going to spend like 30 minutes like trying this new Google like AdWords thing and trying to like get some extra leads for my business and supplementary.
And, you know, like one month, like they grow a little bit of revenue, like from, from the SEO or the SEM thing next month, they get a little bit more.
And the other person is like, this thing is awesome.
Like SEM is awesome and it's early, but I need to figure it out.
And so they stopped going door to door and selling knives at all.
And they just spend like the next few months, like just focus on like, how do I get this entirely internet business to work right in the early days of, of it.
like, you know, two months, like they have zero revenue.
They're like living off like their savings.
But they slowly start to get this thing to start get humming, right?
And they get like really versed in the best of SEO and SEM techniques.
And how do I create an e-commerce, you know, kind of, you know, a website that like allows people to transact directly there versus like just giving them a number to call me.
And, you know, the end of the story is like, OK, like project forward like five years.
Where do you think each of those people is?
Right.
And like the obvious answer is the second person has probably built like one of the early multibillion dollar e-commerce businesses and just like carved off like the next Amazon.
Right.
And the other person is like probably still selling door to door, which is getting harder and harder.
Like, you know, kind of that market shrinking.
And so I think it is one of those things where it's like you kind of have to like hit a reset moment.
And what feels like.
you know maybe experimentation and not actually bringing home the bacon actually is the most profound thing you can do to create like real business leverage in the like not even like two-year time frame but like maybe even like the six-month time frame and i'm curious like in your experience or when you see like uh solopreneurs doing this like where do you see or like how often like what is the the average like break-even point Literally either in terms of you get to the point where you can self-sustain a full-time business, and that becomes your paycheck, or just even where it feels like it's starting to pan out.
I think that there's multiple milestones that people hit.
It's a game of confidence.
When you make your first internet dollar, no matter what it is, it rewires your brain.
If you can take an idea and make $1 a stranger, just $1, it's going to rewire your brain.
Then I think once you get to $10K a month, just something about that number, for the most part, once you hit that, you're probably quitting your job.
You're probably going all in.
You're probably like, okay, there's something here and there's a path to something bigger.
I think that with respect to agent products and products like this, The mistake I think a lot of people make is they try it too sporadically.
So what I encourage people to do is to actually try the product every single day for a certain amount of time.
So commit to 30 days, 60 days, 90 days, some amount of time so that every single day, it's like in your calendar.
Literally I have in my calendar.
30 minutes here, 30 minutes there, right?
And that's what gets you to be a top 1% agent builder, right?
Because you make it a part of your workflow and then you end up seeing outsized returns because it compounds.
That makes sense.
I mean, it's kind of like, I'm not a writer, but I've heard from writer friends, the most important thing is not to like wait for like the one weekend where you're going to have like the spurt of brilliance and write the whole screenplay or the whole book all in one get go.
But it's like, you have to force yourself to write like some pages every single day, like no stops.
And like some of them are going to be crappy pages, but like the forced habit, like just gets you better and better and better.
And then it becomes like natural.
And so I could see that being very applicable and kind of like analogous here for the world of like getting agent savvy.
So do we have some tweets?
So, okay, let's look at this.
Let's see.
The consensus narratives are, oh, this is not loading for some reason.
The consensus narratives are getting louder.
Every medium post reads like the last one.
Okay, so here's one.
The 10K month AI entrepreneur boom is mostly content farm fiction.
They say 82% of US businesses have zero employees.
What do you think about this one?
I mean, what I like about it is, When I do tweets, because I'm a human being, largely there's no data.
I have a hot take.
What's cool about this is there's research.
The truth is people obviously want data associated with their tweets.
With a team of hyper-agents doing all the research for you and coming up with content ideas, now you have time.
This is kind of cool.
Is this true?
that med v is actually not a legitimate business i actually uh i hadn't um i'd followed like the first arc of that story which is oh my god this thing is like so massive um but uh i mean it's a little let down for like the the billion dollar uh startup story but like you know maybe there's a take on it that says like no but like it's still possible for real this guy just kind of like gave us all a bad reputation um your ai agents didn't replace your va blah blah These are all what I would call kernels for really great tweets.
The cool thing is I could give it feedback.
As an example, let's say I want to give you feedback on your skill.
What's one thing that you want to give it some feedback on?
I would say The tweets that tend to do well sound very friend-to-friend.
Do these all just feel a little too...
They're not colloquial.
Exactly.
These feel a little too formal or stiff or something.
Exactly.
That's something I would notice that.
What we can do...
Like we would put this in the eval, right?
Yeah, you could do both.
So one is like you could immediately go and turn this like or update the skill based on this feedback.
You could also have it immediately just like turn around like a new draft of these tweets, right?
To sound more colloquial.
And then finally, to your point, I could go and create a rubric that actually says like, okay, like here's the five dimensions I care about and then auto evaluate every future output, right?
So.
You kind of have a number of different options depending on how far you want to go right now.
If you just want to get your job done right now, you don't want to bother with rubric, you don't have to, right?
But eventually, you get to the point where you want to set up a scalable system for this to just constantly work and get better and better.
And that's the point at which you would do a rubric, which is not that hard, actually.
You can either go in through the UI and build one, or you can actually, in this chat, say, help me build a rubric to score great Greg-style content, which I'll queue up for after.
it updates the skill, and then it will go and help me create that rubric, save it, pin it to this agent or to this skill, and then automatically run every future time I create content.
And is it possible to, for example, get an email every single day at 8 a.m.
with some ideas?
like you yeah you absolutely can so the way to do that would be in fact you could just tell it in the thread like can you turn this into a recurring daily email at 8 a.m and so then what it's going to do is like say like i want to now save this thread into an agent and the agent is going to be given a run schedule of like every day 8 a.m go and do this thing we're actually about to ship something that we're calling a live mode which is kind of inspired by like the open claw, like kind of heartbeat behavior where you could already have configured an agent to do this just by saying like, I want it to pull every 30 minutes, but we're making it much more of a first class thing within hyper agent where you can literally just click a button, turn any agent or any thread alive.
And then the feeling is going to be that like, wow, this thing is just like constantly on.
looking at all of the like new tweets out there, coming up with new ideas and then pushing them to me either via telegram or over email or in Slack, whenever it comes up with new stuff.
So like the UX or the mental model is meant to be like, wow, this just becomes like a always on like 24 seven agent that pushes ideas to me or even like can go and like preemptively draft and post content.
Like if you wanted it to go full YOLO, you could actually have it just go in like, tweet the content itself right good old full yolo mode yeah yeah i don't recommend full yolo mode just because i mean there's no need for for something like this right like in order for x specifically in order to win if you can get one good tweet out every single day that's all it is yeah no one you know and That just means that you could, and you can batch these, you can schedule it out, but just look at it, make sure that it's high quality, meets your bar.
I think it's definitely worth it for this specifically.
Yeah, that's fair.
I think that content is a very hits-driven business, and so fewer high-quality hits is what matters.
But there are tons of use cases where...
like maybe for my own emails, right?
Like there are a subset of emails that like are low stakes that I just, you know, want hyper agent to just automatically not only draft a reply, but like, if it feels confident, it's like not a sensitive kind of situation.
Like, you know, then just go ahead and like respond to, to it.
Right.
Like, you know, it could be simple, like inbound inquiries from like internal folks saying like, Hey, when you have time to meet, it can just preemptively go ahead and like.
suggested time right or even like pre-book it on my calendar um or customer emails that are like innocuous or like asking for like we're trying to give input on a feature you could just compile all that feedback feedback for me as a report but then respond like with a smart personalized acknowledgement to the user or even ask for like clarification And I think you all have like a ton of connectors built into Hyperagent, right?
Yeah.
So what's really cool actually is that not only do we have a ton of connectors that just work out of the box, you click a button OAuth in the thread, right?
So maybe starting a new one, I could say like, what's a tool that you want to use with Hyperagent?
It could be like Renola.
Notion maybe?
Okay, yeah.
Can I connect to Notion and pull in all my notes?
And so it will just in the thread, like say, Hey, here, here's an OAuth link, like connect to your notion.
But arguably one of the most powerful parts is like, even for things that we don't have a connector to, like, let's say there's some like very obscure API that you're trying to work with, right?
You could basically have hyper agent go and learn that API.
So I actually, I'll say like, actually, nevermind on this.
Can you instead help me build an API integration to what's some like fairly new tool that you know of that has an API?
I'm assuming, well, do you have linear built in here?
We do have a connection to linear, but actually maybe Twilio could be a good example, right?
Like where you can OAuth into Twilio, so it has to be an API skill.
And we may have a pre-built connector, but I'm going to have it like build a custom skill regardless.
So can you still help me build a custom skill?
to integrate with Twilio via API.
And so now what's going to happen is it can go and research the Twilio API docs, create a skill for itself to use the API, and then actually ask me to enter my credentials in a safe way and then be able to use the Twilio API fully.
So I think the powerful thing now is a frontier agent should be able to like literally do anything.
Right.
Like, but it's just a matter of like, you have to give it access to the right context and you have to like, you know, tell it like, Hey, like, yeah, you should build a skill for this.
So then it can do it every single future time effortlessly.
Um, let's say what we want to do SMS voice for now, maybe phone numbers.
We'll do an API key off and, um, any specific workflows think like maybe actually I want to, build a voice and SMS service that can call restaurants for reservations or something, right?
If you're listening to this and you're not fired up about building a business right now, the fact that you can do this is crazy.
If someone has heard about, this is the first time they're hearing about hyper agent.
They want to get started and they You know, what's a plan for them to like, what should they do?
How do they get started?
How do they get the most out of hyper agent?
I think like the most often like the hardest thing to get over is not like how to use the product.
Like I think, you know, our users have said like, wow, this product is like super intuitive.
Like I can usually just like ask the agent to figure something out and it goes and does it.
So it's not like I have to learn like a ton of new like configuration or UI or anything.
I think the hardest part is actually like.
picking like the right problem or like the right business opportunity you want to try to attack with hyperagent which like hyperagent actually can help you brainstorm that in fact we just shipped a new better onboarding flow where instead of just like landing you into a generic you know kind of like empty canvas where you have to like just pick like a new thread and, you know, we have some like templates and so on.
Like now when you first land in, it's going to suggest like, Hey, do you want to like connect me to all of your contacts?
So like connect me to your Gmail and to your Slack and to like your notion and granola.
And what I'll offer to do is actually go and like research you like in your context.
So I want to read through a bunch of your like past weeks, emails and slacks and like, look at your past granola meetings and.
You know, of course, all that context is private to you.
But like now hyper agent is going to be able to like suggest to you like, hey, based on everything I've learned about you, like here's some use cases that might be relevant to you.
So it seems like you're a VC.
Maybe you're like doing a lot of deal flow.
I could create an agent to just go and automatically like, you know, kind of summarize and do research on every investment pitch that you get.
Right.
So like you can turn me on all the time.
Like I'll just run in the background and then like ping you.
Every single time you get a inbound pitch or you can even have it learn the behavior to thread a private reply to any email that you get inbound from a founder.
Right.
So you get an inbound pitch.
Hyperagent on behalf of you sends you and only you a just threaded reply within that email chain saying, hey, I researched this company.
I also summarize all the materials.
Here's what you should know about them.
Right.
But the whole idea is that like Hyperagent itself.
can help you identify use cases, or you could come in just with a really broad prompt, like kind of interested in building, building a solopreneur business.
I don't know.
I'm kind of interested in like real estate.
I want to pick one of Greg's, you know, kind of ideas that are open source, like help me plan this out.
Right.
And it will do a very good job of like going and running with you on that.
So I think the main thing is like, don't get stuck in the blank slate starting point problem.
Like just come in and like, you know, figure out some place to start.
Maybe it's your personal, like, you know, contacts.
Maybe it's like you come in with an idea, but like once you start getting into it, like it's, it just sucks you in even more because like you realize all of what you can do and it's just so powerful.
Like you won't help but to get better and better at it.
Last question before we head out.
I was just talking to someone on this podcast talking about Hermes Agent.
One of the things we were talking about is when you're picking one of these platforms, be it OpenClaw, HyperAgent, Codex, whatever, you're sort of investing in an ecosystem.
My question for you, Howie, is why should someone invest in...
the hyper-agent ecosystem?
Where do you see hyper-agent going over the next few years?
We have a lot of experience building great PLG products.
Obviously, Airtable itself is a PLG product that also scaled up into real serious businesses.
still run their major operations, whether it's like really, really large, like, you know, kind of Walmart scale companies, like, you know, the opening eyes of the world, but also like, you know, we have like, like really innovative, fast moving SMBs, some of the like fastest growing companies like Mercore run a lot of like stuff on Airtable.
And, you know, I think like the, the, the experience that we have of building a product that's both extremely low floor and intuitive, but then also has a very high ceiling and scales up even as you need to scale up the number of agents you have, how you deploy them, how you oversee them.
Like that's our commitment is that we are going to be the best at giving you both a low floor and a high ceiling, especially as you want to actually run a serious business or operations with hyper agent, right?
So I think that's going to be kind of unique where...
I see the landscape fragmenting into like, there's going to be really easy, fun kind of prototyping tools and products that are kind of like easy to get started with, but then ultimately don't scale with you as you want to become like a real serious enterprise built around these agents.
And then conversely, there's going to be more like...
heavy kind of agent builder products, right?
With like configuration and like controls and all that stuff that are going to be better from like a control plane standpoint, from being able to like oversee a fleet of agents standpoint, but make the initial experience and the graduation path like a lot more clunky, right?
Or just like a really sharp wall to overcome.
So I think our commitment is this product is going to be the best combination of low floor and high ceiling.
And we're always going to have this obsession with great UX.
Like that's our DNA.
That's like what I obsess over.
And the only kind of company that I want to build is one that wins in a product category where the value of the software or the technology is very, very high, but the accessibility is really kind of the key differentiator that we win on, right?
So agents are going to be powerful.
We're not going to be the only powerful agent product out there.
I think frontier agents are all going to get better and smarter and faster and so on.
But what we can do is use really great product design, just like Apple did with computing, to make the powerful experience also really accessible.
Hyperagent is the most visual agent builder I've ever seen.
It reminds me of a desk.
I'm looking at my desk, it's a wood desk right now, and I have a paper over here and some scribbles over here and my iPad over there.
To me, that's what Hyperagent feels and looks like.
It feels like a desk that I'm visualizing it.
So I think for people who connect like that, and I'm certainly one of those people, I think a lot of people are just going to be like, sign me up.
use their computer through the terminal like all day every day like yeah howie some people are like they're they're you know um they're what they love doing is like obsessing over tuning every single detail and stuff like that and those people the you know that an open claw might be for them right but if you want more like yeah that but I believe that you don't have to sacrifice the tunability or the power.
And so one of our strong design philosophies here is that hyperagent still does give you a lot of control.
You can go and tweak agent configuration if you want to.
If you want to choose the exact...
model and system prompt and tools and like give it a lot of refinement you can and like you can go quite far in terms of curating memories uh we actually just shipped yesterday a kind of like a defrag tool for your memory so that as you accumulate more and more memories across all these different agents you have this like really elegant way of like defragging them right where Like we can auto suggest here related memories clustered by both like, you know, keyword as well as like embeddings similarity so that we're actually understanding the content of the memories and you can consolidate them.
But they're like, you know, we want to really serve both people who are like power users who want control over how the agent is set up so they can get maximum bleeding edge performance.
But then also, you know, like you shouldn't have to do all that to get value out of the product.
So it really is about the range.
I think it's more just that if you are truly happy just doing it all yourself through a very low-level command line interface kind of experience, and you're okay not having the control plane, the deployability, the ability to oversee many agents and deploy them at scale and manage across a team, then maybe those people aren't going to appreciate HyperAgent as much.
Totally.
Well, I'm stoked to see how it evolves.
Thanks for doing a little show and tell.
You got me fired up.
I'll include links where to follow you, but also where to sign up to HyperAgent in the description, in the show notes.
And we're going to do a really generous credits giveaway for your listeners.
One of the benefits of launching HyperAgent...
within air table which is a half billion revenue business we're going to generate 100 million of free cash flow uh you know this year like we have over a billion dollars on our balance sheet that's not to like you know just be pretentious about it but is that like you know, we've built a good and growing and like, you know, kind of profitable business with Airtable that allows us to be even more generous and liberal with like, we just want to get people to really adopt hyper-agent, get value out of it.
And we want it to become the standard, right?
Like we want it to become like the iPhone.
So, you know, we're willing to be very, very generous.
Like we're not trying to make money and nickel and dime people on, you know, on pricing.
In fact, like we're giving away multipliers to, you know, your audience and early adopters for both like just straight up.
cash that gets applied towards real model costs, including like Opus, which now, you know, as a lot of the OpenClaw community has gotten kind of sad about, like you can't get subsidized credit for use in OpenClaw.
But like you can use Opus, you can get the frontier models and you can get it much more cheaply because we're willing to subsidize it through HyperAgent.
Well, this is a group of people listening to this who appreciate that because this is a group of people who actually They listen and actually go and build stuff.
Thanks for the love, Howie.
I love the solopreneur and small early stage startup and small business owner audience.
I think it is where more AI innovation is going to happen far faster than frankly within many large incumbent companies.
You just have the agility.
the only thing keeping you from going and deploying agents everywhere is like just your willingness and like putting in a little bit of time.
Right.
But, you know, we're already seeing in our early adoption base with hyper agent, like, you know, some of these like small shops have become super sophisticated, really, really fast and are running their operations that are kind of game changing way that frankly, like a 50,000 person company would not be able to do.
for a much, much, much longer time, right?
And just has all kinds of like, you know, kind of reasons why they wouldn't be able to go and pivot on a dime.
So I think this is a really, really awesome audience.
And, you know, I kind of live to see, you know, entrepreneurs like do awesome stuff, right?
Like, so super exciting to be plugged into the community.
And like, I want to see, you know, your listener base generate like, you know, $100 billion, you know, kind of legit.
companies with less than five employees.
Thanks a lot, Howie.
I'll see you next time.
Awesome.
See you.