# Clem DeLong: China Leads Open Source, LLM Bubble Risks

**Podcast:** a16z Podcast
**Published:** 2026-05-22

## Transcript

The idea of restricting a technology like AI based on risks is just like, for example, you would say, okay, some people can punch other people, so let's tie down everybody's hands, right?
Because it's too dangerous.
Some people can punch, right?
But in reality, you don't want to do that because your hands are so useful.
The way you want to control it is untie everyone and then regulate or fight the bad actors.
So, for example, if hacking, that creates, cybersecurity risks.
It's illegal, right?
So you have to fight it, but not by preventing everyone from getting these capabilities.
Otherwise, you blow down progress, you create massive gaps in terms of controls, in terms of capabilities, and you create actually additional risks.
This episode originally aired on MTS.
Open source software built much of the modern internet.
Linux, Apache.
Kubernetes, and even the transformer architecture behind ChatGPT all spread because researchers and developers could study, modify, and improve them in public.
But AI is increasingly moving in the opposite direction, with the most powerful models distributed behind closed APIs, controlled by a small number of companies.
At the same time, China has emerged as one of the biggest contributors to open-source AI, while debates around safety, regulation, and access...
are becoming more politically charged.
And now those same tensions are extending into robotics, where AI is beginning to move off the screen and into the physical world.
Theo Jaffe and Sofia Porcini speak with Clem DeLong, CEO at Hugging Face.
We are live here on MTS with Clement DeLong, who is the CEO of Hugging Face, which has been really an incredible resource for anyone who's interested in...
large language models, and especially open-weight large language models.
I've been a hugging face user for a while now.
So it's great to have you here.
Clem, thanks so much for coming on MTS.
Yeah, of course.
Thanks for having me.
Absolutely.
Okay, so you are a big proponent of open source.
First of all, how do you predict, and you believe that open source is a very important thing for innovation and competition.
So can you compare and contrast sort of like the open source environments in the U.S.
and China to start?
Yeah, so I mean, historically, the U.S.
was super, super strong with open source, right?
That's kind of like what led to the current AI revolution, right?
Like the T in chat, GTT, is actually coming from Transformer, which was open source from Google.
Unfortunately, for the past few years, this trend has changed and things tended to kind of like...
closed down in the US and kind of like frontier labs more kind of like sharing their models behind like closed source APIs.
China, so the complete opposite movement.
They're the strongest open source contributors today.
If you ask most startups, most academia in the US that are using open source, they're usually using Chinese open source models, right?
You've probably heard of DeepSeek, of Kuen, of Kimi.
There are kind of like a bunch of companies and organizations in China contributing massively to the field of open source.
Great.
So you recently said we're in an LLM bubble.
What makes you think that?
Well, I was asked if we were in an AI bubble and I said we're probably not.
kind of like in AI as kind of like a general field bubble.
But I feel like if there's one specific domain of AI where there's so much investment that there's maybe a risk of overinvesting, it's large language models distributed behind APIs, right?
Like you see the building of crazy data centers for it.
And obviously you see a lot of revenue growth.
but with kind of like uncertain margins and certain kind of like long-term sustainability and moat for it.
So if there's a bubble, it's probably an LLM, but we'll see what happens in the next few months.
Well, you're a big proponent of open source, you know, as we all know.
But do you think that labs should ever restrict releasing their models?
in an open source way for safety reasons?
Like, yeah, in 2022, 23, it was way too early for that.
The models at the time were toys.
But now we have stuff like Claude Mythos, which supposedly can really assist people with cyber attacks.
We have models that are increasing pretty dramatically in bio capability, which could be even scarier.
So do you think companies should still be releasing their models as open source?
So the interesting thing is that we've had these conversations and this kind of like talking point for a while in AI when we were earlier talking face, I think six, seven years ago.
At the time it was GPT-2 and there was already like a lot of people saying that it was too dangerous to release in open source at the time.
It was six, seven years ago when basically it was nothing more than just an auto-complete.
I think we've seen progressively that these were quite overblown.
And I think they're also overblown today, right?
And the point is that, you know, Mito's, I think when it was announced, was it like three weeks ago, a month ago?
It was crazy dangerous and now it's starting to be deployed kind of like everywhere, right?
I think they just got access to the first international organization in South Korea.
I think yesterday or something like that.
And probably in a few weeks or in a few months, everyone is going to be using Mythos and not kind of like destroy the world as a result.
So I think with the current models, it's safe to release beyond APIs.
It's safe to release in open source.
And it's actually...
The safest way because it gives everyone the capabilities to not only build the systems, but also build the protection systems.
So if we talk, for example, for cybersecurity, the biggest risk is that a few players have capabilities that other people don't have.
And so the attackers could have capabilities that the defenders don't have.
Whereas if you make it more open...
Actually, it's usually easier for the defenders to react and kind of make the whole system safer.
So that's kind of like what we see with each release, where there are always kind of like overgrown concerns before, and then progressively just we all adapt and the benefits kind of like outweigh the risks.
Yeah, it feels like we'll still be dealing with this problem in like 50 years where somebody releases like some sort of like open source robotics, you know, robot or program or something.
And then everyone is like, no, you shouldn't have done that.
It's so risky.
And then we'll just adapt again.
It's kind of like the story of technology, you know, like, I mean, the idea of like restricting a technology like AI based on risks is just like, for example, you would say, okay, some people can punch.
other people, so let's tie down everybody's hands, right?
Because it's too dangerous.
Some people can punch, right?
But in reality, you don't want to do that because your hands are so useful.
They're creating so many good things in the world.
You need your hands.
The way you want to control it is untie everyone, give the freedom to everyone, and then regulate or fight the bad actors, right?
So, for example, if, you know, hacking, that creates...
cybersecurity risks.
I mean, it's illegal, right?
You have to make it illegal.
You have to fight it, but not by preventing everyone from getting these capabilities.
Because otherwise, you blow down progress, you create massive gaps in terms of controls, in terms of capabilities, and you create actually additional risks.
Well, right now on the topic of regulation, President Trump is in China, where he will be meeting with Xi Jinping over the next couple of days.
And they're going to be discussing, among other things, AI regulation and international AI agreements.
So what do you hope to get out of this in terms of open source?
Yeah, I mean, I'm excited to see conversations about open source AI.
Poverty, there's going to be some conversations about distillation.
about collaborations between two countries.
I hope both countries will be able to agree on fostering more transparency, more openness to kind of help more people access this technology.
I'm glad that Jensen hopped into the plane and joined these conversations because I think he has a lot of the right perspectives on Dito.
this topic to kind of like basically create more collaboration between countries and kind of like share progress.
Yeah, I'm curious about your robotics push.
So you guys launched Le Robot in 2024.
And you've talked about how robotics is the next frontier unlocked by AI and all of this stuff.
How do you sort of see this playing out and what is the role of open source?
Yes, I have two little robots behind me, two Richelini.
We've shipped almost 10,000 of them all over the world.
So it's probably one of the most widely distributed robots of the year at this point.
I think what's really cool with robotics is that it enables very new use cases and better use cases for AI.
So for example, for the Richelini, you have an app store.
Anyone can build apps.
So there's been over 300 apps that have been created for it already.
And when you see it in action, for example, with kids, right, empowering kids to interact with AI in a different way than looking at a laptop or looking at a phone, you realize that it's very empowering.
When you see a company like the Ritchie Mini on a kitchen table, looking around and helping you cook, you realize that it's enabling, empowering, creating new use cases that are just not possible just with a laptop and a phone.
That's why OpenAI and Sam Altman, for example, have talked a lot about their excitement about bringing new devices to market.
There's an important China-US component there because it's very likely that Chinese are going to dominate robotics, or at least they're already dominating.
And so on this topic too, it's really important that we build more in the US on this topic.
And we obviously have a lot of strength for it with the strength of the startup ecosystem in the US, the strength of the frontier models.
I hope to see a lot more in the coming months of the topic.
How can you face has been compared to GitHub a lot?
You know, the GitHub of AI.
But why wasn't GitHub the GitHub of AI?
It seems like they've kind of fumbled a lot of things in the AI realm.
So why do you think Hugging Face became sort of the go-to place for model developers to deploy models and not GitHub?
Yeah, I mean, I don't blame them.
They have a lot on their plates.
I think with the coding assistant, they've kind of been dealing with their own set of issues.
The reality is that hosting and sharing AI artifacts is quite different than hosting code.
So even if people have been calling us the GitHub of AI, I think it's two very different things.
For example, for us, the volume of files, of data that we're dealing with is much, much larger than what the GitHub is doing.
For example, just last week, we added two petabytes of data to the to the platform just last week.
It's kind of like a matter of comparison.
It's the equivalent of 500,000 two-hours movies that have been uploaded to Hugging Face just last week.
So you have a lot of structural differences, and we managed to build kind of like our infrastructure capabilities in a way that makes it just better for...
People are building in AI to use Hugging Face to host their models, their datasets, both publicly but also privately.
We have a lot of private usage now.
So that's kind of like some of the reasons why we managed to do it, whereas GitHub focused on other things.
Totally.
Well, that's pretty cool.
We love Hugging Face.
And we really appreciate your early support of MTS and our drops.
So it was great to have you on today.
Clem, thanks so much for coming on MTS.
Thanks again for listening, and I'll see you in the next episode.
This information is for educational purposes only and is not a recommendation to buy, hold, or sell any investment or financial product.
This podcast has been produced by a third party and may include paid promotional advertisements, other company references, and individuals unaffiliated with A16Z.
Such advertisements, companies, and individuals are not endorsed by AH Capital Management LLC, A16Z, or any of its affiliates.
Information is from sources deemed reliable on the date of publication, but A16Z does not guarantee its accuracy.