May 22, 2026 · a16z Podcast · 5 min read

Clem DeLong: China Leads Open Source, LLM Bubble Risks

Hugging Face CEO Clem DeLong analyzes China's dominance in open-source AI, warns of an LLM API bubble, and argues that open distribution enhances cybersecurity while robotics unlocks new commercial frontiers.

Technology Business Entrepreneurship

The takeaways

China Dominates Open Source AI Contributions. Chinese firms like DeepSeek and Kuen now lead global open-weight model releases, forcing US startups and academia to rely on Chinese infrastructure for core innovation capabilities.
LLM API Investment Faces Bubble Risks. Massive capital flows into closed API data centers show uncertain margins and weak moats, signaling a potential correction in the large language model distribution layer.
Open Source Enhances Cybersecurity Defense. Distributing models openly empowers defenders to build protection systems, whereas restricting access creates capability gaps that attackers can exploit against lagging defenders.
Robotics Unlocks New AI Use Cases. Physical AI devices like the Richelini robot enable app-based interactions and real-world utility, expanding AI beyond screens into kitchens, education, and domestic assistance.
Hugging Face Scales Beyond GitHub's Scope. AI artifact hosting requires distinct infrastructure handling petabytes of data weekly, differentiating Hugging Face's platform from traditional code repositories like GitHub.
Regulation Should Target Bad Actors, Not Tools. Restricting technology access stifles progress and creates control gaps; effective governance requires criminalizing misuse while ensuring widespread capability distribution for defense.

Clem DeLong, CEO of Hugging Face, identifies a pivotal inflection point in the AI economy: China has overtaken the US as the global leader in open-source AI contributions, fundamentally altering the innovation landscape. While US frontier labs increasingly restrict models behind closed APIs, Chinese organizations like DeepSeek, Kuen, and Kimi are releasing high-capability open-weight models. This shift forces US startups and academia to rely on Chinese infrastructure, creating a strategic vulnerability. DeLong warns that the US must reverse its trend toward enclosure to preserve its competitive advantage and foster the transparency necessary for global collaboration, particularly as geopolitical dialogues regarding AI regulation intensify.

Economic Risks in LLM Distribution

DeLong cautions against a potential bubble within the large language model sector, specifically regarding API-based distribution. Massive capital expenditures on data centers are outpacing clear evidence of sustainable margins or durable moats for closed-API providers. While AI broadly remains robust, the concentration of investment in proprietary LLM infrastructure presents significant correction risks. Investors and operators must scrutinize the unit economics of API services and recognize that open-weight models are eroding the value proposition of closed systems by offering comparable performance with greater flexibility and lower costs.

Safety Through Capability Parity

Arguments for restricting AI access based on safety concerns are fundamentally flawed. DeLong argues that limiting model distribution creates dangerous capability asymmetries, where malicious actors may possess tools that defenders lack. Open-source release empowers the broader ecosystem to develop protection systems, detection mechanisms, and countermeasures rapidly. Historical precedents, such as the overblown fears surrounding GPT-2, demonstrate that society adapts to new capabilities faster than predicted. Effective governance should focus on criminalizing misuse and targeting bad actors rather than stifling technological progress by denying access to defensive tools.

Robotics and Infrastructure Specialization

The convergence of AI and robotics represents the next major growth vector. Hugging Face's Richelini robot, with nearly 10,000 units shipped and an expanding app ecosystem, illustrates how physical AI enables use cases impossible on screens, from education to domestic assistance. China is poised to dominate robotics hardware, necessitating urgent US investment in this sector. Furthermore, Hugging Face's infrastructure, which ingested two petabytes of data in a single week, underscores that AI artifact hosting requires specialized capabilities distinct from traditional code repositories like GitHub. Companies must build or partner with platforms optimized for the scale and complexity of modern AI development.

Key insights

China has become the primary contributor to open-source AI, with US startups and academia increasingly dependent on Chinese models like DeepSeek and Kuen for development.

Geopolitics & Market Dynamics →

Impact: US organizations face supply chain risks and must adapt strategies to leverage Chinese open-source assets or risk falling behind in innovation velocity.
Investment in closed-API LLM infrastructure shows signs of a bubble due to uncertain margins, lack of durable moats, and overbuilding of data centers relative to sustainable revenue.

Investment & Economics →

Impact: Capital efficiency in AI will likely shift toward open-weight models and specialized applications, pressuring API-only providers to demonstrate clear unit economics.
Open-source distribution enhances overall system security by enabling defenders to build protection tools, whereas restrictions create capability gaps that attackers can exploit.

Risk Management & Security →

Impact: Enterprises should prioritize open-weight models to ensure access to defensive capabilities and avoid reliance on black-box systems that may lag in threat response.
Robotics is emerging as a critical AI frontier, with devices like the Richelini enabling app-based interactions and new use cases in education and domestic environments.

Product Innovation →

Impact: Businesses should explore physical AI integrations to capture value beyond digital interfaces, while monitoring China's likely dominance in robotics hardware manufacturing.
AI artifact hosting requires specialized infrastructure capable of handling petabytes of data, distinct from traditional code repositories like GitHub.

Operations & Infrastructure →

Impact: Developers and enterprises must utilize platforms optimized for AI scale, as generic code hosting solutions cannot support the volume and complexity of modern model workflows.

Action items

Audit current AI model dependencies to identify reliance on Chinese open-source models and assess supply chain risks.

Impact: Mitigates geopolitical exposure and ensures continuity of innovation by diversifying model sources or building internal capabilities.
Evaluate the ROI of closed-API LLM subscriptions versus open-weight alternatives to optimize infrastructure spend.

Impact: Reduces costs and vendor lock-in while maintaining performance through flexible, self-hosted model deployments.
Invest in robotics and physical AI use cases to capture emerging markets beyond screen-based interactions.

Impact: Positions the organization to leverage new revenue streams and competitive advantages in the rapidly growing physical AI sector.
Adopt open-weight models to ensure access to defensive tools and maintain capability parity against potential threats.

Impact: Strengthens cybersecurity posture by enabling rapid development of protection systems and reducing dependency on restricted capabilities.

Quotes

“The idea of restricting a technology like AI based on risks is just like, for example, you would say, okay, some people can punch other people, so let's tie down everybody's hands, right?”

“If there's one specific domain of AI where there's so much investment that there's maybe a risk of overinvesting, it's large language models distributed behind APIs.”

“China... They're the strongest open source contributors today. If you ask most startups, most academia in the US that are using open source, they're usually using Chinese open source models.”