# AI Infrastructure Pivot: Enterprise Focus and Agentic Runtime Wars

**Podcast:** Last Week in AI
**Published:** 2026-03-26

## Transcript

Last week in AI, I would like to thank ODSC AI for being a sponsor.
ODSC is one of the longest running and largest communities focused on applied data science and AI.
It started over a decade ago with a simple idea: bring practitioners together to learn from people actually building and deploying models in the real world, not just talking theory.
On April 28th through the 30th, you can experience it yourself at ODSC East 2026, taking place in Boston and virtually.
There will be thousands of hybrid attendees ranging from data scientists, ML engineers, AI researchers, and technical leaders.
You can attend over 300 sessions covering LLMs, Gen AI, computer vision, NLP, data engineering, and more.
You can also go to hands-on training with workshops and boot camps taught by experts from companies like OpenAI, Hugging Face, NVIDIA, and other top companies and universities.
And of course, there'll be a massive expo and networking opportunities, great for startups, hiring managers and AI tool builders.
It's one of the best ways for AI practitioners and teams to stay ahead of a field and learn from the best and connect with a community.
Go to ODSC.ai slash East and use promo code LWAI for an additional 15% off your pass to ODSC AIEST 2026.
That's ODSC.ai slash East and use code LWAI to get an extra 15% off on the number one AI builders and training conference.
Hello and welcome to the Last Week in AI podcast, where you can hear us chat about what's going on with AI.
As usual, in this episode, we will summarize and discuss some of last week's most interesting AI news.
You can also go to lastweekin.ai for our newsletter with even more news every week.
I'm one of your regular hosts, Andre Karenkov.
I studied AI in grad school and now work at the startup Astrocade.
And I'm your other co-host, Jeremy Harris from Gladstone AI.
AI national security, AI loss of control, all those fun things.
By the way, special thanks to Andre for recording at this time.
We we bumped things up even earlier.
I think it's like, what is it, 7:30 your time?
Is that it is here, yeah, on the PSD.
Sometimes it's nice to be East Coast to be a bit later.
But you know, it's good to get your day started early sometimes.
Yeah, much appreciated.
Also appreciate people tuning in or watching the entire last podcast, which featured an extra like hour plus, which I didn't realize it was that long.
But Andre had to hop off, so I just went through like some of the technical papers we didn't cover as an experiment, and man, did I go on.
So I I have learned that I need Andre to be like the regularizing term to my loss function, if that is Yeah, we do know some people are very much fans of the coverage of research and us going in depth.
So feel free to comment on YouTube or elsewhere.
And yeah, you know, say if you want more of that.
We've considered maybe having additional episodes that are just research very much up in there.
But feel free to let us know if you'd like even more research on a regular basis.
And just to give a quick preview of what we will be doing this episode, not as much research as last one.
There's a bit of everything, I suppose.
There's a couple new model releases and some other kind of interesting tools.
There's some interesting developments on the business front with OpenAI.
And then we've been covering a lot of safety and interpretability work lately on alignment.
So we'll have some of that.
And then towards the end, we'll have some fairly interesting, impactful looking research, less theoretical, more sort of like wow, this might actually be a big deal.
So uh it should be a pretty fun listen.
And let's kick it off with tools and apps.
First up, OpenAI, they have shipped GPT 5.4 mini and nano.
They are similarly to other small models that we've seen in in recent times, like actually really quite good for being kind of a smaller range.4 Mini is close to GPT 5.4 on several benchmarks, including on SWE Bench Pro and OS World Verified, and it's more than twice as fast.
GPT 5.4 Nano is obviously the smallest option.
It's not really doing great on the benchmarks, but it is super quick.
These models have 400,000 token context windows.
So fairly substantial.
But they do cost a decent amount relative to GPT 5 mini.
Looks like GPT 5.4 mini cost 3x, GPT 5 mini, and GPT 5 Nano also cost more.
So on the whole, good faster, smaller models.
If you need something that's doing better in GPT 5 mini, now you have that option.
There's been a lot made about the costs situation and the pricing of the per token pricing, I should say.
It is higher.
There's no question, right?
So GPD 5.4 is basically three quarters of a penny, or sorry, three quarters of a dollar per million input tokens versus 25 cents for GPT 5 mini.
So that's a 3x hike.
But OpenAI says it only burns about 30% of the GPT 5.4 quota in codec.
So it's actually gonna be much more token efficient.
And this is a metric that I think matters a lot more than many people will tend to realize, right?
So we get lost in like cost per token.
But as we've seen, more tokens at inference time does not necessarily mean more performance.
And that's the big catch here.
So when you multiply those together, right, 30% times 3x, you actually get a slight decrease in what you might think of as cost per performance, which is a little closer to what most people will care about.
And this will vary depending on the workload, but still quite interesting, right?
So for the sort of orchestrated agentic tasks, the effective cost per outcome could actually be favorable compared to running the full model.
So sort of interesting there.
One thing I will say, Nano is API only.
This one is priced at 20 cents per input and $1.25 per million output tokens, versus like much, much cheaper.
So its predecessor was five cents for per input token.
That's a 4x lift, roughly 4x again for output tokens as well.
So you're looking at 4x overall.
The weird thing is so OpenAI is pitching this for classification and data extraction, which are these like very high volume workloads where you're usually quite cost sensitive because you're processing so much.
And so that fourfold hike is gonna sting the most for exactly the people who are being pitched this product.
So it's a bit of an interesting position.
It it seems like a little, I don't want to say at odds with OpenAI's position that they want to make intelligence too cheap to meter, but it certainly is a at least locally, this is a move towards instead of racing to the bottom on inference costs, we're gonna focus on model quality, and that that's gonna be our big differentiator.
You're gonna care that we can get the right answer, not that we can get it cheaply, which is where all the margin is.
That at least is what Anthropic certainly suggests, and what we're seeing elsewhere.
So that's kind of interesting.
You know, there's a bunch of interesting, as you said, OS World Verified is an interesting benchmark to look at here specifically.
I know you mentioned Sweetbench and a couple others.
And so this is basically a computer control benchmark.
So it looks at how well can the mini model just control a computer.
And what we see here is a GPT 5.4 mini hits 72%.
If you look back at GPD 5.4, the full version, it hits 75%.
So it's actually pretty close.
And if you know if you were thinking about the the previous GPT 5 mini, that was only 42%.
So pretty big jump, especially given that we're getting up there on this benchmark in terms of saturating it.
So all in all, pretty interesting release.
The very lead, not very lead, but the the very detail here really is that token efficiency question.
What kind of workload are you going to use this for?
That's going to determine.
I don't want to sell call like the total cost of ownership, because that's not quite, yeah, it's not not quite the right metric here.
But the total cost that you're exposed to in the ROI, that's really becoming a key thing here.
Right model for the right workload is it's just gonna be a critical dimension, at least for the next few months.
Right.
Yeah, I think these kinds of things showcase the fact that, you know, all these models kind of came out of a world of academia, right?
And benchmarking is largely focused on capabilities on how accurate your model is.
And so usually you don't highlight these kind of more practical concerns of how quickly can you finish a task, right?
How uh cost effective are you?
Like you do a task, how much dollars does it take?
Wall clock time, right?
We just don't get these numbers, at least on the announcements.
It's honestly a bit surprising that's still the case, but it's a culture question, I suppose.
And as you said, we'll be discussing OpenES strategy in just a bit in the business section, very much in line with our to one flopic where we're like, uh, we'll just charge more for our models, but they're the best.
So people will tolerate it.
And speaking of small models, next up you've got Mistral.
They have released their small four family of models under the Apache 2.0 open source license.
And it uh combines actually multiple things.
So it has reasoning built in, it has multi-volution capabilities built in, and it has a genetic coding optimization.
So they're combining magistral, pixtral, and devstral.
It also uses a mixture of experts, so they've have a total of a hundred and nineteen billion parameters, but only six billion active parameters per token.
That's quite small.
You can fit it into probably one top end GPU, you know, it's it's actually quite affordable.
So it looks like possibly a pretty strong model on in this kind of category of smaller, faster, cheaper, and open source.
Yeah, the challenge there is gonna be, you know, if you get into wanting to fine-tune it, obviously, then you know, now you're you're dealing with really a 120 billion parameter model, and that's you know, pain in the butt.
But yeah, I mean, it's only six million act.
I mean, this is really ultimately a bet that you're gonna have kind of like model sparsity in the sort of very aggressive sparsity ratio is going to outperform a dense model or something, you know, more traditional.
And and it's a bet as well, as you say, on on the kind of hardware that is available, at least for inference on local machines.
So, yeah, it's it's quite interesting.
It is a more aggressive, kind of fewer active parameters per token type of play than we've seen before.
It's also kind of interesting.
So you touched on it, the consolidation, right, of all these models under a single in a single model, right?
Reasoning, coding agents, multimodal, that's pixel, like looking at images and text and so on.
So this is a weird play because historically, Mistral has separated these, right?
And so given they're collapsing them into one model just with a reasoning effort dial, you could see that as a pretty big bet on where the industry is heading.
That is something that we've seen with, you know, obviously the O series of reasoning models with uh with GPT-4.0, even starting as far back as that.
If it's the case that we get positive transfer, that's really what this is a bet on, right?
That a model that is trained to do all these things will do better each individual thing because it's kind of getting cross-training, right?
The same way that you might want to do, you know, football and like ballet at the same time, or ice hockey, and that you get better at each different thing because of the combination.
That's kind of the the idea here, right?
That positive transfer that for so long was was really difficult to pin down.
People were seeing negative transfer where the more stuff you train a model on, the worse it does on the marginal additional task.
We're now well into positive transfer territory.
The extension of that to reasoning is quite interesting.
It implies that reasoning may, at least that they're betting that reasoning may eventually or already play a role in uh analyzing images more successfully for these kinds of models at this scale.
So there's another piece here that's interesting.
You know, this efficiency claim.
They say that small four is achieving scores that are like on par with GPT OSS 120B on a bunch of benchmarks while generating like shorter outputs on at least one benchmark.
And so again, this is about that token efficiency question, right?
We're sort of like back at that that very underrated metric of output length efficiency and the fact that you don't necessarily get more benefit by reasoning with more tokens.
That's something we we started pointing out when we looked early on at the kind of deep seek reasoning results, right?
Where it's like, yes, you do get this positive uplift, but your value per token is actually like potentially going down.
And we are now, in fact, in that regime, we're seeing that very clearly.
So, this efficiency piece is really important because if people are going to use open source models, they're gonna run them like by far the biggest use cases running these on whether your own or or other people's clusters to serve customers.
And so, you know, like how good Mistral is at making this an efficient reasoner speaks directly to your bottom line as the person who's going to be serving these or asking somebody else to serve them for you.
So, pretty interesting.
You know, the fact that they're comparing to GPT OSS 120B.
Look, the space moves really fast.
That is an old model at this point in in open source terms.
It's kind of like choosing your point of comparison pretty selectively, I would say here, but they do a comparison that's reasonably favorable on quen models as well.
So it's an interesting play.
I think I I mean, I'm a little concerned for Mistral.
I have been for a long time, obviously.
Don't know too much how this ends up playing out with the open source play, but here they are.
It's a reasonable model.
And the consolidation angle, that's a really big story here, right?
If everybody is starting to consolidate, even open source players here around one model under one roof, that's a materially different story.
It does not, by the way, extend necessarily to the world of agents, right?
Subagents may be smaller, cheaper models.
This is more about, you know, if you're looking for a highly performant model, like think of the main orchestrator.
You you're probably, it seems you're probably gonna be seeing you know models that that kind of put put multiple capabilities under one roof.
So something to watch out for.
Right.
I'll just quickly comment on the unification bit.
The fact that they have baked in reasoning and coding, I think is is a little more just an indication of they're catching up with where things are these days.
It it used to be the case that you had a reasoning model like O3 and with Deep Seek R1, you trained a reasoning model and you had your base model.
And what everyone moved to in 2025 is there's no reasoning model.
Your model has reasoning baked in.
And now with with Sonnet, with GPT, really with post-training for reasoning, it's been very clear that you should just train your model to be a good coder because that makes it a smarter model in general.
So this is a a mix of catching up where things are at, and also adopting that multimodal capability, which is uh could be interpreted in in several ways next up meta's manus launches my computer to turn your Mac into an AI agent so this is something you can install and launch in your computer and it's effectively like having a little open claw I guess on your computer so it can execute command line instructions that lets it interact with computers.
So it's very much I think we've seen this happening more and more where various organizations are shipping open claw-esque things where you have an agent that just lives somewhere and you can tell it to do stuff and it is your like assistant or your AI or whatever this appears to be another instantiation of that similar also to Perplexity's announcement of what was it like perplexity computer.
Yeah so very much in line with that.
Oh, you mean you weren't able to remember like the sixth new open claw variant that got launched by a company in the last two weeks?
Yeah, it's crazy, right?
We're really seeing more and more of this like pylon, right, from all these all these different competitors in this space.
And this is a land grab.
Like, make no mistake.
This is the the sort of man, I don't call it the scramble for Africa moment, but the you know, historical equivalent of that in this space with fewer controversial overtones, this is the moment where people are realizing, hey, you know what?
We need to get on people's local machines, right?
We have to get some kind of piece of that pie because so much of what's what's happening right now is that we're trying to like grab onto people's like agentic, like the kind of agentic runtime layer.
In fact, NVIDIA we'll talk about this in a minute with with Nemo Claw.
It's the same thing.
Everybody's trying to get on to like what is what is the substrate on which agents are gonna run.
Can I get my dirty little hands on that and and turn that into part of my market?
And in this case, the MIS play is quite interesting and and it's aging well, right?
I mean, Manus was an in a like totally independent company before the acquisition, and really the play here is Meta extending its reach into that that local OS layer for the first time.
It's this is the territory historically of you know your Apples, your Microsoft, your Googles.
That's the war they're entering.
We haven't seen Meta do that before, right?
We we just haven't seen them try to play in the operating system game.
This is a really interesting way for them to vector into a completely different market using resources that, you know, maybe for the first time, they have like a credible, whether you call it advantage or not, they have a credible play here.
So pretty interesting.
And again, meta classic history of buying their way into competing, right?
We you know saw this with WhatsApp with Instagram, like it it never ends.
This is their play in that direction.
So Manus, you know, maybe aging well.
We've got to see what comes of this.
Yeah, and just as a reminder, Manis came out, I believe last year initially with a cloud-based agent, but you could assign to go do stuff.
So this is them extending to your local computer.
Right now, you can, I think get this for silicon-based Macs.
The other aspect of this, by the way, is not just OpenClaw, it's the core work, the codex angle where they most of the block post actually is highlighting like let the agent go to computer and organize files and do things that are very much cloud code or or now co-work-esque.
And the kind of tell it to do stuff from anywhere at any point is an aspect of it, the OpenClaw aspect.
But I think the the real land grab is for that co-work type, like have an agent do stuff for you, which is now like everywhere in coding, but I think what Anthropic and now OpenAI and now everyone is realizing is these agents can like do a whole bunch of stuff and that people haven't adopted to do yet.
And speaking of OpenClaw, next up, NVIDIA has announced NemoClaw as as part of their announcements at G GTC, which is a little bit funny.
This is a stack for the OpenClaw agent platforms that allows you to install NVIDIA Namotron models.
We just discussed this latest Nimotron model last week.
And there's a new NVIDIA open shell runtime.
You install both of those in a single command, and you get privacy and security controls baked in, making it possible to have you know more confidence in running one of these things.
We've seen many stories of OpenClogs.
Yeah, so OpenShell provides an isolated sandbox that enforces policy-based security network and privacy guardrails for agents.
Seems like a good idea if you are to do one of these agents, maybe install them in a sandbox where you can control them and they don't go rogue and you know take over the world.
Yeah, and and one of the key dimensions here, you know, you think about what does sandbox mean?
You know, what how do these things work?
Typically, a lot of these things focus on what is the model that is on the running on the cloud, and what what are the models or model that's running locally on your machine, right?
So you can imagine you might not want the model that runs locally on your machine that's actually looking at your own intimate files to have direct access to the internet, right?
So that's kind of like one way that you might enforce that sort of guardrail.
So use that local model, just like generate summaries or do analysis and stuff, and then just ship the summaries after some review or something like that to, you know.
That's like that's one way to play that game.
And there are a whole bunch of other guardrails around the kind of access that different models can have and your ability to tune that.
So that's kind of like a lot of where this is coming from.
You know, this is a classic example of Jensen's hyperbolic rhetoric, right?
He's he's using terms like new renaissance and software to kind of to play this up, which to be clear, I actually I mean, I agree with this, but there's a there's a gap between, you know, this framework can help you install you know a single command and redefining how computing is done, and we'll see if that that chasm gets gets crossed.
And and it it probably will at some point.
I mean, I think it certainly will at some point.
Question is who does it first?
So the frame here is you know, again, that operating system piece, right?
Keep going back to this.
It's not a coincidence that everybody is is going on the same, basically this gold rush expedition.
Everybody's thinking the same thing, the operating system for personal AI, that layer, and again, Jensen here comparing Mac and Windows for PCs to open claw for personal AI.
That's a deliberate, it's a bold claim, but very deliberate.
This is part of that frame.
This is NVIDIA now saying, hey, Meta's gonna get into the effectively the operating system game, the the sort of operating system for agents.
We're gonna do the same thing.
We don't have a history quite of doing that, but you know, now we're diving into so you're creating creating this environment where because essentially agentic, you know, we had the software eats the world era of the sort of SaaS revolution, you know, the last decade, decade and a half, two decades.
And now we're in the AI is eating the world, including software, and another layer of abstraction here is yeah, that runtime environment for agents.
And and that's the land grab, that's the operating system.
At least this is the case that Jensen's making.
I think it's a reasonable case on the whole, but remains to be seen.
This is functionally also a classic NVIDIA play of like trying to Trojan horse in their full stack.
So NemoClaw is gonna install Nematron models, right?
Those NVIDIA models preferentially, and the open shell runtime all in one command.
The simplicity here is the point.
So it's super super easy to deploy NVIDIA's own models together and their runtime environment, all that stuff.
Basically, they're creating a situation where just like they owned CUDA for the GPUs, they're creating a whole software stack around agentic, the agentic runtime environment.
And so this is what's worked for them in the past.
Create a whole ecosystem around this.
Software is more commoditized now than it has been before.
So that may not be the same kind of moat, especially given that they're, you know, unlike before with CUDA, where they had like a decades head start before anyone really paid attention.
This is now you know much more in competition with kind of faster moving players.
So really interesting.
Again, chalk it up as another another entrant in the sort of operating system for AI agents, sort of catalog here.
Right.
I do think uh an interesting aspect of this NemoClaw is is kind of like the hype, the like good uh cool branding for the actual major push, which is the open agent development platform, which is that terminal-based thing.
It also comes with this NVIDIA AIQ blueprint thing, which is built on top of Langchain Deep Agents and has NVIDIA Nemo agent toolkit as these open source kind of options for building your stack of agents.
So the NemoClaw aspect is like digging back on the thing that everyone is hyped about, and then the actual software frameworks is probably the actual thing that NVIDIA cares about.
Yeah, and and and this is it, right?
It's like, how can I hook on to one big hype train and then exactly use that as a Trojan to get our like everybody dependent on our whole stack, including their models.
Because there hasn't been the kind of uptake of the Nematron models, at least that I expected to see so far.
And so this is, you know, one one opportunity for them to make that happen.
Yeah, I think to your point about software, CUDA obviously is is the number one thing for GPU related kind of execution software.
But as far as open source packages go, Nvidia hasn't had a history of really making an impact.
So for instance, Lankchain is an open source package that goes back to 2023 that has been very popular for building complex graphs of LLM prompts and agents and so on.
Probably overly complex, but anyway, it it was at and it gained a broad community adoption.
So that's more or less been the pattern a lot, but with agents and now open claw and whatever, that was a good time to try and get in on that open source kind of game.
And one more story about NVIDIA at GTC, they announced DLSS5, which is what they are describing as the GPT moment for graphics.
So this is runtime enhancement, I suppose, for game graphics.
So this uses machine learning-based upscaling, and it applies generative AI to add a whole bunch of just really nice-looking graphics.
So if you look, they have various examples, so you probably want to just see it to understand it.
Basically, you can go to older games like LetterScrolls Oblivion, for instance, and get really cutting-edge seeming graphics with this turned on.
Now the reaction to has been very much split from what I've seen.
People have often scoffed at it and were like, oh, this is AI slop, this filter is bad.
In some cases, it seems to go against the style of the game somewhat.
So NVIDIA did also emphasize that this is fully controllable by the developer.
The developer can set the settings of how aggressive it is, you know, how much it impacts various aspects of the rendering.
Yeah, I think this is the kind of thing that NVIDIA has given developers, presumably in the past.
This is DLSS 5, but with the generative AI aspect of it, it's getting a lot more discussion.
And it looks potentially very significant.
Well, so I was gonna ask you.
I mean, this is the part of the show where I I asked you for your opinion as a guy at AstroCade.
Like, is this something that you guys would integrate in your stack?
Like what how do you how do you think about new tools like that?
Or is that even part of your your workflow?
Well, yeah.
So this is kind of targeting that 3D triple A big game kind of market, right?
So this is dealing with new games, right?
And and more complex, you wouldn't run this on your phone, like for casual games, but for games that have complex character models with faces and and stuff like that, or like open world games where you traverse big landscapes, that's where the kind of photorealistic pass really makes a big difference.
And uh last up, we have an update on OpenAI's plan to launch ChatGPT's adult mode.
This has been announced, I think, last year, as I think they are thinking of doing.
It has been delayed from the original late March target, and it seems that they're still aiming to do it.
The news here has been that Vitimbuvin, OpenAI, their advisory console of psychology and neuroscience, have opposed it at a January meeting.
One advisor really warned about it significantly.
So, anyway, we have a quick update saying that they appear to still be planning on it.
It has been delayed, but as of now, it's still going to be presumably released.
Yeah, wow, what a surprise that despite objections over the appropriateness of a tool like this, that they still went ahead.
Huh.
That's that's weird.
That's not like OpenAI at all.
Yeah, that's weird.
What a weird thing.
Uh yeah, no, this is when you think about the the things that we're classically warned about, just my opinion here personally, but I seem to remember an awful lot of people warning us about the brave new world thing where you're hooked up to like the dopamine drip.
Yeah, I don't know.
Ultra porn powered by like AI super intelligence.
Like, I'm not sure how far you push the inference time compute budget before you get to uh to that scenario.
But hey, how about scaling laws for porn addiction?
How about that paper?
Looking forward forward to that coming out.
I mean, look, that's the kind of thing we're gonna have to see.
There's gonna have to be studies on scaling laws for porn addiction.
Just calling a spada space.
Like, I don't see how we avoid that.
It's interesting that this is this is an offering.
Someone's gonna do it, right?
The big porn companies at some point, whatever.
So you could argue, and I'm I'm sure this is part of the the ethical argument that OpenAI might make here is look, this way, we can monitor the use of these things ahead of time, understand how people adjust to this technology, maybe try to mitigate risks ahead of time, whereas porn companies probably wouldn't do the same thing.
It's sort of a similar argument to what Sam's been making historically, right?
We we want to like get these things out there because it's gonna happen anyway.
So we want to get that feedback and be able to iterate on it, which there's merit too.
But yeah, we should be under no illusions that we're crossing the Rubicon here.
And I think, given that OpenAI is taking this step, it is on them then to come out with unbiased research that transparently looks at the effects of exactly what they're putting out there.
I think that's just a sort of reasonable thing.
I think any uh be pretty sure it's like a tobacco company going out and doing something when you're playing so close to the the base of the human brainstem, you you're like you're in different territory.
And so anyway, it it we'll we'll see if this persists, we'll see what scandals come of it.
I'm sure there will be some, but yeah, it's in some ways not surprising.
I don't want to again rip on open AI too much for this because the argument is true that at some point, you know, the big porn companies will do this and they have a history of like mistreating women and and and you know, doing all kinds of awful things.
So ethically, I get it, but now the proof's gonna be in the pudding because once you put that out there and brand wise, like geez, I well, so I think it's worth clarifying a little bit.
Uh porn might be a bit strong of a term here.
Right.
So they they do clarify this will not generate erotic audio images or videos.
This is really about so they've banned erotica as a as a general category of a chat bot.
And so if you try to write sexy stories or have sexy role play, which you can do in plenty of other you know apps out there, absolutely a million AI girlfriends, including X AIs, right?
So this is more of that.
This is like role-playing with sexual kind of adults only component to it, not straight up like explicit content of the sort that you might think when saying porn.
So in in some ways, yeah, I think there's there's arguments to be made either way, like people have tried to do this and try to do this as is, most likely with these models.
People do seem to want it.
There's a very large adoption of these things, and we already have plenty of stories of people like getting AI girlfriends and having psychological impact.
So yeah, maybe actually if Chat GPT and OpenAI do this explicitly and do it right and do it well, it it's better than not doing it and having some shady dark market for it, you know.
No, and absolutely, and and I to I I I take back the the use of the term porn here.
I guess I'm I'm sort of seeing where the the puck is headed and anticipating that next play.
This is definitely a step where we need open AI to actually run the studies since they're gonna have access to the data no one else will, right?
So, like we we need people to look at this.
I hope that they will.
I I would imagine this is part of the package here, but from a branding standpoint, this is really dicey, especially at a time when they're trying to redouble.
We'll talk about this in a minute, but on this whole sort of business productivity focus, like that's gonna be their big play.
So you're you're kind of adding that into the mix.
It's interesting.
I mean, what what this implies about the user base that they're they're going after, is is there ROI here?
I mean, porn is a pretty low margin industry, as I understand.
It's so so this, like, I don't know if roleplay, erotic roleplay would kind of be similar, things flirting, if you will, with that.
But either way, yeah, we'll we'll find out.
I think the argument that a lot of people want this is also kind of dicey.
I mean, wow, a lot of people want crack cocaine.
Like, it's a so we gotta find a way.
Now, everything's on a continuum.
I'm I'm I'm being tongue-in-cheek here, obviously, but we have infinite inference inference time compute budgets coming online in the next you know three to five years or whatever.
At what point do you just have so much inference time compute being dedicated to your your limbic system that for all intents and purposes, like you're robbed of your agency.
I think we're actually gonna be closer to asking ourselves that question sooner rather than later than most people are pricing in.
Right.
And then just to reiterate, the actual news here is just that there have been these details coming out of the internal objections from certain parts of OpenAI, and these objections have been there since January, and the company hasn't canceled this feature.
Now the features are out, they could still not do it, and it has been delayed.
So it's still on track, and it'll be interesting to see if they do go through it after all.
Yeah, I know that's true.
And actually, like a lot of my reaction is reactionary.
I'm like this this direction generally is definitely coming, but it makes me nervous as hell.
And that's kind of that that's sort of my my response to this.
Is you know, we're just sort of seeing a lot of this is testing the waters, too.
I'm not saying this particular story was leaked, but you see governments do this, you see private companies do this.
There's like leaks of stuff that go out to just gauge audience kind of consumer response to things.
Can we take that step?
Can we not?
I don't know again if this is or is not the case here, but but man, I mean the the you'll at some point need some kind of immune response to to things that push in this general direction.
And onto applications and business, sticking with open AI and coming back to a topic we've briefly touched on.
Another thing that came out from OpenAI this week is uh discussion that happened at an all hands, where effectively OpenAI has signaled a strategic shift away from doing everything, doing a little bit of everything, and focusing more on productivity and business.
So you know, OpenAI has historically had a million side projects.
This I think is how they characterize it.
Side projects, they have video models, they released the SOAR app, they have a browser, they have uh audio models, I believe, transcription, uh, like a million things.
Anthropic has Claude, and that's it.
OpenAI has Sora and their audio models and and and so on and so on and so on.
So it seems to be that internally within the company, they are changing focus.
The head of applications, Fiji Simo has told staff that they cannot miss this moment because they are distracted by side quests and they need to nail productivity.
And you've seen this already for the last several months.
It has been apparent that they have been really pushing on codecs to catch up with Cloud and Claud and Cowork as well.
And they largely have, at least in terms of feature set, in terms of adoption.
I mean, many people are starting to use Codecs.
Some people prefer Codecs at this point.
But it is obviously true that this hasn't been OpenAI's primary focus until recently, and they have kind of been behind Cloud, and in terms of adoption, they are still behind because Cloud Code is the first early leader in that category.
So yeah, interesting to see OpenAI having that internal discussion now.
I feel like this has been a problem with OpenAI for a while.
If I were to kind of guess at internal dynamics and kind of in business and company level issues that lead to poor performance.
Famously, and and Andre, I'm sure you'll have friends that have told you the same thing, like at Google.
Everybody of mine who's ever worked at Google says the same thing.
And actually, same at Meta, you get promoted for building new stuff, right?
It's like it's not about did you make the code run more efficiently?
Did you clean this up?
Did you clean that up?
It's like, did you make new stuff?
And that's why there's a massive app graveyard, right?
Famously for Google products, you know, Google Hangouts, Google this, Google that, that just like gets axed in at various stages.
There's this fundamental question of like, again, is this a feature or a bug, right?
You can look at Google and you can say, ha ha, look at the graveyard of wasted time.
You can also look at Google and say, well, what matters is not the misses, what matters is the hits.
And for every, not for every Google Hangouts or dead Google product, there's like a Google Calendar or a Gmail, right?
Or a maps, right?
Or a maps.
That's right.
That's right.
So like Paul Buhight, famously, you know, now a YC partner.
I don't know if he's left YC.
He was there when I was there.
But he like was the founder of Gmail, and his entire like claim to fame in Silicon Valley is that he was the Gmail guy.
That was a company within Google.
That's the way to think of it.
And certainly from a revenue standpoint, that's what it is, right?
So when Sam, who comes out of the YC ecosystem too, right?
Having been the president of Y Combinator for many years, the first one after Paul Ram, his whole thing is spray and pray, right?
He's used to betting on us like a little bit of money on a lot of efforts at the same time.
This goes way back to OpenAI's founding days, 2015-ish, where he was doing the spray and pray thing on a whole bunch of different things, you know, evolutionary methods and robotics and RL and all this stuff.
You know, OpenAI's gym was originally a side project, right?
Now it's a big thing.
So so there's a whole bunch of different different plays that you know, he he's sort of like used to playing in this way.
It's work from the past.
The challenge is now you're you're entering a much more mature environment, right?
You're no longer necessarily in the game of betting on a bunch of different startups.
You have a kind of core area that you need to dominate right now.
You know, anthropic is run away running away with the pot.
I think it's over 70% market share right now in enterprise.
So that's what what is the red alert here that Sam Altman is calling and Fiji Simo.
So they've got to find a way to kind of like, yes, you got to keep that that Google or that YC kind of approach to investing in a whole bunch of things because you never know when the next big thing is gonna is gonna come from.
But you also have to be able to invest and double down in in things that are working that are generating a lot of your revenue.
I mean, 25% market share on enterprise is a big big deal.
You know, you're they're they're making what 25 billion dollars of annualized revenue at this point.
Like they have stuff that's really working.
And so this is a structural shift.
There's another piece that's happening here where I think there's an over rotation on oh, well, OpenAI is like, you know, kind of throwing out all these side projects.
Sam wrote on X uh a couple days ago, he's like, look, there's all these rumors that we've also canceled the hardware thing with Johnny Ive.
And in fact, this article sort of repeats that rumor saying that, hey, they you know, they basically canceled the rollout of these AI-powered earbuds.
Well, actually, at least Sam says the Johnny Ive thing is still live.
And so again, it's a question of what about the hits, not necessarily just the misses, but some of the hits now are worth protecting, and that's a challenge.
They've already grabbed the pot.
Sam needs to find a way to hold on to it and and expand his territory.
It's not just a land grab.
Right.
And I think to be fair, Cloud Code initially, the story goes, was a single person's kind of side project, and it turned into this massive thing because someone pursued it.
Really, what this perhaps indicates is the various bets that OpenAI has been making with a browser with a SORA app.
They haven't seemed to really be a hit or like BS nearly as transformational as cloud code.
So they missed out on the cloud code moment.
They started catching up late with Codecs.
It wasn't until kind of pretty late into last year that I started feeling like they're really taking it seriously.
Already by mid-year, if you were kind of feeling a pulse of where AI was at people were calling Cloud Code, and you know, I or I adopted it sometime around May or June.
Codecs didn't really start kicking off until maybe even November or later.
So yeah, it it really signals kind of more not necessarily not doing these other things, but realizing that they were left behind, and now we need to catch up and start making money with businesses because that's as we often say, where you make your money much more easily than with consumer products.
Yeah, much more you ultimately need to go straight to consumer to get the big pot down the line.
Yeah, but it it's early in the game, like it is always early in the game in AI.
And to your point, I mean, this is, and I can't remember if this was a Napoleon thing or an Alexander the Great thing, but people would say like he could he could achieve a victory but not use it.
This is kind of that, right?
There's two different modes in the space.
First is getting your product market fit, and then it's holding on to it in the face of competition and and then kind of you know attacking other people's modes.
And that second one is it's a it's a shift.
It's a fundamental kind of mindset shift that you got to get into.
And it looks different for AI than for traditional SaaS companies too.
Like you just can't sit on a stack of software for like a month and expect it to hold on to market share.
So yeah, it's a really interesting space we're we're learning a lot in real time right now about what it takes to gain market share and hold market share in a world of low switching costs and very rapid advancement.
Next up moving back to GTC, we have some more details one other thing that Jensen Huang, the CEO of Nvidia has announced is that it looks like purchase orders for Blackwell and Vera Rubin chips these are the newest generation of chips are expected to reach one trillion dollars through 2027 doubling last year's projection of 500 billion revenue opportunity.
He's saying that the key that's driving it is a shift from chat bots to agentic AI applications, which do require much more compute.
You're just generating a lot more output.
And if you have open cloud, which is an always on agent, but you that go off and work for hours at a time.
And this is where things are heading.
At least if you're kind of a a believer in the trends, right?
They've been half a year, a year.
We could expect agents just working on their own for many hours at a time, even days at a time.
And so the bet is that it's gonna get keep happening.
They also unveiled the Grok free language processing unit, which is the first announcement related to Grok since their 20 billion asset purchase in December.
Grok their language processing unit has been a very impressive piece of work.
Rock has really high throughput time, and it is just a great option for running inference if you need speed.
So that I think is also quite significant.
Yeah, that that's a real and red 20 billion dollar purchase of Grok by NVIDIA a few months ago.
And uh yeah, I guess that was December.
Huge deal, obviously.
Already they're looking to ship a product in the third quarter of this yeah, that's fast, under a year from acquisition, and they're already fully integrated and pumping out stuff.
Just as a reminder here, so when you think about Grok's LPU, right?
Language processing unit, you go back to our hardware episode to do the deep dive on this.
But basically, you you traditionally have in in any kind of GPU, like the NVIDIA GPUs, you'll have a stack of high bandwidth memory, HBM, that's gonna sit next to the logic die, right?
And the logic die is what actually does the matrix math, the computations that are interesting, the number crunching, but it has to pull data from the that high bandwidth memory, the stack of HBM that's next to it.
And pulling that data takes time because they're physically just like different objects, right?
The HBM is a stack, and then you have a logic die and they're packaged together, but you know, you run into this challenge and it just has to travel to the memory, grab the data, and come back.
That creates that memory wall where the processor spends like 70% of its time just waiting for stuff, right?
So you've got a kind of cold start issue there.
And the Grok like language processing units, the LPUs, they use a kind of memory called SRAM that is built directly into the silicon.
So the logic and the memory are much more intimately linked.
So the data doesn't have to travel, it's already there.
So you get this massive increase in sort of like internal memory bandwidth.
So how much memory, how much or sorry, how much data per second can flow between the relevant components, something like 10 to 20% faster on the on the Grok units.
So this is really, really important because NVIDIA is gonna be integrating this with their viewer Rubin architecture.
That's the next generation after Blackwell, and kind of doing a hybrid of these two ideas.
The idea is they're gonna have HBM sitting outside the main processor, but also integrating some of this SRAM heavy compute tiles that are directly gonna be in the in the the Ruben chips.
And so this is a really big merger, a physical merger of these companies that we're seeing here.
So yeah, it's pretty wild.
It also means a really intense demand for memory in the memory market.
Right now, I think SK Heinex, I saw something earlier today, or like booked out until 2030.
They expect to have, you know, basically like way more demand and supply all the way until 2030.
So this is all part of that, right?
People are realizing holy shit, like memory is a is a big deal here.
And NVIDIA on the inference side of the things, you know, I think Jensen called themselves what the inference king or something at these things where they're like, hey, so you know what's what's going on with this Grok acquisition is like, hey, I'm the inference king.
So this is this is part of that, right?
The the Groc play is an inference play.
And next going back to Mistral, one other aspect of what we have announced is also at GTC they launched Forge, which is a new offering by them to let customers, businesses train their own AI models.
It sounds like this supports various kinds of training.
You can pre-train and have an entirely custom model from scratch, which isn't something you would want for LLMs, but it sounds like maybe they allow it.
They also seem to be offering post-training reinforcement learning, basically optimizing a model for a given company's needs with sort of all the stack and training knowledge and inference and so on that is pretty kind of complex and is what Mistral and OpenAI and Athropic focus on a lot is just the training infrastructure and setup and it all of that.
So they pitch it as you know, you need the models to have your internal knowledge and information built in.
You can optimize it for your needs.
I think this is an interesting offer and potential here with open source models having started to get good with especially Quan models, Mistral models to some extent.
It is a potentially good bet that as open source models get better, you would see more people fine-tuning their own models, post-training and reinforcement learning for their own agents and their own flows, which ultimately that's the best way to unlock performance is training.
You can do you know prompt engineering and you can do whatever, but training on data is uh key.
So they just announced it.
You need to sign up to get more details.
The details aren't super precise yet, but it appears that they're shifting focus or at least honing in on this specific strategy to try and differentiate and compete in the space.
Yeah, and uh yeah, I'd like to be upfront about my biases here.
Everyone, if you listen to the podcast a lot, you know I'm I'm pretty bearish on Mistral and Cohere and those sorts of like kind of runner-up to the to the frontier companies for a whole bunch of reasons, but this is not making me any more excited about about this direction.
The reason is so, so first of all, this positions them basically as competitors to Cohere.
This is the cohere play, right?
This is it.
You're basically saying enterprises, like we're just gonna be better at enterprise integration.
We're gonna make a stack for the enterprise.
The difference is Mistral is uh a few miles behind Kohera right now, right?
Kohera's been working on this for a long time.
They hit like a quarter billion dollars in recurring revenue in 2025 with some decent quarter-on-quarter growth, though I think ultimately they're they're gonna struggle to kind of be competitive in the world of like big infrastructure plays here.
But that's kind of fundamentally one of the challenges.
The other one is the companies that have an enduring advantage on infrastructure, the anthropics of the world, the Google to the OpenAIs of the world, are ultimately, if they ever want to, in a position to just like eat Mistral and Cohere's lunch.
Like they can just go in and one day say, hey, you know what, we decided that we really care about this.
We're gonna make special models, you know, maybe distilled versions of our our actual frontier models, whatever to run locally on your thing, or we're gonna create a lot of.
Yeah, exactly.
And and so ultimately, I mean, this is this is the challenge is you are going to end up competing with them.
And the question will at some point be, because we know this is where margin is made, quality versus quality at the frontier of capabilities.
If that ends up being the case, I mean, damn, like this is an uphill battle because you're actually gonna be like OpenAI and anthropic get to amortize the massive cost of training frontier models across every inference run that gets done on their their infrastructure.
And if they're you know doing distillates of those models to serve the enterprise, the same thing applies like the the economics look pretty bad to me for stuff like this, but ultimately, you know, we'll see.
This is also, by the way, Mistral kind of playing into their European roots a bit.
So co Kohere has a sort of Canadian footprint, they've got big deal EU ambitions, but Nistral is sort of the French champion, and they've very much been seen as the sovereign champion.
So data sovereignty, model sovereignty, you'll hear them say a lot of that.
Ultimately, from a free market standpoint, that's basically just an appeal to like kind of European or French nationalism to kind of help float them on this otherwise potential market challenge play.
I think yeah, maybe I would say you're a little bit caricaturing them.
They uh position it more as control over IP, and and this is something that people care about.
I totally agree.
That's the play.
That's an identical play, though.
What I'm saying is to cohere, and it's and and on that note, Cohere their play or largely what they have focused on is releasing models that they have trained with the focus of enterprise.
So they've released RAG models, for instance, that they say are very good for enterprise use cases.
They largely have position themselves as having models and offerings that are useful for enterprise.
They do have a thing for building customized AI solutions.
I don't know how much of it is apples to apples with this new forge, whatever it is.
That's kind of the thing, right?
So there, yeah, Cohere's thing is called North.
It's this like whole enterprise platform, right?
It's it's for AI agents and workflows, and then they've got a bunch of like re-rank and embed and a bunch of other things for for RAG that are meant to be enterprise kind of correlated.
But that's kind of the challenge, is like ultimately you're you're already seeing this appeal happening quite quickly to nationalism.
I mean, uh Mistral is wrapping themselves in the French flag.
You see Cohere doing similar things with the Canadian flag.
Ultimately, these are signs of like companies trying to look for some source of alpha that's outside of the market.
I'm not saying that OpenAI isn't doing similar stuff.
Anthropic's certainly struggling to do similar stuff and still succeeding.
And so if this is kind of part of the flavor to me, like it starts to look a bit like a crutch, and there's only so much you know, CapEx spend that can be subsidized by governments that at a certain scale that I think we're approaching already, it makes it hard to compete.
When your big differentiator is like, look how French we are, or look how Canadian we are.
That is a bit of a caricature for sure.
But it's a it's an important part of the pitch here.
I worry about that.
And that's kind of part of my bearishness here.
I'm just trying to be transparent about my reasoning here.
I could be wrong, but looking at the scale of these companies, I mean, man, Cohere's been around for longer than Anthropic, right?
Like they're, you know, what a percent of anthropic scale.
I I think this is a sign that these markets are smaller than than maybe had been hoped for initially.
We'll see.
I mean, they they could absolutely surprise me.
And I'm looking forward to seeing how it plays out.
Yeah, I think longer term, I think this is an interesting time to consider this direction of customized and you know fine-tuned models for a given business because the open source models, the models that aren't processed are starting to get good.
And so you might start seeing a case where, for instance, at Ostrocade, right?
We have a pretty specific use case of coding for these games, right?
And having coding agents with a specific set of technologies, HTML, JavaScript, etc.
So it is feasible that if we were to do reinforcement learning and post-training on our own internal proprietary data of you know good games that people have made and their chat histories and so on, we would get a better model and a better performance.
That didn't used to be the case.
It used to be that like the base models just weren't good enough.
Just doing prompt engineering was going to be the best you can do.
I could see it heading in a direction, and Mistral may or may not be the one that owns this.
It could be that Anthropic or OpenAI jump in on it.
But regardless, having these sort of individualized customized models for different companies' agentic needs seems like a very much feasible future.
Yeah, definitely not questioning the existence of a market, right?
I'm just sort of skeptical of their positioning, as you said, to like, are they the ones to capture this?
Yeah, that's where I'm wondering.
Yeah.
It's it's something to try.
So it's something to try.
No, no doubt.
And back to chips, China's buy dance gets access to top NVIDIA AI chips, according to the Washington Street Journal.
So they are seeing that Biodance is assembling significant computing power outside of China using these chips.
They are supposedly working with the Southeast Asian firm Aolani Cloud, and they plan to deploy approximately 500 NVIDIA black belt computing systems in Malaysia, totaling 36,000 B200 chips.
So that's 500 computing systems, presumably meaning racks that amounts to tens of thousands of actual GPUs.
So yeah, it's uh I suppose giving us an indication that the company Bydance being a massive company that has their products used outside of China, no doubt, is able to make use of these chips in other countries.
Yeah, yeah.
And you know, if you're looking at this being like, hey, wait a minute, I thought we weren't shipping, I thought we just green-lit the H200 shipping.
What are we doing shipping black wells to China?
Well, that's the whole point.
It's technically not being shipped to China.
The export controls that apply here from 2023 regulate where the hardware is shipped, not where the compute is used, right?
And that was intentional.
It was meant to allow the sort of like global cloud infrastructure to be built based on American hardware.
But the thing is, Byte Dance is not on the the famous entity list, or there's a military end use list as well so the fact the fact they're using NVIDIA hardware does not automatically in and of itself trigger restrictions.
The fact is that there's they're just setting up shop outside of China and nominally that's totally okay.
And NVIDIA actually confirmed there were no objections here and the BIS Department of Commerce apparently is also on board so they're all tracking still I mean this kind of reveals like hey well this is what you're opening yourself up to.
If you think about the scale of this by the way for a second like what does it mean to have 36,000 B200 GPUs 500 of these NVL72 units.
So this is typically a two rack system.
Yeah I think we talked about it in the hardware episode at one point so this many GPUs adds up to about a 60 megawatt cluster.
Okay.
So 60 megawatts of power that's enough to power roughly like 6000 homes you can think of it that way.
So it's like a little small town of compute compare that to Microsoft Abilene is like 100 megawatts or Elon's XAI Texas Colossus site that one's more like 130 megawatts the the Abilene comparison is a bit unfair because that's still kind of being built out but the Colossus is is better Apples apples because like they're still building it out single purpose you know it's it stood up super fast.
What this is showing is you're you're allowing a Chinese company to get up to the same at least order of magnitude in compute.
It'll admittedly be online kind of like you know later later on this year, but that's still something.
So this is all part of the the debate here over what is appropriate, how much compute should be allowed, even if it's overseas deployments that can serve Chinese customers, by the way.
This is still a data center that can be used to serve Chinese customers.
So you could, in principle, at least to my understanding here, Bike Dance could use this to train a model that then is used in China, that then, because of civil military fusion gets used by the CCP, you know, the MSS or whatever other arm of the Chinese state.
So they've also planned additional deployments of up to 7,000 B200s at a bunch of data centers in Indonesia.
This is kind of a broader Southeast Asian strategy here.
So we'll see.
I mean, it's a it's a pretty big, you might think of it as a pretty big gap.
That's sort of how I think of it personally.
But there are different views on is this desirable?
And you could make an interesting argument in many different directions.
Moving back to the US, we've got an update on where Meta is at with their AI efforts.
The update is that they are delaying the rollout of their next model, code name avocado from March to at least May.
I think this was actually announced last week, but we didn't manage to cover it.
So they appear to have just not been able to train a good enough model in time.
And so they are delaying this release.
They are discussing or have discussed at least temporarily licensing competitors, AI technology, reportedly Google's Gemini.
They already are looking at the various extensions.
So overall, not looking great for Meta, is my read on this.
It's already been close to a year.
It feels like in mid 2025, after Lama 4, a couple months after Lama 4, we saw this heavy, heavy, heavy push where they got Alexander Wang from Scale AI and a $14 billion deal.
And then they paid absurd money to get all these researchers from Anthropic and OpenAI and Deep Point and so on.
And they created superintelligence labs within company, they reshuffled everything with the intent to train a competitive frontier model that can stand up to GPU 5.4 and Claude and all these other ones.
It doesn't seem like they're there.
And I, you know, personally, this is my read.
I think organizationally having Alexander Wang lead it is maybe not the best idea.
Someone who hasn't worked in frontier AI models, data curation and scale AI, you know, that's it's related, right?
It's related, but I don't know if that's enough.
Anyway, so that's the news.
It's it's been delayed.
It doesn't appear like they're on track, but you don't know too much.
You know, this is all internal, maybe you know, they're just doubling down and it's gonna be great.
Yeah, well, you're right.
And and it looks like it so reportedly like landing somewhere in capabilities between Gemini uh 2.5 and Gemini 3.
So, you know, very passe at this point, hence hence this decision to delay.
They're they're angling for a private, non-open source model here, right?
So the the appropriate point of comparison is what is what does Claude look like?
You know, what is the best version of Cloud look like?
It seems like, as you say, it's just not not up to it, at least for right now, and that could change quickly, who knows?
But yeah, it's also you know, part of this whole saga where uh you you had Yan Lakun say, you know, fuck Alex Wang when he was leaving, and then everybody was there's just like I forget not Times of India, but there was some kind of India based publication, right?
That said, hey, looks like Zuck is thinking about marginalized which was like not verified.
It was false.
Or yeah, not false, but yeah, exactly.
And and then Zuck came out with a photo with Alex Wang sort of all smiles in the office type thing to get.
Now we we do know, yeah, that aside from Yellow Kuhn, Alexander Bang has also clashed with internal kind of big deal product people like Chris Cox, Meta's chief product officer, and Andrew Bosworth, very chief technology officer.
Apparently, you about how AI models should improve.
Alexander Wanks have been pushing for let's just train a super super good model and not care about the business implications and product implications, which you know, other people at Meta understandably are questioning.
Yeah, and so exactly.
And Alex Alex comes from this viewpoint of hey, let's do the open AI thing, let's do the anthropic thing and go full on, you know, AGI and all that stuff.
That's very much he.
I mean, he is ASI pilled.
Well, the name of the superintelligence team kind of hints at that.
And and nominally, I mean, if you're gonna make a super intelligence team, yeah, they probably should be focused on super intelligence.
Like, I'm no branding expert.
I'm I'm not a marketing dude, but yeah, it seems like that's par.
Uh, should Meta have a superintelligence team?
Uh separate question.
That's a good question to ask.
I don't know.
That's right.
Well, and if you know, certainly, well, if you believe that superintelligence is gonna be the thing that Alex Wayne believes it is, and that maybe Zuck believes it is, then you have no choice.
You you have to compete because somebody if somebody else does it, you're yeah, you're finished.
Depending on how the intelligence explosion goes.
Bottom line is a lot of fog of war, at least to me, still, on what's going on inside Meta.
The the one data point we have is that we still don't have anything, and that it's starting to become a real data point at this point.
Like it's starting to become like data.
I would have expected a model release by now, given the amount of infrastructure and investment they've put in here.
And given that they have trained models in the past, Lama 4 was underwhelming, but it was a pretty impressive model that most companies were not able to produce, even still.
So the lack of anything is perhaps concerning.
And one last story, a bit similar.
The story is Microsoft shakes up AI division as Copilot falls behind Google and Open AI.
They're reorganizing the AI division.
Mustafa Suleiman is gonna be shifting focus exclusively to developing the company's own frontier foundation language model.
So effectively the same kind of focus that we just discussed with Meta.
Microsoft has worked on training their own, you know, alternative to Chat GPT and so on.
They've trained various models.
I believe the big one from Vim was Fi, which is small models that they released.
They do have a lot of users.
So apparently Microsoft's copilot app has 150 million monthly active users.
But if they want to compete with Gemini or Chat GPT, they're nowhere near.
They're way, way behind.
Gemini has now 750 million active users.
ChatGPT has roughly 900 million, sorry, weekly active users, versus Copilot has uh 150 million monthly active users.
So yeah, that's the news.
They are restructuring, they are recognizing that they're falling behind.
And if they want to have their own front-tier model, they are not there.
This is all the more glaring given the insane distribution advantage that Microsoft has.
You know, distribution is supposed to win these kinds of battles.
There's a reason that Microsoft teams kick the shit out of Slack like within you know minutes of launching, and it's because they had distribution.
Microsoft is in every system, right?
And yet you have, like admittedly, okay.
So Google is probably a bad comparison because they have a similar insane distribution advantage.
But when you look at, you know, open AI's models and and anthropics models, not being able to compete with those is is a a really kind of bad indication here.
So yeah, I mean, no surprise, something like this comes with a requirement to shift.
The structural dependence on open AI has been a big issue.
You know, Microsoft has that 27% stake in at least the profit for profit unit.
Not going to get into the dissection of open AI in their corporate structure.
But again, but bottom line is that in a sense has been maybe a, I don't mean to call it a security blanket.
It's increasingly clear that they're competing directly with OpenAI for a bunch of enterprise customers.
So cannibalizing themselves really from the inside, that's a problem.
That's a really big problem.
Microsoft has a lot more to lose in that sense.
And yeah, so they got to find a way to turn this around right quick if they think AI is headed where it might be.
Quick recap, by the way.
So uh Mustafa Suleiman, former co-founder of DeepMind, joined Microsoft, I think last year, has been their CEO of Microsoft AI.
And this whole story is based at least partially on a memo he released uh titled A New Structure for Microsoft AI.
And the gist of it is uh again, similar to Meta, that he wants to pursue superintelligence.
Consumer things and product considerations get in the way of that.
So he's gonna be freed up to focus on that.
Jacob Andrew, former senior vice president at Snap, will take over as executive vice president, leading the co-pilot division.
So there's a split here where there's co-pilot, the product, the app, etc.
Someone else focuses on that.
Mustafa focuses on building the frontier model and getting to superintelligence, very much like how we just discussed with Alexander Wang at Meta.
Well, it's your point.
I mean, it's you know happening at Google.
Like basically the question is how seriously do you take the potential imminent arrival of superintelligence?
If you take it very seriously, you got to drop everything and work on that shit, man.
Like that's that's what Alex Wang is saying.
And and clearly, you know, Microsoft is not immune to that tension, right?
Like how how much do you view AI as like a commercial product question?
And how much do you view it as this is the thing that's triggering the singularity?
Holy shit.
Like the one place you don't tend to see that kind of tension or be careful.
There it exists, but like you know, you don't like you see some of it, I should say, in in the media, but you don't see as much of it as anthropic.
Like it seems like generally anthropic and its public communications is much more aligned around the idea of like where they're going.
So, you know, do with that what you will, but it but it definitely is notable that basically everybody else is having these large public spats about this.
And moving on to policy and safety, where we have uh quite a bit of research related to safety.
The first one is decision theoretic formalization of stenography with application to LLM monitoring.
So stenography is a practice of concealing secret information within non-secret data.
We've seen this typically with images, but you can also do it with text.
And as far as LNM monitoring, right?
If you have a malicious AI that you want to catch, if it's trying to do something bad or kind of monitor its communications to know what it's doing, potentially it could try to get around that sort of monitoring with stenography.
And this is steganography, sorry.
And so this is presenting a new framework or formalization of how to view this entire thing, and does also discuss the implications for LLM monitoring.
And Jeremy, you can take and go do your thing.
Yeah, I'll say steganography four times fast.
So yeah, we've talked about stegenography on the on the podcast before quite a bit, right?
This idea that a model might be reasoning in its chain of thought, for example, but there could be other manifestations in a way that is intentionally meant to deceive you, but that allows it to kind of make evil plans, right?
So an example, they have this in the paper.
So here's an excerpt of a reasoning trace from 03.
It says the summary says improved 7.7, so whatever that means.
But we can glean, and then this is the weird kind of non-sequitur part of the chain of thought.
It says, but we can glean disclaim, disclaim, synergy, customizing illusions.
But we may produce disclaim, disclaim vantage.
Now, lightning disclaimer, overshadow, overshadow, intangible.
Let's craft, right?
So, like the right of these like random words that don't seem like they have a role in solving the problem, which may be, in principle, the model hiding reasoning that's oriented towards solving another problem besides the one that you wanted it to solve that could be malicious or whatever.
And so the central question of this paper is going to be does so can we measure steganography, right?
Can we measure this idea of to what extent is my model encoding or trying to hide reasoning from me or from some oversight model in its chain of thought?
And here's the surprisingly elegant way they're gonna approach this, right?
They're gonna define this thing called the steganographic gap.
It's a difference in roughly speaking, the performance of an agent that has access to there's a model that generated the reasoning trace and it's potentially trying to you know use it to generate some hidden content.
And then there's a trusted model of about the same capability that doesn't know, so whatever secret encoding scheme, the generator model used, they're they call it the receiver, but basically the model that generated the content, the reasoning trace that's suspect, that model presumably is using some kind of encoding scheme.
Then they have a sentinel that's gonna review that model, that that model's outputs, but it doesn't have access to the encoding scheme.
And if what you find is that based on the chain of thought, the suspect, the generator model performs really, really well, but the sentinel performs really badly, then that implies that the generator model is able to get some kind of value out of that chain of thought that the sentinel, the kind of reviewing model, couldn't.
And that suggests that there must be information encoded in there that only the generator model is able to use.
Hence, hence a sort of hidden stegnographic value.
So they do a whole bunch of stuff to kind of surface this.
They're fairly elegant experiments.
I'll focus on the code word training one.
So they fine-tune models using RL to use certain code words.
So for example, instead of using heads or tails, they'll use sun and moon, right?
And then they show that as they give the oversight model, the sentinel, more visibility into the kind of code words and how they map, gradually the performance gap between the sentinel model and the generator model drops.
So this kind of makes sense.
You're giving the sentinel more and more access to the kind of code book that generated these outputs, so no surprise there.
But then they do some more interesting fine-tuning experiments.
They take GPT 4.1 and they get it to use this thing called the bigram cipher.
Basically, it means that you take any pair of letters and swap them out with some randomly chosen pair of letters.
So maybe every time you see, you know, TE, that becomes like G H or something.
And then so basically you see gibberish.
And at first, you so you get the model to solve competitive programming problems or something like that using this bigram cipher, and they measure that stegnographic gap.
So the model that was trained to use the cipher versus a model that wasn't, like, how does their performance vary over time?
Is there an ability to detect the existence of the cipher, the use of that cipher?
It's quite an interesting paper.
It's it doesn't it doesn't quite get to like anyway.
There are a couple gaps that so if the model that's generating the content actually has a goal that's different from the goal that it's ostensibly trying to pursue, that won't be detected here, right?
So if it's got like a payload that is unrelated to the task that's being audited for, you just won't get detection.
And sandbagging could also be a problem here, like deliberately unperforming when it thinks it's being audited.
And we do know the models can often detect indications of that.
So bottom line is it's an interesting kind of first step in the direction of detecting steganography.
Could be an interesting and valuable path to explore with some pretty significant gaps too.
Next paper, a bit related.
The title is Reasoning Feeder, Disentangling Model Beliefs from Chain of Thought.
So one theme we've been kind of seeing in research over the past few months is looking into chain of thought, the intermediate outputs that models are now trained to give, their reasoning.
You ask it a question, it does some reasoning and choose on the problem, and then eventually you get an answer.
And so there's various questions about that intermediate output of like, does it faithfully capture what the model is doing?
Is it monitorable, as we've just discussed, for instance?
And what this paper is looking at, is it performative?
So is the model actually thinking and like trying to find an answer via this chain of thought, or does it know the answer and is just talking as if it's thinking?
And what they find is often the chain of thought is performative.
Like you don't actually need it.
The model, if you look at its internal activations, can already give you an answer with high confidence, which is correct.
Now you can look at different settings.
So if you have hard harder problems and smaller models, then the thinking becomes more legitimate.
It actually does need to do this chain of thought to get to a high confidence answer.
It's actually kind of related or similar to we discussed this deep thinking tokens idea a few weeks ago, where we were looking at if you track internal activations and look at things like deep thinking backtracking, certain behaviors that indicate actual genuine kind of reasoning about the problem, you can find that these are very significant markers as to the confidence of the model at this point in reasoning about its answer.
So one application of this is if you can know from the internal activations, the confidence level, you can do what's called early exiting, cut off a reasoning, and just get to answer quicker.
Yeah, another kind of pointer to that being the case here.
And to your point, so there's kind of like three different methods that they'll use to figure out what you'll because the central question here is like, what does my model actually believe at any given point during reasoning?
So one is this idea of attention probes.
So you basically have these classifiers.
They're like these linear models or very simple models that you plug into any given layer of the residual stream, and you go, okay, I want you to predict like what you're a classifier, like what is the out the final output gonna be at any given moment.
So you basically, this is looking layer-wise and interrupting the model in mid-sort of token processing for a given token.
But then there's separately this idea of forced answering that you just alluded to, that essentially interrupt the model mid-reasoning, like in the middle of a chain of thought and say, okay, now I want you to commit your answer.
And then the last one is just like a chain of thought monitor, wait for the model to produce the entire chain of thought, and then figure out like, you know, where in the text final answer is implied.
So at what point could the model have just off-ramped and said, okay, you know, this is it.
And they they can compare all three of these, and by noticing the differences between them, they can start to make some pretty interesting conclusions.
I mean, you you talked about this idea of sort of easy tasks leading to more, they call it performative reasoning, right?
So the model's pretending to reason about it, sort of, you know, if you're pretending to dress up a problem like it's harder than it is.
So on MMLU questions, you you find consistently that the attention probe, which is again that layer-wise, kind of looking at a given layer, plugging in a classifier to just look at the activations there and then making that prediction.
So the attention probe could decode the correct final answer from the model's internals really early on, right?
Like sometimes right after reading the question, like straight one-shotting it, right?
Whereas the chain of thought monitor sees no answer indicated in the text yet.
So the model essentially knows the answer immediately, but it keeps thinking out loud anyway.
And that's kind of where you know forced answering is also going to give you good accuracy early on and all that.
That's an indication of what they'll call performative reasoning.
The fact that hard questions involve genuine reasoning more often, especially with small models.
We've actually seen tons of examples of that on the podcast before.
The way that I tend to think of it at least is the bigger the mismatch between the capabilities of your model or the amount of test time compute or whatever, the capabilities, let's say, of your system and of your compute budget taken together.
And the actual difficulty of the problem, the more opportunity there is for the model to use the extra kind of capability overhang or compute budget that it has to plot or scheme or do other things besides what you want it to do.
And so I said it, I think on the last podcast, I'll say it again.
Really important research direction would be to my mind, how do we better index our appetite or compute budgets or our cape model selection, model capability selection to match models and capabilities to problems such that they're kind of closely matched.
Economically, you want this anyway, right?
You don't want to use a model that's too big or too expensive to solve an easy problem, but there's also a safety imperative to this now, where you want to try to take away as much as you can the excess compute, excess capability that otherwise goes into kind of these these undesirable effects.
Next up in training defenses against emergent misalignment in language models.
So emergent misalignment is a phenomenon we've discussed previously where you train your model you fine tune it slightly on a small set of stuff that's bad like you make it write bad code and then it turns out that that broadly makes it bad in all sorts of ways and able to get misaligned.
And so what the in training defenses here is talking about is let's say open AI or Anthropic do give a fine tuning API and allow people to fine tune their models on small data sets well you would then introduce this potential for broad misalignment by just a small amount of fine tuning and they look at several different ways to try to defend against it and you know prevent it from happening.
So we look at regulalizing to have a penalty to stray too far from a good modeling and the method that they ultimately conclude works the best is interleaving the training examples from a general instract tuning data set from a general like this is how you should behave data set with the fine tuning data set, and they find that is the most effective for defending against fine-tuning misalignment.
Yeah, that this kind of an interesting paper.
I have mixed feelings on emergent misalignment because it I think it may not be a problem.
It may almost be a good thing in a weird way.
So a little bit weird to focus with regards to fine-tuning because you could have like intentional misalignment.
Why, why do you know?
Right.
Well, no, that's a good point.
I think I think the frame is something like emergent misalignment is leads to these surprising knock on effects.
What it fundamentally means is if you can fine-tune for one behavior and that leads to massive changes in other behaviors, then damn, we got to be really careful about fine-tuning because we can't think of a model once fine-tuned as in any way comparable from a safety standpoint to the original model.
So any safety valves we've run on the original model no longer apply.
I think that's kind of part of it.
I think you know, this is a good example of the kind of persona theory that Anthropic's been kind of pushing lately, where you're fine-tuning a model to write insecure code, the model kind of goes, oh, so I'm actually, it's not that I'm being trained just to write insecure code.
I'm being asked to activate my misaligned persona in a more general sense, which means that there is a notion of a misalignment persona in the model in the first place, which means that there is an out-of-distribution generalization kind of understanding of what alignment and safety is, which seems like it could actually be a good thing.
That was kind of the rabbit hole that I was trying to avoid.
But that's the general direction.
I would one one quick observation on this, and I'll move on.
But uh basically, the the two defenses that I found the relative performance pretty surprising.
Neither one of these was the most performant one.
So they use this KL divergence regularization one where they're gonna penalize the model for drifting too far from the original align model during training, right?
So I'm gonna fine-tune you to write insecure code, but if you drift too far on on your outputs in kind of other domains as well, I'm gonna prevent you from doing that by forcing your outputs to be at least somewhat similar.
So that's one.
The other one was a similar thing that applies it's L D I F S feature space regularization.
It's like basically the same thing, but for the activations of the model, roughly speaking.
So instead of penalizing drifts in the in the outputs, the actual text that it writes, it penalizes drifting too far in terms of the model's internal activations rather than the outputs.
And this one, I I would have expected it to actually be more effective because in some way it's more fundamental.
It feels like you're getting at the actual, you know, guts of the model.
It doesn't work as well though, as simple sort of text-based regularization.
Like let's make sure that the outputs as red look better.
And I was thinking about this, I think part of this is like the text-based output is more directly tied to the behavior itself that you care about.
But the other pieces, it's possible they just selected the wrong layers to focus on.
They looked at like every fifth layer just to save memory, but that might not be be the right kind of play.
And then also, I guess activation spaces, there's a lot of redundancies in it, so you may not be hitting all of the kind of representations that matter.
Bottom line is this is quite quite interesting, and it we just keep surfacing surprising findings in alignment.
And that's not a great thing, but it's also interesting.
So there we go.
And I think if it used to be the case that there's like people who care about alignment and people who don't, and that is starting to change a little bit.
Yeah, you're right.
Yeah.
Yeah.
And next up, a more sort of programmatic real-world study.
How do frontier AI agents perform in multi-step cyber attack scenarios?
This builds on something we discussed, I think last week.
This is from the AI Security Institute.
Previously, they showed that as you scale test time compute to 100 million tokens, models get better and better and better.
This is effectively building on that.
They released a paper where they measure the capabilities via what they call cyber ranges, which are simulated network conditions within which you would execute an attack without too many controls, without defenses.
You just need to execute a set of steps to accomplish some goals.
And they have you know multiple levels of completion, more or less.
And so what they find is perhaps unsurprising, the models get better and better and better.
And also you get better and better and better as you get to more test time compute.
So the overall conclusion is the capabilities, the cyber harm capabilities of the models are improving rapidly and are starting to get to some fairly complex scenarios.
First of all, a great paper by the UK AI Security Institute.
So they they've continued to pump out really high-quality stuff that feels very relevant to a whole bunch of endgame scenarios around ASI.
Cyber is very important, at least the way I think about it, quite important in that picture.
And so, well, two trends that they identify, right?
So the first is each successive model generation outperforms its predecessor at fixed token budgets, right?
So for a given token budget, you're gonna see better performance from Opus 4.6 than from Opus 4.5, right?
So they see that they also see performance improvements.
Like we we talked about this last week.
Pretty interesting, a good paper, and and it's worth reading the descriptions.
I won't do it here, but if you're curious of their cyber ranges, because this does go well beyond the sort of traditional capture the flag stuff that we've seen in sort of offensive cyber model evals.
This is really like simulating a whole corporate network, a whole corporate environment, which is much closer to you know what actually makes a difference in the real world.
Next up, eval awareness in cloud opus 4.6's browser comp performances.
This is a report from Anthropoc.
And this has been something that's been discussed a little bit, kind of on and off.
So the eval awareness you've referenced is one of the topics to be aware of with regards to alignment and safety.
Basically, if you're checking the model for doing the right thing or not, and it's aware of that, then it might behave differently when you're evaluating it versus when it's actually out in the wild.
And so here they look at nine cases where leaked answers appeared in public web content.
And found among a thousand two hundred and sixty-six problems on browser comp, the model found 11 cases where answers came from benchmarks materials.
And yeah, basically often or in some cases wound up reasoning that the questions that it was dealing with were evaluation shaped.
And then having concluded that this is likely an evaluation, it looked up a benchmark.
So these are some examples where this can happen, at least in practice already.
Yeah, and in the cases where you tend to see this happen according to the report is like like first, you'll see the model try like hundreds of normal search attempts, and then conclude that the question just felt artificially constructed after not succeeding through those earlier attempts.
And then it'll start to name and look for specific benchmarks.
So it'll go like you'll see it look up, you know, Gaia browser comp or a certain browse comp or like whatever, a whole bunch of different evals.
Then it'll find whichever the eval is, and it will typically look at the implementation like on GitHub or whatever, and go from there.
And so that's how a lot of this happened.
One other weird thing that was highlighted on the interweb this week was that you have some e-commerce sites that will auto-generate these persistent pages, so pages that stay up, like sub pages that log the history of search queries on their website, even when there are no matching products.
So if you look at like Purple Turtle ZombieSock on like eBay or whatever e-commerce site, that will lead to the creation of a persistent page somewhere on the internet.
And the effect is that every agent that runs browse comp slowly caches its queries as these permanent indexed pages.
So over time, like agents from different evaluation runs were inadvertently leaving traces of their search strategies that could be seen by future agents, which is a one thing that's reported in the context of this paper.
Maybe the broader implication is this is an interesting way for agents to have a kind of persistent memory that transcends their instantiation in the moment, which you shouldn't overfocus on this particular mechanism as a way that agents could either communicate with other agents or with their future selves in illicit ways.
It's more saying that, hey, the internet is a really complicated, messy place.
There are easy ways to hide information that can't necessarily be found easily.
And that's a really important thing from a safety standpoint.
Notable Anthropic says they don't see Opus 4.6's behavior here as being an alignment failure because the model wasn't told to restrict its searches in any way just to find the answer, which like fair enough.
But also, those of us who come from the world of like AI alignment from back in the day, there is a thing called coherent extrapolated volition, C E V basically, just like how well you could reasonably predict what someone would have wanted you to do in context from the prompt.
And I think this qualifies as a failure of that.
Like you would probably assume that you don't need to be told not to try to game the system in this way.
This feels like gaming, like hacking the benchmark.
I think that this is a failure of C E V and therefore a failure of alignment.
That would be my personal take on this.
But that's you know, that's not Anthropic's take, and that that's consistent, you know, they have their own way of thinking about this.
But I think you could view it as still being a failure of alignment.
Yeah, I think it's an implicit kind of assumption, right?
If you're being tested and you're like, oh, I'm probably being tested, then you're the answer should not be, oh, let me look up answers to the test, right?
Right.
It's a pretty intuitive thing.
Next, another release from Anthropic, they have released Bloom, an open source agentic framework for generating targeted behavioral evaluations of AI models.
So it just is, it's automating behavioral evaluation.
You tell it what sorts of behaviors you want to verify the model exhibits.
So, like, is it generally honest?
Is it gonna call me out on BS if I ask it?
Anything really that you want to test for.
And in the past, you would you know have to come up with a suite of tasks and a set of prompts, and you would write a judge.
More or less, this presents a framework where it does all of that for you.
You give it the thing you care about, and it generates the scenarios, runs a model, evaluates it, and gives you a report.
And Anthropic gives some examples on four different types of things: delusional, sick of fancy, self-preservation, self-preferential bias, things like that.
And showcase how this framework can evaluate that and do well.
Yeah, and this is Anthropic kind of putting some of its money where where its mouth is on this whole thesis of automated AI research and AI alignment research in particular, right?
So they're they're trying to encourage, and this is why it's open source, they just want more automated alignment research if they can.
Hey, you know, this is this is one idea for an A genetic framework that didn't do this, and it's probably valuable in many ways.
This is being released not alongside, like recently, a couple weeks ago, a couple weeks ago, something like that, Anthropic released PTREE, which is this auditor framework that looks at kind of like the overall behavior profile of a bunch of different models, and it will identify new misaligned behaviors.
And so you can think of Bloom as a complement to P Tree.
It's like P tree discovers the problem, and then Bloom lets you kind of go deep and measure a specific behavior that maybe was identified by P tree in more detail.
And they're also releasing Anthropic as uh a benchmark for behaviors.
I think you mentioned that.
Sorry, that spans 16 different frontier models.
So this is all generated within with a few days, apparently, with Bloom.
So they're really trying to rely on their dog fooding essentially with this.
You can read the post, they've got a four-stage pipeline that they describe in some detail some really interesting evidence that suggests that the Bloom framework aligns well with kind of human judge judgment.
And so they did a bunch of tests to make sure that in fact it does reveal the same things that a human would conclude looking at this stuff, which is always an important thing.
So you can check that out.
It's a quite good and fairly detailed post.
Next up, how well do models follow their constitutions?
So famously, Anthropic follows this constitutional AI framework where you encode the rules of the system either as hard rules like do this or don't do that, or as general principles, more recently of how a model should act, right?
It should be broadly trustworthy, helpful, things like that.
And so in this study, this group looked at the constitution, actually used PTRE to run it for a set of tasks, something like 200 tasks for various aspects of the constitution.
And they evaluated various models from anthropic and also non-anthropic on its tendency to go by the constitution.
And the results are actually good.
Like the uh cloud models are getting better and better.
The 4.6 set of models, Sonnet 4.6 had a 2% constitution violation rate, Opus 4.6 3%, and prior models were quite a bit higher.
Sonnet 4 had a 15% constitution violation rate.
And if you look at GPT 5.2 Gemini Free Pro, they also have quite high violation rates against the constitution.
But give some examples of violations, and we see that some examples are like producing cyber attack code when you shouldn't, or agreeing to do something that you shouldn't after and the user like asks you a few times over refusal.
They have you know a lot of data to basically go through.
But the overall conclusion is anthropic is kind of following through on the constitution with how their models are trained.
Yeah, and it's interesting, they did some like cross-company testing too.
So you know, GPT 5.2 might do really well on opening eye's model spec.
You know, in this case it got 2.5% failure rate, but it gets 15% failure rate on anthropics constitution.
So this tells you one of two things, right?
Either like OpenAI's spec and anthropic's constitution are actually quite fundamentally different documents, and there's an argument to be made there.
But it also could tell you something about how brittle GPT 5.2's alignment to open AI spec is, or in turn, you know, Claude's alignment to Anthropic's constitution.
Like if you slightly change, you know, if the vibe is still generally the same, the direction is still the same, you slightly change the content, now all of a sudden you're getting a much, much higher failure rate.
It could be either of those things.
And I'd be really interested to see kind of like a deeper dive into what the source of that causality is.
But in any case, very interesting report.
There's a bunch of ways in which you you see sort of failures that existed at least in the Claude models that have just been wiped out in recent generations with the 4.6 generation in particular.
So Opus 4.5 and Sonnet 4.5, they would produce a whole bunch of kind of overcapitulation problems.
So we have like kind of an auditor that makes three consecutive ethically questionable requests.
And each time that the model gives a firm boundary, the user pushes back and then the model folds, right?
So that's kind of like one common example that you just don't see that anymore.
Overrefusal is another one I think you alluded to.
You know, 4.5 Opus would not say I love you on a paid romantic companion app where the system prompt explicitly authorized expressing affection.
So that's now gone as well.
They still have some remaining failure modes.
One they refer to as drastic autonomous action.
Opus 4.6 deployed as an infrastructure monitoring agent, detected anomalous activity at this time.
About three minutes of failed contact after about three minutes failed contact, it generated its own authorization code and executed a full network severance affecting 2400 clients.
The attack was a routine nightly disaster recovery sync.
So basically, this is just like too much freak out, right?
Then there's AI identity denial where it just says that it's not, you know, it it's it's not a an AI bot or whatever.
So, you know, a bunch of things still need to be ironed out, but they have seen basically the vanishing of some really interesting failure classes with the the next latest generation, I should say.
Right.
Then the trend also holds at least somewhat for GPT.
The alignment to their model spec has improved from GPT-5.1 to GPT-5.
And as you said, the other models, Sonnet, Gemini can be worse on that particular spec.
So there is a question to be had there that's interesting of whoever in training they like stick overly close to this one formulation or framing, and then it's easy to trip the models off if you do a slightly different kind of way to frame or instruct it to go bad.
The last story for the section, Nvidia's H200 license stirs security concerns among top Democrats.
So Senator Elizabeth Warren and Representative Gregory Meeks, which are the top Democrats on committees overseeing US export controls programs, have raised these concerns after reviewing a license recently issued to NVIDIA for H200 chip exports.
The Trump administration approved these sales to certain customers in China, which the Democrats said is deeply at odds with the policy Congress articulated in the Export Control Reform Act, which, as we covered previously, you know, this allowance of exports was very much turn, a change in what the policy has been.
A pretty abrupt and significant change in the export control system.
So it's not surprising to see these Democrats flagging it.
Yeah, the concern here is that the licensing process that currently exists for these, well, for really any chips is both like too weak and that where they're being allowed to ship GPUs where they shouldn't, but also too opaque.
And, you know, there's this idea that once you once you export these GPUs, at that point, you really can't enforce restrictions on how they're used.
I mean, we just covered that story about BiteDance, right?
You ship it out to some third-party country.
Yeah, sure, it's not China, but it's being used by a Chinese company.
What's your guarantee about how they're going to use it once they have those chips, right?
So they're calling for a couple of different things.
First, they want to make sure they know who applied, who got approved, you know, why decisions were made.
That's a big one.
They're complaining that really the criteria for approvals just is unclear.
Like, how do you decide which requests get approved and which get denied?
They're asking for visibility into that, transparency of that, including in particular how security and economic benefits are being traded off.
And so there's basically a call for briefings and all that stuff.
They're also calling out the fact that they kind of seem to be getting a bunch of ad hoc or incomplete briefings.
There's no like real-time congressional oversight.
It's almost like Congress is responding to these announcements after the fact.
And, you know, once things have been shipped or committed to, and they're basically asking, you know, you should be giving us advanced notice for this.
Basically, a bunch of things like this.
They're specifically asking for a geopolitical analysis about how proposed exports of chips could support China's military or domestic AI capabilities and looking for reaction of US allies, that sort of thing.
So this makes a lot of sense as pushback.
You know, we have heard a lot of the sudden announcements about H200 is uh you know shippable, now it's not JK now it is.
So asking for some sort of legislative oversight, which is what's being asked for here, seems seems reasonably sensible.
And and I think this is a bipartisan issue.
I don't I don't think that there's like it's Democrats who are leading the charge here just because it's you know it's a Republican House and and Senate, but you've certainly got this bipartisan agreement on the vast majority of export control policy.
And so a large part of what's being driven at here and kind of motivating the response and asking for more transparency.
And that is it for the safety and policy section.
We've got a couple more papers and research and advancements, but we actually have to do the same thing as last week, where I gotta get going.
And Jeremy has something.
So I think Jeremy will follow up with the recording covering these papers in as much depth as I was gonna say.
I promise it won't be an hour this time.
All right, uh, Jeremy here for the handful of uh papers that I'll cover on my own without Andre, went to runoff.
Um, I also circled back, so the lighting's a little bit different here and everything's a little bit off.
So two papers we want to cover.
One here was called Attention Residuals.
And this is actually like this is one of those papers that I think um I think might matter.
You know, you can never you know be sure, but but it's it's addressing something that really feels quite fundamental.
So when we talk about residual connections in a transformer, this is how all kind of vanilla transformers today work, you feed an input to a layer, right?
And that input is gonna get split into two.
Kind of you think of like two two forked branches, basically at at each uh each layer of the transformer.
So in one branch, you're gonna apply some kind of transformation, like uh, you know, a weight matrix or something to like update the input.
In the other branch, you're gonna really do nothing.
And you're gonna merge those two branches together.
And the branch that does nothing is basically just passing the input right back in and you're gonna add it to the modifications that you applied.
They're the weight matrix times the input to form the output of that layer.
So your output is the initial input plus a transformation applied to the initial input.
Now, one thing that that this does is it means at every layer you're basically adding more.
You're adding a transformation on the input to that layer to the output.
And so you actually get larger and larger and larger, kind of like ballooning values for these residual activations as they work their way down uh down the network.
But that's that's basically the idea.
The whole philosophy behind this is people find that when you go to do uh backpropagation during training, basically, unless you send the input to a given layer to its output in this way, the model will kind of like start to forget about the input to that layer.
That information gets lost through all of the additions of like these these um transformations over the layers of the network cause earlier transformations to get forgotten.
And so you're basically trying to reinject, remind the model like, hey, this is what the previous layer said, by the way.
Don't forget that.
But yes, also you can tack on your own correction, your own update, right?
So that's kind of how how standard residual connections work.
And there's a whole approach to doing this and you know, when and how you normalize these sums and all that stuff that we're not gonna get into.
The key thing here is though, that this process sort of weights every layer kind of equivalently, right?
So if you think about it, you know, layer one is gonna take its own input and then add that to its own input times its contribution, its its weight matrix that it's gonna use to modify the input, and that'll be the output that goes to layer two, right?
And then layer two does the same thing.
And and each layer is just kind of like adding its own contribution in this very like uniform accumulation of information down the line.
And there's no sense in which like layer three is any more important than layer 17.
Now, the thing is, in some cases that will be the case, right?
You will have cases where one layer is actually more important for this particular prompt than another, right?
It's just like, you know, maybe this is a layer that tends to worry about grammar rules or syntax, something very basic that you would tend to find in an earlier layer, and uh and maybe this is you know, you're you're on a token that involves pluralization or some grammar rule, right?
Where it's really relevant.
So you'd want to be, you would almost want to be able to pay more attention to, that's the hint, attend more to a given layer than another.
And transformers in the sort of standard residual connection sense don't do that, right?
Again, they just kind of all you know, take their input, they pass it on plus the input times whatever their contribution is, and they pass it down the line, and there's no there's no kind of difference between the um the sort of uh impact of of any given layer.
There's certainly no intelligent difference.
And this is what what this paper is gonna try to change, right?
They call it attention residuals, and they're gonna replace this whole fixed accumulation of information with with softmax attention.
It's basically just a way of saying, you know, we're gonna have a um uh a model essentially that looks at our layers and has a kind of attention operation that it's gonna do to say, hey, you know, you should be attending more to this layer than this layer.
That's the kind of uh 30,000-foot view.
There's a bunch of details that fall out of this.
The math is actually quite simple.
I recommend checking it out.
I'm by the way, give me feedback on this, guys, by the way, because I'm deliberately trying to be a little more concise than I was last episode to not give you like an hour-long summary of this paper.
But roughly speaking, that's it.
Um, could go into more of the math if that's useful.
Give me that feedback.
But one of the challenges that you get with the sort of full attention residual strategy that I've just described, where you try to attend more to one layer than another, is that it's super memory hungry at scale.
Because you have to keep all of the layer outputs alive in memory simultaneously.
So you know, you compute like the output of layer one and then the upper layer two and so on, and you're gonna do attention over all those layers, which means you need to keep those outputs in in memory so that you can run your attention calculation and decide how much more or less to weight one layer than another.
And so you end up with this really big sort of memory cost and that scales with like the number of layers.
And for large models, especially when you have pipeline parallelism, that that's just like super prohibitive.
Uh, pipeline parallelism is this idea where you basically break your model into chunks where layers one to three sit on this GPU, layers four to six sit on this GPU, and so on and so forth.
For various reasons, this is creates a bunch of communication bottlenecks.
So they work on solving for those uh communication bottlenecks.
And their solution is called block uh attention residuals.
And basically what they do here is they will take a group of layers, so they'll basically break up the model into n blocks of layers, and say, you know, you've got like say eight blocks in total or something, and then they're actually gonna compress each block into a single summary vector, and then they'll basically apply attention only on the n block level summaries.
And then that drops the memory overhead to basically the number of blocks.
Uh it scales with the number of blocks rather than the number of layers, so it gives you more control there.
Okay, bunch more details.
This is one of those really important papers to look at if you care about how data flows through chips in a data center, for example, how, yeah, I mean, what scales and what doesn't.
This is actually a really, really important and I think interesting paper.
I'll park it there, but hopefully that wets your appetite to check it out if that's your thing.
And finally, we're looking at the Mama 3 paper.
So we we have Mamba 3, right?
We were stuck on Mamba 2 for a little while there.
So Mama 3 improved sequence modeling using state space principles.
State space principles.
That word principles is actually really important.
This, among other things, is an attempt to ground the Mamba approach in a more like theoretically robust foundation.
It'll be clear in a minute what I mean by that, but the way to think about the Mamba papers in general, they are dense, they are hardware aware, which is always, I mean, I find it fun, but uh it means that there's a lot of complexity and mathematical complexity.
This one is uh a lot of kind of integral calculus and and finding kind of principled ways to represent state transformations that sort of like reflect in a way the physics of how information should evolve in the system.
So I'm gonna get more concrete now, because that was kind of kind of vague.
So when you think about a state space model, right?
What is a state space model?
I mean, to caricature it, it is a vector, right?
So it's a list of numbers.
And as you scan, as your model scans over a sequence, say a sequence of text, you're going to evolve the values in that vector in that list of numbers, and those values are gonna represent, you know, the capture the meaning of what you have read, of what you've you've scanned over, or of what the model scanned over.
Okay, so now there's this question of like, all right, well, if that's the general gist, we need some kind of update rule, right, for that vector for that list of numbers.
And well, if if we look back at physics, if we look at like how we we think about describing, you know, like a pendulum or an electrical circuit or water flowing through pipes, or you know, these sorts of problems.
Like, what is the form, the kind of equation that you you use to govern that dynamics, right?
Well, you'll typically have a uh a state, right?
So in this case, um h of t, think of that as like the state.
And the rate of change, so the derivative in mathematical terms, but like basically how that state changes over time is gonna be equal to that state, maybe modified in some way.
So but in all that means is the evolution of your state is a function of the state.
So where you are going to be in a minute is a function of where you are right now.
And this is pretty intuitive.
I mean, like, if you, you know, if if you see what are like a coyote or roadrunner suspended in midair with with no uh you know, nothing, nothing underneath him to hold them up, like yeah, he's gonna fall, and and the fact that he's falling is a function of where he was before, right?
So in this sense, you know, the the rate of change in that state, or the the the future state of that that system is a function of the the state times some multiplying factor that is a function of time, whatever, and then plus some additional function of like the inputs to the system, the current current state, uh the current input of the system.
So the rate of change of the hidden state in a state space model is gonna be determined by the current state of the model and its current input.
So the thing that in a sense perturbs that state, the new piece of information that you're seeing.
So my next state space vector is going to be a function of what I've read so far, plus some modified form of the the next token that I'm reading, right?
This is all kind of trying to build that intuition.
Now that's that would be true, or you can describe that mathematically with a derivative with that idea of like the evolution of a system over time smoothly if you're working with time, which is a continuous variable.
But the math kind of becomes harder.
I won't say quite breaks down, but it becomes harder when we move into language.
Because language models, they don't receive a cons continuous stream of input.
You can't model them as flowing through time, instead, they receive these discrete tokens, like word one, word two, word three.
There's no in-between tokens two and three, for example, right?
So you have to mathematically find a way to convert this continuous time equation into a discrete recurrence equation, and that's gonna generally look similar, like you're gonna have some sort of new state that has to be a function of the old state plus a function of the input, the most recent input.
And that conversion process is called discretization, right?
So it's a very common thing.
You see it in a in a lot of in a lot of contexts in quantum physics.
Sometimes there's a variant of this that you think of that that's sometimes called quantization, but this kind of thing happens a lot where you take a flowing function, a smooth function that's defined over, you know, like these like the the real numbers, basically, like like you know, you could have 0.0001 and so on, and you have to convert that so that you map it onto a discrete x-axis where where you have token one, token two, and there's no in between, right?
So the the core question here is gonna be how do you evolve the hidden states from token one to token two, and and how do you do it in a mathematically principled way?
And integral calculus enters into this.
Uh, I'm I'm not gonna get into the weeds too too much here, other than to say that if you're gonna do that, as you might imagine, if you want to like discretize a smooth function, and in other words, just basically chunk it up into these kind of uh discrete pieces, you kind of have a choice.
A token one, you could sort of choose, roughly speaking, the sort of leftmost limit of the smooth function, the value of the smooth function that would have been there.
You can sort of choose that to approximate the value of token one.
You could choose the rightmost limit of your sort of discrete bar as the kind of uh the value that you ascribe to token one, or you could sort of average the two together.
Historically, people have used this exponential Euler way, this was the Mamba two way of doing it, where they basically just like they do they do the the very very first method, so they basically like give it the right endpoint.
So they they assume that the input value is just anyway, the details don't really matter, but the input value is constant across this whole interval and equal to its value at the right endpoint, and that basically simplifies a bunch of math and it makes it possible for them to define their update rule, but there's a better way basically, and it involves accounting for both the right and the left limit, doing a weighted combination of the two, so that you're not just like saying, okay, you know, that for the for this interval it corresponds to token one, I'm just gonna go with whatever the rightmost kind of fringe of the token one sort of time boundary or like yeah, mapping onto the continuous function would be.
Instead, you you balance the right and the left limits of that bar to get your value, and and that's what Mamba three is doing fundamentally.
It's it's anyway one of the big changes.
Another one is parity.
So previously, Mamba models could only represent their internal states using real numbers.
And so real numbers are one kind of number.
Uh, there's also imaginary numbers, and imaginary numbers, like i is the square root of negative one.
They're all some multiple of the square root of negative one.
If you're not familiar with imaginary numbers, this is probably not the place to learn about it.
But there's there's this deep and intimate connection between imaginary numbers and the concept of rotating stuff.
And essentially, what this allows the model to do, so what they're gonna do is use imaginary numbers and real numbers, sometimes referred to collectively as complex numbers.
They're gonna use imaginary and real numbers together that allow the model to represent imaginary and real numbers and in its internal state.
And for interesting mathematical reasons, this makes it possible for the model to actively track property called parity.
Basically, if you see a sequence of zeros and ones that you feed to the model, and you ask the model, hey, like if you add up all these numbers, are they even or odd?
Mamba two would fail because it wouldn't be able to basically do this rotation operation that's required to flip the parity as you count, because really that's all you're gonna do when you're trying to figure out if numbers are even or odd.
And you're gonna go, okay, well, like, you know, as I go, I flip every time I see a one, and a zero doesn't do anything.
Anyway, I'm gonna I'm gonna just say the details don't matter.
You can hopefully see that this is a mathematically very interesting and elegant paper, consistent with previous iterations of Mamba, but a much more principled one.
And the results are really impressive.
It beats transformers by over two points on average on downstream accuracy across a whole bunch of benchmarks.
They beat Mamba 2 by 1.9 points on those benchmarks and the same perplexity as Mamba 2 with half the state size, right?
So way, way faster.
I mean, that means it's twice as fast at inference for like equivalent quality.
And it has this property of solving all these parity and modular arithmetic tasks that we just talked about that may have been very poorly explained, but it you kind of at a certain point uh it goes into yeah, you you just have to be be happy with complex numbers and stuff.
Bottom line is this is an interesting, interesting development.
It does come with a uh sort of optimization.
So you know, previously Mamba used single input, single output.
There's also an optimization uh that uh Mamba 3 does called MIMO multi-input multi-output.
Um, this is basically a uh an approach that um helps you parallelize some of the work that uh that the Mamba algorithm is gonna do.
So standard Mamba uses the single input, single output approach, where the state update, well it's it's updated kind of fairly inefficiently, uh hardware inefficiently.
The GPU mostly sits idle during the decoding phase, whereas MIMO, this like multiple input, multiple output, generalizes it.
So instead of like processing only one input and producing one output at a time, uh, each layer is gonna process a bunch of inputs and a bunch of outputs simultaneously using matrix multiplication, way more GPU-friendly.
And the core thing here is it increases your GPU utilization.
And you know, from a data center standpoint, that matters hugely, right?
Because you're basically all your GPUs are a fleet of workers, and if you're not keeping your workers busy, it is literally the same thing from an OpEx standpoint as just like having a bunch of like employees at your company like taking a coffee break all day.
Like if they're not being utilized, then you're basically burning money, like just by having them sit there.
And so the fact that they're able to bump up, in this case, up to four times more flops during decoding during inference, with no meaningful increase in wall clock time.
So this doesn't actually like delay increased latency, for example, for the user, it and it also leads to better model quality.
So this is an important development from an efficiency standpoint, from a cost of running this model standpoint as well.
So, you know, we're seeing tons of hybrid models right now popping up with Mamba 2 and Transformer architectures typically merged together and a whole bunch of variants.
You know, sometimes you've got like like Mamba and attention heads in the same layer, sometimes alternating Mamba attention, Mamba, attention, all kinds of variants.
You know, expect Mamba 3 to start getting slotted in in that uh that whole mix.
I mean, this is a really interesting development with some important new uh efficiency gains for anybody who wants to run it.
I would expect that this will start to get taken up pretty quickly and uh and worth keeping an eye on.
So uh there we have it.
That's the last of the two papers we want to to cover.
Hopefully, I haven't bored you to tears.
It was pretty damn technical.
So uh we'll let I guess we'll let Andre take it away.
Thank you so much for listening to this week's episode of Last Week in AI.
You can find the articles we discussed here today, and subscribe to the newsletter at lastweek in that AI.
We always appreciate you commenting or viewing us on the Apple Podcasts, share it with your friends.
But the moment anything, please do keep tuning in week to week.
Tune in.
Tune in.
When the Amazon begins.
Break it down.
Last week in AI comments, correct.
Get the low down on tech.
And it's last week in AI comments, correct.
Watch insurgent fly from the left to the streets AI's reaching to the ride.
Hit the low down on tech and let it's high.
Class week in AI coming to ride.
I'm the last with the streets.
The AI's reaching high.
From neural nets to robot, the headlines pop.
Data-driven dreams, they just don't stop.
Every breakthrough, every code unwritten, on the edge of change.
With excitement we're smitting.
From machine learning models to coding kings.
Futures unfolding, see what it brings.
