OpenAI pushes for global safety standards
βOpenAI's latest policy suggestions focus heavily on international cooperation.β
All podcast episode summaries matching MONITOR COMPUTE β aggregated across every podcast we track.
βOpenAI's latest policy suggestions focus heavily on international cooperation.β
βThe launch of Meta's Gemini 4 is a pivotal moment for the industry.β
βWhat used to matter a lot was execution was very, very fucking difficult, and ideas were cheap. Now ideas are cheap and plentiful, but execution is very easy. So, really, only the good ideas are the ones that can justify the spend on super cheap implementation. ... If implementation costs continue to tank, which they are, we don't even have Meetos yet. ... What now comes to the world, it's a complete reordering of how economies work.β
βMethos is potentially the biggest step up in model capabilities in two years. I think that's really, really an important detail that it's so good that they're, like, don't wanna release it even though they already announced the price to their people that they did a selective release for cyber for, and it's 5 or 10 x the token cost. They just don't wanna release it because they're worried about the impact on the world. And they're releasing a worse version, Opus four seven, to us, and they explicitly said on the card, hey. We actually Preferably. Made it worse at cyber.β
βThis year, the spend has just skyrocketed, and it really started in late December with Opus. ... We signed an enterprise contract with Anthropic, and it's gone to the point where now I think when I last talked to you, it was 5,000,000 spend rate. It's actually 7,000,000 spend rate now. ... We're north of 25% of spend on cloud code as a percentage of salary. And if this trajectory continues, then, you know, we'll spend more than a 100% by the end of the year, which is a bit terrifying.β
βThere's people who have argued GPU's full lives are less than five years. Complete nonsense. There are clusters now resigning. Three or four year old Hopper clusters resigning for three or four more years. There's a 100 clusters that are resigning for another couple years. So the useful life is clearly not five years. It's maybe even seven or eight years, arguably. We don't know yet. We'll see when Hopper gets there, but it's clearly not five years. So useful life is extending, and the prices are going up on that renewal.β
βWhat used to matter a lot was execution was very, very fucking difficult, and ideas were cheap. Now ideas are cheap and plentiful, but execution is very easy. So, really, only the good ideas are the ones that can justify the spend on super cheap implementation. ... If implementation costs continue to tank, which they are, we don't even have Meetos yet. ... What now comes to the world, it's a complete reordering of how economies work.β
βI think there will be a large scale protest against Anthropic and end up at AI. People hate AI. AI is less popular than ICE, less popular than politicians. With Anthropic adding so much revenue, that's gonna start causing business changes downstream. People are gonna get more and more scared of AI. They'll start blaming more and more of their own problems and things that are global, have been deep seated problems for a long time. Those will bubble up and be blamed on AI.β
βI don't think we have hit the limits. I think it's a bit more nuanced than that. So of course when the leading companies all started building these large language models, you're getting enormous jumps with each generation of new system. At some point that had to slow down, so it's not continuing to be exponential. But that doesn't mean there isn't great returns still for scaling the existing systems up further.β
βWe're seeing unprecedented multimodal reasoning in this new release.β
βThe tasks that come up in the wild are more likely to be messy in some sense. They involve working with other people. They involve working in much larger code bases or more open-ended problems, maybe with something even adversarial going on. We do tend to see that the AIs are less capable of working on these more messy problems.β
βThe autonomous agents which I've been talking about how this is going to consume orders of magnitude more tokens and change our life. I'm excited to see more is coming and open claw was just this brief thing that woke us up to what Anthropic appears to be all in on. Truly autonomous agents running 24 seven, hopefully safely, hopefully not leaking all of our source code, but it's coming soon.β
βI've got a probability distribution around the timings, but I would say there's a very good chance of it being within the next five years. So that's not long at all. We used to do this extrapolation of compute and algorithmic progress, and basically we predicted around 20 years it would take from when we started out, and I think we're pretty much on track.β
βMethos is potentially the biggest step up in model capabilities in two years. I think that's really, really an important detail that it's so good that they're, like, don't wanna release it even though they already announced the price to their people that they did a selective release for cyber for, and it's 5 or 10 x the token cost. They just don't wanna release it because they're worried about the impact on the world. And they're releasing a worse version, Opus four seven, to us, and they explicitly said on the card, hey. We actually Preferably. Made it worse at cyber.β
βI sometimes quantify the coming of AGI is like 10 times the industrial revolution at 10 times the speed. We've been very consistent how we define AGI as basically a system that exhibits all the cognitive capabilities the human mind has. That's important because the brain is the only existence proof we have that we know of in the universe that general intelligence is possible.β
βThe open-weights ecosystem is evolving faster than many expected.β
βIn this case, we're talking about for a bus 4.6, something like tasks that take humans 12 hours to do, we predict that it will succeed at those tasks around 50 percent of the time. It turns out that when you plot using this particular difficulty measure, how performant AIs are relative to how long it takes humans to complete these tasks, we see an exponential increase in capabilities for AIs.β
βI want to start with, you guessed it, Anthropic, unbelievable 28 day month of February, where they did 6 billion in revenue, which was more than Databricks has done in their entire lifetime. It was actually the accidental leak of Claude Mythos, essentially 3,000 unpublished assets leaked. It's a 10 trillion parameter model apparently, that is this next level step changing capabilities that they're not releasing because of how powerful it is.β
βI think clearly the central reason is that we are bottlenecked on technical talent, on incredibly capable people to come work on these questions. I was on a METR work retreat recently where we were brainstorming 20, 30 of these, what seemed like world important problems, problems that we think no one else is going to get to if we do not get to them.β
βWe think of it as trying to build advanced science that can say, when are we getting to the point that AI systems could improve themselves or speed up the pace of AI development? When will AI research feed on itself? The core capability for that might be software engineering and machine learning research ability.β
βYou're seeing the economists, the accountants have wandered into the room, and they said, we have a scarce resource here. Let's optimize it. Let's devote this compute to the people who can pay the most for it. You haven't lived till you've seen an 85% decline in an index. I think shooting Sora in the head is even more significant in terms of what it says about the strategic direction of the company.β
βI would say about 90% of the breakthroughs that underpin the modern AI industry were done either by Google Brain or Google Research or DeepMind. Those labs that have capability to invent new algorithmic ideas are going to start having bigger advantage over the next few years. The last set of ideas are sort of all the juices being rung out of them.β
βThis year, the spend has just skyrocketed, and it really started in late December with Opus. ... We signed an enterprise contract with Anthropic, and it's gone to the point where now I think when I last talked to you, it was 5,000,000 spend rate. It's actually 7,000,000 spend rate now. ... We're north of 25% of spend on cloud code as a percentage of salary. And if this trajectory continues, then, you know, we'll spend more than a 100% by the end of the year, which is a bit terrifying.β
βI think there will be a large scale protest against Anthropic and end up at AI. People hate AI. AI is less popular than ICE, less popular than politicians. With Anthropic adding so much revenue, that's gonna start causing business changes downstream. People are gonna get more and more scared of AI. They'll start blaming more and more of their own problems and things that are global, have been deep seated problems for a long time. Those will bubble up and be blamed on AI.β
βNone of that incremental capacity really gets here until the second that they've decided to do in addition to the typical 20 to 30%. They can stretch a little bit, but, really, the true incremental supply doesn't come till '28, which is a very unique thing. Even if they wanted to build as fast as possible, it doesn't come till '28, late twenty seven at best. So the result is memory prices have gone through the roof. And guess what? They're gonna double and triple again, at least on DRAM, especially.β
βAnd what that ends up meaning is that you keep on having these doublings of capabilities every, let's say, four months, it seems, on recent trends, where the next model is not merely going to have necessarily an hour longer time horizon, but perhaps be having some multiple of the time horizon of the previous model that's come out.β
βThere's people who have argued GPU's full lives are less than five years. Complete nonsense. There are clusters now resigning. Three or four year old Hopper clusters resigning for three or four more years. There's a 100 clusters that are resigning for another couple years. So the useful life is clearly not five years. It's maybe even seven or eight years, arguably. We don't know yet. We'll see when Hopper gets there, but it's clearly not five years. So useful life is extending, and the prices are going up on that renewal.β
βOne extraordinary fact from my perspective... is something like the R&D spend on compute of these companies has risen exponentially, of course, and in fact, it's risen exponentially at essentially the same rate as time horizon progress. You know, I think there's nothing necessary about that. You know, it doesn't mean by itself that if compute progress slows, then capabilities progress will also slow.β
βThe faster we vibe code, the faster we ship, the more corners we cut in general on application level security. It happens. I mean, so many folks are accidentally uploading code to insecure GitHub's, to database, to super bases that are by default open. So this is this is accelerating our data, which is just open on the Internet. And you could say, but God, this shouldn't happen at the Anthropic level. And I'm sure someone will get will get scolded.β
βI think compute is the big one, not just for the obvious reason of scaling up your ideas and your systems as the scaling laws as they're called, but the other thing you need a lot of compute for is for doing experiments. The computers, the cloud is our workbench basically. So if you have a new algorithmic idea, you've got to test it at a reasonable scale, otherwise it won't hold when you put it into the main system.β
βMETR is a research nonprofit based in the Bay Area... dedicated to advancing the science of measuring whether and when AI systems might pose catastrophic risks to humanity as a whole, focused specifically on threats that come from AI autonomy or AI systems themselves. We think it sets the stakes for conversations about AI misalignment.β
βNone of that incremental capacity really gets here until the second that they've decided to do in addition to the typical 20 to 30%. They can stretch a little bit, but, really, the true incremental supply doesn't come till '28, which is a very unique thing. Even if they wanted to build as fast as possible, it doesn't come till '28, late twenty seven at best. So the result is memory prices have gone through the roof. And guess what? They're gonna double and triple again, at least on DRAM, especially.β
βSafety mandates are moving toward hard compute caps.β
βOn the cybersecurity leak, it was noteworthy that Anthropic, quote unquote, blamed human error. We may be at the stage where we throw the humans under the bus, not the AI anymore. Which I think at some level is pretty terrifying. But and you know exactly what happened. You often see this where you're about to do a big announcement. You have your content management system. You stage all the assets, be it their fed press release.β
Get a daily email of the best quotes & audio clips from the top podcasts.