⌘K

2 episodes tagged

Approximate match across all podcasts

Home/Tags/WATCH DIFFUSION

WATCH DIFFUSION

All podcast episode summaries matching WATCH DIFFUSION — aggregated across every podcast we track.

2 episodes · Page 1/1

Related Tags

#BUY COMPUTE1x #REGULATE AI1x #EXPORT CONTROLS1x #BUILD CULTURE1x #BUILD DIFFUSION LLMS1x #OPTIMIZE INFERENCE1x #DEPLOY VOICE AGENTS1x #WATCH MERCURY1x

Quotes & Clips tagged WATCH DIFFUSION

18 on this page

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Mar 26

Diffusion models enable controllable generation through external constraints

“A diffusion model, at least for images, diffusion models are are known to be, much more suitable for controllable generation. And the reason is that because the object, let's say the image that you're generating is, sort of, like, available to the model from the very beginning, it's very easy for the model to check whether or not this object that it's generating is consistent with, say, some constraints or some kind of, some kind of, like, control signal that you wanna use to to make sure that the output is consistent with whatever you want the model to generate. So I was on some papers where we're doing medical imaging, and and the idea is that, you know, when you do a CT scan, you're basically taking some projections of your body cross section, and then, you know, you're trying to reconstruct what your body looks like from some measurements that you get from the machine.”
— Stefano Ermon - Stanford professor, Inception Labs CEO

From “The Race to Production-Grade Diffusion LLMs with Stefano Ermon - #764” →

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Mar 26

Causal attention masks block reuse of pretrained autoregressive weights

“The the real challenge is that you're like the the attention mask that you use in a traditional autoregressive model is causal. So the model only knows how to use context to the left as it figures out how to what to do next. And in a diffusion language model, you really wanna be able to have access to the context to the left and to the right as you decide what to change. It's like one of the key properties that make these models potentially much higher quality than compared to autoregressive models.”
— Stefano Ermon - Stanford professor, Inception Labs CEO

From “The Race to Production-Grade Diffusion LLMs with Stefano Ermon - #764” →

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Mar 26

Mercury 2 matches frontier speed-tier quality 5-10x faster

“The latest model that we announced this week, Mercury two, is actually matching in quality, some of the best speed optimized models from Frontier Labs. So we'll think about the Haiku models, the flash models, mini models from OpenAI. So it's the at that quality level. But, again, it's about five to 10 x faster in terms of, like, the time it takes you to get an answer, using a diffusion model versus an autoregressive model.”
— Stefano Ermon - Stanford professor, Inception Labs CEO

From “The Race to Production-Grade Diffusion LLMs with Stefano Ermon - #764” →

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Mar 26

Diffusion LLMs scale better than autoregressive models at inference

“If you need to scale up these models and they are actually getting into production, the price per token or the what's needed per token becomes the key metrics that you care about. And so what we're seeing with the fusion language models is that they scale better than autoregressive models at inference time. They're cheaper to serve. They're faster. You get more tokens per GPU, which means that the price is actually lower.”
— Stefano Ermon - Stanford professor, Inception Labs CEO

From “The Race to Production-Grade Diffusion LLMs with Stefano Ermon - #764” →

Dwarkesh Podcast

Dwarkesh Podcast

Good interview shows

Feb 13

Federal AI moratorium for ten years is reckless given the timelines

“The thing that was being voted on is we're going to ban all state regulation of AI for ten years with no apparent plan to to do any federal regulation of AI, which would take congress to pass, which is a very high bar. Given the serious dangers that I lay out in adolescence of technology around things like the, you know, kind of biological weapons and bioterrorism, autonomy risk, and the timelines we've been talking about, like, ten years is an eternity.”
— Dario Amodei - CEO of Anthropic

From “Dario Amodei — "We are near the end of the exponential"” →

Dwarkesh Podcast

Dwarkesh Podcast

Good interview shows

Feb 13

Claude Code emerged organically from internal Anthropic developer use

“Yeah. So it actually happened in a pretty simple way, which is we had our own, you know, we had our coding models, which were good at coding. And and, you know, around the beginning of 2025, I said, I I think the time has come where you can have nontrivial acceleration of your own research. And then, you know, this thing, I, you know, I think it might have been originally called Claude CLI, and then the the name eventually got changed to Claude Code internally, was the thing that kind of everyone was using, and it was seeing fast internal adoption. And I looked at it, and I said, probably we should launch this externally.”
— Dario Amodei - CEO of Anthropic

From “Dario Amodei — "We are near the end of the exponential"” →

Dwarkesh Podcast

Dwarkesh Podcast

Good interview shows

Feb 13

Diffusion is real but faster than any previous technology

“I think diffusion is very real and and and and doesn't have to, you know, doesn't exclusively have to do with limitation limitations on the AI models. Like, again, there are people who use diffusion to to, you know, as kind of a buzzword to say this isn't a big deal. I'm not talking about that. I think AI will diffuse much faster than previous technologies have, but but not infinitely fast.”
— Dario Amodei - CEO of Anthropic

From “Dario Amodei — "We are near the end of the exponential"” →

Dwarkesh Podcast

Dwarkesh Podcast

Good interview shows

Feb 13

Authoritarianism becomes morally obsolete in the age of AGI

“I actually believe it could be the case, is that is that dictatorships become morally obsolete. They become morally unworkable forms of government, and that and that and that the the the the crisis that that creates is is is sufficient to force us to find another way. I just wonder if it will motivate new ways of thinking about with with with the new technology, how to preserve and protect freedom.”
— Dario Amodei - CEO of Anthropic

From “Dario Amodei — "We are near the end of the exponential"” →

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Mar 26

Discrete tokens break the geometry diffusion relies on

“But if you think about text and you take two words, then it's not clear what's in between the meaning of two different words. Right? And so there is no real geometry to the space of possible tokens or possible words. And so that makes the idea of denoising much more challenging because there is it's not clear what it means to perturb, add noise to to text.”
— Stefano Ermon - Stanford professor, Inception Labs CEO

From “The Race to Production-Grade Diffusion LLMs with Stefano Ermon - #764” →

Dwarkesh Podcast

Dwarkesh Podcast

Good interview shows

Feb 13

Anthropic's revenue has grown 10x annually, hitting $9-10B in 2025

“But what we've seen from from the beginning, you know, at least if you look within anthropic, there's this bizarre 10 x per year growth in revenue that we've seen. Right? So, you know, in 2023, it was, like, zero to a 100,000,000. 2024, it was a 100,000,000 to a billion. 2025, it was a billion to, like, 9 or 10,000,000,000.”
— Dario Amodei - CEO of Anthropic

From “Dario Amodei — "We are near the end of the exponential"” →

Dwarkesh Podcast

Dwarkesh Podcast

Good interview shows

Feb 13

Buying too much compute can bankrupt you if revenue forecasts miss by a year

“And so I could buy a trillion dollars. Actually, it would be, like, $5,000,000,000,000 of compute because it would be a trillion dollar a year for for five years. Right? I could buy a trillion dollars of compute that starts at the 2027. And if my if my revenue is not a trillion dollars, if it's even 800,000,000,000, there's no force on earth. There's there's no hedge on earth that could stop me from going bankrupt if I if I buy that much compute.”
— Dario Amodei - CEO of Anthropic

From “Dario Amodei — "We are near the end of the exponential"” →

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Mar 26

Big labs face high switching costs to adopt diffusion

“My sense is that, you know, there is a big switching cost. Like, they're very, very focused on Gemini on the on their main model. And so, you know, it could be a that that's kind of, like, the issue with these big labs is that, you know, they're only in one direction, and then it's hard for them to really focus on on an alternative direction. As a start up, we're in much better positions to do that because we, you know, we're laser focused on one thing, and we can really deliver and and build everything that's needed to get that, technology to succeed.”
— Stefano Ermon - Stanford professor, Inception Labs CEO

From “The Race to Production-Grade Diffusion LLMs with Stefano Ermon - #764” →

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Mar 26

Voice agents and fast agentic loops are killer use cases

“We're already seeing, a lot of usage. I mean, you nailed the two main ones that we're seeing, voice, a lot of voice, customer support, the educational kinda like agents. People love the speed of the of diffusion language models. They always have this issue that they would wanna be able to use a thinking model, like a reasoning model, but usually, the latency is just not enough. And so maybe they use unless they use specialized AI inference chips, but that's too expensive and they cannot scale to large volumes. So we had a bunch of, customers that are building voice agents on top of the fusion language models.”
— Stefano Ermon - Stanford professor, Inception Labs CEO

From “The Race to Production-Grade Diffusion LLMs with Stefano Ermon - #764” →

Dwarkesh Podcast

Dwarkesh Podcast

Good interview shows

Feb 13

The most consequential AGI decisions will be made in two-minute hallway conversations

“So, you know, one of my one of my, I guess, worries, although it's also an insight into into, you know, in into kind of what's happening is that, you know, some very critical decision will be will be some decision that, you know, someone just comes into my office and is like, Dario, you have two minutes. Like, you know, should we should we do, you know, should we do thing thing a or thing b on this, like, you know, someone gives me this random, you know, half page half page memo and is like, should we should we do a or b? And I'm like, I don't know. I have to eat lunch. Let's do b. And and, you know, that ends up being the most consequential thing ever.”
— Dario Amodei - CEO of Anthropic

From “Dario Amodei — "We are near the end of the exponential"” →

Dwarkesh Podcast

Dwarkesh Podcast

Good interview shows

Feb 13

The country of geniuses in a data center is one to three years away

“So so on the ten years, I'm, like, you know, 90%, which is about as certain as you can be. Like, I think it's I think it's crazy to say that this won't happen by by by 2035. I have a strong view, 99, 95% that, like, all this will happen in ten years. Like, that's I think that's just a super safe bet. And then I have a hunch this is more like a fifty fifty thing that it's gonna be more like one to two, maybe more like one to three.”
— Dario Amodei - CEO of Anthropic

From “Dario Amodei — "We are near the end of the exponential"” →

Dwarkesh Podcast

Dwarkesh Podcast

Good interview shows

Feb 13

Anthropic's culture is held together by Dario's biweekly Vision Quest talks

“One, I write this thing called the DVQ, Dario Vision Quest. I wasn't the one who named it that. That's the name it it it received, and it's one of these names that I kind of I tried to fight it because it made it sound like I was, like, going off and smoking peyote or something. But the name just stuck. So I get up in front of the company. Every two weeks, I have, like, a three or four page document, and I just kind of talk through, like, three or four different topics about what's going on internally.”
— Dario Amodei - CEO of Anthropic

From “Dario Amodei — "We are near the end of the exponential"” →

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Mar 26

Masking tokens replaces noise in diffusion text models

“One that works pretty well is basically one where you, mask out tokens. So you you kind of like, hide them. You you take a sentence and then you remove some of the tokens. You hide them from the neural network, and then you ask the neural network, can you predict what those tokens were? And so it's similar in some sense to next token prediction, except that things were done out of order, and the network needs to be able to use information from you needs to use context to the left and to the right and combine it in some interesting ways to figure out how to predict all these missing tokens from the from the sentence.”
— Stefano Ermon - Stanford professor, Inception Labs CEO

From “The Race to Production-Grade Diffusion LLMs with Stefano Ermon - #764” →

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Mar 26

Existing serving engines cannot run diffusion language models

“I think one of the reasons why, there are still no other providers that are able to serve diffusion language models, in production today, you cannot run a diffusion language model on existing serving engines. So if you think about BLLM, SG Lang, TensorRT, these frameworks that exist and and not even open source, and and they are really, really good at serving, other aggressive LLMs very efficiently. The space for diffusion language models is much, much, less developed, so we had to build our own serving engine.”
— Stefano Ermon - Stanford professor, Inception Labs CEO

From “The Race to Production-Grade Diffusion LLMs with Stefano Ermon - #764” →

More clips tagged WATCH DIFFUSION?

Get a daily email of the best quotes & audio clips from the top podcasts.

Subscribe for daily Quicklets