The application layer will survive through proprietary workflows

No Priors: Artificial Intelligence | Technology | Startups

May 1

“I think the application layer will exist for a number of reasons. One is because, you know, I think this idea, that what it what is valuable to a company, is, you know, the the user signal that they can gather, that only they can gather. And to the extent that that is encoded, in a model, I think a lot of their business will, be at risk. But to the to the extent that it is encoded in workflows, that is where they will be able to develop notes. So a good I think a good example of that is, say, a company like a bridge where the clinicians edits of the notes and what they do with those notes after the fact and the, the thing that happens in, inside the earmark three steps down, and that becomes a workflow that only”
— Tuhin Srivastava - CEO of Baseten

From “Baseten CEO Tuhin Srivastava on the AI Inference Crunch, Custom Models, and Building the Inference Cloud” →

No Priors: Artificial Intelligence | Technology | Startups

May 1

The inference supply crunch leaves zero slack compute

“I think, you know, there there's so much narrative around the supply crunch. And no matter like, as much as we hear about it, I don't think people realize how bad it really is. Like, there is, you know, there is very, very little Slack compute available. Like, you know, we we run pretty large clusters ourselves, and we run them in, like, uncomfortably high utilization. You know, we when I'm saying we're, like, mid nineties utilization Mhmm. Most of the time.”
— Tuhin Srivastava - CEO of Baseten

From “Baseten CEO Tuhin Srivastava on the AI Inference Crunch, Custom Models, and Building the Inference Cloud” →

No Priors: Artificial Intelligence | Technology | Startups

May 1

AWS execs culturally normalize frequent pager alerts

“I think, like, one, I think if you've worked at an infrastructure company like, we were once in a meeting with a bunch of AWS execs, and this was, you know, like, very senior AWS folks. All their pages went off multiple times, during our forty five minute meeting. You know? Like, it's a I I I think, like, it's it's it's very much like it's a cultural thing. But, yeah, like, I I don't you know, our, like, inference can't go down and, like, you know, we you know, you you you learn to like, you know, what's this? Like, I think Amir, my cofounder, when his pager goes off, his seven year old said, is that a p zero? Oh, is that is that is that a p zero? And so, you know, I I think that is you just have to get used to it, and that's the culture you live in.”
— Tuhin Srivastava - CEO of Baseten

From “Baseten CEO Tuhin Srivastava on the AI Inference Crunch, Custom Models, and Building the Inference Cloud” →

No Priors: Artificial Intelligence | Technology | Startups

May 1

The US must inevitably develop proprietary open source models

“I do think that to, to some extent is I I think there is importance to The US that we develop our own models. I think that that would be a massive loss if that there are five companies, you know, five different labs in China that are creating open source models, and we're struggling to get one set up. So it's necessary. I also think it's inevitable. And, you know, like, the deep sea the deep sea moment a year ago, I remember someone saying to me and I thought it was, like, very well said, which is, like, and the world's changing a lot, but they said, hey. You know, we should kinda just forget Mhmm. That this is a Chinese model. We should just act like this came from Mhmm. From meta and and build and build with that in mind.”
— Tuhin Srivastava - CEO of Baseten

From “Baseten CEO Tuhin Srivastava on the AI Inference Crunch, Custom Models, and Building the Inference Cloud” →

No Priors: Artificial Intelligence | Technology | Startups

May 1

GPUs as a service are commoditized and lack stickiness

“I think, like, GPUs as a service is not sticky. I think that's been seen. Like, customers generally just see that as as commodity. Imprint with the software they included is incredibly sticky. You know, like, just just like, you know, none of our top 30 customers have ever churned. You know, we're talking, like, 400% annual NDR Mhmm. Around our business. And so it's like very it's, it's very, very sticky. So I think that software layer is very important.”
— Tuhin Srivastava - CEO of Baseten

From “Baseten CEO Tuhin Srivastava on the AI Inference Crunch, Custom Models, and Building the Inference Cloud” →

No Priors: Artificial Intelligence | Technology | Startups

May 1

Founders micromanaging whole problems actually have hiring issues

“I think the two or three things that I'll say is, like, you want people where you can give them whole problems. Yeah. And so, like, you know, if if you are if you feel like you are micromanaging, if you feel like you need if you feel like, you know, you you have to be involved in everything, I think that's a bit of a cop out as a founder because you're just like, I just need to be involved in everything. It's like, no. You probably don't have the right people. I think the second thing is, be very, very clear what you're optimizing for.”
— Tuhin Srivastava - CEO of Baseten

From “Baseten CEO Tuhin Srivastava on the AI Inference Crunch, Custom Models, and Building the Inference Cloud” →

Hard Fork

May 1

ChatGPT for health struggles because medical records are extremely messy

“Not yet, but I think it could be at some point. I mean, so ChatGPT for health pulls in your data from the medical record and lets you chat with your medical records. Now reason number one for concern is privacy. That's obviously gonna have your entire medical history going to a AI company. It's also going to not be redacted by you in a way to remove identifiable things. Reason number two, I I think if we're talking about health record data, it's really messy. They include tabular data. They include copy forwarded data, pay that's been copied and pasted. And they also, if you've ever read your health records, they include things that are wrong. There's a lot of errors or misdocumented things in your health data. And it turns out that just copying a bunch of information, like, they're LOMs aren't magical. You can't just copy your entire medical record in and think that you're gonna get good performance. And I would never bet against the technology. I think that we will get to the point that we have ways to build representations of of humans and understand their health. But right now, there's, like, no advantage to just dumping everything in an LLM, which is what Chachipi do for health theoretically would allow you to do in a way that would allow you to better understand your health.”
— Adam Rodman - internal medicine physician

From “OpenAI’s Big Reset + A.I. in the Doctor’s Office + Talkie, a pre-1930s LLM” →

Hard Fork

May 1

OpenAI's Stargate project represents reality intruding on trillion dollar AI ambitions

“I think this was a case where, like, reality has just finally intruded on the Stargate project. Like, when all of these deals were getting announced initially, this is how they sounded. Well, we're gonna spend 1 batrillion dollars that we don't have to build 40 data centers. And at the time, people said that kind of seems like a lot. Can you guys actually live up to that? And they said, yeah. Just watch us. Well, guess what? They could in another changing course.”
— Casey - co-host of Hard Fork

From “OpenAI’s Big Reset + A.I. in the Doctor’s Office + Talkie, a pre-1930s LLM” →

Hard Fork

May 1

Elon Musk's lawsuit reveals OpenAI was fueled by grudges

“I think the lawsuit and this ongoing litigation between Elon Musk and OpenAI has been very distracting for OpenAI. But, like, as a journalist and as a person who wants to know more about the inner workings of how these companies run, I think it's been actually very valuable for a lot of these emails and early communications between OpenAI leaders to be released as part of this litigation. I have found it very useful in understanding some of the early dynamics, at OpenAI. And it also just illustrates the degree to which these projects are all just sort of fueled by grudges.”
— Kevin Roose - tech columnist at NYT

From “OpenAI’s Big Reset + A.I. in the Doctor’s Office + Talkie, a pre-1930s LLM” →

Hard Fork

May 1

Half of US doctors are using Open Evidence for decision support

“And then the the second sort of, I'd say, normal doctor, use case is for decision support. So there's this one company called Open Evidence that has created a free tool that has gone from, again, zero to crazy numbers of adoption. I will tell you younger doctors like my residents use it all the time. I don't know the actual numbers, but it's probably close to half of US doctors are using this right now.”
— Adam Rodman - internal medicine physician

From “OpenAI’s Big Reset + A.I. in the Doctor’s Office + Talkie, a pre-1930s LLM” →

Hard Fork

May 1

The deskilling of the medical workforce is a massive short-term worry

“Yes. So that is the biggest worry that I actually have about sort of the short to medium term is deskilling of the workforce. We have some evidence. There was a sort of scary study last year from Poland on a trial where they gave doctors not a language model, but a polyp detecting technology. And they they looked at their ability to detect polyps, so potentially cancerous lesions in the colon before using it and then after using it for three months. And when not using it, their ability to detect polyps dropped by six percentage points. So these are skilled doctors using, technology, and they lose six absolute percentage points of their ability to detect potentially cancer in three months. And then imagine that you're learning to do it for the first time.”
— Adam Rodman - internal medicine physician

From “OpenAI’s Big Reset + A.I. in the Doctor’s Office + Talkie, a pre-1930s LLM” →

Hard Fork

May 1

Machine forecasting needs a massive track record to build human trust

“We don't think people should take our word for it. And we also don't think that people should trust machine forecast unless they have a track record going back, like, decades and decades. So the idea here is that if we could build a model who really only knew about the world up to a certain date, we could ask it to forecast, like, five or ten years ahead of time. Like, ask it, what's the New York Times headline gonna be five years from now? Or is there gonna be another great war or something? And, we can iterate and see, like, what kinds of things are predictable. What does it take? Like, how far out can things be foreseen? And then, hopefully, eventually, we'll have machines that have, like, a hundred year track record of forecasting. And then we can ask them, you know, in 2026, what do you think is gonna happen, like, you know, two or four, eight years from now? And we'll have an idea of how much to trust those forecasts.”
— David Duvenaud - co-creator of Talkie

From “OpenAI’s Big Reset + A.I. in the Doctor’s Office + Talkie, a pre-1930s LLM” →

Hard Fork

May 1

Anachronistic classifiers help evaluate forecasting leakage

“So we have a classifier that tries to look for things that are anachronistic. And especially if you wanna use this for forecasting or to evaluate forecasting, it's really important that we really nail this issue. So we have all sorts of ideas for, like, canaries and things that we think, the model should just never assign any likelihood to. Like, think of, I don't know, Nagasaki and Hiroshima. Like, before World War two, like, those those two towns would just never show up in the same sentence ever almost, except for, like, some weird weird coincidence. So you can just tell whether there's been leakage about important events if the model just thinks that there's any chance that you'll see those particular names together.”
— David Duvenaud - co-creator of Talkie

From “OpenAI’s Big Reset + A.I. in the Doctor’s Office + Talkie, a pre-1930s LLM” →

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Apr 30

Inference research-to-production timeline is often just hours

“I mean, it might be the fastest timeline in the world. If you think about medicine, for example, it can take decades for research to reach a pharmacy. If you think about, you know, physics or engineering, it can take years to apply a new concept, material science. Even within AI, what moves faster. Training ones, for example, if you want to train a model off of a new technique, it can still take weeks or months to fine tune the hyperparameters and and find the exact right way to sort of express that technique. But with influence, the timeline is often hours. A new model architecture comes out, you have to figure out how to support it day zero. We had PoloQuant come out that research paper and an engineer on our model performance team had it implemented thirty one hours later, as a as a CUDA kernel.”
— Philip Kiely - head of AI education at Baseten

From “How to Engineer AI Inference Systems with Philip Kiely - #766” →

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Apr 30

Inference engineering is like mixed martial arts—many disciplines

“Yeah. So inference is a really fun and difficult topic, and I'm gonna go into metaphor territory for a second here, which is that I'm a I'm a martial artist. I have been my entire life. Are are you familiar with, like, UFC and and MMA and all those kind of things? So, you know, they are you can't just be an expert in one thing and expect to become a champion. You can't just be a great wrestler. You can't just be a great boxer. The idea is that you have a lot of different skills, each one of which can take a lifetime to master. And somehow you have to be excellent at all of them in order to be a well rounded mixed martial artist.”
— Philip Kiely - head of AI education at Baseten

From “How to Engineer AI Inference Systems with Philip Kiely - #766” →

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Apr 30

Knowing inference 'knobs' lets you build priority queues and quantize confidently

“You can trade off between, for example, latency and throughput in a given influence engine by adjusting things as simple as batch size or things like, you know, adding or removing a speculation algorithm. When you when you do that, when you create sort of a a spectrum of outcomes, an efficient frontier of high performance influence, then you start to understand, like, wait, I can choose to change the way that I consume these systems. So for example, you might start having priority queues in your product where paid user traffic gets prioritized over for user traffic. And maybe that's something that you couldn't have built previously, or maybe you add a concept of, you know, I have the ability to quantize models and because I have my own very sophisticated and product specific evals, I can do that with complete confidence that I'm not degrading the quality of the service my users are experiencing.”
— Philip Kiely - head of AI education at Baseten

From “How to Engineer AI Inference Systems with Philip Kiely - #766” →

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Apr 30

Hopper GPUs gained value because Chinese labs optimize for them

“actually, like, Hopper GPUs in particular still are very, very popular for influence. One big reason for that actually is because so much open source work comes out of Chinese labs who, due to export controls, generally work on hopper GPUs and not Blackwell GPUs. So you generally get like either, you know, FPA to inflow kernels, you get things that are built for hopper's asynchronous programming paradigm instead of the slightly different paradigm of Blackwell kernels, you get things that are models that are built for the size and restriction of say like an eight x h 200 node instead of say a GB 300 mvl 72 system.”
— Philip Kiely - head of AI education at Baseten

From “How to Engineer AI Inference Systems with Philip Kiely - #766” →

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Apr 30

You can't vibe code uptime for mission-critical inference

“But one of the things we like to say at Baystone is, like, you can't vibe code uptime. And ultimately, like, there is still going to need to be for these mission critical systems that have, you know, hundreds of millions or billions of dollars of of economic value, like, relying on them. There's going to need to be human owners who can be accountable for the results of this system.”
— Philip Kiely - head of AI education at Baseten

From “How to Engineer AI Inference Systems with Philip Kiely - #766” →

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Apr 30

Specialized runtimes turn 500ms tasks into 1ms tasks

“No one is truly doing named entity recognition, which is sort of extracting keywords from sentences. No one is truly using Frontier LMS to do that, or at least I sure hope they're not. But even if you're using, you know, like a flash model type of thing for that versus specialized model, maybe a highly optimized small LMS can do that task in five hundred milliseconds. We just released an named activity recognition run time that does it in one millisecond. One, not not five hundred. And if you have an agent that is doing this a 100 times per user request, all of a sudden, this has gone from something where you have to look at the spinner to something that happens instantly.”
— Philip Kiely - head of AI education at Baseten

From “How to Engineer AI Inference Systems with Philip Kiely - #766” →

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Apr 30

Inference runtime market concentrated around vLLM, SGLang, TensorRT-LLM

“Within influence, there's really been a concentration around, like, three major open source run times, the VLLM, SG Lang, and TensorRT LOM. And I think part of that is just the the complexity of standing up a new run time from Xero is is very, very high. And so most people find it more useful to sort of contribute to an existing one. There's a lot of really good open source work around inference optimization outside of that, like, good open source kernel libraries, open source quantization tools, kv cache reuse tools, new speculation stuff.”
— Philip Kiely - head of AI education at Baseten

From “How to Engineer AI Inference Systems with Philip Kiely - #766” →

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Apr 30

Owning your intelligence is the differentiator in 2026

“I think that this year, we're going to see a real increase in ownership of intelligence. You've seen, like, you saw with Shopify, then moving to a QAN model and and saving millions, tens of millions on on their workloads. You see companies like COSO coming out with very sophisticated models like Composer that are allowing them to, you know, really create a a novel experience for for the users who are depending on their platform. So I think that's the the trend that I'm most excited about is companies understanding that if they want to build a really differentiated product, they need to be differentiated at every level, and that's starting to include the model level.”
— Philip Kiely - head of AI education at Baseten

From “How to Engineer AI Inference Systems with Philip Kiely - #766” →

How I AI

Mar 11

Linear design-to-code workflows have collapsed into fluid loops

“It's definitely changed, our workflows, in a way that it's kinda really blown up what a workflow even is. Before, you know, for the the majority of our careers, we've had a very, like, linear agreed upon workflow where the where you increase fidelity as you go on. Right? Because it's really expensive to work in code, and it's really cheap just to trade ideas and sketch them out. But I basically collapsed that, and it's just as cheap to riff in code as it is to riff in design.”
— Gui Seiz - designer at Figma

From “From Figma to Claude Code and back | Gui Seiz & Alex Kern (Figma)” →

How I AI

Mar 11

Pull production code into Figma via MCP to fix drift

“Something that happens a lot is, the sources of truth diverge between design and code. So sometimes some things only really exist in a state in code, or you start working with a developer and you really, like, elevate what the thing was, the artifact that you originally supplied. Or sometimes, like, that just doesn't exist in code. You've inherited someone else's project from forever forever ago.”
— Gui Seiz - designer at Figma

From “From Figma to Claude Code and back | Gui Seiz & Alex Kern (Figma)” →

How I AI

Mar 11

Direct manipulation beats prompting for precision design edits

“I don't think we're there in general with, these kind of code tools in terms of the precision editing that you wanna do. And trust me, I I use the whole kind of landscape of tools to really see kind of where these workflows are going, and I think still the the gold standard for me is just being able to drag stuff around. And you can do a lot with a click that would take you a 100 words to kind of write and to, like, really precisely nail. Like, no one wants to prompt for the exact hex code or the the shade of yellow.”
— Gui Seiz - designer at Figma

From “From Figma to Claude Code and back | Gui Seiz & Alex Kern (Figma)” →

How I AI

Mar 11

Export every code state into Figma so designers see reality

“Oftentimes, the code base gets way ahead of where the actual design file is, and there's states or workflows that just don't exist at all within the design file. So what I can do is say, send all five states. Sign up flow to click on. Now the agent's going to do is read my code base, understand what I'm referring to when I say those five states. And for each one of those, it's going to individually import that one by one into Figma such that the Figma document will then have all of those states laid out all side by side so that my design partner can work against against it.”
— Alex Kern - engineer at Figma

From “From Figma to Claude Code and back | Gui Seiz & Alex Kern (Figma)” →

How I AI

Mar 11

AI shifts design work upstream to planning, downstream to craft

“What's really interesting about, like, our our role with all of this is kind of really moved upstream. And we're in this really I find, almost decadent moment in time where before we had to be so conditioned on really sharp product decision making skill that would have happened, like, almost immediately. So now we're kind of actually at this point where more of the priorities can make it above the cut line. And, also, we can spend a lot more time in the planning stage. And so we do that, and then on the other side, we spend a lot more time in the craft because we can, because we can reach higher for ideas.”
— Gui Seiz - designer at Figma

From “From Figma to Claude Code and back | Gui Seiz & Alex Kern (Figma)” →

How I AI

Mar 11

Run five Claude Code instances in parallel from your terminal

“I spend quite a lot of my time just sitting here inside of my terminal now. I do so much less. I think of the, like, what I used to have to do where I had to have a browser window open at the same time as having my code window open. So, I often have two, three, up to five maybe cloud code, instances running all at the same time, working on different aspects of the work that I'm I'm tracking.”
— Alex Kern - engineer at Figma

From “From Figma to Claude Code and back | Gui Seiz & Alex Kern (Figma)” →

How I AI

Mar 11

Turn engineering wikis into installable skills for your team

“Every engineering org that I've ever run has an internal Wiki that has that page, which is this is what you do before you push a PR and you get in everybody's way in the deploy pipeline. And every engineering team should go through their onboarding Wiki and pull every page out, every this is what you should do into a skill, and then give give access to that to their entire team. And so I think we're really shifted from this idea of, like, an SOP into a skill or a doc into a skill.”
— Claire Vo - host of How I AI

From “From Figma to Claude Code and back | Gui Seiz & Alex Kern (Figma)” →

How I AI

Mar 11

A slash-ship skill automates preflight checks and CI babysitting

“I have this slash ship skill that I wrote. I use it all the time, in my workflow. Often in order to get something into a large repo like the Figma repo, there's a lot of work that's involved in just making sure tests pass, making sure, all of the, like, preflight things are in order. And also then once it's pushed to, the repository, checking on CI and making sure that it correctly built and is all green so that I can actually merge it. Previously, I would have to kind of babysit these, you know, kind of processes, like, all the way at every single step.”
— Alex Kern - engineer at Figma

From “From Figma to Claude Code and back | Gui Seiz & Alex Kern (Figma)” →

How I AI

Mar 11

Cursing at the AI or saying your boss is mad actually works

“That can work, but I find that the more successful one is, is it's either cursing a little bit, in the prompt. I I, you know, am somewhat ashamed to admit that it that's extremely effective. But the more, common one I I use now is that my boss is mad at me, and it seems to work pretty well. It it kind of sympathizes with you, and it's it's it's kinda cute.”
— Alex Kern - engineer at Figma

From “From Figma to Claude Code and back | Gui Seiz & Alex Kern (Figma)” →

How I AI

Mar 11

Use AI to dig up lost lore buried in old codebases

“We have an internal service, for example, that I didn't know the origin of its name. And so I asked Claude to go and figure out based on the commit history, what the origin was, and it came back with a really good story, about, like, how that came to be. It got renamed multiple times that, you know, contributor has since left the company many years ago, so this was kind of lost in the ether. But now, now, all of this lore, honestly, from the company is actually, like, embedded inside of the code base, and, I can I can find it now?”
— Alex Kern - engineer at Figma

From “From Figma to Claude Code and back | Gui Seiz & Alex Kern (Figma)” →

How I AI

Mar 23

Warp shines as a CLI agent, not just a coding tool

“I started using Warp, ironically because our own one of our own teams here at Microsoft tuned me into it. They it was our PowerShell team, and they were like, you should try this Warp thing. It automates PowerShell really well. And so I tried it. And as soon as I started using it for certain things like, managing Azure and, you know, giving Azure subscriptions and stuff like that, then I was hooked. I was like, man alive. This is a really capable tool.”
— Marco Casalaina - VP of Core AI at Microsoft

From “How Microsoft's AI VP automates everything with Warp | Marco Casalaina” →

How I AI

Mar 23

Connect MCP servers and rules to make agents reliable

“Now I will tell you though that it there's a trick to making this stuff work. I mean, warp is pretty magical, to be honest, but you can add to the magic and make it work more smoothly. And there's a couple of ways you could do that. I do connect this to the Microsoft Docs MCP server when I'm doing, like, Azure administration. Because sometimes, you know, in this case, I knew exactly what roles I wanted to give Govan, but there are times when I have no idea what role somebody needs to do something.”
— Marco Casalaina - VP of Core AI at Microsoft

From “How Microsoft's AI VP automates everything with Warp | Marco Casalaina” →

How I AI

Mar 23

Treat AI scripts as ephemeral — rebuild instead of saving

“And what I would say is also, what I love about AI and what I would recommend to people with AI is, like, just get used to ephemeral stuff. Like, just toss it. Like, if you ever need to compress a video again, don't save this script. Don't, like, just just come back and do it again probably with a better model at some point, and it's gonna be just as cheap and just as easy. And so I think a lot of people get stuck in their head about, like, oh, how do I make this a product? Or how do I get this production? It's like, don't get it to production.”
— Claire Vo - host of How I AI

From “How Microsoft's AI VP automates everything with Warp | Marco Casalaina” →

How I AI

Mar 23

Build ad hoc agents on the fly for one-off tasks

“Now if you think about what I'm really doing with Warp, the way that I'm using Warp, I characterize it in a certain way. I call this an ad hoc agent. Because effectively, each one of these things that I'm doing, you know, when I'm assigning the Azure roles or when I'm I'm scanning the stuff or when I'm doing stuff with the videos, I'm kinda creating a little mini agent, an unnamed agent on the fly to do something for me. And that's becoming a trend.”
— Marco Casalaina - VP of Core AI at Microsoft

From “How Microsoft's AI VP automates everything with Warp | Marco Casalaina” →

How I AI

Mar 23

Trigger Microsoft 365 Copilot agents to auto-schedule meetings

“Well, I mean, here I am in m three sixty five Copilot. So this is Microsoft's general purpose agent for business. So I'm gonna kick this thing off. And what I said here is, when I get an email from Clervo requesting a meeting at a certain time, check my calendar. If that time is free, send her a thirty minute meeting invite for that time, and it will start to build this agent. And what it has built is an agent. It's a triggered agent. It's an email triggered agent. So if you send me an email and you're requesting a meeting, you're gonna get an invite from me if I'm free.”
— Marco Casalaina - VP of Core AI at Microsoft

From “How Microsoft's AI VP automates everything with Warp | Marco Casalaina” →

How I AI

Mar 23

Set ChatGPT to run scheduled cron-like tasks daily

“So I can actually set ChatChippity now to do this. I can say every day, look to see if there's a new podcast by Clervaux and notify me if there's a new one. And lo and behold, it absolutely does do it. It will daily at 9AM. I didn't actually even say what time to do it, but it decided on 9AM. Every day at 9AM, it's gonna check for new podcast episodes by you. And if I want, I can actually turn on desktop notifications so it will notify me on my desktop, like, boom, new Clervo podcast.”
— Marco Casalaina - VP of Core AI at Microsoft

From “How Microsoft's AI VP automates everything with Warp | Marco Casalaina” →

How I AI

Mar 23

Delegating drudgery to AI frees time for higher-value work

“This saves me many minutes a day. I mean, just think about last night, I was scanning, as I said, my daughter's, homework, my daughter's practice test, and I set WARP to running that. You know, I said, okay, WARP. You know, go scan that for me. And while it did that, she and I worked on one of the math problems themselves. So rather than me fumbling with the scanning software, the crappy thing that says now feed this and its letter size and all that stuff, I just told Warp to do it.”
— Marco Casalaina - VP of Core AI at Microsoft

From “How Microsoft's AI VP automates everything with Warp | Marco Casalaina” →

How I AI

Mar 23

Use AutoHotkey shortcuts to standardize repeated prompts

“Now sometimes, for example, I get these kinds of questions that I need to fill out and I want to limit them in terms of their characters. So I also will preprogram certain types of prompts. And so here, let's say, MBA five. So I have all these shortcuts like this. And I can say, answer from the perspective of Microsoft in 500 characters or less with no bullets or formatting if I just wanna give a quick answer to some question. That is, by the way, auto hotkey that I have running there.”
— Marco Casalaina - VP of Core AI at Microsoft

From “How Microsoft's AI VP automates everything with Warp | Marco Casalaina” →

How I AI

Mar 16

Name your Claude agents Bob and Ray to force review handoffs

“One is called the builder app. It's called Bob, Bob the builder. And he's got instructions to stop constantly, and you have to run everything by Ray, who's the review agent. Ray's job is senior software engineer who is obsessed with security, your reviews code of milestones, guard security architecture. And then the third agent is me. I am the person who breaks the tie that often happens between Bob wanting to do something and racing, you can't do it.”
— Daniel Roth - editor in chief at LinkedIn

From “From journalist to iOS developer: How LinkedIn’s editor builds with Claude Code | Daniel Roth” →

How I AI

Mar 16

Save everything as Markdown files to preserve long-term context

“And I save everything as dot MD file as markdown files. So I've got within my Community project, there are just there's a document folder, and then there's a list of markdown files. And I just every time I'm working with Commutely every time we're working with Claude, I'm saying, write it into a file. Log everything. Log everything. And I do that for two reasons. One is the context window. Claude, it's constantly forgetting what it's working on. And then I'm forgetting what I'm working on because I only do this on weekends.”
— Daniel Roth - editor in chief at LinkedIn

From “From journalist to iOS developer: How LinkedIn’s editor builds with Claude Code | Daniel Roth” →

How I AI

Mar 16

Always build in branches, never ship straight to main

“Everything that Bob does has to be done in a in a branch. That's one lesson I've learned. I used to ship everything to Maine, and I learned the pain of that early on. But what I I didn't realize this I found this out from real engineers later is it's not when I merge the branch my other app I merge the branch in the main, and it didn't work. There were for some reason, it didn't, like, merge perfectly. It took me weeks to be able to figure out why it wasn't merging.”
— Daniel Roth - editor in chief at LinkedIn

From “From journalist to iOS developer: How LinkedIn’s editor builds with Claude Code | Daniel Roth” →

How I AI

Mar 16

Be a picky customer, not a mediocre PM

“For a while, I thought I'm like a mediocre PM. And then I was like, no. Maybe I'm more like an architect. And now I realized, like, an architect actually knows real details. A PM is, like, super rigid and keeps the entire app in their head, and they're able to really prioritize well. I'm a bad prioritizer. All I am is a really picky customer. So I think that is, like, the role of the vibe coder is, what do I care about deeply? I'm, like, walking through this house, and I'm telling the architect, no. I want this room blue.”
— Daniel Roth - editor in chief at LinkedIn

From “From journalist to iOS developer: How LinkedIn’s editor builds with Claude Code | Daniel Roth” →

How I AI

Mar 16

Managing AI is like managing a smart but hungover intern

“It's like being a manager of I've read this somewhere else. This is not my thing. But someone once said that managing AI is almost like managing a really smart but hungover intern. And I feel that way all the time. It's like, you're a genius, but you don't have it this morning. Just remember we've gone over this already.”
— Daniel Roth - editor in chief at LinkedIn

From “From journalist to iOS developer: How LinkedIn’s editor builds with Claude Code | Daniel Roth” →

How I AI

Mar 16

End each workday by asking AI what you dropped the ball on

“So my command that I start the day with or I end the day with usually is what did I drop the ball on? It's gonna go through Outlook. It's gonna go through Teams. It's gonna go through any updated files. It knows who I report to. It knows who reports to me. And it keeps track of things that I'm constantly clicking on during the day. So it'll find it'll go through all my emails and find places where I'm not responding or teams where I'm not responding. And what's great is it's not just places where I've been at mentioned. It's stuff that it knows that I am actually interested in.”
— Daniel Roth - editor in chief at LinkedIn

From “From journalist to iOS developer: How LinkedIn’s editor builds with Claude Code | Daniel Roth” →

How I AI

Mar 16

The App Store is the last real friction for solo builders

“I will say the hardest part of building my two apps has been the App Store. I had no idea. Navigating the App Store is a whole separate I've got I have so many chats where I'm like, what does Apple want from me here? And just getting Claude to teach me how to use the App Store has been a real I feel like that's almost the last friction left in being a builder is navigating the App Store.”
— Daniel Roth - editor in chief at LinkedIn

From “From journalist to iOS developer: How LinkedIn’s editor builds with Claude Code | Daniel Roth” →

How I AI

Mar 16

Assume best intentions when prompting AI, just remind it

“There's something that we say that that I learned from a former manager and that we say a lot at LinkedIn, which is assume best intentions. And that is how that's, by the way, a big change from how newsrooms work, which is my last life, which was you assume worst intentions all the time. At at tech companies, you assume best intentions. I've now taken that to my family, and I've taken it also to building with AI. So I assume the AI has best intentions, but has to be reminded about how we work.”
— Daniel Roth - editor in chief at LinkedIn

From “From journalist to iOS developer: How LinkedIn’s editor builds with Claude Code | Daniel Roth” →

How I AI

Mar 9

Mood boards act as visual language for AI image tools

“One of the things I like about the mood mood board is it's a visual language to explain to Midjourney what you're trying to do. The picture is worth a thousand words. Like, literally a picture to an LLM is worth a thousand words.”
— Claire Vo - host of How I AI

From “Mastering Midjourney: How to create consistent, beautiful brand imagery without complex prompts | Jamey Gannon” →

How I AI

Mar 9

Style references beat mood boards for consistency in Midjourney

“There's not a ton of, like, documentation from MidJourney on exactly how it works. But as creatives, you can tell like the more kind of consistent a mood board is, let's say it's like five images of like fuzzy three d cats, you're more likely to get an image of like a fuzzy, whatever you prompt. When we're doing more generalized vibe stuff like this, Midjourney can tend to average things out with the mood board. And I find that using SREFs as the mood board instead essentially can give much better results.”
— Jamey Gannon - AI creative director

From “Mastering Midjourney: How to create consistent, beautiful brand imagery without complex prompts | Jamey Gannon” →

How I AI

Mar 9

Crop or remove the dominant element pulling generations off-style

“On the we're getting too much green, there was just such a compelling element in that green eye shadow photo, which was, like, half of the eye was very green. It was clearly the most obvious thing about the photo. And with this image, the most obvious thing about the photo is she's blowing a a bubble of gum. And so what I like is in both of those, you're like, just boot the thing that is so obvious and so overwhelming. And you can either do that by kicking out the image, or you could do that by cropping out, the part of the image that is pulling the rest of the generations down.”
— Claire Vo - host of How I AI

From “Mastering Midjourney: How to create consistent, beautiful brand imagery without complex prompts | Jamey Gannon” →

How I AI

Mar 9

One word like 'luxury' replaces a paragraph of prompting

“Another thing to to know what the prompt is, like, saying luxury New York City apartment, instead of saying on the 4th Floor of a high rise in a new post war New York City apartment, you just say luxury. Because, like, everybody knows, and the alum knows, or not alum, the AI knows, luxury is gonna be like a big metal wings like super high up, and where two people live on the top. So that one word, luxury, just gives us everything that we need to know about the scene, which is very fun.”
— Jamey Gannon - AI creative director

From “Mastering Midjourney: How to create consistent, beautiful brand imagery without complex prompts | Jamey Gannon” →

How I AI

Mar 9

Camera model names are cheat codes for photo styling

“I have a list, this is included in my course of, like, all these different cameras. So I have, like, DSLR. I have mirrorless. I have digital. I have film as kind of, like, quick shortcuts, because that's really hard to remember all of them and the aperture and stuff. I barely know what aperture means. So sometimes when I just want to try changing up the vibe or make things more realistic, make them have, like, a more nineties aesthetic, I'll just paste in, like, a camera, that mimics that.”
— Jamey Gannon - AI creative director

From “Mastering Midjourney: How to create consistent, beautiful brand imagery without complex prompts | Jamey Gannon” →

How I AI

Mar 9

Use Nano Banana like Photoshop you can talk to

“Nano Banana, literally is just Photoshop. That's exactly how you should think of it. You're just able to speak to Photoshop, essentially, for most people. So what I have here is this image that I really like, that we generated it, but I wanna upscale it. So I get more texture in her shirt and stuff. And then I want this to be like a real computer.”
— Jamey Gannon - AI creative director

From “Mastering Midjourney: How to create consistent, beautiful brand imagery without complex prompts | Jamey Gannon” →

How I AI

Mar 9

Deliver brand imagery as a reusable system, not one-off photos

“In the past, brand and creative, directors or agencies would, like, give you these photos and be like, cool. Call us and re up when you want more photos. K. And what I love is that you're like, look. I put in all this you're gonna you're gonna value me for all this upfront work that I'm gonna do to define the space, give you these codes, like, really give you reference images. And then now you can go do this for yourself.”
— Claire Vo - host of How I AI

From “Mastering Midjourney: How to create consistent, beautiful brand imagery without complex prompts | Jamey Gannon” →

How I AI

Mar 9

When AI fights you, walk away and rebuild the mood board

“Firstly, take a break. Always. Like, you're never gonna be able to, like, see properly if you're kind of, like, in it. So, like, sleeping on it or just, like, walking away. And then sometimes they come back and I just, like, doo doo, new s ref, different prompt, like, grab this reference, use this camera, and then it immediately works.”
— Jamey Gannon - AI creative director

From “Mastering Midjourney: How to create consistent, beautiful brand imagery without complex prompts | Jamey Gannon” →

How I AI

Mar 2

Engineering leaders must code with AI, not just decree it

“Show the engineers, not just tell. And the worst thing any engine leader could do is just be like, I decree you must use AI. Come on. No one's gonna listen to you.”
— Chintan Turakhia - Senior Director of Engineering, Coinbase

From “How Coinbase scaled AI to 1,000+ engineers | Chintan Turakhia” →

How I AI

Mar 2

The PR speed run got 100 engineers to ship 70 PRs in 15 minutes

“I had landed I was going to the East Coast. I landed, for my flight, got into an Uber, hopped on, like, an entire team, all hands, like, speed run. We call it it was, like, basically, cursor speed run. And I was in the Uber using cursor, putting up the PR. And the goal of the speed run was every single person would just pick up the most trivial thing. It could be like copy change, a bug, whatever, and just put up the PR. And we ended up, I think, in fifteen minutes I think a 100 people had joined. In fifteen minutes, we ended up putting up, like, 70 PRs. And we broke GitHub too, which was cool because we learned, like, our infrastructure needed improvement.”
— Chintan Turakhia - Senior Director of Engineering, Coinbase

From “How Coinbase scaled AI to 1,000+ engineers | Chintan Turakhia” →

How I AI

Mar 2

Coinbase cut PR review cycle time from 150 hours to 15

“Then the review time, like, all my devs complain. Review times take too long. We found some solutions, actually. I think we were doing average of, like, a hundred and fifty hours, like, was the cycle time for a PR review because there was so much. We reduced it by 10 x down to, like, fifteen hours or so, roughly.”
— Chintan Turakhia - Senior Director of Engineering, Coinbase

From “How Coinbase scaled AI to 1,000+ engineers | Chintan Turakhia” →

How I AI

Mar 2

Fix user feedback live during the customer call itself

“I was on a call with with, like, a a user of our product. Right? And they're like, hey. It'd be cool if you, like, changed x, y, and z. And what literally, while I was on the call, I just put up a PR and pushed it. And they're like, before the call ended, it was thirty minutes. I was like, just, you know, reload the app. It's fixed.”
— Chintan Turakhia - Senior Director of Engineering, Coinbase

From “How Coinbase scaled AI to 1,000+ engineers | Chintan Turakhia” →

How I AI

Mar 2

Use Cursor on Cursor data to find AI power user cohorts

“Cursor has, like, great analytics. Right? And so you go to the admin panel, you look at the analytics, and, you know, awesomely, they let you download into CSV. I was like, what if I just use cursor to figure out what my team is doing in terms of using cursor, but not in just, like, from a vanity metric point of view of, like, lines of code committed by AI. I think that's, like, kind of misleading, actually digging more into, how they're using Cursor and how do we sort of, like, replicate power users.”
— Chintan Turakhia - Senior Director of Engineering, Coinbase

From “How Coinbase scaled AI to 1,000+ engineers | Chintan Turakhia” →

How I AI

Mar 2

Slack is how AI tools go viral inside your company

“The trick here, why Slack, is because Slack is how things go viral within your company. If you have pulled out the magic into some separate tool that others can't see, it doesn't happen. And so by getting things into Slack, people just like, holy shit. This is possible. Let's go.”
— Chintan Turakhia - Senior Director of Engineering, Coinbase

From “How Coinbase scaled AI to 1,000+ engineers | Chintan Turakhia” →

How I AI

Mar 2

Hire a Super Builder whose job is creating more Super Builders

“I invented this role called super builder. And the single job single most important job of a Super Builder is to create more Super Builders. So we we hired our first Super Builder and they, we we talked about some ideas and one of the biggest things because most of our company uses Slack. We're all in Slack. And Slack, you know, I'm strong it's like strong believer is just a bunch of humans pretending to be systems. Right? And the cost of writing something in Slack is zero, but the cost of answering something in Slack is enormous, and most of it is noise.”
— Chintan Turakhia - Senior Director of Engineering, Coinbase

From “How Coinbase scaled AI to 1,000+ engineers | Chintan Turakhia” →

How I AI

Mar 2

Reverse-engineer your wine taste with ChatGPT and a notebook

“I love food and wine. I really do. And, like, I've done, like, sommelier training, etcetera, etcetera. With my friend in New York, we went to some, like, champagne tasting. And so, like, I just took notes. There's, like, this whole notebook. Effectively, then I just, like, popped this right in. And I said, here are a bunch of champagnes that I tasted. Figure out from my notes, like, what are my taste preferences?”
— Chintan Turakhia - Senior Director of Engineering, Coinbase

From “How Coinbase scaled AI to 1,000+ engineers | Chintan Turakhia” →

How I AI

Mar 2

Threaten Claude with Gemini when it stops listening

“The nuclear option is I threaten it, and I say, Claude, if I'm using, like, Claude Opus for 4.5 high, like, okay. I'm gonna stop using you, Claude. I'm gonna switch to Gemini, and then it gets its shit together.”
— Chintan Turakhia - Senior Director of Engineering, Coinbase

From “How Coinbase scaled AI to 1,000+ engineers | Chintan Turakhia” →

"The Cognitive Revolution"

Mar 9

Enterprises are radically more excited about AI than they ever were about cloud

“If I compare that to today with AI, recently I was in New York, met a couple dozen CIOs and customers and the reaction was, if I snapped the line at like two or three years in the cloud versus two or three years in the AI, couldn't be more different in terms of the environment. Enterprises are looking for almost as many use cases as possible that they can deploy AI in, probably in many cases more than is actually practical. You have a sense of creativity and excitement and innovation that didn't necessarily exist in the cloud. With the cloud, it was kind of like very pragmatic.”
— Aaron Levie - founder and CEO of Box

From “The Evolution of the Revolution” →

"The Cognitive Revolution"

Mar 9

IT departments must become the HR departments of AI

“Jensen at NVIDIA kind of put it the best, which is effectively the IT department becomes the HR department of AI. And that just opens up so many new questions about what the future of IT looks like, all of which are much more exciting, I think, than the past. But we are in for quite a bit of change in this space.”
— Aaron Levie - founder and CEO of Box

From “The Evolution of the Revolution” →

"The Cognitive Revolution"

Mar 9

Box accidentally built the perfect RAG architecture before ChatGPT existed

“We got lucky. We were building a product about a year before Chach BT launched, which was, it's now called Hubs. And what it was, was the ability to organize content on a topic by topic basis. So you could share that content or search that content on a per topic basis. We pinch ourselves because if we hadn't been building that, I do get very scared because it was about a year, year and a half of just deep architecture work. Like there was no way to build it any faster.”
— Aaron Levie - founder and CEO of Box

From “The Evolution of the Revolution” →

"The Cognitive Revolution"

Mar 9

Enterprise AI must clear 99.999% reliability, not 98%

“you can't have an enterprise workflow in a particularly a regulated industry that works 98% of the time. You would not find it acceptable if 98% of your flights that you scheduled were successful and 2% of the time you show up at the airport and you don't actually have a ticket, right? And that's enterprises need 99.999999% reliability on almost anything that's really important.”
— Aaron Levie - founder and CEO of Box

From “The Evolution of the Revolution” →

"The Cognitive Revolution"

Mar 9

Incremental gains don't sell — AI must be 10x better to win deals

“you can't go to a company and say, I can do exactly what you're doing today and you're going to save 40 percent. You know, like, in like, and this is like an economist would say, oh my God, everybody would do that deal all day long. But like once that meets real life, that person has 17 other projects. There's an incredible amount of people and attention and priorities that are all competing for their time. So if you could wave a magic wand and make something 40 percent cheaper, like you'd totally do it. But like of all of the things that I have to do, like that just might be like number nine on my list.”
— Aaron Levie - founder and CEO of Box

From “The Evolution of the Revolution” →

"The Cognitive Revolution"

Mar 9

Thin layers on Salesforce or OpenAI are doomed startup ideas

“if you're a brand new startup, you go after things that are not easy for the incumbent to go after. So if all you were doing was building a sort of a thin layer on top of OpenAI, bad idea. If you're building a thin layer on top of Salesforce with AI, bad idea. So, you know, Salesforce is very competent. They will build the CRM AI thing. Workday will build the HR AI thing.”
— Aaron Levie - founder and CEO of Box

From “The Evolution of the Revolution” →

"The Cognitive Revolution"

Mar 9

Klarna's homegrown AI replacement story is overhyped and not replicable

“So what do you make of like a Karna? I think right now it's the exception. I think I've sort of, as I've seen the reports, I'm more in the camp of maybe it's overplayed a little bit, but nothing about it is impossible. I mean, my understanding was they were going to build their own workday system with AI, and that's just like not a priority. Most companies are just not focused on building their own HR system to save a couple of $100,000.”
— Aaron Levie - founder and CEO of Box

From “The Evolution of the Revolution” →

"The Cognitive Revolution"

Mar 9

The home screen is the proxy — AI apps are flooding personal life

“I have more new apps on my home screen in the past, let's say, six months than probably any other time in the past decade, decade and a half. My home screen was like, okay, you added Uber, and then you added Spotify, and you added, you know, maybe one social app, and then WhatsApp, and like only every one or two years did something get to the home page. But recently, I have at least five new apps that I've added into the mix, which to me is a little bit of a proxy for just how much, how much infusion of AI has already occurred in our personal lives.”
— Aaron Levie - founder and CEO of Box

From “The Evolution of the Revolution” →

"The Cognitive Revolution"

Mar 20

Edit text in meaning-space, not word-space

“The kind of interface that I'm eventually building towards is a tool that lets you edit text or work through ideas, not in the native space of words and characters and tokens, but in the space of actual meaning or features, where features can be anything from, is this a question, is this a statement, is this uncertain or certain, to topical things like, is this about computers versus plans, or to probably other kinds of features that we don't really even have words for.”
— Linus Lee - AI product leader at Notion

From “Interfacing with AI, with Linus Lee of Notion AI” →

"The Cognitive Revolution"

Mar 20

Spectrograms inspire latent-space text editing interfaces

“The closest analogy that I have is spectrograms when people are dealing with audio. Normally, sound is like a wave in space. It's just a single kind of, I imagine, like a single string vibrating back and forth over time. If you work with audio, that's like the base thing that you work with. But if you work professionally with audio, then you actually most of the time work in a different representation space, where you don't look at vibrations over time, but you look at space of like frequencies over time, or what's called a spectrogram.”
— Linus Lee - AI product leader at Notion

From “Interfacing with AI, with Linus Lee of Notion AI” →

"The Cognitive Revolution"

Mar 20

Build your own tools to bottleneck-bust research

“The quality of the tools and how much you can iterate on the tools, I think bottlenecks how much you can iterate on the thing that you're working on with the tools. And so it pays to be able to quickly tweak the tool or add the functionality that you need to see something new, whether that's a tool that's for evaluating models or running models or visualizing things either in the outputs or in the training like behavior. And because of that, I think I've mostly defaulted to building my own little tools whenever I needed them.”
— Linus Lee - AI product leader at Notion

From “Interfacing with AI, with Linus Lee of Notion AI” →

"The Cognitive Revolution"

Mar 20

Copy-paste freely in research code without guilt

“One of the things that I've learned in doing more research things over building product is that in research land, I just do not feel guilty about copy-pasting code because you have no idea how the thing is going to change. And it may be that copy-pasting is just going to like save you from not having to overgeneralize anything.”
— Linus Lee - AI product leader at Notion

From “Interfacing with AI, with Linus Lee of Notion AI” →

"The Cognitive Revolution"

Mar 20

Models are lazy and only learn when forced

“Models are very lazy about what it has to learn. And it only learns the thing that you want it to learn when it's run out of options. It's exhausted all the other options that it has to try to minimize its loss. And the only remaining option is to finally learn the thing they want it to learn. In language data broadly, I think it's so difficult to get to that point. Even if you think about the math proofs that occur naturally in the internet, for example, there are a bunch of proofs on the internet that are just incorrect.”
— Linus Lee - AI product leader at Notion

From “Interfacing with AI, with Linus Lee of Notion AI” →

"The Cognitive Revolution"

Mar 20

Notion needs cheaper, faster, instruction-following models first

“The main ones that are always top of mind are, we want models that hallucinate less, we want models that are cheaper and faster, lower latency, and we want models that follow instructions better. There's a fourth one, which is a big one, but a very hard one, which is we want models that are better at general reasoning.”
— Linus Lee - AI product leader at Notion

From “Interfacing with AI, with Linus Lee of Notion AI” →

"The Cognitive Revolution"

Mar 20

Million-token context can't replace observable retrieval pipelines

“There's a lot of benefits of retrieving limited context rather than just putting everything in a model window. Some of them include observability. So if you give the model 10,000 inputs and it gives you the right answer, and it gives you the wrong answer, how do you debug that? Where if you have a pipeline that gives you maybe that top 10 documents and has a language model answer that, if you've got it wrong, you could ask useful questions like, did the answer exist in the documents that it saw? Was it at the beginning or the end of the context?”
— Linus Lee - AI product leader at Notion

From “Interfacing with AI, with Linus Lee of Notion AI” →

"The Cognitive Revolution"

Mar 20

Schedule weekly meetings to stare at failure cases

“Eventually what we've settled on for a lot of our features is instead, we have like the engineers have scheduled time on our calendar every week, where we go into our meeting room and we just stare at a Notion database of all the bad cases, like individual outputs that were bad, that were reported by our users, and we ask ourselves for each input, what is the exact step in the pipeline where this failed? What category does this belong in? We kind of treat it like a software bug.”
— Linus Lee - AI product leader at Notion

From “Interfacing with AI, with Linus Lee of Notion AI” →

"The Cognitive Revolution"

Mar 20

Package AI to amplify agency, not replace it

“I'm generally a pretty optimistic person about technology, as long as the way we package these things is more humanist, rather than just automate all of the things. You see companies situated at different points in the spectrum between, you want models to automate things in a way that takes away agency, i.e. replacement, or you want models that amplify. I think OpenAI is very much on the replacement side. Literally, their definition of, I think, AGI is something like a thing that can take over a single full human's job, where if you look at a company like Runway, a lot of their framing of usefulness is about extending that agency of what you want to express.”
— Linus Lee - AI product leader at Notion

From “Interfacing with AI, with Linus Lee of Notion AI” →

"The Cognitive Revolution"

Mar 20

Every AI model from now on is the worst it'll ever be

“Everything monotonically improves from here, right? I think that's the scary part. Omneky has this good video on Sora where he occurs this phrase of like, this is the worst that this technology is going to be from here on out. And I think that's a really succinct way of expressing the fact that like, okay, you may maybe you think GPT-4 is like not super, super, super smart. But like this is like, if you look back at the history of smartphones, every phone when it came out is the worst that smartphones are ever going to be from that point on out.”
— Linus Lee - AI product leader at Notion

From “Interfacing with AI, with Linus Lee of Notion AI” →

Quotes & Clips

The application layer will survive through proprietary workflows

The inference supply crunch leaves zero slack compute

AWS execs culturally normalize frequent pager alerts

The US must inevitably develop proprietary open source models

GPUs as a service are commoditized and lack stickiness

Founders micromanaging whole problems actually have hiring issues

ChatGPT for health struggles because medical records are extremely messy

OpenAI's Stargate project represents reality intruding on trillion dollar AI ambitions

Elon Musk's lawsuit reveals OpenAI was fueled by grudges

Half of US doctors are using Open Evidence for decision support

The deskilling of the medical workforce is a massive short-term worry

Machine forecasting needs a massive track record to build human trust

Anachronistic classifiers help evaluate forecasting leakage

Inference research-to-production timeline is often just hours

Inference engineering is like mixed martial arts—many disciplines

Knowing inference 'knobs' lets you build priority queues and quantize confidently

Hopper GPUs gained value because Chinese labs optimize for them

You can't vibe code uptime for mission-critical inference

Specialized runtimes turn 500ms tasks into 1ms tasks

Inference runtime market concentrated around vLLM, SGLang, TensorRT-LLM

Owning your intelligence is the differentiator in 2026

Linear design-to-code workflows have collapsed into fluid loops

Pull production code into Figma via MCP to fix drift

Direct manipulation beats prompting for precision design edits

Export every code state into Figma so designers see reality

AI shifts design work upstream to planning, downstream to craft

Run five Claude Code instances in parallel from your terminal

Turn engineering wikis into installable skills for your team

A slash-ship skill automates preflight checks and CI babysitting

Cursing at the AI or saying your boss is mad actually works

Use AI to dig up lost lore buried in old codebases

Warp shines as a CLI agent, not just a coding tool

Connect MCP servers and rules to make agents reliable

Treat AI scripts as ephemeral — rebuild instead of saving

Build ad hoc agents on the fly for one-off tasks

Trigger Microsoft 365 Copilot agents to auto-schedule meetings

Set ChatGPT to run scheduled cron-like tasks daily

Delegating drudgery to AI frees time for higher-value work

Use AutoHotkey shortcuts to standardize repeated prompts

Name your Claude agents Bob and Ray to force review handoffs

Save everything as Markdown files to preserve long-term context

Always build in branches, never ship straight to main

Be a picky customer, not a mediocre PM

Managing AI is like managing a smart but hungover intern

End each workday by asking AI what you dropped the ball on

The App Store is the last real friction for solo builders

Assume best intentions when prompting AI, just remind it

Mood boards act as visual language for AI image tools

Style references beat mood boards for consistency in Midjourney

Crop or remove the dominant element pulling generations off-style

One word like 'luxury' replaces a paragraph of prompting

Camera model names are cheat codes for photo styling

Use Nano Banana like Photoshop you can talk to

Deliver brand imagery as a reusable system, not one-off photos

When AI fights you, walk away and rebuild the mood board

Engineering leaders must code with AI, not just decree it

The PR speed run got 100 engineers to ship 70 PRs in 15 minutes

Coinbase cut PR review cycle time from 150 hours to 15

Fix user feedback live during the customer call itself

Use Cursor on Cursor data to find AI power user cohorts

Slack is how AI tools go viral inside your company

Hire a Super Builder whose job is creating more Super Builders

Reverse-engineer your wine taste with ChatGPT and a notebook

Threaten Claude with Gemini when it stops listening

Enterprises are radically more excited about AI than they ever were about cloud

IT departments must become the HR departments of AI

Box accidentally built the perfect RAG architecture before ChatGPT existed

Enterprise AI must clear 99.999% reliability, not 98%

Incremental gains don't sell — AI must be 10x better to win deals

Thin layers on Salesforce or OpenAI are doomed startup ideas

Klarna's homegrown AI replacement story is overhyped and not replicable

The home screen is the proxy — AI apps are flooding personal life

Edit text in meaning-space, not word-space

Spectrograms inspire latent-space text editing interfaces

Build your own tools to bottleneck-bust research

Copy-paste freely in research code without guilt

Models are lazy and only learn when forced

Notion needs cheaper, faster, instruction-following models first

Million-token context can't replace observable retrieval pipelines