ChatGPT started as a general chatbot bet over narrow tools
βWell, so the goal was we need to come up with some productionization of GPT four. And there's questions about, like, how do we turn this incredibly powerful model into products? And we're all spitballing ideas like writing bot, coding bot, you know, very natural at the time. Some of our least interesting ideas were a meeting bot. So it would just sit in a Google Meet, take notes, send and send out, like, to dos after. But John Shulman was very opinionated. He's like, we think we should keep it very general. Let's do a chatbot.β
βGPUs are so extraordinarily expensive. And what's interesting is just the compute cost relative to physical infrastructure is actually surprising where, you know, so much money is spent on the compute, that the physical infrastructure sometimes is actually lower, but, you know, has very large lead times and there's intrinsic difficulty of having these well calibrated, well functioning physical systems. But from a capital perspective, it's primarily a compute cost.β
Software engineering self-improvement is happening now, other domains lag
βSo these systems have become so incredibly impressive on this on this domain as a result of huge amounts of data, really cheap verifiable environments, like, you know, you can check Unitesco from failing the passing with just a few CPUs. It's basically instantaneous. There's no domain expertise gap between an AI researcher or software engineer. And, obviously, this will become and is becoming a larger contributor to the next generation of the system.β
Literature data spans orders of magnitude and needs experimental grounding
βOne of the engineers on our team was looking at a reported material, property, and it was just sort of extracted values from literature. And It was really interesting to see the reported value spanned many orders of magnitude. And so you train a ML system on that and it's like, well, the best you can do is model this distribution, but you're no closer to, like, a ground truth. And that's where experimental data comes in, where you now have a grounding in this.β
Physicists dominate AI because they are principled, hard-nosed thinkers
βI think it's a great way to think about the world. It's, like, very principled, very, like, hard nosed scientists, very careful. And I don't know. I think it's just it's such an incredible field. You have such high leverage in computer science, in AI. And so I think a lot of physicists were seeing that, particularly in, like, high energy physics. After the discovery of the Higgs, I think a lot of high energy physicists were sort of looking for what's next.β
Language models act as orchestration layers over specialized neural nets
βYeah. So, language models are incredibly powerful. It's a very natural interface, and so we continue to use these. But we think about them almost as like an orchestration layer. So that's sort of a a copilot assistant, but also like a system that can direct, experiments. And it's almost it's orchestrating other specialized models as well. So we do construct neural nets that are specially designed for atomic systems where there's, like, some symmetry awareness, and those have much lower latency and they've been, like, fine tuned for that.β
Intelligence is spiky, not a single scalar capability
βI think one fallacy is thinking about intelligence as a scaler. We've consistently seen these systems have a very odd spikiness. And it's actually possible to architect a system that is world class on some math domain, but then you could do some perturbations to the questions and actually degrade it substantially. So it's like a bad high school student. And so there's this, like, odd spikiness to these systems.β
Robotics will massively accelerate but isn't required for Periodic
βThe one of the reasons I ask is, I used to run this company, Color, and we built our own liquid handling robotic systems. We buy liquid handling robots, but then we have to adjust them dramatically. We had, like, cameras that would use ML to monitor the system and sort of make adjustments. We had to three d print parts to decrease vibrations on the platform because we were dealing with such small, volumes of liquid. And so there's enormous amounts of customization versus just having and the firmware for it was awful and writing against that was painful.β