Integrate simulation with real physical experiments
βReally important, it's not just like a pool of data. It's this interactive closed loop system that is so powerful. Once you have the experimental data, you can look through it. You can look for aberrations. You can look for patterns. You can look for consistency with simulation data, with literature. And then that helps drive the next set of experiments. So it's not just a pool of data. It's this very active loop.β
βI think over the next few years, past that, we saw ever-improving models. We saw reasoning. I think test time inference became really important. That led to more reliable error correction, more reliable tool use, and we see the rise of coding agents and other agents. And I think those were foundational technologies necessary to then connect these systems to the physical world. It was just not possible with the AI technology of 2022.β
Language models act as orchestration layers over specialized neural nets
βYeah. So, language models are incredibly powerful. It's a very natural interface, and so we continue to use these. But we think about them almost as like an orchestration layer. So that's sort of a a copilot assistant, but also like a system that can direct, experiments. And it's almost it's orchestrating other specialized models as well. So we do construct neural nets that are specially designed for atomic systems where there's, like, some symmetry awareness, and those have much lower latency and they've been, like, fine tuned for that.β
Robotics will massively accelerate but isn't required for Periodic
βThe one of the reasons I ask is, I used to run this company, Color, and we built our own liquid handling robotic systems. We buy liquid handling robots, but then we have to adjust them dramatically. We had, like, cameras that would use ML to monitor the system and sort of make adjustments. We had to three d print parts to decrease vibrations on the platform because we were dealing with such small, volumes of liquid. And so there's enormous amounts of customization versus just having and the firmware for it was awful and writing against that was painful.β
βI think it's a great way to think about the world. It's very principled, very hard-nosed scientists, very careful. And I don't know, I think it's such an incredible field. You have such high leverage in computer science, in AI. And so I think a lot of physicists were seeing that, particularly in like high-energy physics. After the discovery of the Higgs, I think a lot of high-energy physicists were sort of looking for what's next. Ultimately, it becomes bottlenecked on the new apparatus for pushing the next energy frontier.β
Literature data spans orders of magnitude and needs experimental grounding
βOne of the engineers on our team was looking at a reported material, property, and it was just sort of extracted values from literature. And It was really interesting to see the reported value spanned many orders of magnitude. And so you train a ML system on that and it's like, well, the best you can do is model this distribution, but you're no closer to, like, a ground truth. And that's where experimental data comes in, where you now have a grounding in this.β
ChatGPT started as a general chatbot bet over narrow tools
βWell, so the goal was we need to come up with some productionization of GPT four. And there's questions about, like, how do we turn this incredibly powerful model into products? And we're all spitballing ideas like writing bot, coding bot, you know, very natural at the time. Some of our least interesting ideas were a meeting bot. So it would just sit in a Google Meet, take notes, send and send out, like, to dos after. But John Shulman was very opinionated. He's like, we think we should keep it very general. Let's do a chatbot.β
βGPUs are so extraordinarily expensive. And what's interesting is just the compute cost relative to physical infrastructure is actually surprising where, you know, so much money is spent on the compute, that the physical infrastructure sometimes is actually lower, but, you know, has very large lead times and there's intrinsic difficulty of having these well calibrated, well functioning physical systems. But from a capital perspective, it's primarily a compute cost.β
Intelligence is spiky, not a single scalar capability
βI think one fallacy is thinking about intelligence as a scaler. We've consistently seen these systems have a very odd spikiness. And it's actually possible to architect a system that is world class on some math domain, but then you could do some perturbations to the questions and actually degrade it substantially. So it's like a bad high school student. And so there's this, like, odd spikiness to these systems.β
Physicists dominate AI because they are principled, hard-nosed thinkers
βI think it's a great way to think about the world. It's, like, very principled, very, like, hard nosed scientists, very careful. And I don't know. I think it's just it's such an incredible field. You have such high leverage in computer science, in AI. And so I think a lot of physicists were seeing that, particularly in, like, high energy physics. After the discovery of the Higgs, I think a lot of high energy physicists were sort of looking for what's next.β
Software engineering self-improvement is happening now, other domains lag
βSo these systems have become so incredibly impressive on this on this domain as a result of huge amounts of data, really cheap verifiable environments, like, you know, you can check Unitesco from failing the passing with just a few CPUs. It's basically instantaneous. There's no domain expertise gap between an AI researcher or software engineer. And, obviously, this will become and is becoming a larger contributor to the next generation of the system.β
βThe opinion that I and others held as periodic was, you're not going to see the same kind of acceleration in science and technology unless you start connecting these things to the physical world. Science ultimately isn't sitting in a room thinking really hard. You have to conduct experiments. You have to learn from them. You have to interface with reality. And the creation of ChatGPT in late 2022 was an important technology, but it was still far too weak. We couldn't have done periodic on technology of that era.β
Closed-loop experiments fix unreliable literature data
βOne of the engineers on our team was looking at a reported material property. And it was just sort of extracted values from literature. It was really interesting to see the reported value spanned many orders of magnitude. And so you train an ML system on that and it's like, well, the best you can do is model this distribution, but you're no closer to like a ground truth. And that's where experimental data comes in, where you now have a grounding in this.β