Vlad Tenev and Tudor Achim on mathematical superintelligence, why math is harder than code for LLMs, and the end of buggy software
- β’
LLMs hallucinate by predicting words instead of logic
βThe core problem with current large language models is that they are fundamentally probabilistic word predictors. When you ask them to do math, they aren't reasoning; they are guessing the next most likely token. Harmonic's approach with Aristotle moves away from this by forcing the model to reason in formal logic and Lean code, which ensures the result is actually correct.β
- β’
Aristotle 10xed verified ErdΕs problems in months
βIn just a few months, Aristotle was able to 10x the total corpus of formally verified ErdΕs problems. This shows the scale at which AI can accelerate mathematical discovery when it isn't limited by human speed or the fear of making a mistake. We are essentially automating the verification process for complex mathematical conjectures.β
- β’
Formal verification will eliminate software bugs
βThe end of buggy software is within sight because of formal verification. If we can treat software code like a mathematical proof, we can mathematically guarantee that the program will behave exactly as intended. This shifts the paradigm from testing for bugs to proving that bugs cannot exist in the system.β
- β’
AI will solve Millennium Prize problems by 2028
βWe are predicting that the first Millennium Prize problem will be solved by 2027 or 2028 using these advanced mathematical models. The progress we are seeing is exponential, and as these models get better at formal reasoning, the most difficult unsolved problems in mathematics become solvable targets.β
- β’
Aristotle uses Lean code for verified outputs
βAristotle produces 100% verified mathematical outputs because it reasons in Lean code rather than natural language. By shifting the foundation to formal logic, we eliminate the hallucinations that plague current generative AI models. This is the difference between a model that sounds smart and one that is provably correct.β
