PUBLISHED: APR 14, 2026INDEXED: APR 19, 2026, 12:29 AM

The world of voice AI, with Mati Staniszewski of ElevenLabs

Key Takeaways

  • β€’

    Voice AI lags behind text in conversational nuance

    β€œAI has conquered text but still struggles with conversational speech. We are trying to solve the voice Turing Test where you can't tell if you're talking to a machine or a human because of the subtle emotional shifts and the way we naturally interrupt or emphasize certain words.”

    β€” Mati Staniszewski
  • β€’

    ElevenLabs reached an eleven billion dollar valuation

    β€œThe rapid ascent to an $11 billion valuation was driven by the realization that audio is the next frontier of accessibility. It’s not just about reading text; it’s about creating a presence that feels authentic across every language and allows for a more natural interaction with machines.”

    β€” Mati Staniszewski
  • β€’

    Ukraine utilizes ElevenLabs for digital government services

    β€œUkraine is actually using our tech for digital government services, which is an incredible use case for accessibility. It allows them to communicate vital information to citizens in a way that is both efficient and human-sounding, ensuring that the technology serves a real social purpose during a time of crisis.”

    β€” Mati Staniszewski
  • β€’

    Voice agents transform industries like farming and healthcare

    β€œThe potential for voice agents in everything from farming to healthcare is massive. Imagine a farmer being able to interact with complex data systems through natural conversation while their hands are busy in the field, or a healthcare provider getting instant, verbal updates on patient metrics without looking at a screen.”

    β€” Mati Staniszewski
  • β€’

    Speech-to-speech models replace cascaded audio systems

    β€œWe are moving away from cascaded models toward speech-to-speech, which allows for much lower latency and better preservation of intent. It's about designing an AI-native organization that focuses on the end-to-end audio experience rather than just stacking different layers of text and sound together.”

    β€” Mati Staniszewski
Want more? Subscribe to go deeper! β†’

Episode Description

DescriptionMati Staniszewski is the co-founder of ElevenLabs, the research company making audio accessible across languages and voices. He sits down with John to discuss the "voice Turing Test" and why AI has conquered text but still struggles with conversational speech. They discuss the future of human-computer interaction, including why we still can't get our phones to read a PDF properly and the massive potential for voice agents in everything from farming to healthcare. Mati also opens up about ElevenLabs’ rapid ascent to an $11 billion valuation and gives a behind-the-scenes look at how Ukraine is using their tech for digital government services.Timestamps(00:00:27) How audio models work(00:08:52) ElevenLabs business model(00:17:50) The conversational Turing Test(00:21:01) Link by Stripe(00:26:02) Cascaded vs speech-to-speech(00:31:53) Universal translation(00:51:41) Designing an AI-native org

Featured in Category Feeds

Stay in the Loop

Get Cheeky Pint summaries and more, delivered free.