Mati Staniszewski — Guest Appearances | Quicklets.ai

Cheeky Pint

APR 14, 2026Stripe

The world of voice AI, with Mati Staniszewski of ElevenLabs

ADOPT VOICE AI WATCH SPEECH MODELS FUND ACCESSIBLE TECH DIGITIZE GOVERNMENT

•
Voice AI lags behind text in conversational nuance
“AI has conquered text but still struggles with conversational speech. We are trying to solve the voice Turing Test where you can't tell if you're talking to a machine or a human because of the subtle emotional shifts and the way we naturally interrupt or emphasize certain words.”
— Mati Staniszewski
•
ElevenLabs reached an eleven billion dollar valuation
“The rapid ascent to an $11 billion valuation was driven by the realization that audio is the next frontier of accessibility. It’s not just about reading text; it’s about creating a presence that feels authentic across every language and allows for a more natural interaction with machines.”
— Mati Staniszewski
•
Ukraine utilizes ElevenLabs for digital government services
“Ukraine is actually using our tech for digital government services, which is an incredible use case for accessibility. It allows them to communicate vital information to citizens in a way that is both efficient and human-sounding, ensuring that the technology serves a real social purpose during a time of crisis.”
— Mati Staniszewski
•
Voice agents transform industries like farming and healthcare
“The potential for voice agents in everything from farming to healthcare is massive. Imagine a farmer being able to interact with complex data systems through natural conversation while their hands are busy in the field, or a healthcare provider getting instant, verbal updates on patient metrics without looking at a screen.”
— Mati Staniszewski
•
Speech-to-speech models replace cascaded audio systems
“We are moving away from cascaded models toward speech-to-speech, which allows for much lower latency and better preservation of intent. It's about designing an AI-native organization that focuses on the end-to-end audio experience rather than just stacking different layers of text and sound together.”
— Mati Staniszewski