The world of voice AI, with Mati Staniszewski of ElevenLabs
- β’
Voice AI lags behind text in conversational nuance
βAI has conquered text but still struggles with conversational speech. We are trying to solve the voice Turing Test where you can't tell if you're talking to a machine or a human because of the subtle emotional shifts and the way we naturally interrupt or emphasize certain words.β
- β’
ElevenLabs reached an eleven billion dollar valuation
βThe rapid ascent to an $11 billion valuation was driven by the realization that audio is the next frontier of accessibility. Itβs not just about reading text; itβs about creating a presence that feels authentic across every language and allows for a more natural interaction with machines.β
- β’
Ukraine utilizes ElevenLabs for digital government services
βUkraine is actually using our tech for digital government services, which is an incredible use case for accessibility. It allows them to communicate vital information to citizens in a way that is both efficient and human-sounding, ensuring that the technology serves a real social purpose during a time of crisis.β
- β’
Voice agents transform industries like farming and healthcare
βThe potential for voice agents in everything from farming to healthcare is massive. Imagine a farmer being able to interact with complex data systems through natural conversation while their hands are busy in the field, or a healthcare provider getting instant, verbal updates on patient metrics without looking at a screen.β
- β’
Speech-to-speech models replace cascaded audio systems
βWe are moving away from cascaded models toward speech-to-speech, which allows for much lower latency and better preservation of intent. It's about designing an AI-native organization that focuses on the end-to-end audio experience rather than just stacking different layers of text and sound together.β
