Tobias Münch
Is the web ready for voice user interfaces?
#1about 3 minutes
Why voice user interfaces are important for accessibility
Voice interfaces can significantly improve web accessibility for users with disabilities and provide hands-free convenience for mobile professionals.
#2about 1 minute
Understanding the Web Speech API's core functions
The Web Speech API is a W3C standard divided into speech recognition for converting voice to text and speech synthesis for converting text to voice.
#3about 2 minutes
Reviewing VUI research and its current limitations
Research projects like the Conversational Web and a wheelchair VUI demonstrate potential but suffer from inconsistent accuracy, online-only functionality, and lack of wake words.
#4about 3 minutes
How to implement the Web Speech API in JavaScript
Learn the step-by-step process of implementing speech recognition, including loading the class, configuring grammar with JSGF, starting the listener, and processing the results.
#5about 2 minutes
Navigating the Web Speech API's result data structure
The API returns a nested data structure containing a list of results, each with alternatives that include the text transcript and a confidence score.
#6about 3 minutes
Key challenges limiting Web Speech API adoption
The API's adoption is hindered by significant issues including poor developer experience, privacy risks from cloud processing, no offline support, and inconsistent browser implementations.
#7about 3 minutes
A look inside the browser's implementation of speech recognition
An analysis of the Chromium source code reveals how the Web Speech API is implemented through layers that manage and dispatch recognition tasks to either remote cloud services or local OS-dependent engines.
#8about 5 minutes
The future of VUIs with Stanford's React Genie
Stanford's React Genie project offers a new paradigm by loosely coupling a voice agent with React state, allowing for complex voice commands that can manipulate off-screen content and application logic.
#9about 1 minute
Final verdict on the web's readiness for voice UIs
While the current Web Speech API is suitable for experimentation, it is not reliable enough for production use, but promising research indicates a more capable future for web-based voice interfaces.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
00:17 MIN
Building a custom voice AI with WebRTC and Google APIs
Raise your voice!
03:39 MIN
Understanding the limitations of the Web Speech API
Building a Browser-Based Karaoke Game with Web Speech API
02:36 MIN
An overview of the Web Speech API
Building a Browser-Based Karaoke Game with Web Speech API
23:19 MIN
Practical design considerations for voice interfaces
Building a Browser-Based Karaoke Game with Web Speech API
03:39 MIN
Understanding the limitations of the Web Speech API
Building a Browser-Based Karaoke Game with Web Speech API
04:02 MIN
Building language-enabled universal interfaces for software
Semantic AI: Why Embeddings Might Matter More Than LLMs
1:01:30 MIN
The difficult reality of coding with voice commands
Honeypots and Tarpits, Benefits of Building your own Tools and more with Salma Alam-Naylor
06:14 MIN
Demonstrating basic text-to-speech and voice navigation
Building a Browser-Based Karaoke Game with Web Speech API
Featured Partners
Related Videos
Speak, Code, Deploy: Transforming Developer Experience with Voice Commands
Sami Ekblad
Hello JARVIS - Building Voice Interfaces for Your LLMS
Nathaniel Okenwa
Livecoding with AI
Rainer Stropek
Prompt API & WebNN: The AI Revolution Right in Your Browser
Christian Liebel
Raise your voice!
Lee Boonstra
From ML to LLM: On-device AI in the Browser
Nico Martin
What’s New and What’s Next in Web UI
Cleyra Uzcategui
The State Of The Web
Jeremy Keith
Related Articles
View all articles

.webp?w=240&auto=compress,format)

From learning to earning
Jobs that call for the skills explored in this talk.


Founding Product Engineer
fonio GmbH
Vienna, Austria
Remote
€70-100K
Intermediate
Senior
React
Node.js
Next.js
+1

UX Designer - Voice Experience (Automotive) / Visual Designer
Cerence
Ulm, Germany
Figma
Adobe After Effects

Senior AI developer | StartUp, Voice AI, Agentic AI, RAG | 100% remote möglich, bis 110 T€ (mwd)
Vesterling Consulting GmbH
Köln, Germany
Remote
€45-110K
Senior
Software Architecture

Senior AI developer | StartUp, Voice AI, Agentic AI, RAG | 100% remote möglich, bis 110 T€ (mwd)
Vesterling Consulting GmbH
Berlin, Germany
Remote
€45-110K
Senior
Software Architecture

Senior AI developer | StartUp, Voice AI, Agentic AI, RAG | 100% remote möglich, bis 110 T€ (mwd)
Vesterling Consulting GmbH
Bremen, Germany
Remote
€45-110K
Senior
Software Architecture

Senior AI developer | StartUp, Voice AI, Agentic AI, RAG | 100% remote möglich, bis 110 T€ (mwd)
Vesterling Consulting GmbH
Leipzig, Germany
Remote
€45-110K
Senior
Software Architecture

Senior AI developer | StartUp, Voice AI, Agentic AI, RAG | 100% remote möglich, bis 110 T€ (mwd)
Vesterling Consulting GmbH
Dresden, Germany
Remote
€45-110K
Senior
Software Architecture

Senior AI developer | StartUp, Voice AI, Agentic AI, RAG | 100% remote möglich, bis 110 T€ (mwd)
Vesterling Consulting GmbH
Düsseldorf, Germany
Remote
€45-110K
Senior
Software Architecture