Real-time voice I/O with streaming speech generation. Supports browser STT or server-side Whisper transcription.