Izwi
Local audio inference engine for speech, transcription, and voice cloning
Overview
Izwi is a complete audio inference engine that runs entirely on your Mac with no cloud dependency. Built in Rust with Metal acceleration for Apple Silicon, it delivers text-to-speech, speech recognition, voice cloning, voice design, conversational AI, and speaker diarization — all processed locally with zero external dependencies. Izwi achieves 10x faster performance than cloud APIs with under 50ms first token latency. It includes both a desktop GUI and a production-ready API server with OpenAI-compatible endpoints, making it suitable for both personal use and local development workflows. Currently in alpha and open-source under the Apache 2.0 license, Izwi is free forever with no API keys or internet connection required.
Architecture: Apple Silicon, Intel
Key Features
- Text-to-speech with multiple voices and pitch/speed control
- Speech recognition with word-level timestamps and multi-language support
- Voice cloning from just seconds of audio
- Voice design from text descriptions
- Conversational AI with real-time dialogue
- Speaker diarization for multi-speaker audio separation
- OpenAI-compatible API endpoints for local development
- Rust-native with Metal GPU acceleration on Apple Silicon
- 10x faster than cloud APIs with under 50ms first token latency
- Desktop GUI and production API server modes
- Zero external dependencies — no internet or API keys required
- 100% local processing for complete privacy
Tags
Similar Apps
Free Voice Reader
Complete voice AI suite for macOS - dictate, read aloud, clone voices, and transcribe meetings locally
Voice Clone: AI Voice Cloning
Clone any voice in minutes with AI
FonoX
High-Quality Text-to-Speech, Fully Local on macOS
Speaklone
Professional voice cloning and synthesis powered by on-device AI