Doing icon

Doing

Ultra-fast local voice transcription for Mac with 150x real-time speed

Paid Dictation

Overview

Doing is a blazing-fast local voice transcription app for Mac that converts speech to text without cloud services or subscriptions. Hold the Fn key to speak, release to paste transcribed text at your cursor. Powered by NVIDIA's Parakeet-TDT engine running entirely on-device, it processes 60 seconds of audio in roughly 400ms. Doing includes a Skills system for AI-powered post-processing with customizable prompts, YOLO mode that auto-presses Return after pasting, and audio ducking that fades music during recording. Optional cloud engines like OpenAI Whisper, Google Gemini, and AssemblyAI are available for users who prefer them. No account required, and a free trial includes 100 transcriptions.

Pricing: $49

Minimum macOS: 14.0 (Sonoma)

Architecture: Apple Silicon

Key Features

  • 150x real-time transcription speed: 60 seconds of audio processed in ~400ms
  • Fully local on-device processing with NVIDIA Parakeet-TDT engine
  • Hold hotkey to speak, release to paste at cursor position
  • Skills system for AI-powered post-processing with customizable prompts
  • YOLO mode that automatically presses Return after pasting
  • Audio ducking that fades music during recording
  • Mouse pip indicator showing where text will be pasted
  • Optional cloud engines: OpenAI Whisper, Google Gemini, AssemblyAI
  • No account required, 100 free trial transcriptions
  • 3 device activations per license with 14-day refund guarantee

Tags

voice inputtranscriptiontext generation