Overview

Lemonade is an open-source local AI runtime that runs text, image, and speech models entirely on your own machine, with no cloud dependency and no telemetry. It exposes OpenAI-compatible API endpoints so existing OpenAI SDK clients can point at it unchanged, and it can serve multiple models simultaneously — large chat models like gpt-oss-120b and Qwen-Coder-Next with 64k+ context, vision models for image analysis, image generation and editing, automatic speech recognition, and speech synthesis. Hardware is auto-detected and optimized on the fly, with specific tuning for AMD Ryzen AI, Radeon, and Strix Halo systems alongside standard macOS, Windows, and Linux support. A built-in control panel app makes it easy to browse, download, and swap models, and the whole thing ships as a portable binary under 10 MB for simple deployment. Free, open source, private by design.

Pricing: Free (open source)

Architecture: Apple Silicon, Intel