Text Generation WebUI
By oobabooga
Gradio web UI for running local LLMs across many backends (llama.cpp, Transformers, ExLlama) with extensions.
Best for
- local model experimentation
- multi-backend serving
- power-user UI
Other Local & on-device AI
Ollama
Run open-weight LLMs locally with a single command. Bundles model weights, quantizations, and an OpenAI-compatible HTTP API into a clean CLI.
LM Studio
Desktop GUI for downloading and chatting with local LLMs. The friendly way to try open-weight models without touching a terminal.
llama.cpp
C/C++ inference engine for LLaMA-family models. The library that quietly powers most local AI apps — fast, low-level, runs on almost anything.
Jan
Open-source ChatGPT alternative that runs entirely offline. Built on llama.cpp with a clean desktop UI and an OpenAI-compatible API.
MLX
Apple's array framework for Apple Silicon. Designed to run ML workloads natively on M-series Macs with unified memory between CPU and GPU.
GPT4All
Open-source desktop app for running LLMs locally with a chat UI, document RAG, and a browsable model catalog.
Open WebUI
Self-hosted, extensible ChatGPT-style web interface for local and remote models, with offline operation and RAG.
AnythingLLM
All-in-one desktop/self-hosted app for document chat (RAG) and agents over local or cloud models.