Local AI

MLX

Apple's array framework for machine learning on Apple Silicon. Native Metal support, unified memory, first-class LLM inference.

Try MLXgithub.com/ml-explore/mlx

Save

MLX is Apple's machine learning framework built specifically for Apple Silicon. Unlike running llama.cpp through Metal, MLX is designed ground-up for the unified memory architecture of M-series chips, which means model weights and KV cache can be shared between CPU and GPU with no copy overhead. For local inference on a Mac, this delivers noticeably better tokens-per-second than the generic options at the same memory footprint. The ecosystem now includes mlx-lm for LLM inference with a simple Python API, mlx-vlm for vision-language models, and community-maintained quantized weights for most popular open-source LLMs. For anyone doing serious local work on a MacBook Pro or Mac Studio, MLX is the default inference layer in 2026.

apple-silicon metal local inference apple m-series

Similar Tools

Local AI

llama.cpp

C++ inference engine for LLMs. GGUF format, quantization, CPU and Metal/CUDA support. The foundation most local tools build on.

Local AI

LM Studio

Desktop app for discovering, downloading, and running local LLMs. Clean chat UI, OpenAI-compatible API server, and automatic GPU detection. MLX engine optimized for Apple Silicon.

Local AI

vLLM

High-throughput inference server for LLMs. PagedAttention memory management. The go-to for serious local or self-hosted serving.

Local AI

Ollama

The easiest way to run LLMs locally. One command to pull and run any model. OpenAI-compatible API. 52M+ monthly downloads. Supports GGUF, Safetensors, and custom Modelfiles.

Get started with MLX

Apple's array framework for machine learning on Apple Silicon. Native Metal support, unified memory, first-class LLM inference.

Try MLX

Get weekly tool reviews

Honest takes on AI dev tools, frameworks, and infrastructure - delivered to your inbox.

Subscribe Free

Compare all pricing Compare side by side

More Local AI Tools

Ollama

The easiest way to run LLMs locally. One command to pull and run any model. OpenAI-compatible API. 52M+ monthly downloads. Supports GGUF, Safetensors, and custom Modelfiles.

LM Studio

Desktop app for discovering, downloading, and running local LLMs. Clean chat UI, OpenAI-compatible API server, and automatic GPU detection. MLX engine optimized for Apple Silicon.

Jan

Open-source ChatGPT alternative that runs 100% offline. Desktop app with local models, cloud API connections, custom assistants, and MCP integration. AGPLv3 licensed.

Related Guides

Guide

Run AI Models Locally with Ollama and LM Studio

Install Ollama and LM Studio, pull your first model, and run AI locally for coding, chat, and automation - with zero cloud dependency.

Getting Started

Guide

MCP Installation Scopes - Claude Code

Local, project, user, and plugin-level MCP configurations.

Claude Code

Guide

Scheduled Tasks (Desktop) - Claude Code

GUI-based scheduling on your local machine for recurring work.

Claude Code

All AI Tools

MLX

Similar Tools

llama.cpp

LM Studio

vLLM

Ollama

Get started with MLX

Get weekly tool reviews

More Local AI Tools

Ollama

LM Studio

Jan

Related Guides

Run AI Models Locally with Ollama and LM Studio

MCP Installation Scopes - Claude Code

Scheduled Tasks (Desktop) - Claude Code

Get Smarter About AI Dev

MLX

Similar Tools

llama.cpp

LM Studio

vLLM

Ollama

Get started with MLX

Get weekly tool reviews

More Local AI Tools

Ollama

LM Studio

Jan

Related Guides

Run AI Models Locally with Ollama and LM Studio

MCP Installation Scopes - Claude Code

Scheduled Tasks (Desktop) - Claude Code

Get Smarter About AI Dev