Running Claude Code fully offline on a MacBook — no API key, no cloud, 17s per task

Mar 26, 2026•divinetribe1•View Original

TL;DR Highlight

A post sharing how to run Claude Code fully offline on a MacBook by connecting it to a local LLM without an API key or cloud, useful for developers who want to use an AI coding assistant at no cost.

Who Should Read

Developers who want to use AI coding assistants like Claude Code locally without API costs, or developers who need to leverage AI tools in internet-restricted environments (offline, secure networks, etc.).

Core Mechanics

Failed to retrieve the original content. The post body was inaccessible due to Reddit's network security blocking.
Key information inferred from the title: the author successfully ran Claude Code fully offline on a MacBook, reportedly achieving a processing speed of approximately 17 seconds per task.
Given that it operates without an API key or Anthropic cloud servers, it is likely that a local LLM (e.g., an Ollama-based model) was connected as the backend for Claude Code.
Fully offline execution offers several practical benefits, including data privacy, reduced API costs, and usability in network-free environments.

Evidence

Author built ~200-line Python server so Claude Code talks directly to local MLX model via Anthropic Messages API — no proxy or middleware
M5 Max (128GB) benchmark: ~2.2s for 100 tokens (45 tok/s), ~11s for 500 tokens — slower than API but fully offline at zero cost
Counter in comments: already possible by just swapping API key with local endpoint, this adds unnecessary complexity / Ollama launch claude does the same
Positive responses: one user ran Qwen3.5 30B 4-bit and built Conway's Game of Life on first try / 'will be essential as prices rise'

How to Apply

"The original content was inaccessible, so specific application instructions cannot be provided. Visit the original Reddit URL directly (https://www.reddit.com/r/ClaudeAI/comments/1s43b8w/) or log in to view the full content. | A similar approach involves connecting Ollama with a local model (e.g., Qwen, Llama family) to Claude Code as a custom API endpoint. Start by exploring how to change Claude Code's ANTHROPIC_BASE_URL environment variable to point to a local server address. | If you need an offline AI coding assistant, the Continue.dev + Ollama combination is also worth considering as an alternative."

Terminology

Claude CodeA CLI-based AI coding assistant created by Anthropic. It allows you to write, edit, and debug code using natural language directly from the terminal.

오프라인 LLMA large language model that runs directly on a local machine without an internet connection. It can be run on devices like a MacBook using tools such as Ollama.

API 키A secret key used for authentication and billing when accessing cloud AI services (e.g., Anthropic, OpenAI). Not required when running locally.

Related Resources

https://www.reddit.com/r/ClaudeAI/comments/1s43b8w/running_claude_code_fully_offline_on_a_macbook_no/