90%+ fewer tokens per session by reading a pre-compiled wiki instead of exploring files cold. Built from Karpathy's workflow.
TL;DR Highlight
This is a workflow sharing post about how pre-organizing a codebase in Wiki format can reduce token usage per Claude session by more than 90% instead of directly exploring the codebase every time.
Who Should Read
Developers who are using Claude or other LLMs for codebase exploration and development tasks and are facing token cost or context limitations.
Core Mechanics
- Allowing the AI to directly explore files (cold exploration) in each session consumes a lot of unnecessary tokens, which can be solved by pre-organizing the codebase in Wiki format.
- This approach is inspired by the workflow used by Andrej Karpathy, and the key is to pre-compile the structure and core content of the codebase.
- It is reported that this method reduces token usage per session by more than 90%, significantly reducing the cost of repetitive codebase exploration.
- Due to blocked access to the original source, it was not possible to confirm the specific implementation methods, tools, and scripts.
Evidence
- "(No comment information)"
How to Apply
- If you are working with Claude on the same codebase repeatedly, you can try creating a Markdown Wiki file in advance that organizes the project structure, key modules, and function roles, and injecting only that file at the beginning of each session.
- When starting a new project, have Claude explore the entire codebase only once, and save the results in a file like CODEBASE_WIKI.md. Subsequent sessions can then refer to only that file to save tokens.
- If you need specific implementation methods from the original post, visit the original Reddit URL (https://www.reddit.com/r/ClaudeAI/comments/1sfdztg/) directly or refer to Karpathy's publicly available workflow-related materials.
Terminology
Related Papers
Using Claude Code: The unreasonable effectiveness of HTML
Claude Code 팀이 Markdown 대신 HTML을 LLM 출력 포맷으로 선호하기 시작한 이유와 그 실용적 장점을 정리한 글로, AI와 함께 문서/스펙/대시보드를 만드는 워크플로우에 직접적인 영향을 준다.
When to Vote, When to Rewrite: Disagreement-Guided Strategy Routing for Test-Time Scaling
Disagreement-guided routing boosts LLM accuracy on math and code by 3-7% with adaptive problem solving.
Less Is More: Engineering Challenges of On-Device Small Language Model Integration in a Mobile Application
Five failure modes and eight practical solutions emerged after five days of running on-device SLMs (Gemma 4 E2B, Qwen3 0.6B) with Wordle.
Dynamic Context Evolution for Scalable Synthetic Data Generation
A framework that completely eliminates duplication and repetition in large-scale synthetic data generation with LLMs using three mechanisms (VTS + Semantic Memory + Adaptive Prompt).
I mass deleted 3 months of AI generated code last week. Here is what I learned.
A retrospective post by a developer who deleted 3 months' worth of code after over-relying on AI code generation, but access to the original post is blocked, making it impossible to verify the actual content.
This new technique saves 60% of my token expenses