Nanocode: The best Claude Code that $200 can buy in pure JAX on TPUs
TL;DR Highlight
An open-source library that allows you to train a 1.3B parameter coding agent model from scratch on a $200 (approximately 270,000 KRW) TPU, following Anthropic's Constitutional AI approach. It can serve as a hands-on reference for developers who want to directly understand the entire AI training pipeline.
Who Should Read
ML engineers who want to directly implement alignment learning techniques such as Constitutional AI or RLHF, or developers who want to understand end-to-end how coding agents like Claude are created internally.
Core Mechanics
- nanocode is a library that demonstrates how anyone can train their own coding agent model from scratch, closely following the Constitutional AI approach used in Anthropic's Claude model training.
- The entire training infrastructure and philosophy are derived from Karpathy's nanochat project, and the structure is so similar that those familiar with nanochat will find nanocode to be quite familiar.
- The code is entirely written in JAX and optimized for training on Google TPUs. While it does run on NVIDIA GPUs, it's important to consider that it's TPU-optimized code.
- The 1.3B parameter model, nanocode-d24, can be trained in about 9 hours on a TPU v6e-8, costing approximately $200 (approximately 270,000 KRW).
- The smaller 477M parameter model, nanocode-d20, can be trained in about 1.5 hours for $34 (approximately 45,000 KRW), making it good for quick experimentation.
- You can receive free TPU access for a month through Google's TRC (TPU Research Cloud) program, and new Google Cloud accounts can also receive a $300 credit, allowing you to get started without any cost.
- The training pipeline consists of SOUL.md creation (defining the model's value criteria) → agentic interface definition → synthetic data generation → preference optimisation.
- The author used TPUs for 3 months through the TRC program and was able to maintain the same pod for over a week, as spot instances were rarely interrupted.
Evidence
- In a demo video, nanocode was criticized for answering a prompt ('remove falsey values from a list without creating a new list') with a list comprehension (a method that creates a new list), suggesting it didn't fully understand the requirements and generated incorrect code. Correct implementation examples using reversed range + pop were also shared in the comments.
- A question arose, 'Why spend $200 to train it yourself when there are many free coding models?' This appears to have missed the context that the project is aimed at understanding the learning principles directly, rather than just using it. No clear answer was provided.
- A sharp observation was made, 'Claude Code is a harness (execution framework) for calling LLMs and executing tools, not something that can be trained itself. Are you using the term incorrectly?' It's important to understand that the project name is an homage to Claude Code, not a claim to train the actual Claude Code.
- A cautionary comment was made that there is another open-source project with a similar name, nanocoder (https://github.com/Nano-Collective/nanocoder), which could cause confusion.
- There was positive feedback that the content was well-written and easy to understand even for those with no ML experience, while some skeptical comments demanded verification, asking 'Does Anthropic actually use this method, and does it actually work?'
How to Apply
- If you want to create an alignment-trained model using the Constitutional AI approach, apply to the Google TRC program to receive free TPU access and start by training nanocode-d20 (477M parameters). You can run the entire pipeline in $34 and 1.5 hours.
- If you want to create a coding agent that reflects your company or team's unique coding style, rules, and values, refer to the nanocode pipeline to create synthetic data based on your own SOUL.md and train it with preference optimisation.
- If you are new to JAX and TPU-based training infrastructure, reading the nanochat project first will significantly lower the learning curve, as nanocode uses almost the same commands and structure.
- If you want to experiment in an NVIDIA GPU environment, nanocode does run on GPUs, but keep in mind that it's based on TPU-optimized code, so there may be a performance difference. It's best to measure the cost/speed trade-off directly.
Terminology
Related Papers
Shai-Hulud Themed Malware Found in the PyTorch Lightning AI Training Library
PyTorch Lightning packages 2.6.2 and 2.6.3 delivered credential-stealing malware via a supply chain attack.
Alignment whack-a-mole: Finetuning activates recall of copyrighted books in LLMs
Fine-tuning even safety-aligned LLMs can bypass safeguards and reproduce copyrighted text verbatim, revealing prompt filtering alone isn't enough to prevent copyright infringement.
Show HN: MacMind – A transformer neural network in HyperCard on a 1989 Macintosh
This is an educational project implementing a single-layer Transformer with 1,216 parameters in the scripting language HyperTalk (1987) and training it on a real Macintosh SE/30. It demonstrates that the core mathematics of modern LLMs works the same on hardware from 30 years ago.
MegaTrain: Full Precision Training of 100B+ Parameter LLMs on a Single GPU
Introducing MegaTrain, a system that leverages CPU memory as the primary storage and utilizes the GPU solely as a compute engine, enabling full-precision training of 120B parameter models with just a single H200 GPU.
Show HN: I built a tiny LLM to demystify how language models work
This educational project allows you to build a mini LLM with 8.7 million parameters, trained on a Guppy fish character, from scratch in just 5 minutes using a single Colab notebook, focusing on demystifying the black box nature of LLMs.