nanochat
nanochat — Full-Stack LLM Implementation by Andrej Karpathy
nanochat (via) is a fascinating new project from Andrej Karpathy, discussed in detail in this forum post.
It delivers a complete ChatGPT-style LLM stack, including training, inference, and a web-based UI, all in a single, minimal, hackable, dependency-light codebase.
> "This repo is a full-stack implementation of an LLM like ChatGPT in a single, clean, minimal, hackable, dependency-lite codebase."
---
Key Features
- Compact codebase (~8,000 lines)
- Primary language: Python + PyTorch
- Rust component: Tokenizer training tool
- Affordable training — as low as $100
- Supports full lifecycle: data prep → training → midtraining → fine‑tuning → deployment
---
Training Setup & Cost
Karpathy suggests:
- Hardware: Rent an 8×H100 NVIDIA GPU node
- Cost: ~$24/hour
- Training durations:
- 4 hours (~$100) → produces a conversational model (example)
- 12 hours → slightly surpasses GPT‑2
- Longer runs could yield stronger results.
---
Dataset & Training Phases
Stage 1 — Initial Training
- Dataset: ~24GB from karpathy/fineweb-edu-100b-shuffle
- (based on FineWeb-Edu)
Stage 2 — Midtraining
Script: mid_train.py
Data (568K examples):
- SmolTalk — 460K
- MMLU auxiliary train — 100K
- GSM8K — 8K
Stage 3 — Supervised Fine-Tuning (SFT)
Script: chat_sft.py
Data (21.4K examples):
- ARC-Easy — 2.3K
- ARC-Challenge — 1.1K
- GSM8K — 8K
- SmolTalk — 10K
---
Deployment
You can serve trained models using:
- Backend: chat_web.py
- Frontend: ui.html — a minimal vanilla JavaScript + HTML interface.
---
AiToEarn — Monetizing AI Models & Content
For developers and creators experimenting with nanochat or other AI-driven projects, AiToEarn offers:
- Open-source global AI content monetization platform
- Cross-platform publishing to:
- Douyin, Kwai, WeChat, Bilibili, Rednote (Xiaohongshu)
- Facebook, Instagram, LinkedIn, Threads
- YouTube, Pinterest, X (Twitter)
- Core features:
- AI content generation tools
- Automated multi-channel distribution
- Analytics + insights
- AI model ranking (view rankings)
📄 Documentation: docs.aitoearn.ai
---
In summary: nanochat is an accessible way to build your own conversational LLM from scratch, offering a clean codebase and low training costs. Combined with platforms like AiToEarn, creators can not only experiment with their own models but also efficiently publish, analyze, and monetize AI-generated content across the globe.
---
Do you want me to also rewrite this into a step‑by‑step “Getting Started” guide so developers can follow from training to deployment? That would make it even more practical.