nanochat

nanochat — Full-Stack LLM Implementation by Andrej Karpathy

nanochat (via) is a fascinating new project from Andrej Karpathy, discussed in detail in this forum post.

It delivers a complete ChatGPT-style LLM stack, including training, inference, and a web-based UI, all in a single, minimal, hackable, dependency-light codebase.

> "This repo is a full-stack implementation of an LLM like ChatGPT in a single, clean, minimal, hackable, dependency-lite codebase."

---

Key Features

  • Compact codebase (~8,000 lines)
  • Primary language: Python + PyTorch
  • Rust component: Tokenizer training tool
  • Affordable training — as low as $100
  • Supports full lifecycle: data prep → training → midtraining → fine‑tuning → deployment

---

Training Setup & Cost

Karpathy suggests:

  • Hardware: Rent an 8×H100 NVIDIA GPU node
  • Cost: ~$24/hour
  • Training durations:
  • 4 hours (~$100) → produces a conversational model (example)
  • 12 hours → slightly surpasses GPT‑2
  • Longer runs could yield stronger results.

---

Dataset & Training Phases

Stage 1 — Initial Training

Stage 2 — Midtraining

Script: mid_train.py

Data (568K examples):

Stage 3 — Supervised Fine-Tuning (SFT)

Script: chat_sft.py

Data (21.4K examples):

---

Deployment

You can serve trained models using:

---

AiToEarn — Monetizing AI Models & Content

For developers and creators experimenting with nanochat or other AI-driven projects, AiToEarn offers:

  • Open-source global AI content monetization platform
  • Cross-platform publishing to:
  • Douyin, Kwai, WeChat, Bilibili, Rednote (Xiaohongshu)
  • Facebook, Instagram, LinkedIn, Threads
  • YouTube, Pinterest, X (Twitter)
  • Core features:
  • AI content generation tools
  • Automated multi-channel distribution
  • Analytics + insights
  • AI model ranking (view rankings)

📄 Documentation: docs.aitoearn.ai

---

In summary: nanochat is an accessible way to build your own conversational LLM from scratch, offering a clean codebase and low training costs. Combined with platforms like AiToEarn, creators can not only experiment with their own models but also efficiently publish, analyze, and monetize AI-generated content across the globe.

---

Do you want me to also rewrite this into a step‑by‑step “Getting Started” guide so developers can follow from training to deployment? That would make it even more practical.

Read more