nanochat

nanochat — Full-Stack LLM Implementation by Andrej Karpathy

nanochat (via) is a fascinating new project from Andrej Karpathy, discussed in detail in this forum post.

It delivers a complete ChatGPT-style LLM stack, including training, inference, and a web-based UI, all in a single, minimal, hackable, dependency-light codebase.

> "This repo is a full-stack implementation of an LLM like ChatGPT in a single, clean, minimal, hackable, dependency-lite codebase."

---

Key Features

  • Compact codebase (~8,000 lines)
  • Primary language: Python + PyTorch
  • Rust component: Tokenizer training tool
  • Affordable training — as low as $100
  • Supports full lifecycle: data prep → training → midtraining → fine‑tuning → deployment

---

Training Setup & Cost

Karpathy suggests:

  • Hardware: Rent an 8×H100 NVIDIA GPU node
  • Cost: ~$24/hour
  • Training durations:
  • 4 hours (~$100) → produces a conversational model (example)
  • 12 hours → slightly surpasses GPT‑2
  • Longer runs could yield stronger results.

---

Dataset & Training Phases

Stage 1 — Initial Training

Stage 2 — Midtraining

Script: mid_train.py

Data (568K examples):

Stage 3 — Supervised Fine-Tuning (SFT)

Script: chat_sft.py

Data (21.4K examples):

---

Deployment

You can serve trained models using:

---

AiToEarn — Monetizing AI Models & Content

For developers and creators experimenting with nanochat or other AI-driven projects, AiToEarn offers:

  • Open-source global AI content monetization platform
  • Cross-platform publishing to:
  • Douyin, Kwai, WeChat, Bilibili, Rednote (Xiaohongshu)
  • Facebook, Instagram, LinkedIn, Threads
  • YouTube, Pinterest, X (Twitter)
  • Core features:
  • AI content generation tools
  • Automated multi-channel distribution
  • Analytics + insights
  • AI model ranking (view rankings)

📄 Documentation: docs.aitoearn.ai

---

In summary: nanochat is an accessible way to build your own conversational LLM from scratch, offering a clean codebase and low training costs. Combined with platforms like AiToEarn, creators can not only experiment with their own models but also efficiently publish, analyze, and monetize AI-generated content across the globe.

---

Do you want me to also rewrite this into a step‑by‑step “Getting Started” guide so developers can follow from training to deployment? That would make it even more practical.

Read more

Express | OpenAI’s In-Hhouse Chip: Partnering with Arm and Broadcom to Build 10-GW Compute Power, SoftBank May Be the Biggest Beneficiary

Express | OpenAI’s In-Hhouse Chip: Partnering with Arm and Broadcom to Build 10-GW Compute Power, SoftBank May Be the Biggest Beneficiary

OpenAI Partners with Arm, Broadcom, and TSMC on Custom AI Chips Beijing, October 14, 2025 — The Information OpenAI is working with Arm to incorporate Arm-designed CPUs into its self-developed AI server chips, and co-designing a dedicated, inference-focused AI chip with Broadcom. These chips will be manufactured by TSMC, with production

By Honghao Wang
Just Now: Musk’s Second-Gen Starship Completes Final Flight! Bonus: Jensen Huang Personally Delivers Supercomputer

Just Now: Musk’s Second-Gen Starship Completes Final Flight! Bonus: Jensen Huang Personally Delivers Supercomputer

Starship V2 Farewell Mission — A Smooth Final Flight The 11th and final Starship V2 mission ended successfully just moments ago. Key highlights: * Booster 15 (flight-proven) took to the skies again. * 8 Starlink simulators deployed flawlessly. * Multiple heat shield tiles removed for extreme stress testing. * Spacecraft concluded with a planned breakup

By Honghao Wang