## Datawhale People
# Interview: “No Onion, Ginger, or Garlic” — Datawhale Member

**CodeMaster**
Ranking: #8
---
## Introduction
In an era where **AI is reshaping technology and industry** at unprecedented speed, an open-source large-model tutorial project called **Happy-LLM** went viral — gaining **3k+ GitHub stars in just 24 hours**. Developers hailed it as *“the hottest tutorial in the large-model learning circle.”*
Behind this success is **Song Zhixue**, born in 1995, with a non-traditional technical background. He transitioned from **Surveying Engineering** to **AI**, evolving from an open-source beginner to the creator of highly popular projects like **Self-LLM** and **Happy-LLM**. His journey serves as a vivid “from-zero-to-expert” case study.
In this interview, Song — known online as **“No Onion, Ginger, or Garlic”** — shares how his passion for technology and open-source collaboration lowered the barriers to large-model learning, enabling more developers to participate in shaping the future of the intelligent era.
---
## 1. **Origin: A Non-Typical Programmer’s Open-Source Journey**

### Self-Introduction
I’m **Song Zhixue**, currently contributing large-model open-source content in **Datawhale**, and serving as a research assistant at the **AGI Lab** of West Lake University under Professor Zhang Chi.

My undergraduate degree was in **Surveying Engineering**, a field far from computer science. My turning point came via **LeetCode**, where solving algorithm problems sparked my interest. Discovering **Datawhale** was like entering an entirely new world.
I once joined a study group for *Dive into Deep Learning* and applied — despite doubts about my qualifications — to be a teaching assistant. That decision was the “first button” on my open-source path: transitioning from learner to contributor.


---
### Joining Datawhale and Early Growth
While taking notes for *Dive into Deep Learning*, community member **Hu Ruifeng** suggested making them open source. His advice — *“Don’t overthink it, just start doing”* — shaped my view of open source. I didn’t even know how to use GitHub then, but in Datawhale’s *exploration-friendly* atmosphere, I naturally stepped into contribution.
We even had a “Genshin Impact” group chat — but discussions revolved around deep learning rather than gaming.
Later, after formal application to join Datawhale, my goal was clear: learn deep learning algorithms and exchange ideas with others. This was a major career turning point.
---


---
### Influence of *Dive into Deep Learning*
This book introduced me to deep learning and impressed me with its **dual teaching approach**:
1. **Implement from scratch** using NumPy/PyTorch to ensure understanding.
2. Then use **high-level APIs** for efficiency.
In *Happy-LLM*, we replicated this idea:
- **Chapter 5**: Build each layer of **LLaMA2** from scratch in PyTorch.
- **Chapter 6**: Use Hugging Face Transformers for rapid implementation.
This “know why, then know how” approach is ideal for cross-disciplinary learners.
🔗 [Dive into Deep Learning](https://zh.d2l.ai/index.html)


---
## 2. **Breakthrough: From Personal Notes to Community Sensation**

---
### Birth of Self-LLM
**Self-LLM**, with 25,000+ stars today, began modestly. Initially, I was writing **introductory LLM tutorials** for **InternLM** at the Shanghai AI Lab. With spare compute resources, I explored other models, documenting training scripts and demos for **ChatGLM**, **Qianwen**, etc.
Requests for these scripts showed a **shared need**: beginner-friendly, practical LLM guides were scarce. The project became a “time capsule,” tracking LLM evolution — new models listed first, older ones later.
---


---
### Why Self-LLM Works
We didn’t overthink positioning — just solved problems we needed ourselves. Co-lead **Zou Yuheng** and I uploaded our scripts to GitHub; within days, Self-LLM had 1,000+ stars.
By **making every step of training explicit** — data prep, model loading, fine-tuning — learners understood both *how* and *why*. Graduate students and engineers reported gaining deep confidence using other frameworks afterward.
Over half of contributors are university students — their practical questions (“Our lab has GPUs — can we fine-tune this model?”) keep tutorials relevant.


---
## 3. **From Self-LLM to Happy-LLM — Growth by Listening**
Feedback revealed a gap: learners wanted **a panoramic, structured path** covering LLM architecture and development process.
In mid-2023, discussions with **Zou Yuheng** and **Xie Wenrui** led to creating **Happy-LLM** — a full-chain learning resource from beginner to expert.
**Impact:**
- **3,000+ stars on day one**
- **10,000+ stars in under a month**
- **20,000+ stars today**


---
### Project Roadmap
**Self-LLM**: Community-driven updates for new models (e.g., Minimax M2).
**Happy-LLM**: Three-stage evolution
1. **Printed edition** via People’s Posts and Telecommunications Press.
2. **Multi-language versions** to share China’s open-source education globally.
3. **Extended Chapters**: Community-submitted practice notes & innovations.
Also launching **Hello-Agents**:
- **Software engineering-oriented Agents** (*Dify*, *Coze*) for deployment.
- **AI-native Agent frameworks** (*Camel-AI*, *MetaGPT*) for capability evolution.
---

---
## 4. **Operating an Open-Source Community — Methodology**

### Lowering Barriers
It’s a myth you need advanced skills to contribute. I began not knowing how to make a PR. Anyone can join — learn as you go.
**Two principles:**
1. Lower participation threshold — share notes, not just code.
2. Maintain passion — motives beyond resumes or internships.
Open source thrives on persistence under constraints — at launch, we had no GPUs, just community resource sharing.
---


---
### Advice to Newcomers
- **Don’t wait until you feel ready.**
- Start small — file issues, send PRs.
- Follow contribution guidelines to lower friction.
- Focus on topics you love — passion sustains projects.
---
## 5. **Open Source in the AI Era — Opportunities & Challenges**

---
### Symbiosis of AI and Open Source
AI accelerates open-source innovation; open source fuels AI’s growth. This is a **paradigm shift** in collaboration and iteration.
Examples:
- **DeepSeek** and **Kimi** model releases let the community study cutting-edge designs.
- Open sharing helps newcomers stand on giants’ shoulders — but demands constant innovation to avoid being surpassed.


---
### Rising and Falling Barriers
- **Lower barriers** in tool usage and app development — even non-programmers use AI like Vibe Coding to build projects.
- **Higher barriers** in resource-intensive R&D — e.g., improving ROPE or creating 3D generative models.


---
## 6. **Future Programmer Skills in the AI-Assisted Era**
Critical competencies:
- **System architecture & engineering management**
- **Requirements definition & problem discovery**
- **Holistic understanding of complex systems**
AI will not replace programmers, but those without system thinking risk obsolescence. Value will lie in guiding AI on *what* to build, not just coding.


---
## 7. **Current Research at Westlake University AGI Lab**
Focus:
- Properties of **multimodal models**
- Multimodal **model agents**
- Upcoming deep dive: **Diffusion LLM**
Habit: Study new models’ code and architecture immediately upon release.
Reflection: Technical value ≠ product value. Simple tools solving user pain points can surpass complex projects in impact. Adopting a **product-oriented mindset** helps craft relevant solutions.

---
## 8. **Message to Developers**
**Make technology accessible, so everyone can find their place in open source.**
Open source is about **democratizing knowledge** — lowering barriers and welcoming first steps.
**Project Links:**
- [Self-LLM](https://gitcode.com/datawhalechina/self-llm)
- [Happy-LLM](https://gitcode.com/GitHub_Trending/ha/happy-llm)
---
## Closing Thoughts
Song Zhixue’s journey — from research to product thinking, from personal growth to community building — shows that technical mastery is *also* evolution of mindset. In fast-changing tech, curiosity, adaptability, and user perspective are developer’s greatest assets.