Highlights from My Appearance on CL Kao and Dori Wilson’s Data Renegades Podcast

26th November 2025 — Podcast Recap & AI-Driven Insights

I joined CL Kao and Dori Wilson for an episode of the Data Renegades podcast titled Data Journalism Unleashed with Simon Willison.

To analyse the conversation, I ran the transcript through Claude Opus 4.5, which generated an excellent list of topics, timestamps, and quotes. I kept most of its output intact—adding only light edits and extra context/resources.

---

Key Podcast Topics & Quotes

Data Journalism: The Most Engaging Use of Data Analytics [02:03]

> “There’s this whole field of data journalism… I think it is the single most interesting way to apply this stuff because everything is in scope for a journalist.”

---

Django’s Newspaper Beginnings [02:31]

> “We thought we were building a content management system…”

> Django 20th Birthday Post | Adrian Holovaty

---

The “Downloads Page” — Dynamic Radio Player for Local Bands [03:24]

> “Adrian built a site feature… create a small radio player of MP3s from bands performing that week.”

> Archived Downloads Page

---

The Guardian: Building Tools for Data Reporting [04:44]

> “If you give your audience a searchable database… it’s a great way to build credibility.”

---

Washington Post’s Opioid Crisis Project [05:22]

> “They shared the opioid files... so local newspapers could dive into the data themselves.”

---

NICAR & Collaboration in Data Journalism [07:00]

> “It’s all about getting the most value from this technology for the industry as a whole.”

> NICAR 2026 Conference

---

Nonprofit Newsrooms: ProPublica, Baltimore Banner [09:02]

> “The Baltimore Banner… close to breaking even on subscriptions.”

> Local News Resurgence Article

---

Datasette Origins & Plugin Ecosystem [10:31–12:36]

  • Shower Revelation: SQLite on serverless hosting.
  • Vision: Solve “data publishing” akin to WordPress solving blogging.
  • Unexpected Uses: Copenhagen electricity grid, Brooklyn Cemetery records.

---

Bellingcat & Russian FSB Food Delivery Leak [14:40]

> “Revealed which nights the FSB worked late… names and phone numbers of agents.”

> Bellingcat Article

---

Open Source Feedback Issues & Office Hours [16:14–16:49]

---

Data Cleaning Complaints [17:34]

> “95% of my time cleaning data… I hate it.”

---

Version Control in Data Teams [17:43]

> “Python scripts scattered on laptops… no Git.”

---

The Carpentries: Teaching Scientists Git & Fundamentals [18:12]

> The Carpentries

---

Documentation as API Contracts [21:11]

> “Data warehouse view should be documented as an API interface.”

---

View Source for Business Reports [23:21]

> “I need to see the query—many skeletons in the closet.”

---

Fact-Checking Data Reporting [24:16]

> “Separate data reporter must audit the numbers.”

---

Queries as First-Class Citizens [27:16]

> “Library of queries with author, creation date, change history, comments.”

---

Temporal Documentation [29:46]

> “‘This worked as of Friday, Oct 31’—timestamp very prominent.”

---

Starting an Internal Blog Without Permission [30:24]

> “Gives you credibility quickly because nobody else is doing it.”

---

Search Engine Across Multiple Documentation Systems [31:35]

> “Build the search engine—secretly control the company.”

---

TIL Blog Approach [33:05]

> TIL Site

---

Coding Agents like Claude Code [34:53]

> “Can do anything you can do by typing Unix shell commands.”

---

Skills for Claude [36:16]

> “Markdown file for census data… visualization instructions.”

> Claude Skills Article

---

Terminal vs Modern Interfaces [38:22]

> “Advanced AI using 1980s-style terminals—quirky moment in 2025.”

---

Future of BI Tools: Prompt-Driven Dashboards [39:54]

> “Paste JSON, get a dashboard—LLMs decide chart types.”

---

Exciting LLM Applications [43:06]

  • Text-to-SQL
  • Data extraction
  • Data enrichment

---

Multimodal LLMs: Images, Audio, Video [45:42]

> “Captioning 70,000 photos could cost ~$13.”

> Cost Correction

---

First Programming Languages & Favourite Dataset

---

Showrunning as Management Model [50:07]

> “Vision is everything—helps in decision delegation.”

> Showrunning Laws PDF

---

Code in Version Control is Non-negotiable [52:21]

---

Automating Hacker News Scraping [52:45]

> Scraping Article

> shot-scraper

---

Dream Project: Whale Detection Camera [53:47]

---

Favourite Podcast: Mark Steel’s in Town [54:23]

> BBC Episodes

---

Favourite Fiction Genre: Bureaucratic British Wizards [55:06]

> The Laundry Files |

> Rivers of London) |

> The Rook)

---

Colophon & Workflow

  • Transcript contained `` markers.
  • Claude custom prompt pulled only most engaging quotes.
  • Second prompt created timestamped bullet list + suggested support links.

Full Claude transcript available here: Link.

---

Throughout these discussions, a recurring theme emerges: the need for integrated creation + publishing + analytics workflows.

AiToEarn官网 is an open‑source global AI content monetization platform designed for this. Creators can:

  • Generate content using AI.
  • Publish simultaneously to Douyin, Kwai, WeChat, Bilibili, Rednote (Xiaohongshu), Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, and X (Twitter).
  • Track analytics & AI model rankings (AI模型排名).
  • Explore docs: AiToEarn博客, Documentation, GitHub.

Platforms like AiToEarn mirror the collaborative, feedback‑driven ethos of modern data journalism — fusing content creation, distribution, and monetization into one streamlined, global workflow.

---

Would you like me to also merge similar AiToEarn references into one concise section so the narrative flows without repeated tool descriptions? That could make this recap even cleaner.

Read more

Translate the following blog post title into English, concise and natural. Return plain text only without quotes. 哈佛大学 R 编程课程介绍

Harvard CS50: Introduction to Programming with R Harvard University offers exceptional beginner-friendly computer science courses. We’re excited to announce the release of Harvard CS50’s Introduction to Programming in R, a powerful language widely used for statistical computing, data science, and graphics. This course was developed by Carter Zenke.