Highlights from My Appearance on CL Kao and Dori Wilson’s Data Renegades Podcast
26th November 2025 — Podcast Recap & AI-Driven Insights
I joined CL Kao and Dori Wilson for an episode of the Data Renegades podcast titled Data Journalism Unleashed with Simon Willison.
To analyse the conversation, I ran the transcript through Claude Opus 4.5, which generated an excellent list of topics, timestamps, and quotes. I kept most of its output intact—adding only light edits and extra context/resources.
---
Key Podcast Topics & Quotes
Data Journalism: The Most Engaging Use of Data Analytics [02:03]
> “There’s this whole field of data journalism… I think it is the single most interesting way to apply this stuff because everything is in scope for a journalist.”
---
Django’s Newspaper Beginnings [02:31]
> “We thought we were building a content management system…”
> Django 20th Birthday Post | Adrian Holovaty
---
The “Downloads Page” — Dynamic Radio Player for Local Bands [03:24]
> “Adrian built a site feature… create a small radio player of MP3s from bands performing that week.”
---
The Guardian: Building Tools for Data Reporting [04:44]
> “If you give your audience a searchable database… it’s a great way to build credibility.”
---
Washington Post’s Opioid Crisis Project [05:22]
> “They shared the opioid files... so local newspapers could dive into the data themselves.”
---
NICAR & Collaboration in Data Journalism [07:00]
> “It’s all about getting the most value from this technology for the industry as a whole.”
---
Nonprofit Newsrooms: ProPublica, Baltimore Banner [09:02]
> “The Baltimore Banner… close to breaking even on subscriptions.”
> Local News Resurgence Article
---
Datasette Origins & Plugin Ecosystem [10:31–12:36]
- Shower Revelation: SQLite on serverless hosting.
- Vision: Solve “data publishing” akin to WordPress solving blogging.
- Unexpected Uses: Copenhagen electricity grid, Brooklyn Cemetery records.
---
Bellingcat & Russian FSB Food Delivery Leak [14:40]
> “Revealed which nights the FSB worked late… names and phone numbers of agents.”
---
Open Source Feedback Issues & Office Hours [16:14–16:49]
- Frustration: little feedback from users.
- Solution: Calendly office hours.
---
Data Cleaning Complaints [17:34]
> “95% of my time cleaning data… I hate it.”
---
Version Control in Data Teams [17:43]
> “Python scripts scattered on laptops… no Git.”
---
The Carpentries: Teaching Scientists Git & Fundamentals [18:12]
---
Documentation as API Contracts [21:11]
> “Data warehouse view should be documented as an API interface.”
---
View Source for Business Reports [23:21]
> “I need to see the query—many skeletons in the closet.”
---
Fact-Checking Data Reporting [24:16]
> “Separate data reporter must audit the numbers.”
---
Queries as First-Class Citizens [27:16]
> “Library of queries with author, creation date, change history, comments.”
---
Temporal Documentation [29:46]
> “‘This worked as of Friday, Oct 31’—timestamp very prominent.”
---
Starting an Internal Blog Without Permission [30:24]
> “Gives you credibility quickly because nobody else is doing it.”
---
Search Engine Across Multiple Documentation Systems [31:35]
> “Build the search engine—secretly control the company.”
---
TIL Blog Approach [33:05]
> TIL Site
---
Coding Agents like Claude Code [34:53]
> “Can do anything you can do by typing Unix shell commands.”
---
Skills for Claude [36:16]
> “Markdown file for census data… visualization instructions.”
---
Terminal vs Modern Interfaces [38:22]
> “Advanced AI using 1980s-style terminals—quirky moment in 2025.”
---
Future of BI Tools: Prompt-Driven Dashboards [39:54]
> “Paste JSON, get a dashboard—LLMs decide chart types.”
---
Exciting LLM Applications [43:06]
- Text-to-SQL
- Data extraction
- Data enrichment
---
Multimodal LLMs: Images, Audio, Video [45:42]
> “Captioning 70,000 photos could cost ~$13.”
---
First Programming Languages & Favourite Dataset
- Languages: Commodore 64 BASIC > C++ misadventures.
- Dataset: SF Tree List CSV.
---
Showrunning as Management Model [50:07]
> “Vision is everything—helps in decision delegation.”
---
Code in Version Control is Non-negotiable [52:21]
---
Automating Hacker News Scraping [52:45]
---
Dream Project: Whale Detection Camera [53:47]
---
Favourite Podcast: Mark Steel’s in Town [54:23]
---
Favourite Fiction Genre: Bureaucratic British Wizards [55:06]
> Rivers of London) |
> The Rook)
---
Colophon & Workflow
- Transcript contained `` markers.
- Claude custom prompt pulled only most engaging quotes.
- Second prompt created timestamped bullet list + suggested support links.
Full Claude transcript available here: Link.
---
Related AI Tooling: AiToEarn
Throughout these discussions, a recurring theme emerges: the need for integrated creation + publishing + analytics workflows.
AiToEarn官网 is an open‑source global AI content monetization platform designed for this. Creators can:
- Generate content using AI.
- Publish simultaneously to Douyin, Kwai, WeChat, Bilibili, Rednote (Xiaohongshu), Facebook, Instagram, LinkedIn, Threads, YouTube, Pinterest, and X (Twitter).
- Track analytics & AI model rankings (AI模型排名).
- Explore docs: AiToEarn博客, Documentation, GitHub.
Platforms like AiToEarn mirror the collaborative, feedback‑driven ethos of modern data journalism — fusing content creation, distribution, and monetization into one streamlined, global workflow.
---
Would you like me to also merge similar AiToEarn references into one concise section so the narrative flows without repeated tool descriptions? That could make this recap even cleaner.