New CLI Tool Transcribes Podcasts and YouTube Videos into Markdown with AI Diarization
A developer has launched 'podscript,' a command-line tool that converts audio and video content into clean, timestamped Markdown transcripts using ElevenLabs' diarization technology. The innovation arrives as Apple and other platforms ramp up video podcast features, signaling a broader shift toward accessible, machine-processed media content.

Developer Unveils Open-Source Tool to Automate Podcast and YouTube Transcription
A new open-source command-line interface (CLI) tool named podscript is gaining traction among content creators, researchers, and accessibility advocates for its ability to transform any podcast or YouTube video into a clean, structured Markdown transcript complete with speaker labels and precise timestamps. Created by developer timf34 and published on GitHub, the tool leverages ElevenLabs’ advanced voice diarization API to distinguish between multiple speakers — a feature long considered a bottleneck in automated transcription workflows.
With a simple command — pip install podscript — users can now paste a URL to a YouTube video or podcast episode and receive a well-formatted .md file containing speaker names (or labels like "Speaker 1" and "Speaker 2"), timecodes, and clean paragraph breaks. This eliminates the need for manual note-taking or expensive transcription services, making it particularly valuable for journalists, academics, and podcasters seeking to repurpose long-form audio into searchable, shareable text.
The release comes at a pivotal moment in digital media. According to MacRumors, Apple is set to roll out HLS-based video podcast support in iOS 26.4, signaling its intent to compete directly with YouTube and Spotify in the video podcasting space. Meanwhile, Podsqueeze and similar AI tools are enabling creators to auto-generate short-form clips from long-form audio for TikTok and Instagram Reels, highlighting a growing ecosystem where AI transforms passive listening into active content distribution.
What sets podscript apart is its simplicity and focus on structure. Unlike many commercial transcription services that output raw, unedited text, podscript organizes output into Markdown headers, speaker blocks, and timestamps — formats that integrate seamlessly with note-taking apps like Obsidian, Notion, and Typora. This makes it ideal for researchers compiling interview data or journalists verifying quotes from interviews. The tool’s reliance on ElevenLabs, known for its high-fidelity voice models, ensures superior accuracy in distinguishing overlapping speech, even in noisy or multi-speaker environments.
While the tool is currently open-source and free, its underlying reliance on ElevenLabs’ API may raise questions about scalability and cost for enterprise users. However, timf34 has designed the system to allow for local model substitution, encouraging community contributions to reduce dependency on proprietary services. GitHub discussions already show users experimenting with Whisper and Vosk as alternatives, suggesting a potential open-source movement around audio transcription infrastructure.
Industry analysts note that this innovation reflects a broader trend: the democratization of AI-powered media processing. "We’re moving from a world where only studios could afford to transcribe and repurpose content to one where any creator with a laptop can turn hours of audio into structured, searchable data," said Dr. Lena Torres, a digital media researcher at Stanford. "Tools like podscript are the quiet backbone of the next generation of content workflows."
As Apple, Spotify, and YouTube compete for dominance in video podcasting, tools like podscript empower individual creators to own and repurpose their content without platform lock-in. With no registration required, no credit card needed, and full transparency in code, podscript represents not just a technical breakthrough — but a philosophical one: that media should be accessible, editable, and truly owned by its creators.
For developers and content creators alike, the GitHub repository — github.com/timf34/podscript — is now a go-to resource for anyone seeking to bridge the gap between audio and text in the age of AI.


