YouTube Transcript
Download YouTube video transcripts with automatic frame extraction at visual references.
Metadata
| Field | Value |
|---|---|
| Type | command |
| Invoked by | /youtube-transcript |
| Dependencies | yt-dlp, ffmpeg |
Usage
/youtube-transcript https://www.youtube.com/watch?v=VIDEO_IDWhat It Does
- Downloads transcript - Auto-generated or manual captions
- Detects visual references - Phrases like “as you can see”, “look at this”
- Extracts frames - Screenshots at key moments
- Presents combined output - Transcript with embedded images
Example Output
[00:00] Introduction to the topic...
[01:23] Now let me show you the architecture diagram
[Frame extracted: architecture-01-23.png]
[02:45] As you can see here, the data flows from...
[Frame extracted: diagram-02-45.png]Visual Reference Detection
The skill detects phrases in English and German:
English:
- “as you can see”
- “look at this”
- “here’s the diagram”
- “on this slide”
- “let me show you”
German:
- “wie Sie sehen”
- “schauen Sie hier”
- “auf dieser Folie”
Use Cases
- Conference talks - Extract slides and diagrams
- Tutorials - Capture UI screenshots with instructions
- Code walkthroughs - Save code snippets shown on screen
- Presentations - Get slides without screen recording
Requirements
Dependencies are installed automatically when you select this skill:
yt-dlp- Downloads videos and transcriptsffmpeg- Extracts frames from video
Limitations
- Requires captions (auto-generated or manual)
- Frame quality depends on video quality
- Large videos take longer to process
Related
- Record 021 - Implementation details