Extract everything from a YouTube video: metadata, transcript, chapters, description links, social profiles, tags, engagement stats — saved to ~/.hermes/youtube/ for reuse across tasks.
video-watchThis skill now keeps the best parts of the original Hermes YouTube extractor and borrows the strongest workflow ideas from Brad Bonanno's claude-video /watch skill.
Use youtube-content when you need:
- structured metadata archival
- transcript/chapter extraction
- description-link mining
- reusable JSON artifacts in ~/.hermes/youtube/
- durable source records that pair cleanly with the separate video-watch visual-analysis workflow
Use video-watch when you need:
- URL-or-local-file video analysis beyond YouTube
- bug debugging from a screen recording
- hook analysis from the first seconds of a video
- focused timestamp-window inspection with denser frame extraction
- answers grounded in extracted frames rather than only transcript/metadata
Best combined workflow:
1. Run youtube-content first for YouTube videos to get durable metadata, transcript, chapters, and link extraction.
2. If the user's question depends on what is visibly on screen, follow with a video-watch-style frame workflow focused on the specific chapter or timestamp range.
3. For long videos, do not trust a sparse whole-video visual pass if the user only cares about one moment. Re-run on a bounded window.
This skill is now intentionally metadata/transcript-first.
When the user needs actual visual understanding of the video, switch to video-watch instead of trying to do URL-level Gemini analysis inside this extractor. That keeps the workflow closer to Brad Bonanno's /watch approach:
# Required
export SUPADATA_API_KEY=sd_... # In ~/.hermes/.env
# Optional (legacy fallback)
pip install youtube-transcript-api pysocks
SUPADATA_API_KEY=sd_... python3 SKILL_DIR/scripts/youtube_extract.py "https://youtube.com/watch?v=VIDEO_ID"
SUPADATA_API_KEY=sd_... python3 SKILL_DIR/scripts/youtube_extract.py "VIDEO_URL" --json
SUPADATA_API_KEY=sd_... python3 SKILL_DIR/scripts/youtube_extract.py "VIDEO_URL" -l es
SUPADATA_API_KEY=sd_... python3 SKILL_DIR/scripts/youtube_extract.py "VIDEO_URL" --no-save
from hermes_tools import terminal
import os
url = "https://www.youtube.com/watch?v=VIDEO_ID"
key = os.environ.get("SUPADATA_API_KEY", "")
result = terminal(f'SUPADATA_API_KEY={key} python3 ~/.hermes/skills/media/youtube-content/scripts/youtube_extract.py "{url}" --json')
print(result["output"])
from hermes_tools import web_extract
import json, os
# Quick transcript via Supadata — no script needed
video_id = "dQw4w9WgXcQ"
key = os.environ.get("SUPADATA_API_KEY", "")
from urllib.request import Request, urlopen
req = Request(
f"https://api.supadata.ai/v1/transcript?url=https://www.youtube.com/watch?v={video_id}&mode=auto",
headers={"x-api-key": key}
)
data = json.loads(urlopen(req, timeout=60).read())
print(data["content"][:500])
All extractions saved to ~/.hermes/youtube/:
~/.hermes/youtube/
├── index.json # Master index of all extracted videos
├── VIDEO_ID_title-slug.json # Full structured extraction (JSON)
├── VIDEO_ID_transcript.txt # Human-readable transcript with chapters
└── ...
{
"extracted_at": "ISO timestamp",
"video_id": "11-char ID",
"url": "full YouTube URL",
"title": "Video title",
"channel": {"name", "id", "url", "handle"},
"description": "Full description text",
"duration_seconds": 213,
"duration_string": "3:33",
"upload_date": "2024-01-15T...",
"category": "Education",
"tags": ["tag1", "tag2"],
"stats": {"views": 1000000, "likes": 50000},
"chapters": [{"title": "Intro", "time": "0:00", "start_seconds": 0}],
"thumbnail": "https://i.ytimg.com/...",
"description_links": ["https://..."],
"description_timestamps": [{"time": "0:00", "seconds": 0, "label": "Intro"}],
"social_links": {"twitter": ["@handle"], "github": ["repo"]},
"endscreen_videos": [{"title": "...", "url": "..."}],
"transcript": {
"available": true,
"source": "supadata",
"segment_count": 250,
"full_text": "complete transcript as one string",
"timestamped_text": "0:00 first line\n0:05 second line...",
"segments": [{"text": "...", "start": 0.0, "duration": 2.5}]
}
}
We removed Gemini/OpenRouter video-url enrichment from the default YouTube extraction flow.
Reasons:
- it created provider/config friction in Hermes
- it blurred the line between archival extraction and actual video watching
- Brad-style analysis is better served by the dedicated video-watch skill, which works from frames, contact sheets, timestamps, and captions
Use youtube-content to archive and structure the source.
Use video-watch to actually watch the video.
Base URL: https://api.supadata.ai/v1
Auth: x-api-key header
GET /transcript)url — YouTube URL (required)lang — language code (optional)mode — native (1 credit), generate (2 credits/min), auto (tries native first)jobId for async pollingGET /youtube/video)videoId — 11-char video ID (required)GET /metadata)url — any social media URL (YouTube, TikTok, Instagram, X, Facebook)GET /job/{jobId})https://www.youtube.com/watch?v=VIDEO_IDhttps://youtu.be/VIDEO_IDhttps://youtube.com/shorts/VIDEO_IDhttps://youtube.com/embed/VIDEO_IDhttps://youtube.com/live/VIDEO_IDWhen Supadata is not configured/available and normal transcript tools fail, do not stop at a sparse noembed result. Fetch the watch page HTML directly and parse ytInitialData / ytcfg to recover useful structured data:
python3 - <<'PY'
import json, re, urllib.request
video_id = 'VIDEO_ID'
url = f'https://www.youtube.com/watch?v={video_id}&hl=en&gl=US'
req = urllib.request.Request(url, headers={
'User-Agent': 'Mozilla/5.0',
'Accept-Language': 'en-US,en;q=0.9',
})
html = urllib.request.urlopen(req, timeout=20).read().decode('utf-8', 'ignore')
open('/tmp/yt.html', 'w').write(html)
for name, pat in [('player', r'ytInitialPlayerResponse\s*=\s*({.+?});'), ('data', r'ytInitialData\s*=\s*({.+?});')]:
m = re.search(pat, html)
if m:
open(f'/tmp/{name}.json', 'w').write(json.dumps(json.loads(m.group(1)), indent=2))
PY
Then inspect /tmp/data.json recursively for:
- videoPrimaryInfoRenderer: title, views, likes
- videoSecondaryInfoRenderer: channel name/id/handle/subscriber count
- structuredDescriptionContentRenderer → videoDescriptionHeaderRenderer: publish date, views, likes
- expandableVideoDescriptionBodyRenderer.attributedDescriptionBodyText.content: full visible description and chapter timestamps
- engagementPanels[] with panelIdentifier == engagement-panel-searchable-transcript: transcript availability and getTranscriptEndpoint.params
Use this fallback to update the saved JSON with metadata, description, timestamps, and extraction notes even when transcript text cannot be retrieved.
SUPADATA_API_KEY env var. Key format: sd_...ytInitialData fallback above before reporting partial extraction.getTranscriptEndpoint.params, but /youtubei/v1/get_transcript can still fail with 400 FAILED_PRECONDITION from VPS/cloud environments. Record that the panel exists and save chapters/description; do not claim transcript extraction succeeded.yt-dlp --dump-single-json may fail with “Sign in to confirm you’re not a bot” on cloud hosts. Treat it as another blocked path, not a final failure..snippets attribute, not dict keys.