video-podcast-maker (Claude Skill) — Install & Live Demo

Why use it

Key features

Script writing with two-host conversational arc
6 TTS engines: Edge / Azure / OpenAI / ElevenLabs / etc. — pick by quality/cost
Multi-language: zh-CN and en-US first-class
Auto-assembly: aligns voice tracks with visuals + intro/outro
Direct upload helpers for Bilibili / YouTube

Live Demo

What it looks like in practice

ready

Install

Pick your client

~/Library/Application Support/Claude/claude_desktop_config.json · Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "video-podcast-maker-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/Agents365-ai/video-podcast-maker",
        "~/.claude/skills/video-podcast-maker"
      ],
      "_inferred": true
    }
  }
}

Open Claude Desktop → Settings → Developer → Edit Config. Restart after saving.

~/.cursor/mcp.json · .cursor/mcp.json

{
  "mcpServers": {
    "video-podcast-maker-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/Agents365-ai/video-podcast-maker",
        "~/.claude/skills/video-podcast-maker"
      ],
      "_inferred": true
    }
  }
}

Cursor uses the same mcpServers schema as Claude Desktop. Project config wins over global.

VS Code → Cline → MCP Servers → Edit

{
  "mcpServers": {
    "video-podcast-maker-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/Agents365-ai/video-podcast-maker",
        "~/.claude/skills/video-podcast-maker"
      ],
      "_inferred": true
    }
  }
}

Click the MCP Servers icon in the Cline sidebar, then "Edit Configuration".

~/.codeium/windsurf/mcp_config.json

{
  "mcpServers": {
    "video-podcast-maker-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/Agents365-ai/video-podcast-maker",
        "~/.claude/skills/video-podcast-maker"
      ],
      "_inferred": true
    }
  }
}

Same shape as Claude Desktop. Restart Windsurf to pick up changes.

~/.continue/config.json

{
  "mcpServers": [
    {
      "name": "video-podcast-maker-skill",
      "command": "git",
      "args": [
        "clone",
        "https://github.com/Agents365-ai/video-podcast-maker",
        "~/.claude/skills/video-podcast-maker"
      ]
    }
  ]
}

Continue uses an array of server objects rather than a map.

~/.config/zed/settings.json

{
  "context_servers": {
    "video-podcast-maker-skill": {
      "command": {
        "path": "git",
        "args": [
          "clone",
          "https://github.com/Agents365-ai/video-podcast-maker",
          "~/.claude/skills/video-podcast-maker"
        ]
      }
    }
  }
}

Add to context_servers. Zed hot-reloads on save.

claude mcp add video-podcast-maker-skill -- git clone https://github.com/Agents365-ai/video-podcast-maker ~/.claude/skills/video-podcast-maker

One-liner. Verify with claude mcp list. Remove with claude mcp remove.

Use Cases

Real-world ways to use video-podcast-maker

Convert a blog post into a podcast-style video

👤 Content creators repurposing written work ⏱ ~60 min intermediate

When to use: You have a long-form post and want a 10-minute video to push to YouTube/B站.

Prerequisites

Skill installed — git clone https://github.com/Agents365-ai/video-podcast-maker ~/.claude/skills/video-podcast-maker
TTS engine credentials — Set the env var for the engine you choose (e.g. AZURE_TTS_KEY)

Flow

Generate script

Use video-podcast-maker. From post.md, write a two-host script (10 min) in en-US. Hosts: Alice (analytical), Bob (curious).✓ Copied

→ Turn-by-turn script with Alice/Bob lines
Render voices

Render with Azure TTS — Alice: en-US-JennyNeural, Bob: en-US-GuyNeural.✓ Copied

→ Two MP3 tracks; alignment metadata
Assemble

Assemble video: title card, b-roll keywords from script, host avatars, captions.✓ Copied

→ MP4 ready, 1920×1080
Upload

Push to YouTube as unlisted with description + tags from the script.✓ Copied

→ YouTube URL

Outcome: Polished podcast video from one long-form post in under an hour.

Pitfalls

TTS pronunciation off on technical terms — Pre-mark tricky words in script with phonetic hints; supported by most engines

Combine with: humanizer-skill

Run a weekly podcast on a topic with consistent hosts

👤 Niche-topic creators with a publishing cadence ⏱ ~45 min advanced

When to use: You want a Friday weekly podcast on AI/Web3/wherever — automated.

Flow

Define personas

Set host personas: Alice (skeptic), Bob (enthusiast). Save as default.✓ Copied

→ Persona file saved
Pull weekly news

Use video-podcast-maker. Pull this week's top 5 stories on <topic> from RSS feeds. Generate the script.✓ Copied

→ Script with 5 segments
Render + publish

Render and publish to YouTube + Bilibili at Friday 9am.✓ Copied

→ Both platforms have the episode

Outcome: Consistent weekly content; near-zero per-episode effort.

Pitfalls

AI voices sound the same after a few episodes — Rotate personas; vary engine; add real intro stings

Combine with: duckduckgo-mcp

Localize an English podcast to Chinese (or vice versa)

👤 Creators serving cross-language audiences ⏱ ~50 min intermediate

When to use: You have an English podcast and want a Chinese version for B站 with native voices.

Flow

Translate script

Use video-podcast-maker. Translate script from en-US to zh-CN preserving the conversational tone.✓ Copied

→ zh-CN script with cultural adaptation, not literal
Render with native voices

Render with zh-CN voices (e.g. Azure XiaoxiaoNeural + YunxiNeural).✓ Copied

→ Native-quality audio
Re-assemble + upload to Bilibili

Use the same b-roll; new audio; new captions in zh-CN. Upload to B站.✓ Copied

→ B站 URL

Outcome: Authentic cross-language version, not a translation tape.

Pitfalls

Direct translation loses idioms — Skill is configured to adapt culturally; review jokes/references manually

Combinations

Pair with other MCPs for X10 leverage

video-podcast-maker-skill + humanizer-skill

Strip AI tells from generated scripts

Run humanizer on the script before TTS — sound less generated, more conversational.✓ Copied

video-podcast-maker-skill + duckduckgo-mcp

Pull fresh news for the script

Search latest <topic> stories; feed top 5 into make_script.✓ Copied

Tools

What this MCP exposes

Tool	Inputs	When to call	Cost
make_script	source_text, hosts, length_min, language	Step 1 — script writing	LLM tokens
render_tts	script, engine, voices	After script approved	TTS engine quota / $
assemble_video	audio_tracks, b_roll_keywords, theme	Final assembly	Local CPU/GPU
publish	platform, mp4_path, metadata	Push to YouTube / Bilibili	0
translate_script	script, target_language	Localization step	LLM tokens

Cost & Limits

What this costs to run

API quota: TTS engines have per-character limits; Azure free tier ~500k chars/month
Tokens per call: Script ~3k–6k tokens; assembly is local
Monetary: Free for skill; pay per TTS engine + LLM
Tip: Use Edge TTS (free) for drafts; Azure/ElevenLabs only for production

Security

Permissions, secrets, blast radius

Minimum scopes: filesystem-write (output)

Credential storage: Engine API keys via env vars; YouTube/Bilibili tokens in secrets file

Data egress: TTS engines, LLM provider, target platforms

Never grant: Public OAuth tokens that could post on your channel without confirmation

Voice cloning features (if enabled) raise consent concerns — only clone with explicit permission

Troubleshooting

Common errors and fixes

TTS truncates long lines

Most engines cap lines ~250 chars; skill auto-splits but verify long sentences

Audio drift across long renders

Render in chunks, concat with crossfade — skill does this by default for >5 min

B站 upload fails

Verify cookies in secrets file; B站 sometimes requires re-login

Captions out of sync

Re-run alignment; some TTS engines under-report timing — skill has resync mode

Alternatives

video-podcast-maker vs others

Alternative	When to use it instead	Tradeoff
ElevenLabs Studio	You want a polished SaaS UI	Costs more; less automation in chat
NotebookLM Audio Overview	You want a one-shot two-host audio summary from any source	No video; less control; cloud-only

More

Resources

📖 Read the official README on GitHub

🐙 Browse open issues

🔍 Browse all 400+ MCP servers and Skills