/ Directory / Playground / video-podcast-maker
● Community Agents365-ai 🔑 Needs your key

video-podcast-maker

by Agents365-ai · Agents365-ai/video-podcast-maker

Idea to Bilibili / YouTube podcast video in one flow — script writing, multi-voice TTS, automated assembly, multi-language.

video-podcast-maker is a Claude Code skill that takes a topic or article and produces a video-podcast asset: a two-host script, multi-voice TTS render via 6 engines (Edge / Azure / OpenAI / etc.), b-roll-style visuals, and a Bilibili/YouTube-ready video. Multi-language support for zh-CN and en-US out of the box.

Why use it

Key features

Live Demo

What it looks like in practice

ready

Install

Pick your client

~/Library/Application Support/Claude/claude_desktop_config.json  · Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "video-podcast-maker-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/Agents365-ai/video-podcast-maker",
        "~/.claude/skills/video-podcast-maker"
      ],
      "_inferred": true
    }
  }
}

Open Claude Desktop → Settings → Developer → Edit Config. Restart after saving.

~/.cursor/mcp.json · .cursor/mcp.json
{
  "mcpServers": {
    "video-podcast-maker-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/Agents365-ai/video-podcast-maker",
        "~/.claude/skills/video-podcast-maker"
      ],
      "_inferred": true
    }
  }
}

Cursor uses the same mcpServers schema as Claude Desktop. Project config wins over global.

VS Code → Cline → MCP Servers → Edit
{
  "mcpServers": {
    "video-podcast-maker-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/Agents365-ai/video-podcast-maker",
        "~/.claude/skills/video-podcast-maker"
      ],
      "_inferred": true
    }
  }
}

Click the MCP Servers icon in the Cline sidebar, then "Edit Configuration".

~/.codeium/windsurf/mcp_config.json
{
  "mcpServers": {
    "video-podcast-maker-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/Agents365-ai/video-podcast-maker",
        "~/.claude/skills/video-podcast-maker"
      ],
      "_inferred": true
    }
  }
}

Same shape as Claude Desktop. Restart Windsurf to pick up changes.

~/.continue/config.json
{
  "mcpServers": [
    {
      "name": "video-podcast-maker-skill",
      "command": "git",
      "args": [
        "clone",
        "https://github.com/Agents365-ai/video-podcast-maker",
        "~/.claude/skills/video-podcast-maker"
      ]
    }
  ]
}

Continue uses an array of server objects rather than a map.

~/.config/zed/settings.json
{
  "context_servers": {
    "video-podcast-maker-skill": {
      "command": {
        "path": "git",
        "args": [
          "clone",
          "https://github.com/Agents365-ai/video-podcast-maker",
          "~/.claude/skills/video-podcast-maker"
        ]
      }
    }
  }
}

Add to context_servers. Zed hot-reloads on save.

claude mcp add video-podcast-maker-skill -- git clone https://github.com/Agents365-ai/video-podcast-maker ~/.claude/skills/video-podcast-maker

One-liner. Verify with claude mcp list. Remove with claude mcp remove.

Use Cases

Real-world ways to use video-podcast-maker

Convert a blog post into a podcast-style video

👤 Content creators repurposing written work ⏱ ~60 min intermediate

When to use: You have a long-form post and want a 10-minute video to push to YouTube/B站.

Prerequisites
  • Skill installed — git clone https://github.com/Agents365-ai/video-podcast-maker ~/.claude/skills/video-podcast-maker
  • TTS engine credentials — Set the env var for the engine you choose (e.g. AZURE_TTS_KEY)
Flow
  1. Generate script
    Use video-podcast-maker. From post.md, write a two-host script (10 min) in en-US. Hosts: Alice (analytical), Bob (curious).✓ Copied
    → Turn-by-turn script with Alice/Bob lines
  2. Render voices
    Render with Azure TTS — Alice: en-US-JennyNeural, Bob: en-US-GuyNeural.✓ Copied
    → Two MP3 tracks; alignment metadata
  3. Assemble
    Assemble video: title card, b-roll keywords from script, host avatars, captions.✓ Copied
    → MP4 ready, 1920×1080
  4. Upload
    Push to YouTube as unlisted with description + tags from the script.✓ Copied
    → YouTube URL

Outcome: Polished podcast video from one long-form post in under an hour.

Pitfalls
  • TTS pronunciation off on technical terms — Pre-mark tricky words in script with phonetic hints; supported by most engines
Combine with: humanizer-skill

Run a weekly podcast on a topic with consistent hosts

👤 Niche-topic creators with a publishing cadence ⏱ ~45 min advanced

When to use: You want a Friday weekly podcast on AI/Web3/wherever — automated.

Flow
  1. Define personas
    Set host personas: Alice (skeptic), Bob (enthusiast). Save as default.✓ Copied
    → Persona file saved
  2. Pull weekly news
    Use video-podcast-maker. Pull this week's top 5 stories on <topic> from RSS feeds. Generate the script.✓ Copied
    → Script with 5 segments
  3. Render + publish
    Render and publish to YouTube + Bilibili at Friday 9am.✓ Copied
    → Both platforms have the episode

Outcome: Consistent weekly content; near-zero per-episode effort.

Pitfalls
  • AI voices sound the same after a few episodes — Rotate personas; vary engine; add real intro stings
Combine with: duckduckgo-mcp

Localize an English podcast to Chinese (or vice versa)

👤 Creators serving cross-language audiences ⏱ ~50 min intermediate

When to use: You have an English podcast and want a Chinese version for B站 with native voices.

Flow
  1. Translate script
    Use video-podcast-maker. Translate script from en-US to zh-CN preserving the conversational tone.✓ Copied
    → zh-CN script with cultural adaptation, not literal
  2. Render with native voices
    Render with zh-CN voices (e.g. Azure XiaoxiaoNeural + YunxiNeural).✓ Copied
    → Native-quality audio
  3. Re-assemble + upload to Bilibili
    Use the same b-roll; new audio; new captions in zh-CN. Upload to B站.✓ Copied
    → B站 URL

Outcome: Authentic cross-language version, not a translation tape.

Pitfalls
  • Direct translation loses idioms — Skill is configured to adapt culturally; review jokes/references manually

Combinations

Pair with other MCPs for X10 leverage

video-podcast-maker-skill + humanizer-skill

Strip AI tells from generated scripts

Run humanizer on the script before TTS — sound less generated, more conversational.✓ Copied
video-podcast-maker-skill + duckduckgo-mcp

Pull fresh news for the script

Search latest <topic> stories; feed top 5 into make_script.✓ Copied

Tools

What this MCP exposes

ToolInputsWhen to callCost
make_script source_text, hosts, length_min, language Step 1 — script writing LLM tokens
render_tts script, engine, voices After script approved TTS engine quota / $
assemble_video audio_tracks, b_roll_keywords, theme Final assembly Local CPU/GPU
publish platform, mp4_path, metadata Push to YouTube / Bilibili 0
translate_script script, target_language Localization step LLM tokens

Cost & Limits

What this costs to run

API quota
TTS engines have per-character limits; Azure free tier ~500k chars/month
Tokens per call
Script ~3k–6k tokens; assembly is local
Monetary
Free for skill; pay per TTS engine + LLM
Tip
Use Edge TTS (free) for drafts; Azure/ElevenLabs only for production

Security

Permissions, secrets, blast radius

Minimum scopes: filesystem-write (output)
Credential storage: Engine API keys via env vars; YouTube/Bilibili tokens in secrets file
Data egress: TTS engines, LLM provider, target platforms
Never grant: Public OAuth tokens that could post on your channel without confirmation

Troubleshooting

Common errors and fixes

TTS truncates long lines

Most engines cap lines ~250 chars; skill auto-splits but verify long sentences

Audio drift across long renders

Render in chunks, concat with crossfade — skill does this by default for >5 min

B站 upload fails

Verify cookies in secrets file; B站 sometimes requires re-login

Captions out of sync

Re-run alignment; some TTS engines under-report timing — skill has resync mode

Alternatives

video-podcast-maker vs others

AlternativeWhen to use it insteadTradeoff
ElevenLabs StudioYou want a polished SaaS UICosts more; less automation in chat
NotebookLM Audio OverviewYou want a one-shot two-host audio summary from any sourceNo video; less control; cloud-only

More

Resources

📖 Read the official README on GitHub

🐙 Browse open issues

🔍 Browse all 400+ MCP servers and Skills