VoiceMode MCP — 사용 사례, 설치 & 라이브 데모

왜 쓰나요

핵심 기능

Local Whisper option — no cloud audio
Multiple TTS backends: OpenAI, ElevenLabs, local Coqui
Push-to-talk or voice-activated modes
Streams partial responses so you hear Claude thinking
Works in terminal alongside Claude Code CLI

라이브 데모

실제 사용 모습

voicemode-mcp.replay ▶ 준비됨

0/0

설치

클라이언트 선택

~/Library/Application Support/Claude/claude_desktop_config.json · Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "voicemode-mcp": {
      "command": "uvx",
      "args": [
        "voice-mode"
      ]
    }
  }
}

Claude Desktop → Settings → Developer → Edit Config 열기. 저장 후 앱 재시작.

~/.cursor/mcp.json · .cursor/mcp.json

{
  "mcpServers": {
    "voicemode-mcp": {
      "command": "uvx",
      "args": [
        "voice-mode"
      ]
    }
  }
}

Cursor는 Claude Desktop과 동일한 mcpServers 스키마 사용. 프로젝트 설정이 전역보다 우선.

VS Code → Cline → MCP Servers → Edit

{
  "mcpServers": {
    "voicemode-mcp": {
      "command": "uvx",
      "args": [
        "voice-mode"
      ]
    }
  }
}

Cline 사이드바의 MCP Servers 아이콘 클릭 후 "Edit Configuration" 선택.

~/.codeium/windsurf/mcp_config.json

{
  "mcpServers": {
    "voicemode-mcp": {
      "command": "uvx",
      "args": [
        "voice-mode"
      ]
    }
  }
}

Claude Desktop과 같은 형식. Windsurf 재시작 후 적용.

~/.continue/config.json

{
  "mcpServers": [
    {
      "name": "voicemode-mcp",
      "command": "uvx",
      "args": [
        "voice-mode"
      ]
    }
  ]
}

Continue는 맵이 아닌 서버 오브젝트 배열 사용.

~/.config/zed/settings.json

{
  "context_servers": {
    "voicemode-mcp": {
      "command": {
        "path": "uvx",
        "args": [
          "voice-mode"
        ]
      }
    }
  }
}

context_servers에 추가. 저장 시 Zed가 핫 리로드.

claude mcp add voicemode-mcp -- uvx voice-mode

한 줄 명령. claude mcp list로 확인, claude mcp remove로 제거.

사용 사례

실전 활용법: VoiceMode

Drive a Claude Code session hands-free while reading on another screen

👤 Devs who read docs or designs on one monitor while coding ⏱ ~30 min intermediate

언제 쓸까: You're reading a design doc and want to dictate changes without alt-tabbing.

사전 조건

Microphone + speakers — System audio configured — test with say "hello" or equivalent
Whisper model ready — voice-mode install-whisper downloads the local model

흐름

Start voice

Use voicemode. Listen for prompts and speak responses. Repeat after me: "ready"✓ 복사됨

→ TTS plays "ready"
Dictate a change

[spoken] Update src/auth.ts — use bcrypt instead of plain SHA256 for passwords.✓ 복사됨

→ Transcription correct; change applied; TTS confirms
Review

[spoken] Read me the diff.✓ 복사됨

→ TTS reads diff in chunks, pausable

결과: A working session where your hands never leave what they were doing.

함정

TTS talking over your prompts — Enable push-to-talk mode or a wake word

함께 쓰기: filesystem

Code by voice for accessibility or RSI recovery

👤 Devs with RSI, low vision, or preferring speech input ⏱ ~60 min intermediate

언제 쓸까: You can't type for a while and need to keep shipping.

사전 조건

Tolerable ambient noise — Quiet room; headset mic beats laptop mic

흐름

Baseline

[spoken] Use voicemode. Read the latest git diff out loud, pausing between files.✓ 복사됨

→ Clear TTS read
Workflow

[spoken] Refactor the user model in src/models/user.ts. Move password hashing into a method. Show me the plan first.✓ 복사됨

→ Plan spoken; confirmation required before changes

결과: A full coding session without keyboard input.

함정

Code symbols mispronounced by TTS — Configure the TTS phoneme dictionary for common programming terms

조합

다른 MCP와 조합해 10배 효율

voicemode-mcp + filesystem

Voice-dictated code changes land in the repo

I'll dictate changes; apply them in files after reading each back.✓ 복사됨

voicemode-mcp + github

Dictate a PR description after voice-reviewing the diff

Read me the staged changes, then open a PR with a description I'll dictate.✓ 복사됨

도구

이 MCP가 노출하는 것

도구	입력	언제 호출	비용
start_listening	mode: "ptt"\|"vad"	Begin a voice session	free or OpenAI Whisper API
speak	text: str, voice?: str	Any time Claude wants to surface something audibly	TTS provider-dependent
transcribe_last	none	Fetch what the user just said	Whisper call
stop_listening	none	End voice session	free

비용 및 제한

운영 비용

API 쿼터: Local: free. OpenAI Whisper: $0.006/min. ElevenLabs TTS: ~$0.30/1k chars.
호출당 토큰: Audio pipelines are not token-costed directly
금액: Free with local stack; metered with cloud providers
팁: Local Whisper + Coqui TTS is totally free but lower quality — start cloud, downgrade later

보안

권한, 시크릿, 파급범위

최소 스코프: microphone speakers

자격 증명 저장: TTS/STT API keys in env

데이터 외부 송신: Voice audio to TTS/STT provider if not local

Never use cloud STT in calls with confidential audio unless you trust the provider's retention policy

문제 해결

자주 발생하는 오류와 해결

Mic not detected

System audio permission — grant terminal/Claude Code mic access

확인: `voice-mode test-mic` prints levels

TTS sounds robotic

Default is Coqui local — switch to OpenAI tts-1-hd via VOICE_MODE_TTS=openai

Lag between my speech and response

Use local Whisper-tiny for STT; cloud adds 500ms+

대안

VoiceMode 다른 것과 비교

대안	언제 쓰나	단점/장점
macOS Dictation + say command	You just want basic OS-level voice	No integration with Claude's output — one-way only
Superwhisper / Wispr Flow	You want a polished native macOS dictation app	Not MCP-integrated; no agent-level workflows

VoiceMode

왜 쓰나요

핵심 기능

라이브 데모

실제 사용 모습

설치

클라이언트 선택

사용 사례

실전 활용법: VoiceMode

Drive a Claude Code session hands-free while reading on another screen

사전 조건

흐름

함정

Code by voice for accessibility or RSI recovery

사전 조건

흐름

함정

조합

다른 MCP와 조합해 10배 효율

도구

이 MCP가 노출하는 것

비용 및 제한

운영 비용

보안

권한, 시크릿, 파급범위

문제 해결

자주 발생하는 오류와 해결

대안

VoiceMode 다른 것과 비교

더 보기

리소스