/ Directory / Playground / MarkItDown MCP
● Official microsoft ⚡ Instant

MarkItDown MCP

by microsoft · microsoft/markitdown

Microsoft's MarkItDown as MCP — convert PDF, DOCX, PPTX, XLSX, audio, and HTML to clean Markdown for Claude to read.

MarkItDown is Microsoft's universal document-to-Markdown converter, packaged as an MCP server. Hand it any office doc, PDF, image, audio file, ZIP, EPub, or URL and get back structured Markdown that Claude can reason about. The MCP layer (markitdown-mcp) is a separate package in the same monorepo.

Why use it

Key features

Live Demo

What it looks like in practice

markitdown-mcp.replay ▶ ready
0/0

Install

Pick your client

~/Library/Application Support/Claude/claude_desktop_config.json  · Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "markitdown-mcp": {
      "command": "uvx",
      "args": [
        "markitdown-mcp"
      ]
    }
  }
}

Open Claude Desktop → Settings → Developer → Edit Config. Restart after saving.

~/.cursor/mcp.json · .cursor/mcp.json
{
  "mcpServers": {
    "markitdown-mcp": {
      "command": "uvx",
      "args": [
        "markitdown-mcp"
      ]
    }
  }
}

Cursor uses the same mcpServers schema as Claude Desktop. Project config wins over global.

VS Code → Cline → MCP Servers → Edit
{
  "mcpServers": {
    "markitdown-mcp": {
      "command": "uvx",
      "args": [
        "markitdown-mcp"
      ]
    }
  }
}

Click the MCP Servers icon in the Cline sidebar, then "Edit Configuration".

~/.codeium/windsurf/mcp_config.json
{
  "mcpServers": {
    "markitdown-mcp": {
      "command": "uvx",
      "args": [
        "markitdown-mcp"
      ]
    }
  }
}

Same shape as Claude Desktop. Restart Windsurf to pick up changes.

~/.continue/config.json
{
  "mcpServers": [
    {
      "name": "markitdown-mcp",
      "command": "uvx",
      "args": [
        "markitdown-mcp"
      ]
    }
  ]
}

Continue uses an array of server objects rather than a map.

~/.config/zed/settings.json
{
  "context_servers": {
    "markitdown-mcp": {
      "command": {
        "path": "uvx",
        "args": [
          "markitdown-mcp"
        ]
      }
    }
  }
}

Add to context_servers. Zed hot-reloads on save.

claude mcp add markitdown-mcp -- uvx markitdown-mcp

One-liner. Verify with claude mcp list. Remove with claude mcp remove.

Use Cases

Real-world ways to use MarkItDown MCP

Drop a 200-page PDF in front of Claude as readable Markdown

👤 Researchers, lawyers, anyone with PDF-heavy workflows ⏱ ~15 min beginner

When to use: You need to discuss specifics from a PDF and don't want to copy-paste pages.

Flow
  1. Convert
    Use markitdown. Convert ~/Downloads/whitepaper.pdf to Markdown. Tell me total length and section count.✓ Copied
    → Markdown returned with TOC summary
  2. Discuss specifics
    From section 3, what claims do they make about throughput? Quote the exact lines.✓ Copied
    → Direct quotes with section refs
  3. Compare to another doc
    Now convert competitor.pdf the same way. Compare their throughput claims.✓ Copied
    → Per-doc table of claims

Outcome: Two PDFs ingested, compared, and quotable in chat.

Pitfalls
  • Scanned PDFs come out empty — MarkItDown does basic OCR — for image-only PDFs, run OCR upstream first
Combine with: filesystem

Convert any URL to clean Markdown without browser scraping

👤 Anyone who wants to ingest articles or docs by URL ⏱ ~10 min beginner

When to use: Article is dynamically rendered or paywalled; you want structured output, not raw HTML.

Flow
  1. Fetch and convert
    Use markitdown to convert https://example.com/long-article. Strip nav and footer.✓ Copied
    → Article body in Markdown
  2. Summarize or quote
    Give me the core claim and the strongest evidence cited.✓ Copied
    → Structured summary

Outcome: URL turned into reasoning-grade Markdown.

Pitfalls
  • JS-heavy SPAs return empty — Use a browser-based MCP (browser-act, mcp-chrome) for SPAs, then pipe to markitdown

Batch-convert a folder of mixed Office docs to a knowledge base

👤 Knowledge ops, support teams building internal corpora ⏱ ~30 min intermediate

When to use: You have a Dropbox/SharePoint folder of mixed docs and want them all readable.

Flow
  1. Inventory
    List ~/docs/ — group by extension. How many PDFs, DOCXs, PPTXs?✓ Copied
    → Per-extension counts
  2. Convert all
    Convert every doc in ~/docs/ to Markdown into ~/docs-md/. Preserve folder structure.✓ Copied
    → Mirror tree with .md files
  3. Index for retrieval
    Now give me a single index.md listing each doc's title and 2-line summary.✓ Copied
    → Knowledge-base index file

Outcome: Mixed-format folder turned into a homogeneous Markdown corpus.

Combine with: filesystem

Combinations

Pair with other MCPs for X10 leverage

markitdown-mcp + filesystem

Read source files then convert in batch

List ~/inbox/, convert each via markitdown, save to ~/processed/.✓ Copied
markitdown-mcp + office-word-mcp

Convert Word doc to Markdown for editing then convert back

MarkItDown the .docx → edit the .md → use word-mcp to write a new .docx with the edits.✓ Copied

Tools

What this MCP exposes

ToolInputsWhen to callCost
convert_to_markdown uri (file:// or http://) Any document you want as text 0 (LLM hooks optional, paid)

Cost & Limits

What this costs to run

API quota
N/A — local
Tokens per call
Variable — large PDFs can yield tens of thousands of MD lines
Monetary
Free (MIT). Optional LLM/Whisper hooks billed separately.
Tip
Skip large appendices — convert page ranges if the SDK supports it for your file type

Security

Permissions, secrets, blast radius

Minimum scopes: filesystem-read outbound:url-fetch
Credential storage: None by default; LLM hooks need their own keys
Data egress: URLs you ask it to fetch; LLM endpoints if hooks enabled

Troubleshooting

Common errors and fixes

ImportError on rare format

MarkItDown has optional extras: pip install markitdown[all] to include parsers like youtube/azure-docs

Encoding errors on legacy DOCs

Re-save in Office as DOCX before converting; .doc support is best-effort

Tables look squashed

MarkItDown preserves table structure but Claude may need explicit prompt to render — ask for HTML table mode

Alternatives

MarkItDown MCP vs others

AlternativeWhen to use it insteadTradeoff
Docling / UnstructuredYou need PDF layout fidelity for complex scientific docsHeavier deps; more accurate on tables
kreuzbergPure text extraction with OCR on scanned docsDifferent optimizing target

More

Resources

📖 Read the official README on GitHub

🐙 Browse open issues

🔍 Browse all 400+ MCP servers and Skills