/ 目录 / 演练场 / Unstructured UNS MCP
● 官方 unstructured-io ⚡ 即开即用

Unstructured UNS MCP

作者 unstructured-io · unstructured-io/uns-mcp

Parse, chunk, and embed any document with Unstructured's pipeline.

Unstructured's library extracts clean text and structure from messy documents (PDFs, scans, PPTX, emails). This MCP wraps the Unstructured Serverless pipeline so Claude can ingest a folder of raw files and turn them into a queryable RAG corpus.

为什么要用

核心特性

实时演示

实际使用效果

unstructured-uns-mcp.replay ▶ 就绪
0/0

安装

选择你的客户端

~/Library/Application Support/Claude/claude_desktop_config.json  · Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "unstructured-uns-mcp": {
      "command": "uvx",
      "args": [
        "uns-mcp"
      ]
    }
  }
}

打开 Claude Desktop → Settings → Developer → Edit Config。保存后重启应用。

~/.cursor/mcp.json · .cursor/mcp.json
{
  "mcpServers": {
    "unstructured-uns-mcp": {
      "command": "uvx",
      "args": [
        "uns-mcp"
      ]
    }
  }
}

Cursor 使用与 Claude Desktop 相同的 mcpServers 格式。项目级配置优先于全局。

VS Code → Cline → MCP Servers → Edit
{
  "mcpServers": {
    "unstructured-uns-mcp": {
      "command": "uvx",
      "args": [
        "uns-mcp"
      ]
    }
  }
}

点击 Cline 侧栏中的 MCP Servers 图标,然后选 "Edit Configuration"。

~/.codeium/windsurf/mcp_config.json
{
  "mcpServers": {
    "unstructured-uns-mcp": {
      "command": "uvx",
      "args": [
        "uns-mcp"
      ]
    }
  }
}

格式与 Claude Desktop 相同。重启 Windsurf 生效。

~/.continue/config.json
{
  "mcpServers": [
    {
      "name": "unstructured-uns-mcp",
      "command": "uvx",
      "args": [
        "uns-mcp"
      ]
    }
  ]
}

Continue 使用服务器对象数组,而非映射。

~/.config/zed/settings.json
{
  "context_servers": {
    "unstructured-uns-mcp": {
      "command": {
        "path": "uvx",
        "args": [
          "uns-mcp"
        ]
      }
    }
  }
}

加入 context_servers。Zed 保存后热重载。

claude mcp add unstructured-uns-mcp -- uvx uns-mcp

一行命令搞定。用 claude mcp list 验证,claude mcp remove 卸载。

使用场景

实战用法: Unstructured UNS MCP

Stand up a RAG pipeline from a SharePoint folder to Pinecone

👤 Data engineers / RAG builders ⏱ ~15 min intermediate

何时使用: You have a corporate doc dump and want a Claude-queryable index.

前置条件
  • Server/skill installed and authenticated — See repo README
步骤
  1. Define the pipeline
    Create an Unstructured pipeline: source SharePoint folder X, partition by_title with 1024 token max, embed with text-embedding-3-small, target Pinecone index 'corp-docs'.✓ 已复制
    → Pipeline id
  2. Run and monitor
    Run it and tell me when it's done. Report any failed documents.✓ 已复制
    → Status updates + final summary

结果: Production-grade ingest with proper chunking — not naive PDF text dumps.

注意事项
  • Default chunkers can split tables across chunks. For dense tabular docs, use 'by_title' with combine_text_under_n_chars. — Default chunkers can split tables across chunks. For dense tabular docs, use 'by_title' with combine_text_under_n_chars.
搭配使用: filesystem · qdrant-mcp-server

组合

与其他 MCP 搭配,撬动十倍杠杆

unstructured-uns-mcp + filesystem

Pair with filesystem for complementary capabilities

Use this server together with filesystem to complete a multi-step task.✓ 已复制
unstructured-uns-mcp + qdrant-mcp-server

Pair with qdrant-mcp-server for complementary capabilities

Use this server together with qdrant-mcp-server to complete a multi-step task.✓ 已复制

工具

此 MCP 暴露的能力

工具输入参数何时调用成本
create_pipeline source, dest, partition_args, chunk_args Define a new ingest job 1 API call
run_pipeline pipeline_id Execute the pipeline Per Unstructured plan
list_workflows (none) See all configured workflows 1 API call

成本与限制

运行它的成本

API 配额
See provider docs for rate limits
每次调用 Token 数
Varies by tool
费用
See repo README for pricing details
提示
Cache tool results and avoid repeated identical calls.

安全

权限、密钥、影响范围

凭据存储: Use environment variables; never commit secrets
数据出站: Tool calls go to the provider's API as documented

故障排查

常见错误与修复

401 Unauthorized

Get an API key at unstructured.io → Settings → API keys; set UNSTRUCTURED_API_KEY and UNSTRUCTURED_API_URL.

验证: list_workflows returns at least one
source connector auth failure

Connector creds (e.g. SharePoint app token) are configured per-source. Recreate the source via Unstructured UI to refresh.

验证: Run a 1-file test pipeline first

替代方案

Unstructured UNS MCP 对比其他方案

替代方案何时用它替代权衡
LlamaIndex / llamacloudYou want a managed RAG-as-a-service productHigher-level but less control over chunking

更多

资源

📖 阅读 GitHub 上的官方 README

🐙 查看未解决的 issue

🔍 浏览全部 400+ MCP 服务器和 Skills