Unstructured UNS MCP

为什么要用

核心特性

60+ source connectors (S3, Drive, OneDrive, SharePoint, GCS)
Document partitioning into elements (Title, Table, NarrativeText, ListItem)
Built-in chunkers (by_title, basic)
Embeddings via OpenAI, Voyage, or local models
Targets: vector DBs (Pinecone, Weaviate, Qdrant) or warehouses

实时演示

实际使用效果

unstructured-uns-mcp.replay ▶ 就绪

0/0

安装

选择你的客户端

~/Library/Application Support/Claude/claude_desktop_config.json · Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "unstructured-uns-mcp": {
      "command": "uvx",
      "args": [
        "uns-mcp"
      ]
    }
  }
}

打开 Claude Desktop → Settings → Developer → Edit Config。保存后重启应用。

~/.cursor/mcp.json · .cursor/mcp.json

{
  "mcpServers": {
    "unstructured-uns-mcp": {
      "command": "uvx",
      "args": [
        "uns-mcp"
      ]
    }
  }
}

Cursor 使用与 Claude Desktop 相同的 mcpServers 格式。项目级配置优先于全局。

VS Code → Cline → MCP Servers → Edit

{
  "mcpServers": {
    "unstructured-uns-mcp": {
      "command": "uvx",
      "args": [
        "uns-mcp"
      ]
    }
  }
}

点击 Cline 侧栏中的 MCP Servers 图标，然后选 "Edit Configuration"。

~/.codeium/windsurf/mcp_config.json

{
  "mcpServers": {
    "unstructured-uns-mcp": {
      "command": "uvx",
      "args": [
        "uns-mcp"
      ]
    }
  }
}

格式与 Claude Desktop 相同。重启 Windsurf 生效。

~/.continue/config.json

{
  "mcpServers": [
    {
      "name": "unstructured-uns-mcp",
      "command": "uvx",
      "args": [
        "uns-mcp"
      ]
    }
  ]
}

Continue 使用服务器对象数组，而非映射。

~/.config/zed/settings.json

{
  "context_servers": {
    "unstructured-uns-mcp": {
      "command": {
        "path": "uvx",
        "args": [
          "uns-mcp"
        ]
      }
    }
  }
}

加入 context_servers。Zed 保存后热重载。

claude mcp add unstructured-uns-mcp -- uvx uns-mcp

一行命令搞定。用 claude mcp list 验证，claude mcp remove 卸载。

使用场景

实战用法： Unstructured UNS MCP

Stand up a RAG pipeline from a SharePoint folder to Pinecone

👤 Data engineers / RAG builders ⏱ ~15 min intermediate

何时使用： You have a corporate doc dump and want a Claude-queryable index.

前置条件

Server/skill installed and authenticated — See repo README

步骤

Define the pipeline

Create an Unstructured pipeline: source SharePoint folder X, partition by_title with 1024 token max, embed with text-embedding-3-small, target Pinecone index 'corp-docs'.✓ 已复制

→ Pipeline id
Run and monitor

Run it and tell me when it's done. Report any failed documents.✓ 已复制

→ Status updates + final summary

结果： Production-grade ingest with proper chunking — not naive PDF text dumps.

注意事项

Default chunkers can split tables across chunks. For dense tabular docs, use 'by_title' with combine_text_under_n_chars. — Default chunkers can split tables across chunks. For dense tabular docs, use 'by_title' with combine_text_under_n_chars.

搭配使用： filesystem · qdrant-mcp-server

组合

与其他 MCP 搭配，撬动十倍杠杆

unstructured-uns-mcp + filesystem

Pair with filesystem for complementary capabilities

Use this server together with filesystem to complete a multi-step task.✓ 已复制

unstructured-uns-mcp + qdrant-mcp-server

Pair with qdrant-mcp-server for complementary capabilities

Use this server together with qdrant-mcp-server to complete a multi-step task.✓ 已复制

工具

此 MCP 暴露的能力

工具	输入参数	何时调用	成本
create_pipeline	source, dest, partition_args, chunk_args	Define a new ingest job	1 API call
run_pipeline	pipeline_id	Execute the pipeline	Per Unstructured plan
list_workflows	(none)	See all configured workflows	1 API call

成本与限制

运行它的成本

API 配额: See provider docs for rate limits
每次调用 Token 数: Varies by tool
费用: See repo README for pricing details
提示: Cache tool results and avoid repeated identical calls.

安全

权限、密钥、影响范围

凭据存储： Use environment variables; never commit secrets

数据出站： Tool calls go to the provider's API as documented

故障排查

常见错误与修复

401 Unauthorized

Get an API key at unstructured.io → Settings → API keys; set UNSTRUCTURED_API_KEY and UNSTRUCTURED_API_URL.

验证： list_workflows returns at least one

source connector auth failure

Connector creds (e.g. SharePoint app token) are configured per-source. Recreate the source via Unstructured UI to refresh.

验证： Run a 1-file test pipeline first

替代方案