freemcplab / Directory / Playground / WebClaw

● Community 0xMassi ⚡ Instant

WebClaw

Name: WebClaw MCP Server
Author: 0xMassi

Rust-fast local scraping and structured extraction — pull a page, get cleaned markdown + structured fields without sending the URL to a third party.

WebClaw is a local-first web-content extractor: scrape, crawl, parse readability, extract structured fields, all in a single Rust binary. Use it when you don't want pages going through a SaaS scraper.

▶ Watch demo 📦 Install 💡 Use Cases

Why use it

Key features

Local Rust binary — no SaaS hop
Readability + markdown output
Structured field extraction with schemas
Crawl with depth limits
Respects robots.txt
User-agent rotation

Live Demo

What it looks like in practice

ready

Install

Pick your client

~/Library/Application Support/Claude/claude_desktop_config.json · Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "webclaw-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "webclaw-mcp"
      ]
    }
  }
}

Open Claude Desktop → Settings → Developer → Edit Config. Restart after saving.

~/.cursor/mcp.json · .cursor/mcp.json

{
  "mcpServers": {
    "webclaw-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "webclaw-mcp"
      ]
    }
  }
}

Cursor uses the same mcpServers schema as Claude Desktop. Project config wins over global.

VS Code → Cline → MCP Servers → Edit

{
  "mcpServers": {
    "webclaw-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "webclaw-mcp"
      ]
    }
  }
}

Click the MCP Servers icon in the Cline sidebar, then "Edit Configuration".

~/.codeium/windsurf/mcp_config.json

{
  "mcpServers": {
    "webclaw-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "webclaw-mcp"
      ]
    }
  }
}

Same shape as Claude Desktop. Restart Windsurf to pick up changes.

~/.continue/config.json

{
  "mcpServers": [
    {
      "name": "webclaw-mcp",
      "command": "npx",
      "args": [
        "-y",
        "webclaw-mcp"
      ]
    }
  ]
}

Continue uses an array of server objects rather than a map.

~/.config/zed/settings.json

{
  "context_servers": {
    "webclaw-mcp": {
      "command": {
        "path": "npx",
        "args": [
          "-y",
          "webclaw-mcp"
        ]
      }
    }
  }
}

Add to context_servers. Zed hot-reloads on save.

claude mcp add webclaw-mcp -- npx -y webclaw-mcp

One-liner. Verify with claude mcp list. Remove with claude mcp remove.

Use Cases

Real-world ways to use WebClaw

Scrape a doc site without leaking URLs to a SaaS

👤 Developers ⏱ ~15 min intermediate

When to use: NDA or compliance limits SaaS scrapers.

Flow

Run

webclaw fetch https://internal-docs.corp/x✓ Copied

→ Markdown returned
Iterate

Claude refines extraction✓ Copied

→ Clean text

Outcome: Page content available to Claude without third-party scraper.

Combinations

Pair with other MCPs for X10 leverage

webclaw-mcp + filesystem

Save extracted markdown for downstream RAG

Combine webclaw-mcp with filesystem: Save extracted markdown for downstream RAG✓ Copied

Tools

What this MCP exposes

Tool	Inputs	When to call	Cost
fetch	(see docs)	Pull a URL as cleaned markdown	1 call
extract	(see docs)	Apply a schema to extract structured fields	1 call
crawl	(see docs)	Walk a site with depth limits	1 call

Cost & Limits

What this costs to run

API quota: Local CPU
Tokens per call: Page-sized
Monetary: Free OSS
Tip: Use --readability; raw HTML burns tokens

Security

Permissions, secrets, blast radius

Credential storage: None

Data egress: Wherever you fetch

Never grant: scrape paywalled / login-required content as agent

Troubleshooting

Common errors and fixes

Blocked by site

Try --respect-robots false only on your own sites

Alternatives

WebClaw vs others

Alternative	When to use it instead	Tradeoff
firecrawl-mcp	You want managed SaaS	Pages go through Firecrawl

Resources

📖 Read the official README on GitHub

🐙 Browse open issues

🔍 Browse all 400+ MCP servers and Skills