/ Directory / Playground / WebClaw
● Community 0xMassi ⚡ Instant

WebClaw

by 0xMassi · 0xMassi/webclaw

Rust-fast local scraping and structured extraction — pull a page, get cleaned markdown + structured fields without sending the URL to a third party.

WebClaw is a local-first web-content extractor: scrape, crawl, parse readability, extract structured fields, all in a single Rust binary. Use it when you don't want pages going through a SaaS scraper.

Why use it

Key features

Live Demo

What it looks like in practice

ready

Install

Pick your client

~/Library/Application Support/Claude/claude_desktop_config.json  · Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "webclaw-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "webclaw-mcp"
      ]
    }
  }
}

Open Claude Desktop → Settings → Developer → Edit Config. Restart after saving.

~/.cursor/mcp.json · .cursor/mcp.json
{
  "mcpServers": {
    "webclaw-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "webclaw-mcp"
      ]
    }
  }
}

Cursor uses the same mcpServers schema as Claude Desktop. Project config wins over global.

VS Code → Cline → MCP Servers → Edit
{
  "mcpServers": {
    "webclaw-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "webclaw-mcp"
      ]
    }
  }
}

Click the MCP Servers icon in the Cline sidebar, then "Edit Configuration".

~/.codeium/windsurf/mcp_config.json
{
  "mcpServers": {
    "webclaw-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "webclaw-mcp"
      ]
    }
  }
}

Same shape as Claude Desktop. Restart Windsurf to pick up changes.

~/.continue/config.json
{
  "mcpServers": [
    {
      "name": "webclaw-mcp",
      "command": "npx",
      "args": [
        "-y",
        "webclaw-mcp"
      ]
    }
  ]
}

Continue uses an array of server objects rather than a map.

~/.config/zed/settings.json
{
  "context_servers": {
    "webclaw-mcp": {
      "command": {
        "path": "npx",
        "args": [
          "-y",
          "webclaw-mcp"
        ]
      }
    }
  }
}

Add to context_servers. Zed hot-reloads on save.

claude mcp add webclaw-mcp -- npx -y webclaw-mcp

One-liner. Verify with claude mcp list. Remove with claude mcp remove.

Use Cases

Real-world ways to use WebClaw

Scrape a doc site without leaking URLs to a SaaS

👤 Developers ⏱ ~15 min intermediate

When to use: NDA or compliance limits SaaS scrapers.

Flow
  1. Run
    webclaw fetch https://internal-docs.corp/x✓ Copied
    → Markdown returned
  2. Iterate
    Claude refines extraction✓ Copied
    → Clean text

Outcome: Page content available to Claude without third-party scraper.

Combinations

Pair with other MCPs for X10 leverage

webclaw-mcp + filesystem

Save extracted markdown for downstream RAG

Combine webclaw-mcp with filesystem: Save extracted markdown for downstream RAG✓ Copied

Tools

What this MCP exposes

ToolInputsWhen to callCost
fetch (see docs) Pull a URL as cleaned markdown 1 call
extract (see docs) Apply a schema to extract structured fields 1 call
crawl (see docs) Walk a site with depth limits 1 call

Cost & Limits

What this costs to run

API quota
Local CPU
Tokens per call
Page-sized
Monetary
Free OSS
Tip
Use --readability; raw HTML burns tokens

Security

Permissions, secrets, blast radius

Credential storage: None
Data egress: Wherever you fetch
Never grant: scrape paywalled / login-required content as agent

Troubleshooting

Common errors and fixes

Blocked by site

Try --respect-robots false only on your own sites

Alternatives

WebClaw vs others

AlternativeWhen to use it insteadTradeoff
firecrawl-mcpYou want managed SaaSPages go through Firecrawl

More

Resources

📖 Read the official README on GitHub

🐙 Browse open issues

🔍 Browse all 400+ MCP servers and Skills