/ Directory / Playground / browser-act skills
● Community browser-act ⚡ Instant

browser-act skills

by browser-act · browser-act/skills

Browse, scrape, and extract structured data from complex sites — faster and cheaper than driving a full headless browser.

browser-act is a Claude skill for web automation that prioritizes structured extraction over screenshot-and-think loops. Visit pages, navigate flows, and extract typed data with explicit selectors or schema-guided prompts. Works on JS-heavy sites where DDG-style fetches return nothing, but cheaper than a full Playwright MCP for many tasks.

Why use it

Key features

Live Demo

What it looks like in practice

ready

Install

Pick your client

~/Library/Application Support/Claude/claude_desktop_config.json  · Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "browser-act-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/browser-act/skills",
        "~/.claude/skills/browser-act"
      ],
      "_inferred": true
    }
  }
}

Open Claude Desktop → Settings → Developer → Edit Config. Restart after saving.

~/.cursor/mcp.json · .cursor/mcp.json
{
  "mcpServers": {
    "browser-act-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/browser-act/skills",
        "~/.claude/skills/browser-act"
      ],
      "_inferred": true
    }
  }
}

Cursor uses the same mcpServers schema as Claude Desktop. Project config wins over global.

VS Code → Cline → MCP Servers → Edit
{
  "mcpServers": {
    "browser-act-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/browser-act/skills",
        "~/.claude/skills/browser-act"
      ],
      "_inferred": true
    }
  }
}

Click the MCP Servers icon in the Cline sidebar, then "Edit Configuration".

~/.codeium/windsurf/mcp_config.json
{
  "mcpServers": {
    "browser-act-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/browser-act/skills",
        "~/.claude/skills/browser-act"
      ],
      "_inferred": true
    }
  }
}

Same shape as Claude Desktop. Restart Windsurf to pick up changes.

~/.continue/config.json
{
  "mcpServers": [
    {
      "name": "browser-act-skill",
      "command": "git",
      "args": [
        "clone",
        "https://github.com/browser-act/skills",
        "~/.claude/skills/browser-act"
      ]
    }
  ]
}

Continue uses an array of server objects rather than a map.

~/.config/zed/settings.json
{
  "context_servers": {
    "browser-act-skill": {
      "command": {
        "path": "git",
        "args": [
          "clone",
          "https://github.com/browser-act/skills",
          "~/.claude/skills/browser-act"
        ]
      }
    }
  }
}

Add to context_servers. Zed hot-reloads on save.

claude mcp add browser-act-skill -- git clone https://github.com/browser-act/skills ~/.claude/skills/browser-act

One-liner. Verify with claude mcp list. Remove with claude mcp remove.

Use Cases

Real-world ways to use browser-act skills

Extract a typed list from a JS-heavy SPA

👤 Devs needing one-off data from sites without APIs ⏱ ~25 min intermediate

When to use: Site renders the data you need only after JS executes; plain fetch returns nothing.

Prerequisites
  • Skill installed — git clone https://github.com/browser-act/skills ~/.claude/skills/browser-act
Flow
  1. Define schema
    Use browser-act. Schema: items[]{title:str, price:number, available:bool}.✓ Copied
    → Schema accepted
  2. Visit + extract
    Open <url>; wait for the product grid; extract matching the schema.✓ Copied
    → Typed JSON list
  3. Verify outliers
    Spot-check 3 random rows by re-fetching their detail page; check parsing is correct.✓ Copied
    → Spot-checks pass; or you find a parser bug to fix

Outcome: Reliable typed data; no manual click-through.

Pitfalls
  • Site bot-detection blocks after 50 requests — Lower concurrency; rotate UA; respect robots.txt or skip the task
Combine with: filesystem

Walk a multi-step form to reach data behind it

👤 Devs scraping data behind login or wizards ⏱ ~40 min intermediate

When to use: Public dataset hidden behind a 'select country → select year → click view' flow.

Flow
  1. Plan the flow
    Use browser-act. Steps: pick country=US, year=2025, click 'View'. Then extract the table.✓ Copied
    → Flow plan accepted
  2. Execute
    Run the flow for 50 country/year combinations.✓ Copied
    → Typed rows for all 50
  3. Persist
    Write each combo to /data/<country>-<year>.json.✓ Copied
    → Files in /data/

Outcome: Bulk data behind clicky UIs without manual labor.

Pitfalls
  • Flow breaks when the site adds a step or relabels a button — Skill notices and pauses; you re-record the flow once, not 50 times
Combine with: filesystem

Monitor a page for change and alert

👤 Anyone watching a status page / availability tracker ⏱ ~15 min beginner

When to use: You want to know when a slot opens, a price drops, a doc updates.

Flow
  1. Define the watch
    Use browser-act. Watch <url> selector '.availability-banner' every 10 minutes. Alert if text changes.✓ Copied
    → Watch active
  2. Define alert path
    Alert via: write to ~/inbox/alerts.txt + notify webhook https://<my-webhook>.✓ Copied
    → On change, both fire

Outcome: Hands-off monitoring of a specific signal.

Pitfalls
  • Watching too aggressively = blocked — Stick to ≥5-min intervals on most sites; respect 429s

Combinations

Pair with other MCPs for X10 leverage

browser-act-skill + filesystem

Persist scraped data into structured paths

Save extraction outputs to /data/<source>/<date>.json with provenance metadata.✓ Copied
browser-act-skill + duckduckgo-mcp

Find pages first via search, then extract structured

Search via duckduckgo-mcp for the data source; pass the URL into browser-act for typed extraction.✓ Copied

Tools

What this MCP exposes

ToolInputsWhen to callCost
extract_typed url, schema, wait_for? Pull structured data from a page Browser run + LLM tokens
run_flow steps[], schema? Multi-step navigation Multi-step browser cost
watch url, selector, interval, action Long-running change detection Per-poll cost
screenshot url, full_page? Visual debugging Browser run

Cost & Limits

What this costs to run

API quota
Depends on provider; some flows are free with bundled browser
Tokens per call
Schema-guided extraction is cheaper than raw screenshot-and-think
Monetary
Free skill; LLM tokens for extraction
Tip
Always pass a schema — undirected extraction wastes tokens on noise

Security

Permissions, secrets, blast radius

Minimum scopes: Outbound HTTPS
Credential storage: If logging into a site, secrets via env vars; rotate after one-off scrapes
Data egress: Target sites + LLM provider
Never grant: Persistent login tokens stored in the skill's workspace

Troubleshooting

Common errors and fixes

Extraction returns empty

Wait for selector longer; site might load data after a delay or via XHR after click

Verify: Use screenshot tool to verify page state
Site bot-detects

Lower concurrency; rotate UA; consider whether the site permits scraping

Schema mismatch

Loosen types (string vs number); the site might use formatting that breaks strict types

Watch fires repeatedly on cosmetic changes

Pin selector tighter; or switch to text-based diff instead of HTML diff

Alternatives

browser-act skills vs others

AlternativeWhen to use it insteadTradeoff
Playwright MCP / chrome-devtools-mcpYou need full browser control with all DevTools featuresHeavier; more expensive per call
Firecrawl MCPSite-wide crawling, not per-page typed extractionDifferent shape; paid for serious volumes
duckduckgo-mcp fetch_contentPage is plain HTML; no JS neededWon't work on SPAs

More

Resources

📖 Read the official README on GitHub

🐙 Browse open issues

🔍 Browse all 400+ MCP servers and Skills