browser-act skills — Use Cases, Install & Live Demo

Name: browser-act skills (Claude Skill)
Author: browser-act

Why use it

Key features

Schema-guided extraction — pass a JSON Schema, get matching data
Faster + cheaper than screenshot-and-LLM loops
Multi-step flows (fill form → click → scrape result)
Sensible defaults: rate-limited, retries, polite UA
Higher reliability on real-world SPAs

Live Demo

What it looks like in practice

ready

Install

Pick your client

~/Library/Application Support/Claude/claude_desktop_config.json · Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "browser-act-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/browser-act/skills",
        "~/.claude/skills/browser-act"
      ],
      "_inferred": true
    }
  }
}

Open Claude Desktop → Settings → Developer → Edit Config. Restart after saving.

~/.cursor/mcp.json · .cursor/mcp.json

{
  "mcpServers": {
    "browser-act-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/browser-act/skills",
        "~/.claude/skills/browser-act"
      ],
      "_inferred": true
    }
  }
}

Cursor uses the same mcpServers schema as Claude Desktop. Project config wins over global.

VS Code → Cline → MCP Servers → Edit

{
  "mcpServers": {
    "browser-act-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/browser-act/skills",
        "~/.claude/skills/browser-act"
      ],
      "_inferred": true
    }
  }
}

Click the MCP Servers icon in the Cline sidebar, then "Edit Configuration".

~/.codeium/windsurf/mcp_config.json

{
  "mcpServers": {
    "browser-act-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/browser-act/skills",
        "~/.claude/skills/browser-act"
      ],
      "_inferred": true
    }
  }
}

Same shape as Claude Desktop. Restart Windsurf to pick up changes.

~/.continue/config.json

{
  "mcpServers": [
    {
      "name": "browser-act-skill",
      "command": "git",
      "args": [
        "clone",
        "https://github.com/browser-act/skills",
        "~/.claude/skills/browser-act"
      ]
    }
  ]
}

Continue uses an array of server objects rather than a map.

~/.config/zed/settings.json

{
  "context_servers": {
    "browser-act-skill": {
      "command": {
        "path": "git",
        "args": [
          "clone",
          "https://github.com/browser-act/skills",
          "~/.claude/skills/browser-act"
        ]
      }
    }
  }
}

Add to context_servers. Zed hot-reloads on save.

claude mcp add browser-act-skill -- git clone https://github.com/browser-act/skills ~/.claude/skills/browser-act

One-liner. Verify with claude mcp list. Remove with claude mcp remove.

Use Cases

Real-world ways to use browser-act skills

Extract a typed list from a JS-heavy SPA

👤 Devs needing one-off data from sites without APIs ⏱ ~25 min intermediate

When to use: Site renders the data you need only after JS executes; plain fetch returns nothing.

Prerequisites

Skill installed — git clone https://github.com/browser-act/skills ~/.claude/skills/browser-act

Flow

Define schema

Use browser-act. Schema: items[]{title:str, price:number, available:bool}.✓ Copied

→ Schema accepted
Visit + extract

Open <url>; wait for the product grid; extract matching the schema.✓ Copied

→ Typed JSON list
Verify outliers

Spot-check 3 random rows by re-fetching their detail page; check parsing is correct.✓ Copied

→ Spot-checks pass; or you find a parser bug to fix

Outcome: Reliable typed data; no manual click-through.

Pitfalls

Site bot-detection blocks after 50 requests — Lower concurrency; rotate UA; respect robots.txt or skip the task

Combine with: filesystem

Walk a multi-step form to reach data behind it

👤 Devs scraping data behind login or wizards ⏱ ~40 min intermediate

When to use: Public dataset hidden behind a 'select country → select year → click view' flow.

Flow

Plan the flow

Use browser-act. Steps: pick country=US, year=2025, click 'View'. Then extract the table.✓ Copied

→ Flow plan accepted
Execute

Run the flow for 50 country/year combinations.✓ Copied

→ Typed rows for all 50
Persist

Write each combo to /data/<country>-<year>.json.✓ Copied

→ Files in /data/

Outcome: Bulk data behind clicky UIs without manual labor.

Pitfalls

Flow breaks when the site adds a step or relabels a button — Skill notices and pauses; you re-record the flow once, not 50 times

Combine with: filesystem

Monitor a page for change and alert

👤 Anyone watching a status page / availability tracker ⏱ ~15 min beginner

When to use: You want to know when a slot opens, a price drops, a doc updates.

Flow

Define the watch

Use browser-act. Watch <url> selector '.availability-banner' every 10 minutes. Alert if text changes.✓ Copied

→ Watch active
Define alert path

Alert via: write to ~/inbox/alerts.txt + notify webhook https://<my-webhook>.✓ Copied

→ On change, both fire

Outcome: Hands-off monitoring of a specific signal.

Pitfalls

Watching too aggressively = blocked — Stick to ≥5-min intervals on most sites; respect 429s

Combinations

Pair with other MCPs for X10 leverage

browser-act-skill + filesystem

Persist scraped data into structured paths

Save extraction outputs to /data/<source>/<date>.json with provenance metadata.✓ Copied

browser-act-skill + duckduckgo-mcp

Find pages first via search, then extract structured

Search via duckduckgo-mcp for the data source; pass the URL into browser-act for typed extraction.✓ Copied

Tools

What this MCP exposes

Tool	Inputs	When to call	Cost
extract_typed	url, schema, wait_for?	Pull structured data from a page	Browser run + LLM tokens
run_flow	steps[], schema?	Multi-step navigation	Multi-step browser cost
watch	url, selector, interval, action	Long-running change detection	Per-poll cost
screenshot	url, full_page?	Visual debugging	Browser run

Cost & Limits

What this costs to run

API quota: Depends on provider; some flows are free with bundled browser
Tokens per call: Schema-guided extraction is cheaper than raw screenshot-and-think
Monetary: Free skill; LLM tokens for extraction
Tip: Always pass a schema — undirected extraction wastes tokens on noise

Security

Permissions, secrets, blast radius

Minimum scopes: Outbound HTTPS

Credential storage: If logging into a site, secrets via env vars; rotate after one-off scrapes

Data egress: Target sites + LLM provider

Never grant: Persistent login tokens stored in the skill's workspace

Respect robots.txt and ToS of target sites; skill is a tool, not a license to ignore policy

Troubleshooting

Common errors and fixes

Extraction returns empty

Wait for selector longer; site might load data after a delay or via XHR after click

Verify: Use screenshot tool to verify page state

Site bot-detects

Lower concurrency; rotate UA; consider whether the site permits scraping

Schema mismatch

Loosen types (string vs number); the site might use formatting that breaks strict types

Watch fires repeatedly on cosmetic changes

Pin selector tighter; or switch to text-based diff instead of HTML diff

Alternatives

browser-act skills vs others

Alternative	When to use it instead	Tradeoff
Playwright MCP / chrome-devtools-mcp	You need full browser control with all DevTools features	Heavier; more expensive per call
Firecrawl MCP	Site-wide crawling, not per-page typed extraction	Different shape; paid for serious volumes
duckduckgo-mcp fetch_content	Page is plain HTML; no JS needed	Won't work on SPAs

More

Resources

📖 Read the official README on GitHub

🐙 Browse open issues

🔍 Browse all 400+ MCP servers and Skills