/ Directory / Playground / Semble
● Community MinishLab ⚡ Instant

Semble

by MinishLab · MinishLab/semble

Code search that costs ~2% of grep+read in tokens — semantic + lexical, fully local, sub-second on million-line repos.

Semble is a code-search MCP from Minish Lab. It indexes a repo with hybrid sparse+dense embeddings and serves Claude semantic search results that are tighter than grep — no need to dump whole files. Token usage is dramatically lower than the standard 'grep then read' pattern, especially on large monorepos.

Why use it

Key features

Live Demo

What it looks like in practice

semble-mcp.replay ▶ ready
0/0

Install

Pick your client

~/Library/Application Support/Claude/claude_desktop_config.json  · Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "semble-mcp": {
      "command": "uvx",
      "args": [
        "--from",
        "semble[mcp]",
        "semble"
      ]
    }
  }
}

Open Claude Desktop → Settings → Developer → Edit Config. Restart after saving.

~/.cursor/mcp.json · .cursor/mcp.json
{
  "mcpServers": {
    "semble-mcp": {
      "command": "uvx",
      "args": [
        "--from",
        "semble[mcp]",
        "semble"
      ]
    }
  }
}

Cursor uses the same mcpServers schema as Claude Desktop. Project config wins over global.

VS Code → Cline → MCP Servers → Edit
{
  "mcpServers": {
    "semble-mcp": {
      "command": "uvx",
      "args": [
        "--from",
        "semble[mcp]",
        "semble"
      ]
    }
  }
}

Click the MCP Servers icon in the Cline sidebar, then "Edit Configuration".

~/.codeium/windsurf/mcp_config.json
{
  "mcpServers": {
    "semble-mcp": {
      "command": "uvx",
      "args": [
        "--from",
        "semble[mcp]",
        "semble"
      ]
    }
  }
}

Same shape as Claude Desktop. Restart Windsurf to pick up changes.

~/.continue/config.json
{
  "mcpServers": [
    {
      "name": "semble-mcp",
      "command": "uvx",
      "args": [
        "--from",
        "semble[mcp]",
        "semble"
      ]
    }
  ]
}

Continue uses an array of server objects rather than a map.

~/.config/zed/settings.json
{
  "context_servers": {
    "semble-mcp": {
      "command": {
        "path": "uvx",
        "args": [
          "--from",
          "semble[mcp]",
          "semble"
        ]
      }
    }
  }
}

Add to context_servers. Zed hot-reloads on save.

claude mcp add semble-mcp -- uvx --from semble[mcp] semble

One-liner. Verify with claude mcp list. Remove with claude mcp remove.

Use Cases

Real-world ways to use Semble

Find where a concept is implemented across a large repo

👤 Engineers exploring big codebases ⏱ ~5 min beginner

When to use: You need 'where does session validation happen?' answered in one shot.

Prerequisites
  • Repo indexedsemble index . — first run takes a few minutes for big repos
Flow
  1. Search
    Use semble. Find where session token validation logic lives. Return file:line ranges.✓ Copied
    → 3-5 hits with concrete file:line ranges
  2. Read tight
    Read just those line ranges (not whole files). Summarize the validation flow.✓ Copied
    → Concise flow summary; small token footprint

Outcome: Question answered with a fraction of the tokens grep+read would burn.

Pitfalls
  • Stale index after big rebasessemble reindex or set up a file-watcher hook
Combine with: filesystem

Audit how a coding pattern is used across a repo

👤 Refactoring teams ⏱ ~15 min intermediate

When to use: You're about to deprecate a helper and want all its uses + variants.

Flow
  1. Lexical pass
    Find all calls to legacy_token_check.✓ Copied
    → Exact-match list
  2. Semantic pass
    Find functions that do the same job under a different name.✓ Copied
    → Look-alike implementations flagged

Outcome: Refactor scope is fully known before starting.

Pitfalls
  • False positives on semantic pass — Verify each candidate; semble gives a similarity score to triage

Combinations

Pair with other MCPs for X10 leverage

semble-mcp + filesystem

Search → tight read pattern (only the ranges semble returns)

Use semble to find candidates, then filesystem to read just the matched ranges.✓ Copied
semble-mcp + github

Build a refactor PR with all call sites updated

Find all call sites with semble, generate a PR that updates each.✓ Copied

Tools

What this MCP exposes

ToolInputsWhen to callCost
search query, k?, mode? Default — find by concept free
lexical_search pattern, k? Exact tokens / regex free
index path First run / after big restructure free
reindex path Catch up after edits free

Cost & Limits

What this costs to run

API quota
None — local
Tokens per call
Tiny — usually 100–500
Monetary
Free
Tip
Index the repo once; queries are cheap. Token savings compound.

Security

Permissions, secrets, blast radius

Minimum scopes: Local file read for indexing
Credential storage: None
Data egress: None — embeddings stay local

Troubleshooting

Common errors and fixes

Indexing OOM on huge repos

Lower batch size in config or index per-package

Search returns nothing relevant

Re-index — your local copy may have drifted; or refine the query

uvx install hangs

Pre-warm with uv pip install semble[mcp]

Alternatives

Semble vs others

AlternativeWhen to use it insteadTradeoff
ripgrep + grep MCPYou only need exact-match searchNo semantic recall; higher token cost on results
ast-grep / serenaYou want syntax-aware structural searchDifferent model; not embedding-based

More

Resources

📖 Read the official README on GitHub

🐙 Browse open issues

🔍 Browse all 400+ MCP servers and Skills