/ Directory / Playground / ARIS (Auto-Research-In-Sleep)
● Community wanshuiyin ⚡ Instant

ARIS (Auto-Research-In-Sleep)

by wanshuiyin · wanshuiyin/Auto-claude-code-research-in-sleep

Skills that turn an LLM coding agent into an autonomous ML researcher — idea discovery, cross-model review loops, overnight experiment automation.

ARIS (Auto-Research-In-Sleep) is a Markdown-only skill bundle for autonomous ML research. Designed to run unattended — typically overnight — and produce a stack of vetted research candidates, experiment results, and review notes. No framework lock-in: the skills work with Claude Code, Codex, OpenCode, and any agent that loads markdown skills. Build for solo researchers and small labs that want compute and reasoning to compound while they're asleep.

Why use it

Key features

Live Demo

What it looks like in practice

ready

Install

Pick your client

~/Library/Application Support/Claude/claude_desktop_config.json  · Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "aris-research-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep",
        "~/.claude/skills/aris"
      ],
      "_inferred": true
    }
  }
}

Open Claude Desktop → Settings → Developer → Edit Config. Restart after saving.

~/.cursor/mcp.json · .cursor/mcp.json
{
  "mcpServers": {
    "aris-research-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep",
        "~/.claude/skills/aris"
      ],
      "_inferred": true
    }
  }
}

Cursor uses the same mcpServers schema as Claude Desktop. Project config wins over global.

VS Code → Cline → MCP Servers → Edit
{
  "mcpServers": {
    "aris-research-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep",
        "~/.claude/skills/aris"
      ],
      "_inferred": true
    }
  }
}

Click the MCP Servers icon in the Cline sidebar, then "Edit Configuration".

~/.codeium/windsurf/mcp_config.json
{
  "mcpServers": {
    "aris-research-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep",
        "~/.claude/skills/aris"
      ],
      "_inferred": true
    }
  }
}

Same shape as Claude Desktop. Restart Windsurf to pick up changes.

~/.continue/config.json
{
  "mcpServers": [
    {
      "name": "aris-research-skill",
      "command": "git",
      "args": [
        "clone",
        "https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep",
        "~/.claude/skills/aris"
      ]
    }
  ]
}

Continue uses an array of server objects rather than a map.

~/.config/zed/settings.json
{
  "context_servers": {
    "aris-research-skill": {
      "command": {
        "path": "git",
        "args": [
          "clone",
          "https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep",
          "~/.claude/skills/aris"
        ]
      }
    }
  }
}

Add to context_servers. Zed hot-reloads on save.

claude mcp add aris-research-skill -- git clone https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep ~/.claude/skills/aris

One-liner. Verify with claude mcp list. Remove with claude mcp remove.

Use Cases

Real-world ways to use ARIS (Auto-Research-In-Sleep)

Run an autonomous research session overnight

👤 Solo researchers / small labs ⏱ ~720 min advanced

When to use: You have a research direction and 12 hours; you want to wake up to candidate experiments.

Prerequisites
  • Compute access (GPU/cloud) for experiments — ARIS schedules; you provide the runner
Flow
  1. Scope
    Use ARIS. Goal: explore parameter-efficient fine-tuning variants for small LLMs. Budget: 12h, 5 ideas, 2 experiments per idea.✓ Copied
    → Plan with budgets per stage
  2. Discover
    Generate 10 candidate ideas. Cross-review. Keep the top 5.✓ Copied
    → Ranked idea list with critiques
  3. Run
    For each top idea, draft an experiment, run it, log metrics. Stop when budget exhausted.✓ Copied
    → Experiments completed; metrics logged

Outcome: Morning report: ranked ideas with actual experimental signal.

Pitfalls
  • Compute usage runs away — Hard caps on per-experiment budget; ARIS respects them
Combine with: filesystem

Run a literature scan and extract experimental claims

👤 Researchers entering a new area ⏱ ~90 min intermediate

When to use: You need to map a subfield's claims and gaps.

Flow
  1. Scope
    Scan papers from 2024-2026 on diffusion-LM hybrids. Extract claimed contributions and benchmarks.✓ Copied
    → Claim table
  2. Critique
    Cross-review claims against each other. What's contradictory or under-tested?✓ Copied
    → Gap analysis

Outcome: Map of the subfield with concrete research gaps.

Pitfalls
  • Hallucinated citations — ARIS includes a verification step — pair with arxiv MCP for ground truth
Combine with: notebooklm-py-skill

Combinations

Pair with other MCPs for X10 leverage

aris-research-skill + notebooklm-py-skill

Literature digest before / after research

Use NotebookLM to digest related papers; ARIS to design experiments.✓ Copied
aris-research-skill + filesystem

Persist experiment artifacts under /research/<run-id>/

Save all logs and results under /research/2026-05-03/.✓ Copied

Tools

What this MCP exposes

ToolInputsWhen to callCost
plan_research goal, budget Session start tokens
discover_ideas topic, n Idea generation phase tokens
cross_review idea, models[] Filtering ideas tokens × n models
run_experiment idea, runner config Validation compute + tokens

Cost & Limits

What this costs to run

API quota
Tokens + compute — both can be substantial
Tokens per call
Idea discovery and cross-review are token-heavy
Monetary
Free skill; you pay compute and model usage
Tip
Set hard token / compute caps in the plan stage

Security

Permissions, secrets, blast radius

Minimum scopes: Compute runner access Read access to data
Credential storage: External — handled by your runner
Data egress: Depends on your model/runner setup
Never grant: Production data to autonomous research loops

Troubleshooting

Common errors and fixes

Loops without converging

Tighten budget; ARIS respects caps and stops

Cross-review collapses to one viewpoint

Diversify reviewer models; if all are same family, you'll get correlated views

Experiments fail silently

Tighten runner error reporting; ARIS logs but can't fix bad runners

Alternatives

ARIS (Auto-Research-In-Sleep) vs others

AlternativeWhen to use it insteadTradeoff
Hand-run experimentsYou need full controlDoesn't scale beyond your hours
Hosted AutoML platformsYou want managed compute + UILess flexibility; cost

More

Resources

📖 Read the official README on GitHub

🐙 Browse open issues

🔍 Browse all 400+ MCP servers and Skills