freemcplab / Directory / Playground / ARIS (Auto-Research-In-Sleep)

● Community wanshuiyin ⚡ Instant

ARIS (Auto-Research-In-Sleep)

Name: ARIS (Auto-Research-In-Sleep) (Claude Skill)
Author: wanshuiyin

by wanshuiyin · wanshuiyin/Auto-claude-code-research-in-sleep

Skills that turn an LLM coding agent into an autonomous ML researcher — idea discovery, cross-model review loops, overnight experiment automation.

ARIS (Auto-Research-In-Sleep) is a Markdown-only skill bundle for autonomous ML research. Designed to run unattended — typically overnight — and produce a stack of vetted research candidates, experiment results, and review notes. No framework lock-in: the skills work with Claude Code, Codex, OpenCode, and any agent that loads markdown skills. Build for solo researchers and small labs that want compute and reasoning to compound while they're asleep.

▶ Watch demo 📦 Install 💡 Use Cases

Why use it

Key features

Idea discovery with cross-model review (multi-LLM consensus)
Experiment automation — write, run, evaluate, iterate
Review loops to filter low-quality outputs
Markdown-only — no opaque framework, easy to fork
Works with any agent harness

Live Demo

What it looks like in practice

ready

Install

Pick your client

~/Library/Application Support/Claude/claude_desktop_config.json · Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "aris-research-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep",
        "~/.claude/skills/aris"
      ],
      "_inferred": true
    }
  }
}

Open Claude Desktop → Settings → Developer → Edit Config. Restart after saving.

~/.cursor/mcp.json · .cursor/mcp.json

{
  "mcpServers": {
    "aris-research-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep",
        "~/.claude/skills/aris"
      ],
      "_inferred": true
    }
  }
}

Cursor uses the same mcpServers schema as Claude Desktop. Project config wins over global.

VS Code → Cline → MCP Servers → Edit

{
  "mcpServers": {
    "aris-research-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep",
        "~/.claude/skills/aris"
      ],
      "_inferred": true
    }
  }
}

Click the MCP Servers icon in the Cline sidebar, then "Edit Configuration".

~/.codeium/windsurf/mcp_config.json

{
  "mcpServers": {
    "aris-research-skill": {
      "command": "git",
      "args": [
        "clone",
        "https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep",
        "~/.claude/skills/aris"
      ],
      "_inferred": true
    }
  }
}

Same shape as Claude Desktop. Restart Windsurf to pick up changes.

~/.continue/config.json

{
  "mcpServers": [
    {
      "name": "aris-research-skill",
      "command": "git",
      "args": [
        "clone",
        "https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep",
        "~/.claude/skills/aris"
      ]
    }
  ]
}

Continue uses an array of server objects rather than a map.

~/.config/zed/settings.json

{
  "context_servers": {
    "aris-research-skill": {
      "command": {
        "path": "git",
        "args": [
          "clone",
          "https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep",
          "~/.claude/skills/aris"
        ]
      }
    }
  }
}

Add to context_servers. Zed hot-reloads on save.

claude mcp add aris-research-skill -- git clone https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep ~/.claude/skills/aris

One-liner. Verify with claude mcp list. Remove with claude mcp remove.

Use Cases

Real-world ways to use ARIS (Auto-Research-In-Sleep)

Run an autonomous research session overnight

👤 Solo researchers / small labs ⏱ ~720 min advanced

When to use: You have a research direction and 12 hours; you want to wake up to candidate experiments.

Prerequisites

Compute access (GPU/cloud) for experiments — ARIS schedules; you provide the runner

Flow

Scope

Use ARIS. Goal: explore parameter-efficient fine-tuning variants for small LLMs. Budget: 12h, 5 ideas, 2 experiments per idea.✓ Copied

→ Plan with budgets per stage
Discover

Generate 10 candidate ideas. Cross-review. Keep the top 5.✓ Copied

→ Ranked idea list with critiques
Run

For each top idea, draft an experiment, run it, log metrics. Stop when budget exhausted.✓ Copied

→ Experiments completed; metrics logged

Outcome: Morning report: ranked ideas with actual experimental signal.

Pitfalls

Compute usage runs away — Hard caps on per-experiment budget; ARIS respects them

Combine with: filesystem

Run a literature scan and extract experimental claims

👤 Researchers entering a new area ⏱ ~90 min intermediate

When to use: You need to map a subfield's claims and gaps.

Flow

Scope

Scan papers from 2024-2026 on diffusion-LM hybrids. Extract claimed contributions and benchmarks.✓ Copied

→ Claim table
Critique

Cross-review claims against each other. What's contradictory or under-tested?✓ Copied

→ Gap analysis

Outcome: Map of the subfield with concrete research gaps.

Pitfalls

Hallucinated citations — ARIS includes a verification step — pair with arxiv MCP for ground truth

Combine with: notebooklm-py-skill

Combinations

Pair with other MCPs for X10 leverage

aris-research-skill + notebooklm-py-skill

Literature digest before / after research

Use NotebookLM to digest related papers; ARIS to design experiments.✓ Copied

aris-research-skill + filesystem

Persist experiment artifacts under /research/<run-id>/

Save all logs and results under /research/2026-05-03/.✓ Copied

Tools

What this MCP exposes

Tool	Inputs	When to call	Cost
plan_research	goal, budget	Session start	tokens
discover_ideas	topic, n	Idea generation phase	tokens
cross_review	idea, models[]	Filtering ideas	tokens × n models
run_experiment	idea, runner config	Validation	compute + tokens

Cost & Limits

What this costs to run

API quota: Tokens + compute — both can be substantial
Tokens per call: Idea discovery and cross-review are token-heavy
Monetary: Free skill; you pay compute and model usage
Tip: Set hard token / compute caps in the plan stage

Security

Permissions, secrets, blast radius

Minimum scopes: Compute runner access Read access to data

Credential storage: External — handled by your runner

Data egress: Depends on your model/runner setup

Never grant: Production data to autonomous research loops

Autonomous experiments — review all generated code before letting it run on real systems

Troubleshooting

Common errors and fixes

Loops without converging

Tighten budget; ARIS respects caps and stops

Cross-review collapses to one viewpoint

Diversify reviewer models; if all are same family, you'll get correlated views

Experiments fail silently

Tighten runner error reporting; ARIS logs but can't fix bad runners

Alternatives