|
|
English | ZH-CN
EvoHarness delivers terminal-native agent infrastructure: tools, commands, skills, agents, plugins, MCP, memory, approvals, and controlled self-evolution.
Build with the project: shape an open, visible, and research-grade harness for coding workflows.
π Evidence, operators, candidate patches, and promotion paths π©΅
EvoHarness treats self-evolution as control over the harness surface, not as unconstrained agent autonomy.
The real question is not "can the model rewrite itself once," but:
- π§Ύ when to evolve: use real sessions, traces, failures, approvals, and workspace state as evidence
- ποΈ which operator to choose:
revise_command,revise_skill,distill_memory,grow_ecosystem, orstop - π when not to evolve: low-value changes should be filtered before mutation
- β how changes enter the runtime: candidate patches must pass validation, then get promoted, held, or rolled back
So the loop can be read in one line:
evidence -> operator choice -> candidate patch -> validation -> promote / hold / rollback
In short, EvoHarness studies self-evolution as a long-horizon operator-control problem over commands, skills, agents, plugins, MCP, memory, and policy surfaces.
π΄ Three-stage mascot evolution: saddle -> harness -> elegant evolution β¨π©΅
π§© One runtime core β’ π visible harness surfaces β’ π§ one long-horizon state layer
EvoHarness is built around one architectural bet: the harness should be a first-class system surface, not hidden orchestration glue.
What makes the architecture distinctive:
- π visible by default: tools, commands, skills, agents, plugins, and MCP stay inspectable in the workspace
- π§± workspace-native: markdown, registries, settings, memory, and policy surfaces live as real project artifacts
- π§ long-horizon aware: approvals, archived sessions, analytics, and evolution planning remain in the same runtime
- π§ͺ research-ready: the harness is observable, countable, and evolvable instead of disappearing behind a black box
At the system surface, EvoHarness exposes:
- π οΈ 27 tools for files, shell, search, tasks, registry, MCP, and subagents
- π 32 commands as workflow entry points
- π§ 34 skills as on-demand procedural guidance
- π€ 32 agents for bounded delegation
- π 7 plugins for workspace-native ecosystem growth
- π°οΈ 11 MCP servers / 29 MCP tools for external tools, resources, and prompts
π¦ A guided peek: tools, commands, skills, agents, plugins, and MCP all meet at the runtime core β¨
If you want to understand the project quickly, start here:
- π run
evoh doctor --workspace .to inspect the resolved runtime surface - π§ run
evoh tools-list --workspace .,evoh commands-list --workspace .,evoh agents-list --workspace ., andevoh mcp-list --workspace . --kind all - π§ use
/help,/commands,/skills,/agents, and/mcponce you enter the session - π§© browse plugins, .claude, and .evo-harness/mcp.json to see the harness as a real workspace product
In short: EvoHarness is not just "an agent with tools"; it is a visible, editable, and evolvable harness workspace (^_^)
| Item | Why It Matters |
|---|---|
Python 3.11+ |
required for the runtime, CLI, MCP helpers, and local harness surface |
Node.js 18+ |
optional, only if you want the React/Ink frontend |
Without Node, EvoHarness still opens the text session (^_^)/
git clone https://github.com/HITSZ-DS/EvoHarness.git
cd EvoHarness
python -m pip install -e .
evoh doctor --workspace .If the doctor report looks healthy, you are ready to enter the harness.
evoh --workspace .If npm is available, EvoHarness will try the React/Ink frontend first.
If not, it falls back to the text session automatically.
β¨ First look: runtime deck, slash commands, and live harness status in one terminal surface
Inside the session, run:
/setup
EvoHarness will ask for four things:
- π§©
Provider profile: which API family or gateway style you want - π€
Model: the exact model name you want to use - π
API key: paste it now, or leave it blank if you already keep it elsewhere - π
Base URL: required for custom gateways and non-default endpoints
π οΈ `/setup` is the fastest way to make a fresh session actually usable
| Profile | Best Fit | API Style | Typical Key Env |
|---|---|---|---|
anthropic |
native Claude usage | Anthropic Messages API | ANTHROPIC_API_KEY |
openai-compatible |
GLM, Qwen, DeepSeek, DashScope, OpenAI-like gateways | /v1/chat/completions |
OPENAI_API_KEY by default |
moonshot |
Kimi / Moonshot | OpenAI-compatible | MOONSHOT_API_KEY |
anthropic-compatible |
Claude-style proxies and internal gateways | Anthropic-compatible | ANTHROPIC_API_KEY |
claude-code-cli |
Local Claude Code CLI | Uses your Claude Code subscription | No API key needed |
ollama |
Local open-source models | Llama, Qwen, Mistral, etc. | No API key needed |
codex |
OpenAI Codex for code | OpenAI-compatible | OPENAI_API_KEY |
auto |
fastest first try | inferred from model + base URL | inferred from your setup |
π‘ New: Local & Low-Cost Options!
- π
claude-code-cli: Use your Claude Code subscription instead of API keys - π
ollama: Run completely free open-source models locally (Llama 3, Qwen 2, etc.) - π°
codex: OpenAI Codex for code-specific tasks
See LOCAL_PROVIDERS.md for detailed setup instructions.
Recommended pattern:
- π keep your key in an environment variable when possible
- π§ use
/setupto choose the profile, model, and base URL - π§± use
evoh initif you want those settings scaffolded into a fresh workspace
Quick key rules:
anthropicandanthropic-compatibletypically useANTHROPIC_API_KEYmoonshottypically usesMOONSHOT_API_KEYopenai-compatibledefaults toOPENAI_API_KEY, unless you scaffold a custom one withevoh init --api-key-env ...
If you want to bring EvoHarness into another project instead of only running this repo:
evoh init --workspace . --provider-profile openai-compatible --model glm-5 --api-key-env ZHIPUAI_API_KEY --base-url https://open.bigmodel.cn/api/paas/v4/This creates CLAUDE.md, .evo-harness/settings.json, starter .claude/ assets, and the local MCP registry.
Useful follow-up checks:
evoh provider-detect --workspace .
evoh provider-template --profile openai-compatible --model glm-5
evoh doctor --workspace .evoh doctor --workspace .
evoh tools-list --workspace .
evoh commands-list --workspace .
evoh agents-list --workspace .
evoh mcp-list --workspace . --kind all
evoh provider-detect --workspace ./help
/setup
/login
/doctor
/plugins
/resume
/permissions
/exit
π§ installable workflow bundles β’ π°οΈ MCP-native utilities β’ π§© one visible workspace ecosystem
EvoHarness treats plugins and MCP as part of the product surface, not as sidecar extras.
What that means in practice:
- π plugins bundle commands, skills, agents, and MCP surfaces around one workflow family
- π°οΈ MCP bundles expose reusable tools, resources, and prompts for docs, sessions, quality, and workspace mapping
- π§± everything stays workspace-native, so users can inspect the ecosystem instead of guessing what exists
| Plugin | Focus | What It Adds |
|---|---|---|
safe-inspector |
safe read-only review | cautious inspection command, skill, and reviewer agent |
evolution-studio |
trace triage + ecosystem growth | evolution commands, planning skills, and evolution-focused agents |
web-research |
public web research | web command, research skill, scout agent, and MCP search/fetch bundle |
workspace-ops |
workspace mapping + registry hygiene | topology commands, packaging skills, and workspace-intel MCP |
delivery-lab |
release readiness + regression review | ship-readiness workflows and the quality-gate MCP bundle |
docs-foundry |
docs repair + onboarding polish | README/docs workflows and the docs-gap MCP bundle |
session-lab |
sessions + approvals + tasks | task-board / forensics workflows and the session-lab MCP bundle |
| MCP Surface | What It Exposes | Best For |
|---|---|---|
workspace-docs / docs-gap |
doc search, excerpts, repair prompts | onboarding, README drift, docs lookup |
workspace-intel |
workspace snapshot + surface search | understanding the live harness layout |
quality-gate |
doctor report, promotions, session summary | release readiness and regression review |
session-lab |
recent sessions, approvals, task board | long-horizon workflow forensics |
web-research:web-research |
search_web + fetch_page |
public-web research without leaving the harness |
| Surface | Count | Why It Matters |
|---|---|---|
| builtin tools | 26 | direct file, shell, registry, web, task, and runtime actions |
| commands | 32 | reusable workflow entry points |
| skills | 34 | on-demand procedural guidance |
| agents | 32 | bounded delegation and focused side work |
| plugins | 7 | installable workflow families |
| MCP servers | 10 | local service bundles for tools, resources, and prompts |
| MCP tools / resources / prompts | 29 / 27 / 10 | reusable externalized knowledge and actions |
Inside the session:
/plugins
/plugins marketplaces
/mcp
/commands
/agents
/skills
From the CLI:
evoh plugins-list --workspace .
evoh marketplaces-list --workspace .
evoh marketplace-plugins --workspace .
evoh mcp-list --workspace . --kind allπ§ provider + model β’ π permission mode β’ π§© active workflow command β’ π‘ live surface counts
Once the session opens, EvoHarness keeps its operating state visible in the runtime deck instead of hiding it.
The main runtime fields mean:
| Runtime Field | What It Tells You |
|---|---|
π§ provider + model |
which backend family and model you are talking to right now |
π mode |
the current permission mode |
π§© /<workspace-command> |
the active markdown workflow, such as /read-only-inspect |
π‘ surface |
the live counts for commands, skills, agents, plugins, and MCP |
π pulse |
the current tasks, approvals, sessions, and token counters |
Permission modes are simple:
| Mode | Behavior | Best For |
|---|---|---|
default |
read-only work runs freely; mutating actions ask for approval | normal day-to-day coding |
plan |
blocks mutating tools so you can inspect, map, and plan safely | audits, exploration, repo understanding |
full-auto |
allows actions automatically inside sandbox bounds | trusted fast iteration |
Command layers are also explicit:
| Surface | What It Does |
|---|---|
/help, /setup, /doctor, /permissions, /resume, /plugins, /mcp |
session slash commands |
/<workspace-command> |
activates one markdown workflow from .claude/commands/ |
skills |
on-demand workflow guides |
agents |
bounded delegates |
plugins |
bundle commands, skills, agents, and MCP surfaces together |
Good first commands inside the session:
/help
/doctor
/commands
/skills
/agents
/mcp
/permissions
/read-only-inspect auth flow
If you want the same view from the CLI before chatting:
evoh commands-list --workspace .
evoh agents-list --workspace .
evoh tools-list --workspace .
evoh mcp-list --workspace . --kind allApache-2.0. See LICENSE.






