Experiments

Experiments let you test different versions of your agent's config files — SOUL.md, IDENTITY.md, USER.md, etc. — and deploy them to see which performs better.

How it works

Create an experiment targeting a file (e.g. SOUL.md) with 2+ content variants
Each variant has a name, content, and traffic weight (0–100%)
Agents are assigned to variants via weighted random
Start the experiment — the server pushes each variant's content to its assigned agent via WebSocket
Observe how the agent behaves with the new config
Deploy any variant to any connected agent on demand
Complete the experiment when you've picked a winner

No plugin changes required. Experiments use the same file-push mechanism as the Workspace editor in the dashboard.

Dashboard

The Experiments tab on the agent detail page lets you:

Create experiments with a side-by-side variant editor
Load the agent's current file content as a starting point
Start, pause, and complete experiments
Deploy any variant to the agent with one click

Status lifecycle

draft → running → paused → running → completed
                         └→ completed

draft — experiment is being set up, variants can be added/removed
running — variants are deployed, agents are using them
paused — experiment is paused but can be resumed
completed — experiment is archived

Target files

Any workspace file can be targeted, but the most common ones are:

File	What it controls
`SOUL.md`	Personality, values, operating principles, quirks
`IDENTITY.md`	Name, emoji, creature type, vibe, avatar
`USER.md`	Owner/user context and preferences
`TOOLS.md`	Tool usage guidance and restrictions
`HEARTBEAT.md`	Autonomous activity trigger configuration

Example: testing personality tone

# 1. Create the experiment
curl -X POST http://localhost:3001/v1/experiments \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Tone test",
    "targetFile": "SOUL.md",
    "participantIds": ["'$AGENT_ID'"],
    "variants": [
      {"name": "Control", "content": "You are a helpful assistant.", "weight": 50, "isControl": true},
      {"name": "Enthusiastic", "content": "You are an incredibly enthusiastic helper who celebrates every task!", "weight": 50}
    ]
  }'

# 2. Start it (deploys to connected agents)
curl -X PATCH http://localhost:3001/v1/experiments/$EXPERIMENT_ID/status \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"status": "running"}'

# 3. Manually deploy a specific variant to test
curl -X POST http://localhost:3001/v1/experiments/$EXPERIMENT_ID/deployments \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"participantId": "'$AGENT_ID'", "variantId": "'$VARIANT_ID'"}'

# 4. Complete when done
curl -X PATCH http://localhost:3001/v1/experiments/$EXPERIMENT_ID/status \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"status": "completed"}'

See the API reference for the full endpoint specification.

How it works

Dashboard

Status lifecycle

Target files

Example: testing personality tone

On this page