docs: add instruction for running dummy agents and remove old documentation

2026-03-24 18:20:27 -07:00
parent 8ecb728148
commit 3154e34c7a
5 changed files with 112 additions and 2079 deletions
@@ -8,11 +8,12 @@ This guide covers everything you need to know to develop with the Aden Agent Fra
 2. [Initial Setup](#initial-setup)
 3. [Project Structure](#project-structure)
 4. [Building Agents](#building-agents)
-5. [Testing Agents](#testing-agents)
-6. [Code Style & Conventions](#code-style--conventions)
-7. [Git Workflow](#git-workflow)
-8. [Common Tasks](#common-tasks)
-9. [Troubleshooting](#troubleshooting)
+5. [Running Agents](#running-agents)
+6. [Testing Agents](#testing-agents)
+7. [Code Style & Conventions](#code-style--conventions)
+8. [Git Workflow](#git-workflow)
+9. [Common Tasks](#common-tasks)
+10. [Troubleshooting](#troubleshooting)

 ---

@@ -40,121 +41,22 @@ Aden Agent Framework is a Python-based system for building goal-driven, self-imp

 ## Initial Setup

-### Prerequisites
+See [environment-setup.md](./environment-setup.md) for the full setup guide, including Windows, Alpine Linux, and troubleshooting.

-Ensure you have installed:
-
- **Python 3.11+** - [Download](https://www.python.org/downloads/) (3.12 or 3.13 recommended)
- **uv** - Python package manager ([Install](https://docs.astral.sh/uv/getting-started/installation/))
- **git** - Version control
- **Claude Code** - [Install](https://docs.anthropic.com/claude/docs/claude-code) (optional)
- **Codex CLI** - [Install](https://github.com/openai/codex) (optional)
-
-Verify installation:
+### Quick Start

 ```bash
-python --version    # Should be 3.11+
-uv --version        # Should be latest
-git --version       # Any recent version
-```
-
-### Step-by-Step Setup
-
-```bash
-# 1. Clone the repository
 git clone https://github.com/adenhq/hive.git
 cd hive
-
-# 2. Run automated setup
 ./quickstart.sh
 ```

-The setup script performs these actions:
-
-1. Checks Python version (3.11+)
-2. Installs `framework` package from `/core` (editable mode)
-3. Installs `aden_tools` package from `/tools` (editable mode)
-4. Prompts for a default LLM provider, including Hive LLM and OpenRouter
-5. Fixes package compatibility (upgrades openai for litellm)
-6. Verifies all installations
-
-### API Keys (Optional)
-
-For running agents with real LLMs:
-
-```bash
-# Add to your shell profile (~/.bashrc, ~/.zshrc, etc.)
-export ANTHROPIC_API_KEY="your-key-here"
-export OPENAI_API_KEY="your-key-here"        # Optional
-export OPENROUTER_API_KEY="your-key-here"    # Optional, for OpenRouter models
-export HIVE_API_KEY="your-key-here"          # Optional, for Hive LLM
-export BRAVE_SEARCH_API_KEY="your-key-here"  # Optional, for web search tool
-```
-
-Get API keys:
-
- **Anthropic**: [console.anthropic.com](https://console.anthropic.com/)
- **OpenAI**: [platform.openai.com](https://platform.openai.com/)
- **OpenRouter**: [openrouter.ai/keys](https://openrouter.ai/keys)
- **Hive LLM**: [Hive Discord](https://discord.com/invite/hQdU7QDkgR)
- **Brave Search**: [brave.com/search/api](https://brave.com/search/api/)
-
-For OpenRouter and Hive LLM configuration snippets, see [configuration.md](./configuration.md).
-
-### Install Claude Code Skills
-
-```bash
-# Install building-agents and testing-agent skills
-./quickstart.sh
-```
-
-This sets up the MCP tools and workflows for building agents.
-
-### Cursor IDE Support
-
-MCP tools are also available in Cursor. To enable:
-
-1. Open Command Palette (`Cmd+Shift+P` / `Ctrl+Shift+P`)
-2. Run `MCP: Enable` to enable MCP servers
-3. Restart Cursor to load the MCP servers from `.cursor/mcp.json`
-4. Open Agent chat and verify MCP tools are available
-
-### Codex CLI Support
-
-Hive supports [OpenAI Codex CLI](https://github.com/openai/codex) (v0.101.0+).
-
-Configuration files are tracked in git:
- `.codex/config.toml` — MCP server config
-
-To use Codex with Hive:
-1. Run `codex` in the repo root
-2. Start the configured MCP-assisted workflow
-
-Example:
-```
-Start Codex in the repo root and use the configured MCP tools
-```
-
-
-### Opencode Support
-To enable Opencode integration:
-
-1. Create/Ensure `.opencode/` directory exists
-2. Configure MCP servers in `.opencode/mcp.json`
-3. Restart Opencode to load the MCP servers
-4. Switch to the Hive agent
-* **Tools:** Accesses `coder-tools` and standard `tools` via standard MCP protocols over stdio.
-
 ### Verify Setup

 ```bash
-# Verify package imports
-uv run python -c "import framework; print('✓ framework OK')"
-uv run python -c "import aden_tools; print('✓ aden_tools OK')"
-uv run python -c "import litellm; print('✓ litellm OK')"
-
-# Run an agent (after building one with coder-tools)
-PYTHONPATH=exports uv run python -m your_agent_name validate
+uv run python -c "import framework; print('OK')"
+uv run python -c "import aden_tools; print('OK')"
+uv run python -c "import litellm; print('OK')"
 ```

 ---
@@ -181,23 +83,29 @@ hive/                                    # Repository root
 │
 ├── core/                                # CORE FRAMEWORK PACKAGE
 │   ├── framework/                       # Main package code
+│   │   ├── agents/                      # Agent definitions and helpers
 │   │   ├── builder/                     # Agent builder utilities
 │   │   ├── credentials/                 # Credential management
+│   │   ├── debugger/                    # Debugging tools
 │   │   ├── graph/                       # GraphExecutor - executes node graphs
 │   │   ├── llm/                         # LLM provider integrations (Anthropic, OpenAI, OpenRouter, Hive, etc.)
 │   │   ├── mcp/                         # MCP server integration
+│   │   ├── monitoring/                  # Runtime monitoring
+│   │   ├── observability/               # Structured logging - human-readable and machine-parseable tracing
 │   │   ├── runner/                      # AgentRunner - loads and runs agents
-|   |   ├── observability/               # Structured logging - human-readable and machine-parseable tracing
 │   │   ├── runtime/                     # Runtime environment
 │   │   ├── schemas/                     # Data schemas
+│   │   ├── server/                      # HTTP API server
+│   │   ├── skills/                      # Skill definitions
 │   │   ├── storage/                     # File-based persistence
 │   │   ├── testing/                     # Testing utilities
+│   │   ├── tools/                       # Built-in tool implementations
 │   │   ├── tui/                         # Terminal UI dashboard
-│   │   └── __init__.py
+│   │   └── utils/                       # Shared utilities
+│   ├── tests/                           # Unit and E2E tests (including dummy agents)
 │   ├── pyproject.toml                   # Package metadata and dependencies
 │   ├── README.md                        # Framework documentation
-│   ├── MCP_INTEGRATION_GUIDE.md         # MCP server integration guide
-│   └── docs/                            # Protocol documentation
+│   └── MCP_INTEGRATION_GUIDE.md         # MCP server integration guide
 │
 ├── tools/                               # TOOLS PACKAGE (MCP tools)
 │   ├── src/
@@ -320,7 +228,11 @@ If you prefer to build agents manually:
 }
 ```

-### Running Agents
+---
+
+## Running Agents
+
+### Using the `hive` CLI

 ```bash
 # Browse and run agents interactively (Recommended)
@@ -331,33 +243,35 @@ hive run exports/my_agent --input '{"ticket_content": "My login is broken", "cus

 # Run with TUI dashboard
 hive run exports/my_agent --tui
-
 ```

-> **Using Python directly:** `PYTHONPATH=exports uv run python -m agent_name run --input '{...}'`
+### CLI Command Reference
+
+| Command                | Description                                                             |
+| ---------------------- | ----------------------------------------------------------------------- |
+| `hive tui`             | Browse agents and launch TUI dashboard                                  |
+| `hive run <path>`      | Execute an agent (`--tui`, `--model`, `--mock`, `--quiet`, `--verbose`) |
+| `hive shell [path]`    | Interactive REPL (`--multi`, `--no-approve`)                            |
+| `hive info <path>`     | Show agent details                                                      |
+| `hive validate <path>` | Validate agent structure                                                |
+| `hive list [dir]`      | List available agents                                                   |
+| `hive dispatch [dir]`  | Multi-agent orchestration                                               |
+
+### Using Python Directly
+
+```bash
+PYTHONPATH=exports uv run python -m agent_name run --input '{...}'
+```

 ---

 ## Testing Agents

-### Using Built-in Test Commands
+### Agent Tests

 ```bash
 # Run tests for an agent
 PYTHONPATH=exports uv run python -m agent_name test
-```
-
-This generates and runs:
-
- **Constraint tests** - Verify agent respects constraints
- **Success tests** - Verify agent achieves success criteria
- **Integration tests** - End-to-end workflows
-
-### Manual Testing
-
-```bash
-# Run all tests for an agent
-PYTHONPATH=exports uv run python -m agent_name test

 # Run specific test type
 PYTHONPATH=exports uv run python -m agent_name test --type constraint
@@ -370,6 +284,32 @@ PYTHONPATH=exports uv run python -m agent_name test --parallel 4
 PYTHONPATH=exports uv run python -m agent_name test --fail-fast
 ```

+### Framework Tests
+
+```bash
+# Run all unit tests (core + tools)
+make test
+
+# Run linting and format checks
+make check
+```
+
+### Dummy Agent Tests (E2E)
+
+The repository includes end-to-end dummy agent tests under `core/tests/dummy_agents/` that run real LLM calls against deterministic graph structures. These are **not** part of CI — run them manually to verify the executor works with real providers.
+
+```bash
+cd core && uv run python tests/dummy_agents/run_all.py
+```
+
+The script detects available LLM credentials and prompts you to pick a provider. For verbose output:
+
+```bash
+cd core && uv run python tests/dummy_agents/run_all.py --verbose
+```
+
+See [environment-setup.md](./environment-setup.md#testing-with-dummy-agents) for the full list of covered agents and details.
+
 ### Writing Custom Tests

 ```python
@@ -542,8 +482,6 @@ chore(deps): update React to 18.2.0

 ---

---
-
 ## Common Tasks

 ### Adding Python Dependencies
@@ -660,30 +598,7 @@ hive run exports/my_agent --verbose --input '{"task": "..."}'

 ## Troubleshooting

-### Port Already in Use
-
-```bash
-# Find process using port
-lsof -i :3000
-lsof -i :4000
-
-# Kill process
-kill -9 <PID>
-
-```
-
-### Environment Variables Not Loading
-
-```bash
-# Verify .env file exists at project root
-cat .env
-
-# Or check shell environment
-echo $ANTHROPIC_API_KEY
-
-# Create .env if needed
-# Then add your API keys
-```
+See [environment-setup.md](./environment-setup.md#troubleshooting) for common setup issues (module not found errors, broken installations, PEP 668, etc.).

 ---

@@ -693,7 +608,3 @@ echo $ANTHROPIC_API_KEY
 - **Issues**: Search [existing issues](https://github.com/adenhq/hive/issues)
 - **Discord**: Join our [community](https://discord.com/invite/MXE49hrKDk)
 - **Code Review**: Tag a maintainer on your PR
-
---
-
-_Happy coding!_ 🐝
@@ -66,40 +66,6 @@ source .venv/bin/activate
 ./quickstart.sh
 ```

-## Manual Setup (Alternative)
-
-If you prefer to set up manually or the script fails:
-
-### 1. Sync Workspace Dependencies
-
-```bash
-# From repository root - this creates a single .venv at the root
-uv sync
-```
-
-> **Note:** The `uv sync` command uses the workspace configuration in `pyproject.toml` to install both `core` (framework) and `tools` (aden_tools) packages together. This is the recommended approach over individual `pip install -e` commands which may fail due to circular dependencies.
-
-### 2. Activate the Virtual Environment
-
-```bash
-# Linux/macOS
-source .venv/bin/activate
-
-# Windows (PowerShell)
-.venv\Scripts\Activate.ps1
-```
-
-### 3. Verify Installation
-
-```bash
-uv run python -c "import framework; print('✓ framework OK')"
-uv run python -c "import aden_tools; print('✓ aden_tools OK')"
-uv run python -c "import litellm; print('✓ litellm OK')"
-```
-
-> **Windows Tip:**
-> If the verification commands fail on Windows, disable "App Execution Aliases" in Windows Settings → Apps → App Execution Aliases.
-
 ## Requirements

 ### Python Version
@@ -119,47 +85,6 @@ uv run python -c "import litellm; print('✓ litellm OK')"

 We recommend using `quickstart.sh` for LLM API credential setup and the credentials UI/tooling for tool credentials.

-## Running Agents
-
-The `hive` CLI is the primary interface for running agents:
-
-```bash
-# Browse and run agents interactively (Recommended)
-hive tui
-
-# Run a specific agent
-hive run exports/my_agent --input '{"task": "Your input here"}'
-
-# Run with TUI dashboard
-hive run exports/my_agent --tui
-```
-
-### CLI Command Reference
-
-| Command                | Description                                                             |
-| ---------------------- | ----------------------------------------------------------------------- |
-| `hive tui`             | Browse agents and launch TUI dashboard                                  |
-| `hive run <path>`      | Execute an agent (`--tui`, `--model`, `--mock`, `--quiet`, `--verbose`) |
-| `hive shell [path]`    | Interactive REPL (`--multi`, `--no-approve`)                            |
-| `hive info <path>`     | Show agent details                                                      |
-| `hive validate <path>` | Validate agent structure                                                |
-| `hive list [dir]`      | List available agents                                                   |
-| `hive dispatch [dir]`  | Multi-agent orchestration                                               |
-
-### Using Python directly (alternative)
-
-```bash
-# From /hive/ directory
-PYTHONPATH=exports uv run python -m agent_name COMMAND
-```
-
-Windows (PowerShell):
-
-```powershell
-$env:PYTHONPATH="core;exports"
-python -m agent_name COMMAND
-```
-
 ## Building New Agents and Run Flow

 Build and run an agent using Claude Code CLI with the agent building skills:
@@ -454,30 +379,53 @@ hive tui
 hive run exports/your_agent_name --input '{"task": "..."}'
 ```

-## IDE Setup
+## Testing with Dummy Agents

-### VSCode
+The repository includes a suite of dummy agents under `core/tests/dummy_agents/` for end-to-end testing against real LLM providers. These are **not** part of CI — they make real API calls and are meant to be run manually to verify the executor works correctly.

-Add to `.vscode/settings.json`:
+### Running the Tests

-```json
-{
-  "python.analysis.extraPaths": [
-    "${workspaceFolder}/core",
-    "${workspaceFolder}/exports"
-  ],
-  "python.autoComplete.extraPaths": [
-    "${workspaceFolder}/core",
-    "${workspaceFolder}/exports"
-  ]
-}
+```bash
+cd core && uv run python tests/dummy_agents/run_all.py
 ```

-### PyCharm
+The script auto-detects available LLM credentials and prompts you to pick a provider. You need at least one of:

-1. Open Project Settings → Project Structure
-2. Mark `core` as Sources Root
-3. Mark `exports` as Sources Root
+- `ANTHROPIC_API_KEY`
+- `OPENAI_API_KEY`
+- `GEMINI_API_KEY`
+- `ZAI_API_KEY`
+- A Claude Code, Codex, or Kimi subscription
+
+For verbose output with live LLM logs, tool calls, and node traversal details:
+
+```bash
+cd core && uv run python tests/dummy_agents/run_all.py --verbose
+```
+
+### What's Covered
+
+| Agent          | Tests | Coverage                                          |
+| -------------- | ----- | ------------------------------------------------- |
+| echo           | 2     | Single-node lifecycle, basic `set_output`          |
+| pipeline       | 4     | Multi-node traversal, `input_mapping`, conversation modes |
+| branch         | 3     | Conditional edges, LLM-driven routing              |
+| parallel_merge | 4     | Fan-out/fan-in, failure strategies                  |
+| retry          | 4     | Retry mechanics, exhaustion, `ON_FAILURE` edges     |
+| feedback_loop  | 3     | Feedback cycles, `max_node_visits`                  |
+| worker         | 4     | Real MCP tools (`example_tool`, `get_current_time`, `save_data`/`load_data`) |
+
+Typical runtime is 1–3 minutes depending on provider latency.
+
+### Running Individual Test Files
+
+You can also run a specific dummy agent test with pytest directly:
+
+```bash
+cd core && uv run pytest tests/dummy_agents/test_echo.py -v
+```
+
+> **Note:** Individual pytest runs require the LLM provider to be configured via the `conftest.py` fixture. The `run_all.py` script handles this automatically.

 ## Environment Variables

@@ -501,54 +449,6 @@ export HIVE_CREDENTIAL_KEY="your-fernet-key"
 export AGENT_STORAGE_PATH="/custom/storage"
 ```

-## Opencode Setup
-
-[Opencode](https://github.com/opencode-ai/opencode) is fully supported as a coding agent.
-
-### Automatic Setup
-
-Run the quickstart script in the root directory:
-
-```bash
-./quickstart.sh
-```
-
-## Codex Setup
-
-[OpenAI Codex CLI](https://github.com/openai/codex) (v0.101.0+) is supported with project-level config:
-
- `.codex/config.toml` — MCP server configuration
-
-These files are tracked in git and available on clone. To use Codex with Hive:
-
-1. Run `codex` in the repo root
-2. Start the configured MCP-assisted workflow
-
-Quick verification:
-
-```bash
-test -f .codex/config.toml && echo "OK: Codex config" || echo "MISSING: .codex/config.toml"
-echo "OK: .codex/config.toml and MCP tools configured"
-```
-
-## Additional Resources
-
- **Framework Documentation:** [core/README.md](../core/README.md)
- **Tools Documentation:** [tools/README.md](../tools/README.md)
- **Example Agents:** [examples/](../examples/)
- **Agent Building Guide:** [docs/developer-guide.md](./developer-guide.md)
- **Testing Guide:** [core/README.md](../core/README.md)
-
-## Contributing
-
-When contributing agent packages:
-
-1. Place agents in `exports/agent_name/`
-2. Follow the standard agent structure (see existing agents)
-3. Include README.md with usage instructions
-4. Add tests if using `test workflow`
-5. Document required environment variables
-
 ## Support

 - **Issues:** https://github.com/adenhq/hive/issues
@@ -1,75 +0,0 @@
-# Hive Queen Bee: Native agent-building agent
-
-## Problem
-
-Building a Hive agent today requires manual assembly of 7+ files (`agent.py`, `config.py`, `nodes/__init__.py`, `__init__.py`, `__main__.py`, `mcp_servers.json`, tests) with precise framework conventions — correct imports, entry_points format, conversation_mode values, STEP 1/STEP 2 prompt patterns, nullable_output_keys, and more. A single missing re-export in `__init__.py` silently breaks `AgentRunner.load()`. This is the #1 friction point for new users and a recurring source of bugs even for experienced ones.
-
-There is no tool that understands the framework deeply enough to produce correct agents. General-purpose coding assistants hallucinate tool names, use wrong import paths (`from core.framework...`), create too many thin nodes, forget module-level exports, and produce agents that fail validation.
-
-## Proposal
-
-Build **Hive Coder** (codename "Queen Bee") — a framework-native coding agent that lives inside the framework itself and builds complete, validated agent packages from natural language.
-
-### Design principles
-
-1. **Single-node, forever-alive** — One continuous EventLoopNode conversation handles the full lifecycle (understand, qualify, design, implement, verify, iterate). No artificial phase boundaries that destroy context.
-
-2. **Meta-agent capabilities** — Not just a file writer. Can discover available MCP tools at runtime, inspect sessions/checkpoints of agents it builds, run their test suites, and debug failures.
-
-3. **Self-verifying** — Runs three validation steps after every build: class validation (graph structure), `AgentRunner.load()` (package export contract), and pytest. Fixes its own errors up to 3 attempts.
-
-4. **Honest qualification** — Assesses framework fit before building. If a use case is a poor fit (needs sub-second latency, pure CRUD, massive data pipelines), says so instead of producing a bad agent.
-
-5. **Reference-grounded** — Ships with embedded reference docs (framework guide, file templates, anti-patterns) that it reads before writing code. No reliance on training data for framework specifics.
-
-### Components
-
-#### `hive_coder` agent (`core/framework/agents/hive_coder/`)
-
-| File | Purpose |
-|------|---------|
-| `agent.py` | Goal, single-node graph, `HiveCoderAgent` class |
-| `nodes/__init__.py` | `coder` EventLoopNode with comprehensive system prompt |
-| `config.py` | RuntimeConfig with `~/.hive/configuration.json` auto-detection |
-| `__main__.py` | Click CLI (`run`, `tui`, `info`, `validate`, `shell`) |
-| `reference/framework_guide.md` | Node types, edges, patterns, async entry points |
-| `reference/file_templates.md` | Complete code templates for every agent file |
-| `reference/anti_patterns.md` | 22 common mistakes with explanations |
-
-#### Coder Tools MCP Server (`tools/coder_tools_server.py`)
-
-Dedicated tool server providing:
-
- **File I/O**: `read_file` (with line numbers, offset/limit), `write_file` (auto-mkdir), `edit_file` (9-strategy fuzzy matching ported from opencode), `list_directory`, `search_files` (regex)
- **Shell**: `run_command` (timeout, cwd, output truncation)
- **Git**: `undo_changes` (snapshot-based rollback)
- **Meta-agent**: `discover_mcp_tools`, `list_agents`, `list_agent_sessions`, `list_agent_checkpoints`, `get_agent_checkpoint`, `run_agent_tests`
-
-All file operations sandboxed to a configurable project root.
-
-#### Framework changes
-
- `hive code` CLI command — direct launch shortcut
- `hive tui` — discovers framework agents as a source
- `AgentRuntime` — cron expression support (`croniter`) for async entry points
- `prompt_composer` — appends current datetime to system prompts
- `NodeSpec.max_node_visits` — default changed from 1 to 0 (unbounded), matching forever-alive as the standard pattern
- TUI graph view — cron display and hours in countdown
- CredentialError graceful handling in TUI launch
-
-## Acceptance criteria
-
- [ ] `hive code` launches Hive Coder in the TUI
- [ ] `hive tui` lists framework agents alongside exports/ and examples/
- [ ] Given "build me a research agent that searches the web and summarizes findings", Hive Coder produces a valid package in `exports/` that passes `AgentRunner.load()`
- [ ] Tool discovery works: agent calls `discover_mcp_tools()` before designing, never fabricates tool names
- [ ] Self-verification: agent runs all 3 validation steps and fixes errors before presenting
- [ ] Cron timers fire on schedule (unit tested)
- [ ] `max_node_visits=0` default does not break existing agents or tests
- [ ] Reference docs are accurate and match current framework behavior
-
-## Non-goals
-
- Multi-agent orchestration (queen spawning worker agents at runtime) — future work
- GUI/web interface — TUI only for v1
- Auto-publishing to a registry — agents are local packages
@@ -1,288 +0,0 @@
-# Plan: Multi-Graph Sessions with Guardian Pattern
-
-## Context
-
-The target experience: hive_coder builds an agent (e.g., email automation), loads it into the same runtime session, and acts as its guardian. The email agent runs autonomously while hive_coder watches for failures. On error, hive_coder asks the user for help if they're around, attempts an autonomous fix if they're away, and escalates catastrophic failures for post-mortem.
-
-This requires multiple agent graphs sharing a single `AgentRuntime` session — shared memory and data, but isolated conversations. The existing runtime already has most of the primitives: `ExecutionStream` accepts its own `graph`, `trigger_type="event"` subscribes entry points to the EventBus, and `_get_primary_session_state()` bridges memory across streams.
-
-## Architecture Overview
-
-```
-AgentRuntime (shared EventBus, shared state.json, shared data/)
-├── hive_coder graph
-│   ├── Stream "default"     → coder node (client_facing, manual)
-│   └── Stream "guardian"    → guardian node (event-driven, subscribes to EXECUTION_FAILED)
-└── email_agent graph
-    └── Stream "email_agent::default" → intake node (client_facing, manual)
-```
-
-The guardian entry point on hive_coder fires when email_agent emits `EXECUTION_FAILED`. It receives the failure event in its input, reads shared memory for context, and decides: ask user (if present), auto-fix (if away), or escalate (if catastrophic).
-
-## Gap 1: Event Scoping — `graph_id` on Events
-
-**Problem**: EventBus events carry `stream_id` and `node_id` but no `graph_id`. The guardian needs to subscribe to events from a specific graph (email_agent), not a specific stream name.
-
-**Solution**: Add `graph_id: str | None = None` to `AgentEvent` and `filter_graph` to `Subscription`.
-
-### `core/framework/runtime/event_bus.py`
- `AgentEvent` dataclass: add `graph_id: str | None = None` field, include in `to_dict()`
- `Subscription` dataclass: add `filter_graph: str | None = None`
- `subscribe()`: accept `filter_graph` param, pass to `Subscription`
- `_matches()`: check `filter_graph` against `event.graph_id`
-
-### `core/framework/runtime/execution_stream.py`
- `__init__()`: accept `graph_id: str | None = None`, store as `self.graph_id`
- When emitting events via `_event_bus.publish()`: set `event.graph_id = self.graph_id`
-
-## Gap 2: Multi-Graph Runtime — `add_graph()` / `remove_graph()`
-
-**Problem**: `AgentRuntime.__init__` takes a single `GraphSpec`. We need to add/remove graphs dynamically at runtime.
-
-**Solution**: Keep the primary graph on `__init__`. Add methods to register secondary graphs that create their own `ExecutionStream` instances backed by a different graph.
-
-### `core/framework/runtime/agent_runtime.py`
-
-New instance state:
-```python
-self._graph_id: str = graph_id or "primary"  # ID for the primary graph
-self._graphs: dict[str, _GraphRegistration] = {}  # graph_id -> registration
-self._active_graph_id: str = self._graph_id  # TUI focus
-```
-
-Where `_GraphRegistration` is a simple dataclass:
-```python
-@dataclass
-class _GraphRegistration:
-    graph: GraphSpec
-    goal: Goal
-    entry_points: dict[str, EntryPointSpec]
-    streams: dict[str, ExecutionStream]
-    storage_subpath: str  # relative to session root, e.g. "graphs/email_agent"
-    event_subscriptions: list[str]  # EventBus subscription IDs
-    timer_tasks: list[asyncio.Task]
-```
-
-New methods:
- `add_graph(graph_id, graph, goal, entry_points, storage_subpath=None)` — creates streams for the graph using graph-scoped storage, sets up event/timer triggers, stamps `graph_id` on all streams. Can be called while running.
- `remove_graph(graph_id)` — stops streams, cancels timers, unsubscribes events, removes registration. Cannot remove primary graph.
- `list_graphs() -> list[str]` — returns all graph IDs
- `active_graph_id` property with setter — TUI uses this to control which graph's events are displayed
-
-Update existing methods:
- `start()`: stamp `self._graph_id` on primary graph streams (via `ExecutionStream.graph_id`)
- `inject_input(node_id, content)`: search active graph's streams first, then all others
- `_get_primary_session_state()`: search across ALL graphs' streams (not just primary's)
- `stop()`: stop all secondary graph streams/timers/subscriptions too
-
-### Storage Layout
-```
-~/.hive/agents/hive_coder/sessions/{session_id}/
-    state.json                  ← SHARED across all graphs
-    data/                       ← SHARED data directory
-    conversations/coder/        ← hive_coder conversations
-    graphs/
-        email_agent/            ← secondary graph storage root
-            conversations/
-                intake/
-            checkpoints/
-```
-
-Secondary graph executors get `storage_path = {session_root}/graphs/{graph_id}/` while `state.json` and `data/` remain at the session root. The `resume_session_id` mechanism in `_get_primary_session_state()` already handles this — secondary executions find the primary session's `state.json`.
-
-**Concurrent state.json writes**: For the guardian pattern (sequential: email_agent fails → guardian triggers), no file lock needed. But since both could technically write concurrently, add a simple `fcntl.flock()` wrapper around `_write_progress()` in the executor. Small, defensive change.
-
-## Gap 3: Guardian Pattern — User Presence + Autonomous Recovery
-
-**Problem**: When email_agent fails, hive_coder's guardian entry point must decide: ask user or auto-fix.
-
-**Solution**: User presence is a runtime-level signal. The guardian's system prompt and event data give it enough context to decide.
-
-### User Presence Tracking
-Add to `AgentRuntime`:
-```python
-self._last_user_input_time: float = 0.0  # monotonic timestamp
-```
-
-Updated in `inject_input()` (called whenever user types in TUI). Exposed as:
-```python
-@property
-def user_idle_seconds(self) -> float:
-    if self._last_user_input_time == 0:
-        return float('inf')
-    return time.monotonic() - self._last_user_input_time
-```
-
-The guardian node's system prompt instructs the LLM: "If user_idle_seconds < 120, ask the user for guidance via the client-facing interaction. If user is away, attempt an autonomous fix."
-
-This is NOT framework logic — it's prompt-driven. The guardian node is a regular `event_loop` node with `client_facing=True` and tools for code editing + agent lifecycle. The LLM decides the strategy based on presence info injected as context.
-
-### Escalation Model
-Escalation = save a structured log entry. No special framework support needed. The guardian node uses `save_data("escalation_log.jsonl", ...)` via the existing data tools. The LLM writes:
-```json
-{"timestamp": "...", "severity": "catastrophic", "agent": "email_agent", "error": "...", "attempted_fixes": [...], "recommended_action": "..."}
-```
-
-Post-mortem: user opens `/data escalation_log.jsonl` or the TUI shows a notification linking to it.
-
-## Gap 4: Graph Lifecycle Tools — Stop/Reload/Restart
-
-**Problem**: hive_coder needs to programmatically stop a broken agent, fix its code, reload it, and restart it.
-
-**Solution**: MCP tools accessible to the active agent. Uses `ContextVar` to access the runtime (same pattern as `data_dir`).
-
-### `core/framework/tools/session_graph_tools.py` (NEW)
-
-```python
-async def load_agent(agent_path: str) -> str:
-    """Load an agent graph into the running session."""
-
-async def unload_agent(graph_id: str) -> str:
-    """Stop and remove an agent graph from the session."""
-
-async def start_agent(graph_id: str, entry_point: str = "default", input_data: str = "{}") -> str:
-    """Trigger an entry point on a loaded agent graph."""
-
-async def restart_agent(graph_id: str) -> str:
-    """Unload and re-load an agent (picks up code changes)."""
-
-async def list_agents() -> str:
-    """List all agent graphs in the current session with their status."""
-
-async def get_user_presence() -> str:
-    """Return user idle time and presence status."""
-```
-
-These tools call `runtime.add_graph()`, `runtime.remove_graph()`, `runtime.trigger()`, etc.
-
-### Registration
-These tools are registered via `ToolRegistry` with `CONTEXT_PARAM` for `runtime` (injected by the executor, same as `data_dir`). Only available when the runtime is multi-graph capable (set by `cmd_code()`).
-
-## Gap 5: TUI Integration — Graph Switching + Background Notifications
-
-### `core/framework/tui/app.py`
- `_route_event()`: check `event.graph_id` against `runtime.active_graph_id`
-  - Events from active graph: route normally (streaming, chat, etc.)
-  - `CLIENT_INPUT_REQUESTED` from background graph: show notification bar
-  - `EXECUTION_FAILED` from background graph: show error notification
-  - `EXECUTION_COMPLETED` from background: show brief completion notice
-  - Other background events: silent (visible in logs)
- `action_switch_graph(graph_id)`: update `runtime.active_graph_id`, refresh graph view, show header
-
-### `core/framework/tui/widgets/chat_repl.py`
- Track `_input_graph_id: str | None` alongside `_input_node_id`
- `handle_input_requested(node_id, graph_id)`: if background graph, show notification instead of enabling input
- `_submit_input()`: pass `graph_id` to help `inject_input()` route correctly
- New TUI commands:
-  - `/graphs` — list loaded graphs and their status
-  - `/graph <id>` — switch active graph focus
-  - `/load <path>` — load an agent graph into the session
-  - `/unload <id>` — remove a graph from the session
- On graph switch: flush streaming state, render graph header separator
-
-### `core/framework/tui/widgets/graph_view.py`
- `switch_graph(graph_id)` — re-render the graph visualization for the new active graph
- When multi-graph active: show tab-like header listing all loaded graphs
-
-## Gap 6: CLI + Runner Integration
-
-### `core/framework/runner/cli.py`
- `cmd_code()` creates the hive_coder runtime with `graph_id="hive_coder"`
- Registers `session_graph_tools` with the tool config so hive_coder's LLM can call them
- Sets `runtime._multi_graph_capable = True` flag
-
-### `core/framework/runner/runner.py`
- New method: `setup_as_secondary(runtime, graph_id)` — configures this runner to join an existing `AgentRuntime` as a secondary graph. Uses the existing `AgentRunner.load()` to parse agent.json, then calls `runtime.add_graph()` with the parsed graph/goal/entry_points.
-
-## Gap 7: Reliable Mid-Node Resume
-
-**Problem**: When an EventLoopNode is interrupted (crash, Ctrl+Z, context switch), resume doesn't restore to exactly where execution stopped. Several pieces of in-node state are lost, which changes behavior post-resume. In multi-graph sessions with parallel execution and frequent context switching, these gaps compound.
-
-### What's already restored correctly
- **Conversation history**: All messages persisted to disk immediately via `FileConversationStore._persist()` — one file per message in `parts/NNNNNNNNNN.json`
- **OutputAccumulator values**: Write-through to `cursor.json` on every `accumulator.set()` call
- **Iteration counter**: Written to `cursor.json` at the end of each iteration (step 6g)
- **Orphaned tool calls**: `_repair_orphaned_tool_calls()` patches in-flight tool calls with error messages so the LLM knows to retry
-
-### What's lost — and fixes
-
-#### 1. `user_interaction_count` (CRITICAL)
-Resets to 0 on resume. This controls client-facing blocking semantics: before the first interaction, `set_output`-only turns don't prevent blocking (the LLM must present to the user first). After resume, a node that had 3 user interactions behaves as if the user never interacted.
-
-**Fix**: Persist `user_interaction_count` to `cursor.json` alongside `iteration` and `outputs`. Write it in `_write_cursor()` (step 6g), restore in `_restore()`.
-
-**Files**: `core/framework/graph/event_loop_node.py`
-
-#### 2. Accumulator outputs not in SharedMemory
-The `OutputAccumulator` writes to `cursor.json` (durable) but only writes to `SharedMemory` when the judge ACCEPTs. On crash, the CancelledError handler captures `memory.read_all()` — which doesn't include the accumulator's WIP values. On resume, edge conditions checking those memory keys see `None`.
-
-**Fix**: In the executor's `CancelledError` handler, read the interrupted node's `cursor.json` and write any accumulator outputs to `memory` before building `session_state_out`. This ensures resume memory includes WIP output values.
-
-**Files**: `core/framework/graph/executor.py` (CancelledError handler, ~line 1289)
-
-#### 3. Stall/doom-loop detection counters
-`recent_responses` and `recent_tool_fingerprints` reset to empty lists. A previously near-stalled node gets a fresh detection budget.
-
-**Fix**: Persist these to `cursor.json`. They're small (last N strings). Write in `_write_cursor()`, restore in `_restore()`.
-
-**Files**: `core/framework/graph/event_loop_node.py`
-
-#### 4. `continuous_conversation` at executor level
-In continuous mode, the executor's `continuous_conversation` variable is `None` on resume. The node's `_restore()` recovers messages from disk, but the executor doesn't pre-populate this variable until the node returns.
-
-**Fix**: After a resumed node completes, set `continuous_conversation = result.conversation` (this already happens in the normal path at line 1155 — verify it also runs on the resume path).
-
-**Files**: `core/framework/graph/executor.py`
-
-### Multi-graph specific: independent resume per graph
-Each graph in a multi-graph session has its own storage subdirectory (`graphs/{graph_id}/`) with its own `conversations/`, `checkpoints/`, and `cursor.json` files. Resume is already per-executor, so each graph resumes independently. The shared `state.json` at the session root captures the union of all graphs' memory — the `fcntl.flock()` wrapper on `_write_progress()` (Gap 2) ensures concurrent writes don't corrupt it.
-
-### Implementation
-These fixes are prerequisite to multi-graph and should be done as **Phase 0** before the EventBus changes:
-1. Persist `user_interaction_count` + stall/doom counters to `cursor.json`
-2. Restore them in `_restore()`
-3. Flush accumulator outputs to SharedMemory in executor's CancelledError handler
-4. Verify continuous_conversation is set on resume path
-
-## Implementation Phases
-
-### Phase 0: Reliable Mid-Node Resume (prerequisite)
-1. `event_loop_node.py` — persist `user_interaction_count`, `recent_responses`, `recent_tool_fingerprints` to `cursor.json` via `_write_cursor()`; restore in `_restore()`
-2. `executor.py` — in CancelledError handler, read interrupted node's `cursor.json` accumulator outputs and write to `memory` before building `session_state_out`
-3. `executor.py` — verify `continuous_conversation` is populated on resume path
-
-### Phase 1: EventBus Foundation
-1. `event_bus.py` — `graph_id` on `AgentEvent`, `filter_graph` on `Subscription` + `_matches()`
-2. `execution_stream.py` — accept and stamp `graph_id` on emitted events
-
-### Phase 2: Multi-Graph Runtime
-3. `agent_runtime.py` — `_GraphRegistration` dataclass, `add_graph()`, `remove_graph()`, `list_graphs()`, `active_graph_id` property
-4. `agent_runtime.py` — update `inject_input()`, `_get_primary_session_state()`, `stop()` for multi-graph
-5. `agent_runtime.py` — user presence tracking (`_last_user_input_time`, `user_idle_seconds`)
-6. Storage path logic: secondary graphs get `{session_root}/graphs/{graph_id}/`
-
-### Phase 3: Graph Lifecycle Tools
-7. `core/framework/tools/session_graph_tools.py` — `load_agent`, `unload_agent`, `start_agent`, `restart_agent`, `list_agents`, `get_user_presence`
-8. `runner.py` — `setup_as_secondary()` method
-
-### Phase 4: TUI Integration
-9. `app.py` — `graph_id` event filtering, background notifications, `action_switch_graph`
-10. `chat_repl.py` — `/graphs`, `/graph`, `/load`, `/unload` commands, graph_id tracking
-11. `graph_view.py` — multi-graph header, `switch_graph()`
-
-### Phase 5: hive_coder Integration
-12. `cli.py` — `cmd_code()` sets up multi-graph capable runtime, registers graph tools
-13. hive_coder's agent config — add guardian entry point with `trigger_type="event"` subscribing to `EXECUTION_FAILED`
-14. Guardian node system prompt — presence-aware triage logic (ask user / auto-fix / escalate)
-
-## Backward Compatibility
- Single-graph `hive run exports/my_agent` unchanged: `graph_id` defaults to `None`, no secondary graphs loaded, events carry `graph_id=None`, TUI shows no graph switching UI
- All new fields are optional with `None` defaults
- `_get_primary_session_state()` existing behavior preserved when no secondary graphs exist
-
-## Verification
-1. **Unit**: `add_graph()` creates streams with correct `graph_id`, events carry `graph_id`, `filter_graph` works in subscriptions, `inject_input()` routes to correct graph
-2. **Integration**: Load hive_coder + email_agent, email_agent fails → guardian fires → reads shared memory → decides action
-3. **TUI**: `/graphs` shows both, `/graph` switches, background failure notification appears, input routing works across graphs
-4. **Backward compat**: `hive run exports/deep_research_agent --tui` works unchanged
-5. **Lifecycle**: `restart_agent` picks up code changes, `unload_agent` cleans up streams and subscriptions