docs: add instruction for running dummy agents and remove old documentation

This commit is contained in:
Richard Tang
2026-03-24 18:20:27 -07:00
parent 8ecb728148
commit 3154e34c7a
5 changed files with 112 additions and 2079 deletions
File diff suppressed because it is too large Load Diff
+71 -160
View File
@@ -8,11 +8,12 @@ This guide covers everything you need to know to develop with the Aden Agent Fra
2. [Initial Setup](#initial-setup)
3. [Project Structure](#project-structure)
4. [Building Agents](#building-agents)
5. [Testing Agents](#testing-agents)
6. [Code Style & Conventions](#code-style--conventions)
7. [Git Workflow](#git-workflow)
8. [Common Tasks](#common-tasks)
9. [Troubleshooting](#troubleshooting)
5. [Running Agents](#running-agents)
6. [Testing Agents](#testing-agents)
7. [Code Style & Conventions](#code-style--conventions)
8. [Git Workflow](#git-workflow)
9. [Common Tasks](#common-tasks)
10. [Troubleshooting](#troubleshooting)
---
@@ -40,121 +41,22 @@ Aden Agent Framework is a Python-based system for building goal-driven, self-imp
## Initial Setup
### Prerequisites
See [environment-setup.md](./environment-setup.md) for the full setup guide, including Windows, Alpine Linux, and troubleshooting.
Ensure you have installed:
- **Python 3.11+** - [Download](https://www.python.org/downloads/) (3.12 or 3.13 recommended)
- **uv** - Python package manager ([Install](https://docs.astral.sh/uv/getting-started/installation/))
- **git** - Version control
- **Claude Code** - [Install](https://docs.anthropic.com/claude/docs/claude-code) (optional)
- **Codex CLI** - [Install](https://github.com/openai/codex) (optional)
Verify installation:
### Quick Start
```bash
python --version # Should be 3.11+
uv --version # Should be latest
git --version # Any recent version
```
### Step-by-Step Setup
```bash
# 1. Clone the repository
git clone https://github.com/adenhq/hive.git
cd hive
# 2. Run automated setup
./quickstart.sh
```
The setup script performs these actions:
1. Checks Python version (3.11+)
2. Installs `framework` package from `/core` (editable mode)
3. Installs `aden_tools` package from `/tools` (editable mode)
4. Prompts for a default LLM provider, including Hive LLM and OpenRouter
5. Fixes package compatibility (upgrades openai for litellm)
6. Verifies all installations
### API Keys (Optional)
For running agents with real LLMs:
```bash
# Add to your shell profile (~/.bashrc, ~/.zshrc, etc.)
export ANTHROPIC_API_KEY="your-key-here"
export OPENAI_API_KEY="your-key-here" # Optional
export OPENROUTER_API_KEY="your-key-here" # Optional, for OpenRouter models
export HIVE_API_KEY="your-key-here" # Optional, for Hive LLM
export BRAVE_SEARCH_API_KEY="your-key-here" # Optional, for web search tool
```
Get API keys:
- **Anthropic**: [console.anthropic.com](https://console.anthropic.com/)
- **OpenAI**: [platform.openai.com](https://platform.openai.com/)
- **OpenRouter**: [openrouter.ai/keys](https://openrouter.ai/keys)
- **Hive LLM**: [Hive Discord](https://discord.com/invite/hQdU7QDkgR)
- **Brave Search**: [brave.com/search/api](https://brave.com/search/api/)
For OpenRouter and Hive LLM configuration snippets, see [configuration.md](./configuration.md).
### Install Claude Code Skills
```bash
# Install building-agents and testing-agent skills
./quickstart.sh
```
This sets up the MCP tools and workflows for building agents.
### Cursor IDE Support
MCP tools are also available in Cursor. To enable:
1. Open Command Palette (`Cmd+Shift+P` / `Ctrl+Shift+P`)
2. Run `MCP: Enable` to enable MCP servers
3. Restart Cursor to load the MCP servers from `.cursor/mcp.json`
4. Open Agent chat and verify MCP tools are available
### Codex CLI Support
Hive supports [OpenAI Codex CLI](https://github.com/openai/codex) (v0.101.0+).
Configuration files are tracked in git:
- `.codex/config.toml` — MCP server config
To use Codex with Hive:
1. Run `codex` in the repo root
2. Start the configured MCP-assisted workflow
Example:
```
Start Codex in the repo root and use the configured MCP tools
```
### Opencode Support
To enable Opencode integration:
1. Create/Ensure `.opencode/` directory exists
2. Configure MCP servers in `.opencode/mcp.json`
3. Restart Opencode to load the MCP servers
4. Switch to the Hive agent
* **Tools:** Accesses `coder-tools` and standard `tools` via standard MCP protocols over stdio.
### Verify Setup
```bash
# Verify package imports
uv run python -c "import framework; print('✓ framework OK')"
uv run python -c "import aden_tools; print('✓ aden_tools OK')"
uv run python -c "import litellm; print('✓ litellm OK')"
# Run an agent (after building one with coder-tools)
PYTHONPATH=exports uv run python -m your_agent_name validate
uv run python -c "import framework; print('OK')"
uv run python -c "import aden_tools; print('OK')"
uv run python -c "import litellm; print('OK')"
```
---
@@ -181,23 +83,29 @@ hive/ # Repository root
├── core/ # CORE FRAMEWORK PACKAGE
│ ├── framework/ # Main package code
│ │ ├── agents/ # Agent definitions and helpers
│ │ ├── builder/ # Agent builder utilities
│ │ ├── credentials/ # Credential management
│ │ ├── debugger/ # Debugging tools
│ │ ├── graph/ # GraphExecutor - executes node graphs
│ │ ├── llm/ # LLM provider integrations (Anthropic, OpenAI, OpenRouter, Hive, etc.)
│ │ ├── mcp/ # MCP server integration
│ │ ├── monitoring/ # Runtime monitoring
│ │ ├── observability/ # Structured logging - human-readable and machine-parseable tracing
│ │ ├── runner/ # AgentRunner - loads and runs agents
| | ├── observability/ # Structured logging - human-readable and machine-parseable tracing
│ │ ├── runtime/ # Runtime environment
│ │ ├── schemas/ # Data schemas
│ │ ├── server/ # HTTP API server
│ │ ├── skills/ # Skill definitions
│ │ ├── storage/ # File-based persistence
│ │ ├── testing/ # Testing utilities
│ │ ├── tools/ # Built-in tool implementations
│ │ ├── tui/ # Terminal UI dashboard
│ │ └── __init__.py
│ │ └── utils/ # Shared utilities
│ ├── tests/ # Unit and E2E tests (including dummy agents)
│ ├── pyproject.toml # Package metadata and dependencies
│ ├── README.md # Framework documentation
── MCP_INTEGRATION_GUIDE.md # MCP server integration guide
│ └── docs/ # Protocol documentation
── MCP_INTEGRATION_GUIDE.md # MCP server integration guide
├── tools/ # TOOLS PACKAGE (MCP tools)
│ ├── src/
@@ -320,7 +228,11 @@ If you prefer to build agents manually:
}
```
### Running Agents
---
## Running Agents
### Using the `hive` CLI
```bash
# Browse and run agents interactively (Recommended)
@@ -331,33 +243,35 @@ hive run exports/my_agent --input '{"ticket_content": "My login is broken", "cus
# Run with TUI dashboard
hive run exports/my_agent --tui
```
> **Using Python directly:** `PYTHONPATH=exports uv run python -m agent_name run --input '{...}'`
### CLI Command Reference
| Command | Description |
| ---------------------- | ----------------------------------------------------------------------- |
| `hive tui` | Browse agents and launch TUI dashboard |
| `hive run <path>` | Execute an agent (`--tui`, `--model`, `--mock`, `--quiet`, `--verbose`) |
| `hive shell [path]` | Interactive REPL (`--multi`, `--no-approve`) |
| `hive info <path>` | Show agent details |
| `hive validate <path>` | Validate agent structure |
| `hive list [dir]` | List available agents |
| `hive dispatch [dir]` | Multi-agent orchestration |
### Using Python Directly
```bash
PYTHONPATH=exports uv run python -m agent_name run --input '{...}'
```
---
## Testing Agents
### Using Built-in Test Commands
### Agent Tests
```bash
# Run tests for an agent
PYTHONPATH=exports uv run python -m agent_name test
```
This generates and runs:
- **Constraint tests** - Verify agent respects constraints
- **Success tests** - Verify agent achieves success criteria
- **Integration tests** - End-to-end workflows
### Manual Testing
```bash
# Run all tests for an agent
PYTHONPATH=exports uv run python -m agent_name test
# Run specific test type
PYTHONPATH=exports uv run python -m agent_name test --type constraint
@@ -370,6 +284,32 @@ PYTHONPATH=exports uv run python -m agent_name test --parallel 4
PYTHONPATH=exports uv run python -m agent_name test --fail-fast
```
### Framework Tests
```bash
# Run all unit tests (core + tools)
make test
# Run linting and format checks
make check
```
### Dummy Agent Tests (E2E)
The repository includes end-to-end dummy agent tests under `core/tests/dummy_agents/` that run real LLM calls against deterministic graph structures. These are **not** part of CI — run them manually to verify the executor works with real providers.
```bash
cd core && uv run python tests/dummy_agents/run_all.py
```
The script detects available LLM credentials and prompts you to pick a provider. For verbose output:
```bash
cd core && uv run python tests/dummy_agents/run_all.py --verbose
```
See [environment-setup.md](./environment-setup.md#testing-with-dummy-agents) for the full list of covered agents and details.
### Writing Custom Tests
```python
@@ -542,8 +482,6 @@ chore(deps): update React to 18.2.0
---
---
## Common Tasks
### Adding Python Dependencies
@@ -660,30 +598,7 @@ hive run exports/my_agent --verbose --input '{"task": "..."}'
## Troubleshooting
### Port Already in Use
```bash
# Find process using port
lsof -i :3000
lsof -i :4000
# Kill process
kill -9 <PID>
```
### Environment Variables Not Loading
```bash
# Verify .env file exists at project root
cat .env
# Or check shell environment
echo $ANTHROPIC_API_KEY
# Create .env if needed
# Then add your API keys
```
See [environment-setup.md](./environment-setup.md#troubleshooting) for common setup issues (module not found errors, broken installations, PEP 668, etc.).
---
@@ -693,7 +608,3 @@ echo $ANTHROPIC_API_KEY
- **Issues**: Search [existing issues](https://github.com/adenhq/hive/issues)
- **Discord**: Join our [community](https://discord.com/invite/MXE49hrKDk)
- **Code Review**: Tag a maintainer on your PR
---
_Happy coding!_ 🐝
+41 -141
View File
@@ -66,40 +66,6 @@ source .venv/bin/activate
./quickstart.sh
```
## Manual Setup (Alternative)
If you prefer to set up manually or the script fails:
### 1. Sync Workspace Dependencies
```bash
# From repository root - this creates a single .venv at the root
uv sync
```
> **Note:** The `uv sync` command uses the workspace configuration in `pyproject.toml` to install both `core` (framework) and `tools` (aden_tools) packages together. This is the recommended approach over individual `pip install -e` commands which may fail due to circular dependencies.
### 2. Activate the Virtual Environment
```bash
# Linux/macOS
source .venv/bin/activate
# Windows (PowerShell)
.venv\Scripts\Activate.ps1
```
### 3. Verify Installation
```bash
uv run python -c "import framework; print('✓ framework OK')"
uv run python -c "import aden_tools; print('✓ aden_tools OK')"
uv run python -c "import litellm; print('✓ litellm OK')"
```
> **Windows Tip:**
> If the verification commands fail on Windows, disable "App Execution Aliases" in Windows Settings → Apps → App Execution Aliases.
## Requirements
### Python Version
@@ -119,47 +85,6 @@ uv run python -c "import litellm; print('✓ litellm OK')"
We recommend using `quickstart.sh` for LLM API credential setup and the credentials UI/tooling for tool credentials.
## Running Agents
The `hive` CLI is the primary interface for running agents:
```bash
# Browse and run agents interactively (Recommended)
hive tui
# Run a specific agent
hive run exports/my_agent --input '{"task": "Your input here"}'
# Run with TUI dashboard
hive run exports/my_agent --tui
```
### CLI Command Reference
| Command | Description |
| ---------------------- | ----------------------------------------------------------------------- |
| `hive tui` | Browse agents and launch TUI dashboard |
| `hive run <path>` | Execute an agent (`--tui`, `--model`, `--mock`, `--quiet`, `--verbose`) |
| `hive shell [path]` | Interactive REPL (`--multi`, `--no-approve`) |
| `hive info <path>` | Show agent details |
| `hive validate <path>` | Validate agent structure |
| `hive list [dir]` | List available agents |
| `hive dispatch [dir]` | Multi-agent orchestration |
### Using Python directly (alternative)
```bash
# From /hive/ directory
PYTHONPATH=exports uv run python -m agent_name COMMAND
```
Windows (PowerShell):
```powershell
$env:PYTHONPATH="core;exports"
python -m agent_name COMMAND
```
## Building New Agents and Run Flow
Build and run an agent using Claude Code CLI with the agent building skills:
@@ -454,30 +379,53 @@ hive tui
hive run exports/your_agent_name --input '{"task": "..."}'
```
## IDE Setup
## Testing with Dummy Agents
### VSCode
The repository includes a suite of dummy agents under `core/tests/dummy_agents/` for end-to-end testing against real LLM providers. These are **not** part of CI — they make real API calls and are meant to be run manually to verify the executor works correctly.
Add to `.vscode/settings.json`:
### Running the Tests
```json
{
"python.analysis.extraPaths": [
"${workspaceFolder}/core",
"${workspaceFolder}/exports"
],
"python.autoComplete.extraPaths": [
"${workspaceFolder}/core",
"${workspaceFolder}/exports"
]
}
```bash
cd core && uv run python tests/dummy_agents/run_all.py
```
### PyCharm
The script auto-detects available LLM credentials and prompts you to pick a provider. You need at least one of:
1. Open Project Settings → Project Structure
2. Mark `core` as Sources Root
3. Mark `exports` as Sources Root
- `ANTHROPIC_API_KEY`
- `OPENAI_API_KEY`
- `GEMINI_API_KEY`
- `ZAI_API_KEY`
- A Claude Code, Codex, or Kimi subscription
For verbose output with live LLM logs, tool calls, and node traversal details:
```bash
cd core && uv run python tests/dummy_agents/run_all.py --verbose
```
### What's Covered
| Agent | Tests | Coverage |
| -------------- | ----- | ------------------------------------------------- |
| echo | 2 | Single-node lifecycle, basic `set_output` |
| pipeline | 4 | Multi-node traversal, `input_mapping`, conversation modes |
| branch | 3 | Conditional edges, LLM-driven routing |
| parallel_merge | 4 | Fan-out/fan-in, failure strategies |
| retry | 4 | Retry mechanics, exhaustion, `ON_FAILURE` edges |
| feedback_loop | 3 | Feedback cycles, `max_node_visits` |
| worker | 4 | Real MCP tools (`example_tool`, `get_current_time`, `save_data`/`load_data`) |
Typical runtime is 13 minutes depending on provider latency.
### Running Individual Test Files
You can also run a specific dummy agent test with pytest directly:
```bash
cd core && uv run pytest tests/dummy_agents/test_echo.py -v
```
> **Note:** Individual pytest runs require the LLM provider to be configured via the `conftest.py` fixture. The `run_all.py` script handles this automatically.
## Environment Variables
@@ -501,54 +449,6 @@ export HIVE_CREDENTIAL_KEY="your-fernet-key"
export AGENT_STORAGE_PATH="/custom/storage"
```
## Opencode Setup
[Opencode](https://github.com/opencode-ai/opencode) is fully supported as a coding agent.
### Automatic Setup
Run the quickstart script in the root directory:
```bash
./quickstart.sh
```
## Codex Setup
[OpenAI Codex CLI](https://github.com/openai/codex) (v0.101.0+) is supported with project-level config:
- `.codex/config.toml` — MCP server configuration
These files are tracked in git and available on clone. To use Codex with Hive:
1. Run `codex` in the repo root
2. Start the configured MCP-assisted workflow
Quick verification:
```bash
test -f .codex/config.toml && echo "OK: Codex config" || echo "MISSING: .codex/config.toml"
echo "OK: .codex/config.toml and MCP tools configured"
```
## Additional Resources
- **Framework Documentation:** [core/README.md](../core/README.md)
- **Tools Documentation:** [tools/README.md](../tools/README.md)
- **Example Agents:** [examples/](../examples/)
- **Agent Building Guide:** [docs/developer-guide.md](./developer-guide.md)
- **Testing Guide:** [core/README.md](../core/README.md)
## Contributing
When contributing agent packages:
1. Place agents in `exports/agent_name/`
2. Follow the standard agent structure (see existing agents)
3. Include README.md with usage instructions
4. Add tests if using `test workflow`
5. Document required environment variables
## Support
- **Issues:** https://github.com/adenhq/hive/issues
-75
View File
@@ -1,75 +0,0 @@
# Hive Queen Bee: Native agent-building agent
## Problem
Building a Hive agent today requires manual assembly of 7+ files (`agent.py`, `config.py`, `nodes/__init__.py`, `__init__.py`, `__main__.py`, `mcp_servers.json`, tests) with precise framework conventions — correct imports, entry_points format, conversation_mode values, STEP 1/STEP 2 prompt patterns, nullable_output_keys, and more. A single missing re-export in `__init__.py` silently breaks `AgentRunner.load()`. This is the #1 friction point for new users and a recurring source of bugs even for experienced ones.
There is no tool that understands the framework deeply enough to produce correct agents. General-purpose coding assistants hallucinate tool names, use wrong import paths (`from core.framework...`), create too many thin nodes, forget module-level exports, and produce agents that fail validation.
## Proposal
Build **Hive Coder** (codename "Queen Bee") — a framework-native coding agent that lives inside the framework itself and builds complete, validated agent packages from natural language.
### Design principles
1. **Single-node, forever-alive** — One continuous EventLoopNode conversation handles the full lifecycle (understand, qualify, design, implement, verify, iterate). No artificial phase boundaries that destroy context.
2. **Meta-agent capabilities** — Not just a file writer. Can discover available MCP tools at runtime, inspect sessions/checkpoints of agents it builds, run their test suites, and debug failures.
3. **Self-verifying** — Runs three validation steps after every build: class validation (graph structure), `AgentRunner.load()` (package export contract), and pytest. Fixes its own errors up to 3 attempts.
4. **Honest qualification** — Assesses framework fit before building. If a use case is a poor fit (needs sub-second latency, pure CRUD, massive data pipelines), says so instead of producing a bad agent.
5. **Reference-grounded** — Ships with embedded reference docs (framework guide, file templates, anti-patterns) that it reads before writing code. No reliance on training data for framework specifics.
### Components
#### `hive_coder` agent (`core/framework/agents/hive_coder/`)
| File | Purpose |
|------|---------|
| `agent.py` | Goal, single-node graph, `HiveCoderAgent` class |
| `nodes/__init__.py` | `coder` EventLoopNode with comprehensive system prompt |
| `config.py` | RuntimeConfig with `~/.hive/configuration.json` auto-detection |
| `__main__.py` | Click CLI (`run`, `tui`, `info`, `validate`, `shell`) |
| `reference/framework_guide.md` | Node types, edges, patterns, async entry points |
| `reference/file_templates.md` | Complete code templates for every agent file |
| `reference/anti_patterns.md` | 22 common mistakes with explanations |
#### Coder Tools MCP Server (`tools/coder_tools_server.py`)
Dedicated tool server providing:
- **File I/O**: `read_file` (with line numbers, offset/limit), `write_file` (auto-mkdir), `edit_file` (9-strategy fuzzy matching ported from opencode), `list_directory`, `search_files` (regex)
- **Shell**: `run_command` (timeout, cwd, output truncation)
- **Git**: `undo_changes` (snapshot-based rollback)
- **Meta-agent**: `discover_mcp_tools`, `list_agents`, `list_agent_sessions`, `list_agent_checkpoints`, `get_agent_checkpoint`, `run_agent_tests`
All file operations sandboxed to a configurable project root.
#### Framework changes
- `hive code` CLI command — direct launch shortcut
- `hive tui` — discovers framework agents as a source
- `AgentRuntime` — cron expression support (`croniter`) for async entry points
- `prompt_composer` — appends current datetime to system prompts
- `NodeSpec.max_node_visits` — default changed from 1 to 0 (unbounded), matching forever-alive as the standard pattern
- TUI graph view — cron display and hours in countdown
- CredentialError graceful handling in TUI launch
## Acceptance criteria
- [ ] `hive code` launches Hive Coder in the TUI
- [ ] `hive tui` lists framework agents alongside exports/ and examples/
- [ ] Given "build me a research agent that searches the web and summarizes findings", Hive Coder produces a valid package in `exports/` that passes `AgentRunner.load()`
- [ ] Tool discovery works: agent calls `discover_mcp_tools()` before designing, never fabricates tool names
- [ ] Self-verification: agent runs all 3 validation steps and fixes errors before presenting
- [ ] Cron timers fire on schedule (unit tested)
- [ ] `max_node_visits=0` default does not break existing agents or tests
- [ ] Reference docs are accurate and match current framework behavior
## Non-goals
- Multi-agent orchestration (queen spawning worker agents at runtime) — future work
- GUI/web interface — TUI only for v1
- Auto-publishing to a registry — agents are local packages
-288
View File
@@ -1,288 +0,0 @@
# Plan: Multi-Graph Sessions with Guardian Pattern
## Context
The target experience: hive_coder builds an agent (e.g., email automation), loads it into the same runtime session, and acts as its guardian. The email agent runs autonomously while hive_coder watches for failures. On error, hive_coder asks the user for help if they're around, attempts an autonomous fix if they're away, and escalates catastrophic failures for post-mortem.
This requires multiple agent graphs sharing a single `AgentRuntime` session — shared memory and data, but isolated conversations. The existing runtime already has most of the primitives: `ExecutionStream` accepts its own `graph`, `trigger_type="event"` subscribes entry points to the EventBus, and `_get_primary_session_state()` bridges memory across streams.
## Architecture Overview
```
AgentRuntime (shared EventBus, shared state.json, shared data/)
├── hive_coder graph
│ ├── Stream "default" → coder node (client_facing, manual)
│ └── Stream "guardian" → guardian node (event-driven, subscribes to EXECUTION_FAILED)
└── email_agent graph
└── Stream "email_agent::default" → intake node (client_facing, manual)
```
The guardian entry point on hive_coder fires when email_agent emits `EXECUTION_FAILED`. It receives the failure event in its input, reads shared memory for context, and decides: ask user (if present), auto-fix (if away), or escalate (if catastrophic).
## Gap 1: Event Scoping — `graph_id` on Events
**Problem**: EventBus events carry `stream_id` and `node_id` but no `graph_id`. The guardian needs to subscribe to events from a specific graph (email_agent), not a specific stream name.
**Solution**: Add `graph_id: str | None = None` to `AgentEvent` and `filter_graph` to `Subscription`.
### `core/framework/runtime/event_bus.py`
- `AgentEvent` dataclass: add `graph_id: str | None = None` field, include in `to_dict()`
- `Subscription` dataclass: add `filter_graph: str | None = None`
- `subscribe()`: accept `filter_graph` param, pass to `Subscription`
- `_matches()`: check `filter_graph` against `event.graph_id`
### `core/framework/runtime/execution_stream.py`
- `__init__()`: accept `graph_id: str | None = None`, store as `self.graph_id`
- When emitting events via `_event_bus.publish()`: set `event.graph_id = self.graph_id`
## Gap 2: Multi-Graph Runtime — `add_graph()` / `remove_graph()`
**Problem**: `AgentRuntime.__init__` takes a single `GraphSpec`. We need to add/remove graphs dynamically at runtime.
**Solution**: Keep the primary graph on `__init__`. Add methods to register secondary graphs that create their own `ExecutionStream` instances backed by a different graph.
### `core/framework/runtime/agent_runtime.py`
New instance state:
```python
self._graph_id: str = graph_id or "primary" # ID for the primary graph
self._graphs: dict[str, _GraphRegistration] = {} # graph_id -> registration
self._active_graph_id: str = self._graph_id # TUI focus
```
Where `_GraphRegistration` is a simple dataclass:
```python
@dataclass
class _GraphRegistration:
graph: GraphSpec
goal: Goal
entry_points: dict[str, EntryPointSpec]
streams: dict[str, ExecutionStream]
storage_subpath: str # relative to session root, e.g. "graphs/email_agent"
event_subscriptions: list[str] # EventBus subscription IDs
timer_tasks: list[asyncio.Task]
```
New methods:
- `add_graph(graph_id, graph, goal, entry_points, storage_subpath=None)` — creates streams for the graph using graph-scoped storage, sets up event/timer triggers, stamps `graph_id` on all streams. Can be called while running.
- `remove_graph(graph_id)` — stops streams, cancels timers, unsubscribes events, removes registration. Cannot remove primary graph.
- `list_graphs() -> list[str]` — returns all graph IDs
- `active_graph_id` property with setter — TUI uses this to control which graph's events are displayed
Update existing methods:
- `start()`: stamp `self._graph_id` on primary graph streams (via `ExecutionStream.graph_id`)
- `inject_input(node_id, content)`: search active graph's streams first, then all others
- `_get_primary_session_state()`: search across ALL graphs' streams (not just primary's)
- `stop()`: stop all secondary graph streams/timers/subscriptions too
### Storage Layout
```
~/.hive/agents/hive_coder/sessions/{session_id}/
state.json ← SHARED across all graphs
data/ ← SHARED data directory
conversations/coder/ ← hive_coder conversations
graphs/
email_agent/ ← secondary graph storage root
conversations/
intake/
checkpoints/
```
Secondary graph executors get `storage_path = {session_root}/graphs/{graph_id}/` while `state.json` and `data/` remain at the session root. The `resume_session_id` mechanism in `_get_primary_session_state()` already handles this — secondary executions find the primary session's `state.json`.
**Concurrent state.json writes**: For the guardian pattern (sequential: email_agent fails → guardian triggers), no file lock needed. But since both could technically write concurrently, add a simple `fcntl.flock()` wrapper around `_write_progress()` in the executor. Small, defensive change.
## Gap 3: Guardian Pattern — User Presence + Autonomous Recovery
**Problem**: When email_agent fails, hive_coder's guardian entry point must decide: ask user or auto-fix.
**Solution**: User presence is a runtime-level signal. The guardian's system prompt and event data give it enough context to decide.
### User Presence Tracking
Add to `AgentRuntime`:
```python
self._last_user_input_time: float = 0.0 # monotonic timestamp
```
Updated in `inject_input()` (called whenever user types in TUI). Exposed as:
```python
@property
def user_idle_seconds(self) -> float:
if self._last_user_input_time == 0:
return float('inf')
return time.monotonic() - self._last_user_input_time
```
The guardian node's system prompt instructs the LLM: "If user_idle_seconds < 120, ask the user for guidance via the client-facing interaction. If user is away, attempt an autonomous fix."
This is NOT framework logic — it's prompt-driven. The guardian node is a regular `event_loop` node with `client_facing=True` and tools for code editing + agent lifecycle. The LLM decides the strategy based on presence info injected as context.
### Escalation Model
Escalation = save a structured log entry. No special framework support needed. The guardian node uses `save_data("escalation_log.jsonl", ...)` via the existing data tools. The LLM writes:
```json
{"timestamp": "...", "severity": "catastrophic", "agent": "email_agent", "error": "...", "attempted_fixes": [...], "recommended_action": "..."}
```
Post-mortem: user opens `/data escalation_log.jsonl` or the TUI shows a notification linking to it.
## Gap 4: Graph Lifecycle Tools — Stop/Reload/Restart
**Problem**: hive_coder needs to programmatically stop a broken agent, fix its code, reload it, and restart it.
**Solution**: MCP tools accessible to the active agent. Uses `ContextVar` to access the runtime (same pattern as `data_dir`).
### `core/framework/tools/session_graph_tools.py` (NEW)
```python
async def load_agent(agent_path: str) -> str:
"""Load an agent graph into the running session."""
async def unload_agent(graph_id: str) -> str:
"""Stop and remove an agent graph from the session."""
async def start_agent(graph_id: str, entry_point: str = "default", input_data: str = "{}") -> str:
"""Trigger an entry point on a loaded agent graph."""
async def restart_agent(graph_id: str) -> str:
"""Unload and re-load an agent (picks up code changes)."""
async def list_agents() -> str:
"""List all agent graphs in the current session with their status."""
async def get_user_presence() -> str:
"""Return user idle time and presence status."""
```
These tools call `runtime.add_graph()`, `runtime.remove_graph()`, `runtime.trigger()`, etc.
### Registration
These tools are registered via `ToolRegistry` with `CONTEXT_PARAM` for `runtime` (injected by the executor, same as `data_dir`). Only available when the runtime is multi-graph capable (set by `cmd_code()`).
## Gap 5: TUI Integration — Graph Switching + Background Notifications
### `core/framework/tui/app.py`
- `_route_event()`: check `event.graph_id` against `runtime.active_graph_id`
- Events from active graph: route normally (streaming, chat, etc.)
- `CLIENT_INPUT_REQUESTED` from background graph: show notification bar
- `EXECUTION_FAILED` from background graph: show error notification
- `EXECUTION_COMPLETED` from background: show brief completion notice
- Other background events: silent (visible in logs)
- `action_switch_graph(graph_id)`: update `runtime.active_graph_id`, refresh graph view, show header
### `core/framework/tui/widgets/chat_repl.py`
- Track `_input_graph_id: str | None` alongside `_input_node_id`
- `handle_input_requested(node_id, graph_id)`: if background graph, show notification instead of enabling input
- `_submit_input()`: pass `graph_id` to help `inject_input()` route correctly
- New TUI commands:
- `/graphs` — list loaded graphs and their status
- `/graph <id>` — switch active graph focus
- `/load <path>` — load an agent graph into the session
- `/unload <id>` — remove a graph from the session
- On graph switch: flush streaming state, render graph header separator
### `core/framework/tui/widgets/graph_view.py`
- `switch_graph(graph_id)` — re-render the graph visualization for the new active graph
- When multi-graph active: show tab-like header listing all loaded graphs
## Gap 6: CLI + Runner Integration
### `core/framework/runner/cli.py`
- `cmd_code()` creates the hive_coder runtime with `graph_id="hive_coder"`
- Registers `session_graph_tools` with the tool config so hive_coder's LLM can call them
- Sets `runtime._multi_graph_capable = True` flag
### `core/framework/runner/runner.py`
- New method: `setup_as_secondary(runtime, graph_id)` — configures this runner to join an existing `AgentRuntime` as a secondary graph. Uses the existing `AgentRunner.load()` to parse agent.json, then calls `runtime.add_graph()` with the parsed graph/goal/entry_points.
## Gap 7: Reliable Mid-Node Resume
**Problem**: When an EventLoopNode is interrupted (crash, Ctrl+Z, context switch), resume doesn't restore to exactly where execution stopped. Several pieces of in-node state are lost, which changes behavior post-resume. In multi-graph sessions with parallel execution and frequent context switching, these gaps compound.
### What's already restored correctly
- **Conversation history**: All messages persisted to disk immediately via `FileConversationStore._persist()` — one file per message in `parts/NNNNNNNNNN.json`
- **OutputAccumulator values**: Write-through to `cursor.json` on every `accumulator.set()` call
- **Iteration counter**: Written to `cursor.json` at the end of each iteration (step 6g)
- **Orphaned tool calls**: `_repair_orphaned_tool_calls()` patches in-flight tool calls with error messages so the LLM knows to retry
### What's lost — and fixes
#### 1. `user_interaction_count` (CRITICAL)
Resets to 0 on resume. This controls client-facing blocking semantics: before the first interaction, `set_output`-only turns don't prevent blocking (the LLM must present to the user first). After resume, a node that had 3 user interactions behaves as if the user never interacted.
**Fix**: Persist `user_interaction_count` to `cursor.json` alongside `iteration` and `outputs`. Write it in `_write_cursor()` (step 6g), restore in `_restore()`.
**Files**: `core/framework/graph/event_loop_node.py`
#### 2. Accumulator outputs not in SharedMemory
The `OutputAccumulator` writes to `cursor.json` (durable) but only writes to `SharedMemory` when the judge ACCEPTs. On crash, the CancelledError handler captures `memory.read_all()` — which doesn't include the accumulator's WIP values. On resume, edge conditions checking those memory keys see `None`.
**Fix**: In the executor's `CancelledError` handler, read the interrupted node's `cursor.json` and write any accumulator outputs to `memory` before building `session_state_out`. This ensures resume memory includes WIP output values.
**Files**: `core/framework/graph/executor.py` (CancelledError handler, ~line 1289)
#### 3. Stall/doom-loop detection counters
`recent_responses` and `recent_tool_fingerprints` reset to empty lists. A previously near-stalled node gets a fresh detection budget.
**Fix**: Persist these to `cursor.json`. They're small (last N strings). Write in `_write_cursor()`, restore in `_restore()`.
**Files**: `core/framework/graph/event_loop_node.py`
#### 4. `continuous_conversation` at executor level
In continuous mode, the executor's `continuous_conversation` variable is `None` on resume. The node's `_restore()` recovers messages from disk, but the executor doesn't pre-populate this variable until the node returns.
**Fix**: After a resumed node completes, set `continuous_conversation = result.conversation` (this already happens in the normal path at line 1155 — verify it also runs on the resume path).
**Files**: `core/framework/graph/executor.py`
### Multi-graph specific: independent resume per graph
Each graph in a multi-graph session has its own storage subdirectory (`graphs/{graph_id}/`) with its own `conversations/`, `checkpoints/`, and `cursor.json` files. Resume is already per-executor, so each graph resumes independently. The shared `state.json` at the session root captures the union of all graphs' memory — the `fcntl.flock()` wrapper on `_write_progress()` (Gap 2) ensures concurrent writes don't corrupt it.
### Implementation
These fixes are prerequisite to multi-graph and should be done as **Phase 0** before the EventBus changes:
1. Persist `user_interaction_count` + stall/doom counters to `cursor.json`
2. Restore them in `_restore()`
3. Flush accumulator outputs to SharedMemory in executor's CancelledError handler
4. Verify continuous_conversation is set on resume path
## Implementation Phases
### Phase 0: Reliable Mid-Node Resume (prerequisite)
1. `event_loop_node.py` — persist `user_interaction_count`, `recent_responses`, `recent_tool_fingerprints` to `cursor.json` via `_write_cursor()`; restore in `_restore()`
2. `executor.py` — in CancelledError handler, read interrupted node's `cursor.json` accumulator outputs and write to `memory` before building `session_state_out`
3. `executor.py` — verify `continuous_conversation` is populated on resume path
### Phase 1: EventBus Foundation
1. `event_bus.py``graph_id` on `AgentEvent`, `filter_graph` on `Subscription` + `_matches()`
2. `execution_stream.py` — accept and stamp `graph_id` on emitted events
### Phase 2: Multi-Graph Runtime
3. `agent_runtime.py``_GraphRegistration` dataclass, `add_graph()`, `remove_graph()`, `list_graphs()`, `active_graph_id` property
4. `agent_runtime.py` — update `inject_input()`, `_get_primary_session_state()`, `stop()` for multi-graph
5. `agent_runtime.py` — user presence tracking (`_last_user_input_time`, `user_idle_seconds`)
6. Storage path logic: secondary graphs get `{session_root}/graphs/{graph_id}/`
### Phase 3: Graph Lifecycle Tools
7. `core/framework/tools/session_graph_tools.py``load_agent`, `unload_agent`, `start_agent`, `restart_agent`, `list_agents`, `get_user_presence`
8. `runner.py``setup_as_secondary()` method
### Phase 4: TUI Integration
9. `app.py``graph_id` event filtering, background notifications, `action_switch_graph`
10. `chat_repl.py``/graphs`, `/graph`, `/load`, `/unload` commands, graph_id tracking
11. `graph_view.py` — multi-graph header, `switch_graph()`
### Phase 5: hive_coder Integration
12. `cli.py``cmd_code()` sets up multi-graph capable runtime, registers graph tools
13. hive_coder's agent config — add guardian entry point with `trigger_type="event"` subscribing to `EXECUTION_FAILED`
14. Guardian node system prompt — presence-aware triage logic (ask user / auto-fix / escalate)
## Backward Compatibility
- Single-graph `hive run exports/my_agent` unchanged: `graph_id` defaults to `None`, no secondary graphs loaded, events carry `graph_id=None`, TUI shows no graph switching UI
- All new fields are optional with `None` defaults
- `_get_primary_session_state()` existing behavior preserved when no secondary graphs exist
## Verification
1. **Unit**: `add_graph()` creates streams with correct `graph_id`, events carry `graph_id`, `filter_graph` works in subscriptions, `inject_input()` routes to correct graph
2. **Integration**: Load hive_coder + email_agent, email_agent fails → guardian fires → reads shared memory → decides action
3. **TUI**: `/graphs` shows both, `/graph` switches, background failure notification appears, input routing works across graphs
4. **Backward compat**: `hive run exports/deep_research_agent --tui` works unchanged
5. **Lifecycle**: `restart_agent` picks up code changes, `unload_agent` cleans up streams and subscriptions