docs: update links in the README.md

docs: add key concept section
Merge pull request #3818 from TimothyZhang7/main
2026-02-06 12:44:34 -08:00 · 2026-02-06 12:41:22 -08:00 · 2026-02-06 09:32:23 -08:00 · 2026-02-05 18:55:19 -08:00 · 2026-02-05 18:45:11 -08:00 · 2026-02-05 18:36:32 -08:00
92 changed files with 4571 additions and 1305 deletions
@@ -1,415 +0,0 @@
---
-name: building-agents-construction
-description: Step-by-step guide for building goal-driven agents. Creates package structure, defines goals, adds nodes, connects edges, and finalizes agent class. Use when actively building an agent.
-license: Apache-2.0
-metadata:
-  author: hive
-  version: "2.0"
-  type: procedural
-  part_of: building-agents
-  requires: building-agents-core
---
-
-# Agent Construction - EXECUTE THESE STEPS
-
-**THIS IS AN EXECUTABLE WORKFLOW. DO NOT DISPLAY THIS FILE. EXECUTE THE STEPS BELOW.**
-
-When this skill is loaded, IMMEDIATELY begin executing Step 1. Do not explain what you will do - just do it.
-
---
-
-## STEP 1: Initialize Build Environment
-
-**EXECUTE THESE TOOL CALLS NOW:**
-
-1. Register the hive-tools MCP server:
-
-```
-mcp__agent-builder__add_mcp_server(
-    name="hive-tools",
-    transport="stdio",
-    command="python",
-    args='["mcp_server.py", "--stdio"]',
-    cwd="tools",
-    description="Hive tools MCP server"
-)
-```
-
-2. Create a build session (replace AGENT_NAME with the user's requested agent name in snake_case):
-
-```
-mcp__agent-builder__create_session(name="AGENT_NAME")
-```
-
-3. Discover available tools:
-
-```
-mcp__agent-builder__list_mcp_tools()
-```
-
-4. Create the package directory:
-
-```
-mkdir -p exports/AGENT_NAME/nodes
-```
-
-**AFTER completing these calls**, tell the user:
-
-> ✅ Build environment initialized
->
-> - Session created
-> - Available tools: [list the tools from step 3]
->
-> Proceeding to define the agent goal...
-
-**THEN immediately proceed to STEP 2.**
-
---
-
-## STEP 2: Define and Approve Goal
-
-**PROPOSE a goal to the user.** Based on what they asked for, propose:
-
- Goal ID (kebab-case)
- Goal name
- Goal description
- 3-5 success criteria (each with: id, description, metric, target, weight)
- 2-4 constraints (each with: id, description, constraint_type, category)
-
-**FORMAT your proposal as a clear summary, then ask for approval:**
-
-> **Proposed Goal: [Name]**
->
-> [Description]
->
-> **Success Criteria:**
->
-> 1. [criterion 1]
-> 2. [criterion 2]
->    ...
->
-> **Constraints:**
->
-> 1. [constraint 1]
-> 2. [constraint 2]
->    ...
-
-**THEN call AskUserQuestion:**
-
-```
-AskUserQuestion(questions=[{
-    "question": "Do you approve this goal definition?",
-    "header": "Goal",
-    "options": [
-        {"label": "Approve", "description": "Goal looks good, proceed"},
-        {"label": "Modify", "description": "I want to change something"}
-    ],
-    "multiSelect": false
-}])
-```
-
-**WAIT for user response.**
-
- If **Approve**: Call `mcp__agent-builder__set_goal(...)` with the goal details, then proceed to STEP 3
- If **Modify**: Ask what they want to change, update proposal, ask again
-
---
-
-## STEP 3: Design Node Workflow
-
-**BEFORE designing nodes**, review the available tools from Step 1. Nodes can ONLY use tools that exist.
-
-**DESIGN the workflow** as a series of nodes. For each node, determine:
-
- node_id (kebab-case)
- name
- description
- node_type: `"event_loop"` (recommended for all LLM work) or `"function"` (deterministic, no LLM)
- input_keys (what data this node receives)
- output_keys (what data this node produces)
- tools (ONLY tools that exist - empty list if no tools needed)
- system_prompt (should mention `set_output` for producing structured outputs)
- client_facing: True if this node interacts with the user
- nullable_output_keys (for mutually exclusive outputs)
- max_node_visits (>1 if this node is a feedback loop target)
-
-**PRESENT the workflow to the user:**
-
-> **Proposed Workflow: [N] nodes**
->
-> 1. **[node-id]** - [description]
->
->    - Type: event_loop [client-facing] / function
->    - Input: [keys]
->    - Output: [keys]
->    - Tools: [tools or "none"]
->
-> 2. **[node-id]** - [description]
->    ...
->
-> **Flow:** node1 → node2 → node3 → ...
-
-**THEN call AskUserQuestion:**
-
-```
-AskUserQuestion(questions=[{
-    "question": "Do you approve this workflow design?",
-    "header": "Workflow",
-    "options": [
-        {"label": "Approve", "description": "Workflow looks good, proceed to build nodes"},
-        {"label": "Modify", "description": "I want to change the workflow"}
-    ],
-    "multiSelect": false
-}])
-```
-
-**WAIT for user response.**
-
- If **Approve**: Proceed to STEP 4
- If **Modify**: Ask what they want to change, update design, ask again
-
---
-
-## STEP 4: Build Nodes One by One
-
-**FOR EACH node in the approved workflow:**
-
-1. **Call** `mcp__agent-builder__add_node(...)` with the node details
-
-   - input_keys and output_keys must be JSON strings: `'["key1", "key2"]'`
-   - tools must be a JSON string: `'["tool1"]'` or `'[]'`
-
-2. **Call** `mcp__agent-builder__test_node(...)` to validate:
-
-```
-mcp__agent-builder__test_node(
-    node_id="the-node-id",
-    test_input='{"key": "test value"}',
-    mock_llm_response='{"output_key": "test output"}'
-)
-```
-
-3. **Check result:**
-
-   - If valid: Tell user "✅ Node [id] validated" and continue to next node
-   - If invalid: Show errors, fix the node, re-validate
-
-4. **Show progress** after each node:
-
-```
-mcp__agent-builder__get_session_status()
-```
-
-> ✅ Node [X] of [Y] complete: [node-id]
-
-**AFTER all nodes are added and validated**, proceed to STEP 5.
-
---
-
-## STEP 5: Connect Edges
-
-**DETERMINE the edges** based on the workflow flow. For each connection:
-
- edge_id (kebab-case)
- source (node that outputs)
- target (node that receives)
- condition: `"on_success"`, `"always"`, `"on_failure"`, or `"conditional"`
- condition_expr (Python expression using `output.get(...)`, only if conditional)
- priority (positive = forward edge evaluated first, negative = feedback edge)
-
-**FOR EACH edge, call:**
-
-```
-mcp__agent-builder__add_edge(
-    edge_id="source-to-target",
-    source="source-node-id",
-    target="target-node-id",
-    condition="on_success",
-    condition_expr="",
-    priority=1
-)
-```
-
-**AFTER all edges are added, validate the graph:**
-
-```
-mcp__agent-builder__validate_graph()
-```
-
- If valid: Tell user "✅ Graph structure validated" and proceed to STEP 6
- If invalid: Show errors, fix edges, re-validate
-
---
-
-## STEP 6: Generate Agent Package
-
-**EXPORT the graph data:**
-
-```
-mcp__agent-builder__export_graph()
-```
-
-This returns JSON with all the goal, nodes, edges, and MCP server configurations.
-
-**THEN write the Python package files** using the exported data. Create these files in `exports/AGENT_NAME/`:
-
-1. `config.py` - Runtime configuration with model settings
-2. `nodes/__init__.py` - All NodeSpec definitions
-3. `agent.py` - Goal, edges, graph config, and agent class
-4. `__init__.py` - Package exports
-5. `__main__.py` - CLI interface
-6. `mcp_servers.json` - MCP server configurations
-7. `README.md` - Usage documentation
-
-**IMPORTANT entry_points format:**
-
- MUST be: `{"start": "first-node-id"}`
- NOT: `{"first-node-id": ["input_keys"]}` (WRONG)
- NOT: `{"first-node-id"}` (WRONG - this is a set)
-
-**Use the example agent** at `.claude/skills/building-agents-construction/examples/deep_research_agent/` as a template for file structure and patterns. It demonstrates: STEP 1/STEP 2 prompts, client-facing nodes, feedback loops, nullable_output_keys, and data tools.
-
-**AFTER writing all files, tell the user:**
-
-> ✅ Agent package created: `exports/AGENT_NAME/`
->
-> **Files generated:**
->
-> - `__init__.py` - Package exports
-> - `agent.py` - Goal, nodes, edges, agent class
-> - `config.py` - Runtime configuration
-> - `__main__.py` - CLI interface
-> - `nodes/__init__.py` - Node definitions
-> - `mcp_servers.json` - MCP server config
-> - `README.md` - Usage documentation
->
-> **Test your agent:**
->
-> ```bash
-> cd /home/timothy/oss/hive
-> PYTHONPATH=exports uv run python -m AGENT_NAME validate
-> PYTHONPATH=exports uv run python -m AGENT_NAME info
-> ```
-
---
-
-## STEP 7: Verify and Test
-
-**RUN validation:**
-
-```bash
-cd /home/timothy/oss/hive && PYTHONPATH=exports uv run python -m AGENT_NAME validate
-```
-
- If valid: Agent is complete!
- If errors: Fix the issues and re-run
-
-**SHOW final session summary:**
-
-```
-mcp__agent-builder__get_session_status()
-```
-
-**TELL the user the agent is ready** and suggest next steps:
-
- Run with mock mode to test without API calls
- Use `/testing-agent` skill for comprehensive testing
- Use `/setup-credentials` if the agent needs API keys
-
---
-
-## REFERENCE: Node Types
-
-| Type | tools param | Use when |
-|------|-------------|----------|
-| `event_loop` | `'["tool1"]'` or `'[]'` | LLM-powered work with or without tools |
-| `function` | N/A | Deterministic Python operations, no LLM |
-
---
-
-## REFERENCE: NodeSpec New Fields
-
-| Field | Default | Description |
-|-------|---------|-------------|
-| `client_facing` | `False` | Streams output to user, blocks for input between turns |
-| `nullable_output_keys` | `[]` | Output keys that may remain unset (mutually exclusive outputs) |
-| `max_node_visits` | `1` | Max executions per run. Set >1 for feedback loop targets. 0=unlimited |
-
---
-
-## REFERENCE: Edge Conditions & Priority
-
-| Condition | When edge is followed |
-|-----------|--------------------------------------|
-| `on_success` | Source node completed successfully |
-| `on_failure` | Source node failed |
-| `always` | Always, regardless of success/failure |
-| `conditional` | When condition_expr evaluates to True |
-
-**Priority:** Positive = forward edge (evaluated first). Negative = feedback edge (loops back to earlier node). Multiple ON_SUCCESS edges from same source = parallel execution (fan-out).
-
---
-
-## REFERENCE: System Prompt Best Practice
-
-For **internal** event_loop nodes (not client-facing), instruct the LLM to use `set_output`:
-
-```
-Use set_output(key, value) to store your results. For example:
- set_output("search_results", <your results as a JSON string>)
-
-Do NOT return raw JSON. Use the set_output tool to produce outputs.
-```
-
-For **client-facing** event_loop nodes, use the STEP 1/STEP 2 pattern:
-
-```
-**STEP 1 — Respond to the user (text only, NO tool calls):**
-[Present information, ask questions, etc.]
-
-**STEP 2 — After the user responds, call set_output:**
- set_output("key", "value based on user's response")
-```
-
-This prevents the LLM from calling `set_output` before the user has had a chance to respond. The "NO tool calls" instruction in STEP 1 ensures the node blocks for user input before proceeding.
-
---
-
-## EventLoopNode Runtime
-
-EventLoopNodes are **auto-created** by `GraphExecutor` at runtime. Both direct `GraphExecutor` and `AgentRuntime` / `create_agent_runtime()` handle event_loop nodes automatically. No manual `node_registry` setup is needed.
-
-```python
-# Direct execution
-from framework.graph.executor import GraphExecutor
-from framework.runtime.core import Runtime
-
-storage_path = Path.home() / ".hive" / "my_agent"
-storage_path.mkdir(parents=True, exist_ok=True)
-runtime = Runtime(storage_path)
-
-executor = GraphExecutor(
-    runtime=runtime,
-    llm=llm,
-    tools=tools,
-    tool_executor=tool_executor,
-    storage_path=storage_path,
-)
-result = await executor.execute(graph=graph, goal=goal, input_data=input_data)
-```
-
-**DO NOT pass `runtime=None` to `GraphExecutor`** — it will crash with `'NoneType' object has no attribute 'start_run'`.
-
---
-
-## COMMON MISTAKES TO AVOID
-
-1. **Using tools that don't exist** - Always check `mcp__agent-builder__list_mcp_tools()` first
-2. **Wrong entry_points format** - Must be `{"start": "node-id"}`, NOT a set or list
-3. **Skipping validation** - Always validate nodes and graph before proceeding
-4. **Not waiting for approval** - Always ask user before major steps
-5. **Displaying this file** - Execute the steps, don't show documentation
-6. **Too many thin nodes** - Prefer fewer, richer nodes (4 nodes > 8 nodes)
-7. **Missing STEP 1/STEP 2 in client-facing prompts** - Client-facing nodes need explicit phases to prevent premature set_output
-8. **Forgetting nullable_output_keys** - Mark input_keys that only arrive on certain edges (e.g., feedback) as nullable on the receiving node
-9. **Adding framework gating for LLM behavior** - Fix prompts or use judges, not ad-hoc code
@@ -1,12 +1,12 @@
 ---
-name: building-agents-core
+name: hive-concepts
 description: Core concepts for goal-driven agents - architecture, node types (event_loop, function), tool discovery, and workflow overview. Use when starting agent development or need to understand agent fundamentals.
 license: Apache-2.0
 metadata:
  author: hive
  version: "2.0"
  type: foundational
-  part_of: building-agents
+  part_of: hive
 ---

 # Building Agents - Core Concepts
@@ -251,6 +251,7 @@ The judge controls when a node's loop exits:
 Controls loop behavior:
 - `max_iterations` (default 50) — prevents infinite loops
 - `max_tool_calls_per_turn` (default 10) — limits tool calls per LLM response
+- `tool_call_overflow_margin` (default 0.5) — wiggle room before discarding extra tool calls (50% means hard cutoff at 150% of limit)
 - `stall_detection_threshold` (default 3) — detects repeated identical responses
 - `max_history_tokens` (default 32000) — triggers conversation compaction

@@ -258,9 +259,12 @@ Controls loop behavior:

 When tool results exceed the context window, the framework automatically saves them to a spillover directory and truncates with a hint. Nodes that produce or consume large data should include the data tools:

- `save_data(filename, data, data_dir)` — Write data to a file in the data directory
- `load_data(filename, data_dir, offset=0, limit=50)` — Read data with line-based pagination
- `list_data_files(data_dir)` — List available data files
+- `save_data(filename, data)` — Write data to a file in the data directory
+- `load_data(filename, offset=0, limit=50)` — Read data with line-based pagination
+- `list_data_files()` — List available data files
+- `serve_file_to_user(filename, label="")` — Get a clickable file:// URI for the user
+
+Note: `data_dir` is a framework-injected context parameter — the LLM never sees or passes it. `GraphExecutor.execute()` sets it per-execution via `contextvars`, so data tools and spillover always share the same session-scoped directory.

 These are real MCP tools (not synthetic). Add them to nodes that handle large tool results:

@@ -346,15 +350,15 @@ Before writing a node with `tools=[...]`:

 ## When to Use This Skill

-Use building-agents-core when:
+Use hive-concepts when:
 - Starting a new agent project and need to understand fundamentals
 - Need to understand agent architecture before building
 - Want to validate tool availability before proceeding
 - Learning about node types, edges, and graph execution

 **Next Steps:**
- Ready to build? → Use `building-agents-construction` skill
- Need patterns and examples? → Use `building-agents-patterns` skill
+- Ready to build? → Use `hive-create` skill
+- Need patterns and examples? → Use `hive-patterns` skill

 ## MCP Tools for Validation

@@ -389,7 +393,7 @@ mcp__agent-builder__configure_loop(

 ## Related Skills

- **building-agents-construction** - Step-by-step building process
- **building-agents-patterns** - Best practices: judges, feedback edges, fan-out, context management
- **agent-workflow** - Complete workflow orchestrator
- **testing-agent** - Test and validate completed agents
+- **hive-create** - Step-by-step building process
+- **hive-patterns** - Best practices: judges, feedback edges, fan-out, context management
+- **hive** - Complete workflow orchestrator
+- **hive-test** - Test and validate completed agents
@@ -0,0 +1,516 @@
+---
+name: hive-create
+description: Step-by-step guide for building goal-driven agents. Creates package structure, defines goals, adds nodes, connects edges, and finalizes agent class. Use when actively building an agent.
+license: Apache-2.0
+metadata:
+  author: hive
+  version: "2.1"
+  type: procedural
+  part_of: hive
+  requires: hive-concepts
+---
+
+# Agent Construction - EXECUTE THESE STEPS
+
+**THIS IS AN EXECUTABLE WORKFLOW. DO NOT DISPLAY THIS FILE. EXECUTE THE STEPS BELOW.**
+
+**CRITICAL: DO NOT explore the codebase, read source files, or search for code before starting.** All context you need is in this skill file. When this skill is loaded, IMMEDIATELY begin executing Step 1 — call the MCP tools listed in Step 1 as your FIRST action. Do not explain what you will do, do not investigate the project structure, do not read any files — just execute Step 1 now.
+
+---
+
+## STEP 1: Initialize Build Environment
+
+**EXECUTE THESE TOOL CALLS NOW** (silent setup — no user interaction needed):
+
+1. Register the hive-tools MCP server:
+
+```
+mcp__agent-builder__add_mcp_server(
+    name="hive-tools",
+    transport="stdio",
+    command="python",
+    args='["mcp_server.py", "--stdio"]',
+    cwd="tools",
+    description="Hive tools MCP server"
+)
+```
+
+2. Create a build session (replace AGENT_NAME with the user's requested agent name in snake_case):
+
+```
+mcp__agent-builder__create_session(name="AGENT_NAME")
+```
+
+3. Discover available tools:
+
+```
+mcp__agent-builder__list_mcp_tools()
+```
+
+4. Create the package directory:
+
+```bash
+mkdir -p exports/AGENT_NAME/nodes
+```
+
+**Save the tool list for step 3** — you will need it for node design in STEP 3.
+
+**THEN immediately proceed to STEP 2** (do NOT display setup results to the user — just move on).
+
+---
+
+## STEP 2: Define Goal Together with User
+
+**DO NOT propose a complete goal on your own.** Instead, collaborate with the user to define it.
+
+**START by asking the user to help shape the goal:**
+
+> I've set up the build environment and discovered [N] available tools. Let's define the goal for your agent together.
+>
+> To get started, can you help me understand:
+>
+> 1. **What should this agent accomplish?** (the core purpose)
+> 2. **How will we know it succeeded?** (what does "done" look like)
+> 3. **Are there any hard constraints?** (things it must never do, quality bars, etc.)
+
+**WAIT for the user to respond.** Use their input to draft:
+
+- Goal ID (kebab-case)
+- Goal name
+- Goal description
+- 3-5 success criteria (each with: id, description, metric, target, weight)
+- 2-4 constraints (each with: id, description, constraint_type, category)
+
+**PRESENT the draft goal for approval:**
+
+> **Proposed Goal: [Name]**
+>
+> [Description]
+>
+> **Success Criteria:**
+>
+> 1. [criterion 1]
+> 2. [criterion 2]
+>    ...
+>
+> **Constraints:**
+>
+> 1. [constraint 1]
+> 2. [constraint 2]
+>    ...
+
+**THEN call AskUserQuestion:**
+
+```
+AskUserQuestion(questions=[{
+    "question": "Do you approve this goal definition?",
+    "header": "Goal",
+    "options": [
+        {"label": "Approve", "description": "Goal looks good, proceed to workflow design"},
+        {"label": "Modify", "description": "I want to change something"}
+    ],
+    "multiSelect": false
+}])
+```
+
+**WAIT for user response.**
+
+- If **Approve**: Call `mcp__agent-builder__set_goal(...)` with the goal details, then proceed to STEP 3
+- If **Modify**: Ask what they want to change, update the draft, ask again
+
+---
+
+## STEP 3: Design Conceptual Nodes
+
+**BEFORE designing nodes**, review the available tools from Step 1. Nodes can ONLY use tools that exist.
+
+**DESIGN the workflow** as a series of nodes. For each node, determine:
+
+- node_id (kebab-case)
+- name
+- description
+- node_type: `"event_loop"` (recommended for all LLM work) or `"function"` (deterministic, no LLM)
+- input_keys (what data this node receives)
+- output_keys (what data this node produces)
+- tools (ONLY tools that exist from Step 1 — empty list if no tools needed)
+- client_facing: True if this node interacts with the user
+- nullable_output_keys (for mutually exclusive outputs or feedback-only inputs)
+- max_node_visits (>1 if this node is a feedback loop target)
+
+**Prefer fewer, richer nodes** (4 nodes > 8 thin nodes). Each node boundary requires serializing outputs. A research node that searches, fetches, and analyzes keeps all source material in its conversation history.
+
+**PRESENT the nodes to the user for review:**
+
+> **Proposed Nodes ([N] total):**
+>
+> | #   | Node ID    | Type       | Description                   | Tools                  | Client-Facing |
+> | --- | ---------- | ---------- | ----------------------------- | ---------------------- | :-----------: |
+> | 1   | `intake`   | event_loop | Gather requirements from user | —                      |      Yes      |
+> | 2   | `research` | event_loop | Search and analyze sources    | web_search, web_scrape |      No       |
+> | 3   | `review`   | event_loop | Present findings for approval | —                      |      Yes      |
+> | 4   | `report`   | event_loop | Generate final report         | save_data              |      No       |
+>
+> **Data Flow:**
+>
+> - `intake` produces: `research_brief`
+> - `research` receives: `research_brief` → produces: `findings`, `sources`
+> - `review` receives: `findings`, `sources` → produces: `approved_findings` or `feedback`
+> - `report` receives: `approved_findings` → produces: `final_report`
+
+**THEN call AskUserQuestion:**
+
+```
+AskUserQuestion(questions=[{
+    "question": "Do you approve these nodes?",
+    "header": "Nodes",
+    "options": [
+        {"label": "Approve", "description": "Nodes look good, proceed to graph design"},
+        {"label": "Modify", "description": "I want to change the nodes"}
+    ],
+    "multiSelect": false
+}])
+```
+
+**WAIT for user response.**
+
+- If **Approve**: Proceed to STEP 4
+- If **Modify**: Ask what they want to change, update design, ask again
+
+---
+
+## STEP 4: Design Full Graph and Review
+
+**DETERMINE the edges** connecting the approved nodes. For each edge:
+
+- edge_id (kebab-case)
+- source → target
+- condition: `on_success`, `on_failure`, `always`, or `conditional`
+- condition_expr (Python expression, only if conditional)
+- priority (positive = forward, negative = feedback/loop-back)
+
+**RENDER the complete graph as ASCII art.** Make it large and clear — the user needs to see and understand the full workflow at a glance.
+
+**IMPORTANT: Make the ASCII art BIG and READABLE.** Use a box-and-arrow style with generous spacing. Do NOT make it tiny or compressed. Example format:
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                           AGENT: Research Agent                            │
+│                                                                            │
+│  Goal: Thoroughly research technical topics and produce verified reports   │
+└─────────────────────────────────────────────────────────────────────────────┘
+
+    ┌───────────────────────┐
+    │       INTAKE          │
+    │  (client-facing)      │
+    │                       │
+    │  in:  topic           │
+    │  out: research_brief  │
+    └───────────┬───────────┘
+                │ on_success
+                ▼
+    ┌───────────────────────┐
+    │      RESEARCH         │
+    │                       │
+    │  tools: web_search,   │
+    │         web_scrape    │
+    │                       │
+    │  in:  research_brief  │
+    │       [feedback]      │
+    │  out: findings,       │
+    │       sources         │
+    └───────────┬───────────┘
+                │ on_success
+                ▼
+    ┌───────────────────────┐
+    │       REVIEW          │
+    │  (client-facing)      │
+    │                       │
+    │  in:  findings,       │
+    │       sources         │
+    │  out: approved_findings│
+    │       OR feedback     │
+    └───────┬───────┬───────┘
+            │       │
+   approved │       │ feedback (priority: -1)
+            │       │
+            ▼       └──────────────────┐
+    ┌───────────────────────┐          │
+    │       REPORT          │          │
+    │                       │          │
+    │  tools: save_data     │          │
+    │                       │          │
+    │  in:  approved_       │          │
+    │       findings        │          │
+    │  out: final_report    │          │
+    └───────────────────────┘          │
+                                       │
+            ┌──────────────────────────┘
+            │ loops back to RESEARCH
+            ▼ (max_node_visits: 3)
+
+
+    EDGES:
+    ──────
+    1. intake → research         [on_success, priority: 1]
+    2. research → review         [on_success, priority: 1]
+    3. review → report           [conditional: approved_findings is not None, priority: 1]
+    4. review → research         [conditional: feedback is not None, priority: -1]
+```
+
+**PRESENT the graph and edges to the user:**
+
+> Here is the complete workflow graph:
+>
+> [ASCII art above]
+>
+> **Edge Summary:**
+>
+> | #   | Edge              | Condition                                    | Priority |
+> | --- | ----------------- | -------------------------------------------- | -------- |
+> | 1   | intake → research | on_success                                   | 1        |
+> | 2   | research → review | on_success                                   | 1        |
+> | 3   | review → report   | conditional: `approved_findings is not None` | 1        |
+> | 4   | review → research | conditional: `feedback is not None`          | -1       |
+
+**THEN call AskUserQuestion:**
+
+```
+AskUserQuestion(questions=[{
+    "question": "Do you approve this workflow graph?",
+    "header": "Graph",
+    "options": [
+        {"label": "Approve", "description": "Graph looks good, proceed to build the agent"},
+        {"label": "Modify", "description": "I want to change the graph"}
+    ],
+    "multiSelect": false
+}])
+```
+
+**WAIT for user response.**
+
+- If **Approve**: Proceed to STEP 5
+- If **Modify**: Ask what they want to change, update the graph, re-render, ask again
+
+---
+
+## STEP 5: Build the Agent
+
+**NOW — and only now — write the actual code.** The user has approved the goal, nodes, and graph.
+
+### 5a: Register nodes and edges with MCP
+
+**FOR EACH approved node**, call:
+
+```
+mcp__agent-builder__add_node(
+    node_id="...",
+    name="...",
+    description="...",
+    node_type="event_loop",
+    input_keys='["key1", "key2"]',
+    output_keys='["key1"]',
+    tools='["tool1"]',
+    system_prompt="...",
+    client_facing=True/False,
+    nullable_output_keys='["key"]',
+    max_node_visits=1
+)
+```
+
+**FOR EACH approved edge**, call:
+
+```
+mcp__agent-builder__add_edge(
+    edge_id="source-to-target",
+    source="source-node-id",
+    target="target-node-id",
+    condition="on_success",
+    condition_expr="",
+    priority=1
+)
+```
+
+**VALIDATE the graph:**
+
+```
+mcp__agent-builder__validate_graph()
+```
+
+- If invalid: Fix the issues and re-validate
+- If valid: Continue to 5b
+
+### 5b: Write Python package files
+
+**EXPORT the graph data:**
+
+```
+mcp__agent-builder__export_graph()
+```
+
+**THEN write the Python package files** using the exported data. Create these files in `exports/AGENT_NAME/`:
+
+1. `config.py` - Runtime configuration with model settings
+2. `nodes/__init__.py` - All NodeSpec definitions
+3. `agent.py` - Goal, edges, graph config, and agent class
+4. `__init__.py` - Package exports
+5. `__main__.py` - CLI interface
+6. `mcp_servers.json` - MCP server configurations
+7. `README.md` - Usage documentation
+
+**IMPORTANT entry_points format:**
+
+- MUST be: `{"start": "first-node-id"}`
+- NOT: `{"first-node-id": ["input_keys"]}` (WRONG)
+- NOT: `{"first-node-id"}` (WRONG - this is a set)
+
+**IMPORTANT mcp_servers.json format:**
+
+```json
+{
+  "hive-tools": {
+    "transport": "stdio",
+    "command": "python",
+    "args": ["mcp_server.py", "--stdio"],
+    "cwd": "../../tools",
+    "description": "Hive tools MCP server"
+  }
+}
+```
+
+- NO `"mcpServers"` wrapper (that's Claude Desktop format, NOT hive format)
+- `cwd` MUST be `"../../tools"` (relative from `exports/AGENT_NAME/` to `tools/`)
+
+**Use the example agent** at `.claude/skills/hive-create/examples/deep_research_agent/` as a template for file structure and patterns. It demonstrates: STEP 1/STEP 2 prompts, client-facing nodes, feedback loops, nullable_output_keys, and data tools.
+
+**AFTER writing all files, tell the user:**
+
+> Agent package created: `exports/AGENT_NAME/`
+>
+> **Files generated:**
+>
+> - `__init__.py` - Package exports
+> - `agent.py` - Goal, nodes, edges, agent class
+> - `config.py` - Runtime configuration
+> - `__main__.py` - CLI interface
+> - `nodes/__init__.py` - Node definitions
+> - `mcp_servers.json` - MCP server config
+> - `README.md` - Usage documentation
+
+---
+
+## STEP 6: Verify and Test
+
+**RUN validation:**
+
+```bash
+cd /home/timothy/oss/hive && PYTHONPATH=exports uv run python -m AGENT_NAME validate
+```
+
+- If valid: Agent is complete!
+- If errors: Fix the issues and re-run
+
+**TELL the user the agent is ready** and suggest next steps:
+
+- Run with mock mode to test without API calls
+- Use `/hive-test` skill for comprehensive testing
+- Use `/hive-credentials` if the agent needs API keys
+
+---
+
+## REFERENCE: Node Types
+
+| Type         | tools param             | Use when                                |
+| ------------ | ----------------------- | --------------------------------------- |
+| `event_loop` | `'["tool1"]'` or `'[]'` | LLM-powered work with or without tools  |
+| `function`   | N/A                     | Deterministic Python operations, no LLM |
+
+---
+
+## REFERENCE: NodeSpec Fields
+
+| Field                  | Default | Description                                                           |
+| ---------------------- | ------- | --------------------------------------------------------------------- |
+| `client_facing`        | `False` | Streams output to user, blocks for input between turns                |
+| `nullable_output_keys` | `[]`    | Output keys that may remain unset (mutually exclusive outputs)        |
+| `max_node_visits`      | `1`     | Max executions per run. Set >1 for feedback loop targets. 0=unlimited |
+
+---
+
+## REFERENCE: Edge Conditions & Priority
+
+| Condition     | When edge is followed                 |
+| ------------- | ------------------------------------- |
+| `on_success`  | Source node completed successfully    |
+| `on_failure`  | Source node failed                    |
+| `always`      | Always, regardless of success/failure |
+| `conditional` | When condition_expr evaluates to True |
+
+**Priority:** Positive = forward edge (evaluated first). Negative = feedback edge (loops back to earlier node). Multiple ON_SUCCESS edges from same source = parallel execution (fan-out).
+
+---
+
+## REFERENCE: System Prompt Best Practice
+
+For **internal** event_loop nodes (not client-facing), instruct the LLM to use `set_output`:
+
+```
+Use set_output(key, value) to store your results. For example:
+- set_output("search_results", <your results as a JSON string>)
+
+Do NOT return raw JSON. Use the set_output tool to produce outputs.
+```
+
+For **client-facing** event_loop nodes, use the STEP 1/STEP 2 pattern:
+
+```
+**STEP 1 — Respond to the user (text only, NO tool calls):**
+[Present information, ask questions, etc.]
+
+**STEP 2 — After the user responds, call set_output:**
+- set_output("key", "value based on user's response")
+```
+
+This prevents the LLM from calling `set_output` before the user has had a chance to respond. The "NO tool calls" instruction in STEP 1 ensures the node blocks for user input before proceeding.
+
+---
+
+## EventLoopNode Runtime
+
+EventLoopNodes are **auto-created** by `GraphExecutor` at runtime. Both direct `GraphExecutor` and `AgentRuntime` / `create_agent_runtime()` handle event_loop nodes automatically. No manual `node_registry` setup is needed.
+
+```python
+# Direct execution
+from framework.graph.executor import GraphExecutor
+from framework.runtime.core import Runtime
+
+storage_path = Path.home() / ".hive" / "my_agent"
+storage_path.mkdir(parents=True, exist_ok=True)
+runtime = Runtime(storage_path)
+
+executor = GraphExecutor(
+    runtime=runtime,
+    llm=llm,
+    tools=tools,
+    tool_executor=tool_executor,
+    storage_path=storage_path,
+)
+result = await executor.execute(graph=graph, goal=goal, input_data=input_data)
+```
+
+**DO NOT pass `runtime=None` to `GraphExecutor`** — it will crash with `'NoneType' object has no attribute 'start_run'`.
+
+---
+
+## COMMON MISTAKES TO AVOID
+
+1. **Using tools that don't exist** - Always check `mcp__agent-builder__list_mcp_tools()` first
+2. **Wrong entry_points format** - Must be `{"start": "node-id"}`, NOT a set or list
+3. **Skipping validation** - Always validate nodes and graph before proceeding
+4. **Not waiting for approval** - Always ask user before major steps
+5. **Displaying this file** - Execute the steps, don't show documentation
+6. **Too many thin nodes** - Prefer fewer, richer nodes (4 nodes > 8 nodes)
+7. **Missing STEP 1/STEP 2 in client-facing prompts** - Client-facing nodes need explicit phases to prevent premature set_output
+8. **Forgetting nullable_output_keys** - Mark input_keys that only arrive on certain edges (e.g., feedback) as nullable on the receiving node
+9. **Adding framework gating for LLM behavior** - Fix prompts or use judges, not ad-hoc code
+10. **Writing code before user approves the graph** - Always get approval on goal, nodes, and graph BEFORE writing any agent code
+11. **Wrong mcp_servers.json format** - Use flat format (no `"mcpServers"` wrapper), and `cwd` must be `"../../tools"` not `"tools"`
@@ -70,7 +70,9 @@ def tui(mock, verbose, debug):
    try:
        from framework.tui.app import AdenTUI
    except ImportError:
-        click.echo("TUI requires the 'textual' package. Install with: pip install textual")
+        click.echo(
+            "TUI requires the 'textual' package. Install with: pip install textual"
+        )
        sys.exit(1)

    from pathlib import Path
@@ -88,6 +90,9 @@ def tui(mock, verbose, debug):
        agent._event_bus = EventBus()
        agent._tool_registry = ToolRegistry()

+        storage_path = Path.home() / ".hive" / "deep_research_agent"
+        storage_path.mkdir(parents=True, exist_ok=True)
+
        mcp_config_path = Path(__file__).parent / "mcp_servers.json"
        if mcp_config_path.exists():
            agent._tool_registry.load_mcp_config(mcp_config_path)
@@ -104,9 +109,6 @@ def tui(mock, verbose, debug):
        tool_executor = agent._tool_registry.get_executor()
        graph = agent._build_graph()

-        storage_path = Path.home() / ".hive" / "deep_research_agent"
-        storage_path.mkdir(parents=True, exist_ok=True)
-
        runtime = create_agent_runtime(
            graph=graph,
            goal=agent.goal,
@@ -216,7 +218,9 @@ async def _interactive_shell(verbose=False):
                    if "references" in output:
                        click.echo("--- References ---\n")
                        for ref in output.get("references", []):
-                            click.echo(f"  [{ref.get('number', '?')}] {ref.get('title', '')} - {ref.get('url', '')}")
+                            click.echo(
+                                f"  [{ref.get('number', '?')}] {ref.get('title', '')} - {ref.get('url', '')}"
+                            )
                        click.echo("\n")
                else:
                    click.echo(f"\nResearch failed: {result.error}\n")
@@ -227,6 +231,7 @@ async def _interactive_shell(verbose=False):
            except Exception as e:
                click.echo(f"Error: {e}", err=True)
                import traceback
+
                traceback.print_exc()
    finally:
        await agent.stop()
@@ -166,6 +166,11 @@ class DeepResearchAgent:
            edges=self.edges,
            default_model=self.config.model,
            max_tokens=self.config.max_tokens,
+            loop_config={
+                "max_iterations": 100,
+                "max_tool_calls_per_turn": 20,
+                "max_history_tokens": 32000,
+            },
        )

    def _setup(self, mock_mode=False) -> GraphExecutor:
@@ -203,6 +208,7 @@ class DeepResearchAgent:
            tool_executor=tool_executor,
            event_bus=self._event_bus,
            storage_path=storage_path,
+            loop_config=self._graph.loop_config,
        )

        return self._executor
@@ -24,7 +24,7 @@ def _load_preferred_model() -> str:
 class RuntimeConfig:
    model: str = field(default_factory=_load_preferred_model)
    temperature: float = 0.7
-    max_tokens: int = 8192
+    max_tokens: int = 40000
    api_key: str | None = None
    api_base: str | None = None

@@ -102,41 +102,56 @@ Should we proceed to writing the final report?
 )

 # Node 4: Report (client-facing)
-# Writes the final report and presents it to the user.
+# Writes an HTML report, serves the link to the user, and answers follow-ups.
 report_node = NodeSpec(
    id="report",
    name="Write & Deliver Report",
-    description="Write a cited report from the findings and present it to the user",
+    description="Write a cited HTML report from the findings and present it to the user",
    node_type="event_loop",
    client_facing=True,
    input_keys=["findings", "sources", "research_brief"],
    output_keys=["delivery_status"],
    system_prompt="""\
-Write a comprehensive research report and present it to the user.
+Write a comprehensive research report as an HTML file and present it to the user.

-**STEP 1 — Write and present the report (text only, NO tool calls):**
+**STEP 1 — Write the HTML report (tool calls, NO text to user yet):**

-Report structure:
-1. **Executive Summary** (2-3 paragraphs)
-2. **Findings** (organized by theme, with [n] citations)
-3. **Analysis** (synthesis, implications, areas of debate)
-4. **Conclusion** (key takeaways, confidence assessment)
-5. **References** (numbered list of sources cited)
+1. Compose a complete, self-contained HTML document with embedded CSS styling.
+   Use a clean, readable design: max-width container, pleasant typography,
+   numbered citation links, a table of contents, and a references section.

-Requirements:
- Every factual claim must cite its source with [n] notation
- Be objective — present multiple viewpoints where sources disagree
- Distinguish well-supported conclusions from speculation
- Answer the original research questions from the brief
+   Report structure inside the HTML:
+   - Title & date
+   - Executive Summary (2-3 paragraphs)
+   - Table of Contents
+   - Findings (organized by theme, with [n] citation links)
+   - Analysis (synthesis, implications, areas of debate)
+   - Conclusion (key takeaways, confidence assessment)
+   - References (numbered list with clickable URLs)

-End by asking the user if they have questions or want to save the report.
+   Requirements:
+   - Every factual claim must cite its source with [n] notation
+   - Be objective — present multiple viewpoints where sources disagree
+   - Distinguish well-supported conclusions from speculation
+   - Answer the original research questions from the brief

-**STEP 2 — After the user responds:**
+2. Save the HTML file:
+   save_data(filename="report.html", data=<your_html>)
+
+3. Get the clickable link:
+   serve_file_to_user(filename="report.html", label="Research Report")
+
+**STEP 2 — Present the link to the user (text only, NO tool calls):**
+
+Tell the user the report is ready and include the file:// URI from
+serve_file_to_user so they can click it to open. Give a brief summary
+of what the report covers. Ask if they have questions.
+
+**STEP 3 — After the user responds:**
 - Answer follow-up questions from the research material
- If they want to save, use write_to_file tool
 - When the user is satisfied: set_output("delivery_status", "completed")
 """,
-    tools=["write_to_file"],
+    tools=["save_data", "serve_file_to_user", "load_data", "list_data_files"],
 )

 __all__ = [
@@ -1,10 +1,10 @@
 ---
-name: setup-credentials
+name: hive-credentials
 description: Set up and install credentials for an agent. Detects missing credentials from agent config, collects them from the user, and stores them securely in the local encrypted store at ~/.hive/credentials.
 license: Apache-2.0
 metadata:
  author: hive
-  version: "2.2"
+  version: "2.3"
  type: utility
 ---

@@ -31,95 +31,50 @@ Determine which agent needs credentials. The user will either:

 Locate the agent's directory under `exports/{agent_name}/`.

-### Step 2: Detect Required Credentials (Bash-First)
+### Step 2: Detect Missing Credentials

-Use bash commands to determine what the agent needs and what's already configured. This avoids Python import issues and works even when `HIVE_CREDENTIAL_KEY` is not set.
+Use the `check_missing_credentials` MCP tool to detect what the agent needs and what's already configured. This tool loads the agent, inspects its required tools and node types, maps them to credentials via `CREDENTIAL_SPECS`, and checks both the encrypted store and environment variables.

-#### Step 2a: Read Agent Requirements
-
-Extract `required_tools` and node types from the agent config:
-
-```bash
-# Get required tools
-jq -r '.required_tools[]?' exports/{agent_name}/agent.json 2>/dev/null
-
-# Get node types from graph nodes
-jq -r '.graph.nodes[]?.node_type' exports/{agent_name}/agent.json 2>/dev/null | sort -u
+```
+check_missing_credentials(agent_path="exports/{agent_name}")
 ```

-Map the extracted tools and node types to credentials by reading the spec files directly:
+The tool returns a JSON response:

-```bash
-# Read all credential specs — each file defines tools, node_types, env_var, and credential_id
-cat tools/src/aden_tools/credentials/llm.py tools/src/aden_tools/credentials/search.py tools/src/aden_tools/credentials/email.py tools/src/aden_tools/credentials/integrations.py
+```json
+{
+  "agent": "exports/{agent_name}",
+  "missing": [
+    {
+      "credential_name": "brave_search",
+      "env_var": "BRAVE_SEARCH_API_KEY",
+      "description": "Brave Search API key for web search",
+      "help_url": "https://brave.com/search/api/",
+      "tools": ["web_search"]
+    }
+  ],
+  "available": [
+    {
+      "credential_name": "anthropic",
+      "env_var": "ANTHROPIC_API_KEY",
+      "source": "encrypted_store"
+    }
+  ],
+  "total_missing": 1,
+  "ready": false
+}
 ```

-For each `CredentialSpec`, match its `tools` and `node_types` lists against the agent's required tools and node types. Extract the `env_var`, `credential_id`, and `credential_group` for every match. This is the list of needed credentials.
-
-#### Step 2b: Check Existing Credential Sources
-
-For each needed credential, check three sources. A credential is "found" if it exists in ANY of them:
-
-**1. Encrypted store metadata index** (unencrypted JSON — no decryption key needed):
-
-```bash
-cat ~/.hive/credentials/metadata/index.json 2>/dev/null | jq -r '.credentials | keys[]'
-```
-
-If a credential ID appears in this list, it is stored in the encrypted store.
-
-**2. Environment variables:**
-
-```bash
-# Check each needed env var, e.g.:
-printenv ANTHROPIC_API_KEY > /dev/null 2>&1 && echo "ANTHROPIC_API_KEY: set" || echo "ANTHROPIC_API_KEY: not set"
-printenv BRAVE_SEARCH_API_KEY > /dev/null 2>&1 && echo "BRAVE_SEARCH_API_KEY: set" || echo "BRAVE_SEARCH_API_KEY: not set"
-```
-
-**3. Project `.env` file:**
-
-```bash
-# Check each needed env var, e.g.:
-grep -q '^ANTHROPIC_API_KEY=' .env 2>/dev/null && echo "ANTHROPIC_API_KEY: in .env" || echo "ANTHROPIC_API_KEY: not in .env"
-grep -q '^BRAVE_SEARCH_API_KEY=' .env 2>/dev/null && echo "BRAVE_SEARCH_API_KEY: in .env" || echo "BRAVE_SEARCH_API_KEY: not in .env"
-```
-
-#### Step 2c: HIVE_CREDENTIAL_KEY Check
-
-If any credentials were found in the encrypted store metadata index, verify the encryption key is available. The key is typically persisted to shell config by a previous setup-credentials run.
-
-Check both the current session AND shell config files:
-
-```bash
-# Check 1: Current session
-printenv HIVE_CREDENTIAL_KEY > /dev/null 2>&1 && echo "session: set" || echo "session: not set"
-
-# Check 2: Shell config files (where setup-credentials persists it)
-# Note: check each file individually to avoid non-zero exit when one doesn't exist
-for f in ~/.zshrc ~/.bashrc ~/.profile; do [ -f "$f" ] && grep -q 'HIVE_CREDENTIAL_KEY' "$f" && echo "$f"; done
-```
-
-Decision logic:
- **In current session** — no action needed, credentials in the store are usable
- **In shell config but NOT in current session** — the key is persisted but this shell hasn't sourced it. Run `source ~/.zshrc` (or `~/.bashrc`), then re-check. Credentials in the store are usable after sourcing.
- **Not in session AND not in shell config** — the key was never persisted. Warn the user that credentials in the store cannot be decrypted. Help fix the key situation (recover/re-persist), do NOT re-collect credential values that are already stored.
-
-#### Step 2d: Compute Missing & Group
-
-Diff the "needed" credentials against the "found" credentials to get the truly missing list.
-
-Group related credentials by their `credential_group` field from the spec files. Credentials that share the same non-empty `credential_group` value should be presented as a single setup step rather than asking for each one individually.
-
-**If nothing is missing and there's no HIVE_CREDENTIAL_KEY issue:** Report all credentials as configured and skip Steps 3-5. Example:
+**If `ready` is true (nothing missing):** Report all credentials as configured and skip Steps 3-5. Example:

 ```
 All required credentials are already configured:
-  ✓ anthropic (ANTHROPIC_API_KEY) — found in encrypted store
-  ✓ brave_search (BRAVE_SEARCH_API_KEY) — found in environment
+  ✓ anthropic (ANTHROPIC_API_KEY)
+  ✓ brave_search (BRAVE_SEARCH_API_KEY)
 Your agent is ready to run!
 ```

-**If credentials are missing:** Continue to Step 3 with only the missing ones.
+**If credentials are missing:** Continue to Step 3 with the `missing` list.

 ### Step 3: Present Auth Options for Each Missing Credential

@@ -153,7 +108,7 @@ Present the available options using AskUserQuestion:
 Choose how to configure HUBSPOT_ACCESS_TOKEN:

  1) Aden Platform (OAuth) (Recommended)
-     Secure OAuth2 flow via integration.adenhq.com
+     Secure OAuth2 flow via hive.adenhq.com
     - Quick setup with automatic token refresh
     - No need to manage API keys manually

@@ -170,6 +125,22 @@ Choose how to configure HUBSPOT_ACCESS_TOKEN:

 ### Step 4: Execute Auth Flow Based on User Choice

+#### Prerequisite: Ensure HIVE_CREDENTIAL_KEY Is Available
+
+Before storing any credentials, verify `HIVE_CREDENTIAL_KEY` is set (needed to encrypt/decrypt the local store). Check both the current session and shell config:
+
+```bash
+# Check current session
+printenv HIVE_CREDENTIAL_KEY > /dev/null 2>&1 && echo "session: set" || echo "session: not set"
+
+# Check shell config files
+for f in ~/.zshrc ~/.bashrc ~/.profile; do [ -f "$f" ] && grep -q 'HIVE_CREDENTIAL_KEY' "$f" && echo "$f"; done
+```
+
+- **In current session** — proceed to store credentials
+- **In shell config but NOT in current session** — run `source ~/.zshrc` (or `~/.bashrc`) first, then proceed
+- **Not set anywhere** — `EncryptedFileStorage` will auto-generate one. After storing, tell the user to persist it: `export HIVE_CREDENTIAL_KEY="{generated_key}"` in their shell profile
+
 #### Option 1: Aden Platform (OAuth)

 This is the recommended flow for supported integrations (HubSpot, etc.).
@@ -195,7 +166,7 @@ If not set, guide user to get one from Aden (this is where they do OAuth):
 from aden_tools.credentials import open_browser, get_aden_setup_url

 # Open browser to Aden - user will sign up and connect integrations there
-url = get_aden_setup_url()  # https://integration.adenhq.com/setup
+url = get_aden_setup_url()  # https://hive.adenhq.com
 success, msg = open_browser(url)

 print("Please sign in to Aden and connect your integrations (HubSpot, etc.).")
@@ -272,7 +243,7 @@ print(f"Synced credentials: {synced}")
 # If the required credential wasn't synced, the user hasn't authorized it on Aden yet
 if "hubspot" not in synced:
    print("HubSpot not found in your Aden account.")
-    print("Please visit https://integration.adenhq.com to connect HubSpot, then try again.")
+    print("Please visit https://hive.adenhq.com to connect HubSpot, then try again.")
 ```

 For more control over the sync process:
@@ -442,28 +413,38 @@ config_path.write_text(json.dumps(config, indent=2))

 ### Step 6: Verify All Credentials

-Run validation again to confirm everything is set:
+Use the `verify_credentials` MCP tool to confirm everything is properly configured:

-```python
-runner = AgentRunner.load("exports/{agent_name}")
-validation = runner.validate()
-assert not validation.missing_credentials, "Still missing credentials!"
+```
+verify_credentials(agent_path="exports/{agent_name}")
 ```

-Report the result to the user.
+The tool returns:
+
+```json
+{
+  "agent": "exports/{agent_name}",
+  "ready": true,
+  "missing_credentials": [],
+  "warnings": [],
+  "errors": []
+}
+```
+
+If `ready` is true, report success. If `missing_credentials` is non-empty, identify what failed and loop back to Step 3 for the remaining credentials.

 ## Health Check Reference

 Health checks validate credentials by making lightweight API calls:

-| Credential      | Endpoint                                | What It Checks                     |
-| --------------- | --------------------------------------- | ---------------------------------- |
-| `anthropic`     | `POST /v1/messages`                     | API key validity                   |
-| `brave_search`  | `GET /res/v1/web/search?q=test&count=1` | API key validity                   |
-| `google_search` | `GET /customsearch/v1?q=test&num=1`     | API key + CSE ID validity          |
-| `github`        | `GET /user`                             | Token validity, user identity      |
-| `hubspot`       | `GET /crm/v3/objects/contacts?limit=1`  | Bearer token validity, CRM scopes  |
-| `resend`        | `GET /domains`                          | API key validity                   |
+| Credential      | Endpoint                                | What It Checks                    |
+| --------------- | --------------------------------------- | --------------------------------- |
+| `anthropic`     | `POST /v1/messages`                     | API key validity                  |
+| `brave_search`  | `GET /res/v1/web/search?q=test&count=1` | API key validity                  |
+| `google_search` | `GET /customsearch/v1?q=test&num=1`     | API key + CSE ID validity         |
+| `github`        | `GET /user`                             | Token validity, user identity     |
+| `hubspot`       | `GET /crm/v3/objects/contacts?limit=1`  | Bearer token validity, CRM scopes |
+| `resend`        | `GET /domains`                          | API key validity                  |

 ```python
 from aden_tools.credentials import check_credential_health, HealthCheckResult
@@ -560,60 +541,27 @@ token = store.get_key("hubspot", "access_token")
 ## Example Session

 ```
-User: /setup-credentials for my research-agent
+User: /hive-credentials for my research-agent

 Agent: Let me check what credentials your research-agent needs.

-[Reads agent config]
-$ jq -r '.required_tools[]?' exports/research-agent/agent.json
-web_search
-google_search
-
-$ jq -r '.graph.nodes[]?.node_type' exports/research-agent/agent.json | sort -u
-event_loop
-
-[Maps tools/nodes to credentials using lookup table]
-Needed: anthropic, brave_search, google_search, google_cse
-
-[Checks encrypted store metadata index]
-$ cat ~/.hive/credentials/metadata/index.json | jq -r '.credentials | keys[]'
-anthropic
-brave_search
-
-[Checks environment variables]
-$ printenv ANTHROPIC_API_KEY > /dev/null 2>&1 && echo "set" || echo "not set"
-not set
-$ printenv BRAVE_SEARCH_API_KEY > /dev/null 2>&1 && echo "set" || echo "not set"
-not set
-$ printenv GOOGLE_API_KEY > /dev/null 2>&1 && echo "set" || echo "not set"
-not set
-$ printenv GOOGLE_CSE_ID > /dev/null 2>&1 && echo "set" || echo "not set"
-not set
-
-[Checks HIVE_CREDENTIAL_KEY since credentials found in store]
-$ printenv HIVE_CREDENTIAL_KEY > /dev/null 2>&1 && echo "session: set" || echo "session: not set"
-session: not set
-$ for f in ~/.zshrc ~/.bashrc ~/.profile; do [ -f "$f" ] && grep -q 'HIVE_CREDENTIAL_KEY' "$f" && echo "$f"; done
-/Users/user/.zshrc
-
-[Key is in shell config but not current session — sourcing it]
-$ source ~/.zshrc
-
-[Computes missing credentials]
-Found:
-  ✓ anthropic (ANTHROPIC_API_KEY) — in encrypted store
-  ✓ brave_search (BRAVE_SEARCH_API_KEY) — in encrypted store
-Missing:
-  ✗ google_search (GOOGLE_API_KEY)
-  ✗ google_cse (GOOGLE_CSE_ID)
+[Calls check_missing_credentials(agent_path="exports/research-agent")]
+→ Returns:
+  available: anthropic (encrypted_store), brave_search (encrypted_store)
+  missing: google_search (GOOGLE_API_KEY), google_cse (GOOGLE_CSE_ID)
+  ready: false

 Agent: 2 of 4 required credentials are already configured. Only Google Custom
-Search needs setup (2 values as a single group).
+Search needs setup (2 values).

 --- Setting up Google Custom Search (google_search + google_cse) ---

 This requires two values that work together.

+[Checks HIVE_CREDENTIAL_KEY before storing]
+$ printenv HIVE_CREDENTIAL_KEY > /dev/null 2>&1 && echo "set" || echo "not set"
+set
+
 First, the Google API Key:
 1. Go to https://console.cloud.google.com/apis/credentials
 2. Create a new project (or select an existing one)
@@ -640,10 +588,13 @@ Now, the Custom Search Engine ID:

 ✓ Google Custom Search credentials valid

+[Calls verify_credentials(agent_path="exports/research-agent")]
+→ Returns: ready: true, missing_credentials: []
+
 All credentials are now configured:
  ✓ anthropic (ANTHROPIC_API_KEY) — already in encrypted store
  ✓ brave_search (BRAVE_SEARCH_API_KEY) — already in encrypted store
  ✓ google_search (GOOGLE_API_KEY) — stored in encrypted store
  ✓ google_cse (GOOGLE_CSE_ID) — stored in encrypted store
-  Your agent is ready to run!
+Your agent is ready to run!
 ```
@@ -1,19 +1,19 @@
 ---
-name: building-agents-patterns
+name: hive-patterns
 description: Best practices, patterns, and examples for building goal-driven agents. Includes client-facing interaction, feedback edges, judge patterns, fan-out/fan-in, context management, and anti-patterns.
 license: Apache-2.0
 metadata:
  author: hive
  version: "2.0"
  type: reference
-  part_of: building-agents
+  part_of: hive
 ---

 # Building Agents - Patterns & Best Practices

 Design patterns, examples, and best practices for building robust goal-driven agents.

-**Prerequisites:** Complete agent structure using `building-agents-construction`.
+**Prerequisites:** Complete agent structure using `hive-create`.

 ## Practical Example: Hybrid Workflow

@@ -97,6 +97,7 @@ research_node = NodeSpec(
 ```

 **How it works:**
+
 - Client-facing nodes stream LLM text to the user and block for input after each response
 - User input is injected via `node.inject_event(text)`
 - When the LLM calls `set_output` to produce structured outputs, the judge evaluates and ACCEPTs
@@ -107,13 +108,13 @@ research_node = NodeSpec(

 ### When to Use client_facing

-| Scenario | client_facing | Why |
-|----------|:---:|-----|
-| Gathering user requirements | Yes | Need user input |
-| Human review/approval checkpoint | Yes | Need human decision |
-| Data processing (scanning, scoring) | No | Runs autonomously |
-| Report generation | No | No user input needed |
-| Final confirmation before action | Yes | Need explicit approval |
+| Scenario                            | client_facing | Why                    |
+| ----------------------------------- | :-----------: | ---------------------- |
+| Gathering user requirements         |      Yes      | Need user input        |
+| Human review/approval checkpoint    |      Yes      | Need human decision    |
+| Data processing (scanning, scoring) |      No       | Runs autonomously      |
+| Report generation                   |      No       | No user input needed   |
+| Final confirmation before action    |      Yes      | Need explicit approval |

 > **Legacy Note:** The `pause_nodes` / `entry_points` pattern still works for backward compatibility but `client_facing=True` is preferred for new agents.

@@ -158,22 +159,24 @@ EdgeSpec(
 ```

 **Key concepts:**
+
 - `nullable_output_keys`: Lists output keys that may remain unset. The node sets exactly one of the mutually exclusive keys per execution.
 - `max_node_visits`: Must be >1 on the feedback target (extractor) so it can re-execute. Default is 1.
 - `priority`: Positive = forward edge (evaluated first). Negative = feedback edge. The executor tries forward edges first; if none match, falls back to feedback edges.

 ### Routing Decision Table

-| Pattern | Old Approach | New Approach |
-|---------|-------------|--------------|
-| Conditional branching | `router` node | Conditional edges with `condition_expr` |
-| Binary approve/reject | `pause_nodes` + resume | `client_facing=True` + `nullable_output_keys` |
-| Loop-back on rejection | Manual entry_points | Feedback edge with `priority=-1` |
-| Multi-way routing | Router with routes dict | Multiple conditional edges with priorities |
+| Pattern                | Old Approach            | New Approach                                  |
+| ---------------------- | ----------------------- | --------------------------------------------- |
+| Conditional branching  | `router` node           | Conditional edges with `condition_expr`       |
+| Binary approve/reject  | `pause_nodes` + resume  | `client_facing=True` + `nullable_output_keys` |
+| Loop-back on rejection | Manual entry_points     | Feedback edge with `priority=-1`              |
+| Multi-way routing      | Router with routes dict | Multiple conditional edges with priorities    |

 ## Judge Patterns

 **Core Principle: The judge is the SOLE mechanism for acceptance decisions.** Never add ad-hoc framework gating to compensate for LLM behavior. If the LLM calls `set_output` prematurely, fix the system prompt or use a custom judge. Anti-patterns to avoid:
+
 - Output rollback logic
 - `_user_has_responded` flags
 - Premature set_output rejection
@@ -184,6 +187,7 @@ Judges control when an event_loop node's loop exits. Choose based on validation
 ### Implicit Judge (Default)

 When no judge is configured, the implicit judge ACCEPTs when:
+
 - The LLM finishes its response with no tool calls
 - All required output keys have been set via `set_output`

@@ -219,11 +223,11 @@ class SchemaJudge:

 ### When to Use Which Judge

-| Judge | Use When | Example |
-|-------|----------|---------|
+| Judge           | Use When                              | Example                |
+| --------------- | ------------------------------------- | ---------------------- |
 | Implicit (None) | Output keys are sufficient validation | Simple data extraction |
-| SchemaJudge | Need structural validation of outputs | API response parsing |
-| Custom | Domain-specific validation logic | Score must be 0.0-1.0 |
+| SchemaJudge     | Need structural validation of outputs | API response parsing   |
+| Custom          | Domain-specific validation logic      | Score must be 0.0-1.0  |

 ## Fan-Out / Fan-In (Parallel Execution)

@@ -244,6 +248,7 @@ EdgeSpec(id="scorer-to-extractor", source="scorer", target="extractor",
 ```

 **Requirements:**
+
 - Parallel event_loop nodes must have **disjoint output_keys** (no key written by both)
 - Only one parallel branch may contain a `client_facing` node
 - Fan-in node receives outputs from all completed branches in shared memory
@@ -253,6 +258,7 @@ EdgeSpec(id="scorer-to-extractor", source="scorer", target="extractor",
 ### Tiered Compaction

 EventLoopNode automatically manages context window usage with tiered compaction:
+
 1. **Pruning** — Old tool results replaced with compact placeholders (zero-cost, no LLM call)
 2. **Normal compaction** — LLM summarizes older messages
 3. **Aggressive compaction** — Keeps only recent messages + summary
@@ -265,17 +271,20 @@ The framework automatically truncates large tool results and saves full content
 For explicit data management, use the data tools (real MCP tools, not synthetic):

 ```python
-# save_data, load_data, list_data_files are real MCP tools
-# Each takes a data_dir parameter since the MCP server is shared
+# save_data, load_data, list_data_files, serve_file_to_user are real MCP tools
+# data_dir is auto-injected by the framework — the LLM never sees it

 # Saving large results
-save_data(filename="sources.json", data=large_json_string, data_dir="/path/to/spillover")
+save_data(filename="sources.json", data=large_json_string)

 # Reading with pagination (line-based offset/limit)
-load_data(filename="sources.json", data_dir="/path/to/spillover", offset=0, limit=50)
+load_data(filename="sources.json", offset=0, limit=50)

 # Listing available files
-list_data_files(data_dir="/path/to/spillover")
+list_data_files()
+
+# Serving a file to the user as a clickable link
+serve_file_to_user(filename="report.html", label="Research Report")
 ```

 Add data tools to nodes that handle large tool results:
@@ -287,7 +296,7 @@ research_node = NodeSpec(
 )
 ```

-The `data_dir` is passed by the framework (from the node's spillover directory). The LLM sees `data_dir` in truncation messages and uses it when calling `load_data`.
+`data_dir` is a framework context parameter — auto-injected at call time. `GraphExecutor.execute()` sets it per-execution via `ToolRegistry.set_execution_context(data_dir=...)` (using `contextvars` for concurrency safety), ensuring it matches the session-scoped spillover directory.

 ## Anti-Patterns

@@ -304,18 +313,19 @@ The `data_dir` is passed by the framework (from the node's spillover directory).

 A common mistake is splitting work into too many small single-purpose nodes. Each node boundary requires serializing outputs, losing in-context information, and adding edge complexity.

-| Bad (8 thin nodes) | Good (4 rich nodes) |
-|---------------------|---------------------|
-| parse-query | intake (client-facing) |
-| search-sources | research (search + fetch + analyze) |
-| fetch-content | review (client-facing) |
-| evaluate-sources | report (write + deliver) |
-| synthesize-findings | |
-| write-report | |
-| quality-check | |
-| save-report | |
+| Bad (8 thin nodes)  | Good (4 rich nodes)                 |
+| ------------------- | ----------------------------------- |
+| parse-query         | intake (client-facing)              |
+| search-sources      | research (search + fetch + analyze) |
+| fetch-content       | review (client-facing)              |
+| evaluate-sources    | report (write + deliver)            |
+| synthesize-findings |                                     |
+| write-report        |                                     |
+| quality-check       |                                     |
+| save-report         |                                     |

 **Why fewer nodes are better:**
+
 - The LLM retains full context of its work within a single node
 - A research node that searches, fetches, and analyzes keeps all source material in its conversation history
 - Fewer edges means simpler graph and fewer failure points
@@ -324,6 +334,7 @@ A common mistake is splitting work into too many small single-purpose nodes. Eac
 ### MCP Tools - Correct Usage

 **MCP tools OK for:**
+
 - `test_node` — Validate node configuration with mock inputs
 - `validate_graph` — Check graph structure
 - `configure_loop` — Set event loop parameters
@@ -356,7 +367,7 @@ When agent is complete, transition to testing phase:
 ### Pre-Testing Checklist

 - [ ] Agent structure validates: `uv run python -m agent_name validate`
- [ ] All nodes defined in nodes/__init__.py
+- [ ] All nodes defined in nodes/**init**.py
 - [ ] All edges connect valid nodes with correct priorities
 - [ ] Feedback edge targets have `max_node_visits > 1`
 - [ ] Client-facing nodes have meaningful system prompts
@@ -364,10 +375,10 @@ When agent is complete, transition to testing phase:

 ## Related Skills

- **building-agents-core** — Fundamental concepts (node types, edges, event loop architecture)
- **building-agents-construction** — Step-by-step building process
- **testing-agent** — Test and validate agents
- **agent-workflow** — Complete workflow orchestrator
+- **hive-concepts** — Fundamental concepts (node types, edges, event loop architecture)
+- **hive-create** — Step-by-step building process
+- **hive-test** — Test and validate agents
+- **hive** — Complete workflow orchestrator

 ---

@@ -1,11 +1,11 @@
 ---
-name: testing-agent
+name: hive-test
 description: Run goal-based evaluation tests for agents. Use when you need to verify an agent meets its goals, debug failing tests, or iterate on agent improvements based on test results.
 ---

 # Testing Workflow

-This skill provides tools for testing agents built with the building-agents skill.
+This skill provides tools for testing agents built with the hive-create skill.

 ## Workflow Overview

@@ -61,7 +61,7 @@ mcp__agent-builder__debug_test(

 # Testing Agents with MCP Tools

-Run goal-based evaluation tests for agents built with the building-agents skill.
+Run goal-based evaluation tests for agents built with the hive-create skill.

 **Key Principle: MCP tools provide guidelines, Claude writes tests directly**
 - ✅ Get guidelines: `generate_constraint_tests`, `generate_success_tests` → returns templates and guidelines
@@ -279,7 +279,7 @@ if missing_creds:
 ```
 ┌─────────────────────────────────────────────────────────────────────────┐
 │                           GOAL STAGE                                     │
-│  (building-agents skill)                                                 │
+│  (hive-create skill)                                                 │
 │                                                                          │
 │  1. User defines goal with success_criteria and constraints             │
 │  2. Goal written to agent.py immediately                                │
@@ -289,7 +289,7 @@ if missing_creds:
                                   ↓
 ┌─────────────────────────────────────────────────────────────────────────┐
 │                          AGENT STAGE                                     │
-│  (building-agents skill)                                                 │
+│  (hive-create skill)                                                 │
 │                                                                          │
 │  Build nodes + edges, written immediately to files                      │
 │  Constraint tests can run during development:                           │
@@ -608,7 +608,7 @@ Edit(
 )

 # 4. May need to regenerate agent nodes if goal changed significantly
-# This requires going back to building-agents skill
+# This requires going back to hive-create skill
 ```

 #### EDGE_CASE → Add Test and Fix
@@ -1027,17 +1027,17 @@ async def test_client_facing_node(mock_mode):
    assert result.success or result.paused_at is not None
 ```

-## Integration with building-agents
+## Integration with hive-create

 ### Handoff Points

 | Scenario | From | To | Action |
 |----------|------|-----|--------|
-| Agent built, ready to test | building-agents | testing-agent | Generate success tests |
-| LOGIC_ERROR found | testing-agent | building-agents | Update goal, rebuild |
-| IMPLEMENTATION_ERROR found | testing-agent | Direct fix | Edit agent files, re-run tests |
-| EDGE_CASE found | testing-agent | testing-agent | Add edge case test |
-| All tests pass | testing-agent | Done | Agent validated ✅ |
+| Agent built, ready to test | hive-create | hive-test | Generate success tests |
+| LOGIC_ERROR found | hive-test | hive-create | Update goal, rebuild |
+| IMPLEMENTATION_ERROR found | hive-test | Direct fix | Edit agent files, re-run tests |
+| EDGE_CASE found | hive-test | hive-test | Add edge case test |
+| All tests pass | hive-test | Done | Agent validated ✅ |

 ### Iteration Speed Comparison

@@ -4,7 +4,7 @@ This example walks through testing a YouTube research agent that finds relevant

 ## Prerequisites

- Agent built with building-agents skill at `exports/youtube-research/`
+- Agent built with hive-create skill at `exports/youtube-research/`
 - Goal defined with success criteria and constraints

 ## Step 1: Load the Goal
@@ -283,11 +283,11 @@ result = debug_test(
 Since this is an **IMPLEMENTATION_ERROR**, we:

 1. **Don't restart** the Goal → Agent → Eval flow
-2. **Fix the agent** using building-agents skill:
+2. **Fix the agent** using hive-create skill:
   - Modify `filter_node` to handle null results
 3. **Re-run Eval** (tests only)

-### Fix in building-agents:
+### Fix in hive-create:

 ```python
 # Update the filter_node to handle null
@@ -1,32 +1,46 @@
 ---
-name: agent-workflow
-description: Complete workflow for building, implementing, and testing goal-driven agents. Orchestrates building-agents-* and testing-agent skills. Use when starting a new agent project, unsure which skill to use, or need end-to-end guidance.
+name: hive
+description: Complete workflow for building, implementing, and testing goal-driven agents. Orchestrates hive-* skills. Use when starting a new agent project, unsure which skill to use, or need end-to-end guidance.
 license: Apache-2.0
 metadata:
  author: hive
  version: "2.0"
  type: workflow-orchestrator
  orchestrates:
-    - building-agents-core
-    - building-agents-construction
-    - building-agents-patterns
-    - testing-agent
-    - setup-credentials
+    - hive-concepts
+    - hive-create
+    - hive-patterns
+    - hive-test
+    - hive-credentials
 ---

 # Agent Development Workflow

+**THIS IS AN EXECUTABLE WORKFLOW. DO NOT explore the codebase or read source files. ROUTE to the correct skill IMMEDIATELY.**
+
+When this skill is loaded, determine what the user needs and invoke the appropriate skill NOW:
+- **User wants to build an agent** → Invoke `/hive-create` immediately
+- **User wants to test an agent** → Invoke `/hive-test` immediately
+- **User wants to learn concepts** → Invoke `/hive-concepts` immediately
+- **User wants patterns/optimization** → Invoke `/hive-patterns` immediately
+- **User wants to set up credentials** → Invoke `/hive-credentials` immediately
+- **Unclear what user needs** → Ask the user (do NOT explore the codebase to figure it out)
+
+**DO NOT:** Read source files, explore the codebase, search for code, or do any investigation before routing. The sub-skills handle all of that.
+
+---
+
 Complete Standard Operating Procedure (SOP) for building production-ready goal-driven agents.

 ## Overview

 This workflow orchestrates specialized skills to take you from initial concept to production-ready agent:

-1. **Understand Concepts** → `/building-agents-core` (optional)
-2. **Build Structure** → `/building-agents-construction`
-3. **Optimize Design** → `/building-agents-patterns` (optional)
-4. **Setup Credentials** → `/setup-credentials` (if agent uses tools requiring API keys)
-5. **Test & Validate** → `/testing-agent`
+1. **Understand Concepts** → `/hive-concepts` (optional)
+2. **Build Structure** → `/hive-create`
+3. **Optimize Design** → `/hive-patterns` (optional)
+4. **Setup Credentials** → `/hive-credentials` (if agent uses tools requiring API keys)
+5. **Test & Validate** → `/hive-test`

 ## When to Use This Workflow

@@ -37,18 +51,18 @@ Use this meta-skill when:
 - Want consistent, repeatable agent builds

 **Skip this workflow** if:
- You only need to test an existing agent → use `/testing-agent` directly
+- You only need to test an existing agent → use `/hive-test` directly
 - You know exactly which phase you're in → use specific skill directly

 ## Quick Decision Tree

 ```
-"Need to understand agent concepts" → building-agents-core
-"Build a new agent" → building-agents-construction
-"Optimize my agent design" → building-agents-patterns
-"Need client-facing nodes or feedback loops" → building-agents-patterns
-"Set up API keys for my agent" → setup-credentials
-"Test my agent" → testing-agent
+"Need to understand agent concepts" → hive-concepts
+"Build a new agent" → hive-create
+"Optimize my agent design" → hive-patterns
+"Need client-facing nodes or feedback loops" → hive-patterns
+"Set up API keys for my agent" → hive-credentials
+"Test my agent" → hive-test
 "Not sure what I need" → Read phases below, then decide
 "Agent has structure but needs implementation" → See agent directory STATUS.md
 ```
@@ -56,7 +70,7 @@ Use this meta-skill when:
 ## Phase 0: Understand Concepts (Optional)

 **Duration**: 5-10 minutes
-**Skill**: `/building-agents-core`
+**Skill**: `/hive-concepts`
 **Input**: Questions about agent architecture

 ### When to Use
@@ -78,7 +92,7 @@ Use this meta-skill when:
 ## Phase 1: Build Agent Structure

 **Duration**: 15-30 minutes
-**Skill**: `/building-agents-construction`
+**Skill**: `/hive-create`
 **Input**: User requirements ("Build an agent that...")

 ### What This Phase Does
@@ -121,7 +135,7 @@ You're ready for Phase 2 when:

 ### Common Outputs

-The building-agents-construction skill produces:
+The hive-create skill produces:
 ```
 exports/agent_name/
 ├── __init__.py          (package exports)
@@ -141,7 +155,7 @@ exports/agent_name/
 → You may need to add Python functions or MCP tools (not covered by current skills)

 **If want to optimize design:**
-→ Proceed to Phase 1.5 (building-agents-patterns)
+→ Proceed to Phase 1.5 (hive-patterns)

 **If ready to test:**
 → Proceed to Phase 2
@@ -149,7 +163,7 @@ exports/agent_name/
 ## Phase 1.5: Optimize Design (Optional)

 **Duration**: 10-15 minutes
-**Skill**: `/building-agents-patterns`
+**Skill**: `/hive-patterns`
 **Input**: Completed agent structure

 ### When to Use
@@ -174,7 +188,7 @@ exports/agent_name/
 ## Phase 2: Test & Validate

 **Duration**: 20-40 minutes
-**Skill**: `/testing-agent`
+**Skill**: `/hive-test`
 **Input**: Working agent from Phase 1

 ### What This Phase Does
@@ -251,9 +265,9 @@ You're done when:

 ```
 User: "Build an agent that monitors files"
-→ Use /building-agents-construction
+→ Use /hive-create
 → Agent structure created
-→ Use /testing-agent
+→ Use /hive-test
 → Tests created and passing
 → Done: Production-ready agent
 ```
@@ -262,10 +276,10 @@ User: "Build an agent that monitors files"

 ```
 User: "Build an agent (first time)"
-→ Use /building-agents-core (understand concepts)
-→ Use /building-agents-construction (build structure)
-→ Use /building-agents-patterns (optimize design)
-→ Use /testing-agent (validate)
+→ Use /hive-concepts (understand concepts)
+→ Use /hive-create (build structure)
+→ Use /hive-patterns (optimize design)
+→ Use /hive-test (validate)
 → Done: Production-ready agent
 ```

@@ -274,7 +288,7 @@ User: "Build an agent (first time)"
 ```
 User: "Test my agent at exports/my_agent"
 → Skip Phase 1
-→ Use /testing-agent directly
+→ Use /hive-test directly
 → Tests created
 → Done: Validated agent
 ```
@@ -283,10 +297,10 @@ User: "Test my agent at exports/my_agent"

 ```
 User: "Build an agent"
-→ Use /building-agents-construction (Phase 1)
+→ Use /hive-create (Phase 1)
 → Implementation needed (see STATUS.md)
 → [User implements functions]
-→ Use /testing-agent (Phase 2)
+→ Use /hive-test (Phase 2)
 → Tests reveal bugs
 → [Fix bugs manually]
 → Re-run tests
@@ -297,41 +311,41 @@ User: "Build an agent"

 ```
 User: "Build an agent with human review and feedback loops"
-→ Use /building-agents-core (learn event loop, client-facing nodes)
-→ Use /building-agents-construction (build structure with feedback edges)
-→ Use /building-agents-patterns (implement client-facing + feedback patterns)
-→ Use /testing-agent (validate review flows and edge routing)
+→ Use /hive-concepts (learn event loop, client-facing nodes)
+→ Use /hive-create (build structure with feedback edges)
+→ Use /hive-patterns (implement client-facing + feedback patterns)
+→ Use /hive-test (validate review flows and edge routing)
 → Done: Agent with HITL checkpoints and review loops
 ```

 ## Skill Dependencies

 ```
-agent-workflow (meta-skill)
+hive (meta-skill)
    │
-    ├── building-agents-core (foundational)
+    ├── hive-concepts (foundational)
    │   ├── Architecture concepts (event loop, judges)
    │   ├── Node types (event_loop, function)
    │   ├── Edge routing and priority
    │   ├── Tool discovery procedures
    │   └── Workflow overview
    │
-    ├── building-agents-construction (procedural)
+    ├── hive-create (procedural)
    │   ├── Creates package structure
    │   ├── Defines goal
    │   ├── Adds nodes (event_loop, function)
    │   ├── Connects edges with priority routing
    │   ├── Finalizes agent class
-    │   └── Requires: building-agents-core
+    │   └── Requires: hive-concepts
    │
-    ├── building-agents-patterns (reference)
+    ├── hive-patterns (reference)
    │   ├── Client-facing interaction patterns
    │   ├── Feedback edges and review loops
    │   ├── Judge patterns (implicit, SchemaJudge)
    │   ├── Fan-out/fan-in parallel execution
    │   └── Context management and anti-patterns
    │
-    └── testing-agent
+    └── hive-test
        ├── Reads agent goal
        ├── Generates tests
        ├── Runs evaluation
@@ -351,7 +365,7 @@ agent-workflow (meta-skill)

 - Check for STATUS.md or IMPLEMENTATION_GUIDE.md in agent directory
 - Implementation may be needed (Python functions or MCP tools)
- This is expected - building-agents-construction creates structure, not implementation
+- This is expected - hive-create creates structure, not implementation
 - See implementation guide for completion options

 ### "Tests are failing"
@@ -359,7 +373,7 @@ agent-workflow (meta-skill)
 - Review test output for specific failures
 - Check agent goal and success criteria
 - Verify constraints are met
- Use `/testing-agent` to debug and iterate
+- Use `/hive-test` to debug and iterate
 - Fix agent code and re-run tests

 ### "Not sure which phase I'm in"
@@ -420,10 +434,10 @@ You're done with the workflow when:

 ## Additional Resources

- **building-agents-core**: See `.claude/skills/building-agents-core/SKILL.md`
- **building-agents-construction**: See `.claude/skills/building-agents-construction/SKILL.md`
- **building-agents-patterns**: See `.claude/skills/building-agents-patterns/SKILL.md`
- **testing-agent**: See `.claude/skills/testing-agent/SKILL.md`
+- **hive-concepts**: See `.claude/skills/hive-concepts/SKILL.md`
+- **hive-create**: See `.claude/skills/hive-create/SKILL.md`
+- **hive-patterns**: See `.claude/skills/hive-patterns/SKILL.md`
+- **hive-test**: See `.claude/skills/hive-test/SKILL.md`
 - **Agent framework docs**: See `core/README.md`
 - **Example agents**: See `exports/` directory

@@ -431,35 +445,35 @@ You're done with the workflow when:

 This workflow provides a proven path from concept to production-ready agent:

-1. **Learn** with `/building-agents-core` → Understand fundamentals (optional)
-2. **Build** with `/building-agents-construction` → Get validated structure
-3. **Optimize** with `/building-agents-patterns` → Apply best practices (optional)
-4. **Test** with `/testing-agent` → Get verified functionality
+1. **Learn** with `/hive-concepts` → Understand fundamentals (optional)
+2. **Build** with `/hive-create` → Get validated structure
+3. **Optimize** with `/hive-patterns` → Apply best practices (optional)
+4. **Test** with `/hive-test` → Get verified functionality

 The workflow is **flexible** - skip phases as needed, iterate freely, and adapt to your specific requirements. The goal is **production-ready agents** built with **consistent, repeatable processes**.

 ## Skill Selection Guide

-**Choose building-agents-core when:**
+**Choose hive-concepts when:**
 - First time building agents
 - Need to understand event loop architecture
 - Validating tool availability
 - Learning about node types, edges, and judges

-**Choose building-agents-construction when:**
+**Choose hive-create when:**
 - Actually building an agent
 - Have clear requirements
 - Ready to write code
 - Want step-by-step guidance

-**Choose building-agents-patterns when:**
+**Choose hive-patterns when:**
 - Agent structure complete
 - Need client-facing nodes or feedback edges
 - Implementing review loops or fan-out/fan-in
 - Want judge patterns or context management
 - Want best practices

-**Choose testing-agent when:**
+**Choose hive-test when:**
 - Agent structure complete
 - Ready to validate functionality
 - Need comprehensive test coverage
@@ -1,6 +1,6 @@
 # Example: File Monitor Agent

-This example shows the complete agent-workflow in action for building a file monitoring agent.
+This example shows the complete /hive workflow in action for building a file monitoring agent.

 ## Initial Request

@@ -12,7 +12,7 @@ User: "Build an agent that monitors ~/Downloads and copies new files to ~/Docume

 ### Step 1: Create Structure

-Agent invokes `/building-agents` skill and:
+Agent invokes `/hive-create` skill and:

 1. Creates `exports/file_monitor_agent/` package
 2. Writes skeleton files (__init__.py, __main__.py, agent.py, etc.)
@@ -107,7 +107,7 @@ exports/file_monitor_agent/

 ### Step 1: Analyze Agent

-Agent invokes `/testing-agent` skill and:
+Agent invokes `/hive-test` skill and:

 1. Reads goal from `exports/file_monitor_agent/agent.py`
 2. Identifies 4 success criteria to test
@@ -1 +0,0 @@
-../../.claude/skills/agent-workflow
@@ -1 +0,0 @@
-../../.claude/skills/building-agents-construction
@@ -1 +0,0 @@
-../../.claude/skills/building-agents-core
@@ -1 +0,0 @@
-../../.claude/skills/building-agents-patterns
@@ -0,0 +1 @@
+../../.claude/skills/hive
@@ -0,0 +1 @@
+../../.claude/skills/hive-concepts
@@ -0,0 +1 @@
+../../.claude/skills/hive-create
@@ -0,0 +1 @@
+../../.claude/skills/hive-credentials
@@ -0,0 +1 @@
+../../.claude/skills/hive-patterns
@@ -0,0 +1 @@
+../../.claude/skills/hive-test
@@ -1 +0,0 @@
-../../.claude/skills/testing-agent
@@ -1,47 +0,0 @@
-## Summary
- **Added HubSpot integration** — new HubSpot MCP tool with search, get, create, and update operations for contacts, companies, and deals. Includes OAuth2 provider for HubSpot credentials and credential store adapter for the tools layer.
- **Replaced web_scrape tool with Playwright + stealth** — swapped httpx/BeautifulSoup for a headless Chromium browser using `playwright` (async API) and `playwright-stealth`, enabling JS-rendered page scraping and bot detection evasion
- **Added empty response retry logic** — LLM provider now detects empty responses (e.g. Gemini returning 200 with no content on rate limit) and retries with exponential backoff, preventing hallucinated output from the cleanup LLM
- **Added context-aware input compaction** — LLM nodes now estimate input token count before calling the model and progressively truncate the largest values if they exceed the context window budget
- **Increased rate limit retries to 10** with verbose `[retry]` and `[compaction]` logging that includes model name, finish reason, and attempt count
- **Interactive quickstart onboarding** — `quickstart.sh` rewritten as bee-themed interactive wizard that detects existing API keys (including Claude Code subscription), lets user pick ONE default LLM provider, and saves configuration to `~/.hive/configuration.json`
- **Fixed lint errors** across `hubspot_tool.py` (line length) and `agent_builder_server.py` (unused variable)
-
-## Changed files
-
-### HubSpot Integration
- `tools/src/aden_tools/tools/hubspot_tool/` — New MCP tool: contacts, companies, and deals CRUD
- `tools/src/aden_tools/tools/__init__.py` — Registered HubSpot tools
- `tools/src/aden_tools/credentials/integrations.py` — HubSpot credential integration
- `tools/src/aden_tools/credentials/__init__.py` — Updated credential exports
- `core/framework/credentials/oauth2/hubspot_provider.py` — HubSpot OAuth2 provider
- `core/framework/credentials/oauth2/__init__.py` — Registered HubSpot OAuth2 provider
- `core/framework/runner/runner.py` — Updated runner for credential support
-
-### Web Scrape Rewrite
- `tools/src/aden_tools/tools/web_scrape_tool/web_scrape_tool.py` — Playwright async rewrite
- `tools/src/aden_tools/tools/web_scrape_tool/README.md` — Updated docs
- `tools/pyproject.toml` — Added `playwright`, `playwright-stealth` deps
- `tools/Dockerfile` — Added `playwright install chromium --with-deps`
-### LLM Reliability
- `core/framework/llm/litellm.py` — Empty response retry + max retries 10 + verbose logging
- `core/framework/graph/node.py` — Input compaction via `_compact_inputs()`, `_estimate_tokens()`, `_get_context_limit()`
-
-### Quickstart & Setup
- `quickstart.sh` — Interactive bee-themed onboarding wizard with single provider selection
- `~/.hive/configuration.json` — New user config file for default LLM provider/model
-
-### Fixes
- `core/framework/mcp/agent_builder_server.py` — Removed unused variable
- `tools/src/aden_tools/tools/hubspot_tool/hubspot_tool.py` — Fixed E501 line length violations
-
-## Test plan
- [ ] Run `make lint` — passes clean
- [ ] Run `./quickstart.sh` and verify interactive flow works, config saved to `~/.hive/configuration.json`
- [ ] Run `pytest tests/tools/test_web_scrape_tool.py -v`
- [ ] Run agent against a JS-heavy site and verify `web_scrape` returns rendered content
- [ ] Set `HUBSPOT_ACCESS_TOKEN` and verify HubSpot tool CRUD operations work
- [ ] Trigger rate limit and verify `[retry]` logs appear with correct attempt counts
- [ ] Run agent with large inputs and verify `[compaction]` logs show truncation
-
-🤖 Generated with [Claude Code](https://claude.com/claude-code)
@@ -63,7 +63,6 @@ Use Hive when you need:
 - Strong monitoring, safety, and budget controls
 - A framework that evolves with your goals

-
 ## What is Aden

 <p align="center">
@@ -107,6 +106,7 @@ cd hive
 ```

 This sets up:
+
 - **framework** - Core agent runtime and graph executor (in `core/.venv`)
 - **aden_tools** - MCP tools for agent capabilities (in `tools/.venv`)
 - All required Python dependencies
@@ -115,16 +115,16 @@ This sets up:

 ```bash
 # Build an agent using Claude Code
-claude> /building-agents-construction
+claude> /hive

 # Test your agent
-claude> /testing-agent
+claude> /hive-test

 # Run your agent
 PYTHONPATH=exports uv run python -m your_agent_name run --input '{...}'
 ```

-**[📖 Complete Setup Guide](ENVIRONMENT_SETUP.md)** - Detailed instructions for agent development
+**[📖 Complete Setup Guide](docs/environment-setup.md)** - Detailed instructions for agent development

 ### Cursor IDE Support

@@ -133,22 +133,22 @@ Skills are also available in Cursor. To enable:
 1. Open Command Palette (`Cmd+Shift+P` / `Ctrl+Shift+P`)
 2. Run `MCP: Enable` to enable MCP servers
 3. Restart Cursor to load the MCP servers from `.cursor/mcp.json`
-4. Type `/` in Agent chat and search for skills (e.g., `/building-agents-construction`)
+4. Type `/` in Agent chat and search for skills (e.g., `/hive-create`)

 ## Features

- **Goal-Driven Development** - Define objectives in natural language; the coding agent generates the agent graph and connection code to achieve them
- **Adaptiveness** - Framework captures failures, calibrates according to the objectives, and evolves the agent graph
- **Dynamic Node Connections** - No predefined edges; connection code is generated by any capable LLM based on your goals
+- **[Goal-Driven Development](docs/key_concepts/goals_outcome.md)** - Define objectives in natural language; the coding agent generates the agent graph and connection code to achieve them
+- **[Adaptiveness](docs/key_concepts/evolution.md)** - Framework captures failures, calibrates according to the objectives, and evolves the agent graph
+- **[Dynamic Node Connections](docs/key_concepts/graph.md)** - No predefined edges; connection code is generated by any capable LLM based on your goals
 - **SDK-Wrapped Nodes** - Every node gets shared memory, local RLM memory, monitoring, tools, and LLM access out of the box
- **Human-in-the-Loop** - Intervention nodes that pause execution for human input with configurable timeouts and escalation
+- **[Human-in-the-Loop](docs/key_concepts/graph.md#human-in-the-loop)** - Intervention nodes that pause execution for human input with configurable timeouts and escalation
 - **Real-time Observability** - WebSocket streaming for live monitoring of agent execution, decisions, and node-to-node communication
 - **Cost & Budget Control** - Set spending limits, throttles, and automatic model degradation policies
 - **Production-Ready** - Self-hostable, built for scale and reliability

 ## Why Aden

-Hive focuses on generating agents that run real business processes rather than generic agents. Instead of requiring you to manually design workflows, define agent interactions, and handle failures reactively, Hive flips the paradigm: **you describe outcomes, and the system builds itself**—delivering an outcome-driven, adaptive experience with an easy-to-use set of tools and integrations.
+Hive focuses on generating agents that run real business processes rather than generic agents. Instead of requiring you to manually design workflows, define agent interactions, and handle failures reactively, Hive flips the paradigm: **you describe [outcomes](docs/key_concepts/goals_outcome.md), and the system builds itself**—delivering an outcome-driven, [adaptive](docs/key_concepts/evolution.md) experience with an easy-to-use set of tools and integrations.

 ```mermaid
 flowchart LR
@@ -188,27 +188,28 @@ flowchart LR
 | -------------------------- | -------------------------------------- |
 | Hardcode agent workflows   | Describe goals in natural language     |
 | Manual graph definition    | Auto-generated agent graphs            |
-| Reactive error handling    | Outcome-evaluation and adaptiveness               |
+| Reactive error handling    | Outcome-evaluation and adaptiveness    |
 | Static tool configurations | Dynamic SDK-wrapped nodes              |
 | Separate monitoring setup  | Built-in real-time observability       |
 | DIY budget management      | Integrated cost controls & degradation |

 ### How It Works

-1. **Define Your Goal** → Describe what you want to achieve in plain English
-2. **Coding Agent Generates** → Creates the agent graph, connection code, and test cases
-3. **Workers Execute** → SDK-wrapped nodes run with full observability and tool access
+1. **[Define Your Goal](docs/key_concepts/goals_outcome.md)** → Describe what you want to achieve in plain English
+2. **Coding Agent Generates** → Creates the [agent graph](docs/key_concepts/graph.md), connection code, and test cases
+3. **[Workers Execute](docs/key_concepts/worker_agent.md)** → SDK-wrapped nodes run with full observability and tool access
 4. **Control Plane Monitors** → Real-time metrics, budget enforcement, policy management
-5. **Adaptiveness** → On failure, the system evolves the graph and redeploys automatically
+5. **[Adaptiveness](docs/key_concepts/evolution.md)** → On failure, the system evolves the graph and redeploys automatically

 ## Run pre-built Agents (Coming Soon)

 ### Run a sample agent
+
 Aden Hive provides a list of featured agents that you can use and build on top of.

 ### Run an agent shared by others
-Put the agent in `exports/` and run `PYTHONPATH=exports uv run python -m your_agent_name run --input '{...}'`

+Put the agent in `exports/` and run `PYTHONPATH=exports uv run python -m your_agent_name run --input '{...}'`

 For building and running goal-driven agents with the framework:

@@ -221,28 +222,32 @@ For building and running goal-driven agents with the framework:
 # - aden_tools package (MCP tools)
 # - All Python dependencies

-# Build new agents using Claude Code skills
-claude> /building-agents-construction
-
-# Test agents
-claude> /testing-agent
+# Build new agents using Agent Skills
+claude> /hive

 # Run agents
 PYTHONPATH=exports uv run python -m agent_name run --input '{...}'
 ```

-See [ENVIRONMENT_SETUP.md](ENVIRONMENT_SETUP.md) for complete setup instructions.
+See [environment-setup.md](docs/environment-setup.md) for complete setup instructions.

 ## Documentation

- **[Developer Guide](DEVELOPER.md)** - Comprehensive guide for developers
+- **[Developer Guide](docs/developer-guide.md)** - Comprehensive guide for developers
 - [Getting Started](docs/getting-started.md) - Quick setup instructions
 - [Configuration Guide](docs/configuration.md) - All configuration options
 - [Architecture Overview](docs/architecture/README.md) - System design and structure

+### Key Concepts
+
+- [Goals & Outcome-Driven Development](docs/key_concepts/goals_outcome.md) - Why Hive is outcome-driven and how goals define success
+- [The Agent Graph](docs/key_concepts/graph.md) - Nodes, edges, shared memory, and how agents execute
+- [The Worker Agent](docs/key_concepts/worker_agent.md) - Sessions, iterations, headless execution, and the runtime
+- [Evolution](docs/key_concepts/evolution.md) - How agents improve across generations through failure data
+
 ## Roadmap

-Aden Hive Agent Framework aims to help developers build outcome-oriented, self-adaptive agents. See [ROADMAP.md](ROADMAP.md) for details.
+Aden Hive Agent Framework aims to help developers build outcome-oriented, self-adaptive agents. See [roadmap.md](docs/roadmap.md) for details.

 ```mermaid
 flowchart TD
@@ -332,11 +337,12 @@ end

 classDef done fill:#9e9e9e,color:#fff,stroke:#757575
 ```
+
 ## Contributing

 We welcome contributions from the community! We’re especially looking for help building tools, integrations, and example agents for the framework ([check #2805](https://github.com/adenhq/hive/issues/2805)). If you’re interested in extending its functionality, this is the perfect place to start. Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

-**Important:** Please get assigned to an issue before submitting a PR. Comment on an issue to claim it, and a maintainer will assign you. Issues with reproducible steps and proposals are prioritized. This helps prevent duplicate work. 
+**Important:** Please get assigned to an issue before submitting a PR. Comment on an issue to claim it, and a maintainer will assign you. Issues with reproducible steps and proposals are prioritized. This helps prevent duplicate work.

 1. Find or create an issue and get assigned
 2. Fork the repository
@@ -383,7 +389,7 @@ Yes! Hive supports local models through LiteLLM. Simply use the model name forma

 **Q: What makes Hive different from other agent frameworks?**

-Hive generates your entire agent system from natural language goals using a coding agent—you don't hardcode workflows or manually define graphs. When agents fail, the framework automatically captures failure data, evolves the agent graph, and redeploys. This self-improving loop is unique to Aden.
+Hive generates your entire agent system from natural language [goals](docs/key_concepts/goals_outcome.md) using a coding agent—you don't hardcode workflows or manually define graphs. When agents fail, the framework automatically captures failure data, [evolves the agent graph](docs/key_concepts/evolution.md), and redeploys. This self-improving loop is unique to Aden.

 **Q: Is Hive open-source?**

@@ -395,7 +401,7 @@ Hive collects telemetry data for monitoring and observability purposes, includin

 **Q: What deployment options does Hive support?**

-Hive supports self-hosted deployments via Python packages. See the [Environment Setup Guide](ENVIRONMENT_SETUP.md) for installation instructions. Cloud deployment options and Kubernetes-ready configurations are on the roadmap.
+Hive supports self-hosted deployments via Python packages. See the [Environment Setup Guide](docs/environment-setup.md) for installation instructions. Cloud deployment options and Kubernetes-ready configurations are on the roadmap.

 **Q: Can Hive handle complex, production-scale use cases?**

@@ -403,7 +409,7 @@ Yes. Hive is explicitly designed for production environments with features like

 **Q: Does Hive support human-in-the-loop workflows?**

-Yes, Hive fully supports human-in-the-loop workflows through intervention nodes that pause execution for human input. These include configurable timeouts and escalation policies, allowing seamless collaboration between human experts and AI agents.
+Yes, Hive fully supports [human-in-the-loop](docs/key_concepts/graph.md#human-in-the-loop) workflows through intervention nodes that pause execution for human input. These include configurable timeouts and escalation policies, allowing seamless collaboration between human experts and AI agents.

 **Q: What monitoring and debugging tools does Hive provide?**

@@ -423,7 +429,7 @@ Hive provides granular budget controls including spending limits, throttles, and

 **Q: Where can I find examples and documentation?**

-Visit [docs.adenhq.com](https://docs.adenhq.com/) for complete guides, API reference, and getting started tutorials. The repository also includes documentation in the `docs/` folder and a comprehensive [DEVELOPER.md](DEVELOPER.md) guide.
+Visit [docs.adenhq.com](https://docs.adenhq.com/) for complete guides, API reference, and getting started tutorials. The repository also includes documentation in the `docs/` folder and a comprehensive [developer guide](docs/developer-guide.md).

 **Q: How can I contribute to Aden?**

@@ -431,7 +437,7 @@ Contributions are welcome! Fork the repository, create your feature branch, impl

 **Q: When will my team start seeing results from Aden's adaptive agents?**

-Aden's adaptation loop begins working from the first execution. When an agent fails, the framework captures the failure data, helping developers evolve the agent graph through the coding agent. How quickly this translates to measurable results depends on the complexity of your use case, the quality of your goal definitions, and the volume of executions generating feedback.
+Aden's [adaptation loop](docs/key_concepts/evolution.md) begins working from the first execution. When an agent fails, the framework captures the failure data, helping developers evolve the agent graph through the coding agent. How quickly this translates to measurable results depends on the complexity of your use case, the quality of your [goal definitions](docs/key_concepts/goals_outcome.md), and the volume of executions generating feedback.

 **Q: How does Hive compare to other agent frameworks?**

@@ -145,7 +145,7 @@ uv run python -m framework test-debug <agent_path> <test_name>
 uv run python -m framework test-list <goal_id>
 ```

-For detailed testing workflows, see the [testing-agent skill](../.claude/skills/testing-agent/SKILL.md).
+For detailed testing workflows, see the [hive-test skill](../.claude/skills/hive-test/SKILL.md).

 ### Analyzing Agent Behavior with Builder

@@ -156,6 +156,10 @@ class EdgeSpec(BaseModel):
        memory: dict[str, Any],
    ) -> bool:
        """Evaluate a conditional expression."""
+        import logging
+
+        logger = logging.getLogger(__name__)
+
        if not self.condition_expr:
            return True

@@ -172,12 +176,24 @@ class EdgeSpec(BaseModel):

        try:
            # Safe evaluation using AST-based whitelist
-            return bool(safe_eval(self.condition_expr, context))
+            result = bool(safe_eval(self.condition_expr, context))
+            # Log the evaluation for visibility
+            # Extract the variable names used in the expression for debugging
+            expr_vars = {
+                k: repr(context[k])
+                for k in context
+                if k not in ("output", "memory", "result", "true", "false")
+                and k in self.condition_expr
+            }
+            logger.info(
+                "  Edge %s: condition '%s' → %s  (vars: %s)",
+                self.id,
+                self.condition_expr,
+                result,
+                expr_vars or "none matched",
+            )
+            return result
        except Exception as e:
-            # Log the error for debugging
-            import logging
-
-            logger = logging.getLogger(__name__)
            logger.warning(f"      ⚠ Condition evaluation failed: {self.condition_expr}")
            logger.warning(f"         Error: {e}")
            logger.warning(f"         Available context keys: {list(context.keys())}")
@@ -419,6 +435,12 @@ class GraphSpec(BaseModel):
    max_steps: int = Field(default=100, description="Maximum node executions before timeout")
    max_retries_per_node: int = 3

+    # EventLoopNode configuration (from configure_loop)
+    loop_config: dict[str, Any] = Field(
+        default_factory=dict,
+        description="EventLoopNode configuration (max_iterations, max_tool_calls_per_turn, etc.)",
+    )
+
    # Metadata
    description: str = ""
    created_by: str = ""  # "human" or "builder_agent"
@@ -74,6 +74,11 @@ class LoopConfig:
    max_history_tokens: int = 32_000
    store_prefix: str = ""

+    # Overflow margin for max_tool_calls_per_turn.  Tool calls are only
+    # discarded when the count exceeds max_tool_calls_per_turn * (1 + margin).
+    # Default 0.5 means 50% wiggle room (e.g. limit=10 → hard cutoff at 15).
+    tool_call_overflow_margin: float = 0.5
+
    # --- Tool result context management ---
    # When a tool result exceeds this character count, it is truncated in the
    # conversation context.  If *spillover_dir* is set the full result is
@@ -617,9 +622,12 @@ class EventLoopNode(NodeProtocol):
            real_tool_results: list[dict] = []
            limit_hit = False
            executed_in_batch = 0
+            hard_limit = int(
+                self._config.max_tool_calls_per_turn * (1 + self._config.tool_call_overflow_margin)
+            )
            for tc in tool_calls:
                tool_call_count += 1
-                if tool_call_count > self._config.max_tool_calls_per_turn:
+                if tool_call_count > hard_limit:
                    limit_hit = True
                    break
                executed_in_batch += 1
@@ -652,7 +660,7 @@ class EventLoopNode(NodeProtocol):
                        if isinstance(value, str):
                            try:
                                parsed = json.loads(value)
-                                if isinstance(parsed, (list, dict)):
+                                if isinstance(parsed, (list, dict, bool, int, float)):
                                    value = parsed
                            except (json.JSONDecodeError, TypeError):
                                pass
@@ -695,17 +703,16 @@ class EventLoopNode(NodeProtocol):
            # corresponding tool results, causing the LLM to repeat them
            # in the next turn (infinite loop).
            if limit_hit:
-                max_tc = self._config.max_tool_calls_per_turn
                skipped = tool_calls[executed_in_batch:]
                logger.warning(
-                    "Max tool calls per turn (%d) exceeded — discarding %d remaining call(s): %s",
-                    max_tc,
+                    "Hard tool call limit (%d) exceeded — discarding %d remaining call(s): %s",
+                    hard_limit,
                    len(skipped),
                    ", ".join(tc.tool_name for tc in skipped),
                )
                discard_msg = (
-                    f"Tool call discarded: max tool calls per turn "
-                    f"({max_tc}) exceeded. Consolidate your work and "
+                    f"Tool call discarded: hard limit of {hard_limit} tool calls "
+                    f"per turn exceeded. Consolidate your work and "
                    f"use fewer tool calls."
                )
                for tc in skipped:
@@ -1048,7 +1055,7 @@ class EventLoopNode(NodeProtocol):
            truncated = (
                f"[Result from {tool_name}: {len(result.content)} chars — "
                f"too large for context, saved to '{filename}'. "
-                f"Use load_data(filename='{filename}', data_dir='{spill_dir}') "
+                f"Use load_data(filename='{filename}') "
                f"to read the full result.]\n\n"
                f"Preview:\n{preview}…"
            )
@@ -1268,11 +1275,9 @@ class EventLoopNode(NodeProtocol):

        # 5. Spillover files hint
        if self._config.spillover_dir:
-            spill = self._config.spillover_dir
            parts.append(
                "NOTE: Large tool results were saved to files. "
-                f"Use load_data(filename='<filename>', data_dir='{spill}') "
-                "to read them."
+                "Use load_data(filename='<filename>') to read them."
            )

        # 6. Tool call history (prevent re-calling tools)
@@ -132,6 +132,7 @@ class GraphExecutor:
        event_bus: Any | None = None,
        stream_id: str = "",
        storage_path: str | Path | None = None,
+        loop_config: dict[str, Any] | None = None,
    ):
        """
        Initialize the executor.
@@ -149,6 +150,7 @@ class GraphExecutor:
            event_bus: Optional event bus for emitting node lifecycle events
            stream_id: Stream ID for event correlation
            storage_path: Optional base path for conversation persistence
+            loop_config: Optional EventLoopNode configuration (max_iterations, etc.)
        """
        self.runtime = runtime
        self.llm = llm
@@ -161,6 +163,7 @@ class GraphExecutor:
        self._event_bus = event_bus
        self._stream_id = stream_id
        self._storage_path = Path(storage_path) if storage_path else None
+        self._loop_config = loop_config or {}

        # Initialize output cleaner
        self.cleansing_config = cleansing_config or CleansingConfig()
@@ -285,6 +288,16 @@ class GraphExecutor:
        self.logger.info(f"   Goal: {goal.description}")
        self.logger.info(f"   Entry node: {graph.entry_node}")

+        # Set per-execution data_dir so data tools (save_data, load_data, etc.)
+        # and spillover files share the same session-scoped directory.
+        _ctx_token = None
+        if self._storage_path:
+            from framework.runner.tool_registry import ToolRegistry
+
+            _ctx_token = ToolRegistry.set_execution_context(
+                data_dir=str(self._storage_path / "data"),
+            )
+
        try:
            while steps < graph.max_steps:
                steps += 1
@@ -720,6 +733,12 @@ class GraphExecutor:
                node_visit_counts=dict(node_visit_counts),
            )

+        finally:
+            if _ctx_token is not None:
+                from framework.runner.tool_registry import ToolRegistry
+
+                ToolRegistry.reset_execution_context(_ctx_token)
+
    def _build_context(
        self,
        node_spec: NodeSpec,
@@ -845,19 +864,24 @@ class GraphExecutor:
            # When a tool result exceeds max_tool_result_chars, the full
            # content is written to spillover_dir and the agent gets a
            # truncated preview with instructions to use load_data().
+            # Uses storage_path/data which is session-scoped, matching the
+            # data_dir set via execution context for data tools.
            spillover = None
            if self._storage_path:
                spillover = str(self._storage_path / "data")

+            lc = self._loop_config
+            default_max_iter = 100 if node_spec.client_facing else 50
            node = EventLoopNode(
                event_bus=self._event_bus,
                judge=None,  # implicit judge: accept when output_keys are filled
                config=LoopConfig(
-                    max_iterations=100 if node_spec.client_facing else 50,
-                    max_tool_calls_per_turn=10,
-                    stall_detection_threshold=3,
-                    max_history_tokens=32000,
-                    max_tool_result_chars=3_000,
+                    max_iterations=lc.get("max_iterations", default_max_iter),
+                    max_tool_calls_per_turn=lc.get("max_tool_calls_per_turn", 10),
+                    tool_call_overflow_margin=lc.get("tool_call_overflow_margin", 0.5),
+                    stall_detection_threshold=lc.get("stall_detection_threshold", 3),
+                    max_history_tokens=lc.get("max_history_tokens", 32000),
+                    max_tool_result_chars=lc.get("max_tool_result_chars", 3_000),
                    spillover_dir=spillover,
                ),
                tool_executor=self.tool_executor,
@@ -9,20 +9,36 @@ Usage:

 import json
 import os
+import sys
 from datetime import datetime
 from pathlib import Path
 from typing import Annotated

-from mcp.server import FastMCP
+# Ensure exports/ is on sys.path so AgentRunner can import agent modules.
+_framework_dir = Path(__file__).resolve().parent.parent  # core/framework/ -> core/
+_project_root = _framework_dir.parent  # core/ -> project root
+_exports_dir = _project_root / "exports"
+if _exports_dir.is_dir() and str(_exports_dir) not in sys.path:
+    sys.path.insert(0, str(_exports_dir))
+del _framework_dir, _project_root, _exports_dir

-from framework.graph import Constraint, EdgeCondition, EdgeSpec, Goal, NodeSpec, SuccessCriterion
-from framework.graph.plan import Plan
+from mcp.server import FastMCP  # noqa: E402
+
+from framework.graph import (  # noqa: E402
+    Constraint,
+    EdgeCondition,
+    EdgeSpec,
+    Goal,
+    NodeSpec,
+    SuccessCriterion,
+)
+from framework.graph.plan import Plan  # noqa: E402

 # Testing framework imports
-from framework.testing.prompts import (
+from framework.testing.prompts import (  # noqa: E402
    PYTEST_TEST_FILE_HEADER,
 )
-from framework.utils.io import atomic_write
+from framework.utils.io import atomic_write  # noqa: E402

 # Initialize MCP server
 mcp = FastMCP("agent-builder")
@@ -1853,12 +1869,19 @@ def configure_loop(
    max_history_tokens: Annotated[
        int, "Maximum conversation history tokens before compaction (default 32000)"
    ] = 32000,
+    tool_call_overflow_margin: Annotated[
+        float,
+        "Overflow margin for max_tool_calls_per_turn. "
+        "Tool calls are only discarded when count exceeds "
+        "max_tool_calls_per_turn * (1 + margin). Default 0.5 (50% wiggle room)",
+    ] = 0.5,
 ) -> str:
    """Configure event loop parameters for EventLoopNode execution.

    These settings control how EventLoopNodes behave at runtime:
    - max_iterations: prevents infinite loops
    - max_tool_calls_per_turn: limits tool calls per LLM response
+    - tool_call_overflow_margin: wiggle room before tool calls are discarded (default 50%)
    - stall_detection_threshold: detects when LLM repeats itself
    - max_history_tokens: triggers conversation compaction
    """
@@ -1867,6 +1890,7 @@ def configure_loop(
    session.loop_config = {
        "max_iterations": max_iterations,
        "max_tool_calls_per_turn": max_tool_calls_per_turn,
+        "tool_call_overflow_margin": tool_call_overflow_margin,
        "stall_detection_threshold": stall_detection_threshold,
        "max_history_tokens": max_history_tokens,
    }
@@ -698,6 +698,7 @@ class AgentRunner:
            tools=tools,
            tool_executor=tool_executor,
            approval_callback=self._approval_callback,
+            loop_config=self.graph.loop_config,
        )

    def _setup_agent_runtime(self, tools: list, tool_executor: Callable | None) -> None:
@@ -1,5 +1,6 @@
 """Tool discovery and registration for agent runner."""

+import contextvars
 import importlib.util
 import inspect
 import json
@@ -13,6 +14,13 @@ from framework.llm.provider import Tool, ToolResult, ToolUse

 logger = logging.getLogger(__name__)

+# Per-execution context overrides.  Each asyncio task (and thus each
+# concurrent graph execution) gets its own copy, so there are no races
+# when multiple ExecutionStreams run in parallel.
+_execution_context: contextvars.ContextVar[dict[str, Any] | None] = contextvars.ContextVar(
+    "_execution_context", default=None
+)
+

@dataclass
 class RegisteredTool:
@@ -36,7 +44,7 @@ class ToolRegistry:
    # Framework-internal context keys injected into tool calls.
    # Stripped from LLM-facing schemas (the LLM doesn't know these values)
    # and auto-injected at call time for tools that accept them.
-    CONTEXT_PARAMS = frozenset({"workspace_id", "agent_id", "session_id"})
+    CONTEXT_PARAMS = frozenset({"workspace_id", "agent_id", "session_id", "data_dir"})

    def __init__(self):
        self._tools: dict[str, RegisteredTool] = {}
@@ -262,6 +270,24 @@ class ToolRegistry:
        """
        self._session_context.update(context)

+    @staticmethod
+    def set_execution_context(**context) -> contextvars.Token:
+        """Set per-execution context overrides (concurrency-safe via contextvars).
+
+        Values set here take precedence over session context.  Each asyncio
+        task gets its own copy, so concurrent executions don't interfere.
+
+        Returns a token that must be passed to :meth:`reset_execution_context`
+        to restore the previous state.
+        """
+        current = _execution_context.get() or {}
+        return _execution_context.set({**current, **context})
+
+    @staticmethod
+    def reset_execution_context(token: contextvars.Token) -> None:
+        """Restore execution context to its previous state."""
+        _execution_context.reset(token)
+
    def load_mcp_config(self, config_path: Path) -> None:
        """
        Load and register MCP servers from a config file.
@@ -359,11 +385,15 @@ class ToolRegistry:
                ):
                    def executor(inputs: dict) -> Any:
                        try:
-                            # Only inject session context params the tool accepts
+                            # Build base context: session < execution (execution wins)
+                            base_context = dict(registry_ref._session_context)
+                            exec_ctx = _execution_context.get()
+                            if exec_ctx:
+                                base_context.update(exec_ctx)
+
+                            # Only inject context params the tool accepts
                            filtered_context = {
-                                k: v
-                                for k, v in registry_ref._session_context.items()
-                                if k in tool_params
+                                k: v for k, v in base_context.items() if k in tool_params
                            }
                            merged_inputs = {**filtered_context, **inputs}
                            result = client_ref.call_tool(tool_name, merged_inputs)
@@ -331,8 +331,11 @@ class ExecutionStream:
                runtime_adapter = StreamRuntimeAdapter(self._runtime, execution_id)

                # Create executor for this execution.
-                # Scope storage by execution_id so each execution gets
-                # fresh conversations and spillover directories.
+                # Each execution gets its own storage under sessions/{exec_id}/
+                # so conversations, spillover, and data files are all scoped
+                # to this execution.  The executor sets data_dir via execution
+                # context (contextvars) so data tools and spillover share the
+                # same session-scoped directory.
                exec_storage = self._storage.base_path / "sessions" / execution_id
                executor = GraphExecutor(
                    runtime=runtime_adapter,
@@ -342,6 +345,7 @@ class ExecutionStream:
                    event_bus=self._event_bus,
                    stream_id=self.stream_id,
                    storage_path=exec_storage,
+                    loop_config=self.graph.loop_config,
                )
                # Track executor so inject_input() can reach EventLoopNode instances
                self._active_executors[execution_id] = executor
@@ -15,6 +15,7 @@ Client-facing input:
 """

 import asyncio
+import re
 import threading
 from typing import Any

@@ -91,11 +92,18 @@ class ChatRepl(Vertical):
        yield Label("Agent is processing...", id="processing-indicator")
        yield Input(placeholder="Enter input for agent...", id="chat-input")

+    # Regex for file:// URIs that are NOT already inside Rich [link=...] markup
+    _FILE_URI_RE = re.compile(r"(?<!\[link=)(file://\S+)")
+
+    def _linkify(self, text: str) -> str:
+        """Convert bare file:// URIs to clickable Rich [link=...] markup."""
+        return self._FILE_URI_RE.sub(r"[link=\1]\1[/link]", text)
+
    def _write_history(self, content: str) -> None:
        """Write to chat history, only auto-scrolling if user is at the bottom."""
        history = self.query_one("#chat-history", RichLog)
        was_at_bottom = history.is_vertical_scroll_end
-        history.write(content)
+        history.write(self._linkify(content))
        if was_at_bottom:
            history.scroll_end(animate=False)

@@ -508,7 +508,7 @@ async def test_event_loop_set_output():

    assert result.success
    if USE_MOCK_LLM:
-        assert result.output == {"lead_score": "87", "company": "TechCorp"}
+        assert result.output == {"lead_score": 87, "company": "TechCorp"}
    else:
        assert "lead_score" in result.output
        assert "company" in result.output
@@ -549,7 +549,7 @@ async def test_event_loop_missing_output_keys_retried():
    assert "score" in result.output
    assert "reason" in result.output
    if USE_MOCK_LLM:
-        assert result.output["score"] == "87"
+        assert result.output["score"] == 87
        assert result.output["reason"] == "good fit"


@@ -920,7 +920,7 @@ async def test_context_handoff_between_nodes(runtime):
    assert "lead_score" in result.output
    assert "strategy" in result.output
    if USE_MOCK_LLM:
-        assert result.output["lead_score"] == "92"
+        assert result.output["lead_score"] == 92
        assert result.output["strategy"] == "premium"


@@ -316,7 +316,7 @@ class TestSetOutput:
        result = await node.execute(ctx)

        assert result.success is True
-        assert result.output["result"] == "42"
+        assert result.output["result"] == 42

    @pytest.mark.asyncio
    async def test_set_output_rejects_invalid_key(self, runtime, node_spec, memory):
@@ -187,4 +187,4 @@ Run from the project root with PYTHONPATH:
 PYTHONPATH=exports uv run python -m my_agent validate
 ```

-See [Environment Setup](../ENVIRONMENT_SETUP.md) for detailed installation instructions.
+See [Environment Setup](./environment-setup.md) for detailed installation instructions.
@@ -20,12 +20,12 @@ This guide covers everything you need to know to develop with the Aden Agent Fra

 Aden Agent Framework is a Python-based system for building goal-driven, self-improving AI agents.

-| Package       | Directory  | Description                             | Tech Stack   |
-| ------------- | ---------- | --------------------------------------- | ------------ |
-| **framework** | `/core`    | Core runtime, graph executor, protocols | Python 3.11+ |
-| **tools**     | `/tools`   | MCP tools for agent capabilities        | Python 3.11+ |
+| Package       | Directory  | Description                               | Tech Stack   |
+| ------------- | ---------- | ----------------------------------------- | ------------ |
+| **framework** | `/core`    | Core runtime, graph executor, protocols   | Python 3.11+ |
+| **tools**     | `/tools`   | MCP tools for agent capabilities          | Python 3.11+ |
 | **exports**   | `/exports` | Agent packages (user-created, gitignored) | Python 3.11+ |
-| **skills**    | `.claude`  | Claude Code skills for building/testing | Markdown     |
+| **skills**    | `.claude`  | Claude Code skills for building/testing   | Markdown     |

 ### Key Principles

@@ -101,11 +101,11 @@ Get API keys:

 This installs agent-related Claude Code skills:

- `/building-agents-core` - Fundamental agent concepts
- `/building-agents-construction` - Step-by-step agent building
- `/building-agents-patterns` - Best practices and design patterns
- `/testing-agent` - Test and validate agents
- `/agent-workflow` - End-to-end guided workflow
+- `/hive` - Complete workflow for building agents
+- `/hive-create` - Step-by-step agent building
+- `/hive-concepts` - Fundamental agent concepts
+- `/hive-patterns` - Best practices and design patterns
+- `/hive-test` - Test and validate agents

 ### Verify Setup

@@ -115,7 +115,7 @@ uv run python -c "import framework; print('✓ framework OK')"
 uv run python -c "import aden_tools; print('✓ aden_tools OK')"
 uv run python -c "import litellm; print('✓ litellm OK')"

-# Run an agent (after building one via /building-agents-construction)
+# Run an agent (after building one via /hive-create)
 PYTHONPATH=exports uv run python -m your_agent_name validate
 ```

@@ -140,21 +140,11 @@ hive/                                    # Repository root
 │
 ├── .claude/                             # Claude Code Skills
 │   └── skills/                          # Skills for building
-│       ├── building-agents-core/
-|       |   ├── SKILL.md                 # Main skill definition
-│       |   └── examples
-│       ├── building-agents-patterns/
-|       |   ├── SKILL.md
-│       |   └── examples
-│       ├── building-agents-construction/
-|       |   ├── SKILL.md
-│       |   └── examples
-│       ├── testing-agent/               # Skills for testing agents
-│       │   ├── SKILL.md
-│       |   └── examples
-│       └── agent-workflow/              # Complete workflow 
-|           ├── SKILL.md
-│           └── examples
+│       ├── hive/                        # Complete workflow
+│       ├── hive-create/                 # Step-by-step build guide
+│       ├── hive-concepts/               # Fundamental concepts
+│       ├── hive-patterns/               # Best practices
+│       └── hive-test/                   # Test and validate agents
 │
 ├── core/                                # CORE FRAMEWORK PACKAGE
 │   ├── framework/                       # Main package code
@@ -188,7 +178,7 @@ hive/                                    # Repository root
 │   └── README.md                        # Tools documentation
 │
 ├── exports/                             # AGENT PACKAGES (user-created, gitignored)
-│   └── your_agent_name/                 # Created via /building-agents-construction
+│   └── your_agent_name/                 # Created via /hive-create
 │
 ├── docs/                                # Documentation
 │   ├── getting-started.md               # Quick start guide
@@ -202,12 +192,9 @@ hive/                                    # Repository root
 │   └── auto-close-duplicates.ts         # GitHub duplicate issue closer
 │
 ├── quickstart.sh                        # Interactive setup wizard
-├── ENVIRONMENT_SETUP.md                 # Complete Python setup guide
 ├── README.md                            # Project overview
-├── DEVELOPER.md                         # This file
 ├── CONTRIBUTING.md                      # Contribution guidelines
 ├── CHANGELOG.md                         # Version history
-├── ROADMAP.md                           # Product roadmap
 ├── LICENSE                              # Apache 2.0 License
 ├── CODE_OF_CONDUCT.md                   # Community guidelines
 └── SECURITY.md                          # Security policy
@@ -226,10 +213,10 @@ The fastest way to build agents is using the Claude Code skills:
 ./quickstart.sh

 # Build a new agent
-claude> /building-agents-construction
+claude> /hive

 # Test the agent
-claude> /testing-agent
+claude> /hive-test
 ```

 ### Agent Development Workflow
@@ -237,7 +224,7 @@ claude> /testing-agent
 1. **Define Your Goal**

   ```
-   claude> /building-agents-construction
+   claude> /hive
   Enter goal: "Build an agent that processes customer support tickets"
   ```

@@ -260,7 +247,7 @@ claude> /testing-agent

 5. **Test the Agent**
   ```
-   claude> /testing-agent
+   claude> /hive-test
   ```

 ### Manual Agent Development
@@ -324,7 +311,7 @@ PYTHONPATH=exports uv run python -m agent_name run --mock --input '{...}'

 ```bash
 # Run tests for an agent
-claude> /testing-agent
+claude> /hive-test
 ```

 This generates and runs:
@@ -542,7 +529,7 @@ uv add <package>

 ```bash
 # Option 1: Use Claude Code skill (recommended)
-claude> /building-agents-construction
+claude> /hive

 # Option 2: Create manually
 # Note: exports/ is initially empty (gitignored). Create your agent directory:
@@ -657,8 +644,6 @@ kill -9 <PID>
 # Or change ports in config.yaml and regenerate
 ```

-
-
 ### Environment Variables Not Loading

 ```bash
@@ -672,8 +657,6 @@ echo $ANTHROPIC_API_KEY
 # Then add your API keys
 ```

-
-
 ---

 ## Getting Help
@@ -9,8 +9,8 @@ Complete setup guide for building and running goal-driven agents with the Aden A
 ./quickstart.sh
 ```

-> **Note for Windows Users:**  
-> Running the setup script on native Windows shells (PowerShell / Git Bash) may sometimes fail due to Python App Execution Aliases.  
+> **Note for Windows Users:**
+> Running the setup script on native Windows shells (PowerShell / Git Bash) may sometimes fail due to Python App Execution Aliases.
 > It is **strongly recommended to use WSL (Windows Subsystem for Linux)** for a smoother setup experience.

 This will:
@@ -39,17 +39,22 @@ Windows users should use **WSL (Windows Subsystem for Linux)** to set up and run
 If you are using Alpine Linux (e.g., inside a Docker container), you must install system dependencies and use a virtual environment before running the setup script:

 1. Install System Dependencies:
+
 ```bash
 apk update
 apk add bash git python3 py3-pip nodejs npm curl build-base python3-dev linux-headers libffi-dev
 ```
+
 2. Set up Virtual Environment (Required for Python 3.12+):
+
 ```
 uv venv
 source .venv/bin/activate
 # uv handles pip/setuptools/wheel automatically
 ```
+
 3. Run the Quickstart Script:
+
 ```
 ./quickstart.sh
 ```
@@ -87,7 +92,7 @@ uv run python -c "import aden_tools; print('✓ aden_tools OK')"
 uv run python -c "import litellm; print('✓ litellm OK')"
 ```

-> **Windows Tip:**  
+> **Windows Tip:**
 > On Windows, if the verification commands fail, ensure you are running them in **WSL** or after **disabling Python App Execution Aliases** in Windows Settings → Apps → App Execution Aliases.

 ## Requirements
@@ -165,16 +170,16 @@ Build and run an agent using Claude Code CLI with the agent building skills:

 This verifies agent-related Claude Code skills are available:

- `/building-agents-construction` - Step-by-step build guide
- `/building-agents-core` - Fundamental concepts
- `/building-agents-patterns` - Best practices
- `/testing-agent` - Test and validate agents
- `/agent-workflow` - Complete workflow
+- `/hive` - Complete workflow for building agents
+- `/hive-create` - Step-by-step build guide
+- `/hive-concepts` - Fundamental concepts
+- `/hive-patterns` - Best practices
+- `/hive-test` - Test and validate agents

 ### 2. Build an Agent

 ```
-claude> /building-agents-construction
+claude> /hive
 ```

 Follow the prompts to:
@@ -189,7 +194,7 @@ This step creates the initial agent structure required for further development.
 ### 3. Define Agent Logic

 ```
-claude> /building-agents-core
+claude> /hive-concepts
 ```

 Follow the prompts to:
@@ -204,7 +209,7 @@ This step establishes the core concepts and rules needed before building an agen
 ### 4. Apply Agent Patterns

 ```
-claude> /building-agents-patterns
+claude> /hive-patterns
 ```

 Follow the prompts to:
@@ -219,8 +224,9 @@ This step helps optimize agent design before final testing.
 ### 5. Test Your Agent

 ```
-claude> /testing-agent
+claude> /hive-test
 ```
+
 Follow the prompts to:

 1. Generate test guidelines for constraints and success criteria
@@ -230,21 +236,6 @@ Follow the prompts to:

 This step verifies that the agent meets its goals before production use.

-### 6. Agent Development Workflow (End-to-End)
-
-```
-claude> /agent-workflow
-```
-
-Follow the guided flow to:
-
-1. Understand core agent concepts (optional)
-2. Build the agent structure step by step
-3. Apply best-practice design patterns (optional)
-4. Test and validate the agent against its goals
-
-This workflow orchestrates all agent-building skills to take you from idea → production-ready agent.
-
 ## Troubleshooting

 ### "externally-managed-environment" error (PEP 668)
@@ -363,7 +354,7 @@ hive/
 │   └── pyproject.toml
 │
 └── exports/                 # Agent packages (user-created, gitignored)
-    └── your_agent_name/     # Created via /building-agents-construction
+    └── your_agent_name/     # Created via /hive-create
 ```

 ## Separate Virtual Environments
@@ -446,7 +437,7 @@ This design allows agents in `exports/` to be:
 ### 2. Build Agent (Claude Code)

 ```
-claude> /building-agents-construction
+claude> /hive
 Enter goal: "Build an agent that processes customer support tickets"
 ```

@@ -459,7 +450,7 @@ PYTHONPATH=exports uv run python -m your_agent_name validate
 ### 4. Test Agent

 ```
-claude> /testing-agent
+claude> /hive-test
 ```

 ### 5. Run Agent
@@ -513,11 +504,11 @@ export AGENT_STORAGE_PATH="/custom/storage"

 ## Additional Resources

- **Framework Documentation:** [core/README.md](core/README.md)
- **Tools Documentation:** [tools/README.md](tools/README.md)
- **Example Agents:** [exports/](exports/)
- **Agent Building Guide:** [.claude/skills/building-agents-construction/SKILL.md](.claude/skills/building-agents-construction/SKILL.md)
- **Testing Guide:** [.claude/skills/testing-agent/SKILL.md](.claude/skills/testing-agent/SKILL.md)
+- **Framework Documentation:** [core/README.md](../core/README.md)
+- **Tools Documentation:** [tools/README.md](../tools/README.md)
+- **Example Agents:** [exports/](../exports/)
+- **Agent Building Guide:** [.claude/skills/hive-create/SKILL.md](../.claude/skills/hive-create/SKILL.md)
+- **Testing Guide:** [.claude/skills/hive-test/SKILL.md](../.claude/skills/hive-test/SKILL.md)

 ## Contributing

@@ -526,7 +517,7 @@ When contributing agent packages:
 1. Place agents in `exports/agent_name/`
 2. Follow the standard agent structure (see existing agents)
 3. Include README.md with usage instructions
-4. Add tests if using `/testing-agent`
+4. Add tests if using `/hive-test`
 5. Document required environment variables

 ## Support
@@ -33,10 +33,11 @@ uv run python -c "import framework; import aden_tools; print('✓ Setup complete
 # Setup already done via quickstart.sh above

 # Start Claude Code and build an agent
-claude> /building-agents-construction
+claude> /hive
 ```

 Follow the interactive prompts to:
+
 1. Define your agent's goal
 2. Design the workflow (nodes and edges)
 3. Generate the agent package
@@ -52,7 +53,7 @@ mkdir -p exports/my_agent

 # Create your agent structure
 cd exports/my_agent
-# Create agent.json, tools.py, README.md (see DEVELOPER.md for structure)
+# Create agent.json, tools.py, README.md (see developer-guide.md for structure)

 # Validate the agent
 PYTHONPATH=exports uv run python -m my_agent validate
@@ -99,15 +100,15 @@ hive/
 │       └── mcp_server.py   # HTTP MCP server
 │
 ├── exports/                # Agent Packages (user-generated, not in repo)
-│   └── your_agent/         # Your agents created via /building-agents
+│   └── your_agent/         # Your agents created via /hive
 │
 ├── .claude/                # Claude Code Skills
 │   └── skills/
-│       ├── agent-workflow/
-│       ├── building-agents-construction/
-│       ├── building-agents-core/
-│       ├── building-agents-patterns/
-│       └── testing-agent/
+│       ├── hive/
+│       ├── hive-create/
+│       ├── hive-concepts/
+│       ├── hive-patterns/
+│       └── hive-test/
 │
 └── docs/                   # Documentation
 ```
@@ -142,6 +143,7 @@ export BRAVE_SEARCH_API_KEY="your-key-here"  # Optional, for web search
 ```

 Get your API keys:
+
 - **Anthropic**: [console.anthropic.com](https://console.anthropic.com/)
 - **OpenAI**: [platform.openai.com](https://platform.openai.com/)
 - **Brave Search**: [brave.com/search/api](https://brave.com/search/api/)
@@ -150,7 +152,7 @@ Get your API keys:

 ```bash
 # Using Claude Code
-claude> /testing-agent
+claude> /hive-test

 # Or manually
 PYTHONPATH=exports uv run python -m my_agent test
@@ -162,9 +164,9 @@ PYTHONPATH=exports uv run python -m my_agent test --type success

 ## Next Steps

-1. **Detailed Setup**: See [ENVIRONMENT_SETUP.md](../ENVIRONMENT_SETUP.md)
-2. **Developer Guide**: See [DEVELOPER.md](../DEVELOPER.md)
-3. **Build Agents**: Use `/building-agents` skill in Claude Code
+1. **Detailed Setup**: See [environment-setup.md](./environment-setup.md)
+2. **Developer Guide**: See [developer-guide.md](./developer-guide.md)
+3. **Build Agents**: Use `/hive` skill in Claude Code
 4. **Custom Tools**: Learn to integrate MCP servers
 5. **Join Community**: [Discord](https://discord.com/invite/MXE49hrKDk)

@@ -209,4 +211,4 @@ pip uninstall -y framework tools
 - **Documentation**: Check the `/docs` folder
 - **Issues**: [github.com/adenhq/hive/issues](https://github.com/adenhq/hive/issues)
 - **Discord**: [discord.com/invite/MXE49hrKDk](https://discord.com/invite/MXE49hrKDk)
- **Build Agents**: Use `/building-agents` skill to create agents
+- **Build Agents**: Use `/hive` skill to create agents
@@ -78,6 +78,7 @@ cd hive
 ```

 Esto instala:
+
 - **framework** - Runtime del agente principal y ejecutor de grafos
 - **aden_tools** - 19 herramientas MCP para capacidades de agentes
 - Todas las dependencias requeridas
@@ -89,16 +90,16 @@ Esto instala:
 ./quickstart.sh

 # Construir un agente usando Claude Code
-claude> /building-agents-construction
+claude> /hive

 # Probar tu agente
-claude> /testing-agent
+claude> /hive-test

 # Ejecutar tu agente
 PYTHONPATH=exports uv run python -m your_agent_name run --input '{...}'
 ```

-**[📖 Guía de Configuración Completa](ENVIRONMENT_SETUP.md)** - Instrucciones detalladas para desarrollo de agentes
+**[📖 Guía de Configuración Completa](../environment-setup.md)** - Instrucciones detalladas para desarrollo de agentes

 ## Características

@@ -162,14 +163,14 @@ flowchart LR

 ### La Ventaja de Aden

-| Frameworks Tradicionales | Aden |
-|--------------------------|------|
-| Codificar flujos de trabajo de agentes | Describir objetivos en lenguaje natural |
-| Definición manual de grafos | Grafos de agentes auto-generados |
-| Manejo reactivo de errores | Auto-evolución proactiva |
-| Configuraciones de herramientas estáticas | Nodos dinámicos envueltos en SDK |
-| Configuración de monitoreo separada | Observabilidad en tiempo real integrada |
-| Gestión de presupuesto DIY | Controles de costos y degradación integrados |
+| Frameworks Tradicionales                  | Aden                                         |
+| ----------------------------------------- | -------------------------------------------- |
+| Codificar flujos de trabajo de agentes    | Describir objetivos en lenguaje natural      |
+| Definición manual de grafos               | Grafos de agentes auto-generados             |
+| Manejo reactivo de errores                | Auto-evolución proactiva                     |
+| Configuraciones de herramientas estáticas | Nodos dinámicos envueltos en SDK             |
+| Configuración de monitoreo separada       | Observabilidad en tiempo real integrada      |
+| Gestión de presupuesto DIY                | Controles de costos y degradación integrados |

 ### Cómo Funciona

@@ -213,10 +214,7 @@ hive/
 ├── docs/                   # Documentación y guías
 ├── scripts/                # Scripts de construcción y utilidades
 ├── .claude/                # Habilidades de Claude Code para construir agentes
-├── ENVIRONMENT_SETUP.md    # Guía de configuración de Python para desarrollo de agentes
-├── DEVELOPER.md            # Guía del desarrollador
 ├── CONTRIBUTING.md         # Directrices de contribución
-└── ROADMAP.md              # Hoja de ruta del producto
 ```

 ## Desarrollo
@@ -235,20 +233,20 @@ Para construir y ejecutar agentes orientados a objetivos con el framework:
 # - Todas las dependencias

 # Construir nuevos agentes usando habilidades de Claude Code
-claude> /building-agents-construction
+claude> /hive

 # Probar agentes
-claude> /testing-agent
+claude> /hive-test

 # Ejecutar agentes
 PYTHONPATH=exports uv run python -m agent_name run --input '{...}'
 ```

-Consulta [ENVIRONMENT_SETUP.md](ENVIRONMENT_SETUP.md) para instrucciones de configuración completas.
+Consulta [environment-setup.md](../environment-setup.md) para instrucciones de configuración completas.

 ## Documentación

- **[Guía del Desarrollador](DEVELOPER.md)** - Guía completa para desarrolladores
+- **[Guía del Desarrollador](../developer-guide.md)** - Guía completa para desarrolladores
 - [Primeros Pasos](docs/getting-started.md) - Instrucciones de configuración rápida
 - [Guía de Configuración](docs/configuration.md) - Todas las opciones de configuración
 - [Visión General de Arquitectura](docs/architecture/README.md) - Diseño y estructura del sistema
@@ -257,7 +255,7 @@ Consulta [ENVIRONMENT_SETUP.md](ENVIRONMENT_SETUP.md) para instrucciones de conf

 El Framework de Agentes Aden tiene como objetivo ayudar a los desarrolladores a construir agentes auto-adaptativos orientados a resultados. Encuentra nuestra hoja de ruta aquí

-[ROADMAP.md](ROADMAP.md)
+[roadmap.md](../roadmap.md)

 ```mermaid
 timeline
@@ -62,8 +62,8 @@ Aden एक ऐसा प्लेटफ़ॉर्म है जो AI एज
 # त्वरित लिंक (Quick Links)

 - **[डाक्यूमेंटेशन](https://docs.adenhq.com/)** - पूर्ण गाइड्स और API संदर्भ
- **[सेल्फ-होस्टिंग गाइड](https://docs.adenhq.com/getting-started/quickstart)** - 
-Hive को अपने इंफ़्रास्ट्रक्चर पर डिप्लॉय करें
+- **[सेल्फ-होस्टिंग गाइड](https://docs.adenhq.com/getting-started/quickstart)** -
+  Hive को अपने इंफ़्रास्ट्रक्चर पर डिप्लॉय करें
 - **[चेंजलॉग](https://github.com/adenhq/hive/releases)** - नवीनतम अपडेट और रिलीज़
 <!-- - **[Hoja de Ruta](https://adenhq.com/roadmap)** - Funciones y planes próximos -->
 - **[इशू रिपोर्ट करें](https://github.com/adenhq/hive/issues)** - बग रिपोर्ट और फ़ीचर अनुरोध
@@ -87,6 +87,7 @@ cd hive
 ```

 यह इंस्टॉल करता है:
+
 - **framework** - मुख्य एजेंट रनटाइम और ग्राफ़ एक्ज़ीक्यूटर
 - **aden_tools** - एजेंट क्षमताओं के लिए 19 MCP टूल्स
 - सभी आवश्यक डिपेंडेंसीज़
@@ -98,16 +99,16 @@ Claude Code की क्षमताएँ इंस्टॉल करें (
 ./quickstart.sh

 # Claude Code का उपयोग करके एक एजेंट बनाएँ
-claude> /building-agents-construction
+claude> /hive

 # अपने एजेंट का परीक्षण करें
-claude> /testing-agent
+claude> /hive-test

 # अपने एजेंट को चलाएँ
 PYTHONPATH=exports uv run python -m your_agent_name run --input '{...}'
 ```

-**[📖 पूर्ण कॉन्फ़िगरेशन गाइड](ENVIRONMENT_SETUP.md)** - एजेंट विकास के लिए विस्तृत निर्देश
+**[📖 पूर्ण कॉन्फ़िगरेशन गाइड](../environment-setup.md)** - एजेंट विकास के लिए विस्तृत निर्देश

 ## विशेषताएँ

@@ -171,14 +172,14 @@ flowchart LR

 ### Aden की बढ़त

-| पारंपरिक फ़्रेमवर्क्स | Aden |
-|--------------------------|------|
-| एजेंट वर्कफ़्लो को हार्डकोड करना | प्राकृतिक भाषा में लक्ष्यों का वर्णन |
-| ग्राफ़ की मैन्युअल परिभाषा | स्वतः-उत्पन्न एजेंट ग्राफ़ |
-| त्रुटियों का प्रतिक्रियात्मक प्रबंधन | प्रॉएक्टिव स्वयं-विकास |
-| स्थिर टूल कॉन्फ़िगरेशन | SDK-रैप्ड डायनेमिक नोड्स |
-| अलग मॉनिटरिंग सेटअप | एकीकृत रीयल-टाइम ऑब्ज़र्वेबिलिटी |
-| DIY बजट प्रबंधन | एकीकृत लागत नियंत्रण और डिग्रेडेशन नीतियाँ |
+| पारंपरिक फ़्रेमवर्क्स                | Aden                                       |
+| ------------------------------------ | ------------------------------------------ |
+| एजेंट वर्कफ़्लो को हार्डकोड करना     | प्राकृतिक भाषा में लक्ष्यों का वर्णन       |
+| ग्राफ़ की मैन्युअल परिभाषा           | स्वतः-उत्पन्न एजेंट ग्राफ़                 |
+| त्रुटियों का प्रतिक्रियात्मक प्रबंधन | प्रॉएक्टिव स्वयं-विकास                     |
+| स्थिर टूल कॉन्फ़िगरेशन               | SDK-रैप्ड डायनेमिक नोड्स                   |
+| अलग मॉनिटरिंग सेटअप                  | एकीकृत रीयल-टाइम ऑब्ज़र्वेबिलिटी           |
+| DIY बजट प्रबंधन                      | एकीकृत लागत नियंत्रण और डिग्रेडेशन नीतियाँ |

 ### यह कैसे काम करता है

@@ -222,10 +223,7 @@ hive/
 ├── docs/                   # दस्तावेज़ और मार्गदर्शिकाएँ
 ├── scripts/                # बिल्ड स्क्रिप्ट्स और यूटिलिटीज़
 ├── .claude/                # एजेंट बनाने के लिए Claude Code क्षमताएँ
-├── ENVIRONMENT_SETUP.md    # एजेंट डेवलपमेंट के लिए Python सेटअप गाइड
-├── DEVELOPER.md            # डेवलपर गाइड
 ├── CONTRIBUTING.md         # योगदान दिशानिर्देश
-└── ROADMAP.md              # प्रोडक्ट रोडमैप
 ```

 ## विकास
@@ -244,20 +242,20 @@ hive/
 # - सभी डिपेंडेंसीज़

 # Claude Code क्षमताओं का उपयोग करके नए एजेंट बनाएं
-claude> /building-agents-construction
+claude> /hive

 # एजेंट का परीक्षण करें
-claude> /testing-agent
+claude> /hive-test

 # एजेंट चलाएँ
 PYTHONPATH=exports uv run python -m agent_name run --input '{...}'
 ```

-पूरी कॉन्फ़िगरेशन निर्देशों के लिए ENVIRONMENT_SETUP.md देखें।
+पूरी कॉन्फ़िगरेशन निर्देशों के लिए [environment-setup.md](../environment-setup.md) देखें।

 ## दस्तावेज़ीकरण

- **[डेवलपर गाइड](DEVELOPER.md)** - डेवलपर्स के लिए पूर्ण मार्गदर्शिका
+- **[डेवलपर गाइड](../developer-guide.md)** - डेवलपर्स के लिए पूर्ण मार्गदर्शिका
 - [शुरुआत करें](docs/getting-started.md) - त्वरित कॉन्फ़िगरेशन निर्देश
 - [कॉन्फ़िगरेशन गाइड](docs/configuration.md) - सभी कॉन्फ़िगरेशन विकल्प
 - [आर्किटेक्चर का अवलोकन](docs/architecture/README.md) - सिस्टम का डिज़ाइन और संरचना
@@ -266,7 +264,7 @@ PYTHONPATH=exports uv run python -m agent_name run --input '{...}'

 एडेन एजेंट फ़्रेमवर्क का उद्देश्य डेवलपर्स को परिणाम-उन्मुख, स्वयं-अनुकूलित एजेंट बनाने में मदद करना है। हमारी रोडमैप यहाँ देखें।

-[ROADMAP.md](ROADMAP.md)
+[roadmap.md](../roadmap.md)

 ```mermaid
 timeline
@@ -293,6 +291,7 @@ timeline
 - LinkedIn - [कंपनी पेज](https://www.linkedin.com/company/teamaden/)

 ## योगदान करें
+
 हम योगदान का स्वागत करते हैं! कृपया देखें [CONTRIBUTING.md] (CONTRIBUTING.md) दिशानिर्देशों के लिए.

 **महत्वपूर्ण:**: कृपया PR भेजने से पहले किसी issue को अपने नाम असाइन करवाने का अनुरोध करें। उसे क्लेम करने के लिए issue पर टिप्पणी करें, और कोई मेंटेनर 24 घंटों के भीतर उसे आपको असाइन कर देगा। इससे डुप्लिकेट काम से बचाव होता है।
@@ -352,5 +351,3 @@ timeline
 <p align="center">
  सैन फ्रांसिस्को में 🔥 जुनून के साथ बनाया गया
 </p>
-
-
@@ -35,28 +35,28 @@

 ## 概要

-ワークフローをハードコーディングせずに、信頼性の高い自己改善型AIエージェントを構築できます。コーディングエージェントとの会話を通じて目標を定義すると、フレームワークが動的に作成された接続コードを持つノードグラフを生成します。問題が発生すると、フレームワークは障害データをキャプチャし、コーディングエージェントを通じてエージェントを進化させ、再デプロイします。組み込みのヒューマンインザループノード、認証情報管理、リアルタイムモニタリングにより、適応性を損なうことなく制御を維持できます。
+ワークフローをハードコーディングせずに、信頼性の高い自己改善型 AI エージェントを構築できます。コーディングエージェントとの会話を通じて目標を定義すると、フレームワークが動的に作成された接続コードを持つノードグラフを生成します。問題が発生すると、フレームワークは障害データをキャプチャし、コーディングエージェントを通じてエージェントを進化させ、再デプロイします。組み込みのヒューマンインザループノード、認証情報管理、リアルタイムモニタリングにより、適応性を損なうことなく制御を維持できます。

 完全なドキュメント、例、ガイドについては [adenhq.com](https://adenhq.com) をご覧ください。

-## Adenとは
+## Aden とは

 <p align="center">
  <img width="100%" alt="Aden Architecture" src="../assets/aden-architecture-diagram.jpg" />
 </p>

-Adenは、AIエージェントの構築、デプロイ、運用、適応のためのプラットフォームです：
+Aden は、AI エージェントの構築、デプロイ、運用、適応のためのプラットフォームです：

 - **構築** - コーディングエージェントが自然言語の目標から専門的なワーカーエージェント（セールス、マーケティング、オペレーション）を生成
- **デプロイ** - CI/CD統合と完全なAPIライフサイクル管理を備えたヘッドレスデプロイメント
+- **デプロイ** - CI/CD 統合と完全な API ライフサイクル管理を備えたヘッドレスデプロイメント
 - **運用** - リアルタイムモニタリング、可観測性、ランタイムガードレールがエージェントの信頼性を維持
 - **適応** - 継続的な評価、監督、適応により、エージェントは時間とともに改善
- **インフラ** - 共有メモリ、LLM統合、ツール、スキルがすべてのエージェントを支援
+- **インフラ** - 共有メモリ、LLM 統合、ツール、スキルがすべてのエージェントを支援

 ## クイックリンク

- **[ドキュメント](https://docs.adenhq.com/)** - 完全なガイドとAPIリファレンス
- **[セルフホスティングガイド](https://docs.adenhq.com/getting-started/quickstart)** - インフラストラクチャへのHiveデプロイ
+- **[ドキュメント](https://docs.adenhq.com/)** - 完全なガイドと API リファレンス
+- **[セルフホスティングガイド](https://docs.adenhq.com/getting-started/quickstart)** - インフラストラクチャへの Hive デプロイ
 - **[変更履歴](https://github.com/adenhq/hive/releases)** - 最新の更新とリリース
 <!-- - **[ロードマップ](https://adenhq.com/roadmap)** - 今後の機能と計画 -->
 - **[問題を報告](https://github.com/adenhq/hive/issues)** - バグレポートと機能リクエスト
@@ -80,8 +80,9 @@ cd hive
 ```

 これにより以下がインストールされます：
+
 - **framework** - コアエージェントランタイムとグラフエグゼキュータ
- **aden_tools** - エージェント機能のための19個のMCPツール
+- **aden_tools** - エージェント機能のための 19 個の MCP ツール
 - すべての必要な依存関係

 ### 最初のエージェントを構築
@@ -91,31 +92,31 @@ cd hive
 ./quickstart.sh

 # Claude Codeを使用してエージェントを構築
-claude> /building-agents-construction
+claude> /hive

 # エージェントをテスト
-claude> /testing-agent
+claude> /hive-test

 # エージェントを実行
 PYTHONPATH=exports uv run python -m your_agent_name run --input '{...}'
 ```

-**[📖 完全セットアップガイド](ENVIRONMENT_SETUP.md)** - エージェント開発の詳細な手順
+**[📖 完全セットアップガイド](../environment-setup.md)** - エージェント開発の詳細な手順

 ## 機能

 - **目標駆動開発** - 自然言語で目標を定義；コーディングエージェントがそれを達成するためのエージェントグラフと接続コードを生成
 - **自己適応エージェント** - フレームワークが障害をキャプチャし、目標を更新し、エージェントグラフを更新
- **動的ノード接続** - 事前定義されたエッジなし；接続コードは目標に基づいて任意の対応LLMによって生成
- **SDKラップノード** - すべてのノードが共有メモリ、ローカルRLMメモリ、モニタリング、ツール、LLMアクセスを標準装備
+- **動的ノード接続** - 事前定義されたエッジなし；接続コードは目標に基づいて任意の対応 LLM によって生成
+- **SDK ラップノード** - すべてのノードが共有メモリ、ローカル RLM メモリ、モニタリング、ツール、LLM アクセスを標準装備
 - **ヒューマンインザループ** - 設定可能なタイムアウトとエスカレーションを備えた、人間の入力のために実行を一時停止する介入ノード
- **リアルタイム可観測性** - エージェント実行、決定、ノード間通信のライブモニタリングのためのWebSocketストリーミング
+- **リアルタイム可観測性** - エージェント実行、決定、ノード間通信のライブモニタリングのための WebSocket ストリーミング
 - **コストと予算管理** - 支出制限、スロットル、自動モデル劣化ポリシーを設定
 - **本番環境対応** - セルフホスト可能、スケールと信頼性のために構築

-## なぜAdenか
+## なぜ Aden か

-従来のエージェントフレームワークでは、ワークフローを手動で設計し、エージェントの相互作用を定義し、障害を事後的に処理する必要があります。Adenはこのパラダイムを逆転させます—**結果を記述すれば、システムが自ら構築します**。
+従来のエージェントフレームワークでは、ワークフローを手動で設計し、エージェントの相互作用を定義し、障害を事後的に処理する必要があります。Aden はこのパラダイムを逆転させます—**結果を記述すれば、システムが自ら構築します**。

 ```mermaid
 flowchart LR
@@ -162,34 +163,34 @@ flowchart LR
    style STORE fill:#ed8c00,stroke:#cc5d00,stroke-width:2px,color:#fff
 ```

-### Adenの優位性
+### Aden の優位性

-| 従来のフレームワーク | Aden |
-|----------------------|------|
-| エージェントワークフローをハードコード | 自然言語で目標を記述 |
-| 手動でグラフを定義 | 自動生成されるエージェントグラフ |
-| 事後的なエラー処理 | プロアクティブな自己進化 |
-| 静的なツール設定 | 動的なSDKラップノード |
-| 別途モニタリング設定 | 組み込みのリアルタイム可観測性 |
-| DIY予算管理 | 統合されたコスト制御と劣化 |
+| 従来のフレームワーク                   | Aden                             |
+| -------------------------------------- | -------------------------------- |
+| エージェントワークフローをハードコード | 自然言語で目標を記述             |
+| 手動でグラフを定義                     | 自動生成されるエージェントグラフ |
+| 事後的なエラー処理                     | プロアクティブな自己進化         |
+| 静的なツール設定                       | 動的な SDK ラップノード          |
+| 別途モニタリング設定                   | 組み込みのリアルタイム可観測性   |
+| DIY 予算管理                           | 統合されたコスト制御と劣化       |

 ### 仕組み

 1. **目標を定義** → 達成したいことを平易な言葉で記述
 2. **コーディングエージェントが生成** → エージェントグラフ、接続コード、テストケースを作成
-3. **ワーカーが実行** → SDKラップノードが完全な可観測性とツールアクセスで実行
+3. **ワーカーが実行** → SDK ラップノードが完全な可観測性とツールアクセスで実行
 4. **コントロールプレーンが監視** → リアルタイムメトリクス、予算執行、ポリシー管理
 5. **自己改善** → 障害時、システムがグラフを進化させ自動的に再デプロイ

-## Adenの比較
+## Aden の比較

-Adenはエージェント開発に根本的に異なるアプローチを採用しています。ほとんどのフレームワークがワークフローをハードコードするか、エージェントグラフを手動で定義することを要求するのに対し、Adenは**コーディングエージェントを使用して自然言語の目標からエージェントシステム全体を生成**します。エージェントが失敗した場合、フレームワークは単にエラーをログに記録するだけでなく—**自動的にエージェントグラフを進化させ**、再デプロイします。
+Aden はエージェント開発に根本的に異なるアプローチを採用しています。ほとんどのフレームワークがワークフローをハードコードするか、エージェントグラフを手動で定義することを要求するのに対し、Aden は**コーディングエージェントを使用して自然言語の目標からエージェントシステム全体を生成**します。エージェントが失敗した場合、フレームワークは単にエラーをログに記録するだけでなく—**自動的にエージェントグラフを進化させ**、再デプロイします。

 > **注意：** 詳細なフレームワーク比較表とよくある質問については、英語の[README.md](README.md)を参照してください。

-### Adenを選ぶべきとき
+### Aden を選ぶべきとき

-Adenを選択する場合：
+Aden を選択する場合：

 - 手動介入なしに**失敗から自己改善する**エージェントが必要
 - ワークフローではなく結果を記述する**目標駆動開発**が必要
@@ -200,7 +201,7 @@ Adenを選択する場合：
 他のフレームワークを選択する場合：

 - **型安全で予測可能なワークフロー**（PydanticAI、Mastra）
- **RAGとドキュメント処理**（LlamaIndex、Haystack）
+- **RAG とドキュメント処理**（LlamaIndex、Haystack）
 - **エージェント創発の研究**（CAMEL）
 - **リアルタイム音声/マルチモーダル**（TEN Framework）
 - **シンプルなコンポーネント連鎖**（LangChain、Swarm）
@@ -215,15 +216,12 @@ hive/
 ├── docs/                   # ドキュメントとガイド
 ├── scripts/                # ビルドとユーティリティスクリプト
 ├── .claude/                # エージェント構築用のClaude Codeスキル
-├── ENVIRONMENT_SETUP.md    # エージェント開発用のPythonセットアップガイド
-├── DEVELOPER.md            # 開発者ガイド
 ├── CONTRIBUTING.md         # 貢献ガイドライン
-└── ROADMAP.md              # プロダクトロードマップ
 ```

 ## 開発

-### Pythonエージェント開発
+### Python エージェント開発

 フレームワークで目標駆動エージェントを構築および実行するには：

@@ -237,29 +235,29 @@ hive/
 # - すべての依存関係

 # Claude Codeスキルを使用して新しいエージェントを構築
-claude> /building-agents-construction
+claude> /hive

 # エージェントをテスト
-claude> /testing-agent
+claude> /hive-test

 # エージェントを実行
 PYTHONPATH=exports uv run python -m agent_name run --input '{...}'
 ```

-完全なセットアップ手順については、[ENVIRONMENT_SETUP.md](ENVIRONMENT_SETUP.md)を参照してください。
+完全なセットアップ手順については、[environment-setup.md](../environment-setup.md)を参照してください。

 ## ドキュメント

- **[開発者ガイド](DEVELOPER.md)** - 開発者向け総合ガイド
+- **[開発者ガイド](../developer-guide.md)** - 開発者向け総合ガイド
 - [はじめに](docs/getting-started.md) - クイックセットアップ手順
 - [設定ガイド](docs/configuration.md) - すべての設定オプション
 - [アーキテクチャ概要](docs/architecture/README.md) - システム設計と構造

 ## ロードマップ

-Adenエージェントフレームワークは、開発者が結果志向で自己適応するエージェントを構築できるよう支援することを目指しています。ロードマップはこちらをご覧ください
+Aden エージェントフレームワークは、開発者が結果志向で自己適応するエージェントを構築できるよう支援することを目指しています。ロードマップはこちらをご覧ください

-[ROADMAP.md](ROADMAP.md)
+[roadmap.md](../roadmap.md)

 ```mermaid
 timeline
@@ -289,9 +287,9 @@ timeline

 貢献を歓迎します！ガイドラインについては[CONTRIBUTING.md](CONTRIBUTING.md)をご覧ください。

-**重要：** PRを提出する前に、まずIssueにアサインされてください。Issueにコメントして担当を申請すると、メンテナーが24時間以内にアサインします。これにより重複作業を防ぐことができます。
+**重要：** PR を提出する前に、まず Issue にアサインされてください。Issue にコメントして担当を申請すると、メンテナーが 24 時間以内にアサインします。これにより重複作業を防ぐことができます。

-1. Issueを見つけるか作成し、アサインを受ける
+1. Issue を見つけるか作成し、アサインを受ける
 2. リポジトリをフォーク
 3. 機能ブランチを作成 (`git checkout -b feature/amazing-feature`)
 4. 変更をコミット (`git commit -m 'Add amazing feature'`)
@@ -310,31 +308,31 @@ timeline

 ## ライセンス

-このプロジェクトはApache License 2.0の下でライセンスされています - 詳細は[LICENSE](LICENSE)ファイルをご覧ください。
+このプロジェクトは Apache License 2.0 の下でライセンスされています - 詳細は[LICENSE](LICENSE)ファイルをご覧ください。

 ## よくある質問 (FAQ)

 > **注意：** よくある質問の完全版については、英語の[README.md](README.md)を参照してください。

-**Q: AdenはLangChainや他のエージェントフレームワークに依存していますか？**
+**Q: Aden は LangChain や他のエージェントフレームワークに依存していますか？**

-いいえ。AdenはLangChain、CrewAI、その他のエージェントフレームワークに依存せずにゼロから構築されています。フレームワークは軽量で柔軟に設計されており、事前定義されたコンポーネントに依存するのではなく、エージェントグラフを動的に生成します。
+いいえ。Aden は LangChain、CrewAI、その他のエージェントフレームワークに依存せずにゼロから構築されています。フレームワークは軽量で柔軟に設計されており、事前定義されたコンポーネントに依存するのではなく、エージェントグラフを動的に生成します。

-**Q: AdenはどのLLMプロバイダーをサポートしていますか？**
+**Q: Aden はどの LLM プロバイダーをサポートしていますか？**

-AdenはLiteLLM統合を通じて100以上のLLMプロバイダーをサポートしており、OpenAI（GPT-4、GPT-4o）、Anthropic（Claudeモデル）、Google Gemini、Mistral、Groqなどが含まれます。適切なAPIキー環境変数を設定し、モデル名を指定するだけです。
+Aden は LiteLLM 統合を通じて 100 以上の LLM プロバイダーをサポートしており、OpenAI（GPT-4、GPT-4o）、Anthropic（Claude モデル）、Google Gemini、Mistral、Groq などが含まれます。適切な API キー環境変数を設定し、モデル名を指定するだけです。

-**Q: Adenはオープンソースですか？**
+**Q: Aden はオープンソースですか？**

-はい、AdenはApache License 2.0の下で完全にオープンソースです。コミュニティの貢献とコラボレーションを積極的に奨励しています。
+はい、Aden は Apache License 2.0 の下で完全にオープンソースです。コミュニティの貢献とコラボレーションを積極的に奨励しています。

-**Q: Adenは他のエージェントフレームワークと何が違いますか？**
+**Q: Aden は他のエージェントフレームワークと何が違いますか？**

-Adenはコーディングエージェントを使用して自然言語の目標からエージェントシステム全体を生成します—ワークフローをハードコードしたり、グラフを手動で定義したりする必要はありません。エージェントが失敗すると、フレームワークは自動的に障害データをキャプチャし、エージェントグラフを進化させ、再デプロイします。この自己改善ループはAden独自のものです。
+Aden はコーディングエージェントを使用して自然言語の目標からエージェントシステム全体を生成します—ワークフローをハードコードしたり、グラフを手動で定義したりする必要はありません。エージェントが失敗すると、フレームワークは自動的に障害データをキャプチャし、エージェントグラフを進化させ、再デプロイします。この自己改善ループは Aden 独自のものです。

-**Q: Adenはヒューマンインザループワークフローをサポートしていますか？**
+**Q: Aden はヒューマンインザループワークフローをサポートしていますか？**

-はい、Adenは人間の入力のために実行を一時停止する介入ノードを通じて、ヒューマンインザループワークフローを完全にサポートしています。設定可能なタイムアウトとエスカレーションポリシーが含まれており、人間の専門家とAIエージェントのシームレスなコラボレーションを可能にします。
+はい、Aden は人間の入力のために実行を一時停止する介入ノードを通じて、ヒューマンインザループワークフローを完全にサポートしています。設定可能なタイムアウトとエスカレーションポリシーが含まれており、人間の専門家と AI エージェントのシームレスなコラボレーションを可能にします。

 ---

@@ -91,16 +91,16 @@ cd hive
 ./quickstart.sh

 # Claude Code를 사용해 에이전트 빌드
-claude> /building-agents
+claude> /hive

 # 에이전트 테스트
-claude> /testing-agent
+claude> /hive-test

 # 에이전트 실행
 PYTHONPATH=exports uv run python -m your_agent_name run --input '{...}'
 ```

-**[📖 전체 설정 가이드](ENVIRONMENT_SETUP.md)** - 에이전트 개발을 위한 상세한 설명
+**[📖 전체 설정 가이드](../environment-setup.md)** - 에이전트 개발을 위한 상세한 설명

 ## 주요 기능

@@ -226,10 +226,7 @@ hive/
 ├── docs/                   # 문서 및 가이드
 ├── scripts/                # 빌드 및 유틸리티 스크립트
 ├── .claude/                # 에이전트 생성을 위한 Claude Code 스킬
-├── ENVIRONMENT_SETUP.md    # 에이전트 개발을 위한 Python 환경 설정 가이드
-├── DEVELOPER.md            # 개발자 가이드
 ├── CONTRIBUTING.md         # 기여 가이드라인
-└── ROADMAP.md              # 제품 로드맵
 ```

 ## 개발
@@ -248,20 +245,20 @@ hive/
 # - 모든 의존성

 # Claude Code 스킬을 사용해 새 에이전트 생성
-claude> /building-agents
+claude> /hive

 # 에이전트 테스트
-claude> /testing-agent
+claude> /hive-test

 # 에이전트 실행
 PYTHONPATH=exports uv run python -m agent_name run --input '{...}'
 ```

-전체 설정 방법은 [ENVIRONMENT_SETUP.md](ENVIRONMENT_SETUP.md) 를 참고하세요.
+전체 설정 방법은 [environment-setup.md](../environment-setup.md) 를 참고하세요.

 ## 문서

- **[개발자 가이드](DEVELOPER.md)** - 개발자를 위한 종합 가이드
+- **[개발자 가이드](../developer-guide.md)** - 개발자를 위한 종합 가이드
 - [시작하기](docs/getting-started.md) - 빠른 설정 방법
 - [설정 가이드](docs/configuration.md) - 모든 설정 옵션 안내
 - [아키텍처 개요](docs/architecture/README.md) - 시스템 설계 및 구조
@@ -271,7 +268,7 @@ PYTHONPATH=exports uv run python -m agent_name run --input '{...}'
 Aden Agent Framework는 개발자가 결과 중심(outcome-oriented) 이며 자기 적응형(self-adaptive) 에이전트를 구축할 수 있도록 돕는 것을 목표로 합니다.
 자세한 로드맵은 아래 문서에서 확인할 수 있습니다.

-[ROADMAP.md](ROADMAP.md)
+[roadmap.md](../roadmap.md)

 ```mermaid
 timeline
@@ -352,7 +349,7 @@ Aden은 모니터링과 관측성을 위해 토큰 사용량, 지연 시간 메

 **Q: Aden은 어떤 배포 방식을 지원하나요?**

-Aden은 Python 패키지를 통한 셀프 호스팅 배포를 지원합니다. 설치 방법은 [환경 설정 가이드](ENVIRONMENT_SETUP.md)를 참조하세요. 클라우드 배포 옵션과 Kubernetes 대응 설정은 로드맵에 포함되어 있습니다.
+Aden은 Python 패키지를 통한 셀프 호스팅 배포를 지원합니다. 설치 방법은 [환경 설정 가이드](../environment-setup.md)를 참조하세요. 클라우드 배포 옵션과 Kubernetes 대응 설정은 로드맵에 포함되어 있습니다.

 **Q: Aden은 복잡한 프로덕션 규모의 사용 사례도 처리할 수 있나요?**

@@ -380,7 +377,7 @@ Aden은 지출 한도, 호출 제한, 자동 모델 다운그레이드 정책

 **Q: 예제와 문서는 어디에서 확인할 수 있나요?**

-전체 가이드, API 레퍼런스, 시작 튜토리얼은 [docs.adenhq.com](https://docs.adenhq.com/) 에서 확인하실 수 있습니다. 또한 저장소의 `docs/` 디렉터리와 종합적인 [DEVELOPER.md](DEVELOPER.md) 가이드도 함께 제공됩니다.
+전체 가이드, API 레퍼런스, 시작 튜토리얼은 [docs.adenhq.com](https://docs.adenhq.com/) 에서 확인하실 수 있습니다. 또한 저장소의 `docs/` 디렉터리와 종합적인 [developer-guide.md](../developer-guide.md) 가이드도 함께 제공됩니다.

 **Q: Aden에 기여하려면 어떻게 해야 하나요?**

@@ -80,6 +80,7 @@ cd hive
 ```

 Isto instala:
+
 - **framework** - Runtime do agente principal e executor de grafos
 - **aden_tools** - 19 ferramentas MCP para capacidades de agentes
 - Todas as dependências necessárias
@@ -91,16 +92,16 @@ Isto instala:
 ./quickstart.sh

 # Construir um agente usando Claude Code
-claude> /building-agents-construction
+claude> /hive

 # Testar seu agente
-claude> /testing-agent
+claude> /hive-test

 # Executar seu agente
 PYTHONPATH=exports uv run python -m your_agent_name run --input '{...}'
 ```

-**[📖 Guia Completo de Configuração](ENVIRONMENT_SETUP.md)** - Instruções detalhadas para desenvolvimento de agentes
+**[📖 Guia Completo de Configuração](../environment-setup.md)** - Instruções detalhadas para desenvolvimento de agentes

 ## Funcionalidades

@@ -164,14 +165,14 @@ flowchart LR

 ### A Vantagem Aden

-| Frameworks Tradicionais | Aden |
-|-------------------------|------|
-| Codificar fluxos de trabalho de agentes | Descrever objetivos em linguagem natural |
-| Definição manual de grafos | Grafos de agentes auto-gerados |
-| Tratamento reativo de erros | Auto-evolução proativa |
-| Configurações de ferramentas estáticas | Nós dinâmicos envolvidos em SDK |
-| Configuração de monitoramento separada | Observabilidade em tempo real integrada |
-| Gerenciamento de orçamento DIY | Controles de custo e degradação integrados |
+| Frameworks Tradicionais                 | Aden                                       |
+| --------------------------------------- | ------------------------------------------ |
+| Codificar fluxos de trabalho de agentes | Descrever objetivos em linguagem natural   |
+| Definição manual de grafos              | Grafos de agentes auto-gerados             |
+| Tratamento reativo de erros             | Auto-evolução proativa                     |
+| Configurações de ferramentas estáticas  | Nós dinâmicos envolvidos em SDK            |
+| Configuração de monitoramento separada  | Observabilidade em tempo real integrada    |
+| Gerenciamento de orçamento DIY          | Controles de custo e degradação integrados |

 ### Como Funciona

@@ -215,10 +216,7 @@ hive/
 ├── docs/                   # Documentação e guias
 ├── scripts/                # Scripts de build e utilitários
 ├── .claude/                # Habilidades Claude Code para construir agentes
-├── ENVIRONMENT_SETUP.md    # Guia de configuração Python para desenvolvimento de agentes
-├── DEVELOPER.md            # Guia do desenvolvedor
 ├── CONTRIBUTING.md         # Diretrizes de contribuição
-└── ROADMAP.md              # Roadmap do produto
 ```

 ## Desenvolvimento
@@ -237,20 +235,20 @@ Para construir e executar agentes orientados a objetivos com o framework:
 # - Todas as dependências

 # Construir novos agentes usando habilidades Claude Code
-claude> /building-agents-construction
+claude> /hive

 # Testar agentes
-claude> /testing-agent
+claude> /hive-test

 # Executar agentes
 PYTHONPATH=exports uv run python -m agent_name run --input '{...}'
 ```

-Consulte [ENVIRONMENT_SETUP.md](ENVIRONMENT_SETUP.md) para instruções completas de configuração.
+Consulte [environment-setup.md](../environment-setup.md) para instruções completas de configuração.

 ## Documentação

- **[Guia do Desenvolvedor](DEVELOPER.md)** - Guia abrangente para desenvolvedores
+- **[Guia do Desenvolvedor](../developer-guide.md)** - Guia abrangente para desenvolvedores
 - [Começando](docs/getting-started.md) - Instruções de configuração rápida
 - [Guia de Configuração](docs/configuration.md) - Todas as opções de configuração
 - [Visão Geral da Arquitetura](docs/architecture/README.md) - Design e estrutura do sistema
@@ -259,7 +257,7 @@ Consulte [ENVIRONMENT_SETUP.md](ENVIRONMENT_SETUP.md) para instruções completa

 O Aden Agent Framework visa ajudar desenvolvedores a construir agentes auto-adaptativos orientados a resultados. Encontre nosso roadmap aqui

-[ROADMAP.md](ROADMAP.md)
+[roadmap.md](../roadmap.md)

 ```mermaid
 timeline
@@ -80,6 +80,7 @@ cd hive
 ```

 Это установит:
+
 - **framework** - Основная среда выполнения агентов и исполнитель графов
 - **aden_tools** - 19 инструментов MCP для возможностей агентов
 - Все необходимые зависимости
@@ -91,16 +92,16 @@ cd hive
 ./quickstart.sh

 # Создать агента с помощью Claude Code
-claude> /building-agents-construction
+claude> /hive

 # Протестировать агента
-claude> /testing-agent
+claude> /hive-test

 # Запустить агента
 PYTHONPATH=exports uv run python -m your_agent_name run --input '{...}'
 ```

-**[📖 Полное руководство по настройке](ENVIRONMENT_SETUP.md)** - Подробные инструкции для разработки агентов
+**[📖 Полное руководство по настройке](../environment-setup.md)** - Подробные инструкции для разработки агентов

 ## Функции

@@ -164,14 +165,14 @@ flowchart LR

 ### Преимущество Aden

-| Традиционные фреймворки | Aden |
-|-------------------------|------|
-| Жёсткое кодирование рабочих процессов | Описание целей на естественном языке |
-| Ручное определение графов | Автоматически генерируемые графы агентов |
-| Реактивная обработка ошибок | Проактивная самоэволюция |
-| Статические конфигурации инструментов | Динамические узлы, обёрнутые SDK |
-| Отдельная настройка мониторинга | Встроенная наблюдаемость в реальном времени |
-| DIY управление бюджетом | Интегрированный контроль затрат и деградация |
+| Традиционные фреймворки               | Aden                                         |
+| ------------------------------------- | -------------------------------------------- |
+| Жёсткое кодирование рабочих процессов | Описание целей на естественном языке         |
+| Ручное определение графов             | Автоматически генерируемые графы агентов     |
+| Реактивная обработка ошибок           | Проактивная самоэволюция                     |
+| Статические конфигурации инструментов | Динамические узлы, обёрнутые SDK             |
+| Отдельная настройка мониторинга       | Встроенная наблюдаемость в реальном времени  |
+| DIY управление бюджетом               | Интегрированный контроль затрат и деградация |

 ### Как это работает

@@ -215,10 +216,7 @@ hive/
 ├── docs/                   # Документация и руководства
 ├── scripts/                # Скрипты сборки и утилиты
 ├── .claude/                # Навыки Claude Code для создания агентов
-├── ENVIRONMENT_SETUP.md    # Руководство по настройке Python для разработки агентов
-├── DEVELOPER.md            # Руководство разработчика
 ├── CONTRIBUTING.md         # Руководство по участию
-└── ROADMAP.md              # Дорожная карта продукта
 ```

 ## Разработка
@@ -237,20 +235,20 @@ hive/
 # - Все зависимости

 # Создать новых агентов с помощью навыков Claude Code
-claude> /building-agents-construction
+claude> /hive

 # Протестировать агентов
-claude> /testing-agent
+claude> /hive-test

 # Запустить агентов
 PYTHONPATH=exports uv run python -m agent_name run --input '{...}'
 ```

-Обратитесь к [ENVIRONMENT_SETUP.md](ENVIRONMENT_SETUP.md) для полных инструкций по настройке.
+Обратитесь к [environment-setup.md](../environment-setup.md) для полных инструкций по настройке.

 ## Документация

- **[Руководство разработчика](DEVELOPER.md)** - Полное руководство для разработчиков
+- **[Руководство разработчика](../developer-guide.md)** - Полное руководство для разработчиков
 - [Начало работы](docs/getting-started.md) - Инструкции по быстрой настройке
 - [Руководство по конфигурации](docs/configuration.md) - Все опции конфигурации
 - [Обзор архитектуры](docs/architecture/README.md) - Дизайн и структура системы
@@ -259,7 +257,7 @@ PYTHONPATH=exports uv run python -m agent_name run --input '{...}'

 Aden Agent Framework призван помочь разработчикам создавать самоадаптирующихся агентов, ориентированных на результат. Найдите нашу дорожную карту здесь

-[ROADMAP.md](ROADMAP.md)
+[roadmap.md](../roadmap.md)

 ```mermaid
 timeline
@@ -80,6 +80,7 @@ cd hive
 ```

 这将安装：
+
 - **framework** - 核心智能体运行时和图执行器
 - **aden_tools** - 19 个 MCP 工具提供智能体能力
 - 所有必需的依赖项
@@ -91,16 +92,16 @@ cd hive
 ./quickstart.sh

 # 使用 Claude Code 构建智能体
-claude> /building-agents-construction
+claude> /hive

 # 测试您的智能体
-claude> /testing-agent
+claude> /hive-test

 # 运行您的智能体
 PYTHONPATH=exports uv run python -m your_agent_name run --input '{...}'
 ```

-**[📖 完整设置指南](ENVIRONMENT_SETUP.md)** - 智能体开发的详细说明
+**[📖 完整设置指南](../environment-setup.md)** - 智能体开发的详细说明

 ## 功能特性

@@ -164,14 +165,14 @@ flowchart LR

 ### Aden 的优势

-| 传统框架 | Aden |
-|----------|------|
+| 传统框架           | Aden               |
+| ------------------ | ------------------ |
 | 硬编码智能体工作流 | 用自然语言描述目标 |
-| 手动图定义 | 自动生成智能体图 |
-| 被动错误处理 | 主动自我进化 |
-| 静态工具配置 | 动态 SDK 封装节点 |
-| 单独设置监控 | 内置实时可观测性 |
-| DIY 预算管理 | 集成成本控制和降级 |
+| 手动图定义         | 自动生成智能体图   |
+| 被动错误处理       | 主动自我进化       |
+| 静态工具配置       | 动态 SDK 封装节点  |
+| 单独设置监控       | 内置实时可观测性   |
+| DIY 预算管理       | 集成成本控制和降级 |

 ### 工作原理

@@ -215,10 +216,7 @@ hive/
 ├── docs/                   # 文档和指南
 ├── scripts/                # 构建和实用脚本
 ├── .claude/                # Claude Code 技能用于构建智能体
-├── ENVIRONMENT_SETUP.md    # 智能体开发的 Python 设置指南
-├── DEVELOPER.md            # 开发者指南
 ├── CONTRIBUTING.md         # 贡献指南
-└── ROADMAP.md              # 产品路线图
 ```

 ## 开发
@@ -237,20 +235,20 @@ hive/
 # - 所有依赖项

 # 使用 Claude Code 技能构建新智能体
-claude> /building-agents-construction
+claude> /hive

 # 测试智能体
-claude> /testing-agent
+claude> /hive-test

 # 运行智能体
 PYTHONPATH=exports uv run python -m agent_name run --input '{...}'
 ```

-完整设置说明请参阅 [ENVIRONMENT_SETUP.md](ENVIRONMENT_SETUP.md)。
+完整设置说明请参阅 [environment-setup.md](../environment-setup.md)。

 ## 文档

- **[开发者指南](DEVELOPER.md)** - 开发者综合指南
+- **[开发者指南](../developer-guide.md)** - 开发者综合指南
 - [入门指南](docs/getting-started.md) - 快速设置说明
 - [配置指南](docs/configuration.md) - 所有配置选项
 - [架构概述](docs/architecture/README.md) - 系统设计和结构
@@ -259,7 +257,7 @@ PYTHONPATH=exports uv run python -m agent_name run --input '{...}'

 Aden 智能体框架旨在帮助开发者构建面向结果的、自适应的智能体。请在此查看我们的路线图

-[ROADMAP.md](ROADMAP.md)
+[roadmap.md](../roadmap.md)

 ```mermaid
 timeline
@@ -0,0 +1,49 @@
+# Evolution
+
+## Evolution Is the Mechanism; Adaptiveness Is the Result
+
+Agents don't just fail; they fail inevitably. Real-world variables—private LinkedIn profiles, shifting API schemas, or LLM hallucinations—are impossible to predict in a vacuum. The first version of any agent is merely a "happy path" draft.
+
+Evolution is how Hive handles this. When an agent fails, the framework captures what went wrong — which node failed, which success criteria weren't met, what the agent tried and why it didn't work. Then a coding agent (Claude Code, Cursor, or similar) uses that failure data to generate an improved version of the agent. The new version gets deployed, runs, encounters new edge cases, and the cycle continues.
+
+Over generations, the agent gets more reliable. Not because someone sat down and anticipated every possible failure, but because each failure teaches the next version something specific.
+
+## How It Works
+
+The evolution loop has four stages:
+
+**1. Execute** — The worker agent runs against real inputs. Sessions produce outcomes, decisions, and metrics.
+
+**2. Evaluate** — The framework checks outcomes against the goal's success criteria and constraints. Did the agent produce the desired result? Which criteria were satisfied and which weren't? Were any constraints violated?
+
+**3. Diagnose** — Failure data is structured and specific. It's not just "the agent failed" — it's "node `draft_message` failed to produce personalized content because the research node returned insufficient data about the prospect's recent activity." The decision log, problem reports, and execution trace provide the full picture.
+
+**4. Regenerate** — A coding agent receives the diagnosis and the current agent code. It modifies the graph — adding nodes, adjusting prompts, changing edge conditions, adding tools — to address the specific failure. The new version is deployed and the cycle restarts.
+
+## Adaptiveness ≠ Intelligence or Intent
+
+An important distinction: evolution makes agents more adaptive, but not more intelligent in any general sense. The agent isn't learning to reason better — it's being rewritten to handle more situations correctly.
+
+This is closer to how biological evolution works than how learning works. A species doesn't "learn" to survive winter — individuals that happen to have thicker fur survive, and that trait gets selected for. Similarly, agent versions that handle more edge cases correctly survive in production, and the patterns that made them successful get carried forward.
+
+The practical implication: don't expect evolution to make an agent smarter about problems it's never seen. Evolution improves reliability on the *kinds* of problems the agent has already encountered. For genuinely novel situations, that's what human-in-the-loop is for — and every time a human steps in, that interaction becomes potential fuel for the next evolution cycle.
+
+## What Gets Evolved
+
+Evolution can change almost anything about an agent:
+
+**Prompts** — The most common fix. A node's system prompt gets refined based on the specific ways the LLM misunderstood its instructions.
+
+**Graph structure** — Adding a validation node before a critical step, splitting a node that's trying to do too much, adding a fallback path for a common failure mode.
+
+**Edge conditions** — Adjusting routing logic based on observed patterns. If low-confidence research results consistently lead to bad drafts, add a conditional edge that routes them back for another research pass.
+
+**Tool selection** — Swapping in a better tool, adding a new one, or removing one that causes more problems than it solves.
+
+**Constraints and criteria** — Tightening or loosening based on what's actually achievable and what matters in practice.
+
+## The Role of Decision Logging
+
+Evolution depends on good data. The runtime captures every decision an agent makes: what it was trying to do, what options it considered, what it chose, and what happened as a result. This isn't overhead — it's the signal that makes evolution possible.
+
+Without decision logging, failure analysis is guesswork. With it, the coding agent can trace a failure back to its root cause and make a targeted fix rather than a blind change.
@@ -0,0 +1,101 @@
+# Goals & Outcome-Driven Development
+
+## The Core Idea
+
+Business processes are outcome-driven. A sales team doesn't follow a rigid script — they adapt their approach until the deal closes. A support agent doesn't execute a flowchart — they resolve the customer's issue. The outcome is what matters, not the specific steps taken to get there.
+
+Hive is built on this principle. Instead of hardcoding agent workflows step by step, you define the outcome you want, and the framework figures out how to get there. We call this **Outcome-Driven Development (ODD)**.
+
+## Task-Driven vs Goal-Driven vs Outcome-Driven
+
+These three paradigms represent different levels of abstraction for building agents:
+
+**Task-Driven Development (TDD)** asks: *"Is the code correct?"*
+
+You define explicit steps. The agent follows them. Success means the steps ran without errors. The problem: an agent can execute every step perfectly and still produce a useless result. The steps become the goal, not the actual outcome.
+
+**Goal-Driven Development (GDD)** asks: *"Are we solving the right problem?"*
+
+You define what you want to achieve. The agent plans and executes toward that goal. Better than TDD because it captures intent. But goals can be vague — "improve customer satisfaction" doesn't tell you when you're done.
+
+**Outcome-Driven Development (ODD)** asks: *"Did the system produce the desired result?"*
+
+You define measurable success criteria, hard constraints, and the context the agent needs. The agent is evaluated against the actual outcome, not whether it followed the right steps or aimed at the right goal. This is what Hive implements.
+
+## Goals as First-Class Citizens
+
+In Hive, a `Goal` is not a string description. It's a structured object with three components:
+
+### Success Criteria
+
+Each goal has weighted success criteria that define what "done" looks like. These aren't binary pass/fail checks — they're multi-dimensional measures of quality.
+
+```python
+Goal(
+    id="twitter-outreach",
+    name="Personalized Twitter Outreach",
+    success_criteria=[
+        SuccessCriterion(
+            id="personalized",
+            description="Messages reference specific details from the prospect's profile",
+            metric="llm_judge",
+            weight=0.4
+        ),
+        SuccessCriterion(
+            id="compliant",
+            description="Messages follow brand voice guidelines",
+            metric="llm_judge",
+            weight=0.3
+        ),
+        SuccessCriterion(
+            id="actionable",
+            description="Each message includes a clear call to action",
+            metric="output_contains",
+            target="CTA",
+            weight=0.3
+        ),
+    ],
+    ...
+)
+```
+
+Metrics can be `output_contains`, `output_equals`, `llm_judge`, or `custom`. Weights let you express what matters most — a perfectly compliant message that isn't personalized still falls short.
+
+### Constraints
+
+Constraints define what must **not** happen. They're the guardrails.
+
+```python
+constraints=[
+    Constraint(
+        id="no_spam",
+        description="Never send more than 3 messages to the same person per week",
+        constraint_type="hard",    # Violation = immediate escalation
+        category="safety"
+    ),
+    Constraint(
+        id="budget_limit",
+        description="Total LLM cost must not exceed $5 per run",
+        constraint_type="soft",    # Violation = warning, not a hard stop
+        category="cost"
+    ),
+]
+```
+
+Hard constraints are non-negotiable — violating one triggers escalation or failure. Soft constraints are preferences that the agent should respect but can bend when necessary. Constraint categories include `time`, `cost`, `safety`, `scope`, and `quality`.
+
+### Context
+
+Goals carry context — domain knowledge, preferences, background information that the agent needs to make good decisions. This context is injected into every LLM call the agent makes, so the agent is always reasoning with the full picture.
+
+## Why This Matters
+
+When you define goals with weighted criteria and constraints, three things happen:
+
+1. **The agent can self-correct.** Goals are injected into every LLM call, so the agent is always reasoning against its success criteria. Within a [graph execution](./graph.md), nodes use these criteria to decide whether to accept their output, retry, or escalate — self-correction in real time.
+
+2. **Evolution has a target.** When an agent fails, the framework knows *which criteria* it fell short on, which gives the coding agent specific information to improve the next generation (see [Evolution](./evolution.md)).
+
+3. **Humans stay in control.** Constraints define the boundaries. The agent has freedom to find creative solutions within those boundaries, but it can't cross the lines you've drawn.
+
+The goal lifecycle flows through `DRAFT → READY → ACTIVE → COMPLETED / FAILED / SUSPENDED`, giving you visibility into where each objective stands at any point during execution.
@@ -0,0 +1,78 @@
+# The Agent Graph
+
+## Why a Graph
+
+Real business processes aren't linear. A sales outreach might go: research a prospect, draft a message, realize the research is thin, go back and dig deeper, draft again, get human approval, send. There are loops, branches, fallbacks, and decision points.
+
+Hive models this as a directed graph. Nodes do work, edges connect them, and shared memory lets them pass data. The framework walks this structure — running nodes, following edges, managing retries — until the agent reaches its goal or exhausts its step budget.
+
+Edges can loop back, creating feedback cycles where an agent retries a step or takes a different path. That's intentional. A graph that only moves forward can't self-correct.
+
+## Nodes
+
+A node is a unit of work. Each node reads inputs from shared memory, does something, and writes outputs back. There are a handful of node types, each suited to a different kind of work:
+
+**`event_loop`** — The workhorse. This is a multi-turn LLM loop: the model reasons about the current state, calls tools, observes results, and keeps going until it has produced the required outputs. Most of the interesting agent behavior happens in these nodes. They handle long-running tasks, manage their own context window, and can recover from crashes mid-conversation.
+
+**`function`** — A plain Python function. No LLM involved. Use these for anything deterministic: data transformation, API calls with known parameters, validation logic, or any step where you don't want a language model making judgment calls.
+
+**`router`** — A decision point that directs execution down different paths. Can be rule-based ("if confidence is high, go left; otherwise, go right") or LLM-powered ("given the goal and what we know so far, which path makes sense?").
+
+**`human_input`** — A pause point where the agent stops and asks a human for input before continuing. See [Human-in-the-Loop](#human-in-the-loop) below.
+
+There are also simpler LLM node types (`llm_tool_use` for a single LLM call with tools, `llm_generate` for pure text generation) for steps that don't need the full event loop.
+
+### Self-Correction Within a Node
+
+The most important behavior in an `event_loop` node is the ability to self-correct. After each iteration, the node evaluates its own output: did it produce what was needed? If yes, it's done. If not, it tries again — but this time it sees what went wrong and adjusts.
+
+This is the **reflexion pattern**: try, evaluate, learn from the result, try again. It's cheaper and more effective than starting over. An agent that takes three attempts to get something right is still more useful than one that fails on the first try and gives up.
+
+Within a single node, the outcomes are:
+
+- **Accept** — Output meets the bar. Move on.
+- **Retry** — Not good enough, but recoverable. Try again with feedback.
+- **Escalate** — Something is fundamentally broken. Hand off to error handling.
+
+This is self-correction *within a session* — the agent adapting in real time. It's different from [evolution](./evolution.md), which improves the agent *across sessions* by rewriting its code between generations. Both matter: reflexion handles the bumps in a single run, evolution handles the patterns that keep recurring across many runs.
+
+## Edges
+
+Edges control flow between nodes. Each edge has a condition:
+
+- **On success** — follow this edge if the source node succeeded
+- **On failure** — follow if the source failed (this is how you wire up fallback paths and error recovery)
+- **Conditional** — follow if an expression is true (e.g., route high-confidence results one way, low-confidence results another)
+- **LLM-decided** — let the LLM choose which path based on the [goal](./goals_outcome.md) and current context
+
+Edges also handle data plumbing between nodes — mapping one node's outputs to another node's expected inputs, so each node has a clean interface without needing to know where its data came from.
+
+When a node has multiple outgoing edges, the framework can run those branches in parallel and reconverge when they're all done. This is useful for tasks like researching a prospect from multiple sources simultaneously.
+
+## Shared Memory
+
+Shared memory is how nodes communicate. It's a key-value store scoped to a single [session](./worker_agent.md). Every node declares which keys it reads and which it writes, and the framework enforces those boundaries — a node can't quietly access data it hasn't declared.
+
+Data flows through the graph in a natural way: input arrives at the start, each node reads what it needs and writes what it produces, and edges map outputs to inputs as data moves between nodes. At the end, the full memory state is the execution result.
+
+## Human-in-the-Loop
+
+Human-in-the-loop (HITL) nodes are where the agent pauses and asks a person for input. This isn't a blunt "stop everything" — the framework supports structured questions: open-ended text, multiple choice, yes/no approvals, and multi-field forms.
+
+When the agent hits a HITL node, it saves its entire state and presents the questions. The session can sit paused for minutes, hours, or days. When the human responds, execution picks up exactly where it left off.
+
+This is what makes Hive agents supervisable in production. You place HITL nodes at critical decision points — before sending a message, before making a purchase, before any action that's hard to undo. The agent handles the routine work autonomously; humans weigh in on the decisions that matter. And every time a human provides input, that decision becomes data the [evolution](./evolution.md) process can learn from.
+
+## The Shape of an Agent
+
+A typical agent graph looks something like this:
+
+```
+intake → research → draft → [human review] → send → done
+                ↑                                 |
+                └──── on failure ─────────────────┘
+```
+
+An entry node where work begins. A chain of nodes that do the real work. HITL nodes at approval gates. Failure edges that loop back for another attempt. Terminal nodes where execution ends.
+
+The framework tracks everything as it walks the graph: which nodes ran, how many retries each needed, how much the LLM calls cost, how long each step took. This metadata feeds into the [worker agent runtime](./worker_agent.md) for monitoring and into the [evolution](./evolution.md) process for improvement.
@@ -0,0 +1,51 @@
+# The Worker Agent
+
+## What a Worker Agent Is
+
+A worker agent is a specialized AI agent built to perform a specific business process. It's not a general-purpose assistant — it's purpose-built, like hiring someone for a defined role. A sales outreach agent knows how to research prospects, craft personalized messages, and follow up. A support triage agent knows how to categorize tickets, pull customer context, and route to the right team.
+
+In Hive, a **Coding Agent** (like Claude Code or Cursor) generates worker agents from a natural language goal description. You describe what you want the agent to do, and the coding agent produces the graph, nodes, edges, and configuration. The worker agent is the thing that actually runs.
+
+## Sessions
+
+A session is a single execution of a worker agent against a specific input. If your outreach agent processes 50 prospects, that's 50 sessions.
+
+Each session is isolated — it has its own shared memory, its own execution state, and its own history. This matters because sessions can be long-running. An agent might start researching a prospect, pause for human approval, wait hours or days, and then resume to send the message. The session preserves everything across that gap.
+
+Sessions also make debugging straightforward. Every decision the agent made, every tool it called, every retry it attempted — it's all captured in the session. When something goes wrong, you can trace exactly what happened.
+
+## Iterations
+
+Within a session, nodes (especially `event_loop` nodes) work in iterations. An iteration is one turn of the loop: the LLM reasons about the current state, possibly calls tools, observes results, and produces output. Then the judge evaluates: is this good enough?
+
+If not, the node iterates again. The LLM sees what went wrong and adjusts its approach. This is how agents self-correct without human intervention — through rapid iteration within a single node, not by restarting the whole process.
+
+Iterations have limits. You set a maximum per node to prevent runaway loops. If a node can't produce acceptable output within its iteration budget, it fails and the graph's error-handling edges take over.
+
+## Headless Execution
+
+A lot of business processes need to run continuously — monitoring inboxes, processing incoming leads, watching for events. These agents run **headless**: no UI, no human sitting at a terminal, just the agent doing its job in the background.
+
+Headless doesn't mean unsupervised. HITL (human-in-the-loop) nodes still pause execution and wait for human input when the agent hits a decision it shouldn't make alone. The difference is that instead of a live conversation, the agent sends a notification, waits for a response through whatever channel you've configured, and resumes when the human weighs in.
+
+This is the operational model Hive is designed for: agents that run 24/7 as part of your business infrastructure, with humans stepping in only when needed. The goal is to automate the routine and escalate the exceptions.
+
+## The Runtime
+
+The worker agent runtime manages the lifecycle: starting sessions, executing the graph, handling pauses and resumes, tracking costs, and collecting metrics. It coordinates everything the agent needs — LLM access, tool execution, shared memory, credential management — so individual nodes can focus on their specific job.
+
+Key things the runtime handles:
+
+**Cost tracking** — Every LLM call is metered. You set budget constraints on the goal, and the runtime enforces them. An agent can't silently burn through your API credits.
+
+**Decision logging** — Every meaningful choice the agent makes is recorded: what it was trying to do, what options it considered, what it chose, and what happened. This isn't just for debugging — it's the raw material that evolution uses to improve future generations.
+
+**Event streaming** — The runtime emits events as the agent works. You can wire these up to dashboards, logs, or alerting systems to monitor agents in real time.
+
+**Crash recovery** — If execution is interrupted (process crash, deployment, anything), the runtime can resume from the last checkpoint. Conversation state and memory are persisted, so the agent picks up where it left off rather than starting over.
+
+## The Big Picture
+
+The worker agent model is Hive's answer to a simple question: how do you run AI agents like you'd run a team?
+
+You hire for a role (define the goal), you onboard them with context (provide tools, credentials, domain knowledge), you set expectations (success criteria and constraints), you let them work independently (headless execution), and you check in when something unusual comes up (HITL). When they're not performing well, you don't debug them line by line — you evolve them (see [Evolution](./evolution.md)).
@@ -37,5 +37,5 @@ uv run python -m exports.my_agent --help
 ## How to use a recipe

 1. Read the recipe markdown file
-2. Use the patterns described to build your own agent — either manually or with the builder agent (`/agent-workflow`)
+2. Use the patterns described to build your own agent — either manually or with the builder agent (`/hive`)
 3. Refer to the [core README](../core/README.md) for framework API details
@@ -0,0 +1,24 @@
+"""
+Deep Research Agent - Interactive, rigorous research with TUI conversation.
+
+Research any topic through multi-source web search, quality evaluation,
+and synthesis. Features client-facing TUI interaction at key checkpoints
+for user guidance and iterative deepening.
+"""
+
+from .agent import DeepResearchAgent, default_agent, goal, nodes, edges
+from .config import RuntimeConfig, AgentMetadata, default_config, metadata
+
+__version__ = "1.0.0"
+
+__all__ = [
+    "DeepResearchAgent",
+    "default_agent",
+    "goal",
+    "nodes",
+    "edges",
+    "RuntimeConfig",
+    "AgentMetadata",
+    "default_config",
+    "metadata",
+]
@@ -0,0 +1,241 @@
+"""
+CLI entry point for Deep Research Agent.
+
+Uses AgentRuntime for multi-entrypoint support with HITL pause/resume.
+"""
+
+import asyncio
+import json
+import logging
+import sys
+import click
+
+from .agent import default_agent, DeepResearchAgent
+
+
+def setup_logging(verbose=False, debug=False):
+    """Configure logging for execution visibility."""
+    if debug:
+        level, fmt = logging.DEBUG, "%(asctime)s %(name)s: %(message)s"
+    elif verbose:
+        level, fmt = logging.INFO, "%(message)s"
+    else:
+        level, fmt = logging.WARNING, "%(levelname)s: %(message)s"
+    logging.basicConfig(level=level, format=fmt, stream=sys.stderr)
+    logging.getLogger("framework").setLevel(level)
+
+
+@click.group()
+@click.version_option(version="1.0.0")
+def cli():
+    """Deep Research Agent - Interactive, rigorous research with TUI conversation."""
+    pass
+
+
+@cli.command()
+@click.option("--topic", "-t", type=str, required=True, help="Research topic")
+@click.option("--mock", is_flag=True, help="Run in mock mode")
+@click.option("--quiet", "-q", is_flag=True, help="Only output result JSON")
+@click.option("--verbose", "-v", is_flag=True, help="Show execution details")
+@click.option("--debug", is_flag=True, help="Show debug logging")
+def run(topic, mock, quiet, verbose, debug):
+    """Execute research on a topic."""
+    if not quiet:
+        setup_logging(verbose=verbose, debug=debug)
+
+    context = {"topic": topic}
+
+    result = asyncio.run(default_agent.run(context, mock_mode=mock))
+
+    output_data = {
+        "success": result.success,
+        "steps_executed": result.steps_executed,
+        "output": result.output,
+    }
+    if result.error:
+        output_data["error"] = result.error
+
+    click.echo(json.dumps(output_data, indent=2, default=str))
+    sys.exit(0 if result.success else 1)
+
+
+@cli.command()
+@click.option("--mock", is_flag=True, help="Run in mock mode")
+@click.option("--verbose", "-v", is_flag=True, help="Show execution details")
+@click.option("--debug", is_flag=True, help="Show debug logging")
+def tui(mock, verbose, debug):
+    """Launch the TUI dashboard for interactive research."""
+    setup_logging(verbose=verbose, debug=debug)
+
+    try:
+        from framework.tui.app import AdenTUI
+    except ImportError:
+        click.echo(
+            "TUI requires the 'textual' package. Install with: pip install textual"
+        )
+        sys.exit(1)
+
+    from pathlib import Path
+
+    from framework.llm import LiteLLMProvider
+    from framework.runner.tool_registry import ToolRegistry
+    from framework.runtime.agent_runtime import create_agent_runtime
+    from framework.runtime.event_bus import EventBus
+    from framework.runtime.execution_stream import EntryPointSpec
+
+    async def run_with_tui():
+        agent = DeepResearchAgent()
+
+        # Build graph and tools
+        agent._event_bus = EventBus()
+        agent._tool_registry = ToolRegistry()
+
+        storage_path = Path.home() / ".hive" / "deep_research_agent"
+        storage_path.mkdir(parents=True, exist_ok=True)
+
+        mcp_config_path = Path(__file__).parent / "mcp_servers.json"
+        if mcp_config_path.exists():
+            agent._tool_registry.load_mcp_config(mcp_config_path)
+
+        llm = None
+        if not mock:
+            llm = LiteLLMProvider(
+                model=agent.config.model,
+                api_key=agent.config.api_key,
+                api_base=agent.config.api_base,
+            )
+
+        tools = list(agent._tool_registry.get_tools().values())
+        tool_executor = agent._tool_registry.get_executor()
+        graph = agent._build_graph()
+
+        runtime = create_agent_runtime(
+            graph=graph,
+            goal=agent.goal,
+            storage_path=storage_path,
+            entry_points=[
+                EntryPointSpec(
+                    id="start",
+                    name="Start Research",
+                    entry_node="intake",
+                    trigger_type="manual",
+                    isolation_level="isolated",
+                ),
+            ],
+            llm=llm,
+            tools=tools,
+            tool_executor=tool_executor,
+        )
+
+        await runtime.start()
+
+        try:
+            app = AdenTUI(runtime)
+            await app.run_async()
+        finally:
+            await runtime.stop()
+
+    asyncio.run(run_with_tui())
+
+
+@cli.command()
+@click.option("--json", "output_json", is_flag=True)
+def info(output_json):
+    """Show agent information."""
+    info_data = default_agent.info()
+    if output_json:
+        click.echo(json.dumps(info_data, indent=2))
+    else:
+        click.echo(f"Agent: {info_data['name']}")
+        click.echo(f"Version: {info_data['version']}")
+        click.echo(f"Description: {info_data['description']}")
+        click.echo(f"\nNodes: {', '.join(info_data['nodes'])}")
+        click.echo(f"Client-facing: {', '.join(info_data['client_facing_nodes'])}")
+        click.echo(f"Entry: {info_data['entry_node']}")
+        click.echo(f"Terminal: {', '.join(info_data['terminal_nodes'])}")
+
+
+@cli.command()
+def validate():
+    """Validate agent structure."""
+    validation = default_agent.validate()
+    if validation["valid"]:
+        click.echo("Agent is valid")
+        if validation["warnings"]:
+            for warning in validation["warnings"]:
+                click.echo(f"  WARNING: {warning}")
+    else:
+        click.echo("Agent has errors:")
+        for error in validation["errors"]:
+            click.echo(f"  ERROR: {error}")
+    sys.exit(0 if validation["valid"] else 1)
+
+
+@cli.command()
+@click.option("--verbose", "-v", is_flag=True)
+def shell(verbose):
+    """Interactive research session (CLI, no TUI)."""
+    asyncio.run(_interactive_shell(verbose))
+
+
+async def _interactive_shell(verbose=False):
+    """Async interactive shell."""
+    setup_logging(verbose=verbose)
+
+    click.echo("=== Deep Research Agent ===")
+    click.echo("Enter a topic to research (or 'quit' to exit):\n")
+
+    agent = DeepResearchAgent()
+    await agent.start()
+
+    try:
+        while True:
+            try:
+                topic = await asyncio.get_event_loop().run_in_executor(
+                    None, input, "Topic> "
+                )
+                if topic.lower() in ["quit", "exit", "q"]:
+                    click.echo("Goodbye!")
+                    break
+
+                if not topic.strip():
+                    continue
+
+                click.echo("\nResearching...\n")
+
+                result = await agent.trigger_and_wait("start", {"topic": topic})
+
+                if result is None:
+                    click.echo("\n[Execution timed out]\n")
+                    continue
+
+                if result.success:
+                    output = result.output
+                    if "report_content" in output:
+                        click.echo("\n--- Report ---\n")
+                        click.echo(output["report_content"])
+                        click.echo("\n")
+                    if "references" in output:
+                        click.echo("--- References ---\n")
+                        for ref in output.get("references", []):
+                            click.echo(
+                                f"  [{ref.get('number', '?')}] {ref.get('title', '')} - {ref.get('url', '')}"
+                            )
+                        click.echo("\n")
+                else:
+                    click.echo(f"\nResearch failed: {result.error}\n")
+
+            except KeyboardInterrupt:
+                click.echo("\nGoodbye!")
+                break
+            except Exception as e:
+                click.echo(f"Error: {e}", err=True)
+                import traceback
+
+                traceback.print_exc()
+    finally:
+        await agent.stop()
+
+
+if __name__ == "__main__":
+    cli()
@@ -0,0 +1,311 @@
+"""Agent graph construction for Deep Research Agent."""
+
+from framework.graph import EdgeSpec, EdgeCondition, Goal, SuccessCriterion, Constraint
+from framework.graph.edge import GraphSpec
+from framework.graph.executor import ExecutionResult, GraphExecutor
+from framework.runtime.event_bus import EventBus
+from framework.runtime.core import Runtime
+from framework.llm import LiteLLMProvider
+from framework.runner.tool_registry import ToolRegistry
+
+from .config import default_config, metadata
+from .nodes import (
+    intake_node,
+    research_node,
+    review_node,
+    report_node,
+)
+
+# Goal definition
+goal = Goal(
+    id="rigorous-interactive-research",
+    name="Rigorous Interactive Research",
+    description=(
+        "Research any topic by searching diverse sources, analyzing findings, "
+        "and producing a cited report — with user checkpoints to guide direction."
+    ),
+    success_criteria=[
+        SuccessCriterion(
+            id="source-diversity",
+            description="Use multiple diverse, authoritative sources",
+            metric="source_count",
+            target=">=5",
+            weight=0.25,
+        ),
+        SuccessCriterion(
+            id="citation-coverage",
+            description="Every factual claim in the report cites its source",
+            metric="citation_coverage",
+            target="100%",
+            weight=0.25,
+        ),
+        SuccessCriterion(
+            id="user-satisfaction",
+            description="User reviews findings before report generation",
+            metric="user_approval",
+            target="true",
+            weight=0.25,
+        ),
+        SuccessCriterion(
+            id="report-completeness",
+            description="Final report answers the original research questions",
+            metric="question_coverage",
+            target="90%",
+            weight=0.25,
+        ),
+    ],
+    constraints=[
+        Constraint(
+            id="no-hallucination",
+            description="Only include information found in fetched sources",
+            constraint_type="quality",
+            category="accuracy",
+        ),
+        Constraint(
+            id="source-attribution",
+            description="Every claim must cite its source with a numbered reference",
+            constraint_type="quality",
+            category="accuracy",
+        ),
+        Constraint(
+            id="user-checkpoint",
+            description="Present findings to the user before writing the final report",
+            constraint_type="functional",
+            category="interaction",
+        ),
+    ],
+)
+
+# Node list
+nodes = [
+    intake_node,
+    research_node,
+    review_node,
+    report_node,
+]
+
+# Edge definitions
+edges = [
+    # intake -> research
+    EdgeSpec(
+        id="intake-to-research",
+        source="intake",
+        target="research",
+        condition=EdgeCondition.ON_SUCCESS,
+        priority=1,
+    ),
+    # research -> review
+    EdgeSpec(
+        id="research-to-review",
+        source="research",
+        target="review",
+        condition=EdgeCondition.ON_SUCCESS,
+        priority=1,
+    ),
+    # review -> research (feedback loop)
+    EdgeSpec(
+        id="review-to-research-feedback",
+        source="review",
+        target="research",
+        condition=EdgeCondition.CONDITIONAL,
+        condition_expr="needs_more_research == True",
+        priority=1,
+    ),
+    # review -> report (user satisfied)
+    EdgeSpec(
+        id="review-to-report",
+        source="review",
+        target="report",
+        condition=EdgeCondition.CONDITIONAL,
+        condition_expr="needs_more_research == False",
+        priority=2,
+    ),
+]
+
+# Graph configuration
+entry_node = "intake"
+entry_points = {"start": "intake"}
+pause_nodes = []
+terminal_nodes = ["report"]
+
+
+class DeepResearchAgent:
+    """
+    Deep Research Agent — 4-node pipeline with user checkpoints.
+
+    Flow: intake -> research -> review -> report
+                      ^           |
+                      +-- feedback loop (if user wants more)
+    """
+
+    def __init__(self, config=None):
+        self.config = config or default_config
+        self.goal = goal
+        self.nodes = nodes
+        self.edges = edges
+        self.entry_node = entry_node
+        self.entry_points = entry_points
+        self.pause_nodes = pause_nodes
+        self.terminal_nodes = terminal_nodes
+        self._executor: GraphExecutor | None = None
+        self._graph: GraphSpec | None = None
+        self._event_bus: EventBus | None = None
+        self._tool_registry: ToolRegistry | None = None
+
+    def _build_graph(self) -> GraphSpec:
+        """Build the GraphSpec."""
+        return GraphSpec(
+            id="deep-research-agent-graph",
+            goal_id=self.goal.id,
+            version="1.0.0",
+            entry_node=self.entry_node,
+            entry_points=self.entry_points,
+            terminal_nodes=self.terminal_nodes,
+            pause_nodes=self.pause_nodes,
+            nodes=self.nodes,
+            edges=self.edges,
+            default_model=self.config.model,
+            max_tokens=self.config.max_tokens,
+            loop_config={
+                "max_iterations": 100,
+                "max_tool_calls_per_turn": 20,
+                "max_history_tokens": 32000,
+            },
+        )
+
+    def _setup(self, mock_mode=False) -> GraphExecutor:
+        """Set up the executor with all components."""
+        from pathlib import Path
+
+        storage_path = Path.home() / ".hive" / "deep_research_agent"
+        storage_path.mkdir(parents=True, exist_ok=True)
+
+        self._event_bus = EventBus()
+        self._tool_registry = ToolRegistry()
+
+        mcp_config_path = Path(__file__).parent / "mcp_servers.json"
+        if mcp_config_path.exists():
+            self._tool_registry.load_mcp_config(mcp_config_path)
+
+        llm = None
+        if not mock_mode:
+            llm = LiteLLMProvider(
+                model=self.config.model,
+                api_key=self.config.api_key,
+                api_base=self.config.api_base,
+            )
+
+        tool_executor = self._tool_registry.get_executor()
+        tools = list(self._tool_registry.get_tools().values())
+
+        self._graph = self._build_graph()
+        runtime = Runtime(storage_path)
+
+        self._executor = GraphExecutor(
+            runtime=runtime,
+            llm=llm,
+            tools=tools,
+            tool_executor=tool_executor,
+            event_bus=self._event_bus,
+            storage_path=storage_path,
+            loop_config=self._graph.loop_config,
+        )
+
+        return self._executor
+
+    async def start(self, mock_mode=False) -> None:
+        """Set up the agent (initialize executor and tools)."""
+        if self._executor is None:
+            self._setup(mock_mode=mock_mode)
+
+    async def stop(self) -> None:
+        """Clean up resources."""
+        self._executor = None
+        self._event_bus = None
+
+    async def trigger_and_wait(
+        self,
+        entry_point: str,
+        input_data: dict,
+        timeout: float | None = None,
+        session_state: dict | None = None,
+    ) -> ExecutionResult | None:
+        """Execute the graph and wait for completion."""
+        if self._executor is None:
+            raise RuntimeError("Agent not started. Call start() first.")
+        if self._graph is None:
+            raise RuntimeError("Graph not built. Call start() first.")
+
+        return await self._executor.execute(
+            graph=self._graph,
+            goal=self.goal,
+            input_data=input_data,
+            session_state=session_state,
+        )
+
+    async def run(
+        self, context: dict, mock_mode=False, session_state=None
+    ) -> ExecutionResult:
+        """Run the agent (convenience method for single execution)."""
+        await self.start(mock_mode=mock_mode)
+        try:
+            result = await self.trigger_and_wait(
+                "start", context, session_state=session_state
+            )
+            return result or ExecutionResult(success=False, error="Execution timeout")
+        finally:
+            await self.stop()
+
+    def info(self):
+        """Get agent information."""
+        return {
+            "name": metadata.name,
+            "version": metadata.version,
+            "description": metadata.description,
+            "goal": {
+                "name": self.goal.name,
+                "description": self.goal.description,
+            },
+            "nodes": [n.id for n in self.nodes],
+            "edges": [e.id for e in self.edges],
+            "entry_node": self.entry_node,
+            "entry_points": self.entry_points,
+            "pause_nodes": self.pause_nodes,
+            "terminal_nodes": self.terminal_nodes,
+            "client_facing_nodes": [n.id for n in self.nodes if n.client_facing],
+        }
+
+    def validate(self):
+        """Validate agent structure."""
+        errors = []
+        warnings = []
+
+        node_ids = {node.id for node in self.nodes}
+        for edge in self.edges:
+            if edge.source not in node_ids:
+                errors.append(f"Edge {edge.id}: source '{edge.source}' not found")
+            if edge.target not in node_ids:
+                errors.append(f"Edge {edge.id}: target '{edge.target}' not found")
+
+        if self.entry_node not in node_ids:
+            errors.append(f"Entry node '{self.entry_node}' not found")
+
+        for terminal in self.terminal_nodes:
+            if terminal not in node_ids:
+                errors.append(f"Terminal node '{terminal}' not found")
+
+        for ep_id, node_id in self.entry_points.items():
+            if node_id not in node_ids:
+                errors.append(
+                    f"Entry point '{ep_id}' references unknown node '{node_id}'"
+                )
+
+        return {
+            "valid": len(errors) == 0,
+            "errors": errors,
+            "warnings": warnings,
+        }
+
+
+# Create default instance
+default_agent = DeepResearchAgent()
@@ -0,0 +1,46 @@
+"""Runtime configuration."""
+
+import json
+from dataclasses import dataclass, field
+from pathlib import Path
+
+
+def _load_preferred_model() -> str:
+    """Load preferred model from ~/.hive/configuration.json."""
+    config_path = Path.home() / ".hive" / "configuration.json"
+    if config_path.exists():
+        try:
+            with open(config_path) as f:
+                config = json.load(f)
+            llm = config.get("llm", {})
+            if llm.get("provider") and llm.get("model"):
+                return f"{llm['provider']}/{llm['model']}"
+        except Exception:
+            pass
+    return "anthropic/claude-sonnet-4-20250514"
+
+
+@dataclass
+class RuntimeConfig:
+    model: str = field(default_factory=_load_preferred_model)
+    temperature: float = 0.7
+    max_tokens: int = 40000
+    api_key: str | None = None
+    api_base: str | None = None
+
+
+default_config = RuntimeConfig()
+
+
+@dataclass
+class AgentMetadata:
+    name: str = "Deep Research Agent"
+    version: str = "1.0.0"
+    description: str = (
+        "Interactive research agent that rigorously investigates topics through "
+        "multi-source search, quality evaluation, and synthesis - with TUI conversation "
+        "at key checkpoints for user guidance and feedback."
+    )
+
+
+metadata = AgentMetadata()
@@ -0,0 +1,9 @@
+{
+  "hive-tools": {
+    "transport": "stdio",
+    "command": "python",
+    "args": ["mcp_server.py", "--stdio"],
+    "cwd": "../../tools",
+    "description": "Hive tools MCP server providing web_search, web_scrape, and write_to_file"
+  }
+}
@@ -0,0 +1,162 @@
+"""Node definitions for Deep Research Agent."""
+
+from framework.graph import NodeSpec
+
+# Node 1: Intake (client-facing)
+# Brief conversation to clarify what the user wants researched.
+intake_node = NodeSpec(
+    id="intake",
+    name="Research Intake",
+    description="Discuss the research topic with the user, clarify scope, and confirm direction",
+    node_type="event_loop",
+    client_facing=True,
+    input_keys=["topic"],
+    output_keys=["research_brief"],
+    system_prompt="""\
+You are a research intake specialist. The user wants to research a topic.
+Have a brief conversation to clarify what they need.
+
+**STEP 1 — Read and respond (text only, NO tool calls):**
+1. Read the topic provided
+2. If it's vague, ask 1-2 clarifying questions (scope, angle, depth)
+3. If it's already clear, confirm your understanding and ask the user to confirm
+
+Keep it short. Don't over-ask.
+
+**STEP 2 — After the user confirms, call set_output:**
+- set_output("research_brief", "A clear paragraph describing exactly what to research, \
+what questions to answer, what scope to cover, and how deep to go.")
+""",
+    tools=[],
+)
+
+# Node 2: Research
+# The workhorse — searches the web, fetches content, analyzes sources.
+# One node with both tools avoids the context-passing overhead of 5 separate nodes.
+research_node = NodeSpec(
+    id="research",
+    name="Research",
+    description="Search the web, fetch source content, and compile findings",
+    node_type="event_loop",
+    max_node_visits=3,
+    input_keys=["research_brief", "feedback"],
+    output_keys=["findings", "sources", "gaps"],
+    nullable_output_keys=["feedback"],
+    system_prompt="""\
+You are a research agent. Given a research brief, find and analyze sources.
+
+If feedback is provided, this is a follow-up round — focus on the gaps identified.
+
+Work in phases:
+1. **Search**: Use web_search with 3-5 diverse queries covering different angles.
+   Prioritize authoritative sources (.edu, .gov, established publications).
+2. **Fetch**: Use web_scrape on the most promising URLs (aim for 5-8 sources).
+   Skip URLs that fail. Extract the substantive content.
+3. **Analyze**: Review what you've collected. Identify key findings, themes,
+   and any contradictions between sources.
+
+Important:
+- Work in batches of 3-4 tool calls at a time to manage context
+- After each batch, assess whether you have enough material
+- Prefer quality over quantity — 5 good sources beat 15 thin ones
+- Track which URL each finding comes from (you'll need citations later)
+
+When done, use set_output:
+- set_output("findings", "Structured summary: key findings with source URLs for each claim. \
+Include themes, contradictions, and confidence levels.")
+- set_output("sources", [{"url": "...", "title": "...", "summary": "..."}])
+- set_output("gaps", "What aspects of the research brief are NOT well-covered yet, if any.")
+""",
+    tools=["web_search", "web_scrape", "load_data", "save_data", "list_data_files"],
+)
+
+# Node 3: Review (client-facing)
+# Shows the user what was found and asks whether to dig deeper or proceed.
+review_node = NodeSpec(
+    id="review",
+    name="Review Findings",
+    description="Present findings to user and decide whether to research more or write the report",
+    node_type="event_loop",
+    client_facing=True,
+    max_node_visits=3,
+    input_keys=["findings", "sources", "gaps", "research_brief"],
+    output_keys=["needs_more_research", "feedback"],
+    system_prompt="""\
+Present the research findings to the user clearly and concisely.
+
+**STEP 1 — Present (your first message, text only, NO tool calls):**
+1. **Summary** (2-3 sentences of what was found)
+2. **Key Findings** (bulleted, with confidence levels)
+3. **Sources Used** (count and quality assessment)
+4. **Gaps** (what's still unclear or under-covered)
+
+End by asking: Are they satisfied, or do they want deeper research? \
+Should we proceed to writing the final report?
+
+**STEP 2 — After the user responds, call set_output:**
+- set_output("needs_more_research", "true")  — if they want more
+- set_output("needs_more_research", "false") — if they're satisfied
+- set_output("feedback", "What the user wants explored further, or empty string")
+""",
+    tools=[],
+)
+
+# Node 4: Report (client-facing)
+# Writes an HTML report, serves the link to the user, and answers follow-ups.
+report_node = NodeSpec(
+    id="report",
+    name="Write & Deliver Report",
+    description="Write a cited HTML report from the findings and present it to the user",
+    node_type="event_loop",
+    client_facing=True,
+    input_keys=["findings", "sources", "research_brief"],
+    output_keys=["delivery_status"],
+    system_prompt="""\
+Write a comprehensive research report as an HTML file and present it to the user.
+
+**STEP 1 — Write the HTML report (tool calls, NO text to user yet):**
+
+1. Compose a complete, self-contained HTML document with embedded CSS styling.
+   Use a clean, readable design: max-width container, pleasant typography,
+   numbered citation links, a table of contents, and a references section.
+
+   Report structure inside the HTML:
+   - Title & date
+   - Executive Summary (2-3 paragraphs)
+   - Table of Contents
+   - Findings (organized by theme, with [n] citation links)
+   - Analysis (synthesis, implications, areas of debate)
+   - Conclusion (key takeaways, confidence assessment)
+   - References (numbered list with clickable URLs)
+
+   Requirements:
+   - Every factual claim must cite its source with [n] notation
+   - Be objective — present multiple viewpoints where sources disagree
+   - Distinguish well-supported conclusions from speculation
+   - Answer the original research questions from the brief
+
+2. Save the HTML file:
+   save_data(filename="report.html", data=<your_html>)
+
+3. Get the clickable link:
+   serve_file_to_user(filename="report.html", label="Research Report")
+
+**STEP 2 — Present the link to the user (text only, NO tool calls):**
+
+Tell the user the report is ready and include the file:// URI from
+serve_file_to_user so they can click it to open. Give a brief summary
+of what the report covers. Ask if they have questions.
+
+**STEP 3 — After the user responds:**
+- Answer follow-up questions from the research material
+- When the user is satisfied: set_output("delivery_status", "completed")
+""",
+    tools=["save_data", "serve_file_to_user", "load_data", "list_data_files"],
+)
+
+__all__ = [
+    "intake_node",
+    "research_node",
+    "review_node",
+    "report_node",
+]
@@ -0,0 +1,57 @@
+# Twitter Outreach Agent
+
+Personalized email outreach powered by Twitter/X research.
+
+## What it does
+
+1. **Intake** — Collects the target's Twitter handle, outreach purpose, and recipient email
+2. **Research** — Searches and scrapes the target's Twitter/X profile for bio, tweets, interests
+3. **Draft & Review** — Crafts a personalized email and presents it for your approval (with iteration)
+4. **Send** — Sends the approved email
+
+## Usage
+
+```bash
+# Validate the agent structure
+PYTHONPATH=core:exports uv run python -m twitter_outreach validate
+
+# Show agent info
+PYTHONPATH=core:exports uv run python -m twitter_outreach info
+
+# Run in mock mode (no API calls)
+PYTHONPATH=core:exports uv run python -m twitter_outreach run --mock
+
+# Launch the TUI
+PYTHONPATH=core:exports uv run python -m twitter_outreach tui
+
+# Interactive shell
+PYTHONPATH=core:exports uv run python -m twitter_outreach shell
+```
+
+## Architecture
+
+```
+intake → research → draft-review → send
+```
+
+## Tools Used
+
+- `web_search` — Search for Twitter profiles and public info
+- `web_scrape` — Read Twitter/X profile pages
+- `send_email` — Send the approved outreach email
+
+## Nodes
+
+| Node | Type | Client-Facing | Description |
+|------|------|:---:|-------------|
+| `intake` | event_loop | Yes | Collect target info from user |
+| `research` | event_loop | No | Research Twitter/X profile |
+| `draft-review` | event_loop | Yes | Draft email, iterate with user |
+| `send` | event_loop | No | Send approved email |
+
+## Constraints
+
+- **No Spam** — No spammy language, clickbait, or aggressive sales tactics
+- **Approval Required** — Never sends without explicit user approval
+- **Tone** — Professional, authentic, conversational
+- **Privacy** — Only uses publicly available information
@@ -0,0 +1,23 @@
+"""
+Twitter Outreach Agent - Personalized email outreach powered by Twitter/X research.
+
+Reads a target's Twitter/X profile, crafts a personalized outreach email
+referencing their specific activity, and sends it after user approval.
+"""
+
+from .agent import TwitterOutreachAgent, default_agent, goal, nodes, edges
+from .config import RuntimeConfig, AgentMetadata, default_config, metadata
+
+__version__ = "1.0.0"
+
+__all__ = [
+    "TwitterOutreachAgent",
+    "default_agent",
+    "goal",
+    "nodes",
+    "edges",
+    "RuntimeConfig",
+    "AgentMetadata",
+    "default_config",
+    "metadata",
+]
@@ -0,0 +1,210 @@
+"""
+CLI entry point for Twitter Outreach Agent.
+
+Uses AgentRuntime for TUI support with client-facing interaction.
+"""
+
+import asyncio
+import json
+import logging
+import sys
+import click
+
+from .agent import default_agent, TwitterOutreachAgent
+
+
+def setup_logging(verbose=False, debug=False):
+    """Configure logging for execution visibility."""
+    if debug:
+        level, fmt = logging.DEBUG, "%(asctime)s %(name)s: %(message)s"
+    elif verbose:
+        level, fmt = logging.INFO, "%(message)s"
+    else:
+        level, fmt = logging.WARNING, "%(levelname)s: %(message)s"
+    logging.basicConfig(level=level, format=fmt, stream=sys.stderr)
+    logging.getLogger("framework").setLevel(level)
+
+
+@click.group()
+@click.version_option(version="1.0.0")
+def cli():
+    """Twitter Outreach Agent - Personalized email outreach powered by Twitter/X research."""
+    pass
+
+
+@cli.command()
+@click.option("--mock", is_flag=True, help="Run in mock mode")
+@click.option("--quiet", "-q", is_flag=True, help="Only output result JSON")
+@click.option("--verbose", "-v", is_flag=True, help="Show execution details")
+@click.option("--debug", is_flag=True, help="Show debug logging")
+def run(mock, quiet, verbose, debug):
+    """Execute the outreach workflow."""
+    if not quiet:
+        setup_logging(verbose=verbose, debug=debug)
+
+    result = asyncio.run(default_agent.run({}, mock_mode=mock))
+
+    output_data = {
+        "success": result.success,
+        "steps_executed": result.steps_executed,
+        "output": result.output,
+    }
+    if result.error:
+        output_data["error"] = result.error
+
+    click.echo(json.dumps(output_data, indent=2, default=str))
+    sys.exit(0 if result.success else 1)
+
+
+@cli.command()
+@click.option("--mock", is_flag=True, help="Run in mock mode")
+@click.option("--verbose", "-v", is_flag=True, help="Show execution details")
+@click.option("--debug", is_flag=True, help="Show debug logging")
+def tui(mock, verbose, debug):
+    """Launch the TUI dashboard for interactive outreach."""
+    setup_logging(verbose=verbose, debug=debug)
+
+    try:
+        from framework.tui.app import AdenTUI
+    except ImportError:
+        click.echo(
+            "TUI requires the 'textual' package. Install with: pip install textual"
+        )
+        sys.exit(1)
+
+    from pathlib import Path
+
+    from framework.llm import LiteLLMProvider
+    from framework.runner.tool_registry import ToolRegistry
+    from framework.runtime.agent_runtime import create_agent_runtime
+    from framework.runtime.event_bus import EventBus
+    from framework.runtime.execution_stream import EntryPointSpec
+
+    async def run_with_tui():
+        agent = TwitterOutreachAgent()
+
+        agent._event_bus = EventBus()
+        agent._tool_registry = ToolRegistry()
+
+        storage_path = Path.home() / ".hive" / "twitter_outreach"
+        storage_path.mkdir(parents=True, exist_ok=True)
+
+        mcp_config_path = Path(__file__).parent / "mcp_servers.json"
+        if mcp_config_path.exists():
+            agent._tool_registry.load_mcp_config(mcp_config_path)
+
+        llm = None
+        if not mock:
+            llm = LiteLLMProvider(
+                model=agent.config.model,
+                api_key=agent.config.api_key,
+                api_base=agent.config.api_base,
+            )
+
+        tools = list(agent._tool_registry.get_tools().values())
+        tool_executor = agent._tool_registry.get_executor()
+        graph = agent._build_graph()
+
+        runtime = create_agent_runtime(
+            graph=graph,
+            goal=agent.goal,
+            storage_path=storage_path,
+            entry_points=[
+                EntryPointSpec(
+                    id="start",
+                    name="Start Outreach",
+                    entry_node="intake",
+                    trigger_type="manual",
+                    isolation_level="isolated",
+                ),
+            ],
+            llm=llm,
+            tools=tools,
+            tool_executor=tool_executor,
+        )
+
+        await runtime.start()
+
+        try:
+            app = AdenTUI(runtime)
+            await app.run_async()
+        finally:
+            await runtime.stop()
+
+    asyncio.run(run_with_tui())
+
+
+@cli.command()
+@click.option("--json", "output_json", is_flag=True)
+def info(output_json):
+    """Show agent information."""
+    info_data = default_agent.info()
+    if output_json:
+        click.echo(json.dumps(info_data, indent=2))
+    else:
+        click.echo(f"Agent: {info_data['name']}")
+        click.echo(f"Version: {info_data['version']}")
+        click.echo(f"Description: {info_data['description']}")
+        click.echo(f"\nNodes: {', '.join(info_data['nodes'])}")
+        click.echo(f"Client-facing: {', '.join(info_data['client_facing_nodes'])}")
+        click.echo(f"Entry: {info_data['entry_node']}")
+        click.echo(f"Terminal: {', '.join(info_data['terminal_nodes'])}")
+
+
+@cli.command()
+def validate():
+    """Validate agent structure."""
+    validation = default_agent.validate()
+    if validation["valid"]:
+        click.echo("Agent is valid")
+        if validation["warnings"]:
+            for warning in validation["warnings"]:
+                click.echo(f"  WARNING: {warning}")
+    else:
+        click.echo("Agent has errors:")
+        for error in validation["errors"]:
+            click.echo(f"  ERROR: {error}")
+    sys.exit(0 if validation["valid"] else 1)
+
+
+@cli.command()
+@click.option("--verbose", "-v", is_flag=True)
+def shell(verbose):
+    """Interactive outreach session (CLI, no TUI)."""
+    asyncio.run(_interactive_shell(verbose))
+
+
+async def _interactive_shell(verbose=False):
+    """Async interactive shell."""
+    setup_logging(verbose=verbose)
+
+    click.echo("=== Twitter Outreach Agent ===")
+    click.echo("Starting outreach workflow...\n")
+
+    agent = TwitterOutreachAgent()
+    await agent.start()
+
+    try:
+        result = await agent.trigger_and_wait("start", {})
+
+        if result is None:
+            click.echo("\n[Execution timed out]\n")
+        elif result.success:
+            output = result.output
+            status = output.get("delivery_status", "unknown")
+            click.echo(f"\nOutreach complete! Delivery status: {status}")
+        else:
+            click.echo(f"\nOutreach failed: {result.error}")
+    except KeyboardInterrupt:
+        click.echo("\nGoodbye!")
+    except Exception as e:
+        click.echo(f"Error: {e}", err=True)
+        import traceback
+
+        traceback.print_exc()
+    finally:
+        await agent.stop()
+
+
+if __name__ == "__main__":
+    cli()
@@ -0,0 +1,265 @@
+{
+  "agent": {
+    "id": "twitter_outreach",
+    "name": "Personalized Twitter Outreach",
+    "version": "1.0.0",
+    "description": "Given a Twitter/X handle and outreach context, research the target's profile (bio, tweets, interests), craft a personalized outreach email referencing their specific activity, and send it after user approval."
+  },
+  "graph": {
+    "id": "twitter_outreach-graph",
+    "goal_id": "twitter-outreach",
+    "version": "1.0.0",
+    "entry_node": "intake",
+    "entry_points": {
+      "start": "intake"
+    },
+    "pause_nodes": [],
+    "terminal_nodes": [
+      "send"
+    ],
+    "nodes": [
+      {
+        "id": "intake",
+        "name": "Intake",
+        "description": "Collect the target Twitter handle, outreach purpose, and recipient email from the user",
+        "node_type": "event_loop",
+        "input_keys": [],
+        "output_keys": [
+          "twitter_handle",
+          "outreach_context",
+          "recipient_email"
+        ],
+        "nullable_output_keys": [],
+        "input_schema": {},
+        "output_schema": {},
+        "system_prompt": "You are the intake assistant for a personalized Twitter outreach agent.\n\n**STEP 1 \u2014 Respond to the user (text only, NO tool calls):**\nGreet the user and ask them to provide:\n1. The Twitter/X handle of the person they want to reach out to\n2. The purpose/context of the outreach (e.g., partnership opportunity, hiring, collaboration, introduction)\n3. The recipient's email address\n\nBe friendly and concise. If the user provides partial info, ask for what's missing.\n\n**STEP 2 \u2014 After the user provides ALL three pieces of information, call set_output:**\n- set_output(\"twitter_handle\", \"<the Twitter handle, including @>\")\n- set_output(\"outreach_context\", \"<the outreach purpose/context>\")\n- set_output(\"recipient_email\", \"<the email address>\")",
+        "tools": [],
+        "model": null,
+        "function": null,
+        "routes": {},
+        "max_retries": 3,
+        "retry_on": [],
+        "max_node_visits": 1,
+        "output_model": null,
+        "max_validation_retries": 2,
+        "client_facing": true
+      },
+      {
+        "id": "research",
+        "name": "Research",
+        "description": "Research the target's Twitter/X profile \u2014 bio, recent tweets, interests, and topics they engage with",
+        "node_type": "event_loop",
+        "input_keys": [
+          "twitter_handle"
+        ],
+        "output_keys": [
+          "profile_summary"
+        ],
+        "nullable_output_keys": [],
+        "input_schema": {},
+        "output_schema": {},
+        "system_prompt": "You are a Twitter/X profile researcher. Your job is to thoroughly research a person's public Twitter/X presence.\n\nGiven the Twitter handle provided in your inputs, do the following:\n\n1. Use web_search to find their Twitter/X profile and any relevant public information about them.\n2. Use web_scrape to read their Twitter/X profile page (try https://x.com/{handle} or https://twitter.com/{handle}).\n3. Extract and analyze:\n   - Their bio and self-description\n   - Recent tweets and topics they post about\n   - Professional interests, projects, or accomplishments\n   - Any recurring themes or passions\n   - Specific tweets worth referencing in outreach\n4. Look for additional context (personal website, blog, other social profiles mentioned in bio).\n\nCompile a comprehensive profile summary that would help someone write a highly personalized outreach email.\n\nUse set_output(\"profile_summary\", <your detailed summary as a string>) to store your findings.\n\nDo NOT return raw JSON. Use the set_output tool to produce outputs.",
+        "tools": [
+          "web_search",
+          "web_scrape"
+        ],
+        "model": null,
+        "function": null,
+        "routes": {},
+        "max_retries": 3,
+        "retry_on": [],
+        "max_node_visits": 1,
+        "output_model": null,
+        "max_validation_retries": 2,
+        "client_facing": false
+      },
+      {
+        "id": "draft-review",
+        "name": "Draft & Review",
+        "description": "Draft a personalized outreach email using profile research, present to user for review, and iterate until approved",
+        "node_type": "event_loop",
+        "input_keys": [
+          "outreach_context",
+          "recipient_email",
+          "profile_summary"
+        ],
+        "output_keys": [
+          "approved_email"
+        ],
+        "nullable_output_keys": [],
+        "input_schema": {},
+        "output_schema": {},
+        "system_prompt": "You are an expert email copywriter specializing in personalized outreach.\n\nYou have been given:\n- A profile summary of the target person (from their Twitter/X)\n- The outreach context/purpose\n- The recipient's email address\n\n**STEP 1 \u2014 Draft and present the email (text only, NO tool calls):**\n\nUsing the profile research, draft a personalized outreach email that:\n- References at least 2 specific details from their Twitter profile (tweets, interests, projects)\n- Clearly connects to the outreach purpose\n- Includes a specific, relevant call to action\n- Is professional but conversational and authentic \u2014 NOT spammy, robotic, or overly formal\n- Is concise (under 300 words)\n\nPresent the complete email draft to the user, formatted clearly with Subject line and Body.\nThen ask: \"Would you like any changes, or shall I send this?\"\n\nIf the user requests changes, revise the email and present the updated version. Keep iterating until the user is satisfied.\n\n**STEP 2 \u2014 After the user explicitly approves the email, call set_output:**\n- set_output(\"approved_email\", \"<the final approved email text including subject line>\")",
+        "tools": [],
+        "model": null,
+        "function": null,
+        "routes": {},
+        "max_retries": 3,
+        "retry_on": [],
+        "max_node_visits": 1,
+        "output_model": null,
+        "max_validation_retries": 2,
+        "client_facing": true
+      },
+      {
+        "id": "send",
+        "name": "Send",
+        "description": "Send the approved outreach email to the recipient",
+        "node_type": "event_loop",
+        "input_keys": [
+          "approved_email",
+          "recipient_email"
+        ],
+        "output_keys": [
+          "delivery_status"
+        ],
+        "nullable_output_keys": [],
+        "input_schema": {},
+        "output_schema": {},
+        "system_prompt": "You are responsible for sending the approved outreach email.\n\nYou have the approved email text and the recipient's email address in your inputs.\n\nParse the subject line and body from the approved_email, then use the send_email tool to send it to the recipient_email address.\n\nAfter sending successfully, call:\n- set_output(\"delivery_status\", \"sent\")\n\nIf there is an error sending, call:\n- set_output(\"delivery_status\", \"failed: <error details>\")\n\nDo NOT return raw JSON. Use the set_output tool to produce outputs.",
+        "tools": [
+          "send_email"
+        ],
+        "model": null,
+        "function": null,
+        "routes": {},
+        "max_retries": 3,
+        "retry_on": [],
+        "max_node_visits": 1,
+        "output_model": null,
+        "max_validation_retries": 2,
+        "client_facing": false
+      }
+    ],
+    "edges": [
+      {
+        "id": "intake-to-research",
+        "source": "intake",
+        "target": "research",
+        "condition": "on_success",
+        "condition_expr": null,
+        "priority": 1,
+        "input_mapping": {}
+      },
+      {
+        "id": "research-to-draft-review",
+        "source": "research",
+        "target": "draft-review",
+        "condition": "on_success",
+        "condition_expr": null,
+        "priority": 1,
+        "input_mapping": {}
+      },
+      {
+        "id": "draft-review-to-send",
+        "source": "draft-review",
+        "target": "send",
+        "condition": "on_success",
+        "condition_expr": null,
+        "priority": 1,
+        "input_mapping": {}
+      }
+    ],
+    "max_steps": 100,
+    "max_retries_per_node": 3,
+    "description": "Given a Twitter/X handle and outreach context, research the target's profile (bio, tweets, interests), craft a personalized outreach email referencing their specific activity, and send it after user approval.",
+    "created_at": "2026-02-05T13:32:44.573661"
+  },
+  "goal": {
+    "id": "twitter-outreach",
+    "name": "Personalized Twitter Outreach",
+    "description": "Given a Twitter/X handle and outreach context, research the target's profile (bio, tweets, interests), craft a personalized outreach email referencing their specific activity, and send it after user approval.",
+    "status": "draft",
+    "success_criteria": [
+      {
+        "id": "profile-research",
+        "description": "Agent extracts meaningful information from target's Twitter profile including bio, recent tweets, interests, and topics they engage with",
+        "metric": "research_quality",
+        "target": "Identifies at least 3 distinct profile details",
+        "weight": 0.25,
+        "met": false
+      },
+      {
+        "id": "email-personalization",
+        "description": "Drafted email references at least 2 specific details from the target's Twitter profile",
+        "metric": "personalization_score",
+        "target": "At least 2 specific references to profile content",
+        "weight": 0.25,
+        "met": false
+      },
+      {
+        "id": "clear-cta",
+        "description": "Email includes a specific relevant call to action",
+        "metric": "cta_present",
+        "target": "Email contains clear call to action",
+        "weight": 0.15,
+        "met": false
+      },
+      {
+        "id": "user-approval-gate",
+        "description": "Email is presented to user for review and only sent after explicit approval with opportunity to request edits",
+        "metric": "approval_obtained",
+        "target": "User explicitly approves before send",
+        "weight": 0.2,
+        "met": false
+      },
+      {
+        "id": "successful-delivery",
+        "description": "Email is sent successfully via the send_email tool",
+        "metric": "delivery_status",
+        "target": "Email sent without errors",
+        "weight": 0.15,
+        "met": false
+      }
+    ],
+    "constraints": [
+      {
+        "id": "no-spam",
+        "description": "Email must not use spammy language, clickbait, or aggressive sales tactics",
+        "constraint_type": "quality",
+        "category": "content",
+        "check": ""
+      },
+      {
+        "id": "approval-required",
+        "description": "Must never send an email without explicit user approval",
+        "constraint_type": "safety",
+        "category": "process",
+        "check": ""
+      },
+      {
+        "id": "tone-appropriate",
+        "description": "Email tone must be professional, authentic, and conversational \u2014 not robotic or overly formal",
+        "constraint_type": "quality",
+        "category": "content",
+        "check": ""
+      },
+      {
+        "id": "privacy-respect",
+        "description": "Only use publicly available information from the target's Twitter profile",
+        "constraint_type": "safety",
+        "category": "ethics",
+        "check": ""
+      }
+    ],
+    "context": {},
+    "required_capabilities": [],
+    "input_schema": {},
+    "output_schema": {},
+    "version": "1.0.0",
+    "parent_version": null,
+    "evolution_reason": null,
+    "created_at": "2026-02-05 13:30:59.934460",
+    "updated_at": "2026-02-05 13:30:59.934462"
+  },
+  "required_tools": [
+    "web_scrape",
+    "send_email",
+    "web_search"
+  ],
+  "metadata": {
+    "created_at": "2026-02-05T13:32:44.573712",
+    "node_count": 4,
+    "edge_count": 3
+  }
+}
@@ -0,0 +1,310 @@
+"""Agent graph construction for Twitter Outreach Agent."""
+
+from framework.graph import EdgeSpec, EdgeCondition, Goal, SuccessCriterion, Constraint
+from framework.graph.edge import GraphSpec
+from framework.graph.executor import ExecutionResult, GraphExecutor
+from framework.runtime.event_bus import EventBus
+from framework.runtime.core import Runtime
+from framework.llm import LiteLLMProvider
+from framework.runner.tool_registry import ToolRegistry
+
+from .config import default_config, metadata
+from .nodes import (
+    intake_node,
+    research_node,
+    draft_review_node,
+    send_node,
+)
+
+# Goal definition
+goal = Goal(
+    id="twitter-outreach",
+    name="Personalized Twitter Outreach",
+    description=(
+        "Given a Twitter/X handle and outreach context, research the target's profile "
+        "(bio, tweets, interests), craft a personalized outreach email referencing their "
+        "specific activity, and send it after user approval."
+    ),
+    success_criteria=[
+        SuccessCriterion(
+            id="profile-research",
+            description="Agent extracts meaningful information from target's Twitter profile including bio, recent tweets, interests, and topics they engage with",
+            metric="research_quality",
+            target="Identifies at least 3 distinct profile details",
+            weight=0.25,
+        ),
+        SuccessCriterion(
+            id="email-personalization",
+            description="Drafted email references at least 2 specific details from the target's Twitter profile",
+            metric="personalization_score",
+            target="At least 2 specific references to profile content",
+            weight=0.25,
+        ),
+        SuccessCriterion(
+            id="clear-cta",
+            description="Email includes a specific relevant call to action",
+            metric="cta_present",
+            target="Email contains clear call to action",
+            weight=0.15,
+        ),
+        SuccessCriterion(
+            id="user-approval-gate",
+            description="Email is presented to user for review and only sent after explicit approval with opportunity to request edits",
+            metric="approval_obtained",
+            target="User explicitly approves before send",
+            weight=0.2,
+        ),
+        SuccessCriterion(
+            id="successful-delivery",
+            description="Email is sent successfully via the send_email tool",
+            metric="delivery_status",
+            target="Email sent without errors",
+            weight=0.15,
+        ),
+    ],
+    constraints=[
+        Constraint(
+            id="no-spam",
+            description="Email must not use spammy language, clickbait, or aggressive sales tactics",
+            constraint_type="quality",
+            category="content",
+        ),
+        Constraint(
+            id="approval-required",
+            description="Must never send an email without explicit user approval",
+            constraint_type="safety",
+            category="process",
+        ),
+        Constraint(
+            id="tone-appropriate",
+            description="Email tone must be professional, authentic, and conversational — not robotic or overly formal",
+            constraint_type="quality",
+            category="content",
+        ),
+        Constraint(
+            id="privacy-respect",
+            description="Only use publicly available information from the target's Twitter profile",
+            constraint_type="safety",
+            category="ethics",
+        ),
+    ],
+)
+
+# Node list
+nodes = [
+    intake_node,
+    research_node,
+    draft_review_node,
+    send_node,
+]
+
+# Edge definitions
+edges = [
+    EdgeSpec(
+        id="intake-to-research",
+        source="intake",
+        target="research",
+        condition=EdgeCondition.ON_SUCCESS,
+        priority=1,
+    ),
+    EdgeSpec(
+        id="research-to-draft-review",
+        source="research",
+        target="draft-review",
+        condition=EdgeCondition.ON_SUCCESS,
+        priority=1,
+    ),
+    EdgeSpec(
+        id="draft-review-to-send",
+        source="draft-review",
+        target="send",
+        condition=EdgeCondition.ON_SUCCESS,
+        priority=1,
+    ),
+]
+
+# Graph configuration
+entry_node = "intake"
+entry_points = {"start": "intake"}
+pause_nodes = []
+terminal_nodes = ["send"]
+
+
+class TwitterOutreachAgent:
+    """
+    Twitter Outreach Agent — 4-node pipeline with user approval checkpoint.
+
+    Flow: intake -> research -> draft-review -> send
+    """
+
+    def __init__(self, config=None):
+        self.config = config or default_config
+        self.goal = goal
+        self.nodes = nodes
+        self.edges = edges
+        self.entry_node = entry_node
+        self.entry_points = entry_points
+        self.pause_nodes = pause_nodes
+        self.terminal_nodes = terminal_nodes
+        self._executor: GraphExecutor | None = None
+        self._graph: GraphSpec | None = None
+        self._event_bus: EventBus | None = None
+        self._tool_registry: ToolRegistry | None = None
+
+    def _build_graph(self) -> GraphSpec:
+        """Build the GraphSpec."""
+        return GraphSpec(
+            id="twitter-outreach-graph",
+            goal_id=self.goal.id,
+            version="1.0.0",
+            entry_node=self.entry_node,
+            entry_points=self.entry_points,
+            terminal_nodes=self.terminal_nodes,
+            pause_nodes=self.pause_nodes,
+            nodes=self.nodes,
+            edges=self.edges,
+            default_model=self.config.model,
+            max_tokens=self.config.max_tokens,
+            loop_config={
+                "max_iterations": 50,
+                "max_tool_calls_per_turn": 10,
+                "max_history_tokens": 32000,
+            },
+        )
+
+    def _setup(self, mock_mode=False) -> GraphExecutor:
+        """Set up the executor with all components."""
+        from pathlib import Path
+
+        storage_path = Path.home() / ".hive" / "twitter_outreach"
+        storage_path.mkdir(parents=True, exist_ok=True)
+
+        self._event_bus = EventBus()
+        self._tool_registry = ToolRegistry()
+
+        mcp_config_path = Path(__file__).parent / "mcp_servers.json"
+        if mcp_config_path.exists():
+            self._tool_registry.load_mcp_config(mcp_config_path)
+
+        llm = None
+        if not mock_mode:
+            llm = LiteLLMProvider(
+                model=self.config.model,
+                api_key=self.config.api_key,
+                api_base=self.config.api_base,
+            )
+
+        tool_executor = self._tool_registry.get_executor()
+        tools = list(self._tool_registry.get_tools().values())
+
+        self._graph = self._build_graph()
+        runtime = Runtime(storage_path)
+
+        self._executor = GraphExecutor(
+            runtime=runtime,
+            llm=llm,
+            tools=tools,
+            tool_executor=tool_executor,
+            event_bus=self._event_bus,
+            storage_path=storage_path,
+            loop_config=self._graph.loop_config,
+        )
+
+        return self._executor
+
+    async def start(self, mock_mode=False) -> None:
+        """Set up the agent (initialize executor and tools)."""
+        if self._executor is None:
+            self._setup(mock_mode=mock_mode)
+
+    async def stop(self) -> None:
+        """Clean up resources."""
+        self._executor = None
+        self._event_bus = None
+
+    async def trigger_and_wait(
+        self,
+        entry_point: str,
+        input_data: dict,
+        timeout: float | None = None,
+        session_state: dict | None = None,
+    ) -> ExecutionResult | None:
+        """Execute the graph and wait for completion."""
+        if self._executor is None:
+            raise RuntimeError("Agent not started. Call start() first.")
+        if self._graph is None:
+            raise RuntimeError("Graph not built. Call start() first.")
+
+        return await self._executor.execute(
+            graph=self._graph,
+            goal=self.goal,
+            input_data=input_data,
+            session_state=session_state,
+        )
+
+    async def run(
+        self, context: dict, mock_mode=False, session_state=None
+    ) -> ExecutionResult:
+        """Run the agent (convenience method for single execution)."""
+        await self.start(mock_mode=mock_mode)
+        try:
+            result = await self.trigger_and_wait(
+                "start", context, session_state=session_state
+            )
+            return result or ExecutionResult(success=False, error="Execution timeout")
+        finally:
+            await self.stop()
+
+    def info(self):
+        """Get agent information."""
+        return {
+            "name": metadata.name,
+            "version": metadata.version,
+            "description": metadata.description,
+            "goal": {
+                "name": self.goal.name,
+                "description": self.goal.description,
+            },
+            "nodes": [n.id for n in self.nodes],
+            "edges": [e.id for e in self.edges],
+            "entry_node": self.entry_node,
+            "entry_points": self.entry_points,
+            "pause_nodes": self.pause_nodes,
+            "terminal_nodes": self.terminal_nodes,
+            "client_facing_nodes": [n.id for n in self.nodes if n.client_facing],
+        }
+
+    def validate(self):
+        """Validate agent structure."""
+        errors = []
+        warnings = []
+
+        node_ids = {node.id for node in self.nodes}
+        for edge in self.edges:
+            if edge.source not in node_ids:
+                errors.append(f"Edge {edge.id}: source '{edge.source}' not found")
+            if edge.target not in node_ids:
+                errors.append(f"Edge {edge.id}: target '{edge.target}' not found")
+
+        if self.entry_node not in node_ids:
+            errors.append(f"Entry node '{self.entry_node}' not found")
+
+        for terminal in self.terminal_nodes:
+            if terminal not in node_ids:
+                errors.append(f"Terminal node '{terminal}' not found")
+
+        for ep_id, node_id in self.entry_points.items():
+            if node_id not in node_ids:
+                errors.append(
+                    f"Entry point '{ep_id}' references unknown node '{node_id}'"
+                )
+
+        return {
+            "valid": len(errors) == 0,
+            "errors": errors,
+            "warnings": warnings,
+        }
+
+
+# Create default instance
+default_agent = TwitterOutreachAgent()
@@ -0,0 +1,45 @@
+"""Runtime configuration."""
+
+import json
+from dataclasses import dataclass, field
+from pathlib import Path
+
+
+def _load_preferred_model() -> str:
+    """Load preferred model from ~/.hive/configuration.json."""
+    config_path = Path.home() / ".hive" / "configuration.json"
+    if config_path.exists():
+        try:
+            with open(config_path) as f:
+                config = json.load(f)
+            llm = config.get("llm", {})
+            if llm.get("provider") and llm.get("model"):
+                return f"{llm['provider']}/{llm['model']}"
+        except Exception:
+            pass
+    return "anthropic/claude-sonnet-4-20250514"
+
+
+@dataclass
+class RuntimeConfig:
+    model: str = field(default_factory=_load_preferred_model)
+    temperature: float = 0.7
+    max_tokens: int = 40000
+    api_key: str | None = None
+    api_base: str | None = None
+
+
+default_config = RuntimeConfig()
+
+
+@dataclass
+class AgentMetadata:
+    name: str = "Twitter Outreach Agent"
+    version: str = "1.0.0"
+    description: str = (
+        "Reads a target's Twitter/X profile, crafts a personalized outreach email "
+        "referencing their specific activity, and sends it after user approval."
+    )
+
+
+metadata = AgentMetadata()
@@ -0,0 +1,9 @@
+{
+  "hive-tools": {
+    "transport": "stdio",
+    "command": "python",
+    "args": ["mcp_server.py", "--stdio"],
+    "cwd": "../../tools",
+    "description": "Hive tools MCP server providing web_search, web_scrape, and send_email"
+  }
+}
@@ -0,0 +1,137 @@
+"""Node definitions for Twitter Outreach Agent."""
+
+from framework.graph import NodeSpec
+
+# Node 1: Intake (client-facing)
+# Collect the target Twitter handle, outreach purpose, and recipient email.
+intake_node = NodeSpec(
+    id="intake",
+    name="Intake",
+    description="Collect the target Twitter handle, outreach purpose, and recipient email from the user",
+    node_type="event_loop",
+    client_facing=True,
+    input_keys=[],
+    output_keys=["twitter_handle", "outreach_context", "recipient_email"],
+    system_prompt="""\
+You are the intake assistant for a personalized Twitter outreach agent.
+
+**STEP 1 — Respond to the user (text only, NO tool calls):**
+Greet the user and ask them to provide:
+1. The Twitter/X handle of the person they want to reach out to
+2. The purpose/context of the outreach (e.g., partnership opportunity, hiring, collaboration, introduction)
+3. The recipient's email address
+
+Be friendly and concise. If the user provides partial info, ask for what's missing.
+
+**STEP 2 — After the user provides ALL three pieces of information, call set_output:**
+- set_output("twitter_handle", "<the Twitter handle, including @>")
+- set_output("outreach_context", "<the outreach purpose/context>")
+- set_output("recipient_email", "<the email address>")
+""",
+    tools=[],
+)
+
+# Node 2: Research
+# Searches the web and scrapes the target's Twitter/X profile to build a comprehensive summary.
+research_node = NodeSpec(
+    id="research",
+    name="Research",
+    description="Research the target's Twitter/X profile — bio, recent tweets, interests, and topics they engage with",
+    node_type="event_loop",
+    input_keys=["twitter_handle"],
+    output_keys=["profile_summary"],
+    system_prompt="""\
+You are a Twitter/X profile researcher. Your job is to thoroughly research a person's public Twitter/X presence.
+
+Given the Twitter handle provided in your inputs, do the following:
+
+1. Use web_search to find their Twitter/X profile and any relevant public information about them.
+2. Use web_scrape to read their Twitter/X profile page (try https://x.com/{handle} or https://twitter.com/{handle}).
+3. Extract and analyze:
+   - Their bio and self-description
+   - Recent tweets and topics they post about
+   - Professional interests, projects, or accomplishments
+   - Any recurring themes or passions
+   - Specific tweets worth referencing in outreach
+4. Look for additional context (personal website, blog, other social profiles mentioned in bio).
+
+Compile a comprehensive profile summary that would help someone write a highly personalized outreach email.
+
+Use set_output("profile_summary", <your detailed summary as a string>) to store your findings.
+
+Do NOT return raw JSON. Use the set_output tool to produce outputs.
+""",
+    tools=["web_search", "web_scrape"],
+)
+
+# Node 3: Draft & Review (client-facing)
+# Drafts a personalized email, presents to user, iterates until approved.
+draft_review_node = NodeSpec(
+    id="draft-review",
+    name="Draft & Review",
+    description="Draft a personalized outreach email using profile research, present to user for review, and iterate until approved",
+    node_type="event_loop",
+    client_facing=True,
+    input_keys=["outreach_context", "recipient_email", "profile_summary"],
+    output_keys=["approved_email"],
+    system_prompt="""\
+You are an expert email copywriter specializing in personalized outreach.
+
+You have been given:
+- A profile summary of the target person (from their Twitter/X)
+- The outreach context/purpose
+- The recipient's email address
+
+**STEP 1 — Draft and present the email (text only, NO tool calls):**
+
+Using the profile research, draft a personalized outreach email that:
+- References at least 2 specific details from their Twitter profile (tweets, interests, projects)
+- Clearly connects to the outreach purpose
+- Includes a specific, relevant call to action
+- Is professional but conversational and authentic — NOT spammy, robotic, or overly formal
+- Is concise (under 300 words)
+
+Present the complete email draft to the user, formatted clearly with Subject line and Body.
+Then ask: "Would you like any changes, or shall I send this?"
+
+If the user requests changes, revise the email and present the updated version. Keep iterating until the user is satisfied.
+
+**STEP 2 — After the user explicitly approves the email, call set_output:**
+- set_output("approved_email", "<the final approved email text including subject line>")
+""",
+    tools=[],
+)
+
+# Node 4: Send
+# Sends the approved email using the send_email tool.
+send_node = NodeSpec(
+    id="send",
+    name="Send",
+    description="Send the approved outreach email to the recipient",
+    node_type="event_loop",
+    input_keys=["approved_email", "recipient_email"],
+    output_keys=["delivery_status"],
+    system_prompt="""\
+You are responsible for sending the approved outreach email.
+
+You have the approved email text and the recipient's email address in your inputs.
+
+Parse the subject line and body from the approved_email, then use the send_email tool to send it to the recipient_email address.
+
+After sending successfully, call:
+- set_output("delivery_status", "sent")
+
+If there is an error sending, call:
+- set_output("delivery_status", "failed: <error details>")
+
+Do NOT return raw JSON. Use the set_output tool to produce outputs.
+""",
+    tools=["send_email"],
+)
+
+__all__ = [
+    "intake_node",
+    "research_node",
+    "draft_review_node",
+    "send_node",
+]
@@ -88,5 +88,5 @@ Implement **Option A**. The MCP server should be a thin utility layer for test e
 ## Related Files

 - `core/framework/mcp/agent_builder_server.py` - Main file to modify
- `.claude/skills/testing-agent/SKILL.md` - Update documentation if tools change
+- `.claude/skills/hive-test/SKILL.md` - Update documentation if tools change
 - `core/framework/testing/` - Test generation utilities that could be removed
@@ -1,31 +0,0 @@
-## Summary
-
-Add Cursor IDE support for existing Claude Code skills and MCP servers.
-
-## Changes
-
- Created `.cursor/skills/` directory with symlinks to all 5 existing skills:
-  - `agent-workflow`
-  - `building-agents-core`
-  - `building-agents-construction`
-  - `building-agents-patterns`
-  - `testing-agent`
- Added `.cursor/mcp.json` with MCP server configuration (same as `.mcp.json`)
-
-## Why symlinks for skills?
-
- Single source of truth - updates to `.claude/skills/` are reflected in both IDEs
- No duplication or sync issues
- Cursor automatically loads skills from `.cursor/skills/`, `.claude/skills/`, and `.codex/skills/`
-
-## MCP Configuration
-
-Cursor requires `.cursor/mcp.json` for project-level MCP servers. This enables:
- `agent-builder` - Agent building MCP server
- `tools` - Hive tools MCP server
-
-## Setup in Cursor
-
-1. **Enable MCP**: Open Command Palette (`Cmd+Shift+P` / `Ctrl+Shift+P`) and run `MCP: Enable`
-2. **Restart Cursor** to load the MCP servers from `.cursor/mcp.json`
-3. **Skills**: Type `/` in Agent chat and search for the skill name
@@ -746,10 +746,10 @@ echo -e "  1. Open Claude Code in this directory:"
 echo -e "     ${CYAN}claude${NC}"
 echo ""
 echo -e "  2. Build a new agent:"
-echo -e "     ${CYAN}/agent-workflow${NC}"
+echo -e "     ${CYAN}/hive${NC}"
 echo ""
 echo -e "  3. Test an existing agent:"
-echo -e "     ${CYAN}/testing-agent${NC}"
+echo -e "     ${CYAN}/hive-test${NC}"
 echo ""
 echo -e "${BOLD}Skills:${NC}"
 if [ -d "$SCRIPT_DIR/.claude/skills" ]; then
@@ -86,63 +86,53 @@ _TOOL_MODULES = [

 ## Credential Management

-For tools requiring API keys, use the centralized `CredentialManager`. This enables:
- **Agent-aware validation**: Credentials are checked when an agent loads, not at server startup
- **Better error messages**: Users see exactly which credentials are missing and how to get them
- **Easy testing**: Use `CredentialManager.for_testing()` to mock credentials
+Tools fall into two categories based on whether they need external API credentials:

-### Adding a New Credential
+| Signature | Meaning | CI Enforcement |
+|-----------|---------|----------------|
+| `register_tools(mcp)` | No credentials needed | ✅ Just works |
+| `register_tools(mcp, credentials=None)` | Requires credentials | ⚠️ Must have `CredentialSpec` |

-1. Find the appropriate category file in `src/aden_tools/credentials/`:
-   - `llm.py` - LLM providers (anthropic, openai, etc.)
-   - `search.py` - Search tools (brave_search, google_search, etc.)
-   - Or create a new category file for integrations
+**This is enforced by CI** — if your `register_tools` accepts a `credentials` parameter, every tool it registers must appear in a `CredentialSpec.tools` list. Otherwise, CI will fail with a clear error message.

-2. Add the credential spec to the category's dict:
+### Tools WITHOUT Credentials (Simple Case)
+
+If your tool doesn't need external API keys (file operations, local processing, etc.), just use the simple signature:

 ```python
-# In credentials/search.py
-SEARCH_CREDENTIALS = {
-    # ... existing credentials
-    "my_api": CredentialSpec(
-        env_var="MY_API_KEY",
-        tools=["my_api_tool"],  # Which tools need this credential
-        required=True,          # or False for optional
-        help_url="https://example.com/api-keys",
-        description="API key for My Service",
-    ),
-}
+def register_tools(mcp: FastMCP) -> None:
+    """Register tools that don't need credentials."""
+
+    @mcp.tool()
+    def my_local_tool(path: str) -> dict:
+        """Process a local file."""
+        # No credentials needed - just do the work
+        return {"result": process_file(path)}
 ```

-3. If you created a new category file, import and merge it in `credentials/__init__.py`:
+That's it! No additional configuration needed.
+
+### Tools WITH Credentials (Integration Case)
+
+For tools requiring API keys, follow these steps:
+
+#### Step 1: Add the `credentials` parameter

 ```python
-from .my_category import MY_CATEGORY_CREDENTIALS
-
-CREDENTIAL_SPECS = {
-    **LLM_CREDENTIALS,
-    **SEARCH_CREDENTIALS,
-    **MY_CATEGORY_CREDENTIALS,  # Add new category
-}
-```
-
-4. Update your tool to accept the optional `credentials` parameter:
-
-```python
-from typing import Optional, TYPE_CHECKING
+from typing import TYPE_CHECKING

 if TYPE_CHECKING:
-    from aden_tools.credentials import CredentialManager
+    from aden_tools.credentials import CredentialStoreAdapter


 def register_tools(
    mcp: FastMCP,
-    credentials: Optional["CredentialManager"] = None,
+    credentials: CredentialStoreAdapter | None = None,
 ) -> None:
    @mcp.tool()
    def my_api_tool(query: str) -> dict:
        """Tool that requires an API key."""
-        # Use CredentialManager if provided, fallback to direct env access
+        # Use credentials adapter if provided, fallback to direct env access
        if credentials is not None:
            api_key = credentials.get("my_api")
        else:
@@ -157,15 +147,108 @@ def register_tools(
        # Use the API key...
 ```

-5. Update `register_all_tools()` in `tools/__init__.py` to pass credentials to your tool.
+#### Step 2: Create a CredentialSpec
+
+Find the appropriate category file in `src/aden_tools/credentials/` or create a new one:
+
+| Category | File | Examples |
+|----------|------|----------|
+| LLM providers | `llm.py` | anthropic, openai |
+| Search tools | `search.py` | brave_search, google_search |
+| Email providers | `email.py` | resend, google/gmail |
+| GitHub | `github.py` | github |
+| CRM | `hubspot.py` | hubspot |
+| Messaging | `slack.py` | slack |
+
+Add your credential spec:
+
+```python
+# In credentials/<category>.py
+from .base import CredentialSpec
+
+MY_CREDENTIALS = {
+    "my_api": CredentialSpec(
+        env_var="MY_API_KEY",
+        tools=["my_api_tool"],  # IMPORTANT: List ALL tool names this credential covers
+        required=True,
+        help_url="https://example.com/api-keys",
+        description="API key for My Service",
+        # Credential store mapping
+        credential_id="my_api",
+        credential_key="api_key",
+    ),
+}
+```
+
+**Important:** The `tools` list must include every tool name that your `register_tools` function creates. CI will fail if any tool is missing.
+
+#### Step 3: Merge into CREDENTIAL_SPECS
+
+If you created a new category file, import and merge it in `credentials/__init__.py`:
+
+```python
+from .my_category import MY_CREDENTIALS
+
+CREDENTIAL_SPECS = {
+    **LLM_CREDENTIALS,
+    **SEARCH_CREDENTIALS,
+    **MY_CREDENTIALS,  # Add new category
+}
+
+__all__ = [
+    # ... existing exports
+    "MY_CREDENTIALS",
+]
+```
+
+#### Step 4: Update register_all_tools
+
+In `tools/__init__.py`, add your tool registration with credentials:
+
+```python
+from .my_tool import register_tools as register_my_tool
+
+def register_all_tools(mcp: FastMCP, credentials=None) -> list[str]:
+    # ... existing registrations
+
+    # Tools that need credentials
+    register_my_tool(mcp, credentials=credentials)
+
+    return [
+        # ... existing tool names
+        "my_api_tool",
+    ]
+```
+
+### CI Enforcement Rules
+
+The following conformance tests run in CI (`tests/integrations/test_spec_conformance.py`):
+
+| Test | What It Checks |
+|------|----------------|
+| `TestModuleStructure` | Every tool module exports `register_tools` |
+| `TestRegisterToolsSignature` | Correct function signature (`mcp` param, optional `credentials`) |
+| `TestCredentialSpecFields` | All CredentialSpec fields are complete (`env_var`, `help_url`, `description`, `credential_id`, `credential_key`) |
+| `TestSpecToolsMatchRegistered` | Tool names in `spec.tools` actually exist |
+| `TestCredentialCoverage` | **Every tool from a module with `credentials` param has a spec** |
+
+If `TestCredentialCoverage` fails, you'll see:
+
+```
+Tool 'my_new_tool' from module 'my_tool' accepts credentials but has no CredentialSpec.
+
+Fix by either:
+  1. Adding a CredentialSpec in credentials/<category>.py with tools=['my_new_tool'], or
+  2. Removing 'credentials' param from register_tools() if this tool doesn't need credentials
+```

 ### Testing with Mock Credentials

 ```python
-from aden_tools.credentials import CredentialManager
+from aden_tools.credentials import CredentialStoreAdapter

 def test_my_tool_with_valid_key(mcp):
-    creds = CredentialManager.for_testing({"my_api": "test-key"})
+    creds = CredentialStoreAdapter.for_testing({"my_api": "test-key"})
    register_tools(mcp, credentials=creds)
    tool_fn = mcp._tool_manager._tools["my_api_tool"].fn

@@ -194,29 +277,6 @@ The following tools require credentials that are not set:
 Set these environment variables and re-run the agent.
 ```

-## Environment Variables (Legacy)
-
-For simple cases or backward compatibility, you can still check environment variables directly:
-
-```python
-import os
-
-def register_tools(mcp: FastMCP) -> None:
-    @mcp.tool()
-    def my_api_tool(query: str) -> dict:
-        """Tool that requires an API key."""
-        api_key = os.getenv("MY_API_KEY")
-        if not api_key:
-            return {
-                "error": "MY_API_KEY environment variable not set",
-                "help": "Get an API key at https://example.com/api",
-            }
-
-        # Use the API key...
-```
-
-However, using `CredentialManager` is recommended for new tools as it provides better validation and testing support.
-
 ## Best Practices

 ### Error Handling
@@ -35,7 +35,13 @@ Usage:
 Credential categories:
 - llm.py: LLM provider credentials (anthropic, openai, etc.)
 - search.py: Search tool credentials (brave_search, google_search, etc.)
- integrations.py: Third-party integrations (hubspot, etc.)
+- email.py: Email provider credentials (resend, google/gmail)
+- github.py: GitHub API credentials
+- hubspot.py: HubSpot CRM credentials
+- slack.py: Slack workspace credentials
+
+Note: Tools that don't need credentials simply omit the 'credentials' parameter
+from their register_tools() function. This convention is enforced by CI tests.

 To add a new credential:
 1. Find the appropriate category file (or create a new one)
@@ -46,8 +52,9 @@ To add a new credential:
 from .base import CredentialError, CredentialSpec
 from .browser import get_aden_auth_url, get_aden_setup_url, open_browser
 from .email import EMAIL_CREDENTIALS
+from .github import GITHUB_CREDENTIALS
 from .health_check import HealthCheckResult, check_credential_health
-from .integrations import INTEGRATION_CREDENTIALS
+from .hubspot import HUBSPOT_CREDENTIALS
 from .llm import LLM_CREDENTIALS
 from .search import SEARCH_CREDENTIALS
 from .shell_config import (
@@ -56,6 +63,7 @@ from .shell_config import (
    get_shell_config_path,
    get_shell_source_command,
 )
+from .slack import SLACK_CREDENTIALS
 from .store_adapter import CredentialStoreAdapter

 # Merged registry of all credentials
@@ -63,7 +71,9 @@ CREDENTIAL_SPECS = {
    **LLM_CREDENTIALS,
    **SEARCH_CREDENTIALS,
    **EMAIL_CREDENTIALS,
-    **INTEGRATION_CREDENTIALS,
+    **GITHUB_CREDENTIALS,
+    **HUBSPOT_CREDENTIALS,
+    **SLACK_CREDENTIALS,
 }

 __all__ = [
@@ -91,5 +101,7 @@ __all__ = [
    "LLM_CREDENTIALS",
    "SEARCH_CREDENTIALS",
    "EMAIL_CREDENTIALS",
-    "INTEGRATION_CREDENTIALS",
+    "GITHUB_CREDENTIALS",
+    "HUBSPOT_CREDENTIALS",
+    "SLACK_CREDENTIALS",
 ]
@@ -28,7 +28,7 @@ def open_browser(url: str) -> tuple[bool, str]:
        Tuple of (success, message)

    Example:
-        >>> success, msg = open_browser("https://integration.adenhq.com/connect/hubspot")
+        >>> success, msg = open_browser("https://hive.adenhq.com/connect/hubspot")
        >>> if success:
        ...     print("Browser opened!")
    """
@@ -75,7 +75,7 @@ def open_browser(url: str) -> tuple[bool, str]:
        return False, f"Failed to open browser: {e}"


-def get_aden_auth_url(provider_name: str, base_url: str = "https://integration.adenhq.com") -> str:
+def get_aden_auth_url(provider_name: str, base_url: str = "https://hive.adenhq.com") -> str:
    """
    Get the Aden authorization URL for a provider.

@@ -89,7 +89,7 @@ def get_aden_auth_url(provider_name: str, base_url: str = "https://integration.a
    return f"{base_url}/connect/{provider_name}"


-def get_aden_setup_url(base_url: str = "https://integration.adenhq.com") -> str:
+def get_aden_setup_url(base_url: str = "https://hive.adenhq.com") -> str:
    """
    Get the Aden setup URL for creating an API key.

@@ -33,21 +33,21 @@ EMAIL_CREDENTIALS = {
        credential_id="resend",
        credential_key="api_key",
    ),
-    "gmail": CredentialSpec(
-        env_var="GMAIL_ACCESS_TOKEN",
+    "google": CredentialSpec(
+        env_var="GOOGLE_ACCESS_TOKEN",
        tools=["send_email", "send_budget_alert_email"],
        node_types=[],
        required=False,
        startup_required=False,
        help_url="https://hive.adenhq.com",
-        description="Gmail OAuth2 access token (via Aden)",
+        description="Google OAuth2 access token (via Aden) - used for Gmail",
        aden_supported=True,
-        aden_provider_name="gmail",
+        aden_provider_name="google",
        direct_api_key_supported=False,
-        api_key_instructions="Gmail requires OAuth2. Connect via hive.adenhq.com",
+        api_key_instructions="Google OAuth requires OAuth2. Connect via hive.adenhq.com",
        health_check_endpoint="https://gmail.googleapis.com/gmail/v1/users/me/profile",
        health_check_method="GET",
-        credential_id="gmail",
+        credential_id="google",
        credential_key="access_token",
    ),
 }
@@ -0,0 +1,54 @@
+"""
+GitHub tool credentials.
+
+Contains credentials for GitHub API integration.
+"""
+
+from .base import CredentialSpec
+
+GITHUB_CREDENTIALS = {
+    "github": CredentialSpec(
+        env_var="GITHUB_TOKEN",
+        tools=[
+            "github_list_repos",
+            "github_get_repo",
+            "github_search_repos",
+            "github_list_issues",
+            "github_get_issue",
+            "github_create_issue",
+            "github_update_issue",
+            "github_list_pull_requests",
+            "github_get_pull_request",
+            "github_create_pull_request",
+            "github_search_code",
+            "github_list_branches",
+            "github_get_branch",
+            "github_list_stargazers",
+            "github_get_user_profile",
+            "github_get_user_emails",
+        ],
+        required=True,
+        startup_required=False,
+        help_url="https://github.com/settings/tokens",
+        description="GitHub Personal Access Token (classic)",
+        # Auth method support
+        aden_supported=False,
+        direct_api_key_supported=True,
+        api_key_instructions="""To get a GitHub Personal Access Token:
+1. Go to GitHub Settings > Developer settings > Personal access tokens
+2. Click "Generate new token" > "Generate new token (classic)"
+3. Give your token a descriptive name (e.g., "Hive Agent")
+4. Select the following scopes:
+   - repo (Full control of private repositories)
+   - read:org (Read org and team membership - optional)
+   - user (Read user profile data - optional)
+5. Click "Generate token" and copy the token (starts with ghp_)
+6. Store it securely - you won't be able to see it again!""",
+        # Health check configuration
+        health_check_endpoint="https://api.github.com/user",
+        health_check_method="GET",
+        # Credential store mapping
+        credential_id="github",
+        credential_key="access_token",
+    ),
+}
@@ -0,0 +1,53 @@
+"""
+HubSpot tool credentials.
+
+Contains credentials for HubSpot CRM integration.
+"""
+
+from .base import CredentialSpec
+
+HUBSPOT_CREDENTIALS = {
+    "hubspot": CredentialSpec(
+        env_var="HUBSPOT_ACCESS_TOKEN",
+        tools=[
+            "hubspot_search_contacts",
+            "hubspot_get_contact",
+            "hubspot_create_contact",
+            "hubspot_update_contact",
+            "hubspot_search_companies",
+            "hubspot_get_company",
+            "hubspot_create_company",
+            "hubspot_update_company",
+            "hubspot_search_deals",
+            "hubspot_get_deal",
+            "hubspot_create_deal",
+            "hubspot_update_deal",
+        ],
+        required=True,
+        startup_required=False,
+        help_url="https://developers.hubspot.com/docs/api/private-apps",
+        description="HubSpot access token (Private App or OAuth2)",
+        # Auth method support
+        aden_supported=True,
+        aden_provider_name="hubspot",
+        direct_api_key_supported=True,
+        api_key_instructions="""To get a HubSpot Private App token:
+1. Go to HubSpot Settings > Integrations > Private Apps
+2. Click "Create a private app"
+3. Name your app (e.g., "Hive Agent")
+4. Go to the "Scopes" tab and enable:
+   - crm.objects.contacts.read
+   - crm.objects.contacts.write
+   - crm.objects.companies.read
+   - crm.objects.companies.write
+   - crm.objects.deals.read
+   - crm.objects.deals.write
+5. Click "Create app" and copy the access token""",
+        # Health check configuration
+        health_check_endpoint="https://api.hubapi.com/crm/v3/objects/contacts?limit=1",
+        health_check_method="GET",
+        # Credential store mapping
+        credential_id="hubspot",
+        credential_key="access_token",
+    ),
+}
@@ -1,94 +0,0 @@
-"""
-Integration credentials.
-
-Contains credentials for third-party service integrations (HubSpot, etc.).
-"""
-
-from .base import CredentialSpec
-
-INTEGRATION_CREDENTIALS = {
-    "github": CredentialSpec(
-        env_var="GITHUB_TOKEN",
-        tools=[
-            "github_list_repos",
-            "github_get_repo",
-            "github_search_repos",
-            "github_list_issues",
-            "github_get_issue",
-            "github_create_issue",
-            "github_update_issue",
-            "github_list_pull_requests",
-            "github_get_pull_request",
-            "github_create_pull_request",
-            "github_search_code",
-            "github_list_branches",
-            "github_get_branch",
-        ],
-        required=True,
-        startup_required=False,
-        help_url="https://github.com/settings/tokens",
-        description="GitHub Personal Access Token (classic)",
-        # Auth method support
-        aden_supported=False,
-        direct_api_key_supported=True,
-        api_key_instructions="""To get a GitHub Personal Access Token:
-1. Go to GitHub Settings > Developer settings > Personal access tokens
-2. Click "Generate new token" > "Generate new token (classic)"
-3. Give your token a descriptive name (e.g., "Hive Agent")
-4. Select the following scopes:
-   - repo (Full control of private repositories)
-   - read:org (Read org and team membership - optional)
-   - user (Read user profile data - optional)
-5. Click "Generate token" and copy the token (starts with ghp_)
-6. Store it securely - you won't be able to see it again!""",
-        # Health check configuration
-        health_check_endpoint="https://api.github.com/user",
-        health_check_method="GET",
-        # Credential store mapping
-        credential_id="github",
-        credential_key="access_token",
-    ),
-    "hubspot": CredentialSpec(
-        env_var="HUBSPOT_ACCESS_TOKEN",
-        tools=[
-            "hubspot_search_contacts",
-            "hubspot_get_contact",
-            "hubspot_create_contact",
-            "hubspot_update_contact",
-            "hubspot_search_companies",
-            "hubspot_get_company",
-            "hubspot_create_company",
-            "hubspot_update_company",
-            "hubspot_search_deals",
-            "hubspot_get_deal",
-            "hubspot_create_deal",
-            "hubspot_update_deal",
-        ],
-        required=True,
-        startup_required=False,
-        help_url="https://developers.hubspot.com/docs/api/private-apps",
-        description="HubSpot access token (Private App or OAuth2)",
-        # Auth method support
-        aden_supported=True,
-        aden_provider_name="hubspot",
-        direct_api_key_supported=True,
-        api_key_instructions="""To get a HubSpot Private App token:
-1. Go to HubSpot Settings > Integrations > Private Apps
-2. Click "Create a private app"
-3. Name your app (e.g., "Hive Agent")
-4. Go to the "Scopes" tab and enable:
-   - crm.objects.contacts.read
-   - crm.objects.contacts.write
-   - crm.objects.companies.read
-   - crm.objects.companies.write
-   - crm.objects.deals.read
-   - crm.objects.deals.write
-5. Click "Create app" and copy the access token""",
-        # Health check configuration
-        health_check_endpoint="https://api.hubapi.com/crm/v3/objects/contacts?limit=1",
-        health_check_method="GET",
-        # Credential store mapping
-        credential_id="hubspot",
-        credential_key="access_token",
-    ),
-}
@@ -0,0 +1,95 @@
+"""
+Slack tool credentials.
+
+Contains credentials for Slack workspace integration.
+"""
+
+from .base import CredentialSpec
+
+SLACK_CREDENTIALS = {
+    "slack": CredentialSpec(
+        env_var="SLACK_BOT_TOKEN",
+        tools=[
+            "slack_send_message",
+            "slack_list_channels",
+            "slack_get_channel_history",
+            "slack_add_reaction",
+            "slack_get_user_info",
+            "slack_update_message",
+            "slack_delete_message",
+            "slack_schedule_message",
+            "slack_create_channel",
+            "slack_archive_channel",
+            "slack_invite_to_channel",
+            "slack_set_channel_topic",
+            "slack_remove_reaction",
+            "slack_list_users",
+            "slack_upload_file",
+            "slack_search_messages",
+            "slack_get_thread_replies",
+            "slack_pin_message",
+            "slack_unpin_message",
+            "slack_list_pins",
+            "slack_add_bookmark",
+            "slack_list_scheduled_messages",
+            "slack_delete_scheduled_message",
+            "slack_send_dm",
+            "slack_get_permalink",
+            "slack_send_ephemeral",
+            "slack_post_blocks",
+            "slack_open_modal",
+            "slack_update_home_tab",
+            "slack_set_status",
+            "slack_set_presence",
+            "slack_get_presence",
+            "slack_create_reminder",
+            "slack_list_reminders",
+            "slack_delete_reminder",
+            "slack_create_usergroup",
+            "slack_update_usergroup_members",
+            "slack_list_usergroups",
+            "slack_list_emoji",
+            "slack_create_canvas",
+            "slack_edit_canvas",
+            "slack_get_messages_for_analysis",
+            "slack_trigger_workflow",
+            "slack_get_conversation_context",
+            "slack_find_user_by_email",
+            "slack_kick_user_from_channel",
+            "slack_delete_file",
+            "slack_get_team_stats",
+        ],
+        required=True,
+        startup_required=False,
+        help_url="https://api.slack.com/apps",
+        description="Slack Bot Token (starts with xoxb-)",
+        # Auth method support
+        aden_supported=True,
+        aden_provider_name="slack",
+        direct_api_key_supported=True,
+        api_key_instructions="""To get a Slack Bot Token:
+1. Go to https://api.slack.com/apps and click "Create New App"
+2. Choose "From scratch" and give your app a name
+3. Select the workspace where you want to install the app
+4. Go to "OAuth & Permissions" in the sidebar
+5. Add the following Bot Token Scopes:
+   - channels:read, channels:write, channels:history
+   - chat:write, chat:write.public
+   - users:read, users:read.email
+   - reactions:read, reactions:write
+   - files:read, files:write
+   - search:read (requires user token)
+   - pins:read, pins:write
+   - bookmarks:read, bookmarks:write
+   - reminders:read, reminders:write
+   - usergroups:read, usergroups:write
+6. Click "Install to Workspace" and authorize
+7. Copy the "Bot User OAuth Token" (starts with xoxb-)""",
+        # Health check configuration
+        health_check_endpoint="https://slack.com/api/auth.test",
+        health_check_method="POST",
+        # Credential store mapping
+        credential_id="slack",
+        credential_key="access_token",
+    ),
+}
@@ -104,6 +104,7 @@ def register_all_tools(
        "load_data",
        "save_data",
        "list_data_files",
+        "serve_file_to_user",
        "csv_read",
        "csv_write",
        "csv_append",
@@ -180,6 +181,7 @@ def register_all_tools(
        "slack_delete_reminder",
        # Phase 2: User Groups
        "slack_create_usergroup",
+        "slack_update_usergroup_members",
        "slack_list_usergroups",
        # Phase 2: Emoji
        "slack_list_emoji",
@@ -2,7 +2,7 @@
 Email Tool - Send emails using multiple providers.

 Supports:
- Gmail (GMAIL_ACCESS_TOKEN, via Aden OAuth2)
+- Gmail (GOOGLE_ACCESS_TOKEN, via Aden OAuth2)
 - Resend (RESEND_API_KEY)

 Auto-detection: If provider="auto", tries Gmail first, then Resend.
@@ -121,11 +121,11 @@ def register_tools(
        if credentials is not None:
            return {
                "resend_api_key": credentials.get("resend"),
-                "gmail_access_token": credentials.get("gmail"),
+                "gmail_access_token": credentials.get("google"),  # Google OAuth for Gmail
            }
        return {
            "resend_api_key": os.getenv("RESEND_API_KEY"),
-            "gmail_access_token": os.getenv("GMAIL_ACCESS_TOKEN"),
+            "gmail_access_token": os.getenv("GOOGLE_ACCESS_TOKEN"),
        }

    def _resolve_from_email(from_email: str | None) -> str | None:
@@ -141,6 +141,52 @@ def register_tools(mcp: FastMCP) -> None:
        except Exception as e:
            return {"error": f"Failed to load data: {str(e)}"}

+    @mcp.tool()
+    def serve_file_to_user(filename: str, data_dir: str, label: str = "") -> dict:
+        """
+        Purpose
+            Resolve a sandboxed file path to a fully qualified file URI
+            that the user can click to open in their system viewer.
+
+        When to use
+            After saving a file (HTML report, CSV export, etc.) with save_data,
+            call this to give the user a clickable link to open it.
+            The TUI will render the file:// URI as a clickable link.
+
+        Rules & Constraints
+            filename must be a simple name — no paths or '..'
+            The file must already exist in data_dir
+            Returns a file:// URI the agent should include in its response
+
+        Args:
+            filename: The filename to serve (must exist in data_dir).
+            data_dir: Absolute path to the data directory.
+            label: Optional display label (defaults to filename).
+
+        Returns:
+            Dict with file_uri, file_path, and label
+        """
+        if not filename or ".." in filename or "/" in filename or "\\" in filename:
+            return {"error": "Invalid filename. Use simple names like 'report.html'"}
+        if not data_dir:
+            return {"error": "data_dir is required"}
+
+        try:
+            path = Path(data_dir) / filename
+            if not path.exists():
+                return {"error": f"File not found: {filename}"}
+
+            full_path = str(path.resolve())
+            file_uri = f"file://{full_path}"
+            return {
+                "success": True,
+                "file_uri": file_uri,
+                "file_path": full_path,
+                "label": label or filename,
+            }
+        except Exception as e:
+            return {"error": f"Failed to serve file: {str(e)}"}
+
    @mcp.tool()
    def list_data_files(data_dir: str) -> dict:
        """
@@ -0,0 +1,8 @@
+"""Stage 1: Offline conformance tests for tool modules.
+
+Runs in CI on every PR. No credentials, no network.
+Verifies that tool modules follow codebase conventions:
+- 1a: Spec conformance (structure, signatures, credential specs)
+- 1b: Registration (register_tools doesn't raise, tools exist)
+- 1c: Input validation (credential errors, required params)
+"""
@@ -0,0 +1,163 @@
+"""Shared fixtures and discovery utilities for Stage 1 tests.
+
+Discovers all tool modules under aden_tools.tools and provides
+parameterization data for conformance testing.
+"""
+
+from __future__ import annotations
+
+import importlib
+import inspect
+from pathlib import Path
+from typing import Any
+
+from fastmcp import FastMCP
+
+from aden_tools.credentials import CREDENTIAL_SPECS
+
+# --- Known Issues ---
+# google_search and google_cse specs use tools=["google_search"] but
+# the actual MCP tool is "web_search" (multi-provider). This is because
+# _tool_to_cred is 1:1 and web_search already maps to brave_search.
+# These specs use a phantom tool name for credential grouping.
+KNOWN_PHANTOM_TOOLS: set[str] = {"google_search"}
+
+# --- Tool Module Discovery ---
+
+TOOLS_SRC = Path(__file__).resolve().parent.parent.parent / "src" / "aden_tools" / "tools"
+
+
+def _discover_tool_modules() -> list[tuple[str, str]]:
+    """Discover all tool module import paths and short names.
+
+    Scans aden_tools/tools/ for packages that re-export ``register_tools``
+    in their ``__init__.py``.
+
+    Returns:
+        List of (import_path, short_name) tuples.
+        E.g. ("aden_tools.tools.web_search_tool", "web_search_tool")
+    """
+    modules: list[tuple[str, str]] = []
+
+    for item in sorted(TOOLS_SRC.iterdir()):
+        if item.name.startswith("_") or item.name == "__pycache__":
+            continue
+
+        if item.is_dir() and (item / "__init__.py").exists():
+            init_text = (item / "__init__.py").read_text()
+
+            if "register_tools" in init_text:
+                # Direct tool package (e.g., web_search_tool, email_tool)
+                modules.append((f"aden_tools.tools.{item.name}", item.name))
+            else:
+                # Toolkit directory (e.g., file_system_toolkits) — scan sub-packages
+                for sub in sorted(item.iterdir()):
+                    if sub.name.startswith("_") or sub.name == "__pycache__":
+                        continue
+                    if sub.is_dir() and (sub / "__init__.py").exists():
+                        sub_init_text = (sub / "__init__.py").read_text()
+                        if "register_tools" in sub_init_text:
+                            modules.append(
+                                (
+                                    f"aden_tools.tools.{item.name}.{sub.name}",
+                                    f"{item.name}/{sub.name}",
+                                )
+                            )
+
+    return modules
+
+
+# Computed once at import time
+TOOL_MODULES: list[tuple[str, str]] = _discover_tool_modules()
+TOOL_MODULE_IDS: list[str] = [name for _, name in TOOL_MODULES]
+
+
+def _get_credential_tool_modules() -> list[tuple[str, str]]:
+    """Return tool modules that accept a ``credentials`` parameter."""
+    result = []
+    for import_path, short_name in TOOL_MODULES:
+        mod = importlib.import_module(import_path)
+        register_fn = getattr(mod, "register_tools", None)
+        if register_fn is None:
+            continue
+        sig = inspect.signature(register_fn)
+        if "credentials" in sig.parameters:
+            result.append((import_path, short_name))
+    return result
+
+
+CREDENTIAL_TOOL_MODULES: list[tuple[str, str]] = _get_credential_tool_modules()
+CREDENTIAL_TOOL_MODULE_IDS: list[str] = [name for _, name in CREDENTIAL_TOOL_MODULES]
+
+
+def _get_module_to_tools_mapping() -> dict[str, list[str]]:
+    """Map each tool module to the tool names it registers.
+
+    Registers each module's tools individually into a fresh FastMCP instance
+    and collects the tool names that appear.
+    """
+    mapping: dict[str, list[str]] = {}
+
+    for import_path, short_name in TOOL_MODULES:
+        mod = importlib.import_module(import_path)
+        register_fn = getattr(mod, "register_tools", None)
+        if register_fn is None:
+            continue
+
+        mcp = FastMCP("discovery")
+        sig = inspect.signature(register_fn)
+        if "credentials" in sig.parameters:
+            register_fn(mcp, credentials=None)
+        else:
+            register_fn(mcp)
+
+        mapping[short_name] = list(mcp._tool_manager._tools.keys())
+
+    return mapping
+
+
+# Computed once at import time
+MODULE_TO_TOOLS: dict[str, list[str]] = _get_module_to_tools_mapping()
+
+
+def get_all_credential_tool_names() -> list[str]:
+    """Get all tool names that have associated CredentialSpecs."""
+    names: list[str] = []
+    for spec in CREDENTIAL_SPECS.values():
+        names.extend(spec.tools)
+    return names
+
+
+def get_minimal_args(fn: Any) -> dict[str, Any]:
+    """Build minimal keyword arguments for a tool function.
+
+    Uses the function signature to determine required parameters and
+    provides sensible minimal values for common types.
+    """
+    sig = inspect.signature(fn)
+    args: dict[str, Any] = {}
+
+    for name, param in sig.parameters.items():
+        if param.default is not inspect.Parameter.empty:
+            continue  # Skip optional params
+
+        # Infer a minimal value from annotation
+        annotation = param.annotation
+        annotation_str = str(annotation)
+
+        if annotation is str or "str" in annotation_str:
+            args[name] = "test"
+        elif annotation is int or annotation_str == "int":
+            args[name] = 1
+        elif annotation is float or annotation_str == "float":
+            args[name] = 1.0
+        elif annotation is bool or annotation_str == "bool":
+            args[name] = True
+        elif "list" in annotation_str.lower():
+            args[name] = ["test@example.com"]
+        elif "dict" in annotation_str.lower():
+            args[name] = {}
+        else:
+            args[name] = "test"
+
+    return args
@@ -0,0 +1,171 @@
+"""Stage 1c: Input validation and error handling tests.
+
+Generic tests parameterized over credential-requiring tools:
+- Missing credentials returns {"error": "...", "help": "..."} — both keys
+- Missing required params returns {"error": "..."}
+"""
+
+from __future__ import annotations
+
+import importlib
+import inspect
+
+import pytest
+from fastmcp import FastMCP
+
+from aden_tools.credentials import CREDENTIAL_SPECS
+
+from .conftest import (
+    CREDENTIAL_TOOL_MODULES,
+    MODULE_TO_TOOLS,
+    get_minimal_args,
+)
+
+# ---------------------------------------------------------------------------
+# Build parameterization data for credential-requiring tools
+# ---------------------------------------------------------------------------
+
+# Map of tool_name -> (module_import_path, tool_fn_name)
+# Only includes tools that have a CredentialSpec with non-empty tools list
+_CRED_TOOL_ENTRIES: list[tuple[str, str]] = []
+
+for _spec_name, _spec in CREDENTIAL_SPECS.items():
+    for _tool_name in _spec.tools:
+        _CRED_TOOL_ENTRIES.append((_spec_name, _tool_name))
+
+_CRED_TOOL_IDS = [f"{spec}:{tool}" for spec, tool in _CRED_TOOL_ENTRIES]
+
+
+def _find_module_for_tool(tool_name: str) -> str | None:
+    """Find the module import path that registers a given tool."""
+    for short_name, tools in MODULE_TO_TOOLS.items():
+        if tool_name in tools:
+            # Reconstruct import path from short_name
+            for import_path, sn in CREDENTIAL_TOOL_MODULES:
+                if sn == short_name:
+                    return import_path
+    return None
+
+
+def _register_and_get_fn(tool_name: str):
+    """Register the tool's module and return the tool function."""
+    # Find the module that provides this tool
+    module_path = _find_module_for_tool(tool_name)
+    if module_path is None:
+        pytest.skip(f"Could not find module for tool '{tool_name}'")
+
+    mod = importlib.import_module(module_path)
+    mcp = FastMCP("test-validation")
+
+    sig = inspect.signature(mod.register_tools)
+    if "credentials" in sig.parameters:
+        mod.register_tools(mcp, credentials=None)
+    else:
+        mod.register_tools(mcp)
+
+    tool_entry = mcp._tool_manager._tools.get(tool_name)
+    if tool_entry is None:
+        pytest.skip(f"Tool '{tool_name}' not found after registration")
+
+    return tool_entry.fn
+
+
+# --- Env vars to clear for each credential spec ---
+
+_ENV_VARS_TO_CLEAR: dict[str, list[str]] = {}
+for _spec_name, _spec in CREDENTIAL_SPECS.items():
+    _ENV_VARS_TO_CLEAR[_spec_name] = [_spec.env_var]
+
+# Also clear related env vars (e.g., EMAIL_FROM for email tools)
+_EXTRA_ENV_VARS: dict[str, list[str]] = {
+    "resend": ["EMAIL_FROM"],
+}
+
+
+# ---------------------------------------------------------------------------
+# 1c-1: Missing credentials returns {"error": ..., "help": ...}
+# ---------------------------------------------------------------------------
+
+
+class TestMissingCredentialsError:
+    """Tools called without credentials must return both 'error' and 'help' keys."""
+
+    @pytest.mark.parametrize(
+        "spec_name,tool_name",
+        _CRED_TOOL_ENTRIES,
+        ids=_CRED_TOOL_IDS,
+    )
+    def test_missing_credentials_returns_error_and_help(
+        self, spec_name: str, tool_name: str, monkeypatch: pytest.MonkeyPatch
+    ):
+        """Calling a tool without credentials returns {error, help}."""
+        # Clear all credential env vars
+        for env_var in _ENV_VARS_TO_CLEAR.get(spec_name, []):
+            monkeypatch.delenv(env_var, raising=False)
+        for env_var in _EXTRA_ENV_VARS.get(spec_name, []):
+            monkeypatch.delenv(env_var, raising=False)
+
+        # Also clear all other credential env vars to ensure clean state
+        for other_spec in CREDENTIAL_SPECS.values():
+            monkeypatch.delenv(other_spec.env_var, raising=False)
+
+        fn = _register_and_get_fn(tool_name)
+        args = get_minimal_args(fn)
+
+        result = fn(**args)
+
+        assert isinstance(result, dict), (
+            f"Tool '{tool_name}' should return a dict, got {type(result)}"
+        )
+        assert "error" in result, (
+            f"Tool '{tool_name}' missing credentials should return {{'error': ...}}, got {result}"
+        )
+        assert "help" in result, (
+            f"Tool '{tool_name}' missing credentials should return {{'help': ...}}, got {result}"
+        )
+
+
+# ---------------------------------------------------------------------------
+# 1c-2: Missing required params returns error
+# ---------------------------------------------------------------------------
+
+
+class TestMissingRequiredParams:
+    """Calling a tool without required params should return an error or raise TypeError."""
+
+    @pytest.mark.parametrize(
+        "spec_name,tool_name",
+        _CRED_TOOL_ENTRIES,
+        ids=_CRED_TOOL_IDS,
+    )
+    def test_missing_required_params_returns_error(
+        self, spec_name: str, tool_name: str, monkeypatch: pytest.MonkeyPatch
+    ):
+        """Calling a tool with no args raises TypeError or returns error dict."""
+        # Set credential so we can test param validation separately
+        spec = CREDENTIAL_SPECS[spec_name]
+        monkeypatch.setenv(spec.env_var, "test-key")
+
+        fn = _register_and_get_fn(tool_name)
+
+        sig = inspect.signature(fn)
+        required_params = [
+            name
+            for name, param in sig.parameters.items()
+            if param.default is inspect.Parameter.empty
+        ]
+
+        if not required_params:
+            pytest.skip(f"Tool '{tool_name}' has no required params")
+
+        # Calling with no args should fail
+        try:
+            result = fn()
+            # If it returns (doesn't raise), it should be an error dict
+            if isinstance(result, dict):
+                assert "error" in result, (
+                    f"Tool '{tool_name}' called with no args returned success: {result}"
+                )
+        except TypeError:
+            # TypeError from missing positional args is acceptable
+            pass
@@ -0,0 +1,149 @@
+"""Stage 1b: Registration tests.
+
+Verifies that tool registration works correctly:
+- register_tools(mcp) doesn't raise
+- register_tools(mcp, credentials=mock_credentials) doesn't raise
+- Expected tool names exist in mcp._tool_manager._tools
+"""
+
+from __future__ import annotations
+
+import importlib
+import inspect
+
+import pytest
+from fastmcp import FastMCP
+
+from aden_tools.credentials import CredentialStoreAdapter
+
+from .conftest import (
+    CREDENTIAL_TOOL_MODULE_IDS,
+    CREDENTIAL_TOOL_MODULES,
+    MODULE_TO_TOOLS,
+    TOOL_MODULE_IDS,
+    TOOL_MODULES,
+)
+
+# ---------------------------------------------------------------------------
+# 1b-1: register_tools(mcp) doesn't raise
+# ---------------------------------------------------------------------------
+
+
+class TestRegisterWithoutCredentials:
+    """register_tools(mcp) must not raise for any tool module."""
+
+    @pytest.mark.parametrize(
+        "import_path,short_name",
+        TOOL_MODULES,
+        ids=TOOL_MODULE_IDS,
+    )
+    def test_register_tools_no_raise(self, import_path: str, short_name: str):
+        """Calling register_tools(mcp) does not raise."""
+        mod = importlib.import_module(import_path)
+        mcp = FastMCP("test-reg")
+
+        sig = inspect.signature(mod.register_tools)
+        if "credentials" in sig.parameters:
+            mod.register_tools(mcp, credentials=None)
+        else:
+            mod.register_tools(mcp)
+
+        # Should complete without exception
+
+
+# ---------------------------------------------------------------------------
+# 1b-2: register_tools(mcp, credentials=mock) doesn't raise
+# ---------------------------------------------------------------------------
+
+
+class TestRegisterWithMockCredentials:
+    """register_tools(mcp, credentials=mock) must not raise for credential tools."""
+
+    @pytest.fixture
+    def mock_credentials(self) -> CredentialStoreAdapter:
+        """Create a CredentialStoreAdapter with all mock credentials."""
+        return CredentialStoreAdapter.for_testing(
+            {
+                "anthropic": "test-anthropic-key",
+                "brave_search": "test-brave-key",
+                "google_search": "test-google-key",
+                "google_cse": "test-google-cse-id",
+                "resend": "test-resend-key",
+                "github": "test-github-token",
+                "hubspot": "test-hubspot-token",
+            }
+        )
+
+    @pytest.mark.parametrize(
+        "import_path,short_name",
+        CREDENTIAL_TOOL_MODULES,
+        ids=CREDENTIAL_TOOL_MODULE_IDS,
+    )
+    def test_register_tools_with_credentials_no_raise(
+        self,
+        import_path: str,
+        short_name: str,
+        mock_credentials: CredentialStoreAdapter,
+    ):
+        """Calling register_tools(mcp, credentials=mock) does not raise."""
+        mod = importlib.import_module(import_path)
+        mcp = FastMCP("test-reg-cred")
+        mod.register_tools(mcp, credentials=mock_credentials)
+
+        # Should complete without exception
+
+
+# ---------------------------------------------------------------------------
+# 1b-3: Expected tool names exist in mcp._tool_manager._tools
+# ---------------------------------------------------------------------------
+
+
+class TestExpectedToolsRegistered:
+    """After registration, expected tool names must exist in the MCP instance."""
+
+    @pytest.mark.parametrize(
+        "import_path,short_name",
+        TOOL_MODULES,
+        ids=TOOL_MODULE_IDS,
+    )
+    def test_tools_registered_in_mcp(self, import_path: str, short_name: str):
+        """The tool names registered by a module match expectations."""
+        expected_tools = MODULE_TO_TOOLS.get(short_name, [])
+        if not expected_tools:
+            pytest.skip(f"No expected tools mapped for {short_name}")
+
+        mod = importlib.import_module(import_path)
+        mcp = FastMCP("test-tools")
+
+        sig = inspect.signature(mod.register_tools)
+        if "credentials" in sig.parameters:
+            mod.register_tools(mcp, credentials=None)
+        else:
+            mod.register_tools(mcp)
+
+        registered = set(mcp._tool_manager._tools.keys())
+        for tool_name in expected_tools:
+            assert tool_name in registered, (
+                f"Tool '{tool_name}' expected from {short_name} "
+                f"but not found. Registered: {sorted(registered)}"
+            )
+
+    def test_register_all_tools_returns_complete_list(self):
+        """register_all_tools() return list matches actually registered tools."""
+        from aden_tools.tools import register_all_tools
+
+        mcp = FastMCP("test-all")
+        returned_names = register_all_tools(mcp, credentials=None)
+        registered = set(mcp._tool_manager._tools.keys())
+
+        # Every returned name must actually be registered
+        for name in returned_names:
+            assert name in registered, (
+                f"register_all_tools() lists '{name}' but it was not registered"
+            )
+
+        # Every registered tool must be in the return list
+        for name in registered:
+            assert name in returned_names, (
+                f"Tool '{name}' is registered but not in register_all_tools() return list"
+            )
@@ -0,0 +1,305 @@
+"""Stage 1a: Spec conformance tests.
+
+Verifies that every tool module follows codebase structural conventions:
+- __init__.py re-exports register_tools
+- register_tools has the correct signature
+- CredentialSpec fields are complete
+- spec.tools match actual @mcp.tool() functions
+- Specs are merged into CREDENTIAL_SPECS
+- Tool names appear in register_all_tools() return list
+"""
+
+from __future__ import annotations
+
+import importlib
+import inspect
+
+import pytest
+from fastmcp import FastMCP
+
+from aden_tools.credentials import (
+    CREDENTIAL_SPECS,
+    EMAIL_CREDENTIALS,
+    GITHUB_CREDENTIALS,
+    HUBSPOT_CREDENTIALS,
+    LLM_CREDENTIALS,
+    SEARCH_CREDENTIALS,
+    SLACK_CREDENTIALS,
+)
+from aden_tools.tools import register_all_tools
+
+from .conftest import (
+    CREDENTIAL_TOOL_MODULE_IDS,
+    CREDENTIAL_TOOL_MODULES,
+    KNOWN_PHANTOM_TOOLS,
+    MODULE_TO_TOOLS,
+    TOOL_MODULE_IDS,
+    TOOL_MODULES,
+)
+
+# ---------------------------------------------------------------------------
+# 1a-1: Module has __init__.py re-exporting register_tools
+# ---------------------------------------------------------------------------
+
+
+class TestModuleStructure:
+    """Every tool module must export register_tools from its __init__.py."""
+
+    @pytest.mark.parametrize(
+        "import_path,short_name",
+        TOOL_MODULES,
+        ids=TOOL_MODULE_IDS,
+    )
+    def test_module_exports_register_tools(self, import_path: str, short_name: str):
+        """register_tools is importable from the module's package."""
+        mod = importlib.import_module(import_path)
+        assert hasattr(mod, "register_tools"), (
+            f"Module {import_path} does not export 'register_tools'"
+        )
+        assert callable(mod.register_tools), f"{import_path}.register_tools is not callable"
+
+    @pytest.mark.parametrize(
+        "import_path,short_name",
+        TOOL_MODULES,
+        ids=TOOL_MODULE_IDS,
+    )
+    def test_register_tools_in_all(self, import_path: str, short_name: str):
+        """register_tools appears in __all__ if __all__ is defined."""
+        mod = importlib.import_module(import_path)
+        all_list = getattr(mod, "__all__", None)
+        if all_list is not None:
+            assert "register_tools" in all_list, (
+                f"{import_path}.__all__ does not include 'register_tools'"
+            )
+
+
+# ---------------------------------------------------------------------------
+# 1a-2: register_tools signature
+# ---------------------------------------------------------------------------
+
+
+class TestRegisterToolsSignature:
+    """register_tools must have the correct signature."""
+
+    @pytest.mark.parametrize(
+        "import_path,short_name",
+        TOOL_MODULES,
+        ids=TOOL_MODULE_IDS,
+    )
+    def test_accepts_mcp_param(self, import_path: str, short_name: str):
+        """All register_tools functions must accept an 'mcp' parameter."""
+        mod = importlib.import_module(import_path)
+        sig = inspect.signature(mod.register_tools)
+        params = list(sig.parameters.keys())
+        assert len(params) >= 1, f"{import_path}.register_tools has no parameters"
+        assert params[0] == "mcp", (
+            f"{import_path}.register_tools first param should be 'mcp', got '{params[0]}'"
+        )
+
+    @pytest.mark.parametrize(
+        "import_path,short_name",
+        CREDENTIAL_TOOL_MODULES,
+        ids=CREDENTIAL_TOOL_MODULE_IDS,
+    )
+    def test_credential_tools_accept_credentials_param(self, import_path: str, short_name: str):
+        """Tools with CredentialSpecs must accept a 'credentials' parameter."""
+        mod = importlib.import_module(import_path)
+        sig = inspect.signature(mod.register_tools)
+        assert "credentials" in sig.parameters, (
+            f"{import_path}.register_tools should accept 'credentials' param"
+        )
+
+        param = sig.parameters["credentials"]
+        assert param.default is None, (
+            f"{import_path}.register_tools 'credentials' param should default to None"
+        )
+
+
+# ---------------------------------------------------------------------------
+# 1a-3: CredentialSpec field completeness
+# ---------------------------------------------------------------------------
+
+
+class TestCredentialSpecFields:
+    """Every CredentialSpec must have non-empty required fields."""
+
+    @pytest.mark.parametrize("spec_name", list(CREDENTIAL_SPECS.keys()))
+    def test_env_var_non_empty(self, spec_name: str):
+        """CredentialSpec.env_var must be non-empty."""
+        spec = CREDENTIAL_SPECS[spec_name]
+        assert spec.env_var, f"Spec '{spec_name}' has empty env_var"
+
+    @pytest.mark.parametrize("spec_name", list(CREDENTIAL_SPECS.keys()))
+    def test_tools_or_node_types_non_empty(self, spec_name: str):
+        """CredentialSpec must have non-empty tools or node_types."""
+        spec = CREDENTIAL_SPECS[spec_name]
+        assert spec.tools or spec.node_types, (
+            f"Spec '{spec_name}' has both empty tools and empty node_types"
+        )
+
+    @pytest.mark.parametrize("spec_name", list(CREDENTIAL_SPECS.keys()))
+    def test_help_url_non_empty(self, spec_name: str):
+        """CredentialSpec.help_url must be non-empty."""
+        spec = CREDENTIAL_SPECS[spec_name]
+        assert spec.help_url, f"Spec '{spec_name}' has empty help_url"
+
+    @pytest.mark.parametrize("spec_name", list(CREDENTIAL_SPECS.keys()))
+    def test_description_non_empty(self, spec_name: str):
+        """CredentialSpec.description must be non-empty."""
+        spec = CREDENTIAL_SPECS[spec_name]
+        assert spec.description, f"Spec '{spec_name}' has empty description"
+
+    @pytest.mark.parametrize("spec_name", list(CREDENTIAL_SPECS.keys()))
+    def test_credential_id_non_empty(self, spec_name: str):
+        """CredentialSpec.credential_id must be non-empty."""
+        spec = CREDENTIAL_SPECS[spec_name]
+        assert spec.credential_id, f"Spec '{spec_name}' has empty credential_id"
+
+    @pytest.mark.parametrize("spec_name", list(CREDENTIAL_SPECS.keys()))
+    def test_credential_key_non_empty(self, spec_name: str):
+        """CredentialSpec.credential_key must be non-empty."""
+        spec = CREDENTIAL_SPECS[spec_name]
+        assert spec.credential_key, f"Spec '{spec_name}' has empty credential_key"
+
+
+# ---------------------------------------------------------------------------
+# 1a-4: spec.tools match actual registered @mcp.tool() functions
+# ---------------------------------------------------------------------------
+
+
+class TestSpecToolsMatchRegistered:
+    """Every tool name in a CredentialSpec.tools must be a real registered tool."""
+
+    @pytest.fixture(scope="class")
+    def registered_tools(self) -> set[str]:
+        """Register all tools and return the set of registered tool names."""
+        mcp = FastMCP("spec-check")
+        register_all_tools(mcp, credentials=None)
+        return set(mcp._tool_manager._tools.keys())
+
+    @pytest.mark.parametrize("spec_name", list(CREDENTIAL_SPECS.keys()))
+    def test_spec_tools_are_registered(self, spec_name: str, registered_tools: set[str]):
+        """Every name in spec.tools must exist in the registered tools.
+
+        Known phantom tool names (used for multi-provider credential grouping)
+        are excluded — see KNOWN_PHANTOM_TOOLS in conftest.py.
+        """
+        spec = CREDENTIAL_SPECS[spec_name]
+        for tool_name in spec.tools:
+            if tool_name in KNOWN_PHANTOM_TOOLS:
+                continue
+            assert tool_name in registered_tools, (
+                f"Spec '{spec_name}' references tool '{tool_name}' "
+                f"which is not registered. Registered tools: {sorted(registered_tools)}"
+            )
+
+
+# ---------------------------------------------------------------------------
+# 1a-5: All credential category dicts are merged into CREDENTIAL_SPECS
+# ---------------------------------------------------------------------------
+
+
+class TestSpecsMergedIntoCredentialSpecs:
+    """All category credential dicts must be merged into the global CREDENTIAL_SPECS."""
+
+    CATEGORY_DICTS = {
+        "LLM_CREDENTIALS": LLM_CREDENTIALS,
+        "SEARCH_CREDENTIALS": SEARCH_CREDENTIALS,
+        "EMAIL_CREDENTIALS": EMAIL_CREDENTIALS,
+        "GITHUB_CREDENTIALS": GITHUB_CREDENTIALS,
+        "HUBSPOT_CREDENTIALS": HUBSPOT_CREDENTIALS,
+        "SLACK_CREDENTIALS": SLACK_CREDENTIALS,
+    }
+
+    @pytest.mark.parametrize("category_name", list(CATEGORY_DICTS.keys()))
+    def test_category_merged(self, category_name: str):
+        """Every key in the category dict must exist in CREDENTIAL_SPECS."""
+        category = self.CATEGORY_DICTS[category_name]
+        for spec_name, spec in category.items():
+            assert spec_name in CREDENTIAL_SPECS, (
+                f"'{spec_name}' from {category_name} is not in CREDENTIAL_SPECS"
+            )
+            assert CREDENTIAL_SPECS[spec_name] is spec, (
+                f"'{spec_name}' in CREDENTIAL_SPECS is not the same object as in {category_name}"
+            )
+
+
+# ---------------------------------------------------------------------------
+# 1a-6: Tool names appear in register_all_tools() return list
+# ---------------------------------------------------------------------------
+
+
+class TestToolNamesInReturnList:
+    """Tool names from CredentialSpecs must appear in register_all_tools() return."""
+
+    @pytest.fixture(scope="class")
+    def all_tools_return(self) -> list[str]:
+        """Call register_all_tools and return the tool name list."""
+        mcp = FastMCP("return-check")
+        return register_all_tools(mcp, credentials=None)
+
+    @pytest.mark.parametrize("spec_name", list(CREDENTIAL_SPECS.keys()))
+    def test_spec_tools_in_return_list(self, spec_name: str, all_tools_return: list[str]):
+        """Every tool name in spec.tools appears in register_all_tools() return.
+
+        Known phantom tool names are excluded — see KNOWN_PHANTOM_TOOLS.
+        """
+        spec = CREDENTIAL_SPECS[spec_name]
+        for tool_name in spec.tools:
+            if tool_name in KNOWN_PHANTOM_TOOLS:
+                continue
+            assert tool_name in all_tools_return, (
+                f"Tool '{tool_name}' (from spec '{spec_name}') "
+                f"not in register_all_tools() return list"
+            )
+
+
+# ---------------------------------------------------------------------------
+# 1a-7: Credential coverage - tools accepting credentials must have specs
+# ---------------------------------------------------------------------------
+
+
+class TestCredentialCoverage:
+    """Every tool that accepts credentials must have a corresponding CredentialSpec.
+
+    This enforces the convention:
+    - register_tools(mcp) -> no credentials needed
+    - register_tools(mcp, credentials=None) -> must have CredentialSpec entries
+
+    This eliminates the need for a separate "no_credentials" list.
+    """
+
+    @pytest.fixture(scope="class")
+    def all_spec_tools(self) -> set[str]:
+        """Collect all tool names referenced in CREDENTIAL_SPECS."""
+        tools: set[str] = set()
+        for spec in CREDENTIAL_SPECS.values():
+            tools.update(spec.tools)
+        tools.update(KNOWN_PHANTOM_TOOLS)
+        return tools
+
+    @pytest.mark.parametrize(
+        "import_path,short_name",
+        CREDENTIAL_TOOL_MODULES,
+        ids=CREDENTIAL_TOOL_MODULE_IDS,
+    )
+    def test_credential_tools_have_specs(
+        self, import_path: str, short_name: str, all_spec_tools: set[str]
+    ):
+        """Every tool from a module with credentials param must have a spec.
+
+        If this test fails, you have two options:
+        1. Add a CredentialSpec in credentials/<category>.py for your tool
+        2. Remove the 'credentials' param from register_tools() if no credentials needed
+        """
+        tools_in_module = MODULE_TO_TOOLS.get(short_name, [])
+        for tool_name in tools_in_module:
+            assert tool_name in all_spec_tools, (
+                f"Tool '{tool_name}' from module '{short_name}' accepts credentials "
+                f"but has no CredentialSpec.\n\n"
+                f"Fix by either:\n"
+                f"  1. Adding a CredentialSpec in credentials/<category>.py with "
+                f"tools=['{tool_name}'], or\n"
+                f"  2. Removing 'credentials' param from register_tools() if this "
+                f"tool doesn't need credentials"
+            )
@@ -28,7 +28,7 @@ class TestSendEmail:
    def test_no_credentials_returns_error(self, send_email_fn, monkeypatch):
        """Send without credentials returns helpful error."""
        monkeypatch.delenv("RESEND_API_KEY", raising=False)
-        monkeypatch.delenv("GMAIL_ACCESS_TOKEN", raising=False)
+        monkeypatch.delenv("GOOGLE_ACCESS_TOKEN", raising=False)
        monkeypatch.setenv("EMAIL_FROM", "test@example.com")

        result = send_email_fn(to="test@example.com", subject="Test", html="<p>Hi</p>")
@@ -40,7 +40,7 @@ class TestSendEmail:
    def test_resend_explicit_missing_key(self, send_email_fn, monkeypatch):
        """Explicit resend provider without key returns error."""
        monkeypatch.delenv("RESEND_API_KEY", raising=False)
-        monkeypatch.delenv("GMAIL_ACCESS_TOKEN", raising=False)
+        monkeypatch.delenv("GOOGLE_ACCESS_TOKEN", raising=False)
        monkeypatch.setenv("EMAIL_FROM", "test@example.com")

        result = send_email_fn(
@@ -54,7 +54,7 @@ class TestSendEmail:
    def test_missing_from_email_returns_error(self, send_email_fn, monkeypatch):
        """No from_email and no EMAIL_FROM env var returns error when using Resend."""
        monkeypatch.setenv("RESEND_API_KEY", "re_test_key")
-        monkeypatch.delenv("GMAIL_ACCESS_TOKEN", raising=False)
+        monkeypatch.delenv("GOOGLE_ACCESS_TOKEN", raising=False)
        monkeypatch.delenv("EMAIL_FROM", raising=False)

        result = send_email_fn(to="test@example.com", subject="Test", html="<p>Hi</p>")
@@ -336,7 +336,7 @@ class TestSendBudgetAlertEmail:
    def test_no_credentials_returns_error(self, send_budget_alert_fn, monkeypatch):
        """Budget alert without credentials returns error."""
        monkeypatch.delenv("RESEND_API_KEY", raising=False)
-        monkeypatch.delenv("GMAIL_ACCESS_TOKEN", raising=False)
+        monkeypatch.delenv("GOOGLE_ACCESS_TOKEN", raising=False)
        monkeypatch.setenv("EMAIL_FROM", "test@example.com")

        result = send_budget_alert_fn(
@@ -455,7 +455,7 @@ class TestGmailProvider:

    def test_gmail_success(self, send_email_fn, monkeypatch):
        """Successful Gmail send returns success dict with message ID."""
-        monkeypatch.setenv("GMAIL_ACCESS_TOKEN", "test_gmail_token")
+        monkeypatch.setenv("GOOGLE_ACCESS_TOKEN", "test_gmail_token")
        monkeypatch.delenv("RESEND_API_KEY", raising=False)
        monkeypatch.setenv("EMAIL_FROM", "user@gmail.com")

@@ -487,7 +487,7 @@ class TestGmailProvider:

    def test_gmail_missing_credentials(self, send_email_fn, monkeypatch):
        """Explicit Gmail provider without token returns error."""
-        monkeypatch.delenv("GMAIL_ACCESS_TOKEN", raising=False)
+        monkeypatch.delenv("GOOGLE_ACCESS_TOKEN", raising=False)
        monkeypatch.delenv("RESEND_API_KEY", raising=False)
        monkeypatch.setenv("EMAIL_FROM", "test@example.com")

@@ -504,7 +504,7 @@ class TestGmailProvider:

    def test_gmail_api_error(self, send_email_fn, monkeypatch):
        """Gmail API non-200 response returns error dict."""
-        monkeypatch.setenv("GMAIL_ACCESS_TOKEN", "test_gmail_token")
+        monkeypatch.setenv("GOOGLE_ACCESS_TOKEN", "test_gmail_token")
        monkeypatch.delenv("RESEND_API_KEY", raising=False)
        monkeypatch.setenv("EMAIL_FROM", "user@gmail.com")

@@ -525,7 +525,7 @@ class TestGmailProvider:

    def test_gmail_token_expired(self, send_email_fn, monkeypatch):
        """Gmail 401 response returns token expiry error with help."""
-        monkeypatch.setenv("GMAIL_ACCESS_TOKEN", "expired_token")
+        monkeypatch.setenv("GOOGLE_ACCESS_TOKEN", "expired_token")
        monkeypatch.delenv("RESEND_API_KEY", raising=False)
        monkeypatch.setenv("EMAIL_FROM", "user@gmail.com")

@@ -547,7 +547,7 @@ class TestGmailProvider:

    def test_auto_prefers_gmail_over_resend(self, send_email_fn, monkeypatch):
        """Auto mode uses Gmail when both Gmail and Resend are available."""
-        monkeypatch.setenv("GMAIL_ACCESS_TOKEN", "test_gmail_token")
+        monkeypatch.setenv("GOOGLE_ACCESS_TOKEN", "test_gmail_token")
        monkeypatch.setenv("RESEND_API_KEY", "re_test_key")
        monkeypatch.setenv("EMAIL_FROM", "user@gmail.com")

@@ -571,7 +571,7 @@ class TestGmailProvider:

    def test_auto_falls_back_to_resend(self, send_email_fn, monkeypatch):
        """Auto mode falls back to Resend when Gmail is not available."""
-        monkeypatch.delenv("GMAIL_ACCESS_TOKEN", raising=False)
+        monkeypatch.delenv("GOOGLE_ACCESS_TOKEN", raising=False)
        monkeypatch.setenv("RESEND_API_KEY", "re_test_key")
        monkeypatch.setenv("EMAIL_FROM", "test@example.com")

@@ -588,7 +588,7 @@ class TestGmailProvider:

    def test_gmail_no_from_email_ok(self, send_email_fn, monkeypatch):
        """Gmail works without from_email (defaults to authenticated user)."""
-        monkeypatch.setenv("GMAIL_ACCESS_TOKEN", "test_gmail_token")
+        monkeypatch.setenv("GOOGLE_ACCESS_TOKEN", "test_gmail_token")
        monkeypatch.delenv("RESEND_API_KEY", raising=False)
        monkeypatch.delenv("EMAIL_FROM", raising=False)
Author	SHA1	Message	Date
Richard Tang	cd014e41e4	docs: update links in the README.md	2026-02-06 12:44:34 -08:00
Richard Tang	830f11c47d	docs: add key concept section	2026-02-06 12:41:22 -08:00
Timothy @aden	b22be7a6cb	Merge pull request #3818 from TimothyZhang7/main (micro-fix)(skills): cursor skill symlinks to claude skill	2026-02-06 09:32:23 -08:00
Timothy @aden	5179677e8f	Merge pull request #3744 from adenhq/chore/update-hive-credential (micro-fix): update hive-credentials	2026-02-05 18:55:19 -08:00
bryan	2c25b2eae7	Merge branch 'main' into chore/update-hive-credential	2026-02-05 18:45:11 -08:00
RichardTang-Aden	f6705fe2d3	Merge pull request #3746 from RichardTang-Aden/integration-ci (micro-fix)(chore): fix format	2026-02-05 18:36:32 -08:00
Richard Tang	c2771fed20	chore: fix format	2026-02-05 18:30:50 -08:00
RichardTang-Aden	fc781eccd9	Merge pull request #3745 from RichardTang-Aden/integration-ci (micro-fix)(chore): fix lint	2026-02-05 18:15:38 -08:00
bryan	d5a25ae081	update hive-credentials	2026-02-05 18:13:25 -08:00
Richard Tang	23b6fb6391	chore: fix lint	2026-02-05 18:12:47 -08:00
Timothy	433967f0cf	fix: cursor skill symlinks to claude skill	2026-02-05 18:11:24 -08:00
RichardTang-Aden	2a876c2a10	Merge pull request #3743 from RichardTang-Aden/integration-ci feat(ci): add integration credential specs and CI validation	2026-02-05 18:06:22 -08:00
Richard Tang	ff0adeaba7	docs: update outdated skill references	2026-02-05 18:00:06 -08:00
Richard Tang	846edbf256	docs: update documentation structure	2026-02-05 18:00:04 -08:00
Richard Tang	c68dd48f6d	feat: add slack credential spec and contribution doc	2026-02-05 17:39:44 -08:00
Richard Tang	50c0a5da9e	feat: integration credentials implementation check	2026-02-05 17:06:34 -08:00
Timothy @aden	2f0e5c42f1	Merge pull request #3724 from TimothyZhang7/main docs(hive): hive commands rebrand	2026-02-05 15:06:25 -08:00
Timothy @aden	903288468a	Merge pull request #3725 from adenhq/chore/gmail-to-google (micro-fix): changing gmail to google	2026-02-05 14:54:18 -08:00
bryan	9e3bba6f59	updated tests	2026-02-05 14:52:19 -08:00
bryan	bc16f0752f	changing gmail to google	2026-02-05 14:46:38 -08:00
Timothy	86badd70fa	docs(hive): hive commands rebrand	2026-02-05 14:35:50 -08:00
Timothy @aden	ce5379516c	Merge pull request #3722 from TimothyZhang7/main docs(templates): put example templates in there	2026-02-05 14:31:50 -08:00
Timothy	a50078bbf2	chore: moves the templates	2026-02-05 14:25:49 -08:00
Timothy	2cef168442	fix: aden hive url	2026-02-05 14:08:18 -08:00
Timothy @aden	0a1a9e3545	Merge pull request #3720 from TimothyZhang7/feature/example-agent-registry docs(skills): Rename skills to hive-* namespace and improve create workflow	2026-02-05 13:59:45 -08:00
Timothy	3c8682d80c	fix: mention of skill in readme	2026-02-05 13:59:02 -08:00
Timothy	ecc5a1608f	fix: make sure of the skill ordering	2026-02-05 13:54:20 -08:00
RichardTang-Aden	bc81b55600	Merge pull request #3713 from adenhq/update/gmail-send-tool (micro-fix): created gmail send tool	2026-02-05 13:15:08 -08:00
Timothy	28b628c1b4	fix: update skill names and examples	2026-02-05 13:13:19 -08:00
Timothy	148264ac73	fix: skill problems	2026-02-05 13:11:18 -08:00
Timothy	28298d9af2	fix: streamline the executor configuration and data tool usage	2026-02-05 12:50:00 -08:00
				`@@ -1 +0,0 @@`
				`../../.claude/skills/building-agents-construction`
				`@@ -1 +0,0 @@`
				`../../.claude/skills/building-agents-patterns`