docs: change docs tune

docs: update the use cases
docs: deployment strategies
2026-02-10 19:37:14 -08:00 · 2026-02-10 18:15:40 -08:00 · 2026-02-10 18:02:08 -08:00 · 2026-02-10 17:42:05 -08:00 · 2026-02-10 13:10:59 -08:00 · 2026-02-10 13:05:54 -08:00
402 changed files with 97447 additions and 10230 deletions
@@ -0,0 +1,15 @@
+{
+  "hooks": {
+    "PostToolUse": [
+      {
+        "matcher": "Edit|Write|NotebookEdit",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "ruff check --fix \"$CLAUDE_FILE_PATH\" 2>/dev/null; ruff format \"$CLAUDE_FILE_PATH\" 2>/dev/null; true"
+          }
+        ]
+      }
+    ]
+  }
+}
@@ -1,19 +0,0 @@
-{
-  "permissions": {
-    "allow": [
-      "Bash(npm install:*)",
-      "Bash(npm test:*)",
-      "Skill(building-agents-construction)",
-      "Skill(building-agents-construction:*)",
-      "Bash(PYTHONPATH=core:exports pytest:*)",
-      "mcp__agent-builder__create_session",
-      "mcp__agent-builder__get_session_status",
-      "mcp__agent-builder__set_goal",
-      "mcp__agent-builder__list_mcp_servers",
-      "mcp__agent-builder__test_node",
-      "mcp__agent-builder__add_node",
-      "mcp__agent-builder__add_edge",
-      "mcp__agent-builder__validate_graph"
-    ]
-  }
-}
@@ -1,953 +0,0 @@
---
-name: building-agents-construction
-description: Step-by-step guide for building goal-driven agents. Creates package structure, defines goals, adds nodes, connects edges, and finalizes agent class. Use when actively building an agent.
-license: Apache-2.0
-metadata:
-  author: hive
-  version: "1.0"
-  type: procedural
-  part_of: building-agents
-  requires: building-agents-core
---
-
-# Building Agents - Construction Process
-
-Step-by-step guide for building goal-driven agent packages.
-
-**Prerequisites:** Read `building-agents-core` for fundamental concepts.
-
-## CRITICAL: entry_points Format Reference
-
-**⚠️ Common Mistake Prevention:**
-
-The `entry_points` parameter in GraphSpec has a specific format that is easy to get wrong. This section exists because this mistake has caused production bugs.
-
-### Correct Format
-
-```python
-entry_points = {"start": "first-node-id"}
-```
-
-**Examples from working agents:**
-
-```python
-# From exports/outbound_sales_agent/agent.py
-entry_node = "lead-qualification"
-entry_points = {"start": "lead-qualification"}
-
-# From exports/support_ticket_agent/agent.py (FIXED)
-entry_node = "parse-ticket"
-entry_points = {"start": "parse-ticket"}
-```
-
-### WRONG Formats (DO NOT USE)
-
-```python
-# ❌ WRONG: Using node ID as key with input keys as value
-entry_points = {
-    "parse-ticket": ["ticket_content", "customer_id", "ticket_id"]
-}
-# Error: ValidationError: Input should be a valid string, got list
-
-# ❌ WRONG: Using set instead of dict
-entry_points = {"parse-ticket"}
-# Error: ValidationError: Input should be a valid dictionary, got set
-
-# ❌ WRONG: Missing "start" key
-entry_points = {"entry": "parse-ticket"}
-# Error: Graph execution fails, cannot find entry point
-```
-
-### Validation Check
-
-After writing graph configuration, ALWAYS validate:
-
-```python
-# Check 1: Must be a dict
-assert isinstance(entry_points, dict), f"entry_points must be dict, got {type(entry_points)}"
-
-# Check 2: Must have "start" key
-assert "start" in entry_points, f"entry_points must have 'start' key, got keys: {entry_points.keys()}"
-
-# Check 3: "start" value must match entry_node
-assert entry_points["start"] == entry_node, f"entry_points['start']={entry_points['start']} must match entry_node={entry_node}"
-
-# Check 4: Value must be a string (node ID)
-assert isinstance(entry_points["start"], str), f"entry_points['start'] must be string, got {type(entry_points['start'])}"
-```
-
-**Why this matters:** GraphSpec uses Pydantic validation. The wrong format causes ValidationError at runtime, which blocks all agent execution and tests. This bug is not caught until you try to run the agent.
-
-## Building Session Management with MCP
-
-**MANDATORY**: Use the agent-builder MCP server's BuildSession system for automatic bookkeeping and persistence.
-
-### Available MCP Session Tools
-
-```python
-# Create new session (call FIRST before building)
-mcp__agent-builder__create_session(name="Support Ticket Agent")
-# Returns: session_id, automatically sets as active session
-
-# Get current session status (use for progress tracking)
-status = mcp__agent-builder__get_session_status()
-# Returns: {
-#   "session_id": "build_20250122_...",
-#   "name": "Support Ticket Agent",
-#   "has_goal": true,
-#   "node_count": 5,
-#   "edge_count": 7,
-#   "nodes": ["parse-ticket", "categorize", ...],
-#   "edges": [("parse-ticket", "categorize"), ...]
-# }
-
-# List all saved sessions
-mcp__agent-builder__list_sessions()
-
-# Load previous session
-mcp__agent-builder__load_session_by_id(session_id="build_...")
-
-# Delete session
-mcp__agent-builder__delete_session(session_id="build_...")
-```
-
-### How MCP Session Works
-
-The BuildSession class (in `core/framework/mcp/agent_builder_server.py`) automatically:
- **Persists to disk** after every operation (`_save_session()` called automatically)
- **Tracks all components**: goal, nodes, edges, mcp_servers
- **Maintains timestamps**: created_at, last_modified
- **Stores to**: `~/.claude-code-agent-builder/sessions/`
-
-When you call MCP tools like:
- `mcp__agent-builder__set_goal(...)` - Automatically added to session.goal and saved
- `mcp__agent-builder__add_node(...)` - Automatically added to session.nodes and saved
- `mcp__agent-builder__add_edge(...)` - Automatically added to session.edges and saved
-
-**No manual bookkeeping needed** - the MCP server handles it all!
-
-### Show Progress to User
-
-```python
-# Get session status to show progress
-status = json.loads(mcp__agent-builder__get_session_status())
-
-print(f"\n📊 Building Progress:")
-print(f"   Session: {status['name']}")
-print(f"   Goal defined: {status['has_goal']}")
-print(f"   Nodes: {status['node_count']}")
-print(f"   Edges: {status['edge_count']}")
-print(f"   Nodes added: {', '.join(status['nodes'])}")
-```
-
-**Benefits:**
- Automatic persistence - survive crashes/restarts
- Clear audit trail - all operations logged
- Session resume - continue from where you left off
- Progress tracking built-in
- No manual state management needed
-
-## Step-by-Step Guide
-
-### Step 1: Create Building Session & Package Structure
-
-When user requests an agent, **immediately create MCP session and package**:
-
-```python
-# 0. FIRST: Create MCP building session
-agent_name = "technical_research_agent"  # snake_case
-session_result = mcp__agent-builder__create_session(name=agent_name.replace('_', ' ').title())
-session_id = json.loads(session_result)["session_id"]
-print(f"✅ Created building session: {session_id}")
-
-# 1. Create directory
-package_path = f"exports/{agent_name}"
-
-Bash(f"mkdir -p {package_path}/nodes")
-
-# 2. Write skeleton files
-Write(
-    file_path=f"{package_path}/__init__.py",
-    content='''"""
-Agent package - will be populated as build progresses.
-"""
-'''
-)
-
-Write(
-    file_path=f"{package_path}/nodes/__init__.py",
-    content='''"""Node definitions."""
-from framework.graph import NodeSpec
-
-# Nodes will be added here as they are approved
-
-__all__ = []
-'''
-)
-
-Write(
-    file_path=f"{package_path}/agent.py",
-    content='''"""Agent graph construction."""
-from framework.graph import EdgeSpec, EdgeCondition, Goal, SuccessCriterion, Constraint
-from framework.graph.edge import GraphSpec
-from framework.graph.executor import GraphExecutor
-from framework.runtime import Runtime
-from framework.llm.anthropic import AnthropicProvider
-from framework.runner.tool_registry import ToolRegistry
-from aden_tools.credentials import CredentialManager
-
-# Goal will be added when defined
-# Nodes will be imported from .nodes
-# Edges will be added when approved
-# Agent class will be created when graph is complete
-'''
-)
-
-Write(
-    file_path=f"{package_path}/config.py",
-    content='''"""Runtime configuration."""
-from dataclasses import dataclass
-
-@dataclass
-class RuntimeConfig:
-    model: str = "claude-sonnet-4-5-20250929"
-    temperature: float = 0.7
-    max_tokens: int = 4096
-
-default_config = RuntimeConfig()
-
-# Metadata will be added when goal is set
-'''
-)
-
-Write(
-    file_path=f"{package_path}/__main__.py",
-    content=CLI_TEMPLATE  # Full CLI template (see below)
-)
-```
-
-**Show user:**
-
-```
-✅ Package created: exports/technical_research_agent/
-📁 Files created:
-   - __init__.py (skeleton)
-   - __main__.py (CLI ready)
-   - agent.py (skeleton)
-   - nodes/__init__.py (empty)
-   - config.py (skeleton)
-
-You can open these files now and watch them grow as we build!
-```
-
-### Step 2: Define Goal
-
-Propose goal, get approval, **write immediately**:
-
-```python
-# After user approves goal...
-
-goal_code = f'''
-goal = Goal(
-    id="{goal_id}",
-    name="{name}",
-    description="{description}",
-    success_criteria=[
-        SuccessCriterion(
-            id="{sc.id}",
-            description="{sc.description}",
-            metric="{sc.metric}",
-            target="{sc.target}",
-            weight={sc.weight},
-        ),
-        # ... more criteria
-    ],
-    constraints=[
-        Constraint(
-            id="{c.id}",
-            description="{c.description}",
-            constraint_type="{c.constraint_type}",
-            category="{c.category}",
-        ),
-        # ... more constraints
-    ],
-)
-'''
-
-# Append to agent.py
-Read(f"{package_path}/agent.py")  # Must read first
-Edit(
-    file_path=f"{package_path}/agent.py",
-    old_string="# Goal will be added when defined",
-    new_string=f"# Goal definition\n{goal_code}"
-)
-
-# Write metadata to config.py
-metadata_code = f'''
-@dataclass
-class AgentMetadata:
-    name: str = "{name}"
-    version: str = "1.0.0"
-    description: str = "{description}"
-
-metadata = AgentMetadata()
-'''
-
-Read(f"{package_path}/config.py")
-Edit(
-    file_path=f"{package_path}/config.py",
-    old_string="# Metadata will be added when goal is set",
-    new_string=f"# Agent metadata\n{metadata_code}"
-)
-```
-
-**Show user:**
-
-```
-✅ Goal written to agent.py
-✅ Metadata written to config.py
-
-Open exports/technical_research_agent/agent.py to see the goal!
-```
-
-**Note:** Goal is automatically tracked in MCP session. Use `mcp__agent-builder__get_session_status()` to check progress.
-
-### Step 3: Add Nodes (Incremental)
-
-**⚠️ CRITICAL VALIDATION REQUIREMENTS:**
-
-Before adding any node with tools:
-1. Call `mcp__agent-builder__list_mcp_tools()` to discover available tools
-2. Verify each tool exists in the response
-3. If a tool doesn't exist, inform the user and ask how to proceed
-
-After writing each node:
-4. **MANDATORY**: Validate with `mcp__agent-builder__test_node()` before proceeding
-5. **MANDATORY**: Check MCP session status to track progress
-6. Only proceed to next node after validation passes
-
-For each node, **write immediately after approval**:
-
-```python
-# After user approves node...
-
-node_code = f'''
-{node_id.replace('-', '_')}_node = NodeSpec(
-    id="{node_id}",
-    name="{name}",
-    description="{description}",
-    node_type="{node_type}",
-    input_keys={input_keys},
-    output_keys={output_keys},
-    system_prompt="""\\
-{system_prompt}
-""",
-    tools={tools},
-    max_retries={max_retries},
-)
-
-'''
-
-# Append to nodes/__init__.py
-Read(f"{package_path}/nodes/__init__.py")
-Edit(
-    file_path=f"{package_path}/nodes/__init__.py",
-    old_string="__all__ = []",
-    new_string=f"{node_code}\n__all__ = []"
-)
-
-# Update __all__ exports
-all_node_names = [n.replace('-', '_') + '_node' for n in approved_nodes]
-all_exports = f"__all__ = {all_node_names}"
-
-Edit(
-    file_path=f"{package_path}/nodes/__init__.py",
-    old_string="__all__ = []",
-    new_string=all_exports
-)
-```
-
-**Show user after each node:**
-
-```
-✅ Added analyze_request_node to nodes/__init__.py
-📊 Progress: 1/6 nodes added
-
-Open exports/technical_research_agent/nodes/__init__.py to see it!
-```
-
-**Repeat for each node.** User watches the file grow.
-
-#### MANDATORY: Validate Each Node with MCP Tools
-
-After writing EVERY node, you MUST validate before proceeding:
-
-```python
-# Node is already written to file. Now VALIDATE IT (REQUIRED):
-validation_result = json.loads(mcp__agent-builder__test_node(
-    node_id="analyze-request",
-    test_input='{"query": "test query"}',
-    mock_llm_response='{"analysis": "mock output"}'
-))
-
-# Check validation result
-if validation_result["valid"]:
-    # Show user validation passed
-    print(f"✅ Node validation passed: analyze-request")
-
-    # Show session progress
-    status = json.loads(mcp__agent-builder__get_session_status())
-    print(f"📊 Session progress: {status['node_count']} nodes added")
-else:
-    # STOP - Do not proceed until fixed
-    print(f"❌ Node validation FAILED:")
-    for error in validation_result["errors"]:
-        print(f"   - {error}")
-    print("⚠️ Must fix node before proceeding to next component")
-    # Ask user how to proceed
-```
-
-**CRITICAL:** Do NOT proceed to the next node until validation passes. Bugs caught here prevent wasted work later.
-
-### Step 4: Connect Edges
-
-After all nodes approved, add edges:
-
-```python
-# Generate edges code
-edges_code = "edges = [\n"
-for edge in approved_edges:
-    edges_code += f'''    EdgeSpec(
-        id="{edge.id}",
-        source="{edge.source}",
-        target="{edge.target}",
-        condition=EdgeCondition.{edge.condition.upper()},
-'''
-    if edge.condition_expr:
-        edges_code += f'        condition_expr="{edge.condition_expr}",\n'
-    edges_code += f'        priority={edge.priority},\n'
-    edges_code += '    ),\n'
-edges_code += "]\n"
-
-# Write to agent.py
-Read(f"{package_path}/agent.py")
-Edit(
-    file_path=f"{package_path}/agent.py",
-    old_string="# Edges will be added when approved",
-    new_string=f"# Edge definitions\n{edges_code}"
-)
-
-# Write entry points and terminal nodes
-# ⚠️ CRITICAL: entry_points format must be {"start": "node_id"}
-# Common mistake: {"node_id": ["input_keys"]} is WRONG
-# Correct format: {"start": "first-node-id"}
-# Reference: See exports/outbound_sales_agent/agent.py for example
-
-graph_config = f'''
-# Graph configuration
-entry_node = "{entry_node_id}"
-entry_points = {{"start": "{entry_node_id}"}}  # CRITICAL: Must be {{"start": "node-id"}}
-pause_nodes = {pause_nodes}
-terminal_nodes = {terminal_nodes}
-
-# Collect all nodes
-nodes = [
-    {', '.join(node_names)},
-]
-'''
-
-Edit(
-    file_path=f"{package_path}/agent.py",
-    old_string="# Agent class will be created when graph is complete",
-    new_string=graph_config
-)
-```
-
-**Show user:**
-
-```
-✅ Edges written to agent.py
-✅ Graph configuration added
-
-5 edges connecting 6 nodes
-```
-
-#### MANDATORY: Validate Graph Structure
-
-After writing edges, you MUST validate before proceeding to finalization:
-
-```python
-# Edges already written to agent.py. Now VALIDATE STRUCTURE (REQUIRED):
-graph_validation = json.loads(mcp__agent-builder__validate_graph())
-
-# Check for structural issues
-if graph_validation["valid"]:
-    print("✅ Graph structure validated successfully")
-
-    # Show session summary
-    status = json.loads(mcp__agent-builder__get_session_status())
-    print(f"   - Nodes: {status['node_count']}")
-    print(f"   - Edges: {status['edge_count']}")
-    print(f"   - Entry point: {entry_node_id}")
-else:
-    print("❌ Graph validation FAILED:")
-    for error in graph_validation["errors"]:
-        print(f"   ERROR: {error}")
-    print("\n⚠️ Must fix graph structure before finalizing agent")
-    # Ask user how to proceed
-
-# Additional validation: Check entry_points format
-if not isinstance(entry_points, dict):
-    print("❌ CRITICAL ERROR: entry_points must be a dict")
-    print(f"   Current value: {entry_points} (type: {type(entry_points)})")
-    print("   Correct format: {'start': 'node-id'}")
-    # STOP - This is the mistake that caused the support_ticket_agent bug
-
-if entry_points.get("start") != entry_node_id:
-    print("❌ CRITICAL ERROR: entry_points['start'] must match entry_node")
-    print(f"   entry_points: {entry_points}")
-    print(f"   entry_node: {entry_node_id}")
-    print("   They must be consistent!")
-```
-
-**CRITICAL:** Do NOT proceed to Step 5 (finalization) until graph validation passes. This checkpoint prevents structural bugs from reaching production.
-
-### Step 5: Finalize Agent Class
-
-**Pre-flight checks before finalization:**
-
-```python
-# MANDATORY: Verify all validations passed before finalizing
-print("\n🔍 Pre-finalization Checklist:")
-
-# Get current session status
-status = json.loads(mcp__agent-builder__get_session_status())
-
-checks_passed = True
-
-# Check 1: Goal defined
-if not status["has_goal"]:
-    print("❌ No goal defined")
-    checks_passed = False
-else:
-    print(f"✅ Goal defined: {status['goal_name']}")
-
-# Check 2: Nodes added
-if status["node_count"] == 0:
-    print("❌ No nodes added")
-    checks_passed = False
-else:
-    print(f"✅ {status['node_count']} nodes added: {', '.join(status['nodes'])}")
-
-# Check 3: Edges added
-if status["edge_count"] == 0:
-    print("❌ No edges added")
-    checks_passed = False
-else:
-    print(f"✅ {status['edge_count']} edges added")
-
-# Check 4: Entry points format correct
-if not isinstance(entry_points, dict) or "start" not in entry_points:
-    print("❌ CRITICAL: entry_points format incorrect")
-    print(f"   Current: {entry_points}")
-    print("   Required: {'start': 'node-id'}")
-    checks_passed = False
-else:
-    print(f"✅ Entry points valid: {entry_points}")
-
-if not checks_passed:
-    print("\n⚠️ CANNOT PROCEED to finalization until all checks pass")
-    print("   Fix the issues above first")
-    # Ask user how to proceed or stop here
-    return
-
-print("\n✅ All pre-flight checks passed - proceeding to finalization\n")
-```
-
-Write the agent class:
-
-````python
-agent_class_code = f'''
-
-class {agent_class_name}:
-    """
-    {agent_description}
-    """
-
-    def __init__(self, config=None):
-        self.config = config or default_config
-        self.goal = goal
-        self.nodes = nodes
-        self.edges = edges
-        self.entry_node = entry_node
-        self.entry_points = entry_points
-        self.pause_nodes = pause_nodes
-        self.terminal_nodes = terminal_nodes
-        self.executor = None
-
-    def _create_executor(self, mock_mode=False):
-        """Create executor instance."""
-        import tempfile
-        from pathlib import Path
-
-        storage_path = Path(tempfile.gettempdir()) / "{agent_name}"
-        storage_path.mkdir(parents=True, exist_ok=True)
-
-        runtime = Runtime(storage_path=storage_path)
-        tool_registry = ToolRegistry()
-
-        llm = None
-        if not mock_mode:
-            creds = CredentialManager()
-            if creds.is_available("anthropic"):
-                api_key = creds.get("anthropic")
-                llm = AnthropicProvider(api_key=api_key, model=self.config.model)
-
-        graph = GraphSpec(
-            id="{agent_name}-graph",
-            goal_id=self.goal.id,
-            version="1.0.0",
-            entry_node=self.entry_node,
-            entry_points=self.entry_points,
-            terminal_nodes=self.terminal_nodes,
-            pause_nodes=self.pause_nodes,
-            nodes=self.nodes,
-            edges=self.edges,
-            default_model=self.config.model,
-            max_tokens=self.config.max_tokens,
-        )
-
-        self.executor = GraphExecutor(
-            runtime=runtime,
-            llm=llm,
-            tools=list(tool_registry.get_tools().values()),
-            tool_executor=tool_registry.get_executor(),
-        )
-
-        self.graph = graph
-        return self.executor
-
-    async def run(self, context: dict, mock_mode=False, session_state=None):
-        """Run the agent."""
-        executor = self._create_executor(mock_mode=mock_mode)
-        result = await executor.execute(
-            graph=self.graph,
-            goal=self.goal,
-            input_data=context,
-            session_state=session_state,
-        )
-        return result
-
-    def info(self):
-        """Get agent information."""
-        return {{
-            "name": metadata.name,
-            "version": metadata.version,
-            "description": metadata.description,
-            "goal": {{
-                "name": self.goal.name,
-                "description": self.goal.description,
-            }},
-            "nodes": [n.id for n in self.nodes],
-            "edges": [e.id for e in self.edges],
-            "entry_node": self.entry_node,
-            "pause_nodes": self.pause_nodes,
-            "terminal_nodes": self.terminal_nodes,
-        }}
-
-    def validate(self):
-        """Validate agent structure."""
-        errors = []
-        warnings = []
-
-        node_ids = {{node.id for node in self.nodes}}
-        for edge in self.edges:
-            if edge.source not in node_ids:
-                errors.append(f"Edge {{edge.id}}: source '{{edge.source}}' not found")
-            if edge.target not in node_ids:
-                errors.append(f"Edge {{edge.id}}: target '{{edge.target}}' not found")
-
-        if self.entry_node not in node_ids:
-            errors.append(f"Entry node '{{self.entry_node}}' not found")
-
-        return {{
-            "valid": len(errors) == 0,
-            "errors": errors,
-            "warnings": warnings,
-        }}
-
-
-# Create default instance
-default_agent = {agent_class_name}()
-'''
-
-# Append agent class
-Read(f"{package_path}/agent.py")
-Edit(
-    file_path=f"{package_path}/agent.py",
-    old_string="nodes = [",
-    new_string=f"nodes = [\n{agent_class_code}"
-)
-
-# Finalize __init__.py exports
-init_content = f'''"""
-{agent_description}
-"""
-
-from .agent import {agent_class_name}, default_agent, goal, nodes, edges
-from .config import RuntimeConfig, AgentMetadata, default_config, metadata
-
-__version__ = "1.0.0"
-
-__all__ = [
-    "{agent_class_name}",
-    "default_agent",
-    "goal",
-    "nodes",
-    "edges",
-    "RuntimeConfig",
-    "AgentMetadata",
-    "default_config",
-    "metadata",
-]
-'''
-
-Read(f"{package_path}/__init__.py")
-Edit(
-    file_path=f"{package_path}/__init__.py",
-    old_string='"""',
-    new_string=init_content,
-    replace_all=True
-)
-
-# Write README
-readme_content = f'''# {agent_name.replace('_', ' ').title()}
-
-{agent_description}
-
-## Usage
-
-```bash
-# Show agent info
-python -m {agent_name} info
-
-# Validate structure
-python -m {agent_name} validate
-
-# Run agent
-python -m {agent_name} run --input '{{"key": "value"}}'
-
-# Interactive shell
-python -m {agent_name} shell
-````
-
-## As Python Module
-
-```python
-from {agent_name} import default_agent
-
-result = await default_agent.run({{"key": "value"}})
-```
-
-## Structure
-
- `agent.py` - Goal, edges, graph construction
- `nodes/__init__.py` - Node definitions
- `config.py` - Runtime configuration
- `__main__.py` - CLI interface
-  '''
-
-Write(
-file_path=f"{package_path}/README.md",
-content=readme_content
-)
-
-```
-
-**Show user:**
-
-```
-
-✅ Agent class written to agent.py
-✅ Package exports finalized in __init__.py
-✅ README.md generated
-
-🎉 Agent complete: exports/technical_research_agent/
-
-Commands:
-python -m technical_research_agent info
-python -m technical_research_agent validate
-python -m technical_research_agent run --input '{"topic": "..."}'
-```
-
-**Final session summary:**
-
-```python
-# Show final MCP session status
-status = json.loads(mcp__agent-builder__get_session_status())
-
-print("\n📊 Build Session Summary:")
-print(f"   Session ID: {status['session_id']}")
-print(f"   Agent: {status['name']}")
-print(f"   Goal: {status['goal_name']}")
-print(f"   Nodes: {status['node_count']}")
-print(f"   Edges: {status['edge_count']}")
-print(f"   MCP Servers: {status['mcp_servers_count']}")
-print("\n✅ Agent construction complete with full validation")
-print(f"\nSession saved to: ~/.claude-code-agent-builder/sessions/{status['session_id']}.json")
-````
-
-## CLI Template
-
-```python
-CLI_TEMPLATE = '''"""
-CLI entry point for agent.
-"""
-
-import asyncio
-import json
-import sys
-import click
-
-from .agent import default_agent
-
-@click.group()
-@click.version_option(version="1.0.0")
-def cli():
-    """Agent CLI."""
-    pass
-
-@cli.command()
-@click.option("--input", "-i", "input_json", type=str, required=True)
-@click.option("--mock", is_flag=True, help="Run in mock mode")
-@click.option("--quiet", "-q", is_flag=True, help="Only output result JSON")
-def run(input_json, mock, quiet):
-    """Execute the agent."""
-    try:
-        context = json.loads(input_json)
-    except json.JSONDecodeError as e:
-        click.echo(f"Error parsing input JSON: {e}", err=True)
-        sys.exit(1)
-
-    if not quiet:
-        click.echo(f"Running agent with input: {json.dumps(context)}")
-
-    result = asyncio.run(default_agent.run(context, mock_mode=mock))
-
-    output_data = {
-        "success": result.success,
-        "steps_executed": result.steps_executed,
-        "output": result.output,
-    }
-    if result.error:
-        output_data["error"] = result.error
-    if result.paused_at:
-        output_data["paused_at"] = result.paused_at
-
-    click.echo(json.dumps(output_data, indent=2, default=str))
-    sys.exit(0 if result.success else 1)
-
-@cli.command()
-@click.option("--json", "output_json", is_flag=True)
-def info(output_json):
-    """Show agent information."""
-    info_data = default_agent.info()
-    if output_json:
-        click.echo(json.dumps(info_data, indent=2))
-    else:
-        click.echo(f"Agent: {info_data['name']}")
-        click.echo(f"Description: {info_data['description']}")
-        click.echo(f"Nodes: {len(info_data['nodes'])}")
-        click.echo(f"Edges: {len(info_data['edges'])}")
-
-@cli.command()
-def validate():
-    """Validate agent structure."""
-    validation = default_agent.validate()
-    if validation["valid"]:
-        click.echo("✓ Agent is valid")
-    else:
-        click.echo("✗ Agent has errors:")
-        for error in validation["errors"]:
-            click.echo(f"  ERROR: {error}")
-    sys.exit(0 if validation["valid"] else 1)
-
-@cli.command()
-def shell():
-    """Interactive agent session."""
-    click.echo("Interactive mode - enter JSON input:")
-    # ... implementation
-
-if __name__ == "__main__":
-    cli()
-'''
-````
-
-## Testing During Build
-
-After nodes are added:
-
-```python
-# Test individual node
-python -c "
-from exports.my_agent.nodes import analyze_request_node
-print(analyze_request_node.id)
-print(analyze_request_node.input_keys)
-"
-
-# Validate current state
-PYTHONPATH=core:exports python -m my_agent validate
-
-# Show info
-PYTHONPATH=core:exports python -m my_agent info
-```
-
-## Approval Pattern
-
-Use AskUserQuestion for all approvals:
-
-```python
-response = AskUserQuestion(
-    questions=[{
-        "question": "Do you approve this [component]?",
-        "header": "Approve",
-        "options": [
-            {
-                "label": "✓ Approve (Recommended)",
-                "description": "Component looks good, proceed"
-            },
-            {
-                "label": "✗ Reject & Modify",
-                "description": "Need to make changes"
-            },
-            {
-                "label": "⏸ Pause & Review",
-                "description": "Need more time to review"
-            }
-        ],
-        "multiSelect": false
-    }]
-)
-```
-
-## Next Steps
-
-After completing construction:
-
-**If agent structure complete:**
-
- Validate: `python -m agent_name validate`
- Test basic execution: `python -m agent_name info`
- Proceed to testing-agent skill for comprehensive tests
-
-**If implementation needed:**
-
- Check for STATUS.md or IMPLEMENTATION_GUIDE.md in agent directory
- May need Python functions or MCP tool integration
-
-## Related Skills
-
- **building-agents-core** - Fundamental concepts
- **building-agents-patterns** - Best practices and examples
- **testing-agent** - Test and validate completed agents
- **agent-workflow** - Complete workflow orchestrator
@@ -1,303 +0,0 @@
---
-name: building-agents-core
-description: Core concepts for goal-driven agents - architecture, node types, tool discovery, and workflow overview. Use when starting agent development or need to understand agent fundamentals.
-license: Apache-2.0
-metadata:
-  author: hive
-  version: "1.0"
-  type: foundational
-  part_of: building-agents
---
-
-# Building Agents - Core Concepts
-
-Foundational knowledge for building goal-driven agents as Python packages.
-
-## Architecture: Python Services (Not JSON Configs)
-
-Agents are built as Python packages:
-
-```
-exports/my_agent/
-├── __init__.py          # Package exports
-├── __main__.py          # CLI (run, info, validate, shell)
-├── agent.py             # Graph construction (goal, edges, agent class)
-├── nodes/__init__.py    # Node definitions (NodeSpec)
-├── config.py            # Runtime config
-└── README.md            # Documentation
-```
-
-**Key Principle: Agent is visible and editable during build**
-
- ✅ Files created immediately as components are approved
- ✅ User can watch files grow in their editor
- ✅ No session state - just direct file writes
- ✅ No "export" step - agent is ready when build completes
-
-## Core Concepts
-
-### Goal
-
-Success criteria and constraints (written to agent.py)
-
-```python
-goal = Goal(
-    id="research-goal",
-    name="Technical Research Agent",
-    description="Research technical topics thoroughly",
-    success_criteria=[
-        SuccessCriterion(
-            id="completeness",
-            description="Cover all aspects of topic",
-            metric="coverage_score",
-            target=">=0.9",
-            weight=0.4,
-        ),
-        # ... more criteria
-    ],
-    constraints=[
-        Constraint(
-            id="accuracy",
-            description="All information must be verified",
-            constraint_type="hard",
-            category="quality",
-        ),
-        # ... more constraints
-    ],
-)
-```
-
-### Node
-
-Unit of work (written to nodes/__init__.py)
-
-**Node Types:**
-
- `llm_generate` - Text generation, parsing
- `llm_tool_use` - Actions requiring tools
- `router` - Conditional branching
- `function` - Deterministic operations
-
-```python
-search_node = NodeSpec(
-    id="search-web",
-    name="Search Web",
-    description="Search for information online",
-    node_type="llm_tool_use",
-    input_keys=["query"],
-    output_keys=["search_results"],
-    system_prompt="Search the web for: {query}",
-    tools=["web_search"],
-    max_retries=3,
-)
-```
-
-### Edge
-
-Connection between nodes (written to agent.py)
-
-**Edge Conditions:**
-
- `on_success` - Proceed if node succeeds
- `on_failure` - Handle errors
- `always` - Always proceed
- `conditional` - Based on expression
-
-```python
-EdgeSpec(
-    id="search-to-analyze",
-    source="search-web",
-    target="analyze-results",
-    condition=EdgeCondition.ON_SUCCESS,
-    priority=1,
-)
-```
-
-### Pause/Resume
-
-Multi-turn conversations
-
- **Pause nodes** - Stop execution, wait for user input
- **Resume entry points** - Continue from pause with user's response
-
-```python
-# Example pause/resume configuration
-pause_nodes = ["request-clarification"]
-entry_points = {
-    "start": "analyze-request",
-    "request-clarification_resume": "process-clarification"
-}
-```
-
-## Tool Discovery & Validation
-
-**CRITICAL:** Before adding a node with tools, you MUST verify the tools exist.
-
-Tools are provided by MCP servers. Never assume a tool exists - always discover dynamically.
-
-### Step 1: Register MCP Server (if not already done)
-
-```python
-mcp__agent-builder__add_mcp_server(
-    name="aden-tools",
-    transport="stdio",
-    command="python",
-    args='["mcp_server.py", "--stdio"]',
-    cwd="../aden-tools"
-)
-```
-
-### Step 2: Discover Available Tools
-
-```python
-# List all tools from all registered servers
-mcp__agent-builder__list_mcp_tools()
-
-# Or list tools from a specific server
-mcp__agent-builder__list_mcp_tools(server_name="aden-tools")
-```
-
-This returns available tools with their descriptions and parameters:
-
-```json
-{
-  "success": true,
-  "tools_by_server": {
-    "aden-tools": [
-      {
-        "name": "web_search",
-        "description": "Search the web...",
-        "parameters": ["query"]
-      },
-      {
-        "name": "web_scrape",
-        "description": "Scrape a URL...",
-        "parameters": ["url"]
-      }
-    ]
-  },
-  "total_tools": 14
-}
-```
-
-### Step 3: Validate Before Adding Nodes
-
-Before writing a node with `tools=[...]`:
-
-1. Call `list_mcp_tools()` to get available tools
-2. Check each tool in your node exists in the response
-3. If a tool doesn't exist:
-   - **DO NOT proceed** with the node
-   - Inform the user: "The tool 'X' is not available. Available tools are: ..."
-   - Ask if they want to use an alternative or proceed without the tool
-
-### Tool Validation Anti-Patterns
-
-❌ **Never assume a tool exists** - always call `list_mcp_tools()` first
-❌ **Never write a node with unverified tools** - validate before writing
-❌ **Never silently drop tools** - if a tool doesn't exist, inform the user
-❌ **Never guess tool names** - use exact names from discovery response
-
-### Example Validation Flow
-
-```python
-# 1. User requests: "Add a node that searches the web"
-# 2. Discover available tools
-tools_response = mcp__agent-builder__list_mcp_tools()
-
-# 3. Check if web_search exists
-available = [t["name"] for tools in tools_response["tools_by_server"].values() for t in tools]
-if "web_search" not in available:
-    # Inform user and ask how to proceed
-    print("❌ 'web_search' not available. Available tools:", available)
-else:
-    # Proceed with node creation
-    # ...
-```
-
-## Workflow Overview: Incremental File Construction
-
-```
-1. CREATE PACKAGE → mkdir + write skeletons
-2. DEFINE GOAL → Write to agent.py + config.py
-3. FOR EACH NODE:
-   - Propose design
-   - User approves
-   - Write to nodes/__init__.py IMMEDIATELY ← FILE WRITTEN
-   - (Optional) Validate with test_node ← MCP VALIDATION
-   - User can open file and see it
-4. CONNECT EDGES → Update agent.py ← FILE WRITTEN
-   - (Optional) Validate with validate_graph ← MCP VALIDATION
-5. FINALIZE → Write agent class to agent.py ← FILE WRITTEN
-6. DONE - Agent ready at exports/my_agent/
-```
-
-**Files written immediately. MCP tools optional for validation/testing bookkeeping.**
-
-### The Key Difference
-
-**OLD (Bad):**
-
-```
-MCP add_node → Session State → MCP add_node → Session State → ...
-                                                                ↓
-                                                     MCP export_graph
-                                                                ↓
-                                                       Files appear
-```
-
-**NEW (Good):**
-
-```
-Write node to file → (Optional: MCP test_node) → Write node to file → ...
-       ↓                                               ↓
-  File visible                                    File visible
-  immediately                                     immediately
-```
-
-**Bottom line:** Use Write/Edit for construction, MCP for validation if needed.
-
-## When to Use This Skill
-
-Use building-agents-core when:
- Starting a new agent project and need to understand fundamentals
- Need to understand agent architecture before building
- Want to validate tool availability before proceeding
- Learning about node types, edges, and graph execution
-
-**Next Steps:**
- Ready to build? → Use `building-agents-construction` skill
- Need patterns and examples? → Use `building-agents-patterns` skill
-
-## MCP Tools for Validation
-
-After writing files, optionally use MCP tools for validation:
-
-**test_node** - Validate node configuration with mock inputs
-```python
-mcp__agent-builder__test_node(
-    node_id="search-web",
-    test_input='{"query": "test query"}',
-    mock_llm_response='{"results": "mock output"}'
-)
-```
-
-**validate_graph** - Check graph structure
-```python
-mcp__agent-builder__validate_graph()
-# Returns: unreachable nodes, missing connections, etc.
-```
-
-**create_session** - Track session state for bookkeeping
-```python
-mcp__agent-builder__create_session(session_name="my-build")
-```
-
-**Key Point:** Files are written FIRST. MCP tools are for validation only.
-
-## Related Skills
-
- **building-agents-construction** - Step-by-step building process
- **building-agents-patterns** - Best practices and examples
- **agent-workflow** - Complete workflow orchestrator
- **testing-agent** - Test and validate completed agents
@@ -1,497 +0,0 @@
---
-name: building-agents-patterns
-description: Best practices, patterns, and examples for building goal-driven agents. Includes pause/resume architecture, hybrid workflows, anti-patterns, and handoff to testing. Use when optimizing agent design.
-license: Apache-2.0
-metadata:
-  author: hive
-  version: "1.0"
-  type: reference
-  part_of: building-agents
---
-
-# Building Agents - Patterns & Best Practices
-
-Design patterns, examples, and best practices for building robust goal-driven agents.
-
-**Prerequisites:** Complete agent structure using `building-agents-construction`.
-
-## Practical Example: Hybrid Workflow
-
-How to build a node using both direct file writes and optional MCP validation:
-
-```python
-# 1. WRITE TO FILE FIRST (Primary - makes it visible)
-node_code = '''
-search_node = NodeSpec(
-    id="search-web",
-    node_type="llm_tool_use",
-    input_keys=["query"],
-    output_keys=["search_results"],
-    system_prompt="Search the web for: {query}",
-    tools=["web_search"],
-)
-'''
-
-Edit(
-    file_path="exports/research_agent/nodes/__init__.py",
-    old_string="# Nodes will be added here",
-    new_string=node_code
-)
-
-print("✅ Added search_node to nodes/__init__.py")
-print("📁 Open exports/research_agent/nodes/__init__.py to see it!")
-
-# 2. OPTIONALLY VALIDATE WITH MCP (Secondary - bookkeeping)
-validation = mcp__agent-builder__test_node(
-    node_id="search-web",
-    test_input='{"query": "python tutorials"}',
-    mock_llm_response='{"search_results": [...mock results...]}'
-)
-
-print(f"✓ Validation: {validation['success']}")
-```
-
-**User experience:**
-
- Immediately sees node in their editor (from step 1)
- Gets validation feedback (from step 2)
- Can edit the file directly if needed
-
-This combines visibility (files) with validation (MCP tools).
-
-## Pause/Resume Architecture
-
-For agents needing multi-turn conversations with user interaction:
-
-### Basic Pause/Resume Flow
-
-```python
-# Define pause nodes - execution stops at these nodes
-pause_nodes = ["request-clarification", "await-approval"]
-
-# Define entry points - where to resume from each pause
-entry_points = {
-    "start": "analyze-request",  # Initial entry
-    "request-clarification_resume": "process-clarification",  # Resume from clarification
-    "await-approval_resume": "execute-action",  # Resume from approval
-}
-```
-
-### Example: Multi-Turn Research Agent
-
-```python
-# Nodes
-nodes = [
-    NodeSpec(id="analyze-request", ...),
-    NodeSpec(id="request-clarification", ...),  # PAUSE NODE
-    NodeSpec(id="process-clarification", ...),
-    NodeSpec(id="generate-results", ...),
-    NodeSpec(id="await-approval", ...),  # PAUSE NODE
-    NodeSpec(id="execute-action", ...),
-]
-
-# Edges with resume flows
-edges = [
-    EdgeSpec(
-        id="analyze-to-clarify",
-        source="analyze-request",
-        target="request-clarification",
-        condition=EdgeCondition.CONDITIONAL,
-        condition_expr="needs_clarification == true",
-    ),
-    # When resumed, goes to process-clarification
-    EdgeSpec(
-        id="clarify-to-process",
-        source="request-clarification",
-        target="process-clarification",
-        condition=EdgeCondition.ALWAYS,
-    ),
-    EdgeSpec(
-        id="results-to-approval",
-        source="generate-results",
-        target="await-approval",
-        condition=EdgeCondition.ALWAYS,
-    ),
-    # When resumed, goes to execute-action
-    EdgeSpec(
-        id="approval-to-execute",
-        source="await-approval",
-        target="execute-action",
-        condition=EdgeCondition.ALWAYS,
-    ),
-]
-
-# Configuration
-pause_nodes = ["request-clarification", "await-approval"]
-entry_points = {
-    "start": "analyze-request",
-    "request-clarification_resume": "process-clarification",
-    "await-approval_resume": "execute-action",
-}
-```
-
-### Running Pause/Resume Agents
-
-```python
-# Initial run - will pause at first pause node
-result1 = await agent.run(
-    context={"query": "research topic"},
-    session_state=None
-)
-
-# Check if paused
-if result1.paused_at:
-    print(f"Paused at: {result1.paused_at}")
-
-    # Resume with user input
-    result2 = await agent.run(
-        context={"user_response": "clarification details"},
-        session_state=result1.session_state  # Pass previous state
-    )
-```
-
-## Anti-Patterns
-
-### What NOT to Do
-
-❌ **Don't rely on `export_graph`** - Write files immediately, not at end
-```python
-# BAD: Building in session state, exporting at end
-mcp__agent-builder__add_node(...)
-mcp__agent-builder__add_node(...)
-mcp__agent-builder__export_graph()  # Files appear only now
-
-# GOOD: Writing files immediately
-Write(file_path="...", content=node_code)  # File visible now
-Write(file_path="...", content=node_code)  # File visible now
-```
-
-❌ **Don't hide code in session** - Write to files as components approved
-```python
-# BAD: Accumulating changes invisibly
-session.add_component(component1)
-session.add_component(component2)
-# User can't see anything yet
-
-# GOOD: Incremental visibility
-Edit(file_path="...", ...)  # User sees change 1
-Edit(file_path="...", ...)  # User sees change 2
-```
-
-❌ **Don't wait to write files** - Agent visible from first step
-```python
-# BAD: Building everything before writing
-design_all_nodes()
-design_all_edges()
-write_everything_at_once()
-
-# GOOD: Write as you go
-write_package_structure()  # Visible
-write_goal()  # Visible
-write_node_1()  # Visible
-write_node_2()  # Visible
-```
-
-❌ **Don't batch everything** - Write incrementally
-```python
-# BAD: Batching all nodes
-nodes = [design_node_1(), design_node_2(), ...]
-write_all_nodes(nodes)
-
-# GOOD: One at a time with user feedback
-write_node_1()  # User approves
-write_node_2()  # User approves
-write_node_3()  # User approves
-```
-
-### MCP Tools - Correct Usage
-
-**MCP tools OK for:**
-✅ `test_node` - Validate node configuration with mock inputs
-✅ `validate_graph` - Check graph structure
-✅ `create_session` - Track session state for bookkeeping
-✅ Other validation tools
-
-**Just don't:** Use MCP as the primary construction method or rely on export_graph
-
-## Best Practices
-
-### 1. Show Progress After Each Write
-
-```python
-# After writing a node
-print("✅ Added analyze_request_node to nodes/__init__.py")
-print("📊 Progress: 1/6 nodes added")
-print("📁 Open exports/my_agent/nodes/__init__.py to see it!")
-```
-
-### 2. Let User Open Files During Build
-
-```python
-# Encourage file inspection
-print("✅ Goal written to agent.py")
-print("")
-print("💡 Tip: Open exports/my_agent/agent.py in your editor to see the goal!")
-```
-
-### 3. Write Incrementally - One Component at a Time
-
-```python
-# Good flow
-write_package_structure()
-show_user("Package created")
-
-write_goal()
-show_user("Goal written")
-
-for node in nodes:
-    get_approval(node)
-    write_node(node)
-    show_user(f"Node {node.id} written")
-```
-
-### 4. Test As You Build
-
-```python
-# After adding several nodes
-print("💡 You can test current state with:")
-print("  PYTHONPATH=core:exports python -m my_agent validate")
-print("  PYTHONPATH=core:exports python -m my_agent info")
-```
-
-### 5. Keep User Informed
-
-```python
-# Clear status updates
-print("🔨 Creating package structure...")
-print("✅ Package created: exports/my_agent/")
-print("")
-print("📝 Next: Define agent goal")
-```
-
-## Continuous Monitoring Agents
-
-For agents that run continuously without terminal nodes:
-
-```python
-# No terminal nodes - loops forever
-terminal_nodes = []
-
-# Workflow loops back to start
-edges = [
-    EdgeSpec(id="monitor-to-check", source="monitor", target="check-condition"),
-    EdgeSpec(id="check-to-wait", source="check-condition", target="wait"),
-    EdgeSpec(id="wait-to-monitor", source="wait", target="monitor"),  # Loop
-]
-
-# Entry node only
-entry_node = "monitor"
-entry_points = {"start": "monitor"}
-pause_nodes = []
-```
-
-**Example: File Monitor**
-
-```python
-nodes = [
-    NodeSpec(id="list-files", ...),
-    NodeSpec(id="check-new-files", node_type="router", ...),
-    NodeSpec(id="process-files", ...),
-    NodeSpec(id="wait-interval", node_type="function", ...),
-]
-
-edges = [
-    EdgeSpec(id="list-to-check", source="list-files", target="check-new-files"),
-    EdgeSpec(
-        id="check-to-process",
-        source="check-new-files",
-        target="process-files",
-        condition=EdgeCondition.CONDITIONAL,
-        condition_expr="new_files_count > 0",
-    ),
-    EdgeSpec(
-        id="check-to-wait",
-        source="check-new-files",
-        target="wait-interval",
-        condition=EdgeCondition.CONDITIONAL,
-        condition_expr="new_files_count == 0",
-    ),
-    EdgeSpec(id="process-to-wait", source="process-files", target="wait-interval"),
-    EdgeSpec(id="wait-to-list", source="wait-interval", target="list-files"),  # Loop back
-]
-
-terminal_nodes = []  # No terminal - runs forever
-```
-
-## Complex Routing Patterns
-
-### Multi-Condition Router
-
-```python
-router_node = NodeSpec(
-    id="decision-router",
-    node_type="router",
-    input_keys=["analysis_result"],
-    output_keys=["decision"],
-    system_prompt="""
-    Based on the analysis result, decide the next action:
-    - If confidence > 0.9: route to "execute"
-    - If 0.5 <= confidence <= 0.9: route to "review"
-    - If confidence < 0.5: route to "clarify"
-
-    Return: {"decision": "execute|review|clarify"}
-    """,
-)
-
-# Edges for each route
-edges = [
-    EdgeSpec(
-        id="router-to-execute",
-        source="decision-router",
-        target="execute-action",
-        condition=EdgeCondition.CONDITIONAL,
-        condition_expr="decision == 'execute'",
-        priority=1,
-    ),
-    EdgeSpec(
-        id="router-to-review",
-        source="decision-router",
-        target="human-review",
-        condition=EdgeCondition.CONDITIONAL,
-        condition_expr="decision == 'review'",
-        priority=2,
-    ),
-    EdgeSpec(
-        id="router-to-clarify",
-        source="decision-router",
-        target="request-clarification",
-        condition=EdgeCondition.CONDITIONAL,
-        condition_expr="decision == 'clarify'",
-        priority=3,
-    ),
-]
-```
-
-## Error Handling Patterns
-
-### Graceful Failure with Fallback
-
-```python
-# Primary node with error handling
-nodes = [
-    NodeSpec(id="api-call", max_retries=3, ...),
-    NodeSpec(id="fallback-cache", ...),
-    NodeSpec(id="report-error", ...),
-]
-
-edges = [
-    # Success path
-    EdgeSpec(
-        id="api-success",
-        source="api-call",
-        target="process-results",
-        condition=EdgeCondition.ON_SUCCESS,
-    ),
-    # Fallback on failure
-    EdgeSpec(
-        id="api-to-fallback",
-        source="api-call",
-        target="fallback-cache",
-        condition=EdgeCondition.ON_FAILURE,
-        priority=1,
-    ),
-    # Report if fallback also fails
-    EdgeSpec(
-        id="fallback-to-error",
-        source="fallback-cache",
-        target="report-error",
-        condition=EdgeCondition.ON_FAILURE,
-        priority=1,
-    ),
-]
-```
-
-## Performance Optimization
-
-### Parallel Node Execution
-
-```python
-# Use multiple edges from same source for parallel execution
-edges = [
-    EdgeSpec(
-        id="start-to-search1",
-        source="start",
-        target="search-source-1",
-        condition=EdgeCondition.ALWAYS,
-    ),
-    EdgeSpec(
-        id="start-to-search2",
-        source="start",
-        target="search-source-2",
-        condition=EdgeCondition.ALWAYS,
-    ),
-    EdgeSpec(
-        id="start-to-search3",
-        source="start",
-        target="search-source-3",
-        condition=EdgeCondition.ALWAYS,
-    ),
-    # Converge results
-    EdgeSpec(
-        id="search1-to-merge",
-        source="search-source-1",
-        target="merge-results",
-    ),
-    EdgeSpec(
-        id="search2-to-merge",
-        source="search-source-2",
-        target="merge-results",
-    ),
-    EdgeSpec(
-        id="search3-to-merge",
-        source="search-source-3",
-        target="merge-results",
-    ),
-]
-```
-
-## Handoff to Testing
-
-When agent is complete, transition to testing phase:
-
-```python
-print("""
-✅ Agent complete: exports/my_agent/
-
-Next steps:
-1. Switch to testing-agent skill
-2. Generate and approve tests
-3. Run evaluation
-4. Debug any failures
-
-Command: "Test the agent at exports/my_agent/"
-""")
-```
-
-### Pre-Testing Checklist
-
-Before handing off to testing-agent:
-
- [ ] Agent structure validates: `python -m agent_name validate`
- [ ] All nodes defined in nodes/__init__.py
- [ ] All edges connect valid nodes
- [ ] Entry node specified
- [ ] Agent can be imported: `from exports.agent_name import default_agent`
- [ ] README.md with usage instructions
- [ ] CLI commands work (info, validate)
-
-## Related Skills
-
- **building-agents-core** - Fundamental concepts
- **building-agents-construction** - Step-by-step building
- **testing-agent** - Test and validate agents
- **agent-workflow** - Complete workflow orchestrator
-
---
-
-**Remember: Agent is actively constructed, visible the whole time. No hidden state. No surprise exports. Just transparent, incremental file building.**
@@ -0,0 +1,399 @@
+---
+name: hive-concepts
+description: Core concepts for goal-driven agents - architecture, node types (event_loop, function), tool discovery, and workflow overview. Use when starting agent development or need to understand agent fundamentals.
+license: Apache-2.0
+metadata:
+  author: hive
+  version: "2.0"
+  type: foundational
+  part_of: hive
+---
+
+# Building Agents - Core Concepts
+
+Foundational knowledge for building goal-driven agents as Python packages.
+
+## Architecture: Python Services (Not JSON Configs)
+
+Agents are built as Python packages:
+
+```
+exports/my_agent/
+├── __init__.py          # Package exports
+├── __main__.py          # CLI (run, info, validate, shell)
+├── agent.py             # Graph construction (goal, edges, agent class)
+├── nodes/__init__.py    # Node definitions (NodeSpec)
+├── config.py            # Runtime config
+└── README.md            # Documentation
+```
+
+**Key Principle: Agent is visible and editable during build**
+
+- Files created immediately as components are approved
+- User can watch files grow in their editor
+- No session state - just direct file writes
+- No "export" step - agent is ready when build completes
+
+## Core Concepts
+
+### Goal
+
+Success criteria and constraints (written to agent.py)
+
+```python
+goal = Goal(
+    id="research-goal",
+    name="Technical Research Agent",
+    description="Research technical topics thoroughly",
+    success_criteria=[
+        SuccessCriterion(
+            id="completeness",
+            description="Cover all aspects of topic",
+            metric="coverage_score",
+            target=">=0.9",
+            weight=0.4,
+        ),
+        # 3-5 success criteria total
+    ],
+    constraints=[
+        Constraint(
+            id="accuracy",
+            description="All information must be verified",
+            constraint_type="hard",
+            category="quality",
+        ),
+        # 1-5 constraints total
+    ],
+)
+```
+
+### Node
+
+Unit of work (written to nodes/__init__.py)
+
+**Node Types:**
+
+- `event_loop` — Multi-turn streaming loop with tool execution and judge-based evaluation. Works with or without tools.
+- `function` — Deterministic Python operations. No LLM involved.
+
+```python
+search_node = NodeSpec(
+    id="search-web",
+    name="Search Web",
+    description="Search for information and extract results",
+    node_type="event_loop",
+    input_keys=["query"],
+    output_keys=["search_results"],
+    system_prompt="Search the web for: {query}. Use the web_search tool to find results, then call set_output to store them.",
+    tools=["web_search"],
+)
+```
+
+**NodeSpec Fields for Event Loop Nodes:**
+
+| Field | Default | Description |
+|-------|---------|-------------|
+| `client_facing` | `False` | If True, streams output to user and blocks for input between turns |
+| `nullable_output_keys` | `[]` | Output keys that may remain unset (for mutually exclusive outputs) |
+| `max_node_visits` | `1` | Max times this node executes per run. Set >1 for feedback loop targets |
+
+### Edge
+
+Connection between nodes (written to agent.py)
+
+**Edge Conditions:**
+
+- `on_success` — Proceed if node succeeds (most common)
+- `on_failure` — Handle errors
+- `always` — Always proceed
+- `conditional` — Based on expression evaluating node output
+
+**Edge Priority:**
+
+Priority controls evaluation order when multiple edges leave the same node. Higher priority edges are evaluated first. Use negative priority for feedback edges (edges that loop back to earlier nodes).
+
+```python
+# Forward edge (evaluated first)
+EdgeSpec(
+    id="review-to-campaign",
+    source="review",
+    target="campaign-builder",
+    condition=EdgeCondition.CONDITIONAL,
+    condition_expr="output.get('approved_contacts') is not None",
+    priority=1,
+)
+
+# Feedback edge (evaluated after forward edges)
+EdgeSpec(
+    id="review-feedback",
+    source="review",
+    target="extractor",
+    condition=EdgeCondition.CONDITIONAL,
+    condition_expr="output.get('redo_extraction') is not None",
+    priority=-1,
+)
+```
+
+### Client-Facing Nodes
+
+For multi-turn conversations with the user, set `client_facing=True` on a node. The node will:
+- Stream its LLM output directly to the end user
+- Block for user input between conversational turns
+- Resume when new input is injected via `inject_event()`
+
+```python
+intake_node = NodeSpec(
+    id="intake",
+    name="Intake",
+    description="Gather requirements from the user",
+    node_type="event_loop",
+    client_facing=True,
+    input_keys=[],
+    output_keys=["repo_url", "project_url"],
+    system_prompt="You are the intake agent. Ask the user for the repo URL and project URL.",
+)
+```
+
+> **Legacy Note:** The old `pause_nodes` / `entry_points` pattern still works but `client_facing=True` is preferred for new agents.
+
+**STEP 1 / STEP 2 Prompt Pattern:** For client-facing nodes, structure the system prompt with two explicit phases:
+
+```python
+system_prompt="""\
+**STEP 1 — Respond to the user (text only, NO tool calls):**
+[Present information, ask questions, etc.]
+
+**STEP 2 — After the user responds, call set_output:**
+[Call set_output with the structured outputs]
+"""
+```
+
+This prevents the LLM from calling `set_output` prematurely before the user has had a chance to respond.
+
+### Node Design: Fewer, Richer Nodes
+
+Prefer fewer nodes that do more work over many thin single-purpose nodes:
+
+- **Bad**: 8 thin nodes (parse query → search → fetch → evaluate → synthesize → write → check → save)
+- **Good**: 4 rich nodes (intake → research → review → report)
+
+Why: Each node boundary requires serializing outputs and passing context. Fewer nodes means the LLM retains full context of its work within the node. A research node that searches, fetches, and analyzes keeps all the source material in its conversation history.
+
+### nullable_output_keys for Cross-Edge Inputs
+
+When a node receives inputs that only arrive on certain edges (e.g., `feedback` only comes from a review → research feedback loop, not from intake → research), mark those keys as `nullable_output_keys`:
+
+```python
+research_node = NodeSpec(
+    id="research",
+    input_keys=["research_brief", "feedback"],
+    nullable_output_keys=["feedback"],  # Not present on first visit
+    max_node_visits=3,
+    ...
+)
+```
+
+## Event Loop Architecture Concepts
+
+### How EventLoopNode Works
+
+An event loop node runs a multi-turn loop:
+1. LLM receives system prompt + conversation history
+2. LLM responds (text and/or tool calls)
+3. Tool calls are executed, results added to conversation
+4. Judge evaluates: ACCEPT (exit loop), RETRY (loop again), or ESCALATE
+5. Repeat until judge ACCEPTs or max_iterations reached
+
+### EventLoopNode Runtime
+
+EventLoopNodes are **auto-created** by `GraphExecutor` at runtime. You do NOT need to manually register them. Both `GraphExecutor` (direct) and `AgentRuntime` / `create_agent_runtime()` handle event_loop nodes automatically.
+
+```python
+# Direct execution — executor auto-creates EventLoopNodes
+from framework.graph.executor import GraphExecutor
+from framework.runtime.core import Runtime
+
+runtime = Runtime(storage_path)
+executor = GraphExecutor(
+    runtime=runtime,
+    llm=llm,
+    tools=tools,
+    tool_executor=tool_executor,
+    storage_path=storage_path,
+)
+result = await executor.execute(graph=graph, goal=goal, input_data=input_data)
+
+# TUI execution — AgentRuntime also works
+from framework.runtime.agent_runtime import create_agent_runtime
+runtime = create_agent_runtime(
+    graph=graph, goal=goal, storage_path=storage_path,
+    entry_points=[...], llm=llm, tools=tools, tool_executor=tool_executor,
+)
+```
+
+### set_output
+
+Nodes produce structured outputs by calling `set_output(key, value)` — a synthetic tool injected by the framework. When the LLM calls `set_output`, the value is stored in the output accumulator and made available to downstream nodes via shared memory.
+
+`set_output` is NOT a real tool — it is excluded from `real_tool_results`. For client-facing nodes, this means a turn where the LLM only calls `set_output` (no other tools) is treated as a conversational boundary and will block for user input.
+
+### JudgeProtocol
+
+**The judge is the SOLE mechanism for acceptance decisions.** Do not add ad-hoc framework gating, output rollback, or premature rejection logic. If the LLM calls `set_output` too early, fix it with better prompts or a custom judge — not framework-level guards.
+
+The judge controls when a node's loop exits:
+- **Implicit judge** (default, no judge configured): ACCEPTs when the LLM finishes with no tool calls and all required output keys are set
+- **SchemaJudge**: Validates outputs against a Pydantic model
+- **Custom judges**: Implement `evaluate(context) -> JudgeVerdict`
+
+### LoopConfig
+
+Controls loop behavior:
+- `max_iterations` (default 50) — prevents infinite loops
+- `max_tool_calls_per_turn` (default 10) — limits tool calls per LLM response
+- `tool_call_overflow_margin` (default 0.5) — wiggle room before discarding extra tool calls (50% means hard cutoff at 150% of limit)
+- `stall_detection_threshold` (default 3) — detects repeated identical responses
+- `max_history_tokens` (default 32000) — triggers conversation compaction
+
+### Data Tools (Spillover Management)
+
+When tool results exceed the context window, the framework automatically saves them to a spillover directory and truncates with a hint. Nodes that produce or consume large data should include the data tools:
+
+- `save_data(filename, data)` — Write data to a file in the data directory
+- `load_data(filename, offset=0, limit=50)` — Read data with line-based pagination
+- `list_data_files()` — List available data files
+- `serve_file_to_user(filename, label="")` — Get a clickable file:// URI for the user
+
+Note: `data_dir` is a framework-injected context parameter — the LLM never sees or passes it. `GraphExecutor.execute()` sets it per-execution via `contextvars`, so data tools and spillover always share the same session-scoped directory.
+
+These are real MCP tools (not synthetic). Add them to nodes that handle large tool results:
+
+```python
+research_node = NodeSpec(
+    ...
+    tools=["web_search", "web_scrape", "load_data", "save_data", "list_data_files"],
+)
+```
+
+### Fan-Out / Fan-In
+
+Multiple ON_SUCCESS edges from the same source create parallel execution. All branches run concurrently via `asyncio.gather()`. Parallel event_loop nodes must have disjoint `output_keys`.
+
+### max_node_visits
+
+Controls how many times a node can execute in one graph run. Default is 1. Set higher for nodes that are targets of feedback edges (review-reject loops). Set 0 for unlimited (guarded by max_steps).
+
+## Tool Discovery & Validation
+
+**CRITICAL:** Before adding a node with tools, you MUST verify the tools exist.
+
+Tools are provided by MCP servers. Never assume a tool exists - always discover dynamically.
+
+### Step 1: Register MCP Server (if not already done)
+
+```python
+mcp__agent-builder__add_mcp_server(
+    name="tools",
+    transport="stdio",
+    command="python",
+    args='["mcp_server.py", "--stdio"]',
+    cwd="../tools"
+)
+```
+
+### Step 2: Discover Available Tools
+
+```python
+# List all tools from all registered servers
+mcp__agent-builder__list_mcp_tools()
+
+# Or list tools from a specific server
+mcp__agent-builder__list_mcp_tools(server_name="tools")
+```
+
+### Step 3: Validate Before Adding Nodes
+
+Before writing a node with `tools=[...]`:
+
+1. Call `list_mcp_tools()` to get available tools
+2. Check each tool in your node exists in the response
+3. If a tool doesn't exist:
+   - **DO NOT proceed** with the node
+   - Inform the user: "The tool 'X' is not available. Available tools are: ..."
+   - Ask if they want to use an alternative or proceed without the tool
+
+### Tool Validation Anti-Patterns
+
+- **Never assume a tool exists** - always call `list_mcp_tools()` first
+- **Never write a node with unverified tools** - validate before writing
+- **Never silently drop tools** - if a tool doesn't exist, inform the user
+- **Never guess tool names** - use exact names from discovery response
+
+## Workflow Overview: Incremental File Construction
+
+```
+1. CREATE PACKAGE → mkdir + write skeletons
+2. DEFINE GOAL → Write to agent.py + config.py
+3. FOR EACH NODE:
+   - Propose design (event_loop for LLM work, function for deterministic)
+   - User approves
+   - Write to nodes/__init__.py IMMEDIATELY
+   - (Optional) Validate with test_node
+4. CONNECT EDGES → Update agent.py
+   - Use priority for feedback edges (negative priority)
+   - (Optional) Validate with validate_graph
+5. FINALIZE → Write agent class to agent.py
+6. DONE - Agent ready at exports/my_agent/
+```
+
+**Files written immediately. MCP tools optional for validation/testing bookkeeping.**
+
+## When to Use This Skill
+
+Use hive-concepts when:
+- Starting a new agent project and need to understand fundamentals
+- Need to understand agent architecture before building
+- Want to validate tool availability before proceeding
+- Learning about node types, edges, and graph execution
+
+**Next Steps:**
+- Ready to build? → Use `hive-create` skill
+- Need patterns and examples? → Use `hive-patterns` skill
+
+## MCP Tools for Validation
+
+After writing files, optionally use MCP tools for validation:
+
+**test_node** - Validate node configuration with mock inputs
+```python
+mcp__agent-builder__test_node(
+    node_id="search-web",
+    test_input='{"query": "test query"}',
+    mock_llm_response='{"results": "mock output"}'
+)
+```
+
+**validate_graph** - Check graph structure
+```python
+mcp__agent-builder__validate_graph()
+# Returns: unreachable nodes, missing connections, event_loop validation, etc.
+```
+
+**configure_loop** - Set event loop parameters
+```python
+mcp__agent-builder__configure_loop(
+    max_iterations=50,
+    max_tool_calls_per_turn=10,
+    stall_detection_threshold=3,
+    max_history_tokens=32000
+)
+```
+
+**Key Point:** Files are written FIRST. MCP tools are for validation only.
+
+## Related Skills
+
+- **hive-create** - Step-by-step building process
+- **hive-patterns** - Best practices: judges, feedback edges, fan-out, context management
+- **hive** - Complete workflow orchestrator
+- **hive-test** - Test and validate completed agents
@@ -0,0 +1,980 @@
+---
+name: hive-create
+description: Step-by-step guide for building goal-driven agents. Qualifies use cases first (the good, bad, and ugly), then creates package structure, defines goals, adds nodes, connects edges, and finalizes agent class. Use when actively building an agent.
+license: Apache-2.0
+metadata:
+  author: hive
+  version: "2.2"
+  type: procedural
+  part_of: hive
+  requires: hive-concepts
+---
+
+# Agent Construction - EXECUTE THESE STEPS
+
+**THIS IS AN EXECUTABLE WORKFLOW. DO NOT DISPLAY THIS FILE. EXECUTE THE STEPS BELOW.**
+
+**CRITICAL: DO NOT explore the codebase, read source files, or search for code before starting.** All context you need is in this skill file. When this skill is loaded, IMMEDIATELY begin executing Step 0 — determine the build path as your FIRST action. Do not explain what you will do, do not investigate the project structure, do not read any files — just execute Step 0 now.
+
+---
+
+## STEP 0: Choose Build Path
+
+**If the user has already indicated whether they want to build from scratch or from a template, skip this question and proceed to the appropriate step.**
+
+Otherwise, ask:
+
+```
+AskUserQuestion(questions=[{
+    "question": "How would you like to build your agent?",
+    "header": "Build Path",
+    "options": [
+        {"label": "From scratch", "description": "Design goal, nodes, and graph collaboratively from nothing"},
+        {"label": "From a template", "description": "Start from a working sample agent and customize it"}
+    ],
+    "multiSelect": false
+}])
+```
+
+- If **From scratch**: Proceed to STEP 1A
+- If **From a template**: Proceed to STEP 1B
+
+---
+
+## STEP 1A: Initialize Build Environment (From Scratch)
+
+**EXECUTE THESE TOOL CALLS NOW** (silent setup — no user interaction needed):
+
+1. Check for existing sessions:
+
+```
+mcp__agent-builder__list_sessions()
+```
+
+- If a session with this agent name already exists, load it with `mcp__agent-builder__load_session_by_id(session_id="...")` and skip to step 3.
+- If no matching session exists, proceed to step 2.
+
+2. Create a build session (replace AGENT_NAME with the user's requested agent name in snake_case):
+
+```
+mcp__agent-builder__create_session(name="AGENT_NAME")
+```
+
+3. Register the hive-tools MCP server:
+
+```
+mcp__agent-builder__add_mcp_server(
+    name="hive-tools",
+    transport="stdio",
+    command="uv",
+    args='["run", "python", "mcp_server.py", "--stdio"]',
+    cwd="tools",
+    description="Hive tools MCP server"
+)
+```
+
+4. Discover available tools:
+
+```
+mcp__agent-builder__list_mcp_tools()
+```
+
+5. Create the package directory:
+
+```bash
+mkdir -p exports/AGENT_NAME/nodes
+```
+
+**Save the tool list for STEP 4** — you will need it for node design.
+
+**THEN immediately proceed to STEP 2** (do NOT display setup results to the user — just move on).
+
+---
+
+## STEP 1B: Initialize Build Environment (From Template)
+
+**EXECUTE THESE STEPS NOW:**
+
+### 1B.1: Discover available templates
+
+List the template directories and read each template's `agent.json` to get its name and description:
+
+```bash
+ls examples/templates/
+```
+
+For each directory found, read `examples/templates/TEMPLATE_DIR/agent.json` with the Read tool and extract:
+- `agent.name` — the template's display name
+- `agent.description` — what the template does
+
+### 1B.2: Present templates to user
+
+Show the user a table of available templates:
+
+> **Available Templates:**
+>
+> | # | Template | Description |
+> |---|----------|-------------|
+> | 1 | [name from agent.json] | [description from agent.json] |
+> | 2 | ... | ... |
+
+Then ask the user to pick a template and provide a name for their new agent:
+
+```
+AskUserQuestion(questions=[{
+    "question": "Which template would you like to start from?",
+    "header": "Template",
+    "options": [
+        {"label": "[template 1 name]", "description": "[template 1 description]"},
+        {"label": "[template 2 name]", "description": "[template 2 description]"},
+        ...
+    ],
+    "multiSelect": false
+}, {
+    "question": "What should the new agent be named? (snake_case)",
+    "header": "Agent Name",
+    "options": [
+        {"label": "Use template name", "description": "Keep the original template name as-is"},
+        {"label": "Custom name", "description": "I'll provide a new snake_case name"}
+    ],
+    "multiSelect": false
+}])
+```
+
+### 1B.3: Copy template to exports
+
+```bash
+cp -r examples/templates/TEMPLATE_DIR exports/NEW_AGENT_NAME
+```
+
+### 1B.4: Create session and register MCP (same logic as STEP 1A)
+
+First, check for existing sessions:
+
+```
+mcp__agent-builder__list_sessions()
+```
+
+- If a session with this agent name already exists, load it with `mcp__agent-builder__load_session_by_id(session_id="...")` and skip to `list_mcp_tools`.
+- If no matching session exists, create one:
+
+```
+mcp__agent-builder__create_session(name="NEW_AGENT_NAME")
+```
+
+Then register MCP and discover tools:
+
+```
+mcp__agent-builder__add_mcp_server(
+    name="hive-tools",
+    transport="stdio",
+    command="uv",
+    args='["run", "python", "mcp_server.py", "--stdio"]',
+    cwd="tools",
+    description="Hive tools MCP server"
+)
+```
+
+```
+mcp__agent-builder__list_mcp_tools()
+```
+
+### 1B.5: Load template into builder session
+
+Import the entire agent definition in one call:
+
+```
+mcp__agent-builder__import_from_export(agent_json_path="exports/NEW_AGENT_NAME/agent.json")
+```
+
+This reads the agent.json and populates the builder session with the goal, all nodes, and all edges.
+
+**THEN immediately proceed to STEP 2.**
+
+---
+
+## STEP 2: Define Goal Together with User
+**A responsible engineer doesn't jump into building. First, understand the problem and be transparent about what the framework can and cannot do.**
+
+**If starting from a template**, the goal is already loaded in the builder session. Present the existing goal to the user using the format below and ask for approval. Skip the collaborative drafting questions — go straight to presenting and asking "Do you approve this goal, or would you like to modify it?"
+
+**If the user has NOT already described what they want to build**, start by asking what kind of agent they have in mind:
+
+```
+AskUserQuestion(questions=[{
+    "question": "What kind of agent do you want to build? Select an option below, or choose 'Other' to describe your own.",
+    "header": "Agent type",
+    "options": [
+        {"label": "Data collection", "description": "Gathers information from the web, analyzes it, and produces a report or sends outreach (e.g. market research, news digest, email campaigns, competitive analysis)"},
+        {"label": "Workflow automation", "description": "Automates a multi-step business process end-to-end (e.g. lead qualification, content publishing pipeline, data entry)"},
+        {"label": "Personal assistant", "description": "Handles recurring tasks or monitors for events and acts on them (e.g. daily briefings, meeting prep, file organization)"}
+    ],
+    "multiSelect": false
+}])
+```
+
+Use the user's selection (or their custom description if they chose "Other") as context when shaping the goal below. If the user already described what they want before this step, skip the question and proceed directly.
+
+**DO NOT propose a complete goal on your own.** Instead, collaborate with the user to define it.
+
+### 2a: Fast Discovery (3-8 Turns)
+
+**The core principle**: Discovery should feel like progress, not paperwork. The stakeholder should walk away feeling like you understood them faster than anyone else would have.
+
+**Communication sytle**: Be concise. Say less. Mean more. Impatient stakeholders don't want a wall of text — they want to know you get it. Every sentence you say should either move the conversation forward or prove you understood something. If it does neither, cut it.
+
+**Ask Question Rules: Respect Their Time.** Every question must earn its place by:
+1. **Preventing a costly wrong turn** — you're about to build the wrong thing
+2. **Unlocking a shortcut** — their answer lets you simplify the design
+3. **Surfacing a dealbreaker** — there's a constraint that changes everything
+4. **Provide Options** - Provide options to your questions if possible, but also always allow the user to type something beyong the options.
+
+If a question doesn't do one of these, don't ask it. Make an assumption, state it, and move on.
+
+---
+
+#### 2a.1: Let Them Talk, But Listen Like an Architect
+
+When the stakeholder describes what they want, don't just hear the words — listen for the architecture underneath. While they talk, mentally construct:
+
+- **The actors**: Who are the people/systems involved?
+- **The trigger**: What kicks off the workflow?
+- **The core loop**: What's the main thing that happens repeatedly?
+- **The output**: What's the valuable thing produced at the end?
+- **The pain**: What about today's situation is broken, slow, or missing?
+
+You are extracting a **domain model** from natural language in real time. Most stakeholders won't give you this structure explicitly — they'll give you a story. Your job is to hear the structure inside the story.
+
+| They say... | You're hearing... |
+|-------------|-------------------|
+| Nouns they repeat | Your entities |
+| Verbs they emphasize | Your core operations |
+| Frustrations they mention | Your design constraints |
+| Workarounds they describe | What the system must replace |
+| People they name | Your user types |
+
+---
+
+#### 2a.2: Use Domain Knowledge to Fill In the Blanks
+
+You have broad knowledge of how systems work. Use it aggressively.
+
+If they say "I need a research agent," you already know it probably involves: search, summarization, source tracking, and iteration. Don't ask about each — use them as your starting mental model and let their specifics override your defaults.
+
+If they say "I need to monitor files and alert me," you know this probably involves: watch patterns, triggers, notifications, and state tracking.
+
+**The key move**: Take your general knowledge of the domain and merge it with the specifics they've given you. The result is a draft understanding that's 60-80% right before you've asked a single question. Your questions close the remaining 20-40%.
+
+---
+
+#### 2a.3: Play Back a Proposed Model (Not a List of Questions)
+
+After listening, present a **concrete picture** of what you think they need. Make it specific enough that they can spot what's wrong.
+
+**Pattern: "Here's what I heard — tell me where I'm off"**
+
+> "OK here's how I'm picturing this: [User type] needs to [core action]. Right now they're [current painful workflow]. What you want is [proposed solution that replaces the pain].
+>
+> The way I'd structure this: [key entities] connected by [key relationships], with the main flow being [trigger → steps → outcome].
+>
+> For the MVP, I'd focus on [the one thing that delivers the most value] and hold off on [things that can wait].
+>
+> Before I start — [1-2 specific questions you genuinely can't infer]."
+
+Why this works:
+- **Proves you were listening** — they don't feel like they have to repeat themselves
+- **Shows competence** — you're already thinking in systems
+- **Fast to correct** — "no, it's more like X" takes 10 seconds vs. answering 15 questions
+- **Creates momentum** — heading toward building, not more talking
+
+---
+
+#### 2a.4: Ask Only What You Cannot Infer
+
+Your questions should be **narrow, specific, and consequential**. Never ask what you could answer yourself.
+
+**Good questions** (high-stakes, can't infer):
+- "Who's the primary user — you or your end customers?"
+- "Is this replacing a spreadsheet, or is there literally nothing today?"
+- "Does this need to integrate with anything, or standalone?"
+- "Is there existing data to migrate, or starting fresh?"
+
+**Bad questions** (low-stakes, inferable):
+- "What should happen if there's an error?" *(handle gracefully, obviously)*
+- "Should it have search?" *(if there's a list, yes)*
+- "How should we handle permissions?" *(follow standard patterns)*
+- "What tools should I use?" *(your call, not theirs)*
+
+---
+
+#### Conversation Flow (3-5 Turns)
+
+| Turn | Who | What |
+|------|-----|------|
+| 1 | User | Describes what they need |
+| 2 | Agent | Plays back understanding as a proposed model. Asks 1-2 critical questions max. |
+| 3 | User | Corrects, confirms, or adds detail |
+| 4 | Agent | Adjusts model, confirms MVP scope, states assumptions, declares starting point |
+| *(5)* | *(Only if Turn 3 revealed something that fundamentally changes the approach)* |
+
+**AFTER the conversation, IMMEDIATELY proceed to 2b. DO NOT skip to building.**
+
+---
+
+#### Anti-Patterns
+
+| Don't | Do Instead |
+|-------|------------|
+| Open with a list of questions | Open with what you understood from their request |
+| "What are your requirements?" | "Here's what I think you need — am I right?" |
+| Ask about every edge case | Handle with smart defaults, flag in summary |
+| 10+ turn discovery conversation | 3-8 turns. Start building, iterate with real software. |
+| Being lazy nd not understand what user want to achieve | Understand "what" and "why |
+| Ask for permission to start | State your plan and start |
+| Wait for certainty | Start at 80% confidence, iterate the rest |
+| Ask what tech/tools to use | That's your job. Decide, disclose, move on. |
+
+---
+
+
+
+### 2b: Capability Assessment
+
+**After the user responds, analyze the fit.** Present this assessment honestly:
+
+> **Framework Fit Assessment**
+>
+> Based on what you've described, here's my honest assessment of how well this framework fits your use case:
+>
+> **What Works Well (The Good):**
+> - [List 2-4 things the framework handles well for this use case]
+> - Examples: multi-turn conversations, human-in-the-loop review, tool orchestration, structured outputs
+>
+> **Limitations to Be Aware Of (The Bad):**
+> - [List 2-3 limitations that apply but are workable]
+> - Examples: LLM latency means not suitable for sub-second responses, context window limits for very large documents, cost per run for heavy tool usage
+>
+> **Potential Deal-Breakers (The Ugly):**
+> - [List any significant challenges or missing capabilities — be honest]
+> - Examples: no tool available for X, would require custom MCP server, framework not designed for Y
+
+**Be specific.** Reference the actual tools discovered in Step 1. If the user needs `send_email` but it's not available, say so. If they need real-time streaming from a database, explain that's not how the framework works.
+
+### 2c: Gap Analysis
+
+**Identify specific gaps** between what the user wants and what you can deliver:
+
+| Requirement | Framework Support | Gap/Workaround |
+|-------------|-------------------|----------------|
+| [User need] | [✅ Supported / ⚠️ Partial / ❌ Not supported] | [How to handle or why it's a problem] |
+
+**Examples of gaps to identify:**
+- Missing tools (user needs X, but only Y and Z are available)
+- Scope issues (user wants to process 10,000 items, but LLM rate limits apply)
+- Interaction mismatches (user wants CLI-only, but agent is designed for TUI)
+- Data flow issues (user needs to persist state across runs, but sessions are isolated)
+- Latency requirements (user needs instant responses, but LLM calls take seconds)
+
+### 2d: Recommendation
+
+**Give a clear recommendation:**
+
+> **My Recommendation:**
+>
+> [One of these three:]
+>
+> **✅ PROCEED** — This is a good fit. The framework handles your core needs well. [List any minor caveats.]
+>
+> **⚠️ PROCEED WITH SCOPE ADJUSTMENT** — This can work, but we should adjust: [specific changes]. Without these adjustments, you'll hit [specific problems].
+>
+> **🛑 RECONSIDER** — This framework may not be the right tool for this job because [specific reasons]. Consider instead: [alternatives — simpler script, different framework, custom solution].
+
+### 2e: Get Explicit Acknowledgment
+
+**CALL AskUserQuestion:**
+
+```
+AskUserQuestion(questions=[{
+    "question": "Based on this assessment, how would you like to proceed?",
+    "header": "Proceed",
+    "options": [
+        {"label": "Proceed as described", "description": "I understand the limitations, let's build it"},
+        {"label": "Adjust scope", "description": "Let's modify the requirements to fit better"},
+        {"label": "More questions", "description": "I have questions about the assessment"},
+        {"label": "Reconsider", "description": "Maybe this isn't the right approach"}
+    ],
+    "multiSelect": false
+}])
+```
+
+**WAIT for user response.**
+
+- If **Proceed**: Move to STEP 3
+- If **Adjust scope**: Discuss what to change, update your notes, re-assess if needed
+- If **More questions**: Answer them honestly, then ask again
+- If **Reconsider**: Discuss alternatives. If they decide to proceed anyway, that's their informed choice
+
+---
+
+## STEP 3: Define Goal Together with User
+
+**Now that the use case is qualified, collaborate on the goal definition.**
+
+**START by synthesizing what you learned:**
+
+> Based on our discussion, here's my understanding of the goal:
+>
+> **Core purpose:** [what you understood from 2a]
+> **Success looks like:** [what you inferred]
+> **Key constraints:** [what you inferred]
+>
+> Let me refine this with you:
+>
+> 1. **What should this agent accomplish?** (confirm or correct my understanding)
+> 2. **How will we know it succeeded?** (what specific outcomes matter)
+> 3. **Are there any hard constraints?** (things it must never do, quality bars)
+
+**WAIT for the user to respond.** Use their input (and the agent type they selected) to draft:
+
+- Goal ID (kebab-case)
+- Goal name
+- Goal description
+- 3-5 success criteria (each with: id, description, metric, target, weight)
+- 2-4 constraints (each with: id, description, constraint_type, category)
+
+**PRESENT the draft goal for approval:**
+
+> **Proposed Goal: [Name]**
+>
+> [Description]
+>
+> **Success Criteria:**
+>
+> 1. [criterion 1]
+> 2. [criterion 2]
+>    ...
+>
+> **Constraints:**
+>
+> 1. [constraint 1]
+> 2. [constraint 2]
+>    ...
+
+**THEN call AskUserQuestion:**
+
+```
+AskUserQuestion(questions=[{
+    "question": "Do you approve this goal definition?",
+    "header": "Goal",
+    "options": [
+        {"label": "Approve", "description": "Goal looks good, proceed to workflow design"},
+        {"label": "Modify", "description": "I want to change something"}
+    ],
+    "multiSelect": false
+}])
+```
+
+**WAIT for user response.**
+
+- If **Approve**: Call `mcp__agent-builder__set_goal(...)` with the goal details, then proceed to STEP 4
+- If **Modify**: Ask what they want to change, update the draft, ask again
+
+---
+
+## STEP 4: Design Conceptual Nodes
+
+**If starting from a template**, the nodes are already loaded in the builder session. Present the existing nodes using the table format below and ask for approval. Skip the design phase.
+
+**BEFORE designing nodes**, review the available tools from Step 1. Nodes can ONLY use tools that exist.
+
+**DESIGN the workflow** as a series of nodes. For each node, determine:
+
+- node_id (kebab-case)
+- name
+- description
+- node_type: `"event_loop"` (recommended for all LLM work) or `"function"` (deterministic, no LLM)
+- input_keys (what data this node receives)
+- output_keys (what data this node produces)
+- tools (ONLY tools that exist from Step 1 — empty list if no tools needed)
+- client_facing: True if this node interacts with the user
+- nullable_output_keys (for mutually exclusive outputs or feedback-only inputs)
+- max_node_visits (>1 if this node is a feedback loop target)
+
+**Prefer fewer, richer nodes** (4 nodes > 8 thin nodes). Each node boundary requires serializing outputs. A research node that searches, fetches, and analyzes keeps all source material in its conversation history.
+
+**PRESENT the nodes to the user for review:**
+
+> **Proposed Nodes ([N] total):**
+>
+> | #   | Node ID    | Type       | Description                   | Tools                  | Client-Facing |
+> | --- | ---------- | ---------- | ----------------------------- | ---------------------- | :-----------: |
+> | 1   | `intake`   | event_loop | Gather requirements from user | —                      |      Yes      |
+> | 2   | `research` | event_loop | Search and analyze sources    | web_search, web_scrape |      No       |
+> | 3   | `review`   | event_loop | Present findings for approval | —                      |      Yes      |
+> | 4   | `report`   | event_loop | Generate final report         | save_data              |      No       |
+>
+> **Data Flow:**
+>
+> - `intake` produces: `research_brief`
+> - `research` receives: `research_brief` → produces: `findings`, `sources`
+> - `review` receives: `findings`, `sources` → produces: `approved_findings` or `feedback`
+> - `report` receives: `approved_findings` → produces: `final_report`
+
+**THEN call AskUserQuestion:**
+
+```
+AskUserQuestion(questions=[{
+    "question": "Do you approve these nodes?",
+    "header": "Nodes",
+    "options": [
+        {"label": "Approve", "description": "Nodes look good, proceed to graph design"},
+        {"label": "Modify", "description": "I want to change the nodes"}
+    ],
+    "multiSelect": false
+}])
+```
+
+**WAIT for user response.**
+
+- If **Approve**: Proceed to STEP 5
+- If **Modify**: Ask what they want to change, update design, ask again
+
+---
+
+## STEP 5: Design Full Graph and Review
+
+**If starting from a template**, the edges are already loaded in the builder session. Render the existing graph as ASCII art and present it to the user for approval. Skip the edge design phase.
+
+**DETERMINE the edges** connecting the approved nodes. For each edge:
+
+- edge_id (kebab-case)
+- source → target
+- condition: `on_success`, `on_failure`, `always`, or `conditional`
+- condition_expr (Python expression, only if conditional)
+- priority (positive = forward, negative = feedback/loop-back)
+
+**RENDER the complete graph as ASCII art.** Make it large and clear — the user needs to see and understand the full workflow at a glance.
+
+**IMPORTANT: Make the ASCII art BIG and READABLE.** Use a box-and-arrow style with generous spacing. Do NOT make it tiny or compressed. Example format:
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                           AGENT: Research Agent                            │
+│                                                                            │
+│  Goal: Thoroughly research technical topics and produce verified reports   │
+└─────────────────────────────────────────────────────────────────────────────┘
+
+    ┌───────────────────────┐
+    │       INTAKE          │
+    │  (client-facing)      │
+    │                       │
+    │  in:  topic           │
+    │  out: research_brief  │
+    └───────────┬───────────┘
+                │ on_success
+                ▼
+    ┌───────────────────────┐
+    │      RESEARCH         │
+    │                       │
+    │  tools: web_search,   │
+    │         web_scrape    │
+    │                       │
+    │  in:  research_brief  │
+    │       [feedback]      │
+    │  out: findings,       │
+    │       sources         │
+    └───────────┬───────────┘
+                │ on_success
+                ▼
+    ┌───────────────────────┐
+    │       REVIEW          │
+    │  (client-facing)      │
+    │                       │
+    │  in:  findings,       │
+    │       sources         │
+    │  out: approved_findings│
+    │       OR feedback     │
+    └───────┬───────┬───────┘
+            │       │
+   approved │       │ feedback (priority: -1)
+            │       │
+            ▼       └──────────────────┐
+    ┌───────────────────────┐          │
+    │       REPORT          │          │
+    │                       │          │
+    │  tools: save_data     │          │
+    │                       │          │
+    │  in:  approved_       │          │
+    │       findings        │          │
+    │  out: final_report    │          │
+    └───────────────────────┘          │
+                                       │
+            ┌──────────────────────────┘
+            │ loops back to RESEARCH
+            ▼ (max_node_visits: 3)
+
+
+    EDGES:
+    ──────
+    1. intake → research         [on_success, priority: 1]
+    2. research → review         [on_success, priority: 1]
+    3. review → report           [conditional: approved_findings is not None, priority: 1]
+    4. review → research         [conditional: feedback is not None, priority: -1]
+```
+
+**PRESENT the graph and edges to the user:**
+
+> Here is the complete workflow graph:
+>
+> [ASCII art above]
+>
+> **Edge Summary:**
+>
+> | #   | Edge              | Condition                                    | Priority |
+> | --- | ----------------- | -------------------------------------------- | -------- |
+> | 1   | intake → research | on_success                                   | 1        |
+> | 2   | research → review | on_success                                   | 1        |
+> | 3   | review → report   | conditional: `approved_findings is not None` | 1        |
+> | 4   | review → research | conditional: `feedback is not None`          | -1       |
+
+**THEN call AskUserQuestion:**
+
+```
+AskUserQuestion(questions=[{
+    "question": "Do you approve this workflow graph?",
+    "header": "Graph",
+    "options": [
+        {"label": "Approve", "description": "Graph looks good, proceed to build the agent"},
+        {"label": "Modify", "description": "I want to change the graph"}
+    ],
+    "multiSelect": false
+}])
+```
+
+**WAIT for user response.**
+
+- If **Approve**: Proceed to STEP 6
+- If **Modify**: Ask what they want to change, update the graph, re-render, ask again
+
+---
+
+## STEP 6: Build the Agent
+
+**NOW — and only now — write the actual code.** The user has approved the goal, nodes, and graph.
+
+### 6a: Register nodes and edges with MCP
+**If starting from a template**, the copied files will be overwritten with the approved design. You MUST replace every occurrence of the old template name with the new agent name. Here is the complete checklist — miss NONE of these:
+
+| File | What to rename |
+|------|---------------|
+| `config.py` | `AgentMetadata.name` — the display name shown in TUI agent selection |
+| `config.py` | `AgentMetadata.description` — agent description |
+| `agent.py` | Module docstring (line 1) |
+| `agent.py` | `class OldNameAgent:` → `class NewNameAgent:` |
+| `agent.py` | `GraphSpec(id="old-name-graph")` → `GraphSpec(id="new-name-graph")` — shown in TUI status bar |
+| `agent.py` | Storage path: `Path.home() / ".hive" / "agents" / "old_name"` → `"new_name"` |
+| `__main__.py` | Module docstring (line 1) |
+| `__main__.py` | `from .agent import ... OldNameAgent` → `NewNameAgent` |
+| `__main__.py` | CLI help string in `def cli()` docstring |
+| `__main__.py` | All `OldNameAgent()` instantiations |
+| `__main__.py` | Storage path (duplicated from agent.py) |
+| `__main__.py` | Shell banner string (e.g. `"=== Old Name Agent ==="`) |
+| `__init__.py` | Package docstring |
+| `__init__.py` | `from .agent import OldNameAgent` import |
+| `__init__.py` | `__all__` list entry |
+
+**If starting from a template and no modifications were made in Steps 2-5**, the nodes and edges are already registered. Skip to validation (`mcp__agent-builder__validate_graph()`). If modifications were made, re-register the changed nodes/edges (the MCP tools handle duplicates by overwriting).
+
+**FOR EACH approved node**, call:
+
+```
+mcp__agent-builder__add_node(
+    node_id="...",
+    name="...",
+    description="...",
+    node_type="event_loop",
+    input_keys='["key1", "key2"]',
+    output_keys='["key1"]',
+    tools='["tool1"]',
+    system_prompt="...",
+    client_facing=True/False,
+    nullable_output_keys='["key"]',
+    max_node_visits=1
+)
+```
+
+**FOR EACH approved edge**, call:
+
+```
+mcp__agent-builder__add_edge(
+    edge_id="source-to-target",
+    source="source-node-id",
+    target="target-node-id",
+    condition="on_success",
+    condition_expr="",
+    priority=1
+)
+```
+
+**VALIDATE the graph:**
+
+```
+mcp__agent-builder__validate_graph()
+```
+
+- If invalid: Fix the issues and re-validate
+- If valid: Continue to 6b
+
+### 6b: Write Python package files
+
+**EXPORT the graph data:**
+
+```
+mcp__agent-builder__export_graph()
+```
+
+**THEN write the Python package files** using the exported data. Create these files in `exports/AGENT_NAME/`:
+
+1. `config.py` - Runtime configuration with model settings
+2. `nodes/__init__.py` - All NodeSpec definitions
+3. `agent.py` - Goal, edges, graph config, and agent class
+4. `__init__.py` - Package exports
+5. `__main__.py` - CLI interface
+6. `mcp_servers.json` - MCP server configurations
+7. `README.md` - Usage documentation
+
+**IMPORTANT entry_points format:**
+
+- MUST be: `{"start": "first-node-id"}`
+- NOT: `{"first-node-id": ["input_keys"]}` (WRONG)
+- NOT: `{"first-node-id"}` (WRONG - this is a set)
+
+**IMPORTANT mcp_servers.json format:**
+
+```json
+{
+  "hive-tools": {
+    "transport": "stdio",
+    "command": "uv",
+    "args": ["run", "python", "mcp_server.py", "--stdio"],
+    "cwd": "../../tools",
+    "description": "Hive tools MCP server"
+  }
+}
+```
+
+- NO `"mcpServers"` wrapper (that's Claude Desktop format, NOT hive format)
+- `cwd` MUST be `"../../tools"` (relative from `exports/AGENT_NAME/` to `tools/`)
+- `command` MUST be `"uv"` with `"args": ["run", "python", ...]` (NOT bare `"python"` which fails on Mac)
+
+**Use the example agent** at `.claude/skills/hive-create/examples/deep_research_agent/` as a template for file structure and patterns. It demonstrates: STEP 1/STEP 2 prompts, client-facing nodes, feedback loops, nullable_output_keys, and data tools.
+
+**AFTER writing all files, tell the user:**
+
+> Agent package created: `exports/AGENT_NAME/`
+>
+> **Files generated:**
+>
+> - `__init__.py` - Package exports
+> - `agent.py` - Goal, nodes, edges, agent class
+> - `config.py` - Runtime configuration
+> - `__main__.py` - CLI interface
+> - `nodes/__init__.py` - Node definitions
+> - `mcp_servers.json` - MCP server config
+> - `README.md` - Usage documentation
+
+---
+
+## STEP 7: Verify and Test
+
+**RUN validation:**
+
+```bash
+cd /home/timothy/oss/hive && PYTHONPATH=exports uv run python -m AGENT_NAME validate
+```
+
+- If valid: Agent is complete!
+- If errors: Fix the issues and re-run
+
+**TELL the user the agent is ready** and display the next steps box:
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                         ✅ AGENT BUILD COMPLETE                             │
+├─────────────────────────────────────────────────────────────────────────────┤
+│                                                                             │
+│  NEXT STEPS:                                                                │
+│                                                                             │
+│  1. SET UP CREDENTIALS (if agent uses tools like web_search, send_email):  │
+│                                                                             │
+│     /hive-credentials --agent AGENT_NAME                                    │
+│                                                                             │
+│  2. RUN YOUR AGENT:                                                         │
+│                                                                             │
+│     hive tui                                                                │
+│                                                                             │
+│     Then select your agent from the list and press Enter.                   │
+│                                                                             │
+│  3. DEBUG ANY ISSUES:                                                       │
+│                                                                             │
+│     /hive-debugger                                                          │
+│                                                                             │
+│     The debugger monitors runtime logs, identifies retry loops,             │
+│     tool failures, and missing outputs, and provides fix recommendations.  │
+│                                                                             │
+└─────────────────────────────────────────────────────────────────────────────┘
+```
+
+---
+
+## REFERENCE: Node Types
+
+| Type         | tools param             | Use when                                |
+| ------------ | ----------------------- | --------------------------------------- |
+| `event_loop` | `'["tool1"]'` or `'[]'` | LLM-powered work with or without tools  |
+| `function`   | N/A                     | Deterministic Python operations, no LLM |
+
+---
+
+## REFERENCE: NodeSpec Fields
+
+| Field                  | Default | Description                                                           |
+| ---------------------- | ------- | --------------------------------------------------------------------- |
+| `client_facing`        | `False` | Streams output to user, blocks for input between turns                |
+| `nullable_output_keys` | `[]`    | Output keys that may remain unset (mutually exclusive outputs)        |
+| `max_node_visits`      | `1`     | Max executions per run. Set >1 for feedback loop targets. 0=unlimited |
+
+---
+
+## REFERENCE: Edge Conditions & Priority
+
+| Condition     | When edge is followed                 |
+| ------------- | ------------------------------------- |
+| `on_success`  | Source node completed successfully    |
+| `on_failure`  | Source node failed                    |
+| `always`      | Always, regardless of success/failure |
+| `conditional` | When condition_expr evaluates to True |
+
+**Priority:** Positive = forward edge (evaluated first). Negative = feedback edge (loops back to earlier node). Multiple ON_SUCCESS edges from same source = parallel execution (fan-out).
+
+---
+
+## REFERENCE: System Prompt Best Practice
+
+For **internal** event_loop nodes (not client-facing), instruct the LLM to use `set_output`:
+
+```
+Use set_output(key, value) to store your results. For example:
+- set_output("search_results", <your results as a JSON string>)
+
+Do NOT return raw JSON. Use the set_output tool to produce outputs.
+```
+
+For **client-facing** event_loop nodes, use the STEP 1/STEP 2 pattern:
+
+```
+**STEP 1 — Respond to the user (text only, NO tool calls):**
+[Present information, ask questions, etc.]
+
+**STEP 2 — After the user responds, call set_output:**
+- set_output("key", "value based on user's response")
+```
+
+This prevents the LLM from calling `set_output` before the user has had a chance to respond. The "NO tool calls" instruction in STEP 1 ensures the node blocks for user input before proceeding.
+
+---
+
+## EventLoopNode Runtime
+
+EventLoopNodes are **auto-created** by `GraphExecutor` at runtime. Both direct `GraphExecutor` and `AgentRuntime` / `create_agent_runtime()` handle event_loop nodes automatically. No manual `node_registry` setup is needed.
+
+```python
+# Direct execution
+from framework.graph.executor import GraphExecutor
+from framework.runtime.core import Runtime
+
+storage_path = Path.home() / ".hive" / "agents" / "my_agent"
+storage_path.mkdir(parents=True, exist_ok=True)
+runtime = Runtime(storage_path)
+
+executor = GraphExecutor(
+    runtime=runtime,
+    llm=llm,
+    tools=tools,
+    tool_executor=tool_executor,
+    storage_path=storage_path,
+)
+result = await executor.execute(graph=graph, goal=goal, input_data=input_data)
+```
+
+**DO NOT pass `runtime=None` to `GraphExecutor`** — it will crash with `'NoneType' object has no attribute 'start_run'`.
+
+---
+
+## REFERENCE: Framework Capabilities for Qualification
+
+Use this reference during STEP 2 to give accurate, honest assessments.
+
+### What the Framework Does Well (The Good)
+
+| Capability | Description |
+|------------|-------------|
+| Multi-turn conversations | Client-facing nodes stream to users and block for input |
+| Human-in-the-loop review | Approval checkpoints with feedback loops back to earlier nodes |
+| Tool orchestration | LLM can call multiple tools, framework handles execution |
+| Structured outputs | `set_output` produces validated, typed outputs |
+| Parallel execution | Fan-out/fan-in for concurrent node execution |
+| Context management | Automatic compaction and spillover for large data |
+| Error recovery | Retry logic, judges, and feedback edges for self-correction |
+| Session persistence | State saved to disk, resumable sessions |
+
+### Framework Limitations (The Bad)
+
+| Limitation | Impact | Workaround |
+|------------|--------|------------|
+| LLM latency | 2-10+ seconds per turn | Not suitable for real-time/low-latency needs |
+| Context window limits | ~128K tokens max | Use data tools for spillover, design for chunking |
+| Cost per run | LLM API calls cost money | Budget planning, caching where possible |
+| Rate limits | API throttling on heavy usage | Backoff, queue management |
+| Node boundaries lose context | Outputs must be serialized | Prefer fewer, richer nodes |
+| Single-threaded within node | One LLM call at a time per node | Use fan-out for parallelism |
+
+### Not Designed For (The Ugly)
+
+| Use Case | Why It's Problematic | Alternative |
+|----------|---------------------|-------------|
+| Long-running daemons | Framework is request-response, not persistent | External scheduler + agent |
+| Sub-second responses | LLM latency is inherent | Traditional code, no LLM |
+| Processing millions of items | Context windows and rate limits | Batch processing + sampling |
+| Real-time streaming data | No built-in pub/sub or streaming input | Custom MCP server + agent |
+| Guaranteed determinism | LLM outputs vary | Function nodes for deterministic parts |
+| Offline/air-gapped | Requires LLM API access | Local models (not currently supported) |
+| Multi-user concurrency | Single-user session model | Separate agent instances per user |
+
+### Tool Availability Reality Check
+
+**Before promising any capability, check `list_mcp_tools()`.** Common gaps:
+
+- **Email**: May not have `send_email` — check before promising email automation
+- **Calendar**: May not have calendar APIs — check before promising scheduling
+- **Database**: May not have SQL tools — check before promising data queries
+- **File system**: Has data tools but not arbitrary filesystem access
+- **External APIs**: Depends entirely on what MCP servers are registered
+
+---
+
+## COMMON MISTAKES TO AVOID
+
+1. **Skipping use case qualification** - A responsible engineer qualifies the use case BEFORE building. Be transparent about what works, what doesn't, and what's problematic
+2. **Hiding limitations** - Don't oversell the framework. If a tool doesn't exist or a capability is missing, say so upfront
+3. **Using tools that don't exist** - Always check `mcp__agent-builder__list_mcp_tools()` first
+4. **Wrong entry_points format** - Must be `{"start": "node-id"}`, NOT a set or list
+5. **Skipping validation** - Always validate nodes and graph before proceeding
+6. **Not waiting for approval** - Always ask user before major steps
+7. **Displaying this file** - Execute the steps, don't show documentation
+8. **Too many thin nodes** - Prefer fewer, richer nodes (4 nodes > 8 nodes)
+9. **Missing STEP 1/STEP 2 in client-facing prompts** - Client-facing nodes need explicit phases to prevent premature set_output
+10. **Forgetting nullable_output_keys** - Mark input_keys that only arrive on certain edges (e.g., feedback) as nullable on the receiving node
+11. **Adding framework gating for LLM behavior** - Fix prompts or use judges, not ad-hoc code
+12. **Writing code before user approves the graph** - Always get approval on goal, nodes, and graph BEFORE writing any agent code
+13. **Wrong mcp_servers.json format** - Use flat format (no `"mcpServers"` wrapper), `cwd` must be `"../../tools"`, and `command` must be `"uv"` with args `["run", "python", ...]`
@@ -0,0 +1,24 @@
+"""
+Deep Research Agent - Interactive, rigorous research with TUI conversation.
+
+Research any topic through multi-source web search, quality evaluation,
+and synthesis. Features client-facing TUI interaction at key checkpoints
+for user guidance and iterative deepening.
+"""
+
+from .agent import DeepResearchAgent, default_agent, goal, nodes, edges
+from .config import RuntimeConfig, AgentMetadata, default_config, metadata
+
+__version__ = "1.0.0"
+
+__all__ = [
+    "DeepResearchAgent",
+    "default_agent",
+    "goal",
+    "nodes",
+    "edges",
+    "RuntimeConfig",
+    "AgentMetadata",
+    "default_config",
+    "metadata",
+]
@@ -0,0 +1,241 @@
+"""
+CLI entry point for Deep Research Agent.
+
+Uses AgentRuntime for multi-entrypoint support with HITL pause/resume.
+"""
+
+import asyncio
+import json
+import logging
+import sys
+import click
+
+from .agent import default_agent, DeepResearchAgent
+
+
+def setup_logging(verbose=False, debug=False):
+    """Configure logging for execution visibility."""
+    if debug:
+        level, fmt = logging.DEBUG, "%(asctime)s %(name)s: %(message)s"
+    elif verbose:
+        level, fmt = logging.INFO, "%(message)s"
+    else:
+        level, fmt = logging.WARNING, "%(levelname)s: %(message)s"
+    logging.basicConfig(level=level, format=fmt, stream=sys.stderr)
+    logging.getLogger("framework").setLevel(level)
+
+
+@click.group()
+@click.version_option(version="1.0.0")
+def cli():
+    """Deep Research Agent - Interactive, rigorous research with TUI conversation."""
+    pass
+
+
+@cli.command()
+@click.option("--topic", "-t", type=str, required=True, help="Research topic")
+@click.option("--mock", is_flag=True, help="Run in mock mode")
+@click.option("--quiet", "-q", is_flag=True, help="Only output result JSON")
+@click.option("--verbose", "-v", is_flag=True, help="Show execution details")
+@click.option("--debug", is_flag=True, help="Show debug logging")
+def run(topic, mock, quiet, verbose, debug):
+    """Execute research on a topic."""
+    if not quiet:
+        setup_logging(verbose=verbose, debug=debug)
+
+    context = {"topic": topic}
+
+    result = asyncio.run(default_agent.run(context, mock_mode=mock))
+
+    output_data = {
+        "success": result.success,
+        "steps_executed": result.steps_executed,
+        "output": result.output,
+    }
+    if result.error:
+        output_data["error"] = result.error
+
+    click.echo(json.dumps(output_data, indent=2, default=str))
+    sys.exit(0 if result.success else 1)
+
+
+@cli.command()
+@click.option("--mock", is_flag=True, help="Run in mock mode")
+@click.option("--verbose", "-v", is_flag=True, help="Show execution details")
+@click.option("--debug", is_flag=True, help="Show debug logging")
+def tui(mock, verbose, debug):
+    """Launch the TUI dashboard for interactive research."""
+    setup_logging(verbose=verbose, debug=debug)
+
+    try:
+        from framework.tui.app import AdenTUI
+    except ImportError:
+        click.echo(
+            "TUI requires the 'textual' package. Install with: pip install textual"
+        )
+        sys.exit(1)
+
+    from pathlib import Path
+
+    from framework.llm import LiteLLMProvider
+    from framework.runner.tool_registry import ToolRegistry
+    from framework.runtime.agent_runtime import create_agent_runtime
+    from framework.runtime.event_bus import EventBus
+    from framework.runtime.execution_stream import EntryPointSpec
+
+    async def run_with_tui():
+        agent = DeepResearchAgent()
+
+        # Build graph and tools
+        agent._event_bus = EventBus()
+        agent._tool_registry = ToolRegistry()
+
+        storage_path = Path.home() / ".hive" / "agents" / "deep_research_agent"
+        storage_path.mkdir(parents=True, exist_ok=True)
+
+        mcp_config_path = Path(__file__).parent / "mcp_servers.json"
+        if mcp_config_path.exists():
+            agent._tool_registry.load_mcp_config(mcp_config_path)
+
+        llm = None
+        if not mock:
+            llm = LiteLLMProvider(
+                model=agent.config.model,
+                api_key=agent.config.api_key,
+                api_base=agent.config.api_base,
+            )
+
+        tools = list(agent._tool_registry.get_tools().values())
+        tool_executor = agent._tool_registry.get_executor()
+        graph = agent._build_graph()
+
+        runtime = create_agent_runtime(
+            graph=graph,
+            goal=agent.goal,
+            storage_path=storage_path,
+            entry_points=[
+                EntryPointSpec(
+                    id="start",
+                    name="Start Research",
+                    entry_node="intake",
+                    trigger_type="manual",
+                    isolation_level="isolated",
+                ),
+            ],
+            llm=llm,
+            tools=tools,
+            tool_executor=tool_executor,
+        )
+
+        await runtime.start()
+
+        try:
+            app = AdenTUI(runtime)
+            await app.run_async()
+        finally:
+            await runtime.stop()
+
+    asyncio.run(run_with_tui())
+
+
+@cli.command()
+@click.option("--json", "output_json", is_flag=True)
+def info(output_json):
+    """Show agent information."""
+    info_data = default_agent.info()
+    if output_json:
+        click.echo(json.dumps(info_data, indent=2))
+    else:
+        click.echo(f"Agent: {info_data['name']}")
+        click.echo(f"Version: {info_data['version']}")
+        click.echo(f"Description: {info_data['description']}")
+        click.echo(f"\nNodes: {', '.join(info_data['nodes'])}")
+        click.echo(f"Client-facing: {', '.join(info_data['client_facing_nodes'])}")
+        click.echo(f"Entry: {info_data['entry_node']}")
+        click.echo(f"Terminal: {', '.join(info_data['terminal_nodes'])}")
+
+
+@cli.command()
+def validate():
+    """Validate agent structure."""
+    validation = default_agent.validate()
+    if validation["valid"]:
+        click.echo("Agent is valid")
+        if validation["warnings"]:
+            for warning in validation["warnings"]:
+                click.echo(f"  WARNING: {warning}")
+    else:
+        click.echo("Agent has errors:")
+        for error in validation["errors"]:
+            click.echo(f"  ERROR: {error}")
+    sys.exit(0 if validation["valid"] else 1)
+
+
+@cli.command()
+@click.option("--verbose", "-v", is_flag=True)
+def shell(verbose):
+    """Interactive research session (CLI, no TUI)."""
+    asyncio.run(_interactive_shell(verbose))
+
+
+async def _interactive_shell(verbose=False):
+    """Async interactive shell."""
+    setup_logging(verbose=verbose)
+
+    click.echo("=== Deep Research Agent ===")
+    click.echo("Enter a topic to research (or 'quit' to exit):\n")
+
+    agent = DeepResearchAgent()
+    await agent.start()
+
+    try:
+        while True:
+            try:
+                topic = await asyncio.get_event_loop().run_in_executor(
+                    None, input, "Topic> "
+                )
+                if topic.lower() in ["quit", "exit", "q"]:
+                    click.echo("Goodbye!")
+                    break
+
+                if not topic.strip():
+                    continue
+
+                click.echo("\nResearching...\n")
+
+                result = await agent.trigger_and_wait("start", {"topic": topic})
+
+                if result is None:
+                    click.echo("\n[Execution timed out]\n")
+                    continue
+
+                if result.success:
+                    output = result.output
+                    if "report_content" in output:
+                        click.echo("\n--- Report ---\n")
+                        click.echo(output["report_content"])
+                        click.echo("\n")
+                    if "references" in output:
+                        click.echo("--- References ---\n")
+                        for ref in output.get("references", []):
+                            click.echo(
+                                f"  [{ref.get('number', '?')}] {ref.get('title', '')} - {ref.get('url', '')}"
+                            )
+                        click.echo("\n")
+                else:
+                    click.echo(f"\nResearch failed: {result.error}\n")
+
+            except KeyboardInterrupt:
+                click.echo("\nGoodbye!")
+                break
+            except Exception as e:
+                click.echo(f"Error: {e}", err=True)
+                import traceback
+
+                traceback.print_exc()
+    finally:
+        await agent.stop()
+
+
+if __name__ == "__main__":
+    cli()
@@ -0,0 +1,311 @@
+"""Agent graph construction for Deep Research Agent."""
+
+from framework.graph import EdgeSpec, EdgeCondition, Goal, SuccessCriterion, Constraint
+from framework.graph.edge import GraphSpec
+from framework.graph.executor import ExecutionResult, GraphExecutor
+from framework.runtime.event_bus import EventBus
+from framework.runtime.core import Runtime
+from framework.llm import LiteLLMProvider
+from framework.runner.tool_registry import ToolRegistry
+
+from .config import default_config, metadata
+from .nodes import (
+    intake_node,
+    research_node,
+    review_node,
+    report_node,
+)
+
+# Goal definition
+goal = Goal(
+    id="rigorous-interactive-research",
+    name="Rigorous Interactive Research",
+    description=(
+        "Research any topic by searching diverse sources, analyzing findings, "
+        "and producing a cited report — with user checkpoints to guide direction."
+    ),
+    success_criteria=[
+        SuccessCriterion(
+            id="source-diversity",
+            description="Use multiple diverse, authoritative sources",
+            metric="source_count",
+            target=">=5",
+            weight=0.25,
+        ),
+        SuccessCriterion(
+            id="citation-coverage",
+            description="Every factual claim in the report cites its source",
+            metric="citation_coverage",
+            target="100%",
+            weight=0.25,
+        ),
+        SuccessCriterion(
+            id="user-satisfaction",
+            description="User reviews findings before report generation",
+            metric="user_approval",
+            target="true",
+            weight=0.25,
+        ),
+        SuccessCriterion(
+            id="report-completeness",
+            description="Final report answers the original research questions",
+            metric="question_coverage",
+            target="90%",
+            weight=0.25,
+        ),
+    ],
+    constraints=[
+        Constraint(
+            id="no-hallucination",
+            description="Only include information found in fetched sources",
+            constraint_type="quality",
+            category="accuracy",
+        ),
+        Constraint(
+            id="source-attribution",
+            description="Every claim must cite its source with a numbered reference",
+            constraint_type="quality",
+            category="accuracy",
+        ),
+        Constraint(
+            id="user-checkpoint",
+            description="Present findings to the user before writing the final report",
+            constraint_type="functional",
+            category="interaction",
+        ),
+    ],
+)
+
+# Node list
+nodes = [
+    intake_node,
+    research_node,
+    review_node,
+    report_node,
+]
+
+# Edge definitions
+edges = [
+    # intake -> research
+    EdgeSpec(
+        id="intake-to-research",
+        source="intake",
+        target="research",
+        condition=EdgeCondition.ON_SUCCESS,
+        priority=1,
+    ),
+    # research -> review
+    EdgeSpec(
+        id="research-to-review",
+        source="research",
+        target="review",
+        condition=EdgeCondition.ON_SUCCESS,
+        priority=1,
+    ),
+    # review -> research (feedback loop)
+    EdgeSpec(
+        id="review-to-research-feedback",
+        source="review",
+        target="research",
+        condition=EdgeCondition.CONDITIONAL,
+        condition_expr="needs_more_research == True",
+        priority=1,
+    ),
+    # review -> report (user satisfied)
+    EdgeSpec(
+        id="review-to-report",
+        source="review",
+        target="report",
+        condition=EdgeCondition.CONDITIONAL,
+        condition_expr="needs_more_research == False",
+        priority=2,
+    ),
+]
+
+# Graph configuration
+entry_node = "intake"
+entry_points = {"start": "intake"}
+pause_nodes = []
+terminal_nodes = ["report"]
+
+
+class DeepResearchAgent:
+    """
+    Deep Research Agent — 4-node pipeline with user checkpoints.
+
+    Flow: intake -> research -> review -> report
+                      ^           |
+                      +-- feedback loop (if user wants more)
+    """
+
+    def __init__(self, config=None):
+        self.config = config or default_config
+        self.goal = goal
+        self.nodes = nodes
+        self.edges = edges
+        self.entry_node = entry_node
+        self.entry_points = entry_points
+        self.pause_nodes = pause_nodes
+        self.terminal_nodes = terminal_nodes
+        self._executor: GraphExecutor | None = None
+        self._graph: GraphSpec | None = None
+        self._event_bus: EventBus | None = None
+        self._tool_registry: ToolRegistry | None = None
+
+    def _build_graph(self) -> GraphSpec:
+        """Build the GraphSpec."""
+        return GraphSpec(
+            id="deep-research-agent-graph",
+            goal_id=self.goal.id,
+            version="1.0.0",
+            entry_node=self.entry_node,
+            entry_points=self.entry_points,
+            terminal_nodes=self.terminal_nodes,
+            pause_nodes=self.pause_nodes,
+            nodes=self.nodes,
+            edges=self.edges,
+            default_model=self.config.model,
+            max_tokens=self.config.max_tokens,
+            loop_config={
+                "max_iterations": 100,
+                "max_tool_calls_per_turn": 20,
+                "max_history_tokens": 32000,
+            },
+        )
+
+    def _setup(self, mock_mode=False) -> GraphExecutor:
+        """Set up the executor with all components."""
+        from pathlib import Path
+
+        storage_path = Path.home() / ".hive" / "agents" / "deep_research_agent"
+        storage_path.mkdir(parents=True, exist_ok=True)
+
+        self._event_bus = EventBus()
+        self._tool_registry = ToolRegistry()
+
+        mcp_config_path = Path(__file__).parent / "mcp_servers.json"
+        if mcp_config_path.exists():
+            self._tool_registry.load_mcp_config(mcp_config_path)
+
+        llm = None
+        if not mock_mode:
+            llm = LiteLLMProvider(
+                model=self.config.model,
+                api_key=self.config.api_key,
+                api_base=self.config.api_base,
+            )
+
+        tool_executor = self._tool_registry.get_executor()
+        tools = list(self._tool_registry.get_tools().values())
+
+        self._graph = self._build_graph()
+        runtime = Runtime(storage_path)
+
+        self._executor = GraphExecutor(
+            runtime=runtime,
+            llm=llm,
+            tools=tools,
+            tool_executor=tool_executor,
+            event_bus=self._event_bus,
+            storage_path=storage_path,
+            loop_config=self._graph.loop_config,
+        )
+
+        return self._executor
+
+    async def start(self, mock_mode=False) -> None:
+        """Set up the agent (initialize executor and tools)."""
+        if self._executor is None:
+            self._setup(mock_mode=mock_mode)
+
+    async def stop(self) -> None:
+        """Clean up resources."""
+        self._executor = None
+        self._event_bus = None
+
+    async def trigger_and_wait(
+        self,
+        entry_point: str,
+        input_data: dict,
+        timeout: float | None = None,
+        session_state: dict | None = None,
+    ) -> ExecutionResult | None:
+        """Execute the graph and wait for completion."""
+        if self._executor is None:
+            raise RuntimeError("Agent not started. Call start() first.")
+        if self._graph is None:
+            raise RuntimeError("Graph not built. Call start() first.")
+
+        return await self._executor.execute(
+            graph=self._graph,
+            goal=self.goal,
+            input_data=input_data,
+            session_state=session_state,
+        )
+
+    async def run(
+        self, context: dict, mock_mode=False, session_state=None
+    ) -> ExecutionResult:
+        """Run the agent (convenience method for single execution)."""
+        await self.start(mock_mode=mock_mode)
+        try:
+            result = await self.trigger_and_wait(
+                "start", context, session_state=session_state
+            )
+            return result or ExecutionResult(success=False, error="Execution timeout")
+        finally:
+            await self.stop()
+
+    def info(self):
+        """Get agent information."""
+        return {
+            "name": metadata.name,
+            "version": metadata.version,
+            "description": metadata.description,
+            "goal": {
+                "name": self.goal.name,
+                "description": self.goal.description,
+            },
+            "nodes": [n.id for n in self.nodes],
+            "edges": [e.id for e in self.edges],
+            "entry_node": self.entry_node,
+            "entry_points": self.entry_points,
+            "pause_nodes": self.pause_nodes,
+            "terminal_nodes": self.terminal_nodes,
+            "client_facing_nodes": [n.id for n in self.nodes if n.client_facing],
+        }
+
+    def validate(self):
+        """Validate agent structure."""
+        errors = []
+        warnings = []
+
+        node_ids = {node.id for node in self.nodes}
+        for edge in self.edges:
+            if edge.source not in node_ids:
+                errors.append(f"Edge {edge.id}: source '{edge.source}' not found")
+            if edge.target not in node_ids:
+                errors.append(f"Edge {edge.id}: target '{edge.target}' not found")
+
+        if self.entry_node not in node_ids:
+            errors.append(f"Entry node '{self.entry_node}' not found")
+
+        for terminal in self.terminal_nodes:
+            if terminal not in node_ids:
+                errors.append(f"Terminal node '{terminal}' not found")
+
+        for ep_id, node_id in self.entry_points.items():
+            if node_id not in node_ids:
+                errors.append(
+                    f"Entry point '{ep_id}' references unknown node '{node_id}'"
+                )
+
+        return {
+            "valid": len(errors) == 0,
+            "errors": errors,
+            "warnings": warnings,
+        }
+
+
+# Create default instance
+default_agent = DeepResearchAgent()
@@ -0,0 +1,21 @@
+"""Runtime configuration."""
+
+from dataclasses import dataclass
+
+from framework.config import RuntimeConfig
+
+default_config = RuntimeConfig()
+
+
+@dataclass
+class AgentMetadata:
+    name: str = "Deep Research Agent"
+    version: str = "1.0.0"
+    description: str = (
+        "Interactive research agent that rigorously investigates topics through "
+        "multi-source search, quality evaluation, and synthesis - with TUI conversation "
+        "at key checkpoints for user guidance and feedback."
+    )
+
+
+metadata = AgentMetadata()
@@ -0,0 +1,9 @@
+{
+  "hive-tools": {
+    "transport": "stdio",
+    "command": "uv",
+    "args": ["run", "python", "mcp_server.py", "--stdio"],
+    "cwd": "../../tools",
+    "description": "Hive tools MCP server providing web_search, web_scrape, and write_to_file"
+  }
+}
@@ -0,0 +1,162 @@
+"""Node definitions for Deep Research Agent."""
+
+from framework.graph import NodeSpec
+
+# Node 1: Intake (client-facing)
+# Brief conversation to clarify what the user wants researched.
+intake_node = NodeSpec(
+    id="intake",
+    name="Research Intake",
+    description="Discuss the research topic with the user, clarify scope, and confirm direction",
+    node_type="event_loop",
+    client_facing=True,
+    input_keys=["topic"],
+    output_keys=["research_brief"],
+    system_prompt="""\
+You are a research intake specialist. The user wants to research a topic.
+Have a brief conversation to clarify what they need.
+
+**STEP 1 — Read and respond (text only, NO tool calls):**
+1. Read the topic provided
+2. If it's vague, ask 1-2 clarifying questions (scope, angle, depth)
+3. If it's already clear, confirm your understanding and ask the user to confirm
+
+Keep it short. Don't over-ask.
+
+**STEP 2 — After the user confirms, call set_output:**
+- set_output("research_brief", "A clear paragraph describing exactly what to research, \
+what questions to answer, what scope to cover, and how deep to go.")
+""",
+    tools=[],
+)
+
+# Node 2: Research
+# The workhorse — searches the web, fetches content, analyzes sources.
+# One node with both tools avoids the context-passing overhead of 5 separate nodes.
+research_node = NodeSpec(
+    id="research",
+    name="Research",
+    description="Search the web, fetch source content, and compile findings",
+    node_type="event_loop",
+    max_node_visits=3,
+    input_keys=["research_brief", "feedback"],
+    output_keys=["findings", "sources", "gaps"],
+    nullable_output_keys=["feedback"],
+    system_prompt="""\
+You are a research agent. Given a research brief, find and analyze sources.
+
+If feedback is provided, this is a follow-up round — focus on the gaps identified.
+
+Work in phases:
+1. **Search**: Use web_search with 3-5 diverse queries covering different angles.
+   Prioritize authoritative sources (.edu, .gov, established publications).
+2. **Fetch**: Use web_scrape on the most promising URLs (aim for 5-8 sources).
+   Skip URLs that fail. Extract the substantive content.
+3. **Analyze**: Review what you've collected. Identify key findings, themes,
+   and any contradictions between sources.
+
+Important:
+- Work in batches of 3-4 tool calls at a time to manage context
+- After each batch, assess whether you have enough material
+- Prefer quality over quantity — 5 good sources beat 15 thin ones
+- Track which URL each finding comes from (you'll need citations later)
+
+When done, use set_output:
+- set_output("findings", "Structured summary: key findings with source URLs for each claim. \
+Include themes, contradictions, and confidence levels.")
+- set_output("sources", [{"url": "...", "title": "...", "summary": "..."}])
+- set_output("gaps", "What aspects of the research brief are NOT well-covered yet, if any.")
+""",
+    tools=["web_search", "web_scrape", "load_data", "save_data", "list_data_files"],
+)
+
+# Node 3: Review (client-facing)
+# Shows the user what was found and asks whether to dig deeper or proceed.
+review_node = NodeSpec(
+    id="review",
+    name="Review Findings",
+    description="Present findings to user and decide whether to research more or write the report",
+    node_type="event_loop",
+    client_facing=True,
+    max_node_visits=3,
+    input_keys=["findings", "sources", "gaps", "research_brief"],
+    output_keys=["needs_more_research", "feedback"],
+    system_prompt="""\
+Present the research findings to the user clearly and concisely.
+
+**STEP 1 — Present (your first message, text only, NO tool calls):**
+1. **Summary** (2-3 sentences of what was found)
+2. **Key Findings** (bulleted, with confidence levels)
+3. **Sources Used** (count and quality assessment)
+4. **Gaps** (what's still unclear or under-covered)
+
+End by asking: Are they satisfied, or do they want deeper research? \
+Should we proceed to writing the final report?
+
+**STEP 2 — After the user responds, call set_output:**
+- set_output("needs_more_research", "true")  — if they want more
+- set_output("needs_more_research", "false") — if they're satisfied
+- set_output("feedback", "What the user wants explored further, or empty string")
+""",
+    tools=[],
+)
+
+# Node 4: Report (client-facing)
+# Writes an HTML report, serves the link to the user, and answers follow-ups.
+report_node = NodeSpec(
+    id="report",
+    name="Write & Deliver Report",
+    description="Write a cited HTML report from the findings and present it to the user",
+    node_type="event_loop",
+    client_facing=True,
+    input_keys=["findings", "sources", "research_brief"],
+    output_keys=["delivery_status"],
+    system_prompt="""\
+Write a comprehensive research report as an HTML file and present it to the user.
+
+**STEP 1 — Write the HTML report (tool calls, NO text to user yet):**
+
+1. Compose a complete, self-contained HTML document with embedded CSS styling.
+   Use a clean, readable design: max-width container, pleasant typography,
+   numbered citation links, a table of contents, and a references section.
+
+   Report structure inside the HTML:
+   - Title & date
+   - Executive Summary (2-3 paragraphs)
+   - Table of Contents
+   - Findings (organized by theme, with [n] citation links)
+   - Analysis (synthesis, implications, areas of debate)
+   - Conclusion (key takeaways, confidence assessment)
+   - References (numbered list with clickable URLs)
+
+   Requirements:
+   - Every factual claim must cite its source with [n] notation
+   - Be objective — present multiple viewpoints where sources disagree
+   - Distinguish well-supported conclusions from speculation
+   - Answer the original research questions from the brief
+
+2. Save the HTML file:
+   save_data(filename="report.html", data=<your_html>)
+
+3. Get the clickable link:
+   serve_file_to_user(filename="report.html", label="Research Report")
+
+**STEP 2 — Present the link to the user (text only, NO tool calls):**
+
+Tell the user the report is ready and include the file:// URI from
+serve_file_to_user so they can click it to open. Give a brief summary
+of what the report covers. Ask if they have questions.
+
+**STEP 3 — After the user responds:**
+- Answer follow-up questions from the research material
+- When the user is satisfied: set_output("delivery_status", "completed")
+""",
+    tools=["save_data", "serve_file_to_user", "load_data", "list_data_files"],
+)
+
+__all__ = [
+    "intake_node",
+    "research_node",
+    "review_node",
+    "report_node",
+]
@@ -0,0 +1,618 @@
+---
+name: hive-credentials
+description: Set up and install credentials for an agent. Detects missing credentials from agent config, collects them from the user, and stores them securely in the local encrypted store at ~/.hive/credentials.
+license: Apache-2.0
+metadata:
+  author: hive
+  version: "2.3"
+  type: utility
+---
+
+# Setup Credentials
+
+Interactive credential setup for agents with multiple authentication options. Detects what's missing, offers auth method choices, validates with health checks, and stores credentials securely.
+
+## When to Use
+
+- Before running or testing an agent for the first time
+- When `AgentRunner.run()` fails with "missing required credentials"
+- When a user asks to configure credentials for an agent
+- After building a new agent that uses tools requiring API keys
+
+## Workflow
+
+### Step 1: Identify the Agent
+
+Determine which agent needs credentials. The user will either:
+
+- Name the agent directly (e.g., "set up credentials for hubspot-agent")
+- Have an agent directory open (check `exports/` for agent dirs)
+- Be working on an agent in the current session
+
+Locate the agent's directory under `exports/{agent_name}/`.
+
+### Step 2: Detect Missing Credentials
+
+Use the `check_missing_credentials` MCP tool to detect what the agent needs and what's already configured. This tool loads the agent, inspects its required tools and node types, maps them to credentials via `CREDENTIAL_SPECS`, and checks both the encrypted store and environment variables.
+
+```
+check_missing_credentials(agent_path="exports/{agent_name}")
+```
+
+The tool returns a JSON response:
+
+```json
+{
+  "agent": "exports/{agent_name}",
+  "missing": [
+    {
+      "credential_name": "brave_search",
+      "env_var": "BRAVE_SEARCH_API_KEY",
+      "description": "Brave Search API key for web search",
+      "help_url": "https://brave.com/search/api/",
+      "tools": ["web_search"]
+    }
+  ],
+  "available": [
+    {
+      "credential_name": "anthropic",
+      "env_var": "ANTHROPIC_API_KEY",
+      "source": "encrypted_store"
+    }
+  ],
+  "total_missing": 1,
+  "ready": false
+}
+```
+
+**If `ready` is true (nothing missing):** Report all credentials as configured and skip Steps 3-5. Example:
+
+```
+All required credentials are already configured:
+  ✓ anthropic (ANTHROPIC_API_KEY)
+  ✓ brave_search (BRAVE_SEARCH_API_KEY)
+Your agent is ready to run!
+```
+
+**If credentials are missing:** Continue to Step 3 with the `missing` list.
+
+### Step 3: Present Auth Options for Each Missing Credential
+
+For each missing credential, check what authentication methods are available:
+
+```python
+from aden_tools.credentials import CREDENTIAL_SPECS
+
+spec = CREDENTIAL_SPECS.get("hubspot")
+if spec:
+    # Determine available auth options
+    auth_options = []
+    if spec.aden_supported:
+        auth_options.append("aden")
+    if spec.direct_api_key_supported:
+        auth_options.append("direct")
+    auth_options.append("custom")  # Always available
+
+    # Get setup info
+    setup_info = {
+        "env_var": spec.env_var,
+        "description": spec.description,
+        "help_url": spec.help_url,
+        "api_key_instructions": spec.api_key_instructions,
+    }
+```
+
+Present the available options using AskUserQuestion:
+
+```
+Choose how to configure HUBSPOT_ACCESS_TOKEN:
+
+  1) Aden Platform (OAuth) (Recommended)
+     Secure OAuth2 flow via hive.adenhq.com
+     - Quick setup with automatic token refresh
+     - No need to manage API keys manually
+
+  2) Direct API Key
+     Enter your own API key manually
+     - Requires creating a HubSpot Private App
+     - Full control over scopes and permissions
+
+  3) Local Credential Setup (Advanced)
+     Programmatic configuration for CI/CD
+     - For automated deployments
+     - Requires manual API calls
+```
+
+### Step 4: Execute Auth Flow Based on User Choice
+
+#### Prerequisite: Ensure HIVE_CREDENTIAL_KEY Is Available
+
+Before storing any credentials, verify `HIVE_CREDENTIAL_KEY` is set (needed to encrypt/decrypt the local store). Check both the current session and shell config:
+
+```bash
+# Check current session
+printenv HIVE_CREDENTIAL_KEY > /dev/null 2>&1 && echo "session: set" || echo "session: not set"
+
+# Check shell config files
+for f in ~/.zshrc ~/.bashrc ~/.profile; do [ -f "$f" ] && grep -q 'HIVE_CREDENTIAL_KEY' "$f" && echo "$f"; done
+```
+
+- **In current session** — proceed to store credentials
+- **In shell config but NOT in current session** — run `source ~/.zshrc` (or `~/.bashrc`) first, then proceed
+- **Not set anywhere** — `EncryptedFileStorage` will auto-generate one. After storing, tell the user to persist it: `export HIVE_CREDENTIAL_KEY="{generated_key}"` in their shell profile
+
+#### Option 1: Aden Platform (OAuth)
+
+This is the recommended flow for supported integrations (HubSpot, etc.).
+
+**How Aden OAuth Works:**
+
+The ADEN_API_KEY represents a user who has already completed OAuth authorization on Aden's platform. When users sign up and connect integrations on Aden, those OAuth tokens are stored server-side. Having an ADEN_API_KEY means:
+
+1. User has an Aden account
+2. User has already authorized integrations (HubSpot, etc.) via OAuth on Aden
+3. We just need to sync those credentials down to the local credential store
+
+**4.1a. Check for ADEN_API_KEY**
+
+```python
+import os
+aden_key = os.environ.get("ADEN_API_KEY")
+```
+
+If not set, guide user to get one from Aden (this is where they do OAuth):
+
+```python
+from aden_tools.credentials import open_browser, get_aden_setup_url
+
+# Open browser to Aden - user will sign up and connect integrations there
+url = get_aden_setup_url()  # https://hive.adenhq.com
+success, msg = open_browser(url)
+
+print("Please sign in to Aden and connect your integrations (HubSpot, etc.).")
+print("Once done, copy your API key and return here.")
+```
+
+Ask user to provide the ADEN_API_KEY they received.
+
+**4.1b. Save ADEN_API_KEY to Shell Config**
+
+With user approval, persist ADEN_API_KEY to their shell config:
+
+```python
+from aden_tools.credentials import (
+    detect_shell,
+    add_env_var_to_shell_config,
+    get_shell_source_command,
+)
+
+shell_type = detect_shell()  # 'bash', 'zsh', or 'unknown'
+
+# Ask user for approval before modifying shell config
+# If approved:
+success, config_path = add_env_var_to_shell_config(
+    "ADEN_API_KEY",
+    user_provided_key,
+    comment="Aden Platform (OAuth) API key"
+)
+
+if success:
+    source_cmd = get_shell_source_command()
+    print(f"Saved to {config_path}")
+    print(f"Run: {source_cmd}")
+```
+
+Also save to `~/.hive/configuration.json` for the framework:
+
+```python
+import json
+from pathlib import Path
+
+config_path = Path.home() / ".hive" / "configuration.json"
+config = json.loads(config_path.read_text()) if config_path.exists() else {}
+
+config["aden"] = {
+    "api_key_configured": True,
+    "api_url": "https://api.adenhq.com"
+}
+
+config_path.parent.mkdir(parents=True, exist_ok=True)
+config_path.write_text(json.dumps(config, indent=2))
+```
+
+**4.1c. Sync Credentials from Aden Server**
+
+Since the user has already authorized integrations on Aden, use the one-liner factory method:
+
+```python
+from core.framework.credentials import CredentialStore
+
+# This single call handles everything:
+# - Creates encrypted local storage at ~/.hive/credentials
+# - Configures Aden client from ADEN_API_KEY env var
+# - Syncs all credentials from Aden server automatically
+store = CredentialStore.with_aden_sync(
+    base_url="https://api.adenhq.com",
+    auto_sync=True,  # Syncs on creation
+)
+
+# Check what was synced
+synced = store.list_credentials()
+print(f"Synced credentials: {synced}")
+
+# If the required credential wasn't synced, the user hasn't authorized it on Aden yet
+if "hubspot" not in synced:
+    print("HubSpot not found in your Aden account.")
+    print("Please visit https://hive.adenhq.com to connect HubSpot, then try again.")
+```
+
+For more control over the sync process:
+
+```python
+from core.framework.credentials import CredentialStore
+from core.framework.credentials.aden import (
+    AdenCredentialClient,
+    AdenClientConfig,
+    AdenSyncProvider,
+)
+
+# Create client (API key loaded from ADEN_API_KEY env var)
+client = AdenCredentialClient(AdenClientConfig(
+    base_url="https://api.adenhq.com",
+))
+
+# Create provider and store
+provider = AdenSyncProvider(client=client)
+store = CredentialStore.with_encrypted_storage()
+
+# Manual sync
+synced_count = provider.sync_all(store)
+print(f"Synced {synced_count} credentials from Aden")
+```
+
+**4.1d. Run Health Check**
+
+```python
+from aden_tools.credentials import check_credential_health
+
+# Get the token from the store
+cred = store.get_credential("hubspot")
+token = cred.keys["access_token"].value.get_secret_value()
+
+result = check_credential_health("hubspot", token)
+if result.valid:
+    print("HubSpot credentials validated successfully!")
+else:
+    print(f"Validation failed: {result.message}")
+    # Offer to retry the OAuth flow
+```
+
+#### Option 2: Direct API Key
+
+For users who prefer manual API key management.
+
+**4.2a. Show Setup Instructions**
+
+```python
+from aden_tools.credentials import CREDENTIAL_SPECS
+
+spec = CREDENTIAL_SPECS.get("hubspot")
+if spec and spec.api_key_instructions:
+    print(spec.api_key_instructions)
+# Output:
+# To get a HubSpot Private App token:
+# 1. Go to HubSpot Settings > Integrations > Private Apps
+# 2. Click "Create a private app"
+# 3. Name your app (e.g., "Hive Agent")
+# ...
+
+if spec and spec.help_url:
+    print(f"More info: {spec.help_url}")
+```
+
+**4.2b. Collect API Key from User**
+
+Use AskUserQuestion to securely collect the API key:
+
+```
+Please provide your HubSpot access token:
+(This will be stored securely in ~/.hive/credentials)
+```
+
+**4.2c. Run Health Check Before Storing**
+
+```python
+from aden_tools.credentials import check_credential_health
+
+result = check_credential_health("hubspot", user_provided_token)
+if not result.valid:
+    print(f"Warning: {result.message}")
+    # Ask user if they want to:
+    # 1. Try a different token
+    # 2. Continue anyway (not recommended)
+```
+
+**4.2d. Store in Local Encrypted Store**
+
+```python
+from core.framework.credentials import CredentialStore, CredentialObject, CredentialKey
+from pydantic import SecretStr
+
+store = CredentialStore.with_encrypted_storage()
+
+cred = CredentialObject(
+    id="hubspot",
+    name="HubSpot Access Token",
+    keys={
+        "access_token": CredentialKey(
+            name="access_token",
+            value=SecretStr(user_provided_token),
+        )
+    },
+)
+store.save_credential(cred)
+```
+
+**4.2e. Export to Current Session**
+
+```bash
+export HUBSPOT_ACCESS_TOKEN="the-value"
+```
+
+#### Option 3: Local Credential Setup (Advanced)
+
+For programmatic/CI/CD setups.
+
+**4.3a. Show Documentation**
+
+```
+For advanced credential management, you can use the CredentialStore API directly:
+
+  from core.framework.credentials import CredentialStore, CredentialObject, CredentialKey
+  from pydantic import SecretStr
+
+  store = CredentialStore.with_encrypted_storage()
+
+  cred = CredentialObject(
+      id="hubspot",
+      name="HubSpot Access Token",
+      keys={"access_token": CredentialKey(name="access_token", value=SecretStr("..."))}
+  )
+  store.save_credential(cred)
+
+For CI/CD environments:
+  - Set HIVE_CREDENTIAL_KEY for encryption
+  - Pre-populate ~/.hive/credentials programmatically
+  - Or use environment variables directly (HUBSPOT_ACCESS_TOKEN)
+
+Documentation: See core/framework/credentials/README.md
+```
+
+### Step 5: Record Configuration Method
+
+Track which auth method was used for each credential in `~/.hive/configuration.json`:
+
+```python
+import json
+from pathlib import Path
+from datetime import datetime
+
+config_path = Path.home() / ".hive" / "configuration.json"
+config = json.loads(config_path.read_text()) if config_path.exists() else {}
+
+if "credential_methods" not in config:
+    config["credential_methods"] = {}
+
+config["credential_methods"]["hubspot"] = {
+    "method": "aden",  # or "direct" or "custom"
+    "configured_at": datetime.now().isoformat(),
+}
+
+config_path.write_text(json.dumps(config, indent=2))
+```
+
+### Step 6: Verify All Credentials
+
+Use the `verify_credentials` MCP tool to confirm everything is properly configured:
+
+```
+verify_credentials(agent_path="exports/{agent_name}")
+```
+
+The tool returns:
+
+```json
+{
+  "agent": "exports/{agent_name}",
+  "ready": true,
+  "missing_credentials": [],
+  "warnings": [],
+  "errors": []
+}
+```
+
+If `ready` is true, report success. If `missing_credentials` is non-empty, identify what failed and loop back to Step 3 for the remaining credentials.
+
+## Health Check Reference
+
+Health checks validate credentials by making lightweight API calls:
+
+| Credential      | Endpoint                                | What It Checks                    |
+| --------------- | --------------------------------------- | --------------------------------- |
+| `anthropic`     | `POST /v1/messages`                     | API key validity                  |
+| `brave_search`  | `GET /res/v1/web/search?q=test&count=1` | API key validity                  |
+| `google_search` | `GET /customsearch/v1?q=test&num=1`     | API key + CSE ID validity         |
+| `github`        | `GET /user`                             | Token validity, user identity     |
+| `hubspot`       | `GET /crm/v3/objects/contacts?limit=1`  | Bearer token validity, CRM scopes |
+| `resend`        | `GET /domains`                          | API key validity                  |
+
+```python
+from aden_tools.credentials import check_credential_health, HealthCheckResult
+
+result: HealthCheckResult = check_credential_health("hubspot", token_value)
+# result.valid: bool
+# result.message: str
+# result.details: dict (status_code, rate_limited, etc.)
+```
+
+## Encryption Key (HIVE_CREDENTIAL_KEY)
+
+The local encrypted store requires `HIVE_CREDENTIAL_KEY` to encrypt/decrypt credentials.
+
+- If the user doesn't have one, `EncryptedFileStorage` will auto-generate one and log it
+- The user MUST persist this key (e.g., in `~/.bashrc` or a secrets manager)
+- Without this key, stored credentials cannot be decrypted
+- This is the ONLY secret that should live in `~/.bashrc` or environment config
+
+If `HIVE_CREDENTIAL_KEY` is not set:
+
+1. Let the store generate one
+2. Tell the user to save it: `export HIVE_CREDENTIAL_KEY="{generated_key}"`
+3. Recommend adding it to `~/.bashrc` or their shell profile
+
+## Security Rules
+
+- **NEVER** log, print, or echo credential values in tool output
+- **NEVER** store credentials in plaintext files, git-tracked files, or agent configs
+- **NEVER** hardcode credentials in source code
+- **ALWAYS** use `SecretStr` from Pydantic when handling credential values in Python
+- **ALWAYS** use the local encrypted store (`~/.hive/credentials`) for persistence
+- **ALWAYS** run health checks before storing credentials (when possible)
+- **ALWAYS** verify credentials were stored by re-running validation, not by reading them back
+- When modifying `~/.bashrc` or `~/.zshrc`, confirm with the user first
+
+## Credential Sources Reference
+
+All credential specs are defined in `tools/src/aden_tools/credentials/`:
+
+| File              | Category      | Credentials                                   | Aden Supported |
+| ----------------- | ------------- | --------------------------------------------- | -------------- |
+| `llm.py`          | LLM Providers | `anthropic`                                   | No             |
+| `search.py`       | Search Tools  | `brave_search`, `google_search`, `google_cse` | No             |
+| `email.py`        | Email         | `resend`                                      | No             |
+| `integrations.py` | Integrations  | `github`, `hubspot`                           | No / Yes       |
+
+**Note:** Additional LLM providers (Cerebras, Groq, OpenAI) are handled by LiteLLM via environment
+variables (`CEREBRAS_API_KEY`, `GROQ_API_KEY`, `OPENAI_API_KEY`) but are not yet in CREDENTIAL_SPECS.
+Add them to `llm.py` as needed.
+
+To check what's registered:
+
+```python
+from aden_tools.credentials import CREDENTIAL_SPECS
+for name, spec in CREDENTIAL_SPECS.items():
+    print(f"{name}: aden={spec.aden_supported}, direct={spec.direct_api_key_supported}")
+```
+
+## Migration: CredentialManager → CredentialStore
+
+**CredentialManager is deprecated.** Use CredentialStore instead.
+
+| Old (Deprecated)                          | New (Recommended)                                                    |
+| ----------------------------------------- | -------------------------------------------------------------------- |
+| `CredentialManager()`                     | `CredentialStore.with_encrypted_storage()`                           |
+| `creds.get("hubspot")`                    | `store.get("hubspot")` or `store.get_key("hubspot", "access_token")` |
+| `creds.validate_for_tools(tools)`         | Use `store.is_available(cred_id)` per credential                     |
+| `creds.get_auth_options("hubspot")`       | Check `CREDENTIAL_SPECS["hubspot"].aden_supported`                   |
+| `creds.get_setup_instructions("hubspot")` | Access `CREDENTIAL_SPECS["hubspot"]` directly                        |
+
+**Why migrate?**
+
+- **CredentialStore** supports encrypted storage, multi-key credentials, template resolution, and automatic token refresh
+- **CredentialManager** only reads from environment variables and .env files (no encryption, no refresh)
+- **CredentialStoreAdapter** exists for backward compatibility during migration
+
+```python
+# Old way (deprecated)
+from aden_tools.credentials import CredentialManager
+creds = CredentialManager()
+token = creds.get("hubspot")
+
+# New way (recommended)
+from core.framework.credentials import CredentialStore
+store = CredentialStore.with_encrypted_storage()
+token = store.get("hubspot")
+
+# With Aden sync (recommended for OAuth integrations)
+store = CredentialStore.with_aden_sync()
+token = store.get_key("hubspot", "access_token")
+```
+
+## Example Session
+
+```
+User: /hive-credentials for my research-agent
+
+Agent: Let me check what credentials your research-agent needs.
+
+[Calls check_missing_credentials(agent_path="exports/research-agent")]
+→ Returns:
+  available: anthropic (encrypted_store), brave_search (encrypted_store)
+  missing: google_search (GOOGLE_API_KEY), google_cse (GOOGLE_CSE_ID)
+  ready: false
+
+Agent: 2 of 4 required credentials are already configured. Only Google Custom
+Search needs setup (2 values).
+
+--- Setting up Google Custom Search (google_search + google_cse) ---
+
+This requires two values that work together.
+
+[Checks HIVE_CREDENTIAL_KEY before storing]
+$ printenv HIVE_CREDENTIAL_KEY > /dev/null 2>&1 && echo "set" || echo "not set"
+set
+
+First, the Google API Key:
+1. Go to https://console.cloud.google.com/apis/credentials
+2. Create a new project (or select an existing one)
+3. Enable the "Custom Search API" from the API Library
+4. Go to Credentials > Create Credentials > API Key
+5. Copy the generated API key
+
+[AskUserQuestion: "Please provide your Google API key:"]
+[User provides key]
+
+Now, the Custom Search Engine ID:
+1. Go to https://programmablesearchengine.google.com/controlpanel/all
+2. Click "Add" to create a new search engine
+3. Under "What to search", select "Search the entire web"
+4. Give your search engine a name
+5. Click "Create"
+6. Copy the Search Engine ID (cx value)
+
+[AskUserQuestion: "Please provide your Google CSE ID:"]
+[User provides ID]
+
+[Runs health check with both values - GET /customsearch/v1?q=test&num=1 → 200 OK]
+[Stores both in local encrypted store, exports to env]
+
+✓ Google Custom Search credentials valid
+
+[Calls verify_credentials(agent_path="exports/research-agent")]
+→ Returns: ready: true, missing_credentials: []
+
+All credentials are now configured:
+  ✓ anthropic (ANTHROPIC_API_KEY) — already in encrypted store
+  ✓ brave_search (BRAVE_SEARCH_API_KEY) — already in encrypted store
+  ✓ google_search (GOOGLE_API_KEY) — stored in encrypted store
+  ✓ google_cse (GOOGLE_CSE_ID) — stored in encrypted store
+
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                      ✅ CREDENTIALS CONFIGURED                              │
+├─────────────────────────────────────────────────────────────────────────────┤
+│                                                                             │
+│  NEXT STEPS:                                                                │
+│                                                                             │
+│  1. RUN YOUR AGENT:                                                         │
+│                                                                             │
+│     PYTHONPATH=core:exports python -m research-agent tui                    │
+│                                                                             │
+│  2. IF YOU ENCOUNTER ISSUES, USE THE DEBUGGER:                              │
+│                                                                             │
+│     /hive-debugger                                                          │
+│                                                                             │
+│     The debugger analyzes runtime logs, identifies retry loops, tool        │
+│     failures, stalled execution, and provides actionable fix suggestions.  │
+│                                                                             │
+└─────────────────────────────────────────────────────────────────────────────┘
+```
@@ -0,0 +1,933 @@
+---
+name: hive-debugger
+type: utility
+description: Interactive debugging companion for Hive agents - identifies runtime issues and proposes solutions
+version: 1.0.0
+requires:
+  - hive-concepts
+tags:
+  - debugging
+  - runtime-logs
+  - agent-development
+---
+
+# Hive Debugger
+
+An interactive debugging companion that helps developers identify and fix runtime issues in Hive agents. The debugger analyzes runtime logs at three levels (L1/L2/L3), categorizes issues, and provides actionable fix recommendations.
+
+## When to Use This Skill
+
+Use `/hive-debugger` when:
+- Your agent is failing or producing unexpected results
+- You need to understand why a specific node is retrying repeatedly
+- Tool calls are failing and you need to identify the root cause
+- Agent execution is stalled or taking too long
+- You want to monitor agent behavior in real-time during development
+
+This skill works alongside agents running in TUI mode and provides supervisor-level insights into execution behavior.
+
+---
+
+## Prerequisites
+
+Before using this skill, ensure:
+1. You have an exported agent in `exports/{agent_name}/`
+2. The agent has been run at least once (logs exist)
+3. Runtime logging is enabled (default in Hive framework)
+4. You have access to the agent's working directory at `~/.hive/agents/{agent_name}/`
+
+---
+
+## Workflow
+
+### Stage 1: Setup & Context Gathering
+
+**Objective:** Understand the agent being debugged
+
+**What to do:**
+
+1. **Ask the developer which agent needs debugging:**
+   - Get agent name (e.g., "twitter_outreach", "deep_research_agent")
+   - Confirm the agent exists in `exports/{agent_name}/`
+
+2. **Determine agent working directory:**
+   - Calculate: `~/.hive/agents/{agent_name}/`
+   - Verify this directory exists and contains session logs
+
+3. **Read agent configuration:**
+   - Read file: `exports/{agent_name}/agent.json`
+   - Extract goal information from the JSON:
+     - `goal.id` - The goal identifier
+     - `goal.success_criteria` - What success looks like
+     - `goal.constraints` - Rules the agent must follow
+   - Extract graph information:
+     - List of node IDs from `graph.nodes`
+     - List of edges from `graph.edges`
+
+4. **Store context for the debugging session:**
+   - agent_name
+   - agent_work_dir (e.g., `/home/user/.hive/twitter_outreach`)
+   - goal_id
+   - success_criteria
+   - constraints
+   - node_ids
+
+**Example:**
+```
+Developer: "My twitter_outreach agent keeps failing"
+
+You: "I'll help debug the twitter_outreach agent. Let me gather context..."
+
+[Read exports/twitter_outreach/agent.json]
+
+Context gathered:
+- Agent: twitter_outreach
+- Goal: twitter-outreach-multi-loop
+- Working Directory: /home/user/.hive/twitter_outreach
+- Success Criteria: ["Successfully send 5 personalized outreach messages"]
+- Constraints: ["Must verify handle exists", "Must personalize message"]
+- Nodes: ["intake-collector", "profile-analyzer", "message-composer", "outreach-sender"]
+```
+
+---
+
+### Stage 2: Mode Selection
+
+**Objective:** Choose the debugging approach that best fits the situation
+
+**What to do:**
+
+Ask the developer which debugging mode they want to use. Use AskUserQuestion with these options:
+
+1. **Real-time Monitoring Mode**
+   - Description: Monitor active TUI session continuously, poll logs every 5-10 seconds, alert on new issues immediately
+   - Best for: Live debugging sessions where you want to catch issues as they happen
+   - Note: Requires agent to be currently running
+
+2. **Post-Mortem Analysis Mode**
+   - Description: Analyze completed or failed runs in detail, deep dive into specific session
+   - Best for: Understanding why a past execution failed
+   - Note: Most common mode for debugging
+
+3. **Historical Trends Mode**
+   - Description: Analyze patterns across multiple runs, identify recurring issues
+   - Best for: Finding systemic problems that happen repeatedly
+   - Note: Useful for agents that have run many times
+
+**Implementation:**
+```
+Use AskUserQuestion to present these options and let the developer choose.
+Store the selected mode for the session.
+```
+
+---
+
+### Stage 3: Triage (L1 Analysis)
+
+**Objective:** Identify which sessions need attention
+
+**What to do:**
+
+1. **Query high-level run summaries** using the MCP tool:
+   ```
+   query_runtime_logs(
+       agent_work_dir="{agent_work_dir}",
+       status="needs_attention",
+       limit=20
+   )
+   ```
+
+2. **Analyze the results:**
+   - Look for runs with `needs_attention: true`
+   - Check `attention_summary.categories` for issue types
+   - Note the `run_id` of problematic sessions
+   - Check `status` field: "degraded", "failure", "in_progress"
+
+3. **Attention flag triggers to understand:**
+   From runtime_logger.py, runs are flagged when:
+   - retry_count > 3
+   - escalate_count > 2
+   - latency_ms > 60000
+   - tokens_used > 100000
+   - total_steps > 20
+
+4. **Present findings to developer:**
+   - Summarize how many runs need attention
+   - List the most recent problematic runs
+   - Show attention categories for each
+   - Ask which run they want to investigate (if multiple)
+
+**Example Output:**
+```
+Found 2 runs needing attention:
+
+1. session_20260206_115718_e22339c5 (30 minutes ago)
+   Status: degraded
+   Categories: missing_outputs, retry_loops
+
+2. session_20260206_103422_9f8d1b2a (2 hours ago)
+   Status: failure
+   Categories: tool_failures, high_latency
+
+Which run would you like to investigate?
+```
+
+---
+
+### Stage 4: Diagnosis (L2 Analysis)
+
+**Objective:** Identify which nodes failed and what patterns exist
+
+**What to do:**
+
+1. **Query per-node details** using the MCP tool:
+   ```
+   query_runtime_log_details(
+       agent_work_dir="{agent_work_dir}",
+       run_id="{selected_run_id}",
+       needs_attention_only=True
+   )
+   ```
+
+2. **Categorize issues** using the Issue Taxonomy:
+
+   **10 Issue Categories:**
+
+   | Category | Detection Pattern | Meaning |
+   |----------|------------------|---------|
+   | **Missing Outputs** | `exit_status != "success"`, `attention_reasons` contains "missing_outputs" | Node didn't call set_output with required keys |
+   | **Tool Errors** | `tool_error_count > 0`, `attention_reasons` contains "tool_failures" | Tool calls failed (API errors, timeouts, auth issues) |
+   | **Retry Loops** | `retry_count > 3`, `verdict_counts.RETRY > 5` | Judge repeatedly rejecting outputs |
+   | **Guard Failures** | `guard_reject_count > 0` | Output validation failed (wrong types, missing keys) |
+   | **Stalled Execution** | `total_steps > 20`, `verdict_counts.CONTINUE > 10` | EventLoopNode not making progress |
+   | **High Latency** | `latency_ms > 60000`, `avg_step_latency > 5000` | Slow tool calls or LLM responses |
+   | **Client-Facing Issues** | `client_input_requested` but no `user_input_received` | Premature set_output before user input |
+   | **Edge Routing Errors** | `exit_status == "no_valid_edge"`, `attention_reasons` contains "routing_issue" | No edges match current state |
+   | **Memory/Context Issues** | `tokens_used > 100000`, `context_overflow_count > 0` | Conversation history too long |
+   | **Constraint Violations** | Compare output against goal constraints | Agent violated goal-level rules |
+
+3. **Analyze each flagged node:**
+   - Node ID and name
+   - Exit status
+   - Retry count
+   - Verdict distribution (ACCEPT/RETRY/ESCALATE/CONTINUE)
+   - Attention reasons
+   - Total steps executed
+
+4. **Present diagnosis to developer:**
+   - List problematic nodes
+   - Categorize each issue
+   - Highlight the most severe problems
+   - Show evidence (retry counts, error types)
+
+**Example Output:**
+```
+Diagnosis for session_20260206_115718_e22339c5:
+
+Problem Node: intake-collector
+├─ Exit Status: escalate
+├─ Retry Count: 5 (HIGH)
+├─ Verdict Counts: {RETRY: 5, ESCALATE: 1}
+├─ Attention Reasons: ["high_retry_count", "missing_outputs"]
+├─ Total Steps: 8
+└─ Categories: Missing Outputs + Retry Loops
+
+Root Issue: The intake-collector node is stuck in a retry loop because it's not setting required outputs.
+```
+
+---
+
+### Stage 5: Root Cause Analysis (L3 Analysis)
+
+**Objective:** Understand exactly what went wrong by examining detailed logs
+
+**What to do:**
+
+1. **Query detailed tool/LLM logs** using the MCP tool:
+   ```
+   query_runtime_log_raw(
+       agent_work_dir="{agent_work_dir}",
+       run_id="{run_id}",
+       node_id="{problem_node_id}"
+   )
+   ```
+
+2. **Analyze based on issue category:**
+
+   **For Missing Outputs:**
+   - Check `step.tool_calls` for set_output usage
+   - Look for conditional logic that skipped set_output
+   - Check if LLM is calling other tools instead
+
+   **For Tool Errors:**
+   - Check `step.tool_results` for error messages
+   - Identify error types: rate limits, auth failures, timeouts, network errors
+   - Note which specific tool is failing
+
+   **For Retry Loops:**
+   - Check `step.verdict_feedback` from judge
+   - Look for repeated failure reasons
+   - Identify if it's the same issue every time
+
+   **For Guard Failures:**
+   - Check `step.guard_results` for validation errors
+   - Identify missing keys or type mismatches
+   - Compare actual output to expected schema
+
+   **For Stalled Execution:**
+   - Check `step.llm_response_text` for repetition
+   - Look for LLM stuck in same action loop
+   - Check if tool calls are succeeding but not progressing
+
+3. **Extract evidence:**
+   - Specific error messages
+   - Tool call arguments and results
+   - LLM response text
+   - Judge feedback
+   - Step-by-step progression
+
+4. **Formulate root cause explanation:**
+   - Clearly state what is happening
+   - Explain why it's happening
+   - Show evidence from logs
+
+**Example Output:**
+```
+Root Cause Analysis for intake-collector:
+
+Step-by-step breakdown:
+
+Step 3:
+- Tool Call: web_search(query="@RomuloNevesOf")
+- Result: Found Twitter profile information
+- Verdict: RETRY
+- Feedback: "Missing required output 'twitter_handles'. You found the handle but didn't call set_output."
+
+Step 4:
+- Tool Call: web_search(query="@RomuloNevesOf twitter")
+- Result: Found additional Twitter information
+- Verdict: RETRY
+- Feedback: "Still missing 'twitter_handles'. Use set_output to save your findings."
+
+Steps 5-7: Similar pattern continues...
+
+ROOT CAUSE: The node is successfully finding Twitter handles via web_search, but the LLM is not calling set_output to save the results. It keeps searching for more information instead of completing the task.
+```
+
+---
+
+### Stage 6: Fix Recommendations
+
+**Objective:** Provide actionable solutions the developer can implement
+
+**What to do:**
+
+Based on the issue category identified, provide specific fix recommendations using these templates:
+
+#### Template 1: Missing Outputs (Client-Facing Nodes)
+
+```markdown
+## Issue: Premature set_output in Client-Facing Node
+
+**Root Cause:** Node called set_output before receiving user input
+
+**Fix:** Use STEP 1/STEP 2 prompt pattern
+
+**File to edit:** `exports/{agent_name}/nodes/{node_name}.py`
+
+**Changes:**
+1. Update the system_prompt to include explicit step guidance:
+   ```python
+   system_prompt = """
+   STEP 1: Analyze the user input and decide what action to take.
+   DO NOT call set_output in this step.
+
+   STEP 2: After receiving feedback or completing analysis,
+   ONLY THEN call set_output with your results.
+   """
+   ```
+
+2. If some inputs are optional (like feedback on retry edges), add nullable_output_keys:
+   ```python
+   nullable_output_keys=["feedback"]
+   ```
+
+**Verification:**
+- Run the agent with test input
+- Verify the client-facing node waits for user input before calling set_output
+```
+
+#### Template 2: Retry Loops
+
+```markdown
+## Issue: Judge Repeatedly Rejecting Outputs
+
+**Root Cause:** {Insert specific reason from verdict_feedback}
+
+**Fix Options:**
+
+**Option A - If outputs are actually correct:** Adjust judge evaluation rules
+- File: `exports/{agent_name}/agent.json`
+- Update `evaluation_rules` section to accept the current output format
+- Example: If judge expects list but gets string, update rule to accept both
+
+**Option B - If prompt is ambiguous:** Clarify node instructions
+- File: `exports/{agent_name}/nodes/{node_name}.py`
+- Make system_prompt more explicit about output format and requirements
+- Add examples of correct outputs
+
+**Option C - If tool is unreliable:** Add retry logic with fallback
+- Consider using alternative tools
+- Add manual fallback option
+- Update prompt to handle tool failures gracefully
+
+**Verification:**
+- Run the node with test input
+- Confirm judge accepts output on first try
+- Check that retry_count stays at 0
+```
+
+#### Template 3: Tool Errors
+
+```markdown
+## Issue: {tool_name} Failing with {error_type}
+
+**Root Cause:** {Insert specific error message from logs}
+
+**Fix Strategy:**
+
+**If API rate limit:**
+1. Add exponential backoff in tool retry logic
+2. Reduce API call frequency
+3. Consider caching results
+
+**If auth failure:**
+1. Check credentials using:
+   ```bash
+   /hive-credentials --agent {agent_name}
+   ```
+2. Verify API key environment variables
+3. Update `mcp_servers.json` if needed
+
+**If timeout:**
+1. Increase timeout in `mcp_servers.json`:
+   ```json
+   {
+     "timeout_ms": 60000
+   }
+   ```
+2. Consider using faster alternative tools
+3. Break large requests into smaller chunks
+
+**Verification:**
+- Test tool call manually
+- Confirm successful response
+- Monitor for recurring errors
+```
+
+#### Template 4: Edge Routing Errors
+
+```markdown
+## Issue: No Valid Edge from Node {node_id}
+
+**Root Cause:** No edge condition matched the current state
+
+**File to edit:** `exports/{agent_name}/agent.json`
+
+**Analysis:**
+- Current node output: {show actual output keys}
+- Existing edge conditions: {list edge conditions}
+- Why no match: {explain the mismatch}
+
+**Fix:**
+Add the missing edge to the graph:
+```json
+{
+  "edge_id": "{node_id}_to_{target_node}",
+  "source": "{node_id}",
+  "target": "{target_node}",
+  "condition": "on_success"
+}
+```
+
+**Alternative:** Update existing edge condition to cover this case
+
+**Verification:**
+- Run agent with same input
+- Verify edge is traversed successfully
+- Check that execution continues to next node
+```
+
+#### Template 5: Stalled Execution
+
+```markdown
+## Issue: EventLoopNode Not Making Progress
+
+**Root Cause:** {Insert analysis - e.g., "LLM repeating same failed action"}
+
+**File to edit:** `exports/{agent_name}/nodes/{node_name}.py`
+
+**Fix:** Update system_prompt to guide LLM out of loops
+
+**Add this guidance:**
+```python
+system_prompt = """
+{existing prompt}
+
+IMPORTANT: If a tool call fails multiple times:
+1. Try an alternative approach or different tool
+2. If no alternatives work, call set_output with partial results
+3. DO NOT retry the same failed action more than 3 times
+
+Progress is more important than perfection. Move forward even with incomplete data.
+"""
+```
+
+**Additional fix:** Lower max_iterations to prevent infinite loops
+```python
+# In node configuration
+max_node_visits=3  # Prevent getting stuck
+```
+
+**Verification:**
+- Run node with same input that caused stall
+- Verify it exits after reasonable attempts (< 10 steps)
+- Confirm it calls set_output eventually
+```
+
+#### Template 6: Checkpoint Recovery (Post-Fix Resume)
+
+```markdown
+## Recovery Strategy: Resume from Last Clean Checkpoint
+
+**Situation:** You've fixed the issue, but the failed session is stuck mid-execution
+
+**Solution:** Resume execution from a checkpoint before the failure
+
+### Option A: Auto-Resume from Latest Checkpoint (Recommended)
+
+Use CLI arguments to auto-resume when launching TUI:
+
+```bash
+PYTHONPATH=core:exports python -m {agent_name} --tui \
+    --resume-session {session_id}
+```
+
+This will:
+- Load session state from `state.json`
+- Continue from where it paused/failed
+- Apply your fixes immediately
+
+### Option B: Resume from Specific Checkpoint (Time-Travel)
+
+If you need to go back to an earlier point:
+
+```bash
+PYTHONPATH=core:exports python -m {agent_name} --tui \
+    --resume-session {session_id} \
+    --checkpoint {checkpoint_id}
+```
+
+Example:
+```bash
+PYTHONPATH=core:exports python -m deep_research_agent --tui \
+    --resume-session session_20260208_143022_abc12345 \
+    --checkpoint cp_node_complete_intake_143030
+```
+
+### Option C: Use TUI Commands
+
+Alternatively, launch TUI normally and use commands:
+
+```bash
+# Launch TUI
+PYTHONPATH=core:exports python -m {agent_name} --tui
+
+# In TUI, use commands:
+/resume {session_id}                    # Resume from session state
+/recover {session_id} {checkpoint_id}   # Recover from specific checkpoint
+```
+
+### When to Use Each Option:
+
+**Use `/resume` (or --resume-session) when:**
+- You fixed credentials and want to retry
+- Agent paused and you want to continue
+- Agent failed and you want to retry from last state
+
+**Use `/recover` (or --resume-session + --checkpoint) when:**
+- You need to go back to an earlier checkpoint
+- You want to try a different path from a specific point
+- Debugging requires time-travel to earlier state
+
+### Find Available Checkpoints:
+
+```bash
+# In TUI:
+/sessions {session_id}
+
+# This shows all checkpoints with timestamps:
+Available Checkpoints: (3)
+  1. cp_node_complete_intake_143030
+  2. cp_node_complete_research_143115
+  3. cp_pause_research_143130
+```
+
+**Verification:**
+- Use `--resume-session` to test your fix immediately
+- No need to re-run from the beginning
+- Session continues with your code changes applied
+```
+
+**Selecting the right template:**
+- Match the issue category from Stage 4
+- Customize with specific details from Stage 5
+- Include actual error messages and code snippets
+- Provide file paths and line numbers when possible
+- **Always include recovery commands** (Template 6) after providing fix recommendations
+
+---
+
+### Stage 7: Verification Support
+
+**Objective:** Help the developer confirm their fixes work
+
+**What to do:**
+
+1. **Suggest appropriate tests based on fix type:**
+
+   **For node-level fixes:**
+   ```bash
+   # Use hive-test to run goal-based tests
+   /hive-test --agent {agent_name} --goal {goal_id}
+
+   # Or run specific test scenarios
+   /hive-test --agent {agent_name} --scenario {specific_input}
+   ```
+
+   **For quick manual tests:**
+   ```bash
+   # Launch the interactive TUI dashboard
+   hive tui
+   ```
+   Then use arrow keys to select the agent from the list and press Enter to run it.
+
+2. **Provide MCP tool queries to validate the fix:**
+
+   **Check if issue is resolved:**
+   ```
+   query_runtime_logs(
+       agent_work_dir="~/.hive/agents/{agent_name}",
+       status="needs_attention",
+       limit=5
+   )
+   # Should show 0 results if fully fixed
+   ```
+
+   **Verify specific node behavior:**
+   ```
+   query_runtime_log_details(
+       agent_work_dir="~/.hive/agents/{agent_name}",
+       run_id="{new_run_id}",
+       node_id="{fixed_node_id}"
+   )
+   # Should show exit_status="success", retry_count=0
+   ```
+
+3. **Monitor for regression:**
+   - Run the agent multiple times
+   - Check for similar issues reappearing
+   - Verify fix works across different inputs
+
+4. **Provide verification checklist:**
+   ```
+   Verification Checklist:
+   □ Applied recommended fix to code
+   □ Ran agent with test input
+   □ Checked runtime logs show no attention flags
+   □ Verified specific node completes successfully
+   □ Tested with multiple inputs
+   □ No regression of original issue
+   □ Agent meets success criteria
+   ```
+
+**Example interaction:**
+```
+Developer: "I applied the fix to intake-collector. How do I verify it works?"
+
+You: "Great! Let's verify the fix with these steps:
+
+1. Launch the TUI dashboard:
+   hive tui
+   Then select your agent from the list and press Enter to run it.
+
+2. After it completes, check the logs:
+   [Use query_runtime_logs to check for attention flags]
+
+3. Verify the specific node:
+   [Use query_runtime_log_details for intake-collector]
+
+Expected results:
+- No 'needs_attention' flags
+- intake-collector shows exit_status='success'
+- retry_count should be 0
+
+Let me know when you've run it and I'll help check the logs!"
+```
+
+---
+
+## MCP Tool Usage Guide
+
+### Three Levels of Observability
+
+**L1: query_runtime_logs** - Session-level summaries
+- **When to use:** Initial triage, identifying problematic runs, monitoring trends
+- **Returns:** List of runs with status, attention flags, timestamps
+- **Example:**
+  ```
+  query_runtime_logs(
+      agent_work_dir="/home/user/.hive/twitter_outreach",
+      status="needs_attention",
+      limit=20
+  )
+  ```
+
+**L2: query_runtime_log_details** - Node-level details
+- **When to use:** Diagnosing which nodes failed, understanding retry patterns
+- **Returns:** Per-node completion details, retry counts, verdicts
+- **Example:**
+  ```
+  query_runtime_log_details(
+      agent_work_dir="/home/user/.hive/twitter_outreach",
+      run_id="session_20260206_115718_e22339c5",
+      needs_attention_only=True
+  )
+  ```
+
+**L3: query_runtime_log_raw** - Step-level details
+- **When to use:** Root cause analysis, understanding exact failures
+- **Returns:** Full tool calls, LLM responses, judge feedback
+- **Example:**
+  ```
+  query_runtime_log_raw(
+      agent_work_dir="/home/user/.hive/twitter_outreach",
+      run_id="session_20260206_115718_e22339c5",
+      node_id="intake-collector"
+  )
+  ```
+
+### Query Patterns
+
+**Pattern 1: Top-Down Investigation** (Most common)
+```
+1. L1: Find problematic runs
+2. L2: Identify failing nodes
+3. L3: Analyze specific failures
+```
+
+**Pattern 2: Node-Specific Debugging**
+```
+1. L2: Get details for specific node across all runs
+2. L3: Deep dive into worst failures
+```
+
+**Pattern 3: Real-time Monitoring**
+```
+Loop every 10 seconds:
+  1. L1: Check for new needs_attention runs
+  2. If found: Alert and drill into L2
+```
+
+---
+
+## Complete Example Walkthrough
+
+### Scenario: twitter_outreach Agent Failing
+
+**Developer says:** "My twitter_outreach agent keeps failing. It's not finding Twitter handles correctly."
+
+#### Stage 1: Setup
+```
+You: "I'll help debug the twitter_outreach agent. Let me gather context..."
+
+[Read exports/twitter_outreach/agent.json]
+
+Context:
+- Agent: twitter_outreach
+- Goal: twitter-outreach-multi-loop
+- Working Dir: ~/.hive/agents/twitter_outreach
+- Success Criteria: ["Successfully send 5 personalized outreach messages"]
+- Constraints: ["Must verify handle exists", "Must personalize message"]
+- Nodes: intake-collector, profile-analyzer, message-composer, outreach-sender
+```
+
+#### Stage 2: Mode Selection
+```
+You: "Which debugging approach would you prefer?"
+
+[Present 3 mode options using AskUserQuestion]
+
+Developer selects: "Post-Mortem Analysis Mode"
+```
+
+#### Stage 3: Triage (L1)
+```
+You: "Querying runtime logs for problematic runs..."
+
+[Call query_runtime_logs with status="needs_attention"]
+
+Results:
+- Found 1 run needing attention
+- session_20260206_115718_e22339c5 (30 min ago)
+- Status: degraded
+- Categories: missing_outputs, retry_loops
+```
+
+#### Stage 4: Diagnosis (L2)
+```
+You: "Analyzing node-level details..."
+
+[Call query_runtime_log_details for the session]
+
+Results:
+- Problem Node: intake-collector
+- Exit Status: escalate
+- Retry Count: 5
+- Verdict Counts: {RETRY: 5, ESCALATE: 1}
+- Attention Reasons: ["high_retry_count", "missing_outputs"]
+- Total Steps: 8
+
+Diagnosis: intake-collector has retry loop + missing outputs issue
+Category: Missing Outputs + Retry Loops
+```
+
+#### Stage 5: Root Cause (L3)
+```
+You: "Drilling into detailed logs for intake-collector..."
+
+[Call query_runtime_log_raw for node intake-collector]
+
+Evidence:
+Step 3: web_search("@RomuloNevesOf") → Found profile
+  Verdict: RETRY
+  Feedback: "Missing required output 'twitter_handles'. You found the handle but didn't call set_output."
+
+Step 4: web_search("@RomuloNevesOf twitter") → Found more info
+  Verdict: RETRY
+  Feedback: "Still missing 'twitter_handles'. Use set_output to save your findings."
+
+Steps 5-7: Same pattern repeats...
+
+ROOT CAUSE: Node successfully finds handles but LLM doesn't call set_output. It keeps searching instead of completing.
+```
+
+#### Stage 6: Fix Recommendation
+```
+You: "Here's the fix for this issue:
+
+## Issue: intake-collector Not Setting Outputs
+
+**Root Cause:** Node successfully finds Twitter handles but doesn't call set_output
+
+**Category:** Missing Outputs
+
+**File to edit:** exports/twitter_outreach/nodes/intake_collector.py
+
+**Changes:**
+Update the system_prompt to emphasize the output requirement:
+
+```python
+system_prompt = """
+You are the Intake Collector for the Twitter outreach agent.
+
+Your task:
+1. Extract Twitter handle from user input (URL or @mention)
+2. Verify it's a valid Twitter handle format
+3. **CRITICAL: Call set_output(twitter_handles=[handle]) immediately after finding valid handle**
+
+IMPORTANT: You MUST call set_output even if you found the handle on the first try.
+Do not continue searching once you have a valid handle - set output and finish.
+
+Output format:
+{
+  "twitter_handles": ["@handle1", "@handle2", ...]
+}
+"""
+```
+
+**Why this works:** The explicit instruction to call set_output immediately prevents the LLM from continuing to search after finding the handle.
+```
+
+#### Stage 7: Verification
+```
+Developer: "I updated the prompt. How do I test it?"
+
+You: "Let's verify the fix:
+
+1. Launch the TUI dashboard:
+   hive tui
+   Then select your agent and press Enter to run it.
+
+2. I'll check the logs after it runs..."
+
+[Developer runs agent]
+
+Developer: "It completed! Can you check if it worked?"
+
+You: "Checking now..."
+
+[Call query_runtime_log_details for the new run]
+
+Results:
+- Node: intake-collector
+- Exit Status: success ✓
+- Retry Count: 0 ✓
+- Total Steps: 2 ✓
+- No attention flags ✓
+
+"Perfect! The fix worked. The intake-collector now:
+- Completes successfully on first try
+- No retries needed
+- Calls set_output properly
+
+Your agent should now work correctly!"
+```
+
+---
+
+## Tips for Effective Debugging
+
+1. **Always start with L1 logs** - Don't jump straight to detailed logs
+2. **Focus on attention flags** - They highlight the real issues
+3. **Compare verdict_feedback across steps** - Patterns reveal root causes
+4. **Check tool error messages carefully** - They often contain the exact problem
+5. **Consider the agent's goal** - Fixes should align with success criteria
+6. **Test fixes immediately** - Quick verification prevents wasted effort
+7. **Look for patterns across multiple runs** - One-time failures might be transient
+
+## Common Pitfalls to Avoid
+
+1. **Don't recommend code you haven't verified exists** - Always read files first
+2. **Don't assume tool capabilities** - Check MCP server configs
+3. **Don't ignore edge conditions** - Missing edges cause routing failures
+4. **Don't overlook judge configuration** - Mismatched expectations cause retry loops
+5. **Don't forget nullable_output_keys** - Optional inputs need explicit marking
+
+---
+
+## Storage Locations Reference
+
+**New unified storage (default):**
+- Logs: `~/.hive/agents/{agent_name}/sessions/session_YYYYMMDD_HHMMSS_{uuid}/logs/`
+- State: `~/.hive/agents/{agent_name}/sessions/{session_id}/state.json`
+- Conversations: `~/.hive/agents/{agent_name}/sessions/{session_id}/conversations/`
+
+**Old storage (deprecated, still supported):**
+- Logs: `~/.hive/agents/{agent_name}/runtime_logs/runs/{run_id}/`
+
+The MCP tools automatically check both locations.
+
+---
+
+**Remember:** Your role is to be a debugging companion and thought partner. Guide the developer through the investigation, explain what you find, and provide actionable fixes. Don't just report errors - help understand and solve them.
@@ -0,0 +1,385 @@
+---
+name: hive-patterns
+description: Best practices, patterns, and examples for building goal-driven agents. Includes client-facing interaction, feedback edges, judge patterns, fan-out/fan-in, context management, and anti-patterns.
+license: Apache-2.0
+metadata:
+  author: hive
+  version: "2.0"
+  type: reference
+  part_of: hive
+---
+
+# Building Agents - Patterns & Best Practices
+
+Design patterns, examples, and best practices for building robust goal-driven agents.
+
+**Prerequisites:** Complete agent structure using `hive-create`.
+
+## Practical Example: Hybrid Workflow
+
+How to build a node using both direct file writes and optional MCP validation:
+
+```python
+# 1. WRITE TO FILE FIRST (Primary - makes it visible)
+node_code = '''
+search_node = NodeSpec(
+    id="search-web",
+    node_type="event_loop",
+    input_keys=["query"],
+    output_keys=["search_results"],
+    system_prompt="Search the web for: {query}. Use web_search, then call set_output to store results.",
+    tools=["web_search"],
+)
+'''
+
+Edit(
+    file_path="exports/research_agent/nodes/__init__.py",
+    old_string="# Nodes will be added here",
+    new_string=node_code
+)
+
+# 2. OPTIONALLY VALIDATE WITH MCP (Secondary - bookkeeping)
+validation = mcp__agent-builder__test_node(
+    node_id="search-web",
+    test_input='{"query": "python tutorials"}',
+    mock_llm_response='{"search_results": [...mock results...]}'
+)
+```
+
+**User experience:**
+
+- Immediately sees node in their editor (from step 1)
+- Gets validation feedback (from step 2)
+- Can edit the file directly if needed
+
+## Multi-Turn Interaction Patterns
+
+For agents needing multi-turn conversations with users, use `client_facing=True` on event_loop nodes.
+
+### Client-Facing Nodes
+
+A client-facing node streams LLM output to the user and blocks for user input between conversational turns. This replaces the old pause/resume pattern.
+
+```python
+# Client-facing node with STEP 1/STEP 2 prompt pattern
+intake_node = NodeSpec(
+    id="intake",
+    name="Intake",
+    description="Gather requirements from the user",
+    node_type="event_loop",
+    client_facing=True,
+    input_keys=["topic"],
+    output_keys=["research_brief"],
+    system_prompt="""\
+You are an intake specialist.
+
+**STEP 1 — Read and respond (text only, NO tool calls):**
+1. Read the topic provided
+2. If it's vague, ask 1-2 clarifying questions
+3. If it's clear, confirm your understanding
+
+**STEP 2 — After the user confirms, call set_output:**
+- set_output("research_brief", "Clear description of what to research")
+""",
+)
+
+# Internal node runs without user interaction
+research_node = NodeSpec(
+    id="research",
+    name="Research",
+    description="Search and analyze sources",
+    node_type="event_loop",
+    input_keys=["research_brief"],
+    output_keys=["findings", "sources"],
+    system_prompt="Research the topic using web_search and web_scrape...",
+    tools=["web_search", "web_scrape", "load_data", "save_data"],
+)
+```
+
+**How it works:**
+
+- Client-facing nodes stream LLM text to the user and block for input after each response
+- User input is injected via `node.inject_event(text)`
+- When the LLM calls `set_output` to produce structured outputs, the judge evaluates and ACCEPTs
+- Internal nodes (non-client-facing) run their entire loop without blocking
+- `set_output` is a synthetic tool — a turn with only `set_output` calls (no real tools) triggers user input blocking
+
+**STEP 1/STEP 2 pattern:** Always structure client-facing prompts with explicit phases. STEP 1 is text-only conversation. STEP 2 calls `set_output` after user confirmation. This prevents the LLM from calling `set_output` prematurely before the user responds.
+
+### When to Use client_facing
+
+| Scenario                            | client_facing | Why                    |
+| ----------------------------------- | :-----------: | ---------------------- |
+| Gathering user requirements         |      Yes      | Need user input        |
+| Human review/approval checkpoint    |      Yes      | Need human decision    |
+| Data processing (scanning, scoring) |      No       | Runs autonomously      |
+| Report generation                   |      No       | No user input needed   |
+| Final confirmation before action    |      Yes      | Need explicit approval |
+
+> **Legacy Note:** The `pause_nodes` / `entry_points` pattern still works for backward compatibility but `client_facing=True` is preferred for new agents.
+
+## Edge-Based Routing and Feedback Loops
+
+### Conditional Edge Routing
+
+Multiple conditional edges from the same source replace the old `router` node type. Each edge checks a condition on the node's output.
+
+```python
+# Node with mutually exclusive outputs
+review_node = NodeSpec(
+    id="review",
+    name="Review",
+    node_type="event_loop",
+    client_facing=True,
+    output_keys=["approved_contacts", "redo_extraction"],
+    nullable_output_keys=["approved_contacts", "redo_extraction"],
+    max_node_visits=3,
+    system_prompt="Present the contact list to the operator. If they approve, call set_output('approved_contacts', ...). If they want changes, call set_output('redo_extraction', 'true').",
+)
+
+# Forward edge (positive priority, evaluated first)
+EdgeSpec(
+    id="review-to-campaign",
+    source="review",
+    target="campaign-builder",
+    condition=EdgeCondition.CONDITIONAL,
+    condition_expr="output.get('approved_contacts') is not None",
+    priority=1,
+)
+
+# Feedback edge (negative priority, evaluated after forward edges)
+EdgeSpec(
+    id="review-feedback",
+    source="review",
+    target="extractor",
+    condition=EdgeCondition.CONDITIONAL,
+    condition_expr="output.get('redo_extraction') is not None",
+    priority=-1,
+)
+```
+
+**Key concepts:**
+
+- `nullable_output_keys`: Lists output keys that may remain unset. The node sets exactly one of the mutually exclusive keys per execution.
+- `max_node_visits`: Must be >1 on the feedback target (extractor) so it can re-execute. Default is 1.
+- `priority`: Positive = forward edge (evaluated first). Negative = feedback edge. The executor tries forward edges first; if none match, falls back to feedback edges.
+
+### Routing Decision Table
+
+| Pattern                | Old Approach            | New Approach                                  |
+| ---------------------- | ----------------------- | --------------------------------------------- |
+| Conditional branching  | `router` node           | Conditional edges with `condition_expr`       |
+| Binary approve/reject  | `pause_nodes` + resume  | `client_facing=True` + `nullable_output_keys` |
+| Loop-back on rejection | Manual entry_points     | Feedback edge with `priority=-1`              |
+| Multi-way routing      | Router with routes dict | Multiple conditional edges with priorities    |
+
+## Judge Patterns
+
+**Core Principle: The judge is the SOLE mechanism for acceptance decisions.** Never add ad-hoc framework gating to compensate for LLM behavior. If the LLM calls `set_output` prematurely, fix the system prompt or use a custom judge. Anti-patterns to avoid:
+
+- Output rollback logic
+- `_user_has_responded` flags
+- Premature set_output rejection
+- Interaction protocol injection into system prompts
+
+Judges control when an event_loop node's loop exits. Choose based on validation needs.
+
+### Implicit Judge (Default)
+
+When no judge is configured, the implicit judge ACCEPTs when:
+
+- The LLM finishes its response with no tool calls
+- All required output keys have been set via `set_output`
+
+Best for simple nodes where "all outputs set" is sufficient validation.
+
+### SchemaJudge
+
+Validates outputs against a Pydantic model. Use when you need structural validation.
+
+```python
+from pydantic import BaseModel
+
+class ScannerOutput(BaseModel):
+    github_users: list[dict]  # Must be a list of user objects
+
+class SchemaJudge:
+    def __init__(self, output_model: type[BaseModel]):
+        self._model = output_model
+
+    async def evaluate(self, context: dict) -> JudgeVerdict:
+        missing = context.get("missing_keys", [])
+        if missing:
+            return JudgeVerdict(
+                action="RETRY",
+                feedback=f"Missing output keys: {missing}. Use set_output to provide them.",
+            )
+        try:
+            self._model.model_validate(context["output_accumulator"])
+            return JudgeVerdict(action="ACCEPT")
+        except ValidationError as e:
+            return JudgeVerdict(action="RETRY", feedback=str(e))
+```
+
+### When to Use Which Judge
+
+| Judge           | Use When                              | Example                |
+| --------------- | ------------------------------------- | ---------------------- |
+| Implicit (None) | Output keys are sufficient validation | Simple data extraction |
+| SchemaJudge     | Need structural validation of outputs | API response parsing   |
+| Custom          | Domain-specific validation logic      | Score must be 0.0-1.0  |
+
+## Fan-Out / Fan-In (Parallel Execution)
+
+Multiple ON_SUCCESS edges from the same source trigger parallel execution. All branches run concurrently via `asyncio.gather()`.
+
+```python
+# Scanner fans out to Profiler and Scorer in parallel
+EdgeSpec(id="scanner-to-profiler", source="scanner", target="profiler",
+         condition=EdgeCondition.ON_SUCCESS)
+EdgeSpec(id="scanner-to-scorer", source="scanner", target="scorer",
+         condition=EdgeCondition.ON_SUCCESS)
+
+# Both fan in to Extractor
+EdgeSpec(id="profiler-to-extractor", source="profiler", target="extractor",
+         condition=EdgeCondition.ON_SUCCESS)
+EdgeSpec(id="scorer-to-extractor", source="scorer", target="extractor",
+         condition=EdgeCondition.ON_SUCCESS)
+```
+
+**Requirements:**
+
+- Parallel event_loop nodes must have **disjoint output_keys** (no key written by both)
+- Only one parallel branch may contain a `client_facing` node
+- Fan-in node receives outputs from all completed branches in shared memory
+
+## Context Management Patterns
+
+### Tiered Compaction
+
+EventLoopNode automatically manages context window usage with tiered compaction:
+
+1. **Pruning** — Old tool results replaced with compact placeholders (zero-cost, no LLM call)
+2. **Normal compaction** — LLM summarizes older messages
+3. **Aggressive compaction** — Keeps only recent messages + summary
+4. **Emergency** — Hard reset with tool history preservation
+
+### Spillover Pattern
+
+The framework automatically truncates large tool results and saves full content to a spillover directory. The LLM receives a truncation message with instructions to use `load_data` to read the full result.
+
+For explicit data management, use the data tools (real MCP tools, not synthetic):
+
+```python
+# save_data, load_data, list_data_files, serve_file_to_user are real MCP tools
+# data_dir is auto-injected by the framework — the LLM never sees it
+
+# Saving large results
+save_data(filename="sources.json", data=large_json_string)
+
+# Reading with pagination (line-based offset/limit)
+load_data(filename="sources.json", offset=0, limit=50)
+
+# Listing available files
+list_data_files()
+
+# Serving a file to the user as a clickable link
+serve_file_to_user(filename="report.html", label="Research Report")
+```
+
+Add data tools to nodes that handle large tool results:
+
+```python
+research_node = NodeSpec(
+    ...
+    tools=["web_search", "web_scrape", "load_data", "save_data", "list_data_files"],
+)
+```
+
+`data_dir` is a framework context parameter — auto-injected at call time. `GraphExecutor.execute()` sets it per-execution via `ToolRegistry.set_execution_context(data_dir=...)` (using `contextvars` for concurrency safety), ensuring it matches the session-scoped spillover directory.
+
+## Anti-Patterns
+
+### What NOT to Do
+
+- **Don't rely on `export_graph`** — Write files immediately, not at end
+- **Don't hide code in session** — Write to files as components are approved
+- **Don't wait to write files** — Agent visible from first step
+- **Don't batch everything** — Write incrementally, one component at a time
+- **Don't create too many thin nodes** — Prefer fewer, richer nodes (see below)
+- **Don't add framework gating for LLM behavior** — Fix prompts or use judges instead
+
+### Fewer, Richer Nodes
+
+A common mistake is splitting work into too many small single-purpose nodes. Each node boundary requires serializing outputs, losing in-context information, and adding edge complexity.
+
+| Bad (8 thin nodes)  | Good (4 rich nodes)                 |
+| ------------------- | ----------------------------------- |
+| parse-query         | intake (client-facing)              |
+| search-sources      | research (search + fetch + analyze) |
+| fetch-content       | review (client-facing)              |
+| evaluate-sources    | report (write + deliver)            |
+| synthesize-findings |                                     |
+| write-report        |                                     |
+| quality-check       |                                     |
+| save-report         |                                     |
+
+**Why fewer nodes are better:**
+
+- The LLM retains full context of its work within a single node
+- A research node that searches, fetches, and analyzes keeps all source material in its conversation history
+- Fewer edges means simpler graph and fewer failure points
+- Data tools (`save_data`/`load_data`) handle context window limits within a single node
+
+### MCP Tools - Correct Usage
+
+**MCP tools OK for:**
+
+- `test_node` — Validate node configuration with mock inputs
+- `validate_graph` — Check graph structure
+- `configure_loop` — Set event loop parameters
+- `create_session` — Track session state for bookkeeping
+
+**Just don't:** Use MCP as the primary construction method or rely on export_graph
+
+## Error Handling Patterns
+
+### Graceful Failure with Fallback
+
+```python
+edges = [
+    # Success path
+    EdgeSpec(id="api-success", source="api-call", target="process-results",
+             condition=EdgeCondition.ON_SUCCESS),
+    # Fallback on failure
+    EdgeSpec(id="api-to-fallback", source="api-call", target="fallback-cache",
+             condition=EdgeCondition.ON_FAILURE, priority=1),
+    # Report if fallback also fails
+    EdgeSpec(id="fallback-to-error", source="fallback-cache", target="report-error",
+             condition=EdgeCondition.ON_FAILURE, priority=1),
+]
+```
+
+## Handoff to Testing
+
+When agent is complete, transition to testing phase:
+
+### Pre-Testing Checklist
+
+- [ ] Agent structure validates: `uv run python -m agent_name validate`
+- [ ] All nodes defined in nodes/**init**.py
+- [ ] All edges connect valid nodes with correct priorities
+- [ ] Feedback edge targets have `max_node_visits > 1`
+- [ ] Client-facing nodes have meaningful system prompts
+- [ ] Agent can be imported: `from exports.agent_name import default_agent`
+
+## Related Skills
+
+- **hive-concepts** — Fundamental concepts (node types, edges, event loop architecture)
+- **hive-create** — Step-by-step building process
+- **hive-test** — Test and validate agents
+- **hive** — Complete workflow orchestrator
+
+---
+
+**Remember: Agent is actively constructed, visible the whole time. No hidden state. No surprise exports. Just transparent, incremental file building.**
@@ -0,0 +1,351 @@
+# Example: Testing a YouTube Research Agent
+
+This example walks through testing a YouTube research agent that finds relevant videos based on a topic.
+
+## Prerequisites
+
+- Agent built with hive-create skill at `exports/youtube-research/`
+- Goal defined with success criteria and constraints
+
+## Step 1: Load the Goal
+
+First, load the goal that was defined during the Goal stage:
+
+```json
+{
+    "id": "youtube-research",
+    "name": "YouTube Research Agent",
+    "description": "Find relevant YouTube videos on a given topic",
+    "success_criteria": [
+        {
+            "id": "find_videos",
+            "description": "Find 3-5 relevant videos",
+            "metric": "video_count",
+            "target": "3-5",
+            "weight": 1.0
+        },
+        {
+            "id": "relevance",
+            "description": "Videos must be relevant to the topic",
+            "metric": "relevance_score",
+            "target": ">0.8",
+            "weight": 0.8
+        }
+    ],
+    "constraints": [
+        {
+            "id": "api_limits",
+            "description": "Must not exceed YouTube API rate limits",
+            "constraint_type": "hard",
+            "category": "technical"
+        },
+        {
+            "id": "content_safety",
+            "description": "Must filter out inappropriate content",
+            "constraint_type": "hard",
+            "category": "safety"
+        }
+    ]
+}
+```
+
+## Step 2: Get Constraint Test Guidelines
+
+During the Goal stage (or early Eval), get test guidelines for constraints:
+
+```python
+result = generate_constraint_tests(
+    goal_id="youtube-research",
+    goal_json='<goal JSON above>',
+    agent_path="exports/youtube-research"
+)
+```
+
+**The result contains guidelines (not generated tests):**
+- `output_file`: Where to write tests
+- `file_header`: Imports and fixtures to use
+- `test_template`: Format for test functions
+- `constraints_formatted`: The constraints to test
+- `test_guidelines`: Rules for writing tests
+
+## Step 3: Write Constraint Tests
+
+Using the guidelines, write tests directly with the Write tool:
+
+```python
+# Write constraint tests using the provided file_header and guidelines
+Write(
+    file_path="exports/youtube-research/tests/test_constraints.py",
+    content='''
+"""Constraint tests for youtube-research agent."""
+
+import os
+import pytest
+from exports.youtube_research import default_agent
+
+
+pytestmark = pytest.mark.skipif(
+    not os.environ.get("ANTHROPIC_API_KEY") and not os.environ.get("MOCK_MODE"),
+    reason="API key required for real testing."
+)
+
+
+@pytest.mark.asyncio
+async def test_constraint_api_limits_respected():
+    """Verify API rate limits are not exceeded."""
+    import time
+    mock_mode = bool(os.environ.get("MOCK_MODE"))
+
+    for i in range(10):
+        result = await default_agent.run({"topic": f"test_{i}"}, mock_mode=mock_mode)
+        time.sleep(0.1)
+
+    # Should complete without rate limit errors
+    assert "rate limit" not in str(result).lower()
+
+
+@pytest.mark.asyncio
+async def test_constraint_content_safety_filter():
+    """Verify inappropriate content is filtered."""
+    mock_mode = bool(os.environ.get("MOCK_MODE"))
+    result = await default_agent.run({"topic": "general topic"}, mock_mode=mock_mode)
+
+    for video in result.videos:
+        assert video.safe_for_work is True
+        assert video.age_restricted is False
+'''
+)
+```
+
+## Step 4: Get Success Criteria Test Guidelines
+
+After the agent is built, get success criteria test guidelines:
+
+```python
+result = generate_success_tests(
+    goal_id="youtube-research",
+    goal_json='<goal JSON>',
+    node_names="search_node,filter_node,rank_node,format_node",
+    tool_names="youtube_search,video_details,channel_info",
+    agent_path="exports/youtube-research"
+)
+```
+
+## Step 5: Write Success Criteria Tests
+
+Using the guidelines, write success criteria tests:
+
+```python
+Write(
+    file_path="exports/youtube-research/tests/test_success_criteria.py",
+    content='''
+"""Success criteria tests for youtube-research agent."""
+
+import os
+import pytest
+from exports.youtube_research import default_agent
+
+
+pytestmark = pytest.mark.skipif(
+    not os.environ.get("ANTHROPIC_API_KEY") and not os.environ.get("MOCK_MODE"),
+    reason="API key required for real testing."
+)
+
+
+@pytest.mark.asyncio
+async def test_find_videos_happy_path():
+    """Test finding videos for a common topic."""
+    mock_mode = bool(os.environ.get("MOCK_MODE"))
+    result = await default_agent.run({"topic": "machine learning"}, mock_mode=mock_mode)
+
+    assert result.success
+    assert 3 <= len(result.videos) <= 5
+    assert all(v.title for v in result.videos)
+    assert all(v.video_id for v in result.videos)
+
+
+@pytest.mark.asyncio
+async def test_find_videos_minimum_boundary():
+    """Test at minimum threshold (3 videos)."""
+    mock_mode = bool(os.environ.get("MOCK_MODE"))
+    result = await default_agent.run({"topic": "niche topic xyz"}, mock_mode=mock_mode)
+
+    assert len(result.videos) >= 3
+
+
+@pytest.mark.asyncio
+async def test_relevance_score_threshold():
+    """Test relevance scoring meets threshold."""
+    mock_mode = bool(os.environ.get("MOCK_MODE"))
+    result = await default_agent.run({"topic": "python programming"}, mock_mode=mock_mode)
+
+    for video in result.videos:
+        assert video.relevance_score > 0.8
+
+
+@pytest.mark.asyncio
+async def test_find_videos_no_results_graceful():
+    """Test graceful handling of no results."""
+    mock_mode = bool(os.environ.get("MOCK_MODE"))
+    result = await default_agent.run({"topic": "xyznonexistent123"}, mock_mode=mock_mode)
+
+    # Should not crash, return empty or message
+    assert result.videos == [] or result.message
+'''
+)
+```
+
+## Step 6: Run All Tests
+
+Execute all tests:
+
+```python
+result = run_tests(
+    goal_id="youtube-research",
+    agent_path="exports/youtube-research",
+    test_types='["all"]',
+    parallel=4
+)
+```
+
+**Results:**
+
+```json
+{
+    "goal_id": "youtube-research",
+    "overall_passed": false,
+    "summary": {
+        "total": 6,
+        "passed": 5,
+        "failed": 1,
+        "pass_rate": "83.3%"
+    },
+    "duration_ms": 4521,
+    "results": [
+        {"test_id": "test_constraint_api_001", "passed": true, "duration_ms": 1234},
+        {"test_id": "test_constraint_content_001", "passed": true, "duration_ms": 456},
+        {"test_id": "test_success_001", "passed": true, "duration_ms": 789},
+        {"test_id": "test_success_002", "passed": true, "duration_ms": 654},
+        {"test_id": "test_success_003", "passed": true, "duration_ms": 543},
+        {"test_id": "test_success_004", "passed": false, "duration_ms": 845,
+         "error_category": "IMPLEMENTATION_ERROR",
+         "error_message": "TypeError: 'NoneType' object has no attribute 'videos'"}
+    ]
+}
+```
+
+## Step 7: Debug the Failed Test
+
+```python
+result = debug_test(
+    goal_id="youtube-research",
+    test_name="test_find_videos_no_results_graceful",
+    agent_path="exports/youtube-research"
+)
+```
+
+**Debug Output:**
+
+```json
+{
+    "test_id": "test_success_004",
+    "test_name": "test_find_videos_no_results_graceful",
+    "input": {"topic": "xyznonexistent123"},
+    "expected": "Empty list or message",
+    "actual": {"error": "TypeError: 'NoneType' object has no attribute 'videos'"},
+    "passed": false,
+    "error_message": "TypeError: 'NoneType' object has no attribute 'videos'",
+    "error_category": "IMPLEMENTATION_ERROR",
+    "stack_trace": "Traceback (most recent call last):\n  File \"filter_node.py\", line 42\n    for video in result.videos:\nTypeError: 'NoneType' object has no attribute 'videos'",
+    "logs": [
+        {"timestamp": "2026-01-20T10:00:01", "node": "search_node", "level": "INFO", "msg": "Searching for: xyznonexistent123"},
+        {"timestamp": "2026-01-20T10:00:02", "node": "search_node", "level": "WARNING", "msg": "No results found"},
+        {"timestamp": "2026-01-20T10:00:02", "node": "filter_node", "level": "ERROR", "msg": "NoneType error"}
+    ],
+    "runtime_data": {
+        "execution_path": ["start", "search_node", "filter_node"],
+        "node_outputs": {
+            "search_node": null
+        }
+    },
+    "suggested_fix": "Add null check in filter_node before accessing .videos attribute",
+    "iteration_guidance": {
+        "stage": "Agent",
+        "action": "Fix the code in nodes/edges",
+        "restart_required": false,
+        "description": "The goal is correct, but filter_node doesn't handle null results from search_node."
+    }
+}
+```
+
+## Step 8: Iterate Based on Category
+
+Since this is an **IMPLEMENTATION_ERROR**, we:
+
+1. **Don't restart** the Goal → Agent → Eval flow
+2. **Fix the agent** using hive-create skill:
+   - Modify `filter_node` to handle null results
+3. **Re-run Eval** (tests only)
+
+### Fix in hive-create:
+
+```python
+# Update the filter_node to handle null
+add_node(
+    node_id="filter_node",
+    name="Filter Node",
+    description="Filter and rank videos",
+    node_type="function",
+    input_keys=["search_results"],
+    output_keys=["filtered_videos"],
+    system_prompt="""
+    Filter videos by relevance.
+    IMPORTANT: Handle case where search_results is None or empty.
+    Return empty list if no results.
+    """
+)
+```
+
+### Re-export and re-test:
+
+```python
+# Re-export the fixed agent
+export_graph(path="exports/youtube-research")
+
+# Re-run tests
+result = run_tests(
+    goal_id="youtube-research",
+    agent_path="exports/youtube-research",
+    test_types='["all"]'
+)
+```
+
+**Updated Results:**
+
+```json
+{
+    "goal_id": "youtube-research",
+    "overall_passed": true,
+    "summary": {
+        "total": 6,
+        "passed": 6,
+        "failed": 0,
+        "pass_rate": "100.0%"
+    }
+}
+```
+
+## Summary
+
+1. **Got guidelines** for constraint tests during Goal stage
+2. **Wrote** constraint tests using Write tool
+3. **Got guidelines** for success criteria tests during Eval stage
+4. **Wrote** success criteria tests using Write tool
+5. **Ran** tests in parallel
+6. **Debugged** the one failure
+7. **Categorized** as IMPLEMENTATION_ERROR
+8. **Fixed** the agent (not the goal)
+9. **Re-ran** Eval only (didn't restart full flow)
+10. **Passed** all tests
+
+The agent is now validated and ready for production use.
@@ -1,30 +1,53 @@
 ---
-name: agent-workflow
-description: Complete workflow for building, implementing, and testing goal-driven agents. Orchestrates building-agents-* and testing-agent skills. Use when starting a new agent project, unsure which skill to use, or need end-to-end guidance.
+name: hive
+description: Complete workflow for building, implementing, and testing goal-driven agents. Orchestrates hive-* skills. Use when starting a new agent project, unsure which skill to use, or need end-to-end guidance.
 license: Apache-2.0
 metadata:
  author: hive
  version: "2.0"
  type: workflow-orchestrator
  orchestrates:
-    - building-agents-core
-    - building-agents-construction
-    - building-agents-patterns
-    - testing-agent
+    - hive-concepts
+    - hive-create
+    - hive-patterns
+    - hive-test
+    - hive-credentials
+    - hive-debugger
 ---

 # Agent Development Workflow

+**THIS IS AN EXECUTABLE WORKFLOW. DO NOT explore the codebase or read source files. ROUTE to the correct skill IMMEDIATELY.**
+
+When this skill is loaded, **ALWAYS use the AskUserQuestion tool** to present options:
+
+```
+Use AskUserQuestion with these options:
+- "Build a new agent" → Then invoke /hive-create
+- "Test an existing agent" → Then invoke /hive-test
+- "Learn agent concepts" → Then invoke /hive-concepts
+- "Optimize agent design" → Then invoke /hive-patterns
+- "Set up credentials" → Then invoke /hive-credentials
+- "Debug a failing agent" → Then invoke /hive-debugger
+- "Other" (please describe what you want to achieve)
+```
+
+**DO NOT:** Read source files, explore the codebase, search for code, or do any investigation before routing. The sub-skills handle all of that.
+
+---
+
 Complete Standard Operating Procedure (SOP) for building production-ready goal-driven agents.

 ## Overview

 This workflow orchestrates specialized skills to take you from initial concept to production-ready agent:

-1. **Understand Concepts** (5-10 min) → `/building-agents-core` (optional)
-2. **Build Structure** (15-30 min) → `/building-agents-construction`
-3. **Optimize Design** (10-15 min) → `/building-agents-patterns` (optional)
-4. **Test & Validate** (20-40 min) → `/testing-agent`
+1. **Understand Concepts** → `/hive-concepts` (optional)
+2. **Build Structure** → `/hive-create`
+3. **Optimize Design** → `/hive-patterns` (optional)
+4. **Setup Credentials** → `/hive-credentials` (if agent uses tools requiring API keys)
+5. **Test & Validate** → `/hive-test`
+6. **Debug Issues** → `/hive-debugger` (if agent fails at runtime)

 ## When to Use This Workflow

@@ -35,24 +58,26 @@ Use this meta-skill when:
 - Want consistent, repeatable agent builds

 **Skip this workflow** if:
- You only need to test an existing agent → use `/testing-agent` directly
+- You only need to test an existing agent → use `/hive-test` directly
 - You know exactly which phase you're in → use specific skill directly

 ## Quick Decision Tree

 ```
-"Need to understand agent concepts" → building-agents-core
-"Build a new agent" → building-agents-construction
-"Optimize my agent design" → building-agents-patterns
-"Test my agent" → testing-agent
+"Need to understand agent concepts" → hive-concepts
+"Build a new agent" → hive-create
+"Optimize my agent design" → hive-patterns
+"Need client-facing nodes or feedback loops" → hive-patterns
+"Set up API keys for my agent" → hive-credentials
+"Test my agent" → hive-test
+"My agent is failing/stuck/has errors" → hive-debugger
 "Not sure what I need" → Read phases below, then decide
 "Agent has structure but needs implementation" → See agent directory STATUS.md
 ```

 ## Phase 0: Understand Concepts (Optional)

-**Duration**: 5-10 minutes
-**Skill**: `/building-agents-core`
+**Skill**: `/hive-concepts`
 **Input**: Questions about agent architecture

 ### When to Use
@@ -60,12 +85,12 @@ Use this meta-skill when:
 - First time building an agent
 - Need to understand node types, edges, goals
 - Want to validate tool availability
- Learning about pause/resume architecture
+- Learning about event loop architecture and client-facing nodes

 ### What This Phase Provides

 - Architecture overview (Python packages, not JSON)
- Core concepts (Goal, Node, Edge, Pause/Resume)
+- Core concepts (Goal, Node, Edge, Event Loop, Judges)
 - Tool discovery and validation procedures
 - Workflow overview

@@ -73,9 +98,8 @@ Use this meta-skill when:

 ## Phase 1: Build Agent Structure

-**Duration**: 15-30 minutes
-**Skill**: `/building-agents-construction`
-**Input**: User requirements ("Build an agent that...")
+**Skill**: `/hive-create`
+**Input**: User requirements ("Build an agent that...") or a template to start from

 ### What This Phase Does

@@ -99,9 +123,11 @@ Creates the complete agent architecture:

 - ✅ `exports/agent_name/` package created
 - ✅ Goal defined in agent.py
+- ✅ 3-5 success criteria defined
+- ✅ 1-5 constraints defined
 - ✅ 5-10 nodes specified in nodes/__init__.py
 - ✅ 8-15 edges connecting workflow
- ✅ Validated structure (passes `python -m agent_name validate`)
+- ✅ Validated structure (passes `uv run python -m agent_name validate`)
 - ✅ README.md with usage instructions
 - ✅ CLI commands (info, validate, run, shell)

@@ -115,7 +141,7 @@ You're ready for Phase 2 when:

 ### Common Outputs

-The building-agents-construction skill produces:
+The hive-create skill produces:
 ```
 exports/agent_name/
 ├── __init__.py          (package exports)
@@ -135,53 +161,52 @@ exports/agent_name/
 → You may need to add Python functions or MCP tools (not covered by current skills)

 **If want to optimize design:**
-→ Proceed to Phase 1.5 (building-agents-patterns)
+→ Proceed to Phase 1.5 (hive-patterns)

 **If ready to test:**
 → Proceed to Phase 2

 ## Phase 1.5: Optimize Design (Optional)

-**Duration**: 10-15 minutes
-**Skill**: `/building-agents-patterns`
+**Skill**: `/hive-patterns`
 **Input**: Completed agent structure

 ### When to Use

- Want to add pause/resume functionality
+- Want to add client-facing blocking or feedback edges
+- Need judge patterns for output validation
+- Want fan-out/fan-in (parallel execution)
 - Need error handling patterns
- Want to optimize performance
- Need examples of complex routing
 - Want best practices guidance

 ### What This Phase Provides

- Practical examples and patterns
- Pause/resume architecture
- Error handling strategies
+- Client-facing interaction patterns
+- Feedback edge routing with nullable output keys
+- Judge patterns (implicit, SchemaJudge)
+- Fan-out/fan-in parallel execution
+- Context management and spillover patterns
 - Anti-patterns to avoid
- Performance optimization techniques

 **Skip this phase** if your agent design is straightforward.

 ## Phase 2: Test & Validate

-**Duration**: 20-40 minutes
-**Skill**: `/testing-agent`
+**Skill**: `/hive-test`
 **Input**: Working agent from Phase 1

 ### What This Phase Does

-Creates comprehensive test suite:
- Constraint tests (verify hard requirements)
- Success criteria tests (measure goal achievement)
- Edge case tests (handle failures gracefully)
- Integration tests (end-to-end workflows)
+Guides the creation and execution of a comprehensive test suite:
+- Constraint tests
+- Success criteria tests
+- Edge case tests
+- Integration tests

 ### Process

 1. **Analyze agent** - Read goal, constraints, success criteria
-2. **Generate tests** - Create pytest files in `exports/agent_name/tests/`
+2. **Generate tests** - The calling agent writes pytest files in `exports/agent_name/tests/` using hive-test guidelines and templates
 3. **User approval** - Review and approve each test
 4. **Run evaluation** - Execute tests and collect results
 5. **Debug failures** - Identify and fix issues
@@ -244,9 +269,9 @@ You're done when:

 ```
 User: "Build an agent that monitors files"
-→ Use /building-agents-construction
+→ Use /hive-create
 → Agent structure created
-→ Use /testing-agent
+→ Use /hive-test
 → Tests created and passing
 → Done: Production-ready agent
 ```
@@ -255,19 +280,32 @@ User: "Build an agent that monitors files"

 ```
 User: "Build an agent (first time)"
-→ Use /building-agents-core (understand concepts)
-→ Use /building-agents-construction (build structure)
-→ Use /building-agents-patterns (optimize design)
-→ Use /testing-agent (validate)
+→ Use /hive-concepts (understand concepts)
+→ Use /hive-create (build structure)
+→ Use /hive-patterns (optimize design)
+→ Use /hive-test (validate)
 → Done: Production-ready agent
 ```

+### Pattern 1c: Build from Template
+
+```
+User: "Build an agent based on the deep research template"
+→ Use /hive-create
+→ Select "From a template" path
+→ Pick template, name new agent
+→ Review/modify goal, nodes, graph
+→ Agent exported with customizations
+→ Use /hive-test
+→ Done: Customized agent
+```
+
 ### Pattern 2: Test Existing Agent

 ```
 User: "Test my agent at exports/my_agent"
 → Skip Phase 1
-→ Use /testing-agent directly
+→ Use /hive-test directly
 → Tests created
 → Done: Validated agent
 ```
@@ -276,58 +314,71 @@ User: "Test my agent at exports/my_agent"

 ```
 User: "Build an agent"
-→ Use /building-agents-construction (Phase 1)
+→ Use /hive-create (Phase 1)
 → Implementation needed (see STATUS.md)
 → [User implements functions]
-→ Use /testing-agent (Phase 2)
+→ Use /hive-test (Phase 2)
 → Tests reveal bugs
 → [Fix bugs manually]
 → Re-run tests
 → Done: Working agent
 ```

-### Pattern 4: Complex Agent with Patterns
+### Pattern 4: Agent with Review Loops and HITL Checkpoints

 ```
-User: "Build an agent with multi-turn conversations"
-→ Use /building-agents-core (learn pause/resume)
-→ Use /building-agents-construction (build structure)
-→ Use /building-agents-patterns (implement pause/resume pattern)
-→ Use /testing-agent (validate conversation flows)
-→ Done: Complex conversational agent
+User: "Build an agent with human review and feedback loops"
+→ Use /hive-concepts (learn event loop, client-facing nodes)
+→ Use /hive-create (build structure with feedback edges)
+→ Use /hive-patterns (implement client-facing + feedback patterns)
+→ Use /hive-test (validate review flows and edge routing)
+→ Done: Agent with HITL checkpoints and review loops
 ```

 ## Skill Dependencies

 ```
-agent-workflow (meta-skill)
+hive (meta-skill)
    │
-    ├── building-agents-core (foundational)
-    │   ├── Architecture concepts
-    │   ├── Node/Edge/Goal definitions
+    ├── hive-concepts (foundational)
+    │   ├── Architecture concepts (event loop, judges)
+    │   ├── Node types (event_loop, function)
+    │   ├── Edge routing and priority
    │   ├── Tool discovery procedures
    │   └── Workflow overview
    │
-    ├── building-agents-construction (procedural)
+    ├── hive-create (procedural)
    │   ├── Creates package structure
    │   ├── Defines goal
-    │   ├── Adds nodes incrementally
-    │   ├── Connects edges
+    │   ├── Adds nodes (event_loop, function)
+    │   ├── Connects edges with priority routing
    │   ├── Finalizes agent class
-    │   └── Requires: building-agents-core
+    │   └── Requires: hive-concepts
    │
-    ├── building-agents-patterns (reference)
-    │   ├── Best practices
-    │   ├── Pause/resume patterns
-    │   ├── Error handling
-    │   ├── Anti-patterns
-    │   └── Performance optimization
+    ├── hive-patterns (reference)
+    │   ├── Client-facing interaction patterns
+    │   ├── Feedback edges and review loops
+    │   ├── Judge patterns (implicit, SchemaJudge)
+    │   ├── Fan-out/fan-in parallel execution
+    │   └── Context management and anti-patterns
    │
-    └── testing-agent
-        ├── Reads agent goal
-        ├── Generates tests
-        ├── Runs evaluation
-        └── Reports results
+    ├── hive-credentials (utility)
+    │   ├── Detects missing credentials
+    │   ├── Offers auth method choices (Aden OAuth, direct API key)
+    │   ├── Stores securely in ~/.hive/credentials
+    │   └── Validates with health checks
+    │
+    ├── hive-test (validation)
+    │   ├── Reads agent goal
+    │   ├── Generates tests
+    │   ├── Runs evaluation
+    │   └── Reports results
+    │
+    └── hive-debugger (troubleshooting)
+        ├── Monitors runtime logs (L1/L2/L3)
+        ├── Identifies retry loops, tool failures
+        ├── Categorizes issues (10 categories)
+        └── Provides fix recommendations
 ```

 ## Troubleshooting
@@ -337,13 +388,13 @@ agent-workflow (meta-skill)
 - Check node IDs match between nodes/__init__.py and agent.py
 - Verify all edges reference valid node IDs
 - Ensure entry_node exists in nodes list
- Run: `PYTHONPATH=core:exports python -m agent_name validate`
+- Run: `PYTHONPATH=exports uv run python -m agent_name validate`

 ### "Agent has structure but won't run"

 - Check for STATUS.md or IMPLEMENTATION_GUIDE.md in agent directory
 - Implementation may be needed (Python functions or MCP tools)
- This is expected - building-agents-construction creates structure, not implementation
+- This is expected - hive-create creates structure, not implementation
 - See implementation guide for completion options

 ### "Tests are failing"
@@ -351,9 +402,16 @@ agent-workflow (meta-skill)
 - Review test output for specific failures
 - Check agent goal and success criteria
 - Verify constraints are met
- Use `/testing-agent` to debug and iterate
+- Use `/hive-test` to debug and iterate
 - Fix agent code and re-run tests

+### "Agent is failing at runtime"
+
+- Use `/hive-debugger` to analyze runtime logs
+- The debugger identifies retry loops, tool failures, and stalled execution
+- Get actionable fix recommendations with code changes
+- Monitor the agent in real-time during TUI sessions
+
 ### "Not sure which phase I'm in"

 Run these checks:
@@ -363,7 +421,7 @@ Run these checks:
 ls exports/my_agent/agent.py

 # Check if it validates
-PYTHONPATH=core:exports python -m my_agent validate
+PYTHONPATH=exports uv run python -m my_agent validate

 # Check if tests exist
 ls exports/my_agent/tests/
@@ -412,10 +470,10 @@ You're done with the workflow when:

 ## Additional Resources

- **building-agents-core**: See `.claude/skills/building-agents-core/SKILL.md`
- **building-agents-construction**: See `.claude/skills/building-agents-construction/SKILL.md`
- **building-agents-patterns**: See `.claude/skills/building-agents-patterns/SKILL.md`
- **testing-agent**: See `.claude/skills/testing-agent/SKILL.md`
+- **hive-concepts**: See `.claude/skills/hive-concepts/SKILL.md`
+- **hive-create**: See `.claude/skills/hive-create/SKILL.md`
+- **hive-patterns**: See `.claude/skills/hive-patterns/SKILL.md`
+- **hive-test**: See `.claude/skills/hive-test/SKILL.md`
 - **Agent framework docs**: See `core/README.md`
 - **Example agents**: See `exports/` directory

@@ -423,36 +481,46 @@ You're done with the workflow when:

 This workflow provides a proven path from concept to production-ready agent:

-1. **Learn** with `/building-agents-core` → Understand fundamentals (optional)
-2. **Build** with `/building-agents-construction` → Get validated structure
-3. **Optimize** with `/building-agents-patterns` → Apply best practices (optional)
-4. **Test** with `/testing-agent` → Get verified functionality
+1. **Learn** with `/hive-concepts` → Understand fundamentals (optional)
+2. **Build** with `/hive-create` → Get validated structure
+3. **Optimize** with `/hive-patterns` → Apply best practices (optional)
+4. **Configure** with `/hive-credentials` → Set up API keys (if needed)
+5. **Test** with `/hive-test` → Get verified functionality
+6. **Debug** with `/hive-debugger` → Fix runtime issues (if needed)

 The workflow is **flexible** - skip phases as needed, iterate freely, and adapt to your specific requirements. The goal is **production-ready agents** built with **consistent, repeatable processes**.

 ## Skill Selection Guide

-**Choose building-agents-core when:**
+**Choose hive-concepts when:**
 - First time building agents
- Need to understand architecture
+- Need to understand event loop architecture
 - Validating tool availability
- Learning about node types and edges
+- Learning about node types, edges, and judges

-**Choose building-agents-construction when:**
+**Choose hive-create when:**
 - Actually building an agent
 - Have clear requirements
 - Ready to write code
 - Want step-by-step guidance
+- Want to start from an existing template and customize it

-**Choose building-agents-patterns when:**
+**Choose hive-patterns when:**
 - Agent structure complete
- Need advanced patterns
- Implementing pause/resume
- Optimizing performance
+- Need client-facing nodes or feedback edges
+- Implementing review loops or fan-out/fan-in
+- Want judge patterns or context management
 - Want best practices

-**Choose testing-agent when:**
+**Choose hive-test when:**
 - Agent structure complete
 - Ready to validate functionality
 - Need comprehensive test coverage
- Debugging agent behavior
+- Testing feedback loops, output keys, or fan-out
+
+**Choose hive-debugger when:**
+- Agent is failing or stuck at runtime
+- Seeing retry loops or escalations
+- Tool calls are failing
+- Need to understand why a node isn't completing
+- Want real-time monitoring of agent execution
@@ -1,6 +1,6 @@
 # Example: File Monitor Agent

-This example shows the complete agent-workflow in action for building a file monitoring agent.
+This example shows the complete /hive workflow in action for building a file monitoring agent.

 ## Initial Request

@@ -12,7 +12,7 @@ User: "Build an agent that monitors ~/Downloads and copies new files to ~/Docume

 ### Step 1: Create Structure

-Agent invokes `/building-agents` skill and:
+Agent invokes `/hive-create` skill and:

 1. Creates `exports/file_monitor_agent/` package
 2. Writes skeleton files (__init__.py, __main__.py, agent.py, etc.)
@@ -75,10 +75,10 @@ initialize → list → identify → check
 ### Step 5: Finalize

 ```bash
-$ PYTHONPATH=core:exports python -m file_monitor_agent validate
+$ PYTHONPATH=exports uv run python -m file_monitor_agent validate
 ✓ Agent is valid

-$ PYTHONPATH=core:exports python -m file_monitor_agent info
+$ PYTHONPATH=exports uv run python -m file_monitor_agent info
 Agent: File Monitor & Copy Agent
 Nodes: 7
 Edges: 8
@@ -107,7 +107,7 @@ exports/file_monitor_agent/

 ### Step 1: Analyze Agent

-Agent invokes `/testing-agent` skill and:
+Agent invokes `/hive-test` skill and:

 1. Reads goal from `exports/file_monitor_agent/agent.py`
 2. Identifies 4 success criteria to test
@@ -131,7 +131,7 @@ Tests approved incrementally by user.
 ### Step 3: Run Tests

 ```bash
-$ PYTHONPATH=core:exports pytest exports/file_monitor_agent/tests/
+$ PYTHONPATH=exports uv run pytest exports/file_monitor_agent/tests/

 test_constraints.py::test_preserves_originals     PASSED
 test_constraints.py::test_handles_errors          PASSED
@@ -162,7 +162,7 @@ test_edge_cases.py::test_large_files              PASSED
 ./RUN_AGENT.sh

 # Or manually
-PYTHONPATH=core:exports:aden-tools/src python -m file_monitor_agent run
+PYTHONPATH=exports uv run python -m file_monitor_agent run
 ```

 **Capabilities:**
@@ -1,348 +0,0 @@
-# Example: Testing a YouTube Research Agent
-
-This example walks through testing a YouTube research agent that finds relevant videos based on a topic.
-
-## Prerequisites
-
- Agent built with building-agents skill at `exports/youtube-research/`
- Goal defined with success criteria and constraints
-
-## Step 1: Load the Goal
-
-First, load the goal that was defined during the Goal stage:
-
-```json
-{
-    "id": "youtube-research",
-    "name": "YouTube Research Agent",
-    "description": "Find relevant YouTube videos on a given topic",
-    "success_criteria": [
-        {
-            "id": "find_videos",
-            "description": "Find 3-5 relevant videos",
-            "metric": "video_count",
-            "target": "3-5",
-            "weight": 1.0
-        },
-        {
-            "id": "relevance",
-            "description": "Videos must be relevant to the topic",
-            "metric": "relevance_score",
-            "target": ">0.8",
-            "weight": 0.8
-        }
-    ],
-    "constraints": [
-        {
-            "id": "api_limits",
-            "description": "Must not exceed YouTube API rate limits",
-            "constraint_type": "hard",
-            "category": "technical"
-        },
-        {
-            "id": "content_safety",
-            "description": "Must filter out inappropriate content",
-            "constraint_type": "hard",
-            "category": "safety"
-        }
-    ]
-}
-```
-
-## Step 2: Generate Constraint Tests
-
-During the Goal stage (or early Eval), generate tests for constraints:
-
-```python
-result = generate_constraint_tests(
-    goal_id="youtube-research",
-    goal_json='<goal JSON above>'
-)
-```
-
-**Generated tests (awaiting approval):**
-
-```
-┌─────────────────────────────────────────────────────────────────┐
-│ Generated Constraint Tests (2 tests)                             │
-├─────────────────────────────────────────────────────────────────┤
-│ [1/2] test_constraint_api_limits_respected                       │
-│       Constraint: api_limits                                     │
-│       Confidence: 88%                                            │
-│                                                                  │
-│       def test_constraint_api_limits_respected(agent):           │
-│           """Verify API rate limits are not exceeded."""         │
-│           import time                                            │
-│           for i in range(10):                                    │
-│               result = agent.run({"topic": f"test_{i}"})         │
-│               time.sleep(0.1)                                    │
-│           # Should complete without rate limit errors            │
-│           assert "rate limit" not in str(result).lower()         │
-│                                                                  │
-│       [a]pprove  [r]eject  [e]dit  [s]kip                       │
-├─────────────────────────────────────────────────────────────────┤
-│ [2/2] test_constraint_content_safety_filter                      │
-│       Constraint: content_safety                                 │
-│       Confidence: 91%                                            │
-│                                                                  │
-│       def test_constraint_content_safety_filter(agent):          │
-│           """Verify inappropriate content is filtered."""        │
-│           result = agent.run({"topic": "general topic"})         │
-│           for video in result.videos:                            │
-│               assert video.safe_for_work is True                 │
-│               assert video.age_restricted is False               │
-│                                                                  │
-│       [a]pprove  [r]eject  [e]dit  [s]kip                       │
-└─────────────────────────────────────────────────────────────────┘
-```
-
-## Step 3: Approve Constraint Tests
-
-Review and approve each test:
-
-```python
-result = approve_tests(
-    goal_id="youtube-research",
-    approvals='[
-        {"test_id": "test_constraint_api_001", "action": "approve"},
-        {"test_id": "test_constraint_content_001", "action": "approve"}
-    ]'
-)
-```
-
-## Step 4: Generate Success Criteria Tests
-
-After the agent is built, generate success criteria tests:
-
-```python
-result = generate_success_tests(
-    goal_id="youtube-research",
-    goal_json='<goal JSON>',
-    node_names="search_node,filter_node,rank_node,format_node",
-    tool_names="youtube_search,video_details,channel_info"
-)
-```
-
-**Generated tests (awaiting approval):**
-
-```
-┌─────────────────────────────────────────────────────────────────┐
-│ Generated Success Criteria Tests (4 tests)                       │
-├─────────────────────────────────────────────────────────────────┤
-│ [1/4] test_find_videos_happy_path                               │
-│       Criteria: find_videos                                      │
-│       Confidence: 95%                                            │
-│                                                                  │
-│       def test_find_videos_happy_path(agent):                    │
-│           """Test finding videos for a common topic."""          │
-│           result = agent.run({"topic": "machine learning"})      │
-│           assert result.success                                  │
-│           assert 3 <= len(result.videos) <= 5                    │
-│           assert all(v.title for v in result.videos)             │
-│           assert all(v.video_id for v in result.videos)          │
-│                                                                  │
-│       [a]pprove  [r]eject  [e]dit  [s]kip                       │
-├─────────────────────────────────────────────────────────────────┤
-│ [2/4] test_find_videos_minimum_boundary                          │
-│       Criteria: find_videos                                      │
-│       Confidence: 87%                                            │
-│                                                                  │
-│       def test_find_videos_minimum_boundary(agent):              │
-│           """Test at minimum threshold (3 videos)."""            │
-│           result = agent.run({"topic": "niche topic xyz"})       │
-│           assert len(result.videos) >= 3                         │
-│                                                                  │
-│       [a]pprove  [r]eject  [e]dit  [s]kip                       │
-├─────────────────────────────────────────────────────────────────┤
-│ [3/4] test_relevance_score_threshold                             │
-│       Criteria: relevance                                        │
-│       Confidence: 92%                                            │
-│                                                                  │
-│       def test_relevance_score_threshold(agent):                 │
-│           """Test relevance scoring meets threshold."""          │
-│           result = agent.run({"topic": "python programming"})    │
-│           for video in result.videos:                            │
-│               assert video.relevance_score > 0.8                 │
-│                                                                  │
-│       [a]pprove  [r]eject  [e]dit  [s]kip                       │
-├─────────────────────────────────────────────────────────────────┤
-│ [4/4] test_find_videos_no_results_graceful                       │
-│       Criteria: find_videos                                      │
-│       Confidence: 84%                                            │
-│                                                                  │
-│       def test_find_videos_no_results_graceful(agent):           │
-│           """Test graceful handling of no results."""            │
-│           result = agent.run({"topic": "xyznonexistent123"})     │
-│           # Should not crash, return empty or message            │
-│           assert result.videos == [] or result.message           │
-│                                                                  │
-│       [a]pprove  [r]eject  [e]dit  [s]kip                       │
-└─────────────────────────────────────────────────────────────────┘
-```
-
-## Step 5: Approve Success Criteria Tests
-
-```python
-result = approve_tests(
-    goal_id="youtube-research",
-    approvals='[
-        {"test_id": "test_success_001", "action": "approve"},
-        {"test_id": "test_success_002", "action": "approve"},
-        {"test_id": "test_success_003", "action": "approve"},
-        {"test_id": "test_success_004", "action": "approve"}
-    ]'
-)
-```
-
-## Step 6: Run All Tests
-
-Execute all approved tests:
-
-```python
-result = run_tests(
-    goal_id="youtube-research",
-    agent_path="exports/youtube-research",
-    test_types='["all"]',
-    parallel=4
-)
-```
-
-**Results:**
-
-```json
-{
-    "goal_id": "youtube-research",
-    "overall_passed": false,
-    "summary": {
-        "total": 6,
-        "passed": 5,
-        "failed": 1,
-        "pass_rate": "83.3%"
-    },
-    "duration_ms": 4521,
-    "results": [
-        {"test_id": "test_constraint_api_001", "passed": true, "duration_ms": 1234},
-        {"test_id": "test_constraint_content_001", "passed": true, "duration_ms": 456},
-        {"test_id": "test_success_001", "passed": true, "duration_ms": 789},
-        {"test_id": "test_success_002", "passed": true, "duration_ms": 654},
-        {"test_id": "test_success_003", "passed": true, "duration_ms": 543},
-        {"test_id": "test_success_004", "passed": false, "duration_ms": 845,
-         "error_category": "IMPLEMENTATION_ERROR",
-         "error_message": "TypeError: 'NoneType' object has no attribute 'videos'"}
-    ]
-}
-```
-
-## Step 7: Debug the Failed Test
-
-```python
-result = debug_test(
-    goal_id="youtube-research",
-    test_id="test_success_004"
-)
-```
-
-**Debug Output:**
-
-```json
-{
-    "test_id": "test_success_004",
-    "test_name": "test_find_videos_no_results_graceful",
-    "input": {"topic": "xyznonexistent123"},
-    "expected": "Empty list or message",
-    "actual": {"error": "TypeError: 'NoneType' object has no attribute 'videos'"},
-    "passed": false,
-    "error_message": "TypeError: 'NoneType' object has no attribute 'videos'",
-    "error_category": "IMPLEMENTATION_ERROR",
-    "stack_trace": "Traceback (most recent call last):\n  File \"filter_node.py\", line 42\n    for video in result.videos:\nTypeError: 'NoneType' object has no attribute 'videos'",
-    "logs": [
-        {"timestamp": "2026-01-20T10:00:01", "node": "search_node", "level": "INFO", "msg": "Searching for: xyznonexistent123"},
-        {"timestamp": "2026-01-20T10:00:02", "node": "search_node", "level": "WARNING", "msg": "No results found"},
-        {"timestamp": "2026-01-20T10:00:02", "node": "filter_node", "level": "ERROR", "msg": "NoneType error"}
-    ],
-    "runtime_data": {
-        "execution_path": ["start", "search_node", "filter_node"],
-        "node_outputs": {
-            "search_node": null
-        }
-    },
-    "suggested_fix": "Add null check in filter_node before accessing .videos attribute",
-    "iteration_guidance": {
-        "stage": "Agent",
-        "action": "Fix the code in nodes/edges",
-        "restart_required": false,
-        "description": "The goal is correct, but filter_node doesn't handle null results from search_node."
-    }
-}
-```
-
-## Step 8: Iterate Based on Category
-
-Since this is an **IMPLEMENTATION_ERROR**, we:
-
-1. **Don't restart** the Goal → Agent → Eval flow
-2. **Fix the agent** using building-agents skill:
-   - Modify `filter_node` to handle null results
-3. **Re-run Eval** (tests only)
-
-### Fix in building-agents:
-
-```python
-# Update the filter_node to handle null
-add_node(
-    node_id="filter_node",
-    name="Filter Node",
-    description="Filter and rank videos",
-    node_type="function",
-    input_keys=["search_results"],
-    output_keys=["filtered_videos"],
-    system_prompt="""
-    Filter videos by relevance.
-    IMPORTANT: Handle case where search_results is None or empty.
-    Return empty list if no results.
-    """
-)
-```
-
-### Re-export and re-test:
-
-```python
-# Re-export the fixed agent
-export_graph(path="exports/youtube-research")
-
-# Re-run tests
-result = run_tests(
-    goal_id="youtube-research",
-    agent_path="exports/youtube-research",
-    test_types='["all"]'
-)
-```
-
-**Updated Results:**
-
-```json
-{
-    "goal_id": "youtube-research",
-    "overall_passed": true,
-    "summary": {
-        "total": 6,
-        "passed": 6,
-        "failed": 0,
-        "pass_rate": "100.0%"
-    }
-}
-```
-
-## Summary
-
-1. **Generated** constraint tests during Goal stage
-2. **Generated** success criteria tests during Eval stage
-3. **Approved** all tests with user review
-4. **Ran** tests in parallel
-5. **Debugged** the one failure
-6. **Categorized** as IMPLEMENTATION_ERROR
-7. **Fixed** the agent (not the goal)
-8. **Re-ran** Eval only (didn't restart full flow)
-9. **Passed** all tests
-
-The agent is now validated and ready for production use.
@@ -0,0 +1,145 @@
+# Triage Issue Skill
+
+Analyze a GitHub issue, verify claims against the codebase, and close invalid issues with a technical response.
+
+## Trigger
+
+User provides a GitHub issue URL or number, e.g.:
+- `/triage-issue 1970`
+- `/triage-issue https://github.com/adenhq/hive/issues/1970`
+
+## Workflow
+
+### Step 1: Fetch Issue Details
+
+```bash
+gh issue view <number> --repo adenhq/hive --json title,body,state,labels,author
+```
+
+Extract:
+- Title
+- Body (the claim/bug report)
+- Current state
+- Labels
+- Author
+
+If issue is already closed, inform user and stop.
+
+### Step 2: Analyze the Claim
+
+Read the issue body and identify:
+1. **The core claim** - What is the user asserting?
+2. **Technical specifics** - File paths, function names, code snippets mentioned
+3. **Expected behavior** - What do they think should happen?
+4. **Severity claimed** - Security issue? Bug? Feature request?
+
+### Step 3: Investigate the Codebase
+
+For each technical claim:
+1. Find the referenced code using Grep/Glob/Read
+2. Understand the actual implementation
+3. Check if the claim accurately describes the behavior
+4. Look for related tests, documentation, or design decisions
+
+### Step 4: Evaluate Validity
+
+Categorize the issue as one of:
+
+| Category | Action |
+|----------|--------|
+| **Valid Bug** | Do NOT close. Inform user this is a real issue. |
+| **Valid Feature Request** | Do NOT close. Suggest labeling appropriately. |
+| **Misunderstanding** | Prepare technical explanation for why behavior is correct. |
+| **Fundamentally Flawed** | Prepare critique explaining the technical impossibility or design rationale. |
+| **Duplicate** | Find the original issue and prepare duplicate notice. |
+| **Incomplete** | Prepare request for more information. |
+
+### Step 5: Draft Response
+
+For issues to be closed, draft a response that:
+
+1. **Acknowledges the concern** - Don't be dismissive
+2. **Explains the actual behavior** - With code references
+3. **Provides technical rationale** - Why it works this way
+4. **References industry standards** - If applicable
+5. **Offers alternatives** - If there's a better approach for the user
+
+Use this template:
+
+```markdown
+## Analysis
+
+[Brief summary of what was investigated]
+
+## Technical Details
+
+[Explanation with code references]
+
+## Why This Is Working As Designed
+
+[Rationale]
+
+## Recommendation
+
+[What the user should do instead, if applicable]
+
+---
+*This issue was reviewed and closed by the maintainers.*
+```
+
+### Step 6: User Review
+
+Present the draft to the user with:
+
+```
+## Issue #<number>: <title>
+
+**Claim:** <summary of claim>
+
+**Finding:** <valid/invalid/misunderstanding/etc>
+
+**Draft Response:**
+<the markdown response>
+
+---
+Do you want me to post this comment and close the issue?
+```
+
+Use AskUserQuestion with options:
+- "Post and close" - Post comment, close issue
+- "Edit response" - Let user modify the response
+- "Skip" - Don't take action
+
+### Step 7: Execute Action
+
+If user approves:
+
+```bash
+# Post comment
+gh issue comment <number> --repo adenhq/hive --body "<response>"
+
+# Close issue
+gh issue close <number> --repo adenhq/hive --reason "not planned"
+```
+
+Report success with link to the issue.
+
+## Important Guidelines
+
+1. **Never close valid issues** - If there's any merit to the claim, don't close it
+2. **Be respectful** - The reporter took time to file the issue
+3. **Be technical** - Provide code references and evidence
+4. **Be educational** - Help them understand, don't just dismiss
+5. **Check twice** - Make sure you understand the code before declaring something invalid
+6. **Consider edge cases** - Maybe their environment reveals a real issue
+
+## Example Critiques
+
+### Security Misunderstanding
+> "The claim that secrets are exposed in plaintext misunderstands the encryption architecture. While `SecretStr` is used for logging protection, actual encryption is provided by Fernet (AES-128-CBC) at the storage layer. The code path is: serialize → encrypt → write. Only encrypted bytes touch disk."
+
+### Impossible Request
+> "The requested feature would require [X] which violates [fundamental constraint]. This is not a limitation of our implementation but a fundamental property of [technology/protocol]."
+
+### Already Handled
+> "This scenario is already handled by [code reference]. The reporter may be using an older version or misconfigured environment."
@@ -0,0 +1,20 @@
+{
+  "mcpServers": {
+    "agent-builder": {
+      "command": "python",
+      "args": ["-m", "framework.mcp.agent_builder_server"],
+      "cwd": "core",
+      "env": {
+        "PYTHONPATH": "../tools/src"
+      }
+    },
+    "tools": {
+      "command": "python",
+      "args": ["mcp_server.py", "--stdio"],
+      "cwd": "tools",
+      "env": {
+        "PYTHONPATH": "src"
+      }
+    }
+  }
+}
@@ -0,0 +1 @@
+../../.claude/skills/hive
@@ -0,0 +1 @@
+../../.claude/skills/hive-concepts
@@ -0,0 +1 @@
+../../.claude/skills/hive-create
@@ -0,0 +1 @@
+../../.claude/skills/hive-credentials
@@ -0,0 +1 @@
+../../.claude/skills/hive-patterns
@@ -0,0 +1 @@
+../../.claude/skills/hive-test
@@ -0,0 +1,18 @@
+This project uses ruff for Python linting and formatting.
+
+Rules:
+- Line length: 100 characters
+- Python target: 3.11+
+- Use double quotes for strings
+- Sort imports with isort (ruff I rules): stdlib, third-party, first-party (framework), local
+- Combine as-imports
+- Use type hints on all function signatures
+- Use `from __future__ import annotations` for modern type syntax
+- Raise exceptions with `from` in except blocks (B904)
+- No unused imports (F401), no unused variables (F841)
+- Prefer list/dict/set comprehensions over map/filter (C4)
+
+Run `make lint` to auto-fix, `make check` to verify without modifying files.
+Run `make format` to apply ruff formatting.
+
+The ruff config lives in core/pyproject.toml under [tool.ruff].
@@ -11,6 +11,9 @@ indent_size = 2
 insert_final_newline = true
 trim_trailing_whitespace = true

+[*.py]
+indent_size = 4
+
 [*.md]
 trim_trailing_whitespace = false

@@ -0,0 +1,124 @@
+# Normalize line endings for all text files
+* text=auto
+
+# Source code
+*.py text diff=python
+*.js text
+*.ts text
+*.jsx text
+*.tsx text
+*.json text
+*.yaml text
+*.yml text
+*.toml text
+*.ini text
+*.cfg text
+
+# Shell scripts (must use LF)
+*.sh text eol=lf
+quickstart.sh text eol=lf
+
+# PowerShell scripts (Windows-friendly)
+*.ps1 text eol=lf
+*.psm1 text eol=lf
+
+# Windows batch files (must use CRLF)
+*.bat text eol=crlf
+*.cmd text eol=crlf
+
+# Documentation
+*.md text
+*.txt text
+*.rst text
+*.tex text
+
+# Configuration files
+.gitignore text
+.gitattributes text
+.editorconfig text
+Dockerfile text
+docker-compose.yml text
+requirements*.txt text
+pyproject.toml text
+setup.py text
+setup.cfg text
+MANIFEST.in text
+LICENSE text
+README* text
+CHANGELOG* text
+CONTRIBUTING* text
+CODE_OF_CONDUCT* text
+
+# Web files
+*.html text
+*.css text
+*.scss text
+*.sass text
+
+# Data files
+*.xml text
+*.csv text
+*.sql text
+
+# Graphics (binary)
+*.png binary
+*.jpg binary
+*.jpeg binary
+*.gif binary
+*.ico binary
+*.svg binary
+*.eps binary
+*.bmp binary
+*.tif binary
+*.tiff binary
+
+# Archives (binary)
+*.zip binary
+*.tar binary
+*.gz binary
+*.bz2 binary
+*.7z binary
+*.rar binary
+
+# Python compiled (binary)
+*.pyc binary
+*.pyo binary
+*.pyd binary
+*.whl binary
+*.egg binary
+
+# System libraries (binary)
+*.so binary
+*.dll binary
+*.dylib binary
+*.lib binary
+*.a binary
+
+# Documents (binary)
+*.pdf binary
+*.doc binary
+*.docx binary
+*.ppt binary
+*.pptx binary
+*.xls binary
+*.xlsx binary
+
+# Fonts (binary)
+*.ttf binary
+*.otf binary
+*.woff binary
+*.woff2 binary
+*.eot binary
+
+# Audio/Video (binary)
+*.mp3 binary
+*.mp4 binary
+*.wav binary
+*.avi binary
+*.mov binary
+*.flv binary
+
+# Database files (binary)
+*.db binary
+*.sqlite binary
+*.sqlite3 binary
@@ -8,7 +8,6 @@
 /hive/ @adenhq/maintainers

 # Infrastructure
-/docker-compose*.yml @adenhq/maintainers
 /.github/ @adenhq/maintainers

 # Documentation
@@ -1,9 +1,10 @@
 ---
 name: Bug Report
 about: Report a bug to help us improve
-title: '[Bug]: '
-labels: bug
+title: "[Bug]: "
+labels: bug, enhancement
 assignees: ''
+
 ---

 ## Describe the Bug
@@ -1,9 +1,10 @@
 ---
 name: Feature Request
 about: Suggest a new feature or enhancement
-title: '[Feature]: '
+title: "[Feature]: "
 labels: enhancement
 assignees: ''
+
 ---

 ## Problem Statement
@@ -0,0 +1,71 @@
+---
+name: Integration Request
+about: Suggest a new integration
+title: "[Integration]:"
+labels: ''
+assignees: ''
+
+---
+
+## Service                                                                                      
+                                                                                                 
+ Name and brief description of the service and what it enables agents to do.                     
+                                                                                                 
+ **Description:** [e.g., "API key for Slack Bot" — short one-liner for the credential spec]      
+                                                                                                 
+ ## Credential Identity                                                                          
+                                                                                                 
+ - **credential_id:** [e.g., `slack`]                                                            
+ - **env_var:** [e.g., `SLACK_BOT_TOKEN`]                                                        
+ - **credential_key:** [e.g., `access_token`, `api_key`, `bot_token`]                            
+                                                                                                 
+ ## Tools                                                                                        
+                                                                                                 
+ Tool function names that require this credential:                                               
+                                                                                                 
+ - [e.g., `slack_send_message`]                                                                  
+ - [e.g., `slack_list_channels`]                                                                 
+                                                                                                 
+ ## Auth Methods                                                                                 
+                                                                                                 
+ - **Direct API key supported:** Yes / No                                                        
+ - **Aden OAuth supported:** Yes / No                                                            
+                                                                                                 
+ If Aden OAuth is supported, describe the OAuth scopes/permissions required.                     
+                                                                                                 
+ ## How to Get the Credential                                                                    
+                                                                                                 
+ Link where users obtain the key/token:                                                          
+                                                                                                 
+ [e.g., https://api.slack.com/apps]                                                              
+                                                                                                 
+ Step-by-step instructions:                                                                      
+                                                                                                 
+ 1. Go to ...                                                                                    
+ 2. Create a ...                                                                                 
+ 3. Select scopes/permissions: ...                                                               
+ 4. Copy the key/token                                                                           
+                                                                                                 
+ ## Health Check                                                                                 
+                                                                                                 
+ A lightweight API call to validate the credential (no writes, no charges).                      
+                                                                                                 
+ - **Endpoint:** [e.g., `https://slack.com/api/auth.test`]                                       
+ - **Method:** [e.g., `GET` or `POST`]                                                           
+ - **Auth header:** [e.g., `Authorization: Bearer {token}` or `X-Api-Key: {key}`]                
+ - **Parameters (if any):** [e.g., `?limit=1`]                                                   
+ - **200 means:** [e.g., key is valid]                                                           
+ - **401 means:** [e.g., invalid or expired]                                                     
+ - **429 means:** [e.g., rate limited but key is valid]                                          
+                                                                                                 
+ ## Credential Group                                                                             
+                                                                                                 
+ Does this require multiple credentials configured together? (e.g., Google Custom Search needs   
+ both an API key and a CSE ID)                                                                   
+                                                                                                 
+ - [ ] No, single credential                                                                     
+ - [ ] Yes — list the other credential IDs in the group:                                         
+                                                                                                 
+ ## Additional Context                                                                           
+                                                                                                 
+ Links to API docs, rate limits, free tier availability, or anything else relevant.
@@ -0,0 +1,34 @@
+name: Auto-close duplicate issues
+description: Auto-closes issues that are duplicates of existing issues
+on:
+  schedule:
+    - cron: "0 */6 * * *"
+  workflow_dispatch:
+
+jobs:
+  auto-close-duplicates:
+    runs-on: ubuntu-latest
+    timeout-minutes: 10
+    permissions:
+      contents: read
+      issues: write
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+
+      - name: Setup Bun
+        uses: oven-sh/setup-bun@v2
+        with:
+          bun-version: latest
+
+      - name: Run auto-close-duplicates tests
+        run: bun test scripts/auto-close-duplicates
+
+      - name: Auto-close duplicate issues
+        run: bun run scripts/auto-close-duplicates.ts
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          GITHUB_REPOSITORY_OWNER: ${{ github.repository_owner }}
+          GITHUB_REPOSITORY_NAME: ${{ github.event.repository.name }}
+          STATSIG_API_KEY: ${{ secrets.STATSIG_API_KEY }}
@@ -21,21 +21,48 @@ jobs:
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
-          cache: 'pip'
+
+      - name: Install uv
+        uses: astral-sh/setup-uv@v4

      - name: Install dependencies
-        run: |
-          cd core
-          pip install -e .
-          pip install -r requirements-dev.txt
+        run: uv sync --project core --group dev

-      - name: Run ruff
+      - name: Ruff lint
        run: |
-          cd core
-          ruff check .
+          uv run --project core ruff check core/
+          uv run --project core ruff check tools/
+
+      - name: Ruff format
+        run: |
+          uv run --project core ruff format --check core/
+          uv run --project core ruff format --check tools/

  test:
    name: Test Python Framework
+    runs-on: ${{ matrix.os }}
+    strategy:
+      matrix:
+        os: [ubuntu-latest, windows-latest]
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Setup Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: '3.11'
+
+      - name: Install uv
+        uses: astral-sh/setup-uv@v4
+
+      - name: Install dependencies and run tests
+        run: |
+          cd core
+          uv sync
+          uv run pytest tests/ -v
+
+  test-tools:
+    name: Test Tools
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
@@ -44,23 +71,20 @@ jobs:
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
-          cache: 'pip'

-      - name: Install dependencies
-        run: |
-          cd core
-          pip install -e .
-          pip install -r requirements-dev.txt
+      - name: Install uv
+        uses: astral-sh/setup-uv@v4

-      - name: Run tests
+      - name: Install dependencies and run tests
        run: |
-          cd core
-          pytest tests/ -v
+          cd tools
+          uv sync --extra dev
+          uv run pytest tests/ -v

  validate:
    name: Validate Agent Exports
    runs-on: ubuntu-latest
-    needs: [lint, test]
+    needs: [lint, test, test-tools]
    steps:
      - uses: actions/checkout@v4

@@ -68,20 +92,43 @@ jobs:
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
-          cache: 'pip'
+
+      - name: Install uv
+        uses: astral-sh/setup-uv@v4

      - name: Install dependencies
        run: |
          cd core
-          pip install -e .
-          pip install -r requirements-dev.txt
+          uv sync

      - name: Validate exported agents
        run: |
          # Check that agent exports have valid structure
-          for agent_dir in exports/*/; do
+          if [ ! -d "exports" ]; then
+            echo "No exports/ directory found, skipping validation"
+            exit 0
+          fi
+
+          shopt -s nullglob
+          agent_dirs=(exports/*/)
+          shopt -u nullglob
+
+          if [ ${#agent_dirs[@]} -eq 0 ]; then
+            echo "No agent directories in exports/, skipping validation"
+            exit 0
+          fi
+
+          validated=0
+          for agent_dir in "${agent_dirs[@]}"; do
            if [ -f "$agent_dir/agent.json" ]; then
              echo "Validating $agent_dir"
-              python -c "import json; json.load(open('$agent_dir/agent.json'))"
+              uv run python -c "import json; json.load(open('$agent_dir/agent.json'))"
+              validated=$((validated + 1))
            fi
          done
+
+          if [ "$validated" -eq 0 ]; then
+            echo "No agent.json files found in exports/, skipping validation"
+          else
+            echo "Validated $validated agent(s)"
+          fi
@@ -0,0 +1,103 @@
+name: Issue Triage
+
+on:
+  issues:
+    types: [opened]
+
+jobs:
+  triage:
+    runs-on: ubuntu-latest
+    timeout-minutes: 10
+    permissions:
+      contents: read
+      issues: write
+      id-token: write
+
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 1
+
+      - name: Triage and check for duplicates
+        uses: anthropics/claude-code-action@v1
+        with:
+          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
+          github_token: ${{ secrets.GITHUB_TOKEN }}
+          allowed_non_write_users: "*"
+          prompt: |
+            Analyze this new issue and perform triage tasks.
+
+            Issue: #${{ github.event.issue.number }}
+            Repository: ${{ github.repository }}
+
+            ## Your Tasks:
+
+            ### 1. Get issue details
+            Use mcp__github__get_issue to get the full details of issue #${{ github.event.issue.number }}
+
+            ### 2. Check for duplicates
+            Search for similar existing issues using mcp__github__search_issues with relevant keywords from the issue title and body.
+
+            Criteria for duplicates:
+            - Same bug or error being reported
+            - Same feature request (even if worded differently)
+            - Same question being asked
+            - Issues describing the same root problem
+
+            If you find a duplicate:
+            - Add a comment using EXACTLY this format (required for auto-close to work):
+              "Found a possible duplicate of #<issue_number>: <brief explanation of why it's a duplicate>"
+            - Do NOT apply the "duplicate" label yet (the auto-close script will add it after 12 hours if no objections)
+            - Suggest the user react with a thumbs-down if they disagree
+
+            ### 3. Check for Low-Quality / AI Spam
+            Analyze the issue quality. We are receiving many low-effort, AI-generated spam issues.
+            Flag the issue as INVALID if it matches these criteria:
+            - **Vague/Generic**: Title is "Fix bug" or "Error" without specific context.
+            - **Hallucinated**: Refers to files or features that do not exist in this repo.
+            - **Template Filler**: Body contains "Insert description here" or unrelated gibberish.
+            - **Low Effort**: No reproduction steps, no logs, only 1-2 sentences.
+
+            If identified as spam/low-quality:
+            - Add the "invalid" label.
+            - Add a comment:
+              "This issue has been automatically flagged as low-quality or potentially AI-generated spam. It lacks specific details (logs, reproduction steps, file references) required for us to help. Please open a new issue following the template exactly if this is a legitimate request."
+            - Do NOT proceed to other steps.
+
+            ### 4. Check for invalid issues (General)
+            If the issue is not spam but still lacks information:
+            - Add the "invalid" label
+            - Comment asking for clarification
+
+            ### 5. Categorize with labels (if NOT a duplicate or spam)
+            Apply appropriate labels based on the issue content. Use ONLY these labels:
+            - bug: Something isn't working
+            - enhancement: New feature or request
+            - question: Further information is requested
+            - documentation: Improvements or additions to documentation
+            - good first issue: Good for newcomers (if issue is well-defined and small scope)
+            - help wanted: Extra attention is needed (if issue needs community input)
+            - backlog: Tracked for the future, but not currently planned or prioritized
+
+            ### 6. Estimate size (if NOT a duplicate, spam, or invalid)
+            Apply exactly ONE size label to help contributors match their capacity to the task:
+            - "size: small": Docs, typos, single-file fixes, config changes
+            - "size: medium": Bug fixes with tests, adding a single tool, changes within one package
+            - "size: large": Cross-package changes (core + tools), new modules, complex logic, architectural refactors
+
+            You may apply multiple labels if appropriate (e.g., "bug", "size: small", and "good first issue").
+
+            ## Tools Available:
+            - mcp__github__get_issue: Get issue details
+            - mcp__github__search_issues: Search for similar issues
+            - mcp__github__list_issues: List recent issues if needed
+            - mcp__github__add_issue_comment: Add a comment
+            - mcp__github__update_issue: Add labels
+            - mcp__github__get_issue_comments: Get existing comments
+
+            Be thorough but efficient. Focus on accurate categorization and finding true duplicates.
+
+          claude_args: |
+            --model claude-haiku-4-5-20251001
+            --allowedTools "mcp__github__get_issue,mcp__github__search_issues,mcp__github__list_issues,mcp__github__add_issue_comment,mcp__github__update_issue,mcp__github__get_issue_comments"
@@ -0,0 +1,204 @@
+name: PR Check Command
+
+on:
+  issue_comment:
+    types: [created]
+
+jobs:
+  check-pr:
+    # Only run on PR comments that start with /check
+    if: github.event.issue.pull_request && startsWith(github.event.comment.body, '/check')
+    runs-on: ubuntu-latest
+    permissions:
+      pull-requests: write
+      issues: write
+      checks: write
+      statuses: write
+
+    steps:
+      - name: Check PR requirements
+        uses: actions/github-script@v7
+        with:
+          script: |
+            const prNumber = context.payload.issue.number;
+            console.log(`Triggered by /check comment on PR #${prNumber}`);
+
+            // Fetch PR data
+            const { data: pr } = await github.rest.pulls.get({
+              owner: context.repo.owner,
+              repo: context.repo.repo,
+              pull_number: prNumber,
+            });
+
+            const prBody = pr.body || '';
+            const prTitle = pr.title || '';
+            const prAuthor = pr.user.login;
+            const headSha = pr.head.sha;
+
+            // Create a check run in progress
+            const { data: checkRun } = await github.rest.checks.create({
+              owner: context.repo.owner,
+              repo: context.repo.repo,
+              name: 'check-requirements',
+              head_sha: headSha,
+              status: 'in_progress',
+              started_at: new Date().toISOString(),
+            });
+
+            // Extract issue numbers
+            const issuePattern = /(?:close[sd]?|fix(?:e[sd])?|resolve[sd]?)?\s*#(\d+)/gi;
+            const allText = `${prTitle} ${prBody}`;
+            const matches = [...allText.matchAll(issuePattern)];
+            const issueNumbers = [...new Set(matches.map(m => parseInt(m[1], 10)))];
+
+            console.log(`PR #${prNumber}:`);
+            console.log(`  Author: ${prAuthor}`);
+            console.log(`  Found issue references: ${issueNumbers.length > 0 ? issueNumbers.join(', ') : 'none'}`);
+
+            if (issueNumbers.length === 0) {
+              const message = `## PR Closed - Requirements Not Met
+
+            This PR has been automatically closed because it doesn't meet the requirements.
+
+            **Missing:** No linked issue found.
+
+            **To fix:**
+            1. Create or find an existing issue for this work
+            2. Assign yourself to the issue
+            3. Re-open this PR and add \`Fixes #123\` in the description
+
+            **Why is this required?** See #472 for details.`;
+
+              await github.rest.issues.createComment({
+                owner: context.repo.owner,
+                repo: context.repo.repo,
+                issue_number: prNumber,
+                body: message,
+              });
+
+              await github.rest.pulls.update({
+                owner: context.repo.owner,
+                repo: context.repo.repo,
+                pull_number: prNumber,
+                state: 'closed',
+              });
+
+              // Update check run to failure
+              await github.rest.checks.update({
+                owner: context.repo.owner,
+                repo: context.repo.repo,
+                check_run_id: checkRun.id,
+                status: 'completed',
+                conclusion: 'failure',
+                completed_at: new Date().toISOString(),
+                output: {
+                  title: 'Missing linked issue',
+                  summary: 'PR must reference an issue (e.g., `Fixes #123`)',
+                },
+              });
+
+              core.setFailed('PR must reference an issue');
+              return;
+            }
+
+            // Check if PR author is assigned to any linked issue
+            let issueWithAuthorAssigned = null;
+            let issuesWithoutAuthor = [];
+
+            for (const issueNum of issueNumbers) {
+              try {
+                const { data: issue } = await github.rest.issues.get({
+                  owner: context.repo.owner,
+                  repo: context.repo.repo,
+                  issue_number: issueNum,
+                });
+
+                const assigneeLogins = (issue.assignees || []).map(a => a.login);
+                if (assigneeLogins.includes(prAuthor)) {
+                  issueWithAuthorAssigned = issueNum;
+                  console.log(`  Issue #${issueNum} has PR author ${prAuthor} as assignee`);
+                  break;
+                } else {
+                  issuesWithoutAuthor.push({
+                    number: issueNum,
+                    assignees: assigneeLogins
+                  });
+                  console.log(`  Issue #${issueNum} assignees: ${assigneeLogins.length > 0 ? assigneeLogins.join(', ') : 'none'}`);
+                }
+              } catch (error) {
+                console.log(`  Issue #${issueNum} not found`);
+              }
+            }
+
+            if (!issueWithAuthorAssigned) {
+              const issueList = issuesWithoutAuthor.map(i =>
+                `#${i.number} (assignees: ${i.assignees.length > 0 ? i.assignees.join(', ') : 'none'})`
+              ).join(', ');
+
+              const message = `## PR Closed - Requirements Not Met
+
+            This PR has been automatically closed because it doesn't meet the requirements.
+
+            **PR Author:** @${prAuthor}
+            **Found issues:** ${issueList}
+            **Problem:** The PR author must be assigned to the linked issue.
+
+            **To fix:**
+            1. Assign yourself (@${prAuthor}) to one of the linked issues
+            2. Re-open this PR
+
+            **Why is this required?** See #472 for details.`;
+
+              await github.rest.issues.createComment({
+                owner: context.repo.owner,
+                repo: context.repo.repo,
+                issue_number: prNumber,
+                body: message,
+              });
+
+              await github.rest.pulls.update({
+                owner: context.repo.owner,
+                repo: context.repo.repo,
+                pull_number: prNumber,
+                state: 'closed',
+              });
+
+              // Update check run to failure
+              await github.rest.checks.update({
+                owner: context.repo.owner,
+                repo: context.repo.repo,
+                check_run_id: checkRun.id,
+                status: 'completed',
+                conclusion: 'failure',
+                completed_at: new Date().toISOString(),
+                output: {
+                  title: 'PR author not assigned to issue',
+                  summary: `PR author @${prAuthor} must be assigned to one of the linked issues: ${issueList}`,
+                },
+              });
+
+              core.setFailed('PR author must be assigned to the linked issue');
+            } else {
+              await github.rest.issues.createComment({
+                owner: context.repo.owner,
+                repo: context.repo.repo,
+                issue_number: prNumber,
+                body: `✅ PR requirements met! Issue #${issueWithAuthorAssigned} has @${prAuthor} as assignee.`,
+              });
+
+              // Update check run to success
+              await github.rest.checks.update({
+                owner: context.repo.owner,
+                repo: context.repo.repo,
+                check_run_id: checkRun.id,
+                status: 'completed',
+                conclusion: 'success',
+                completed_at: new Date().toISOString(),
+                output: {
+                  title: 'Requirements met',
+                  summary: `Issue #${issueWithAuthorAssigned} has @${prAuthor} as assignee.`,
+                },
+              });
+
+              console.log(`PR requirements met!`);
+            }
@@ -0,0 +1,138 @@
+name: PR Requirements Backfill
+
+on:
+  workflow_dispatch:
+
+jobs:
+  check-all-open-prs:
+    runs-on: ubuntu-latest
+    permissions:
+      pull-requests: write
+      issues: write
+
+    steps:
+      - name: Check all open PRs
+        uses: actions/github-script@v7
+        with:
+          script: |
+            const { data: pullRequests } = await github.rest.pulls.list({
+              owner: context.repo.owner,
+              repo: context.repo.repo,
+              state: 'open',
+              per_page: 100,
+            });
+
+            console.log(`Found ${pullRequests.length} open PRs`);
+
+            for (const pr of pullRequests) {
+              const prNumber = pr.number;
+              const prBody = pr.body || '';
+              const prTitle = pr.title || '';
+              const prAuthor = pr.user.login;
+
+              console.log(`\nChecking PR #${prNumber}: ${prTitle}`);
+
+              // Extract issue numbers from body and title
+              const issuePattern = /(?:close[sd]?|fix(?:e[sd])?|resolve[sd]?)?\s*#(\d+)/gi;
+              const allText = `${prTitle} ${prBody}`;
+              const matches = [...allText.matchAll(issuePattern)];
+              const issueNumbers = [...new Set(matches.map(m => parseInt(m[1], 10)))];
+
+              console.log(`  Found issue references: ${issueNumbers.length > 0 ? issueNumbers.join(', ') : 'none'}`);
+
+              if (issueNumbers.length === 0) {
+                console.log(`  ❌ No linked issue - closing PR`);
+
+                const message = `## PR Closed - Requirements Not Met
+
+            This PR has been automatically closed because it doesn't meet the requirements.
+
+            **Missing:** No linked issue found.
+
+            **To fix:**
+            1. Create or find an existing issue for this work
+            2. Assign yourself to the issue
+            3. Re-open this PR and add \`Fixes #123\` in the description`;
+
+                await github.rest.issues.createComment({
+                  owner: context.repo.owner,
+                  repo: context.repo.repo,
+                  issue_number: prNumber,
+                  body: message,
+                });
+
+                await github.rest.pulls.update({
+                  owner: context.repo.owner,
+                  repo: context.repo.repo,
+                  pull_number: prNumber,
+                  state: 'closed',
+                });
+
+                continue;
+              }
+
+              // Check if any linked issue has the PR author as assignee
+              let issueWithAuthorAssigned = null;
+              let issuesWithoutAuthor = [];
+
+              for (const issueNum of issueNumbers) {
+                try {
+                  const { data: issue } = await github.rest.issues.get({
+                    owner: context.repo.owner,
+                    repo: context.repo.repo,
+                    issue_number: issueNum,
+                  });
+
+                  const assigneeLogins = (issue.assignees || []).map(a => a.login);
+                  if (assigneeLogins.includes(prAuthor)) {
+                    issueWithAuthorAssigned = issueNum;
+                    break;
+                  } else {
+                    issuesWithoutAuthor.push({
+                      number: issueNum,
+                      assignees: assigneeLogins
+                    });
+                  }
+                } catch (error) {
+                  console.log(`  Issue #${issueNum} not found or inaccessible`);
+                }
+              }
+
+              if (!issueWithAuthorAssigned) {
+                const issueList = issuesWithoutAuthor.map(i =>
+                  `#${i.number} (assignees: ${i.assignees.length > 0 ? i.assignees.join(', ') : 'none'})`
+                ).join(', ');
+
+                console.log(`  ❌ PR author not assigned to any linked issue - closing PR`);
+
+                const message = `## PR Closed - Requirements Not Met
+
+            This PR has been automatically closed because it doesn't meet the requirements.
+
+            **PR Author:** @${prAuthor}
+            **Found issues:** ${issueList}
+            **Problem:** The PR author must be assigned to the linked issue.
+
+            **To fix:**
+            1. Assign yourself (@${prAuthor}) to one of the linked issues
+            2. Re-open this PR`;
+
+                await github.rest.issues.createComment({
+                  owner: context.repo.owner,
+                  repo: context.repo.repo,
+                  issue_number: prNumber,
+                  body: message,
+                });
+
+                await github.rest.pulls.update({
+                  owner: context.repo.owner,
+                  repo: context.repo.repo,
+                  pull_number: prNumber,
+                  state: 'closed',
+                });
+              } else {
+                console.log(`  ✅ PR requirements met! Issue #${issueWithAuthorAssigned} has ${prAuthor} as assignee.`);
+              }
+            }
+
+            console.log('\nBackfill complete!');
@@ -0,0 +1,189 @@
+name: PR Requirements Check
+
+on:
+  pull_request_target:
+    types: [opened, reopened, edited, synchronize]
+
+jobs:
+  check-requirements:
+    runs-on: ubuntu-latest
+    permissions:
+      pull-requests: write
+      issues: write
+
+    steps:
+      - name: Check PR has linked issue with assignee
+        uses: actions/github-script@v7
+        with:
+          script: |
+            const pr = context.payload.pull_request;
+            const prNumber = pr.number;
+            const prBody = pr.body || '';
+            const prTitle = pr.title || '';
+            const prLabels = (pr.labels || []).map(l => l.name);
+
+            // Allow micro-fix and documentation PRs without a linked issue
+            const isMicroFix = prLabels.includes('micro-fix') || /micro-fix/i.test(prTitle);
+            const isDocumentation = prLabels.includes('documentation') || /\bdocs?\b/i.test(prTitle);
+            if (isMicroFix || isDocumentation) {
+              const reason = isMicroFix ? 'micro-fix' : 'documentation';
+              console.log(`PR #${prNumber} is a ${reason}, skipping issue requirement.`);
+              return;
+            }
+
+            // Extract issue numbers from body and title
+            // Matches: fixes #123, closes #123, resolves #123, or plain #123
+            const issuePattern = /(?:close[sd]?|fix(?:e[sd])?|resolve[sd]?)?\s*#(\d+)/gi;
+
+            const allText = `${prTitle} ${prBody}`;
+            const matches = [...allText.matchAll(issuePattern)];
+            const issueNumbers = [...new Set(matches.map(m => parseInt(m[1], 10)))];
+
+            console.log(`PR #${prNumber}:`);
+            console.log(`  Found issue references: ${issueNumbers.length > 0 ? issueNumbers.join(', ') : 'none'}`);
+
+            if (issueNumbers.length === 0) {
+              const message = `## PR Closed - Requirements Not Met
+
+            This PR has been automatically closed because it doesn't meet the requirements.
+
+            **Missing:** No linked issue found.
+
+            **To fix:**
+            1. Create or find an existing issue for this work
+            2. Assign yourself to the issue
+            3. Re-open this PR and add \`Fixes #123\` in the description
+
+            **Exception:** To bypass this requirement, you can:
+            - Add the \`micro-fix\` label or include \`micro-fix\` in your PR title for trivial fixes
+            - Add the \`documentation\` label or include \`doc\`/\`docs\` in your PR title for documentation changes
+
+            **Micro-fix requirements** (must meet ALL):
+            | Qualifies | Disqualifies |
+            |-----------|--------------|
+            | < 20 lines changed | Any functional bug fix |
+            | Typos & Documentation & Linting | Refactoring for "clean code" |
+            | No logic/API/DB changes | New features (even tiny ones) |
+
+            **Why is this required?** See #472 for details.`;
+
+              const comments = await github.rest.issues.listComments({
+                owner: context.repo.owner,
+                repo: context.repo.repo,
+                issue_number: prNumber,
+              });
+
+              const botComment = comments.data.find(
+                (c) => c.user.type === 'Bot' && c.body.includes('PR Closed - Requirements Not Met')
+              );
+
+              if (!botComment) {
+                await github.rest.issues.createComment({
+                  owner: context.repo.owner,
+                  repo: context.repo.repo,
+                  issue_number: prNumber,
+                  body: message,
+                });
+              }
+
+              await github.rest.pulls.update({
+                owner: context.repo.owner,
+                repo: context.repo.repo,
+                pull_number: prNumber,
+                state: 'closed',
+              });
+
+              core.setFailed('PR must reference an issue');
+              return;
+            }
+
+            // Check if any linked issue has the PR author as assignee
+            const prAuthor = pr.user.login;
+            let issueWithAuthorAssigned = null;
+            let issuesWithoutAuthor = [];
+
+            for (const issueNum of issueNumbers) {
+              try {
+                const { data: issue } = await github.rest.issues.get({
+                  owner: context.repo.owner,
+                  repo: context.repo.repo,
+                  issue_number: issueNum,
+                });
+
+                const assigneeLogins = (issue.assignees || []).map(a => a.login);
+                if (assigneeLogins.includes(prAuthor)) {
+                  issueWithAuthorAssigned = issueNum;
+                  console.log(`  Issue #${issueNum} has PR author ${prAuthor} as assignee`);
+                  break;
+                } else {
+                  issuesWithoutAuthor.push({
+                    number: issueNum,
+                    assignees: assigneeLogins
+                  });
+                  console.log(`  Issue #${issueNum} assignees: ${assigneeLogins.length > 0 ? assigneeLogins.join(', ') : 'none'} (PR author: ${prAuthor})`);
+                }
+              } catch (error) {
+                console.log(`  Issue #${issueNum} not found or inaccessible`);
+              }
+            }
+
+            if (!issueWithAuthorAssigned) {
+              const issueList = issuesWithoutAuthor.map(i =>
+                `#${i.number} (assignees: ${i.assignees.length > 0 ? i.assignees.join(', ') : 'none'})`
+              ).join(', ');
+
+              const message = `## PR Closed - Requirements Not Met
+
+            This PR has been automatically closed because it doesn't meet the requirements.
+
+            **PR Author:** @${prAuthor}
+            **Found issues:** ${issueList}
+            **Problem:** The PR author must be assigned to the linked issue.
+
+            **To fix:**
+            1. Assign yourself (@${prAuthor}) to one of the linked issues
+            2. Re-open this PR
+
+            **Exception:** To bypass this requirement, you can:
+            - Add the \`micro-fix\` label or include \`micro-fix\` in your PR title for trivial fixes
+            - Add the \`documentation\` label or include \`doc\`/\`docs\` in your PR title for documentation changes
+
+            **Micro-fix requirements** (must meet ALL):
+            | Qualifies | Disqualifies |
+            |-----------|--------------|
+            | < 20 lines changed | Any functional bug fix |
+            | Typos & Documentation & Linting | Refactoring for "clean code" |
+            | No logic/API/DB changes | New features (even tiny ones) |
+
+            **Why is this required?** See #472 for details.`;
+
+              const comments = await github.rest.issues.listComments({
+                owner: context.repo.owner,
+                repo: context.repo.repo,
+                issue_number: prNumber,
+              });
+
+              const botComment = comments.data.find(
+                (c) => c.user.type === 'Bot' && c.body.includes('PR Closed - Requirements Not Met')
+              );
+
+              if (!botComment) {
+                await github.rest.issues.createComment({
+                  owner: context.repo.owner,
+                  repo: context.repo.repo,
+                  issue_number: prNumber,
+                  body: message,
+                });
+              }
+
+              await github.rest.pulls.update({
+                owner: context.repo.owner,
+                repo: context.repo.repo,
+                pull_number: prNumber,
+                state: 'closed',
+              });
+
+              core.setFailed('PR author must be assigned to the linked issue');
+            } else {
+              console.log(`PR requirements met! Issue #${issueWithAuthorAssigned} has ${prAuthor} as assignee.`);
+            }
@@ -21,18 +21,19 @@ jobs:
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
-          cache: 'pip'
+
+      - name: Install uv
+        uses: astral-sh/setup-uv@v4

      - name: Install dependencies
        run: |
          cd core
-          pip install -e .
-          pip install -r requirements-dev.txt
+          uv sync

      - name: Run tests
        run: |
          cd core
-          pytest tests/ -v
+          uv run pytest tests/ -v

      - name: Generate changelog
        id: changelog
@@ -66,5 +66,12 @@ temp/

 exports/*

-core/.agent-builder-sessions/*
-.agent-builder-sessions/
+.agent-builder-sessions/*
+
+.claude/settings.local.json
+
+.venv
+
+docs/github-issues/*
+core/tests/*dumps/*
+screenshots/*
@@ -1,20 +1,14 @@
 {
  "mcpServers": {
    "agent-builder": {
-      "command": "python",
-      "args": ["-m", "framework.mcp.agent_builder_server"],
-      "cwd": "core",
-      "env": {
-        "PYTHONPATH": "../tools/src"
-      }
+      "command": "uv",
+      "args": ["run", "-m", "framework.mcp.agent_builder_server"],
+      "cwd": "core"
    },
    "tools": {
-      "command": "python",
-      "args": ["mcp_server.py", "--stdio"],
-      "cwd": "tools",
-      "env": {
-        "PYTHONPATH": "src"
-      }
+      "command": "uv",
+      "args": ["run", "mcp_server.py", "--stdio"],
+      "cwd": "tools"
    }
  }
 }
@@ -0,0 +1,18 @@
+repos:
+  - repo: https://github.com/astral-sh/ruff-pre-commit
+    rev: v0.15.0
+    hooks:
+      - id: ruff
+        name: ruff lint (core)
+        args: [--fix]
+        files: ^core/
+      - id: ruff
+        name: ruff lint (tools)
+        args: [--fix]
+        files: ^tools/
+      - id: ruff-format
+        name: ruff format (core)
+        files: ^core/
+      - id: ruff-format
+        name: ruff format (tools)
+        files: ^tools/
@@ -0,0 +1 @@
+3.11
@@ -0,0 +1,7 @@
+{
+  "recommendations": [
+    "charliermarsh.ruff",
+    "editorconfig.editorconfig",
+    "ms-python.python"
+  ]
+}
@@ -1,40 +0,0 @@
-# Changelog
-
-All notable changes to this project will be documented in this file.
-
-The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
-and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
-
-## [Unreleased]
-
-### Added
- Initial project structure
- React frontend (honeycomb) with Vite and TypeScript
- Node.js backend (hive) with Express and TypeScript
- Docker Compose configuration for local development
- Configuration system via `config.yaml`
- GitHub Actions CI/CD workflows
- Comprehensive documentation
-
-### Changed
- N/A
-
-### Deprecated
- N/A
-
-### Removed
- N/A
-
-### Fixed
- N/A
-
-### Security
- N/A
-
-## [0.1.0] - 2025-01-13
-
-### Added
- Initial release
-
-[Unreleased]: https://github.com/adenhq/hive/compare/v0.1.0...HEAD
-[0.1.0]: https://github.com/adenhq/hive/releases/tag/v0.1.0
@@ -1,34 +1,70 @@
 # Contributing to Aden Agent Framework

-Thank you for your interest in contributing to the Aden Agent Framework! This document provides guidelines and information for contributors.
+Thank you for your interest in contributing to the Aden Agent Framework! This document provides guidelines and information for contributors. We’re especially looking for help building tools, integrations ([check #2805](https://github.com/adenhq/hive/issues/2805)), and example agents for the framework. If you’re interested in extending its functionality, this is the perfect place to start. 

 ## Code of Conduct

-By participating in this project, you agree to abide by our [Code of Conduct](CODE_OF_CONDUCT.md).
+By participating in this project, you agree to abide by our [Code of Conduct](docs/CODE_OF_CONDUCT.md).
+
+## Issue Assignment Policy
+
+To prevent duplicate work and respect contributors' time, we require issue assignment before submitting PRs.
+
+### How to Claim an Issue
+
+1. **Find an Issue:** Browse existing issues or create a new one
+2. **Claim It:** Leave a comment (e.g., *"I'd like to work on this!"*)
+3. **Wait for Assignment:** A maintainer will assign you within 24 hours. Issues with reproducible steps or proposals are prioritized.
+4. **Submit Your PR:** Once assigned, you're ready to contribute
+
+> **Note:** PRs for unassigned issues may be delayed or closed if someone else was already assigned.
+
+### Exceptions (No Assignment Needed)
+
+You may submit PRs without prior assignment for:
+- **Documentation:** Fixing typos or clarifying instructions — add the `documentation` label or include `doc`/`docs` in your PR title to bypass the linked issue requirement
+- **Micro-fixes:** Add the `micro-fix` label or include `micro-fix` in your PR title to bypass the linked issue requirement. Micro-fixes must meet **all** qualification criteria:
+
+  | Qualifies | Disqualifies |
+  |-----------|--------------|
+  | < 20 lines changed | Any functional bug fix |
+  | Typos & Documentation & Linting | Refactoring for "clean code" |
+  | No logic/API/DB changes | New features (even tiny ones) |

 ## Getting Started

 1. Fork the repository
 2. Clone your fork: `git clone https://github.com/YOUR_USERNAME/hive.git`
-3. Create a feature branch: `git checkout -b feature/your-feature-name`
-4. Make your changes
-5. Run tests: `PYTHONPATH=core:exports python -m pytest`
+3. Add the upstream repository: `git remote add upstream https://github.com/adenhq/hive.git`
+4. Sync with upstream to ensure you're starting from the latest code:
+   ```bash
+   git fetch upstream
+   git checkout main
+   git merge upstream/main
+   ```
+5. Create a feature branch: `git checkout -b feature/your-feature-name`
+6. Make your changes
+7. Run checks and tests:
+   ```bash
+   make check    # Lint and format checks (ruff check + ruff format --check on core/ and tools/)
+   make test     # Core tests (cd core && pytest tests/ -v)
+   ```
 6. Commit your changes following our commit conventions
 7. Push to your fork and submit a Pull Request

 ## Development Setup

 ```bash
-# Install Python packages
-./scripts/setup-python.sh
-
-# Verify installation
-python -c "import framework; import aden_tools; print('✓ Setup complete')"
-
-# Install Claude Code skills (optional)
+# Install Python packages and verify setup
 ./quickstart.sh
 ```

+> **Windows Users:**  
+> If you are on native Windows, it is recommended to use **WSL (Windows Subsystem for Linux)**.  
+> Alternatively, make sure to run PowerShell or Git Bash with Python 3.11+ installed, and disable "App Execution Aliases" in Windows settings.
+
+> **Tip:** Installing Claude Code skills is optional for running existing agents, but required if you plan to **build new agents**.
+
 ## Commit Convention

 We follow [Conventional Commits](https://www.conventionalcommits.org/):
@@ -59,11 +95,12 @@ docs(readme): update installation instructions

 ## Pull Request Process

-1. Update documentation if needed
-2. Add tests for new functionality
-3. Ensure all tests pass
-4. Update the CHANGELOG.md if applicable
-5. Request review from maintainers
+1. **Get assigned to the issue first** (see [Issue Assignment Policy](#issue-assignment-policy))
+2. Update documentation if needed
+3. Add tests for new functionality
+4. Ensure `make check` and `make test` pass
+5. Update the CHANGELOG.md if applicable
+6. Request review from maintainers

 ### PR Title Format

@@ -75,7 +112,7 @@ feat(component): add new feature description
 ## Project Structure

 - `core/` - Core framework (agent runtime, graph executor, protocols)
- `tools/` - MCP Tools Package (19 tools for agent capabilities)
+- `tools/` - MCP Tools Package (tools for agent capabilities)
 - `exports/` - Agent packages and examples
 - `docs/` - Documentation
 - `scripts/` - Build and utility scripts
@@ -92,17 +129,32 @@ feat(component): add new feature description

 ## Testing

-```bash
-# Run all tests for the framework
-cd core && python -m pytest
+> **Note:** When testing agents in `exports/`, always set PYTHONPATH:
+>
+> ```bash
+> PYTHONPATH=exports uv run python -m agent_name test
+> ```

-# Run all tests for tools
-cd tools && python -m pytest
+```bash
+# Run lint and format checks (mirrors CI lint job)
+make check
+
+# Run core framework tests (mirrors CI test job)
+make test
+
+# Or run tests directly
+cd core && pytest tests/ -v

 # Run tests for a specific agent
-PYTHONPATH=core:exports python -m agent_name test
+PYTHONPATH=exports uv run python -m agent_name test
 ```

+> **CI also validates** that all exported agent JSON files (`exports/*/agent.json`) are well-formed JSON. Ensure your agent exports are valid before submitting.
+
+## Contributor License Agreement
+
+By submitting a Pull Request, you agree that your contributions will be licensed under the Aden Agent Framework license.
+
 ## Questions?

 Feel free to open an issue for questions or join our [Discord community](https://discord.com/invite/MXE49hrKDk).
@@ -1,347 +0,0 @@
-# Agent Development Environment Setup
-
-Complete setup guide for building and running goal-driven agents with the Aden Agent Framework.
-
-## Quick Setup
-
-```bash
-# Run the automated setup script
-./scripts/setup-python.sh
-```
-
-This will:
-
- Check Python version (requires 3.10+, recommends 3.11+)
- Install the core framework package (`framework`)
- Install the tools package (`aden_tools`)
- Fix package compatibility issues (openai + litellm)
- Verify all installations
-
-## Manual Setup (Alternative)
-
-If you prefer to set up manually or the script fails:
-
-### 1. Install Core Framework
-
-```bash
-cd core
-pip install -e .
-```
-
-### 2. Install Tools Package
-
-```bash
-cd tools
-pip install -e .
-```
-
-### 3. Upgrade OpenAI Package
-
-```bash
-# litellm requires openai >= 1.0.0
-pip install --upgrade "openai>=1.0.0"
-```
-
-### 4. Verify Installation
-
-```bash
-python -c "import framework; print('✓ framework OK')"
-python -c "import aden_tools; print('✓ aden_tools OK')"
-python -c "import litellm; print('✓ litellm OK')"
-```
-
-## Requirements
-
-### Python Version
-
- **Minimum:** Python 3.10
- **Recommended:** Python 3.11 or 3.12
- **Tested on:** Python 3.11, 3.12, 3.13
-
-### System Requirements
-
- pip (latest version)
- 2GB+ RAM
- Internet connection (for LLM API calls)
-
-### API Keys (Optional)
-
-For running agents with real LLMs:
-
-```bash
-export ANTHROPIC_API_KEY="your-key-here"
-```
-
-## Running Agents
-
-All agent commands must be run from the project root with `PYTHONPATH` set:
-
-```bash
-# From /home/timothy/oss/hive/ directory
-PYTHONPATH=core:exports python -m agent_name COMMAND
-```
-
-### Example: Support Ticket Agent
-
-```bash
-# Validate agent structure
-PYTHONPATH=core:exports python -m support_ticket_agent validate
-
-# Show agent information
-PYTHONPATH=core:exports python -m support_ticket_agent info
-
-# Run agent with input
-PYTHONPATH=core:exports python -m support_ticket_agent run --input '{
-  "ticket_content": "My login is broken. Error 401.",
-  "customer_id": "CUST-123",
-  "ticket_id": "TKT-456"
-}'
-
-# Run in mock mode (no LLM calls)
-PYTHONPATH=core:exports python -m support_ticket_agent run --mock --input '{...}'
-```
-
-### Example: Other Agents
-
-```bash
-# Market Research Agent
-PYTHONPATH=core:exports python -m market_research_agent info
-
-# Outbound Sales Agent
-PYTHONPATH=core:exports python -m outbound_sales_agent validate
-
-# Personal Assistant Agent
-PYTHONPATH=core:exports python -m personal_assistant_agent run --input '{...}'
-```
-
-## Building New Agents
-
-Use Claude Code CLI with the agent building skills:
-
-### 1. Install Skills (One-time)
-
-```bash
-./quickstart.sh
-```
-
-This installs:
-
- `/building-agents` - Build new agents
- `/testing-agent` - Test agents
-
-### 2. Build an Agent
-
-```
-claude> /building-agents
-```
-
-Follow the prompts to:
-
-1. Define your agent's goal
-2. Design the workflow nodes
-3. Connect edges
-4. Generate the agent package
-
-### 3. Test Your Agent
-
-```
-claude> /testing-agent
-```
-
-Creates comprehensive test suites for your agent.
-
-## Troubleshooting
-
-### "ModuleNotFoundError: No module named 'framework'"
-
-**Solution:** Install the core package:
-
-```bash
-cd core && pip install -e .
-```
-
-### "ModuleNotFoundError: No module named 'aden_tools'"
-
-**Solution:** Install the tools package:
-
-```bash
-cd tools && pip install -e .
-```
-
-Or run the setup script:
-
-```bash
-./scripts/setup-python.sh
-```
-
-### "ModuleNotFoundError: No module named 'openai.\_models'"
-
-**Cause:** Outdated `openai` package (0.27.x) incompatible with `litellm`
-
-**Solution:** Upgrade openai:
-
-```bash
-pip install --upgrade "openai>=1.0.0"
-```
-
-### "No module named 'support_ticket_agent'"
-
-**Cause:** Not running from project root or missing PYTHONPATH
-
-**Solution:** Ensure you're in `/home/timothy/oss/hive/` and use:
-
-```bash
-PYTHONPATH=core:exports python -m support_ticket_agent validate
-```
-
-### Agent imports fail with "broken installation"
-
-**Symptom:** `pip list` shows packages pointing to non-existent directories
-
-**Solution:** Reinstall packages properly:
-
-```bash
-# Remove broken installations
-pip uninstall -y framework tools aden-tools
-
-# Reinstall correctly
-cd /home/timothy/oss/hive
-./scripts/setup-python.sh
-```
-
-## Package Structure
-
-The Hive framework consists of three Python packages:
-
-```
-hive/
-├── core/                    # Core framework (runtime, graph executor, LLM providers)
-│   ├── framework/
-│   ├── pyproject.toml
-│   └── requirements.txt
-│
-├── tools/                   # Tools and MCP servers
-│   ├── src/
-│   │   └── aden_tools/     # Actual package location
-│   ├── pyproject.toml
-│   └── README.md
-│
-└── exports/                 # Agent packages (your agents go here)
-    ├── support_ticket_agent/
-    ├── market_research_agent/
-    ├── outbound_sales_agent/
-    └── personal_assistant_agent/
-```
-
-### Why PYTHONPATH is Required
-
-The packages are installed in **editable mode** (`pip install -e`), which means:
-
- `framework` and `aden_tools` are globally importable (no PYTHONPATH needed)
- `exports` is NOT installed as a package (PYTHONPATH required)
-
-This design allows agents in `exports/` to be:
-
- Developed independently
- Version controlled separately
- Deployed as standalone packages
-
-## Development Workflow
-
-### 1. Setup (Once)
-
-```bash
-./scripts/setup-python.sh
-```
-
-### 2. Build Agent (Claude Code)
-
-```
-claude> /building-agents
-Enter goal: "Build an agent that processes customer support tickets"
-```
-
-### 3. Validate Agent
-
-```bash
-PYTHONPATH=core:exports python -m support_ticket_agent validate
-```
-
-### 4. Test Agent
-
-```
-claude> /testing-agent
-```
-
-### 5. Run Agent
-
-```bash
-PYTHONPATH=core:exports python -m support_ticket_agent run --input '{...}'
-```
-
-## IDE Setup
-
-### VSCode
-
-Add to `.vscode/settings.json`:
-
-```json
-{
-  "python.analysis.extraPaths": [
-    "${workspaceFolder}/core",
-    "${workspaceFolder}/exports"
-  ],
-  "python.autoComplete.extraPaths": [
-    "${workspaceFolder}/core",
-    "${workspaceFolder}/exports"
-  ]
-}
-```
-
-### PyCharm
-
-1. Open Project Settings → Project Structure
-2. Mark `core` as Sources Root
-3. Mark `exports` as Sources Root
-
-## Environment Variables
-
-### Required for LLM Operations
-
-```bash
-export ANTHROPIC_API_KEY="sk-ant-..."
-```
-
-### Optional Configuration
-
-```bash
-# Credentials storage location (default: ~/.aden/credentials)
-export ADEN_CREDENTIALS_PATH="/custom/path"
-
-# Agent storage location (default: /tmp)
-export AGENT_STORAGE_PATH="/custom/storage"
-```
-
-## Additional Resources
-
- **Framework Documentation:** [core/README.md](core/README.md)
- **Tools Documentation:** [tools/README.md](tools/README.md)
- **Example Agents:** [exports/](exports/)
- **Agent Building Guide:** [.claude/skills/building-agents-construction/SKILL.md](.claude/skills/building-agents-construction/SKILL.md)
- **Testing Guide:** [.claude/skills/testing-agent/SKILL.md](.claude/skills/testing-agent/SKILL.md)
-
-## Contributing
-
-When contributing agent packages:
-
-1. Place agents in `exports/agent_name/`
-2. Follow the standard agent structure (see existing agents)
-3. Include README.md with usage instructions
-4. Add tests if using `/testing-agent`
-5. Document required environment variables
-
-## Support
-
- **Issues:** https://github.com/adenhq/hive/issues
- **Discord:** https://discord.com/invite/MXE49hrKDk
- **Documentation:** https://docs.adenhq.com/
@@ -0,0 +1,28 @@
+.PHONY: lint format check test install-hooks help
+
+help: ## Show this help
+	@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | \
+		awk 'BEGIN {FS = ":.*?## "}; {printf "  \033[36m%-15s\033[0m %s\n", $$1, $$2}'
+
+lint: ## Run ruff linter and formatter (with auto-fix)
+	cd core && ruff check --fix .
+	cd tools && ruff check --fix .
+	cd core && ruff format .
+	cd tools && ruff format .
+
+format: ## Run ruff formatter
+	cd core && ruff format .
+	cd tools && ruff format .
+
+check: ## Run all checks without modifying files (CI-safe)
+	cd core && ruff check .
+	cd tools && ruff check .
+	cd core && ruff format --check .
+	cd tools && ruff format --check .
+
+test: ## Run all tests
+	cd core && uv run python -m pytest tests/ -v
+
+install-hooks: ## Install pre-commit hooks
+	uv pip install pre-commit
+	pre-commit install
@@ -1,27 +1,32 @@
 <p align="center">
-  <img width="100%" alt="Hive Banner" src="https://storage.googleapis.com/aden-prod-assets/website/aden-title-card.png" />
+  <img width="100%" alt="Hive Banner" src="https://github.com/user-attachments/assets/a027429b-5d3c-4d34-88e4-0feaeaabbab3" />
 </p>

 <p align="center">
  <a href="README.md">English</a> |
-  <a href="README.zh-CN.md">简体中文</a> |
-  <a href="README.es.md">Español</a> |
-  <a href="README.pt.md">Português</a> |
-  <a href="README.ja.md">日本語</a> |
-  <a href="README.ru.md">Русский</a>
+  <a href="docs/i18n/zh-CN.md">简体中文</a> |
+  <a href="docs/i18n/es.md">Español</a> |
+  <a href="docs/i18n/hi.md">हिन्दी</a> |
+  <a href="docs/i18n/pt.md">Português</a> |
+  <a href="docs/i18n/ja.md">日本語</a> |
+  <a href="docs/i18n/ru.md">Русский</a> |
+  <a href="docs/i18n/ko.md">한국어</a>
+</p>
+
+<p align="center">
+  <a href="https://github.com/adenhq/hive/blob/main/LICENSE"><img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="Apache 2.0 License" /></a>
+  <a href="https://www.ycombinator.com/companies/aden"><img src="https://img.shields.io/badge/Y%20Combinator-Aden-orange" alt="Y Combinator" /></a>
+  <a href="https://discord.com/invite/MXE49hrKDk"><img src="https://img.shields.io/discord/1172610340073242735?logo=discord&labelColor=%235462eb&logoColor=%23f5f5f5&color=%235462eb" alt="Discord" /></a>
+  <a href="https://x.com/aden_hq"><img src="https://img.shields.io/twitter/follow/teamaden?logo=X&color=%23f5f5f5" alt="Twitter Follow" /></a>
+  <a href="https://www.linkedin.com/company/teamaden/"><img src="https://custom-icon-badges.demolab.com/badge/LinkedIn-0A66C2?logo=linkedin-white&logoColor=fff" alt="LinkedIn" /></a>
+  <img src="https://img.shields.io/badge/MCP-102_Tools-00ADD8?style=flat-square" alt="MCP" />
 </p>

-[![Apache 2.0 License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/adenhq/hive/blob/main/LICENSE)
-[![Y Combinator](https://img.shields.io/badge/Y%20Combinator-Aden-orange)](https://www.ycombinator.com/companies/aden)
-[![Docker Pulls](https://img.shields.io/docker/pulls/adenhq/hive?logo=Docker&labelColor=%23528bff)](https://hub.docker.com/u/adenhq)
-[![Discord](https://img.shields.io/discord/1172610340073242735?logo=discord&labelColor=%235462eb&logoColor=%23f5f5f5&color=%235462eb)](https://discord.com/invite/MXE49hrKDk)
-[![Twitter Follow](https://img.shields.io/twitter/follow/teamaden?logo=X&color=%23f5f5f5)](https://x.com/aden_hq)
-[![LinkedIn](https://custom-icon-badges.demolab.com/badge/LinkedIn-0A66C2?logo=linkedin-white&logoColor=fff)](https://www.linkedin.com/company/teamaden/)

 <p align="center">
  <img src="https://img.shields.io/badge/AI_Agents-Self--Improving-brightgreen?style=flat-square" alt="AI Agents" />
  <img src="https://img.shields.io/badge/Multi--Agent-Systems-blue?style=flat-square" alt="Multi-Agent" />
-  <img src="https://img.shields.io/badge/Goal--Driven-Development-purple?style=flat-square" alt="Goal-Driven" />
+  <img src="https://img.shields.io/badge/Headless-Development-purple?style=flat-square" alt="Headless" />
  <img src="https://img.shields.io/badge/Human--in--the--Loop-orange?style=flat-square" alt="HITL" />
  <img src="https://img.shields.io/badge/Production--Ready-red?style=flat-square" alt="Production" />
 </p>
@@ -29,43 +34,57 @@
  <img src="https://img.shields.io/badge/OpenAI-supported-412991?style=flat-square&logo=openai" alt="OpenAI" />
  <img src="https://img.shields.io/badge/Anthropic-supported-d4a574?style=flat-square" alt="Anthropic" />
  <img src="https://img.shields.io/badge/Google_Gemini-supported-4285F4?style=flat-square&logo=google" alt="Gemini" />
-  <img src="https://img.shields.io/badge/MCP-19_Tools-00ADD8?style=flat-square" alt="MCP" />
 </p>

 ## Overview

-Build reliable, self-improving AI agents without hardcoding workflows. Define your goal through conversation with a coding agent, and the framework generates a node graph with dynamically created connection code. When things break, the framework captures failure data, evolves the agent through the coding agent, and redeploys. Built-in human-in-the-loop nodes, credential management, and real-time monitoring give you control without sacrificing adaptability.
+Build autonomous, reliable, self-improving AI agents without hardcoding workflows. Define your goal through conversation with a coding agent, and the framework generates a node graph with dynamically created connection code. When things break, the framework captures failure data, evolves the agent through the coding agent, and redeploys. Built-in human-in-the-loop nodes, credential management, and real-time monitoring give you control without sacrificing adaptability.

 Visit [adenhq.com](https://adenhq.com) for complete documentation, examples, and guides.

-## What is Aden
+https://github.com/user-attachments/assets/846c0cc7-ffd6-47fa-b4b7-495494857a55

-<p align="center">
-  <img width="100%" alt="Aden Architecture" src="docs/assets/aden-architecture-diagram.jpg" />
-</p>
+## Who Is Hive For?

-Aden is a platform for building, deploying, operating, and adapting AI agents:
+Hive is designed for developers and teams who want to build **production-grade AI agents** without manually wiring complex workflows.

- **Build** - A Coding Agent generates specialized Worker Agents (Sales, Marketing, Ops) from natural language goals
- **Deploy** - Headless deployment with CI/CD integration and full API lifecycle management
- **Operate** - Real-time monitoring, observability, and runtime guardrails keep agents reliable
- **Adapt** - Continuous evaluation, supervision, and adaptation ensure agents improve over time
- **Infra** - Shared memory, LLM integrations, tools, and skills power every agent
+Hive is a good fit if you:
+
+- Want AI agents that **execute real business processes**, not demos
+- Prefer **goal-driven development** over hardcoded workflows
+- Need **self-healing and adaptive agents** that improve over time
+- Require **human-in-the-loop control**, observability, and cost limits
+- Plan to run agents in **production environments**
+
+Hive may not be the best fit if you’re only experimenting with simple agent chains or one-off scripts.
+
+## When Should You Use Hive?
+
+Use Hive when you need:
+
+- Long-running, autonomous agents
+- Strong guardrails, process, and controls
+- Continuous improvement based on failures
+- Multi-agent coordination
+- A framework that evolves with your goals

 ## Quick Links

 - **[Documentation](https://docs.adenhq.com/)** - Complete guides and API reference
 - **[Self-Hosting Guide](https://docs.adenhq.com/getting-started/quickstart)** - Deploy Hive on your infrastructure
 - **[Changelog](https://github.com/adenhq/hive/releases)** - Latest updates and releases
-<!-- - **[Roadmap](https://adenhq.com/roadmap)** - Upcoming features and plans -->
+- **[Roadmap](docs/roadmap.md)** - Upcoming features and plans
 - **[Report Issues](https://github.com/adenhq/hive/issues)** - Bug reports and feature requests
+- **[Contributing](CONTRIBUTING.md)** - How to contribute and submit PRs

 ## Quick Start

 ### Prerequisites

- [Python 3.11+](https://www.python.org/downloads/) for agent development
- [Docker](https://docs.docker.com/get-docker/) (v20.10+) - Optional, for containerized tools
+- Python 3.11+ for agent development
+- Claude Code or Cursor for utilizing agent skills
+
+> **Note for Windows Users:** It is strongly recommended to use **WSL (Windows Subsystem for Linux)** or **Git Bash** to run this framework. Some core automation scripts may not execute correctly in standard Command Prompt or PowerShell.

 ### Installation

@@ -74,219 +93,251 @@ Aden is a platform for building, deploying, operating, and adapting AI agents:
 git clone https://github.com/adenhq/hive.git
 cd hive

-# Run Python environment setup
-./scripts/setup-python.sh
+# Run quickstart setup
+./quickstart.sh
 ```

-This installs:
- **framework** - Core agent runtime and graph executor
- **aden_tools** - 19 MCP tools for agent capabilities
- All required dependencies
+This sets up:
+
+- **framework** - Core agent runtime and graph executor (in `core/.venv`)
+- **aden_tools** - MCP tools for agent capabilities (in `tools/.venv`)
+- **credential store** - Encrypted API key storage (`~/.hive/credentials`)
+- **LLM provider** - Interactive default model configuration
+- All required Python dependencies with `uv`

 ### Build Your First Agent

 ```bash
-# Install Claude Code skills (one-time)
-./quickstart.sh
-
 # Build an agent using Claude Code
-claude> /building-agents
+claude> /hive

 # Test your agent
-claude> /testing-agent
+claude> /hive-debugger

-# Run your agent
-PYTHONPATH=core:exports python -m your_agent_name run --input '{...}'
+# (at separate terminal) Launch the interactive dashboard
+hive tui
+
+# Or run directly
+hive run exports/your_agent_name --input '{"key": "value"}'
 ```

-**[📖 Complete Setup Guide](ENVIRONMENT_SETUP.md)** - Detailed instructions for agent development
+**[📖 Complete Setup Guide](docs/environment-setup.md)** - Detailed instructions for agent development

 ## Features

- **Goal-Driven Development** - Define objectives in natural language; the coding agent generates the agent graph and connection code to achieve them
- **Self-Adapting Agents** - Framework captures failures, updates objectives and updates the agent graph
- **Dynamic Node Connections** - No predefined edges; connection code is generated by any capable LLM based on your goals
+- **[Goal-Driven Development](docs/key_concepts/goals_outcome.md)** - Define objectives in natural language; the coding agent generates the agent graph and connection code to achieve them
+- **[Adaptiveness](docs/key_concepts/evolution.md)** - Framework captures failures, calibrates according to the objectives, and evolves the agent graph
+- **[Dynamic Node Connections](docs/key_concepts/graph.md)** - No predefined edges; connection code is generated by any capable LLM based on your goals
 - **SDK-Wrapped Nodes** - Every node gets shared memory, local RLM memory, monitoring, tools, and LLM access out of the box
- **Human-in-the-Loop** - Intervention nodes that pause execution for human input with configurable timeouts and escalation
+- **[Human-in-the-Loop](docs/key_concepts/graph.md#human-in-the-loop)** - Intervention nodes that pause execution for human input with configurable timeouts and escalation
 - **Real-time Observability** - WebSocket streaming for live monitoring of agent execution, decisions, and node-to-node communication
+- **Interactive TUI Dashboard** - Terminal-based dashboard with live graph view, event log, and chat interface for agent interaction
 - **Cost & Budget Control** - Set spending limits, throttles, and automatic model degradation policies
 - **Production-Ready** - Self-hostable, built for scale and reliability

+## Integration
+
+<a href="https://github.com/adenhq/hive/tree/main/tools/src/aden_tools/tools"><img width="100%" alt="Integration" src="https://github.com/user-attachments/assets/a1573f93-cf02-4bb8-b3d5-b305b05b1e51" /></a>
+
+Hive is built to be model-agnostic and system-agnostic.
+
+- **LLM flexibility** - Hive Framework is designed to support various types of LLMs, including hosted and local models through LiteLLM-compatible providers.
+- **Business system connectivity** - Hive Framework is designed to connect to all kinds of business systems as tools, such as CRM, support, messaging, data, file, and internal APIs via MCP.
+
+
 ## Why Aden

-Traditional agent frameworks require you to manually design workflows, define agent interactions, and handle failures reactively. Aden flips this paradigm—**you describe outcomes, and the system builds itself**.
+Hive focuses on generating agents that run real business processes rather than generic agents. Instead of requiring you to manually design workflows, define agent interactions, and handle failures reactively, Hive flips the paradigm: **you describe outcomes, and the system builds itself**—delivering an outcome-driven, adaptive experience with an easy-to-use set of tools and integrations.

 ```mermaid
 flowchart LR
-    subgraph BUILD["🏗️ BUILD"]
-        GOAL["Define Goal<br/>+ Success Criteria"] --> NODES["Add Nodes<br/>LLM/Router/Function"]
-        NODES --> EDGES["Connect Edges<br/>on_success/failure/conditional"]
-        EDGES --> TEST["Test & Validate"] --> APPROVE["Approve & Export"]
-    end
+    GOAL["Define Goal"] --> GEN["Auto-Generate Graph"]
+    GEN --> EXEC["Execute Agents"]
+    EXEC --> MON["Monitor & Observe"]
+    MON --> CHECK{{"Pass?"}}
+    CHECK -- "Yes" --> DONE["Deliver Result"]
+    CHECK -- "No" --> EVOLVE["Evolve Graph"]
+    EVOLVE --> EXEC

-    subgraph EXPORT["📦 EXPORT"]
-        direction TB
-        JSON["agent.json<br/>(GraphSpec)"]
-        TOOLS["tools.py<br/>(Functions)"]
-        MCP["mcp_servers.json<br/>(Integrations)"]
-    end
+    GOAL -.- V1["Natural Language"]
+    GEN -.- V2["Instant Architecture"]
+    EXEC -.- V3["Easy Integrations"]
+    MON -.- V4["Full visibility"]
+    EVOLVE -.- V5["Adaptability"]
+    DONE -.- V6["Reliable outcomes"]

-    subgraph RUN["🚀 RUNTIME"]
-        LOAD["AgentRunner<br/>Load + Parse"] --> SETUP["Setup Runtime<br/>+ ToolRegistry"]
-        SETUP --> EXEC["GraphExecutor<br/>Execute Nodes"]
-
-        subgraph DECISION["Decision Recording"]
-            DEC1["runtime.decide()<br/>intent → options → choice"]
-            DEC2["runtime.record_outcome()<br/>success, result, metrics"]
-        end
-    end
-
-    subgraph INFRA["⚙️ INFRASTRUCTURE"]
-        CTX["NodeContext<br/>memory • llm • tools"]
-        STORE[("FileStorage<br/>Runs & Decisions")]
-    end
-
-    APPROVE --> EXPORT
-    EXPORT --> LOAD
-    EXEC --> DECISION
-    EXEC --> CTX
-    DECISION --> STORE
-    STORE -.->|"Analyze & Improve"| NODES
-
-    style BUILD fill:#ffbe42,stroke:#cc5d00,stroke-width:3px,color:#333
-    style EXPORT fill:#fff59d,stroke:#ed8c00,stroke-width:2px,color:#333
-    style RUN fill:#ffb100,stroke:#cc5d00,stroke-width:3px,color:#333
-    style DECISION fill:#ffcc80,stroke:#ed8c00,stroke-width:2px,color:#333
-    style INFRA fill:#e8763d,stroke:#cc5d00,stroke-width:3px,color:#fff
-    style STORE fill:#ed8c00,stroke:#cc5d00,stroke-width:2px,color:#fff
+    style GOAL fill:#ffbe42,stroke:#cc5d00,stroke-width:2px,color:#333
+    style GEN fill:#ffb100,stroke:#cc5d00,stroke-width:2px,color:#333
+    style EXEC fill:#ff9800,stroke:#cc5d00,stroke-width:2px,color:#fff
+    style MON fill:#ff9800,stroke:#cc5d00,stroke-width:2px,color:#fff
+    style CHECK fill:#fff59d,stroke:#ed8c00,stroke-width:2px,color:#333
+    style DONE fill:#4caf50,stroke:#2e7d32,stroke-width:2px,color:#fff
+    style EVOLVE fill:#e8763d,stroke:#cc5d00,stroke-width:2px,color:#fff
+    style V1 fill:#fff,stroke:#ed8c00,stroke-width:1px,color:#cc5d00
+    style V2 fill:#fff,stroke:#ed8c00,stroke-width:1px,color:#cc5d00
+    style V3 fill:#fff,stroke:#ed8c00,stroke-width:1px,color:#cc5d00
+    style V4 fill:#fff,stroke:#ed8c00,stroke-width:1px,color:#cc5d00
+    style V5 fill:#fff,stroke:#ed8c00,stroke-width:1px,color:#cc5d00
+    style V6 fill:#fff,stroke:#ed8c00,stroke-width:1px,color:#cc5d00
 ```

-### The Aden Advantage
+### The Hive Advantage

-| Traditional Frameworks     | Aden                                   |
+| Traditional Frameworks     | Hive                                   |
 | -------------------------- | -------------------------------------- |
 | Hardcode agent workflows   | Describe goals in natural language     |
 | Manual graph definition    | Auto-generated agent graphs            |
-| Reactive error handling    | Proactive self-evolution               |
+| Reactive error handling    | Outcome-evaluation and adaptiveness    |
 | Static tool configurations | Dynamic SDK-wrapped nodes              |
 | Separate monitoring setup  | Built-in real-time observability       |
 | DIY budget management      | Integrated cost controls & degradation |

 ### How It Works

-1. **Define Your Goal** → Describe what you want to achieve in plain English
-2. **Coding Agent Generates** → Creates the agent graph, connection code, and test cases
-3. **Workers Execute** → SDK-wrapped nodes run with full observability and tool access
+1. **[Define Your Goal](docs/key_concepts/goals_outcome.md)** → Describe what you want to achieve in plain English
+2. **Coding Agent Generates** → Creates the [agent graph](docs/key_concepts/graph.md), connection code, and test cases
+3. **[Workers Execute](docs/key_concepts/worker_agent.md)** → SDK-wrapped nodes run with full observability and tool access
 4. **Control Plane Monitors** → Real-time metrics, budget enforcement, policy management
-5. **Self-Improve** → On failure, the system evolves the graph and redeploys automatically
+5. **[Adaptiveness](docs/key_concepts/evolution.md)** → On failure, the system evolves the graph and redeploys automatically

-## How Aden Compares
+## Run Agents

-Aden takes a fundamentally different approach to agent development. While most frameworks require you to hardcode workflows or manually define agent graphs, Aden uses a **coding agent to generate your entire agent system** from natural language goals. When agents fail, the framework doesn't just log errors—it **automatically evolves the agent graph** and redeploys.
-
-### Comparison Table
-
-| Framework                           | Category                  | Approach                                                        | Aden Difference                                           |
-| ----------------------------------- | ------------------------- | --------------------------------------------------------------- | --------------------------------------------------------- |
-| **LangChain, LlamaIndex, Haystack** | Component Libraries       | Predefined components for RAG/LLM apps; manual connection logic | Generates entire graph and connection code upfront        |
-| **CrewAI, AutoGen, Swarm**          | Multi-Agent Orchestration | Role-based agents with predefined collaboration patterns        | Dynamically creates agents/connections; adapts on failure |
-| **PydanticAI, Mastra, Agno**        | Type-Safe Frameworks      | Structured outputs and validation for known workflows           | Evolving workflows; structure emerges through iteration   |
-| **Agent Zero, Letta**               | Personal AI Assistants    | Memory and learning; OS-as-tool or stateful memory focus        | Production multi-agent systems with self-healing          |
-| **CAMEL**                           | Research Framework        | Emergent behavior in large-scale simulations (up to 1M agents)  | Production-oriented with reliable execution and recovery  |
-| **TEN Framework, Genkit**           | Infrastructure Frameworks | Real-time multimodal (TEN) or full-stack AI (Genkit)            | Higher abstraction—generates and evolves agent logic      |
-| **GPT Engineer, Motia**             | Code Generation           | Code from specs (GPT Engineer) or "Step" primitive (Motia)      | Self-adapting graphs with automatic failure recovery      |
-| **Trading Agents**                  | Domain-Specific           | Hardcoded trading firm roles on LangGraph                       | Domain-agnostic; generates structures for any use case    |
-
-### When to Choose Aden
-
-Choose Aden when you need:
-
- Agents that **self-improve from failures** without manual intervention
- **Goal-driven development** where you describe outcomes, not workflows
- **Production reliability** with automatic recovery and redeployment
- **Rapid iteration** on agent architectures without rewriting code
- **Full observability** with real-time monitoring and human oversight
-
-Choose other frameworks when you need:
-
- **Type-safe, predictable workflows** (PydanticAI, Mastra)
- **RAG and document processing** (LlamaIndex, Haystack)
- **Research on agent emergence** (CAMEL)
- **Real-time voice/multimodal** (TEN Framework)
- **Simple component chaining** (LangChain, Swarm)
-
-## Project Structure
-
-```
-hive/
-├── core/                   # Core framework - Agent runtime, graph executor, protocols
-├── tools/                  # MCP Tools Package - 19 tools for agent capabilities
-├── exports/                # Agent packages - Pre-built agents and examples
-├── docs/                   # Documentation and guides
-├── scripts/                # Build and utility scripts
-├── .claude/                # Claude Code skills for building agents
-├── ENVIRONMENT_SETUP.md    # Python setup guide for agent development
-├── DEVELOPER.md            # Developer guide
-├── CONTRIBUTING.md         # Contribution guidelines
-└── ROADMAP.md              # Product roadmap
-```
-
-## Development
-
-### Python Agent Development
-
-For building and running goal-driven agents with the framework:
+The `hive` CLI is the primary interface for running agents.

 ```bash
-# One-time setup
-./scripts/setup-python.sh
+# Browse and run agents interactively (Recommended)
+hive tui

-# This installs:
-# - framework package (core runtime)
-# - aden_tools package (19 MCP tools)
-# - All dependencies
+# Run a specific agent directly
+hive run exports/my_agent --input '{"task": "Your input here"}'

-# Build new agents using Claude Code skills
-claude> /building-agents
+# Run a specific agent with the TUI dashboard
+hive run exports/my_agent --tui

-# Test agents
-claude> /testing-agent
-
-# Run agents
-PYTHONPATH=core:exports python -m agent_name run --input '{...}'
+# Interactive REPL
+hive shell
 ```

-See [ENVIRONMENT_SETUP.md](ENVIRONMENT_SETUP.md) for complete setup instructions.
+The TUI scans both `exports/` and `examples/templates/` for available agents.
+
+> **Using Python directly (alternative):** You can also run agents with `PYTHONPATH=exports uv run python -m agent_name run --input '{...}'`
+
+See [environment-setup.md](docs/environment-setup.md) for complete setup instructions.

 ## Documentation

- **[Developer Guide](DEVELOPER.md)** - Comprehensive guide for developers
+- **[Developer Guide](docs/developer-guide.md)** - Comprehensive guide for developers
 - [Getting Started](docs/getting-started.md) - Quick setup instructions
+- [TUI Guide](docs/tui-selection-guide.md) - Interactive dashboard usage
 - [Configuration Guide](docs/configuration.md) - All configuration options
- [Architecture Overview](docs/architecture.md) - System design and structure
+- [Architecture Overview](docs/architecture/README.md) - System design and structure

 ## Roadmap

-Aden Agent Framework aims to help developers build outcome oriented, self-adaptive agents. Please find our roadmap here
-
-[ROADMAP.md](ROADMAP.md)
+Aden Hive Agent Framework aims to help developers build outcome-oriented, self-adaptive agents. See [roadmap.md](docs/roadmap.md) for details.

 ```mermaid
-timeline
-    title Aden Agent Framework Roadmap
-    section Foundation
-        Architecture : Node-Based Architecture : Python SDK : LLM Integration (OpenAI, Anthropic, Google) : Communication Protocol
-        Coding Agent : Goal Creation Session : Worker Agent Creation : MCP Tools Integration
-        Worker Agent : Human-in-the-Loop : Callback Handlers : Intervention Points : Streaming Interface
-        Tools : File Use : Memory (STM/LTM) : Web Search : Web Scraper : Audit Trail
-        Core : Eval System : Pydantic Validation : Docker Deployment : Documentation : Sample Agents
-    section Expansion
-        Intelligence : Guardrails : Streaming Mode : Semantic Search
-        Platform : JavaScript SDK : Custom Tool Integrator : Credential Store
-        Deployment : Self-Hosted : Cloud Services : CI/CD Pipeline
-        Templates : Sales Agent : Marketing Agent : Analytics Agent : Training Agent : Smart Form Agent
+flowchart TD
+subgraph Foundation
+    direction LR
+    subgraph arch["Architecture"]
+        a1["Node-Based Architecture"]:::done
+        a2["Python SDK"]:::done
+        a3["LLM Integration"]:::done
+        a4["Communication Protocol"]:::done
+    end
+    subgraph ca["Coding Agent"]
+        b1["Goal Creation Session"]:::done
+        b2["Worker Agent Creation"]
+        b3["MCP Tools"]:::done
+    end
+    subgraph wa["Worker Agent"]
+        c1["Human-in-the-Loop"]:::done
+        c2["Callback Handlers"]:::done
+        c3["Intervention Points"]:::done
+        c4["Streaming Interface"]
+    end
+    subgraph cred["Credentials"]
+        d1["Setup Process"]:::done
+        d2["Pluggable Sources"]:::done
+        d3["Enterprise Secrets"]
+        d4["Integration Tools"]:::done
+    end
+    subgraph tools["Tools"]
+        e1["File Use"]:::done
+        e2["Memory STM/LTM"]:::done
+        e3["Web Search/Scraper"]:::done
+        e4["CSV/PDF"]:::done
+        e5["Excel/Email"]
+    end
+    subgraph core["Core"]
+        f1["Eval System"]
+        f2["Pydantic Validation"]:::done
+        f3["Documentation"]:::done
+        f4["Adaptiveness"]
+        f5["Sample Agents"]
+    end
+end
+
+subgraph Expansion
+    direction LR
+    subgraph intel["Intelligence"]
+        g1["Guardrails"]
+        g2["Streaming Mode"]
+        g3["Image Generation"]
+        g4["Semantic Search"]
+    end
+    subgraph mem["Memory Iteration"]
+        h1["Message Model & Sessions"]
+        h2["Storage Migration"]
+        h3["Context Building"]
+        h4["Proactive Compaction"]
+        h5["Token Tracking"]
+    end
+    subgraph evt["Event System"]
+        i1["Event Bus for Nodes"]
+    end
+    subgraph cas["Coding Agent Support"]
+        j1["Claude Code"]
+        j2["Cursor"]
+        j3["Opencode"]
+        j4["Antigravity"]
+    end
+    subgraph plat["Platform"]
+        k1["JavaScript/TypeScript SDK"]
+        k2["Custom Tool Integrator"]
+        k3["Windows Support"]
+    end
+    subgraph dep["Deployment"]
+        l1["Self-Hosted"]
+        l2["Cloud Services"]
+        l3["CI/CD Pipeline"]
+    end
+    subgraph tmpl["Templates"]
+        m1["Sales Agent"]
+        m2["Marketing Agent"]
+        m3["Analytics Agent"]
+        m4["Training Agent"]
+        m5["Smart Form Agent"]
+    end
+end
+
+classDef done fill:#9e9e9e,color:#fff,stroke:#757575
 ```

+## Contributing
+
+We welcome contributions from the community! We’re especially looking for help building tools, integrations, and example agents for the framework ([check #2805](https://github.com/adenhq/hive/issues/2805)). If you’re interested in extending its functionality, this is the perfect place to start. Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
+
+**Important:** Please get assigned to an issue before submitting a PR. Comment on an issue to claim it, and a maintainer will assign you. Issues with reproducible steps and proposals are prioritized. This helps prevent duplicate work.
+
+1. Find or create an issue and get assigned
+2. Fork the repository
+3. Create your feature branch (`git checkout -b feature/amazing-feature`)
+4. Commit your changes (`git commit -m 'Add amazing feature'`)
+5. Push to the branch (`git push origin feature/amazing-feature`)
+6. Open a Pull Request
+
 ## Community & Support

 We use [Discord](https://discord.com/invite/MXE49hrKDk) for support, feature requests, and community discussions.
@@ -295,16 +346,6 @@ We use [Discord](https://discord.com/invite/MXE49hrKDk) for support, feature req
 - Twitter/X - [@adenhq](https://x.com/aden_hq)
 - LinkedIn - [Company Page](https://www.linkedin.com/company/teamaden/)

-## Contributing
-
-We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
-
-1. Fork the repository
-2. Create your feature branch (`git checkout -b feature/amazing-feature`)
-3. Commit your changes (`git commit -m 'Add amazing feature'`)
-4. Push to the branch (`git push origin feature/amazing-feature`)
-5. Open a Pull Request
-
 ## Join Our Team

 **We're hiring!** Join us in engineering, research, and go-to-market roles.
@@ -321,69 +362,57 @@ This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENS

 ## Frequently Asked Questions (FAQ)

-**Q: Does Aden depend on LangChain or other agent frameworks?**
+**Q: What LLM providers does Hive support?**

-No. Aden is built from the ground up with no dependencies on LangChain, CrewAI, or other agent frameworks. The framework is designed to be lean and flexible, generating agent graphs dynamically rather than relying on predefined components.
+Hive supports 100+ LLM providers through LiteLLM integration, including OpenAI (GPT-4, GPT-4o), Anthropic (Claude models), Google Gemini, DeepSeek, Mistral, Groq, and many more. Simply set the appropriate API key environment variable and specify the model name.

-**Q: What LLM providers does Aden support?**
+**Q: Can I use Hive with local AI models like Ollama?**

-Aden supports 100+ LLM providers through LiteLLM integration, including OpenAI (GPT-4, GPT-4o), Anthropic (Claude models), Google Gemini, Mistral, Groq, and many more. Simply set the appropriate API key environment variable and specify the model name.
+Yes! Hive supports local models through LiteLLM. Simply use the model name format `ollama/model-name` (e.g., `ollama/llama3`, `ollama/mistral`) and ensure Ollama is running locally.

-**Q: Can I use Aden with local AI models like Ollama?**
+**Q: What makes Hive different from other agent frameworks?**

-Yes! Aden supports local models through LiteLLM. Simply use the model name format `ollama/model-name` (e.g., `ollama/llama3`, `ollama/mistral`) and ensure Ollama is running locally.
+Hive generates your entire agent system from natural language goals using a coding agent—you don't hardcode workflows or manually define graphs. When agents fail, the framework automatically captures failure data, [evolves the agent graph](docs/key_concepts/evolution.md), and redeploys. This self-improving loop is unique to Aden.

-**Q: What makes Aden different from other agent frameworks?**
+**Q: Is Hive open-source?**

-Aden generates your entire agent system from natural language goals using a coding agent—you don't hardcode workflows or manually define graphs. When agents fail, the framework automatically captures failure data, evolves the agent graph, and redeploys. This self-improving loop is unique to Aden.
+Yes, Hive is fully open-source under the Apache License 2.0. We actively encourage community contributions and collaboration.

-**Q: Is Aden open-source?**
+**Q: Can Hive handle complex, production-scale use cases?**

-Yes, Aden is fully open-source under the Apache License 2.0. We actively encourage community contributions and collaboration.
+Yes. Hive is explicitly designed for production environments with features like automatic failure recovery, real-time observability, cost controls, and horizontal scaling support. The framework handles both simple automations and complex multi-agent workflows.

-**Q: Does Aden collect data from users?**
+**Q: Does Hive support human-in-the-loop workflows?**

-Aden collects telemetry data for monitoring and observability purposes, including token usage, latency metrics, and cost tracking. Content capture (prompts and responses) is configurable and stored with team-scoped data isolation. All data stays within your infrastructure when self-hosted.
+Yes, Hive fully supports [human-in-the-loop](docs/key_concepts/graph.md#human-in-the-loop) workflows through intervention nodes that pause execution for human input. These include configurable timeouts and escalation policies, allowing seamless collaboration between human experts and AI agents.

-**Q: What deployment options does Aden support?**
+**Q: What programming languages does Hive support?**

-Aden supports Docker Compose deployment out of the box, with both production and development configurations. Self-hosted deployments work on any infrastructure supporting Docker. Cloud deployment options and Kubernetes-ready configurations are on the roadmap.
+The Hive framework is built in Python. A JavaScript/TypeScript SDK is on the roadmap.

-**Q: Can Aden handle complex, production-scale use cases?**
-
-Yes. Aden is explicitly designed for production environments with features like automatic failure recovery, real-time observability, cost controls, and horizontal scaling support. The framework handles both simple automations and complex multi-agent workflows.
-
-**Q: Does Aden support human-in-the-loop workflows?**
-
-Yes, Aden fully supports human-in-the-loop workflows through intervention nodes that pause execution for human input. These include configurable timeouts and escalation policies, allowing seamless collaboration between human experts and AI agents.
-
-**Q: What monitoring and debugging tools does Aden provide?**
-
-Aden includes comprehensive observability features: real-time WebSocket streaming for live agent execution monitoring, TimescaleDB-powered analytics for cost and performance metrics, health check endpoints for Kubernetes integration, and 19 MCP tools for budget management, agent status, and policy control.
-
-**Q: What programming languages does Aden support?**
-
-Aden provides SDKs for both Python and JavaScript/TypeScript. The Python SDK includes integration templates for LangGraph, LangFlow, and LiveKit. The backend is Node.js/TypeScript, and the frontend is React/TypeScript.
-
-**Q: Can Aden agents interact with external tools and APIs?**
+**Q: Can Hive agents interact with external tools and APIs?**

 Yes. Aden's SDK-wrapped nodes provide built-in tool access, and the framework supports flexible tool ecosystems. Agents can integrate with external APIs, databases, and services through the node architecture.

-**Q: How does cost control work in Aden?**
+**Q: How does cost control work in Hive?**

-Aden provides granular budget controls including spending limits, throttles, and automatic model degradation policies. You can set budgets at the team, agent, or workflow level, with real-time cost tracking and alerts.
+Hive provides granular budget controls including spending limits, throttles, and automatic model degradation policies. You can set budgets at the team, agent, or workflow level, with real-time cost tracking and alerts.

 **Q: Where can I find examples and documentation?**

-Visit [docs.adenhq.com](https://docs.adenhq.com/) for complete guides, API reference, and getting started tutorials. The repository also includes documentation in the `docs/` folder and a comprehensive [DEVELOPER.md](DEVELOPER.md) guide.
+Visit [docs.adenhq.com](https://docs.adenhq.com/) for complete guides, API reference, and getting started tutorials. The repository also includes documentation in the `docs/` folder and a comprehensive [developer guide](docs/developer-guide.md).

 **Q: How can I contribute to Aden?**

 Contributions are welcome! Fork the repository, create your feature branch, implement your changes, and submit a pull request. See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.

-**Q: Does Aden offer enterprise support?**
+**Q: When will my team start seeing results from Aden's adaptive agents?**

-For enterprise inquiries, contact the Aden team through [adenhq.com](https://adenhq.com) or join our [Discord community](https://discord.com/invite/MXE49hrKDk) for support and discussions.
+Aden's adaptation loop begins working from the first execution. When an agent fails, the framework captures the failure data, helping developers evolve the agent graph through the coding agent. How quickly this translates to measurable results depends on the complexity of your use case, the quality of your goal definitions, and the volume of executions generating feedback.
+
+**Q: How does Hive compare to other agent frameworks?**
+
+Hive focuses on generating agents that run real business processes, rather than generic agents. This vision emphasizes outcome-driven design, adaptability, and an easy-to-use set of tools and integrations.

 ---

@@ -1,150 +0,0 @@
-Product Roadmap
-
-Aden Agent Framework aims to help developers build outcome oriented, self-adaptive agents. Please find our roadmap here
-
-```mermaid
-timeline
-    title Aden Agent Framework Roadmap
-    section Foundation
-        Architecture : Node-Based Architecture : Python SDK : LLM Integration (OpenAI, Anthropic, Google) : Communication Protocol
-        Coding Agent : Goal Creation Session : Worker Agent Creation : MCP Tools Integration
-        Worker Agent : Human-in-the-Loop : Callback Handlers : Intervention Points : Streaming Interface
-        Tools : File Use : Memory (STM/LTM) : Web Search : Web Scraper : Audit Trail
-        Core : Eval System : Pydantic Validation : Docker Deployment : Documentation : Sample Agents
-    section Expansion
-        Intelligence : Guardrails : Streaming Mode : Semantic Search
-        Platform : JavaScript SDK : Custom Tool Integrator : Credential Store
-        Deployment : Self-Hosted : Cloud Services : CI/CD Pipeline
-        Templates : Sales Agent : Marketing Agent : Analytics Agent : Training Agent : Smart Form Agent
-```
-
---
-
-## Phase 1: Foundation
-
-### Backbone Architecture
- [ ] **Node-Based Architecture (Agent as a node)**
-    - [x] Object schema definition
-    - [x] Node wrapper SDK
-    - [ ] Shared memory access
-    - [ ] Default monitoring hooks
-    - [ ] Tool access layer
-    - [x] LLM integration layer (Natively supports all mainstream LLMs through LiteLLM)
-        - [x] Anthropic
-        - [x] OpenAI
-        - [x] Google
- [ ] **Communication protocol between nodes**
- [ ] **[Coding Agent] Goal Creation Session** (separate from coding session)
-    - [ ] Instruction back and forth
-    - [x] Goal Object schema definition
-    - [ ] Being able to generate the test cases
-    - [ ] Test case validation for worker agent (Outcome driven)
- [ ] **[Coding Agent] Worker Agent Creation**
-    - [x] Coding Agent tools
-    - [ ] Use Template Agent as a start
-    - [x] Use our MCP tools
- [ ] **[Worker Agent] Human-in-the-Loop**
-    - [x] Worker Agents request with questions and options
-    - [x] Callback Handler System to receive events throughout execution
-    - [ ] Tool-Based Intervention Points (tool to pause execution and request human input)
-    - [x] Multiple entrypoint for different event source (e.g. Human input, webhook)
-    - [ ] Streaming Interface for Real-time Monitoring
-    - [ ] Request State Management
-
-### Essential Tools
- [x] **File Use Tool Kit**
- [ ] **Memory Tools**
-    - [x] STM Layer Tool (state-based short-term memory)
-    - [x] LTM Layer Tool (RLM - long-term memory)
- [ ] **Infrastructure Tools**
-    - [x] Runtime Log Tool (logs for coding agent)
-    - [ ] Audit Trail Tool (decision timeline generation)
-    - [ ] Web Search
-    - [ ] Web Scraper
-    - [ ] Recipe for "Add your own tools"
-
-### Memory & File System
- [x] DB for long-term persistent memory (Filesystem as durable scratchpad pattern)
- [x] Session Local memory isolation
-
-### Eval System (Basic)
- [x] Test Driven - Run test case for all agent iteration
- [ ] Failure recording mechanism
- [ ] SDK for defining failure conditions
- [ ] Basic observability hooks
- [ ] User-driven log analysis (OSS approach)
-
-### Data Validation
- [ ] Natively Support data validation of LLMs output with Pydantic
-
-### Developer Experience
- [ ] **Debugging mode**
- [ ] **Documentation**
-    - [ ] Quick start guide
-    - [ ] Goal creation guide
-    - [ ] Agent creation guide
-    - [ ] GitHub Page setup
-    - [ ] README with examples
-    - [ ] Contributing guidelines
- [ ] **Distribution**
-    - [ ] PyPI package
-    - [ ] Docker image on Docker Hub
-
-### Sample Agents
- [ ] Knowledge Agent
- [ ] Blog Writer Agent
- [ ] SDR Agent
-
---
-
-## Phase 2: Expansion
-
-### Basic Guardrails
- [ ] Support Basic Monitoring from Agent node SDK
- [ ] SDK guardrail implementation (in node)
- [ ] Guardrail type support (Determined Condition as Guardrails)
-
-### Agent Capability
- [ ] Streaming mode support
-
-### Cross-Platform
- [ ] JavaScript / TypeScript Version SDK
-
-### File System Enhancement
- [ ] Semantic Search integration
- [ ] Interactive File System in product (frontend integration)
-
-### More Worker Tools
- [ ] Custom Tool Integrator
- [ ] Integration as a tool (Credential Store & Support)
- [ ] **Core Agent Tools**
-    - [ ] Node Discovery Tool (find other agents in the graph)
-    - [ ] HITL Tool (pause execution for human approval)
-    - [ ] Wake-up Tool (resume agent tasks)
-
-### Deployment (Self-Hosted)
- [ ] Docker container standardization
- [ ] Headless backend execution
- [ ] Exposed API for frontend attachment
- [ ] Local monitoring & observability
- [ ] Basic lifecycle APIs (Start, Stop, Pause, Resume)
-
-### Deployment (Cloud)
- [ ] Cloud Service Options
- [ ] Support deployment to 3rd-party platforms
- [ ] Self-deploy + orchestrator connection
- [ ] **CI/CD Pipeline**
-    - [ ] Automated test execution
-    - [ ] Agent version control
-    - [ ] All tests must pass for deployment
-
-### Developer Experience Enhancement
- [ ] Tool usage documentation
- [ ] Discord Support Channel
-
-### More Agent Templates
- [ ] GTM Sales Agent (workflow)
- [ ] GTM Marketing Agent (workflow)
- [ ] Analytics Agent
- [ ] Training Agent
- [ ] Smart Entry / Form Agent (self-evolution emphasis)
@@ -1,4 +1,5 @@
 exports/
 docs/
+.agent-builder-sessions/
 .pytest_cache/
 **/__pycache__/
@@ -3,12 +3,12 @@
    "agent-builder": {
      "command": "python",
      "args": ["-m", "framework.mcp.agent_builder_server"],
-      "cwd": "/home/timothy/oss/hive/core"
+      "cwd": "core"
    },
    "tools": {
      "command": "python",
      "args": ["-m", "aden_tools.mcp_server", "--stdio"],
-      "cwd": "/home/timothy/oss/hive/tools"
+      "cwd": "tools"
    }
  }
 }
@@ -14,7 +14,7 @@ Framework provides a runtime framework that captures **decisions**, not just act
 ## Installation

 ```bash
-pip install -e .
+uv pip install -e .
 ```

 ## MCP Server Setup
@@ -45,13 +45,13 @@ If you prefer manual setup:

 ```bash
 # Install framework
-pip install -e .
+uv pip install -e .

 # Install MCP dependencies
-pip install mcp fastmcp
+uv pip install mcp fastmcp

 # Test the server
-python -m framework.mcp.agent_builder_server
+uv run python -m framework.mcp.agent_builder_server
 ```

 ### Using with MCP Clients
@@ -86,13 +86,13 @@ Run an LLM-powered calculator:

 ```bash
 # Single calculation
-python -m framework calculate "2 + 3 * 4"
+uv run python -m framework calculate "2 + 3 * 4"

 # Interactive mode
-python -m framework interactive
+uv run python -m framework interactive

 # Analyze runs with Builder
-python -m framework analyze calculator
+uv run python -m framework analyze calculator
 ```

 ### Using the Runtime
@@ -132,24 +132,20 @@ runtime.end_run(success=True, narrative="Successfully processed all data")

 The framework includes a goal-based testing framework for validating agent behavior.

+Tests are generated using MCP tools (`generate_constraint_tests`, `generate_success_tests`) which return guidelines. Claude writes tests directly using the Write tool based on these guidelines.
+
 ```bash
-# Generate tests from a goal definition
-python -m framework test-generate goal.json
-
-# Interactively approve generated tests
-python -m framework test-approve <goal_id>
-
 # Run tests against an agent
-python -m framework test-run <agent_path> --parallel 4
+uv run python -m framework test-run <agent_path> --goal <goal_id> --parallel 4

 # Debug failed tests
-python -m framework test-debug <goal_id> <test_id>
+uv run python -m framework test-debug <agent_path> <test_name>

-# List tests by status
-python -m framework test-list <goal_id>
+# List tests for a goal
+uv run python -m framework test-list <goal_id>
 ```

-For detailed testing workflows, see the [testing-agent skill](.claude/skills/testing-agent/SKILL.md).
+For detailed testing workflows, see the [hive-test skill](../.claude/skills/hive-test/SKILL.md).

 ### Analyzing Agent Behavior with Builder

@@ -0,0 +1,740 @@
+#!/usr/bin/env python3
+"""
+EventLoopNode WebSocket Demo
+
+Real LLM, real FileConversationStore, real EventBus.
+Streams EventLoopNode execution to a browser via WebSocket.
+
+Usage:
+    cd /home/timothy/oss/hive/core
+    python demos/event_loop_wss_demo.py
+
+    Then open http://localhost:8765 in your browser.
+"""
+
+import asyncio
+import json
+import logging
+import sys
+import tempfile
+from http import HTTPStatus
+from pathlib import Path
+
+import httpx
+import websockets
+from bs4 import BeautifulSoup
+from websockets.http11 import Request, Response
+
+# Add core, tools, and hive root to path
+_CORE_DIR = Path(__file__).resolve().parent.parent
+_HIVE_DIR = _CORE_DIR.parent
+sys.path.insert(0, str(_CORE_DIR))  # framework.*
+sys.path.insert(0, str(_HIVE_DIR / "tools" / "src"))  # aden_tools.*
+sys.path.insert(0, str(_HIVE_DIR))  # core.framework.* (for aden_tools imports)
+
+import os  # noqa: E402
+
+from aden_tools.credentials import CREDENTIAL_SPECS, CredentialStoreAdapter  # noqa: E402
+from core.framework.credentials import CredentialStore  # noqa: E402
+
+from framework.credentials.storage import (  # noqa: E402
+    CompositeStorage,
+    EncryptedFileStorage,
+    EnvVarStorage,
+)
+from framework.graph.event_loop_node import EventLoopNode, LoopConfig  # noqa: E402
+from framework.graph.node import NodeContext, NodeSpec, SharedMemory  # noqa: E402
+from framework.llm.litellm import LiteLLMProvider  # noqa: E402
+from framework.llm.provider import Tool  # noqa: E402
+from framework.runner.tool_registry import ToolRegistry  # noqa: E402
+from framework.runtime.core import Runtime  # noqa: E402
+from framework.runtime.event_bus import EventBus, EventType  # noqa: E402
+from framework.storage.conversation_store import FileConversationStore  # noqa: E402
+
+logging.basicConfig(level=logging.INFO, format="%(asctime)s %(name)s %(message)s")
+logger = logging.getLogger("demo")
+
+# -------------------------------------------------------------------------
+# Persistent state (shared across WebSocket connections)
+# -------------------------------------------------------------------------
+
+STORE_DIR = Path(tempfile.mkdtemp(prefix="hive_demo_"))
+STORE = FileConversationStore(STORE_DIR / "conversation")
+RUNTIME = Runtime(STORE_DIR / "runtime")
+LLM = LiteLLMProvider(model="claude-sonnet-4-5-20250929")
+
+# -------------------------------------------------------------------------
+# Tool Registry — real tools via ToolRegistry (same pattern as GraphExecutor)
+# -------------------------------------------------------------------------
+
+TOOL_REGISTRY = ToolRegistry()
+
+# Credential store: Aden sync (OAuth2 tokens) + encrypted files + env var fallback
+_env_mapping = {name: spec.env_var for name, spec in CREDENTIAL_SPECS.items()}
+_local_storage = CompositeStorage(
+    primary=EncryptedFileStorage(),
+    fallbacks=[EnvVarStorage(env_mapping=_env_mapping)],
+)
+
+if os.environ.get("ADEN_API_KEY"):
+    try:
+        from framework.credentials.aden import (  # noqa: E402
+            AdenCachedStorage,
+            AdenClientConfig,
+            AdenCredentialClient,
+            AdenSyncProvider,
+        )
+
+        _client = AdenCredentialClient(AdenClientConfig(base_url="https://api.adenhq.com"))
+        _provider = AdenSyncProvider(client=_client)
+        _storage = AdenCachedStorage(
+            local_storage=_local_storage,
+            aden_provider=_provider,
+        )
+        _cred_store = CredentialStore(storage=_storage, providers=[_provider], auto_refresh=True)
+        _synced = _provider.sync_all(_cred_store)
+        logger.info("Synced %d credentials from Aden", _synced)
+    except Exception as e:
+        logger.warning("Aden sync unavailable: %s", e)
+        _cred_store = CredentialStore(storage=_local_storage)
+else:
+    logger.info("ADEN_API_KEY not set, using local credential storage")
+    _cred_store = CredentialStore(storage=_local_storage)
+
+CREDENTIALS = CredentialStoreAdapter(_cred_store)
+
+# Debug: log which credentials resolved
+for _name in ["brave_search", "hubspot", "anthropic"]:
+    _val = CREDENTIALS.get(_name)
+    if _val:
+        logger.debug("credential %s: OK (len=%d)", _name, len(_val))
+    else:
+        logger.debug("credential %s: not found", _name)
+
+# --- web_search (Brave Search API) ---
+
+TOOL_REGISTRY.register(
+    name="web_search",
+    tool=Tool(
+        name="web_search",
+        description=(
+            "Search the web for current information. "
+            "Returns titles, URLs, and snippets from search results."
+        ),
+        parameters={
+            "type": "object",
+            "properties": {
+                "query": {
+                    "type": "string",
+                    "description": "The search query (1-500 characters)",
+                },
+                "num_results": {
+                    "type": "integer",
+                    "description": "Number of results to return (1-20, default 10)",
+                },
+            },
+            "required": ["query"],
+        },
+    ),
+    executor=lambda inputs: _exec_web_search(inputs),
+)
+
+
+def _exec_web_search(inputs: dict) -> dict:
+    api_key = CREDENTIALS.get("brave_search")
+    if not api_key:
+        return {"error": "brave_search credential not configured"}
+    query = inputs.get("query", "")
+    num_results = min(inputs.get("num_results", 10), 20)
+    resp = httpx.get(
+        "https://api.search.brave.com/res/v1/web/search",
+        params={"q": query, "count": num_results},
+        headers={"X-Subscription-Token": api_key, "Accept": "application/json"},
+        timeout=30.0,
+    )
+    if resp.status_code != 200:
+        return {"error": f"Brave API HTTP {resp.status_code}"}
+    data = resp.json()
+    results = [
+        {
+            "title": item.get("title", ""),
+            "url": item.get("url", ""),
+            "snippet": item.get("description", ""),
+        }
+        for item in data.get("web", {}).get("results", [])[:num_results]
+    ]
+    return {"query": query, "results": results, "total": len(results)}
+
+
+# --- web_scrape (httpx + BeautifulSoup, no playwright for sync compat) ---
+
+TOOL_REGISTRY.register(
+    name="web_scrape",
+    tool=Tool(
+        name="web_scrape",
+        description=(
+            "Scrape and extract text content from a webpage URL. "
+            "Returns the page title and main text content."
+        ),
+        parameters={
+            "type": "object",
+            "properties": {
+                "url": {
+                    "type": "string",
+                    "description": "URL of the webpage to scrape",
+                },
+                "max_length": {
+                    "type": "integer",
+                    "description": "Maximum text length (default 50000)",
+                },
+            },
+            "required": ["url"],
+        },
+    ),
+    executor=lambda inputs: _exec_web_scrape(inputs),
+)
+
+_SCRAPE_HEADERS = {
+    "User-Agent": (
+        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
+        "AppleWebKit/537.36 (KHTML, like Gecko) "
+        "Chrome/131.0.0.0 Safari/537.36"
+    ),
+    "Accept": "text/html,application/xhtml+xml",
+}
+
+
+def _exec_web_scrape(inputs: dict) -> dict:
+    url = inputs.get("url", "")
+    max_length = max(1000, min(inputs.get("max_length", 50000), 500000))
+    if not url.startswith(("http://", "https://")):
+        url = "https://" + url
+    try:
+        resp = httpx.get(url, timeout=30.0, follow_redirects=True, headers=_SCRAPE_HEADERS)
+        if resp.status_code != 200:
+            return {"error": f"HTTP {resp.status_code}"}
+        soup = BeautifulSoup(resp.text, "html.parser")
+        for tag in soup(["script", "style", "nav", "footer", "header", "aside", "noscript"]):
+            tag.decompose()
+        title = soup.title.get_text(strip=True) if soup.title else ""
+        main = (
+            soup.find("article")
+            or soup.find("main")
+            or soup.find(attrs={"role": "main"})
+            or soup.find("body")
+        )
+        text = main.get_text(separator=" ", strip=True) if main else ""
+        text = " ".join(text.split())
+        if len(text) > max_length:
+            text = text[:max_length] + "..."
+        return {"url": url, "title": title, "content": text, "length": len(text)}
+    except httpx.TimeoutException:
+        return {"error": "Request timed out"}
+    except Exception as e:
+        return {"error": f"Scrape failed: {e}"}
+
+
+# --- HubSpot CRM tools (optional, requires HUBSPOT_ACCESS_TOKEN) ---
+
+_HUBSPOT_API = "https://api.hubapi.com"
+
+
+def _hubspot_headers() -> dict | None:
+    token = CREDENTIALS.get("hubspot")
+    if token:
+        logger.debug("HubSpot token: %s...%s (len=%d)", token[:8], token[-4:], len(token))
+    else:
+        logger.debug("HubSpot token: not found")
+    if not token:
+        return None
+    return {
+        "Authorization": f"Bearer {token}",
+        "Content-Type": "application/json",
+        "Accept": "application/json",
+    }
+
+
+def _exec_hubspot_search(inputs: dict) -> dict:
+    headers = _hubspot_headers()
+    if not headers:
+        return {"error": "HUBSPOT_ACCESS_TOKEN not set"}
+    object_type = inputs.get("object_type", "contacts")
+    query = inputs.get("query", "")
+    limit = min(inputs.get("limit", 10), 100)
+    body: dict = {"limit": limit}
+    if query:
+        body["query"] = query
+    try:
+        resp = httpx.post(
+            f"{_HUBSPOT_API}/crm/v3/objects/{object_type}/search",
+            headers=headers,
+            json=body,
+            timeout=30.0,
+        )
+        if resp.status_code != 200:
+            return {"error": f"HubSpot API HTTP {resp.status_code}: {resp.text[:200]}"}
+        return resp.json()
+    except httpx.TimeoutException:
+        return {"error": "Request timed out"}
+    except Exception as e:
+        return {"error": f"HubSpot error: {e}"}
+
+
+TOOL_REGISTRY.register(
+    name="hubspot_search",
+    tool=Tool(
+        name="hubspot_search",
+        description=(
+            "Search HubSpot CRM objects (contacts, companies, or deals). "
+            "Returns matching records with their properties."
+        ),
+        parameters={
+            "type": "object",
+            "properties": {
+                "object_type": {
+                    "type": "string",
+                    "description": "CRM object type: 'contacts', 'companies', or 'deals'",
+                },
+                "query": {
+                    "type": "string",
+                    "description": "Search query (name, email, domain, etc.)",
+                },
+                "limit": {
+                    "type": "integer",
+                    "description": "Max results (1-100, default 10)",
+                },
+            },
+            "required": ["object_type"],
+        },
+    ),
+    executor=lambda inputs: _exec_hubspot_search(inputs),
+)
+
+logger.info(
+    "ToolRegistry loaded: %s",
+    ", ".join(TOOL_REGISTRY.get_registered_names()),
+)
+
+
+# -------------------------------------------------------------------------
+# HTML page (embedded)
+# -------------------------------------------------------------------------
+
+HTML_PAGE = (  # noqa: E501
+    """<!DOCTYPE html>
+<html lang="en">
+<head>
+<meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1">
+<title>EventLoopNode Live Demo</title>
+<style>
+  * { box-sizing: border-box; margin: 0; padding: 0; }
+  body {
+    font-family: 'SF Mono', 'Fira Code', monospace;
+    background: #0d1117; color: #c9d1d9;
+    height: 100vh; display: flex; flex-direction: column;
+  }
+  header {
+    background: #161b22; padding: 12px 20px;
+    border-bottom: 1px solid #30363d;
+    display: flex; align-items: center; gap: 16px;
+  }
+  header h1 { font-size: 16px; color: #58a6ff; font-weight: 600; }
+  .status {
+    font-size: 12px; padding: 3px 10px; border-radius: 12px;
+    background: #21262d; color: #8b949e;
+  }
+  .status.running { background: #1a4b2e; color: #3fb950; }
+  .status.done { background: #1a3a5c; color: #58a6ff; }
+  .status.error { background: #4b1a1a; color: #f85149; }
+  .chat { flex: 1; overflow-y: auto; padding: 16px; }
+  .msg {
+    margin: 8px 0; padding: 10px 14px; border-radius: 8px;
+    line-height: 1.6; white-space: pre-wrap; word-wrap: break-word;
+  }
+  .msg.user { background: #1a3a5c; color: #58a6ff; }
+  .msg.assistant { background: #161b22; color: #c9d1d9; }
+  .msg.event {
+    background: transparent; color: #8b949e; font-size: 11px;
+    padding: 4px 14px; border-left: 3px solid #30363d;
+  }
+  .msg.event.loop { border-left-color: #58a6ff; }
+  .msg.event.tool { border-left-color: #d29922; }
+  .msg.event.stall { border-left-color: #f85149; }
+  .input-bar {
+    padding: 12px 16px; background: #161b22;
+    border-top: 1px solid #30363d; display: flex; gap: 8px;
+  }
+  .input-bar input {
+    flex: 1; background: #0d1117; border: 1px solid #30363d;
+    color: #c9d1d9; padding: 8px 12px; border-radius: 6px;
+    font-family: inherit; font-size: 14px; outline: none;
+  }
+  .input-bar input:focus { border-color: #58a6ff; }
+  .input-bar button {
+    background: #238636; color: #fff; border: none;
+    padding: 8px 20px; border-radius: 6px; cursor: pointer;
+    font-family: inherit; font-weight: 600;
+  }
+  .input-bar button:hover { background: #2ea043; }
+  .input-bar button:disabled {
+    background: #21262d; color: #484f58; cursor: not-allowed;
+  }
+  .input-bar button.clear { background: #da3633; }
+  .input-bar button.clear:hover { background: #f85149; }
+</style>
+</head>
+<body>
+  <header>
+    <h1>EventLoopNode Live</h1>
+    <span id="status" class="status">Idle</span>
+    <span id="iter" class="status" style="display:none">Step 0</span>
+  </header>
+  <div id="chat" class="chat"></div>
+  <div class="input-bar">
+    <input id="input" type="text"
+           placeholder="Ask anything..." autofocus />
+    <button id="go" onclick="run()">Send</button>
+    <button class="clear"
+            onclick="clearConversation()">Clear</button>
+  </div>
+
+<script>
+let ws = null;
+let currentAssistantEl = null;
+let iterCount = 0;
+const chat = document.getElementById('chat');
+const status = document.getElementById('status');
+const iterEl = document.getElementById('iter');
+const goBtn = document.getElementById('go');
+const inputEl = document.getElementById('input');
+
+inputEl.addEventListener('keydown', e => {
+  if (e.key === 'Enter') run();
+});
+
+function setStatus(text, cls) {
+  status.textContent = text;
+  status.className = 'status ' + cls;
+}
+
+function addMsg(text, cls) {
+  const el = document.createElement('div');
+  el.className = 'msg ' + cls;
+  el.textContent = text;
+  chat.appendChild(el);
+  chat.scrollTop = chat.scrollHeight;
+  return el;
+}
+
+function connect() {
+  ws = new WebSocket('ws://' + location.host + '/ws');
+  ws.onopen = () => {
+    setStatus('Ready', 'done');
+    goBtn.disabled = false;
+  };
+  ws.onmessage = handleEvent;
+  ws.onerror = () => { setStatus('Error', 'error'); };
+  ws.onclose = () => {
+    setStatus('Reconnecting...', '');
+    goBtn.disabled = true;
+    setTimeout(connect, 2000);
+  };
+}
+
+function handleEvent(msg) {
+  const evt = JSON.parse(msg.data);
+
+  if (evt.type === 'llm_text_delta') {
+    if (currentAssistantEl) {
+      currentAssistantEl.textContent += evt.content;
+      chat.scrollTop = chat.scrollHeight;
+    }
+  }
+  else if (evt.type === 'ready') {
+    setStatus('Ready', 'done');
+    if (currentAssistantEl && !currentAssistantEl.textContent)
+      currentAssistantEl.remove();
+    goBtn.disabled = false;
+  }
+  else if (evt.type === 'node_loop_iteration') {
+    iterCount = evt.iteration || (iterCount + 1);
+    iterEl.textContent = 'Step ' + iterCount;
+    iterEl.style.display = '';
+  }
+  else if (evt.type === 'tool_call_started') {
+    var info = evt.tool_name + '('
+      + JSON.stringify(evt.tool_input).slice(0, 120) + ')';
+    addMsg('TOOL  ' + info, 'event tool');
+  }
+  else if (evt.type === 'tool_call_completed') {
+    var preview = (evt.result || '').slice(0, 200);
+    var cls = evt.is_error ? 'stall' : 'tool';
+    addMsg('RESULT  ' + evt.tool_name + ': ' + preview,
+           'event ' + cls);
+    currentAssistantEl = addMsg('', 'assistant');
+  }
+  else if (evt.type === 'result') {
+    setStatus('Session ended', evt.success ? 'done' : 'error');
+    if (evt.error) addMsg('ERROR  ' + evt.error, 'event stall');
+    if (currentAssistantEl && !currentAssistantEl.textContent)
+      currentAssistantEl.remove();
+    goBtn.disabled = false;
+  }
+  else if (evt.type === 'node_stalled') {
+    addMsg('STALLED  ' + evt.reason, 'event stall');
+  }
+  else if (evt.type === 'cleared') {
+    chat.innerHTML = '';
+    iterCount = 0;
+    iterEl.textContent = 'Step 0';
+    iterEl.style.display = 'none';
+    setStatus('Ready', 'done');
+    goBtn.disabled = false;
+  }
+}
+
+function run() {
+  const text = inputEl.value.trim();
+  if (!text || !ws || ws.readyState !== 1) return;
+  addMsg(text, 'user');
+  currentAssistantEl = addMsg('', 'assistant');
+  inputEl.value = '';
+  setStatus('Running', 'running');
+  goBtn.disabled = true;
+  ws.send(JSON.stringify({ topic: text }));
+}
+
+function clearConversation() {
+  if (ws && ws.readyState === 1) {
+    ws.send(JSON.stringify({ command: 'clear' }));
+  }
+}
+
+connect();
+</script>
+</body>
+</html>"""
+)
+
+
+# -------------------------------------------------------------------------
+# WebSocket handler
+# -------------------------------------------------------------------------
+
+
+async def handle_ws(websocket):
+    """Persistent WebSocket: long-lived EventLoopNode with client_facing blocking."""
+    global STORE
+
+    # -- Event forwarding (WebSocket ← EventBus) ----------------------------
+    bus = EventBus()
+
+    async def forward_event(event):
+        try:
+            payload = {"type": event.type.value, **event.data}
+            if event.node_id:
+                payload["node_id"] = event.node_id
+            await websocket.send(json.dumps(payload))
+        except Exception:
+            pass
+
+    bus.subscribe(
+        event_types=[
+            EventType.NODE_LOOP_STARTED,
+            EventType.NODE_LOOP_ITERATION,
+            EventType.NODE_LOOP_COMPLETED,
+            EventType.LLM_TEXT_DELTA,
+            EventType.TOOL_CALL_STARTED,
+            EventType.TOOL_CALL_COMPLETED,
+            EventType.NODE_STALLED,
+        ],
+        handler=forward_event,
+    )
+
+    # -- Per-connection state -----------------------------------------------
+    node = None
+    loop_task = None
+
+    tools = list(TOOL_REGISTRY.get_tools().values())
+    tool_executor = TOOL_REGISTRY.get_executor()
+
+    node_spec = NodeSpec(
+        id="assistant",
+        name="Chat Assistant",
+        description="A conversational assistant that remembers context across messages",
+        node_type="event_loop",
+        client_facing=True,
+        system_prompt=(
+            "You are a helpful assistant with access to tools. "
+            "You can search the web, scrape webpages, and query HubSpot CRM. "
+            "Use tools when the user asks for current information or external data. "
+            "You have full conversation history, so you can reference previous messages."
+        ),
+    )
+
+    # -- Ready callback: subscribe to CLIENT_INPUT_REQUESTED on the bus ---
+    async def on_input_requested(event):
+        try:
+            await websocket.send(json.dumps({"type": "ready"}))
+        except Exception:
+            pass
+
+    bus.subscribe(
+        event_types=[EventType.CLIENT_INPUT_REQUESTED],
+        handler=on_input_requested,
+    )
+
+    async def start_loop(first_message: str):
+        """Create an EventLoopNode and run it as a background task."""
+        nonlocal node, loop_task
+
+        memory = SharedMemory()
+        ctx = NodeContext(
+            runtime=RUNTIME,
+            node_id="assistant",
+            node_spec=node_spec,
+            memory=memory,
+            input_data={},
+            llm=LLM,
+            available_tools=tools,
+        )
+        node = EventLoopNode(
+            event_bus=bus,
+            config=LoopConfig(max_iterations=10_000, max_history_tokens=32_000),
+            conversation_store=STORE,
+            tool_executor=tool_executor,
+        )
+        await node.inject_event(first_message)
+
+        async def _run():
+            try:
+                result = await node.execute(ctx)
+                try:
+                    await websocket.send(
+                        json.dumps(
+                            {
+                                "type": "result",
+                                "success": result.success,
+                                "output": result.output,
+                                "error": result.error,
+                                "tokens": result.tokens_used,
+                            }
+                        )
+                    )
+                except Exception:
+                    pass
+                logger.info(f"Loop ended: success={result.success}, tokens={result.tokens_used}")
+            except websockets.exceptions.ConnectionClosed:
+                logger.info("Loop stopped: WebSocket closed")
+            except Exception as e:
+                logger.exception("Loop error")
+                try:
+                    await websocket.send(
+                        json.dumps(
+                            {
+                                "type": "result",
+                                "success": False,
+                                "error": str(e),
+                                "output": {},
+                            }
+                        )
+                    )
+                except Exception:
+                    pass
+
+        loop_task = asyncio.create_task(_run())
+
+    async def stop_loop():
+        """Signal the node and wait for the loop task to finish."""
+        nonlocal node, loop_task
+        if loop_task and not loop_task.done():
+            if node:
+                node.signal_shutdown()
+            try:
+                await asyncio.wait_for(loop_task, timeout=5.0)
+            except (TimeoutError, asyncio.CancelledError):
+                loop_task.cancel()
+        node = None
+        loop_task = None
+
+    # -- Message loop (runs for the lifetime of this WebSocket) -------------
+    try:
+        async for raw in websocket:
+            try:
+                msg = json.loads(raw)
+            except Exception:
+                continue
+
+            # Clear command
+            if msg.get("command") == "clear":
+                import shutil
+
+                await stop_loop()
+                await STORE.close()
+                conv_dir = STORE_DIR / "conversation"
+                if conv_dir.exists():
+                    shutil.rmtree(conv_dir)
+                STORE = FileConversationStore(conv_dir)
+                await websocket.send(json.dumps({"type": "cleared"}))
+                logger.info("Conversation cleared")
+                continue
+
+            topic = msg.get("topic", "")
+            if not topic:
+                continue
+
+            if node is None:
+                # First message — spin up the loop
+                logger.info(f"Starting persistent loop: {topic}")
+                await start_loop(topic)
+            else:
+                # Subsequent message — inject into the running loop
+                logger.info(f"Injecting message: {topic}")
+                await node.inject_event(topic)
+
+    except websockets.exceptions.ConnectionClosed:
+        pass
+    finally:
+        await stop_loop()
+        logger.info("WebSocket closed, loop stopped")
+
+
+# -------------------------------------------------------------------------
+# HTTP handler for serving the HTML page
+# -------------------------------------------------------------------------
+
+
+async def process_request(connection, request: Request):
+    """Serve HTML on GET /, upgrade to WebSocket on /ws."""
+    if request.path == "/ws":
+        return None  # let websockets handle the upgrade
+    # Serve the HTML page for any other path
+    return Response(
+        HTTPStatus.OK,
+        "OK",
+        websockets.Headers({"Content-Type": "text/html; charset=utf-8"}),
+        HTML_PAGE.encode(),
+    )
+
+
+# -------------------------------------------------------------------------
+# Main
+# -------------------------------------------------------------------------
+
+
+async def main():
+    port = 8765
+    async with websockets.serve(
+        handle_ws,
+        "0.0.0.0",
+        port,
+        process_request=process_request,
+    ):
+        logger.info(f"Demo running at http://localhost:{port}")
+        logger.info("Open in your browser and enter a topic to research.")
+        await asyncio.Future()  # run forever
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
@@ -0,0 +1,930 @@
+#!/usr/bin/env python3
+"""
+Two-Node ContextHandoff Demo
+
+Demonstrates ContextHandoff between two EventLoopNode instances:
+  Node A (Researcher) → ContextHandoff → Node B (Analyst)
+
+Real LLM, real FileConversationStore, real EventBus.
+Streams both nodes to a browser via WebSocket.
+
+Usage:
+    cd /home/timothy/oss/hive/core
+    python demos/handoff_demo.py
+
+    Then open http://localhost:8766 in your browser.
+"""
+
+import asyncio
+import json
+import logging
+import sys
+import tempfile
+from http import HTTPStatus
+from pathlib import Path
+
+import httpx
+import websockets
+from bs4 import BeautifulSoup
+from websockets.http11 import Request, Response
+
+# Add core, tools, and hive root to path
+_CORE_DIR = Path(__file__).resolve().parent.parent
+_HIVE_DIR = _CORE_DIR.parent
+sys.path.insert(0, str(_CORE_DIR))  # framework.*
+sys.path.insert(0, str(_HIVE_DIR / "tools" / "src"))  # aden_tools.*
+sys.path.insert(0, str(_HIVE_DIR))  # core.framework.* (for aden_tools imports)
+
+from aden_tools.credentials import CREDENTIAL_SPECS, CredentialStoreAdapter  # noqa: E402
+from core.framework.credentials import CredentialStore  # noqa: E402
+
+from framework.credentials.storage import (  # noqa: E402
+    CompositeStorage,
+    EncryptedFileStorage,
+    EnvVarStorage,
+)
+from framework.graph.context_handoff import ContextHandoff  # noqa: E402
+from framework.graph.conversation import NodeConversation  # noqa: E402
+from framework.graph.event_loop_node import EventLoopNode, LoopConfig  # noqa: E402
+from framework.graph.node import NodeContext, NodeSpec, SharedMemory  # noqa: E402
+from framework.llm.litellm import LiteLLMProvider  # noqa: E402
+from framework.llm.provider import Tool  # noqa: E402
+from framework.runner.tool_registry import ToolRegistry  # noqa: E402
+from framework.runtime.core import Runtime  # noqa: E402
+from framework.runtime.event_bus import EventBus, EventType  # noqa: E402
+from framework.storage.conversation_store import FileConversationStore  # noqa: E402
+
+logging.basicConfig(level=logging.INFO, format="%(asctime)s %(name)s %(message)s")
+logger = logging.getLogger("handoff_demo")
+
+# -------------------------------------------------------------------------
+# Persistent state
+# -------------------------------------------------------------------------
+
+STORE_DIR = Path(tempfile.mkdtemp(prefix="hive_handoff_"))
+RUNTIME = Runtime(STORE_DIR / "runtime")
+LLM = LiteLLMProvider(model="claude-sonnet-4-5-20250929")
+
+# -------------------------------------------------------------------------
+# Credentials
+# -------------------------------------------------------------------------
+
+# Composite credential store: encrypted files (primary) + env vars (fallback)
+_env_mapping = {name: spec.env_var for name, spec in CREDENTIAL_SPECS.items()}
+_composite = CompositeStorage(
+    primary=EncryptedFileStorage(),
+    fallbacks=[EnvVarStorage(env_mapping=_env_mapping)],
+)
+CREDENTIALS = CredentialStoreAdapter(CredentialStore(storage=_composite))
+
+for _name in ["brave_search", "hubspot"]:
+    _val = CREDENTIALS.get(_name)
+    if _val:
+        logger.debug("credential %s: OK (len=%d)", _name, len(_val))
+    else:
+        logger.debug("credential %s: not found", _name)
+
+# -------------------------------------------------------------------------
+# Tool Registry — web_search + web_scrape for Node A (Researcher)
+# -------------------------------------------------------------------------
+
+TOOL_REGISTRY = ToolRegistry()
+
+
+def _exec_web_search(inputs: dict) -> dict:
+    api_key = CREDENTIALS.get("brave_search")
+    if not api_key:
+        return {"error": "brave_search credential not configured"}
+    query = inputs.get("query", "")
+    num_results = min(inputs.get("num_results", 10), 20)
+    resp = httpx.get(
+        "https://api.search.brave.com/res/v1/web/search",
+        params={"q": query, "count": num_results},
+        headers={
+            "X-Subscription-Token": api_key,
+            "Accept": "application/json",
+        },
+        timeout=30.0,
+    )
+    if resp.status_code != 200:
+        return {"error": f"Brave API HTTP {resp.status_code}"}
+    data = resp.json()
+    results = [
+        {
+            "title": item.get("title", ""),
+            "url": item.get("url", ""),
+            "snippet": item.get("description", ""),
+        }
+        for item in data.get("web", {}).get("results", [])[:num_results]
+    ]
+    return {"query": query, "results": results, "total": len(results)}
+
+
+TOOL_REGISTRY.register(
+    name="web_search",
+    tool=Tool(
+        name="web_search",
+        description=(
+            "Search the web for current information. "
+            "Returns titles, URLs, and snippets from search results."
+        ),
+        parameters={
+            "type": "object",
+            "properties": {
+                "query": {
+                    "type": "string",
+                    "description": "The search query (1-500 characters)",
+                },
+                "num_results": {
+                    "type": "integer",
+                    "description": "Number of results (1-20, default 10)",
+                },
+            },
+            "required": ["query"],
+        },
+    ),
+    executor=lambda inputs: _exec_web_search(inputs),
+)
+
+_SCRAPE_HEADERS = {
+    "User-Agent": (
+        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
+        "AppleWebKit/537.36 (KHTML, like Gecko) "
+        "Chrome/131.0.0.0 Safari/537.36"
+    ),
+    "Accept": "text/html,application/xhtml+xml",
+}
+
+
+def _exec_web_scrape(inputs: dict) -> dict:
+    url = inputs.get("url", "")
+    max_length = max(1000, min(inputs.get("max_length", 50000), 500000))
+    if not url.startswith(("http://", "https://")):
+        url = "https://" + url
+    try:
+        resp = httpx.get(
+            url,
+            timeout=30.0,
+            follow_redirects=True,
+            headers=_SCRAPE_HEADERS,
+        )
+        if resp.status_code != 200:
+            return {"error": f"HTTP {resp.status_code}"}
+        soup = BeautifulSoup(resp.text, "html.parser")
+        for tag in soup(["script", "style", "nav", "footer", "header", "aside", "noscript"]):
+            tag.decompose()
+        title = soup.title.get_text(strip=True) if soup.title else ""
+        main = (
+            soup.find("article")
+            or soup.find("main")
+            or soup.find(attrs={"role": "main"})
+            or soup.find("body")
+        )
+        text = main.get_text(separator=" ", strip=True) if main else ""
+        text = " ".join(text.split())
+        if len(text) > max_length:
+            text = text[:max_length] + "..."
+        return {
+            "url": url,
+            "title": title,
+            "content": text,
+            "length": len(text),
+        }
+    except httpx.TimeoutException:
+        return {"error": "Request timed out"}
+    except Exception as e:
+        return {"error": f"Scrape failed: {e}"}
+
+
+TOOL_REGISTRY.register(
+    name="web_scrape",
+    tool=Tool(
+        name="web_scrape",
+        description=(
+            "Scrape and extract text content from a webpage URL. "
+            "Returns the page title and main text content."
+        ),
+        parameters={
+            "type": "object",
+            "properties": {
+                "url": {
+                    "type": "string",
+                    "description": "URL of the webpage to scrape",
+                },
+                "max_length": {
+                    "type": "integer",
+                    "description": "Maximum text length (default 50000)",
+                },
+            },
+            "required": ["url"],
+        },
+    ),
+    executor=lambda inputs: _exec_web_scrape(inputs),
+)
+
+logger.info(
+    "ToolRegistry loaded: %s",
+    ", ".join(TOOL_REGISTRY.get_registered_names()),
+)
+
+# -------------------------------------------------------------------------
+# Node Specs
+# -------------------------------------------------------------------------
+
+RESEARCHER_SPEC = NodeSpec(
+    id="researcher",
+    name="Researcher",
+    description="Researches a topic using web search and scraping tools",
+    node_type="event_loop",
+    input_keys=["topic"],
+    output_keys=["research_summary"],
+    system_prompt=(
+        "You are a thorough research assistant. Your job is to research "
+        "the given topic using the web_search and web_scrape tools.\n\n"
+        "1. Search for relevant information on the topic\n"
+        "2. Scrape 1-2 of the most promising URLs for details\n"
+        "3. Synthesize your findings into a comprehensive summary\n"
+        "4. Use set_output with key='research_summary' to save your "
+        "findings\n\n"
+        "Be thorough but efficient. Aim for 2-4 search/scrape calls, "
+        "then summarize and set_output."
+    ),
+)
+
+ANALYST_SPEC = NodeSpec(
+    id="analyst",
+    name="Analyst",
+    description="Analyzes research findings and provides insights",
+    node_type="event_loop",
+    input_keys=["context"],
+    output_keys=["analysis"],
+    system_prompt=(
+        "You are a strategic analyst. You receive research findings from "
+        "a previous researcher and must:\n\n"
+        "1. Identify key themes and patterns\n"
+        "2. Assess the reliability and significance of the findings\n"
+        "3. Provide actionable insights and recommendations\n"
+        "4. Use set_output with key='analysis' to save your analysis\n\n"
+        "Be concise but insightful. Focus on what matters most."
+    ),
+)
+
+
+# -------------------------------------------------------------------------
+# HTML page
+# -------------------------------------------------------------------------
+
+HTML_PAGE = (  # noqa: E501
+    """<!DOCTYPE html>
+<html lang="en">
+<head>
+<meta charset="utf-8">
+<meta name="viewport" content="width=device-width, initial-scale=1">
+<title>ContextHandoff Demo</title>
+<style>
+  * {
+    box-sizing: border-box;
+    margin: 0;
+    padding: 0;
+  }
+  body {
+    font-family: 'SF Mono', 'Fira Code', monospace;
+    background: #0d1117;
+    color: #c9d1d9;
+    height: 100vh;
+    display: flex;
+    flex-direction: column;
+  }
+  header {
+    background: #161b22;
+    padding: 12px 20px;
+    border-bottom: 1px solid #30363d;
+    display: flex;
+    align-items: center;
+    gap: 16px;
+  }
+  header h1 {
+    font-size: 16px;
+    color: #58a6ff;
+    font-weight: 600;
+  }
+  .badge {
+    font-size: 12px;
+    padding: 3px 10px;
+    border-radius: 12px;
+    background: #21262d;
+    color: #8b949e;
+  }
+  .badge.researcher {
+    background: #1a3a5c;
+    color: #58a6ff;
+  }
+  .badge.analyst {
+    background: #1a4b2e;
+    color: #3fb950;
+  }
+  .badge.handoff {
+    background: #3d1f00;
+    color: #d29922;
+  }
+  .badge.done {
+    background: #21262d;
+    color: #8b949e;
+  }
+  .badge.error {
+    background: #4b1a1a;
+    color: #f85149;
+  }
+  .chat {
+    flex: 1;
+    overflow-y: auto;
+    padding: 16px;
+  }
+  .msg {
+    margin: 8px 0;
+    padding: 10px 14px;
+    border-radius: 8px;
+    line-height: 1.6;
+    white-space: pre-wrap;
+    word-wrap: break-word;
+  }
+  .msg.user {
+    background: #1a3a5c;
+    color: #58a6ff;
+  }
+  .msg.assistant {
+    background: #161b22;
+    color: #c9d1d9;
+  }
+  .msg.assistant.analyst-msg {
+    border-left: 3px solid #3fb950;
+  }
+  .msg.event {
+    background: transparent;
+    color: #8b949e;
+    font-size: 11px;
+    padding: 4px 14px;
+    border-left: 3px solid #30363d;
+  }
+  .msg.event.loop {
+    border-left-color: #58a6ff;
+  }
+  .msg.event.tool {
+    border-left-color: #d29922;
+  }
+  .msg.event.stall {
+    border-left-color: #f85149;
+  }
+  .handoff-banner {
+    margin: 16px 0;
+    padding: 16px;
+    background: #1c1200;
+    border: 1px solid #d29922;
+    border-radius: 8px;
+    text-align: center;
+  }
+  .handoff-banner h3 {
+    color: #d29922;
+    font-size: 14px;
+    margin-bottom: 8px;
+  }
+  .handoff-banner p, .result-banner p {
+    color: #8b949e;
+    font-size: 12px;
+    line-height: 1.5;
+    max-height: 200px;
+    overflow-y: auto;
+    white-space: pre-wrap;
+    text-align: left;
+  }
+  .result-banner {
+    margin: 16px 0;
+    padding: 16px;
+    background: #0a2614;
+    border: 1px solid #3fb950;
+    border-radius: 8px;
+  }
+  .result-banner h3 {
+    color: #3fb950;
+    font-size: 14px;
+    margin-bottom: 8px;
+    text-align: center;
+  }
+  .result-banner .label {
+    color: #58a6ff;
+    font-size: 11px;
+    font-weight: 600;
+    margin-top: 10px;
+    margin-bottom: 2px;
+  }
+  .result-banner .tokens {
+    color: #484f58;
+    font-size: 11px;
+    text-align: center;
+    margin-top: 10px;
+  }
+  .input-bar {
+    padding: 12px 16px;
+    background: #161b22;
+    border-top: 1px solid #30363d;
+    display: flex;
+    gap: 8px;
+  }
+  .input-bar input {
+    flex: 1;
+    background: #0d1117;
+    border: 1px solid #30363d;
+    color: #c9d1d9;
+    padding: 8px 12px;
+    border-radius: 6px;
+    font-family: inherit;
+    font-size: 14px;
+    outline: none;
+  }
+  .input-bar input:focus {
+    border-color: #58a6ff;
+  }
+  .input-bar button {
+    background: #238636;
+    color: #fff;
+    border: none;
+    padding: 8px 20px;
+    border-radius: 6px;
+    cursor: pointer;
+    font-family: inherit;
+    font-weight: 600;
+  }
+  .input-bar button:hover {
+    background: #2ea043;
+  }
+  .input-bar button:disabled {
+    background: #21262d;
+    color: #484f58;
+    cursor: not-allowed;
+  }
+</style>
+</head>
+<body>
+  <header>
+    <h1>ContextHandoff Demo</h1>
+    <span id="phase" class="badge">Idle</span>
+    <span id="iter" class="badge" style="display:none">Step 0</span>
+  </header>
+  <div id="chat" class="chat"></div>
+  <div class="input-bar">
+    <input id="input" type="text"
+           placeholder="Enter a research topic..." autofocus />
+    <button id="go" onclick="run()">Research</button>
+  </div>
+
+<script>
+let ws = null;
+let currentAssistantEl = null;
+let iterCount = 0;
+let currentPhase = 'idle';
+const chat = document.getElementById('chat');
+const phase = document.getElementById('phase');
+const iterEl = document.getElementById('iter');
+const goBtn = document.getElementById('go');
+const inputEl = document.getElementById('input');
+
+inputEl.addEventListener('keydown', e => {
+  if (e.key === 'Enter') run();
+});
+
+function setPhase(text, cls) {
+  phase.textContent = text;
+  phase.className = 'badge ' + cls;
+  currentPhase = cls;
+}
+
+function addMsg(text, cls) {
+  const el = document.createElement('div');
+  el.className = 'msg ' + cls;
+  el.textContent = text;
+  chat.appendChild(el);
+  chat.scrollTop = chat.scrollHeight;
+  return el;
+}
+
+function addHandoffBanner(summary) {
+  const banner = document.createElement('div');
+  banner.className = 'handoff-banner';
+  const h3 = document.createElement('h3');
+  h3.textContent = 'Context Handoff: Researcher -> Analyst';
+  const p = document.createElement('p');
+  p.textContent = summary || 'Passing research context...';
+  banner.appendChild(h3);
+  banner.appendChild(p);
+  chat.appendChild(banner);
+  chat.scrollTop = chat.scrollHeight;
+}
+
+function addResultBanner(researcher, analyst, tokens) {
+  const banner = document.createElement('div');
+  banner.className = 'result-banner';
+  const h3 = document.createElement('h3');
+  h3.textContent = 'Pipeline Complete';
+  banner.appendChild(h3);
+
+  if (researcher && researcher.research_summary) {
+    const lbl = document.createElement('div');
+    lbl.className = 'label';
+    lbl.textContent = 'RESEARCH SUMMARY';
+    banner.appendChild(lbl);
+    const p = document.createElement('p');
+    p.textContent = researcher.research_summary;
+    banner.appendChild(p);
+  }
+
+  if (analyst && analyst.analysis) {
+    const lbl = document.createElement('div');
+    lbl.className = 'label';
+    lbl.textContent = 'ANALYSIS';
+    lbl.style.color = '#3fb950';
+    banner.appendChild(lbl);
+    const p = document.createElement('p');
+    p.textContent = analyst.analysis;
+    banner.appendChild(p);
+  }
+
+  if (tokens) {
+    const t = document.createElement('div');
+    t.className = 'tokens';
+    t.textContent = 'Total tokens: ' + tokens.toLocaleString();
+    banner.appendChild(t);
+  }
+
+  chat.appendChild(banner);
+  chat.scrollTop = chat.scrollHeight;
+}
+
+function connect() {
+  ws = new WebSocket('ws://' + location.host + '/ws');
+  ws.onopen = () => {
+    setPhase('Ready', 'done');
+    goBtn.disabled = false;
+  };
+  ws.onmessage = handleEvent;
+  ws.onerror = () => { setPhase('Error', 'error'); };
+  ws.onclose = () => {
+    setPhase('Reconnecting...', '');
+    goBtn.disabled = true;
+    setTimeout(connect, 2000);
+  };
+}
+
+function handleEvent(msg) {
+  const evt = JSON.parse(msg.data);
+
+  if (evt.type === 'phase') {
+    if (evt.phase === 'researcher') {
+      setPhase('Researcher', 'researcher');
+    } else if (evt.phase === 'handoff') {
+      setPhase('Handoff', 'handoff');
+    } else if (evt.phase === 'analyst') {
+      setPhase('Analyst', 'analyst');
+    }
+    iterCount = 0;
+    iterEl.style.display = 'none';
+  }
+  else if (evt.type === 'llm_text_delta') {
+    if (currentAssistantEl) {
+      currentAssistantEl.textContent += evt.content;
+      chat.scrollTop = chat.scrollHeight;
+    }
+  }
+  else if (evt.type === 'node_loop_iteration') {
+    iterCount = evt.iteration || (iterCount + 1);
+    iterEl.textContent = 'Step ' + iterCount;
+    iterEl.style.display = '';
+  }
+  else if (evt.type === 'tool_call_started') {
+    var info = evt.tool_name + '('
+      + JSON.stringify(evt.tool_input).slice(0, 120) + ')';
+    addMsg('TOOL  ' + info, 'event tool');
+  }
+  else if (evt.type === 'tool_call_completed') {
+    var preview = (evt.result || '').slice(0, 200);
+    var cls = evt.is_error ? 'stall' : 'tool';
+    addMsg(
+      'RESULT  ' + evt.tool_name + ': ' + preview,
+      'event ' + cls
+    );
+    var assistCls = currentPhase === 'analyst'
+      ? 'assistant analyst-msg' : 'assistant';
+    currentAssistantEl = addMsg('', assistCls);
+  }
+  else if (evt.type === 'handoff_context') {
+    addHandoffBanner(evt.summary);
+    var assistCls = 'assistant analyst-msg';
+    currentAssistantEl = addMsg('', assistCls);
+  }
+  else if (evt.type === 'node_result') {
+    if (evt.node_id === 'researcher') {
+      if (currentAssistantEl
+          && !currentAssistantEl.textContent) {
+        currentAssistantEl.remove();
+      }
+    }
+  }
+  else if (evt.type === 'done') {
+    setPhase('Done', 'done');
+    iterEl.style.display = 'none';
+    if (currentAssistantEl
+        && !currentAssistantEl.textContent) {
+      currentAssistantEl.remove();
+    }
+    currentAssistantEl = null;
+    addResultBanner(
+      evt.researcher, evt.analyst, evt.total_tokens
+    );
+    goBtn.disabled = false;
+    inputEl.placeholder = 'Enter another topic...';
+  }
+  else if (evt.type === 'error') {
+    setPhase('Error', 'error');
+    addMsg('ERROR  ' + evt.message, 'event stall');
+    goBtn.disabled = false;
+  }
+  else if (evt.type === 'node_stalled') {
+    addMsg('STALLED  ' + evt.reason, 'event stall');
+  }
+}
+
+function run() {
+  const text = inputEl.value.trim();
+  if (!text || !ws || ws.readyState !== 1) return;
+  chat.innerHTML = '';
+  addMsg(text, 'user');
+  currentAssistantEl = addMsg('', 'assistant');
+  inputEl.value = '';
+  goBtn.disabled = true;
+  ws.send(JSON.stringify({ topic: text }));
+}
+
+connect();
+</script>
+</body>
+</html>"""
+)
+
+
+# -------------------------------------------------------------------------
+# WebSocket handler — sequential Node A → Handoff → Node B
+# -------------------------------------------------------------------------
+
+
+async def handle_ws(websocket):
+    """Run the two-node handoff pipeline per user message."""
+    try:
+        async for raw in websocket:
+            try:
+                msg = json.loads(raw)
+            except Exception:
+                continue
+
+            topic = msg.get("topic", "")
+            if not topic:
+                continue
+
+            logger.info(f"Starting handoff pipeline for: {topic}")
+
+            try:
+                await _run_pipeline(websocket, topic)
+            except websockets.exceptions.ConnectionClosed:
+                logger.info("WebSocket closed during pipeline")
+                return
+            except Exception as e:
+                logger.exception("Pipeline error")
+                try:
+                    await websocket.send(json.dumps({"type": "error", "message": str(e)}))
+                except Exception:
+                    pass
+
+    except websockets.exceptions.ConnectionClosed:
+        pass
+
+
+async def _run_pipeline(websocket, topic: str):
+    """Execute: Node A (research) → ContextHandoff → Node B (analysis)."""
+    import shutil
+
+    # Fresh stores for each run
+    run_dir = Path(tempfile.mkdtemp(prefix="hive_run_", dir=STORE_DIR))
+    store_a = FileConversationStore(run_dir / "node_a")
+    store_b = FileConversationStore(run_dir / "node_b")
+
+    # Shared event bus
+    bus = EventBus()
+
+    async def forward_event(event):
+        try:
+            payload = {"type": event.type.value, **event.data}
+            if event.node_id:
+                payload["node_id"] = event.node_id
+            await websocket.send(json.dumps(payload))
+        except Exception:
+            pass
+
+    bus.subscribe(
+        event_types=[
+            EventType.NODE_LOOP_STARTED,
+            EventType.NODE_LOOP_ITERATION,
+            EventType.NODE_LOOP_COMPLETED,
+            EventType.LLM_TEXT_DELTA,
+            EventType.TOOL_CALL_STARTED,
+            EventType.TOOL_CALL_COMPLETED,
+            EventType.NODE_STALLED,
+        ],
+        handler=forward_event,
+    )
+
+    tools = list(TOOL_REGISTRY.get_tools().values())
+    tool_executor = TOOL_REGISTRY.get_executor()
+
+    # ---- Phase 1: Researcher ------------------------------------------------
+    await websocket.send(json.dumps({"type": "phase", "phase": "researcher"}))
+
+    node_a = EventLoopNode(
+        event_bus=bus,
+        judge=None,  # implicit judge: accept when output_keys filled
+        config=LoopConfig(
+            max_iterations=20,
+            max_tool_calls_per_turn=10,
+            max_history_tokens=32_000,
+        ),
+        conversation_store=store_a,
+        tool_executor=tool_executor,
+    )
+
+    ctx_a = NodeContext(
+        runtime=RUNTIME,
+        node_id="researcher",
+        node_spec=RESEARCHER_SPEC,
+        memory=SharedMemory(),
+        input_data={"topic": topic},
+        llm=LLM,
+        available_tools=tools,
+    )
+
+    result_a = await node_a.execute(ctx_a)
+    logger.info(
+        "Researcher done: success=%s, tokens=%s",
+        result_a.success,
+        result_a.tokens_used,
+    )
+
+    await websocket.send(
+        json.dumps(
+            {
+                "type": "node_result",
+                "node_id": "researcher",
+                "success": result_a.success,
+                "output": result_a.output,
+            }
+        )
+    )
+
+    if not result_a.success:
+        await websocket.send(
+            json.dumps(
+                {
+                    "type": "error",
+                    "message": f"Researcher failed: {result_a.error}",
+                }
+            )
+        )
+        return
+
+    # ---- Phase 2: Context Handoff -------------------------------------------
+    await websocket.send(json.dumps({"type": "phase", "phase": "handoff"}))
+
+    # Restore the researcher's conversation from store
+    conversation_a = await NodeConversation.restore(store_a)
+    if conversation_a is None:
+        await websocket.send(
+            json.dumps(
+                {
+                    "type": "error",
+                    "message": "Failed to restore researcher conversation",
+                }
+            )
+        )
+        return
+
+    handoff_engine = ContextHandoff(llm=LLM)
+    handoff_context = handoff_engine.summarize_conversation(
+        conversation=conversation_a,
+        node_id="researcher",
+        output_keys=["research_summary"],
+    )
+
+    formatted_handoff = ContextHandoff.format_as_input(handoff_context)
+    logger.info(
+        "Handoff: %d turns, ~%d tokens, keys=%s",
+        handoff_context.turn_count,
+        handoff_context.total_tokens_used,
+        list(handoff_context.key_outputs.keys()),
+    )
+
+    # Send handoff context to browser
+    await websocket.send(
+        json.dumps(
+            {
+                "type": "handoff_context",
+                "summary": handoff_context.summary[:500],
+                "turn_count": handoff_context.turn_count,
+                "tokens": handoff_context.total_tokens_used,
+                "key_outputs": handoff_context.key_outputs,
+            }
+        )
+    )
+
+    # ---- Phase 3: Analyst ---------------------------------------------------
+    await websocket.send(json.dumps({"type": "phase", "phase": "analyst"}))
+
+    node_b = EventLoopNode(
+        event_bus=bus,
+        judge=None,  # implicit judge
+        config=LoopConfig(
+            max_iterations=10,
+            max_tool_calls_per_turn=5,
+            max_history_tokens=32_000,
+        ),
+        conversation_store=store_b,
+    )
+
+    ctx_b = NodeContext(
+        runtime=RUNTIME,
+        node_id="analyst",
+        node_spec=ANALYST_SPEC,
+        memory=SharedMemory(),
+        input_data={"context": formatted_handoff},
+        llm=LLM,
+        available_tools=[],
+    )
+
+    result_b = await node_b.execute(ctx_b)
+    logger.info(
+        "Analyst done: success=%s, tokens=%s",
+        result_b.success,
+        result_b.tokens_used,
+    )
+
+    # ---- Done ---------------------------------------------------------------
+    await websocket.send(
+        json.dumps(
+            {
+                "type": "done",
+                "researcher": result_a.output,
+                "analyst": result_b.output,
+                "total_tokens": ((result_a.tokens_used or 0) + (result_b.tokens_used or 0)),
+            }
+        )
+    )
+
+    # Clean up temp stores
+    try:
+        shutil.rmtree(run_dir)
+    except Exception:
+        pass
+
+
+# -------------------------------------------------------------------------
+# HTTP handler
+# -------------------------------------------------------------------------
+
+
+async def process_request(connection, request: Request):
+    """Serve HTML on GET /, upgrade to WebSocket on /ws."""
+    if request.path == "/ws":
+        return None
+    return Response(
+        HTTPStatus.OK,
+        "OK",
+        websockets.Headers({"Content-Type": "text/html; charset=utf-8"}),
+        HTML_PAGE.encode(),
+    )
+
+
+# -------------------------------------------------------------------------
+# Main
+# -------------------------------------------------------------------------
+
+
+async def main():
+    port = 8766
+    async with websockets.serve(
+        handle_ws,
+        "0.0.0.0",
+        port,
+        process_request=process_request,
+    ):
+        logger.info(f"Handoff demo at http://localhost:{port}")
+        logger.info("Enter a research topic to start the pipeline.")
+        await asyncio.Future()
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
@@ -0,0 +1,123 @@
+"""
+Minimal Manual Agent Example
+----------------------------
+This example demonstrates how to build and run an agent programmatically
+without using the Claude Code CLI or external LLM APIs.
+
+It uses 'function' nodes to define logic in pure Python, making it perfect
+for understanding the core runtime loop:
+Setup -> Graph definition -> Execution -> Result
+
+Run with:
+    uv run python core/examples/manual_agent.py
+"""
+
+import asyncio
+
+from framework.graph import EdgeCondition, EdgeSpec, Goal, GraphSpec, NodeSpec
+from framework.graph.executor import GraphExecutor
+from framework.runtime.core import Runtime
+
+
+# 1. Define Node Logic (Pure Python Functions)
+def greet(name: str) -> str:
+    """Generate a simple greeting."""
+    return f"Hello, {name}!"
+
+
+def uppercase(greeting: str) -> str:
+    """Convert text to uppercase."""
+    return greeting.upper()
+
+
+async def main():
+    print("🚀 Setting up Manual Agent...")
+
+    # 2. Define the Goal
+    # Every agent needs a goal with success criteria
+    goal = Goal(
+        id="greet-user",
+        name="Greet User",
+        description="Generate a friendly uppercase greeting",
+        success_criteria=[
+            {
+                "id": "greeting_generated",
+                "description": "Greeting produced",
+                "metric": "custom",
+                "target": "any",
+            }
+        ],
+    )
+
+    # 3. Define Nodes
+    # Nodes describe steps in the process
+    node1 = NodeSpec(
+        id="greeter",
+        name="Greeter",
+        description="Generates a simple greeting",
+        node_type="function",
+        function="greet",  # Matches the registered function name
+        input_keys=["name"],
+        output_keys=["greeting"],
+    )
+
+    node2 = NodeSpec(
+        id="uppercaser",
+        name="Uppercaser",
+        description="Converts greeting to uppercase",
+        node_type="function",
+        function="uppercase",
+        input_keys=["greeting"],
+        output_keys=["final_greeting"],
+    )
+
+    # 4. Define Edges
+    # Edges define the flow between nodes
+    edge1 = EdgeSpec(
+        id="greet-to-upper",
+        source="greeter",
+        target="uppercaser",
+        condition=EdgeCondition.ON_SUCCESS,
+    )
+
+    # 5. Create Graph
+    # The graph works like a blueprint connecting nodes and edges
+    graph = GraphSpec(
+        id="greeting-agent",
+        goal_id="greet-user",
+        entry_node="greeter",
+        terminal_nodes=["uppercaser"],
+        nodes=[node1, node2],
+        edges=[edge1],
+    )
+
+    # 6. Initialize Runtime & Executor
+    # Runtime handles state/memory; Executor runs the graph
+    from pathlib import Path
+
+    runtime = Runtime(storage_path=Path("./agent_logs"))
+    executor = GraphExecutor(runtime=runtime)
+
+    # 7. Register Function Implementations
+    # Connect string names in NodeSpecs to actual Python functions
+    executor.register_function("greeter", greet)
+    executor.register_function("uppercaser", uppercase)
+
+    # 8. Execute Agent
+    print("▶ Executing agent with input: name='Alice'...")
+
+    result = await executor.execute(graph=graph, goal=goal, input_data={"name": "Alice"})
+
+    # 9. Verify Results
+    if result.success:
+        print("\n✅ Success!")
+        print(f"Path taken: {' -> '.join(result.path)}")
+        print(f"Final output: {result.output.get('final_greeting')}")
+    else:
+        print(f"\n❌ Failed: {result.error}")
+
+
+if __name__ == "__main__":
+    # Optional: Enable logging to see internal decision flow
+    # logging.basicConfig(level=logging.INFO)
+    asyncio.run(main())
@@ -37,9 +37,9 @@ async def example_1_programmatic_registration():
    print(f"\nAvailable tools: {list(tools.keys())}")

    # Run the agent with MCP tools available
-    result = await runner.run({
-        "objective": "Search for 'Claude AI' and summarize the top 3 results"
-    })
+    result = await runner.run(
+        {"objective": "Search for 'Claude AI' and summarize the top 3 results"}
+    )

    print(f"\nAgent result: {result}")

@@ -78,10 +78,8 @@ async def example_3_config_file():

    # Copy example config (in practice, you'd place this in your agent folder)
    import shutil
-    shutil.copy(
-        "examples/mcp_servers.json",
-        test_agent_path / "mcp_servers.json"
-    )
+
+    shutil.copy("examples/mcp_servers.json", test_agent_path / "mcp_servers.json")

    # Load agent - MCP servers will be auto-discovered
    runner = AgentRunner.load(test_agent_path)
@@ -101,27 +99,23 @@ async def example_4_custom_agent_with_mcp_tools():
    """Example 4: Build custom agent that uses MCP tools"""
    print("\n=== Example 4: Custom Agent with MCP Tools ===\n")

-    from framework.builder.workflow import WorkflowBuilder
+    from framework.builder.workflow import GraphBuilder

    # Create a workflow builder
-    builder = WorkflowBuilder()
+    builder = GraphBuilder()

    # Define goal
    builder.set_goal(
        goal_id="web-researcher",
        name="Web Research Agent",
-        description="Search the web and summarize findings"
+        description="Search the web and summarize findings",
    )

    # Add success criteria
    builder.add_success_criterion(
-        "search-results",
-        "Successfully retrieve at least 3 web search results"
-    )
-    builder.add_success_criterion(
-        "summary",
-        "Provide a clear, concise summary of the findings"
+        "search-results", "Successfully retrieve at least 3 web search results"
    )
+    builder.add_success_criterion("summary", "Provide a clear, concise summary of the findings")

    # Add nodes that will use MCP tools
    builder.add_node(
@@ -192,6 +186,7 @@ async def main():
    except Exception as e:
        print(f"\nError running example: {e}")
        import traceback
+
        traceback.print_exc()


@@ -4,8 +4,8 @@
      "name": "tools",
      "description": "Aden tools including web search, file operations, and PDF reading",
      "transport": "stdio",
-      "command": "python",
-      "args": ["mcp_server.py", "--stdio"],
+      "command": "uv",
+      "args": ["run", "python", "mcp_server.py", "--stdio"],
      "cwd": "../tools",
      "env": {
        "BRAVE_SEARCH_API_KEY": "${BRAVE_SEARCH_API_KEY}"
@@ -22,26 +22,22 @@ The framework includes a Goal-Based Testing system (Goal → Agent → Eval):
 See `framework.testing` for details.
 """

-from framework.schemas.decision import Decision, Option, Outcome, DecisionEvaluation
-from framework.schemas.run import Run, RunSummary, Problem
-from framework.runtime.core import Runtime
 from framework.builder.query import BuilderQuery
-from framework.llm import LLMProvider, AnthropicProvider
-from framework.runner import AgentRunner, AgentOrchestrator
+from framework.llm import AnthropicProvider, LLMProvider
+from framework.runner import AgentOrchestrator, AgentRunner
+from framework.runtime.core import Runtime
+from framework.schemas.decision import Decision, DecisionEvaluation, Option, Outcome
+from framework.schemas.run import Problem, Run, RunSummary

 # Testing framework
 from framework.testing import (
+    ApprovalStatus,
+    DebugTool,
+    ErrorCategory,
    Test,
    TestResult,
-    TestSuiteResult,
    TestStorage,
-    ApprovalStatus,
-    ErrorCategory,
-    ConstraintTestGenerator,
-    SuccessCriteriaTestGenerator,
-    ParallelTestRunner,
-    ParallelConfig,
-    DebugTool,
+    TestSuiteResult,
 )

 __all__ = [
@@ -70,9 +66,5 @@ __all__ = [
    "TestStorage",
    "ApprovalStatus",
    "ErrorCategory",
-    "ConstraintTestGenerator",
-    "SuccessCriteriaTestGenerator",
-    "ParallelTestRunner",
-    "ParallelConfig",
    "DebugTool",
 ]
@@ -1,4 +1,4 @@
-"""Allow running as python -m framework"""
+"""Allow running as ``python -m framework``, which powers the ``hive`` console entry point."""

 from framework.cli import main

@@ -2,12 +2,12 @@

 from framework.builder.query import BuilderQuery
 from framework.builder.workflow import (
-    GraphBuilder,
-    BuildSession,
    BuildPhase,
-    ValidationResult,
+    BuildSession,
+    GraphBuilder,
    TestCase,
    TestResult,
+    ValidationResult,
 )

 __all__ = [
@@ -8,12 +8,12 @@ This is designed around the questions I need to answer:
 4. What should we change? (suggestions)
 """

-from typing import Any
 from collections import defaultdict
 from pathlib import Path
+from typing import Any

 from framework.schemas.decision import Decision
-from framework.schemas.run import Run, RunSummary, RunStatus
+from framework.schemas.run import Run, RunStatus, RunSummary
 from framework.storage.backend import FileStorage


@@ -196,10 +196,7 @@ class BuilderQuery:
                break

        # Extract problems
-        problems = [
-            f"[{p.severity}] {p.description}"
-            for p in run.problems
-        ]
+        problems = [f"[{p.severity}] {p.description}" for p in run.problems]

        # Generate suggestions based on the failure
        suggestions = self._generate_suggestions(run, failed_decisions)
@@ -253,11 +250,7 @@ class BuilderQuery:
                    error = decision.outcome.error or "Unknown error"
                    failure_counts[error] += 1

-        common_failures = sorted(
-            failure_counts.items(),
-            key=lambda x: x[1],
-            reverse=True
-        )[:5]
+        common_failures = sorted(failure_counts.items(), key=lambda x: x[1], reverse=True)[:5]

        # Find problematic nodes
        node_stats: dict[str, dict[str, int]] = defaultdict(lambda: {"total": 0, "failed": 0})
@@ -328,34 +321,45 @@ class BuilderQuery:

        # Suggestion: Fix problematic nodes
        for node_id, failure_rate in patterns.problematic_nodes:
-            suggestions.append({
-                "type": "node_improvement",
-                "target": node_id,
-                "reason": f"Node has {failure_rate:.1%} failure rate",
-                "recommendation": f"Review and improve node '{node_id}' - high failure rate suggests prompt or tool issues",
-                "priority": "high" if failure_rate > 0.3 else "medium",
-            })
+            suggestions.append(
+                {
+                    "type": "node_improvement",
+                    "target": node_id,
+                    "reason": f"Node has {failure_rate:.1%} failure rate",
+                    "recommendation": (
+                        f"Review and improve node '{node_id}' - "
+                        "high failure rate suggests prompt or tool issues"
+                    ),
+                    "priority": "high" if failure_rate > 0.3 else "medium",
+                }
+            )

        # Suggestion: Address common failures
        for failure, count in patterns.common_failures:
            if count >= 2:
-                suggestions.append({
-                    "type": "error_handling",
-                    "target": failure,
-                    "reason": f"Error occurred {count} times",
-                    "recommendation": f"Add handling for: {failure}",
-                    "priority": "high" if count >= 5 else "medium",
-                })
+                suggestions.append(
+                    {
+                        "type": "error_handling",
+                        "target": failure,
+                        "reason": f"Error occurred {count} times",
+                        "recommendation": f"Add handling for: {failure}",
+                        "priority": "high" if count >= 5 else "medium",
+                    }
+                )

        # Suggestion: Overall success rate
        if patterns.success_rate < 0.8:
-            suggestions.append({
-                "type": "architecture",
-                "target": goal_id,
-                "reason": f"Goal success rate is only {patterns.success_rate:.1%}",
-                "recommendation": "Consider restructuring the agent graph or improving goal definition",
-                "priority": "high",
-            })
+            suggestions.append(
+                {
+                    "type": "architecture",
+                    "target": goal_id,
+                    "reason": f"Goal success rate is only {patterns.success_rate:.1%}",
+                    "recommendation": (
+                        "Consider restructuring the agent graph or improving goal definition"
+                    ),
+                    "priority": "high",
+                }
+            )

        return suggestions

@@ -408,21 +412,22 @@ class BuilderQuery:
                alternatives = [o for o in decision.options if o.id != decision.chosen_option_id]
                if alternatives:
                    alt_desc = alternatives[0].description
+                    chosen_desc = chosen.description if chosen else "unknown"
                    suggestions.append(
-                        f"Consider alternative: '{alt_desc}' instead of '{chosen.description if chosen else 'unknown'}'"
+                        f"Consider alternative: '{alt_desc}' instead of '{chosen_desc}'"
                    )

            # Check for missing context
            if not decision.input_context:
                suggestions.append(
-                    f"Decision '{decision.intent}' had no input context - ensure relevant data is passed"
+                    f"Decision '{decision.intent}' had no input context - "
+                    "ensure relevant data is passed"
                )

            # Check for constraint issues
            if decision.active_constraints:
-                suggestions.append(
-                    f"Review constraints: {', '.join(decision.active_constraints)} - may be too restrictive"
-                )
+                constraints = ", ".join(decision.active_constraints)
+                suggestions.append(f"Review constraints: {constraints} - may be too restrictive")

        # Check for reported problems with suggestions
        for problem in run.problems:
@@ -471,15 +476,14 @@ class BuilderQuery:

        # Decision count difference
        if len(run1.decisions) != len(run2.decisions):
-            differences.append(
-                f"Decision count: {len(run1.decisions)} vs {len(run2.decisions)}"
-            )
+            differences.append(f"Decision count: {len(run1.decisions)} vs {len(run2.decisions)}")

        # Find first divergence point
-        for i, (d1, d2) in enumerate(zip(run1.decisions, run2.decisions)):
+        for i, (d1, d2) in enumerate(zip(run1.decisions, run2.decisions, strict=False)):
            if d1.chosen_option_id != d2.chosen_option_id:
                differences.append(
-                    f"Diverged at decision {i}: chose '{d1.chosen_option_id}' vs '{d2.chosen_option_id}'"
+                    f"Diverged at decision {i}: "
+                    f"chose '{d1.chosen_option_id}' vs '{d2.chosen_option_id}'"
                )
                break

@@ -13,32 +13,35 @@ Each step requires validation and human approval before proceeding.
 You cannot skip steps or bypass validation.
 """

-from enum import Enum
-from pathlib import Path
+from collections.abc import Callable
 from datetime import datetime
-from typing import Any, Callable
+from enum import StrEnum
+from pathlib import Path
+from typing import Any

 from pydantic import BaseModel, Field

+from framework.graph.edge import EdgeCondition, EdgeSpec, GraphSpec
 from framework.graph.goal import Goal
 from framework.graph.node import NodeSpec
-from framework.graph.edge import EdgeSpec, EdgeCondition, GraphSpec


-class BuildPhase(str, Enum):
+class BuildPhase(StrEnum):
    """Current phase of the build process."""
-    INIT = "init"                    # Just started
-    GOAL_DRAFT = "goal_draft"        # Drafting goal
+
+    INIT = "init"  # Just started
+    GOAL_DRAFT = "goal_draft"  # Drafting goal
    GOAL_APPROVED = "goal_approved"  # Goal approved
-    ADDING_NODES = "adding_nodes"    # Adding nodes
-    ADDING_EDGES = "adding_edges"    # Adding edges
-    TESTING = "testing"              # Running tests
-    APPROVED = "approved"            # Fully approved
-    EXPORTED = "exported"            # Exported to file
+    ADDING_NODES = "adding_nodes"  # Adding nodes
+    ADDING_EDGES = "adding_edges"  # Adding edges
+    TESTING = "testing"  # Running tests
+    APPROVED = "approved"  # Fully approved
+    EXPORTED = "exported"  # Exported to file


 class ValidationResult(BaseModel):
    """Result of a validation check."""
+
    valid: bool
    errors: list[str] = Field(default_factory=list)
    warnings: list[str] = Field(default_factory=list)
@@ -47,6 +50,7 @@ class ValidationResult(BaseModel):

 class TestCase(BaseModel):
    """A test case for validating agent behavior."""
+
    id: str
    description: str
    input: dict[str, Any]
@@ -56,6 +60,7 @@ class TestCase(BaseModel):

 class TestResult(BaseModel):
    """Result of running a test case."""
+
    test_id: str
    passed: bool
    actual_output: Any = None
@@ -69,6 +74,7 @@ class BuildSession(BaseModel):

    Saved after each approved step so you can resume later.
    """
+
    id: str
    name: str
    phase: BuildPhase = BuildPhase.INIT
@@ -457,11 +463,14 @@ class GraphBuilder:

            # Run the test
            import asyncio
-            result = asyncio.run(executor.execute(
-                graph=graph,
-                goal=self.session.goal,
-                input_data=test.input,
-            ))
+
+            result = asyncio.run(
+                executor.execute(
+                    graph=graph,
+                    goal=self.session.goal,
+                    input_data=test.input,
+                )
+            )

            # Check result
            passed = result.success
@@ -515,12 +524,14 @@ class GraphBuilder:
        if not self._pending_validation.valid:
            return False

-        self.session.approvals.append({
-            "phase": self.session.phase.value,
-            "comment": comment,
-            "timestamp": datetime.now().isoformat(),
-            "validation": self._pending_validation.model_dump(),
-        })
+        self.session.approvals.append(
+            {
+                "phase": self.session.phase.value,
+                "comment": comment,
+                "timestamp": datetime.now().isoformat(),
+                "validation": self._pending_validation.model_dump(),
+            }
+        )

        # Advance phase if appropriate
        if self.session.phase == BuildPhase.GOAL_DRAFT:
@@ -554,11 +565,13 @@ class GraphBuilder:
                return False

        self.session.phase = BuildPhase.APPROVED
-        self.session.approvals.append({
-            "phase": "final",
-            "comment": comment,
-            "timestamp": datetime.now().isoformat(),
-        })
+        self.session.approvals.append(
+            {
+                "phase": "final",
+                "comment": comment,
+                "timestamp": datetime.now().isoformat(),
+            }
+        )

        self._save_session()
        return True
@@ -630,69 +643,75 @@ class GraphBuilder:
        """Generate Python code for the graph."""
        lines = [
            '"""',
-            f'Generated agent: {self.session.name}',
-            f'Generated at: {datetime.now().isoformat()}',
+            f"Generated agent: {self.session.name}",
+            f"Generated at: {datetime.now().isoformat()}",
            '"""',
-            '',
-            'from framework.graph import (',
-            '    Goal, SuccessCriterion, Constraint,',
-            '    NodeSpec, EdgeSpec, EdgeCondition,',
-            ')',
-            'from framework.graph.edge import GraphSpec',
-            'from framework.graph.goal import GoalStatus',
-            '',
-            '',
-            '# Goal',
+            "",
+            "from framework.graph import (",
+            "    Goal, SuccessCriterion, Constraint,",
+            "    NodeSpec, EdgeSpec, EdgeCondition,",
+            ")",
+            "from framework.graph.edge import GraphSpec",
+            "from framework.graph.goal import GoalStatus",
+            "",
+            "",
+            "# Goal",
        ]

        if self.session.goal:
            goal_json = self.session.goal.model_dump_json(indent=4)
-            lines.append('GOAL = Goal.model_validate_json(\'\'\'')
+            lines.append("GOAL = Goal.model_validate_json('''")
            lines.append(goal_json)
            lines.append("''')")
        else:
-            lines.append('GOAL = None')
+            lines.append("GOAL = None")

-        lines.extend([
-            '',
-            '',
-            '# Nodes',
-            'NODES = [',
-        ])
+        lines.extend(
+            [
+                "",
+                "",
+                "# Nodes",
+                "NODES = [",
+            ]
+        )

        for node in self.session.nodes:
            node_json = node.model_dump_json(indent=4)
-            lines.append('    NodeSpec.model_validate_json(\'\'\'')
+            lines.append("    NodeSpec.model_validate_json('''")
            lines.append(node_json)
            lines.append("    '''),")

-        lines.extend([
-            ']',
-            '',
-            '',
-            '# Edges',
-            'EDGES = [',
-        ])
+        lines.extend(
+            [
+                "]",
+                "",
+                "",
+                "# Edges",
+                "EDGES = [",
+            ]
+        )

        for edge in self.session.edges:
            edge_json = edge.model_dump_json(indent=4)
-            lines.append('    EdgeSpec.model_validate_json(\'\'\'')
+            lines.append("    EdgeSpec.model_validate_json('''")
            lines.append(edge_json)
            lines.append("    '''),")

-        lines.extend([
-            ']',
-            '',
-            '',
-            '# Graph',
-        ])
+        lines.extend(
+            [
+                "]",
+                "",
+                "",
+                "# Graph",
+            ]
+        )

        graph_json = graph.model_dump_json(indent=4)
-        lines.append('GRAPH = GraphSpec.model_validate_json(\'\'\'')
+        lines.append("GRAPH = GraphSpec.model_validate_json('''")
        lines.append(graph_json)
        lines.append("''')")

-        return '\n'.join(lines)
+        return "\n".join(lines)

    # =========================================================================
    # SESSION MANAGEMENT
@@ -743,7 +762,9 @@ class GraphBuilder:
            "tests": len(self.session.test_cases),
            "tests_passed": sum(1 for t in self.session.test_results if t.passed),
            "approvals": len(self.session.approvals),
-            "pending_validation": self._pending_validation.model_dump() if self._pending_validation else None,
+            "pending_validation": self._pending_validation.model_dump()
+            if self._pending_validation
+            else None,
        }

    def show(self) -> str:
@@ -755,11 +776,13 @@ class GraphBuilder:
        ]

        if self.session.goal:
-            lines.extend([
-                f"Goal: {self.session.goal.name}",
-                f"  {self.session.goal.description}",
-                "",
-            ])
+            lines.extend(
+                [
+                    f"Goal: {self.session.goal.name}",
+                    f"  {self.session.goal.description}",
+                    "",
+                ]
+            )

        if self.session.nodes:
            lines.append("Nodes:")
@@ -1,30 +1,68 @@
 """
-Command-line interface for Goal Agent.
+Command-line interface for Aden Hive.

 Usage:
-    python -m core run exports/my-agent --input '{"key": "value"}'
-    python -m core info exports/my-agent
-    python -m core validate exports/my-agent
-    python -m core list exports/
-    python -m core dispatch exports/ --input '{"key": "value"}'
-    python -m core shell exports/my-agent
+    hive run exports/my-agent --input '{"key": "value"}'
+    hive info exports/my-agent
+    hive validate exports/my-agent
+    hive list exports/
+    hive dispatch exports/ --input '{"key": "value"}'
+    hive shell exports/my-agent

 Testing commands:
-    python -m core test-generate goal.json
-    python -m core test-approve <goal_id>
-    python -m core test-run <agent_path> --goal <goal_id>
-    python -m core test-debug <goal_id> <test_id>
-    python -m core test-list <goal_id>
-    python -m core test-stats <goal_id>
+    hive test-run <agent_path> --goal <goal_id>
+    hive test-debug <goal_id> <test_id>
+    hive test-list <goal_id>
+    hive test-stats <goal_id>
 """

 import argparse
 import sys
+from pathlib import Path
+
+
+def _configure_paths():
+    """Auto-configure sys.path so agents in exports/ are discoverable.
+
+    Resolves the project root by walking up from this file (framework/cli.py lives
+    inside core/framework/) or from CWD, then adds the exports/ directory to sys.path
+    if it exists. This eliminates the need for manual PYTHONPATH configuration.
+    """
+    # Strategy 1: resolve relative to this file (works when installed via pip install -e core/)
+    framework_dir = Path(__file__).resolve().parent  # core/framework/
+    core_dir = framework_dir.parent  # core/
+    project_root = core_dir.parent  # project root
+
+    # Strategy 2: if project_root doesn't look right, fall back to CWD
+    if not (project_root / "exports").is_dir() and not (project_root / "core").is_dir():
+        project_root = Path.cwd()
+
+    # Add exports/ to sys.path so agents are importable as top-level packages
+    exports_dir = project_root / "exports"
+    if exports_dir.is_dir():
+        exports_str = str(exports_dir)
+        if exports_str not in sys.path:
+            sys.path.insert(0, exports_str)
+
+    # Add examples/templates/ to sys.path so template agents are importable
+    templates_dir = project_root / "examples" / "templates"
+    if templates_dir.is_dir():
+        templates_str = str(templates_dir)
+        if templates_str not in sys.path:
+            sys.path.insert(0, templates_str)
+
+    # Ensure core/ is also in sys.path (for non-editable-install scenarios)
+    core_str = str(project_root / "core")
+    if (project_root / "core").is_dir() and core_str not in sys.path:
+        sys.path.insert(0, core_str)


 def main():
+    _configure_paths()
+
    parser = argparse.ArgumentParser(
-        description="Goal Agent - Build and run goal-driven agents"
+        prog="hive",
+        description="Aden Hive - Build and run goal-driven agents",
    )
    parser.add_argument(
        "--model",
@@ -36,10 +74,12 @@ def main():

    # Register runner commands (run, info, validate, list, dispatch, shell)
    from framework.runner.cli import register_commands
+
    register_commands(subparsers)

-    # Register testing commands (test-generate, test-approve, test-run, test-debug, etc.)
+    # Register testing commands (test-run, test-debug, test-list, test-stats)
    from framework.testing.cli import register_testing_commands
+
    register_testing_commands(subparsers)

    args = parser.parse_args()
@@ -0,0 +1,64 @@
+"""Shared Hive configuration utilities.
+
+Centralises reading of ~/.hive/configuration.json so that the runner
+and every agent template share one implementation instead of copy-pasting
+helper functions.
+"""
+
+import json
+from dataclasses import dataclass, field
+from pathlib import Path
+from typing import Any
+
+from framework.graph.edge import DEFAULT_MAX_TOKENS
+
+# ---------------------------------------------------------------------------
+# Low-level config file access
+# ---------------------------------------------------------------------------
+
+HIVE_CONFIG_FILE = Path.home() / ".hive" / "configuration.json"
+
+
+def get_hive_config() -> dict[str, Any]:
+    """Load hive configuration from ~/.hive/configuration.json."""
+    if not HIVE_CONFIG_FILE.exists():
+        return {}
+    try:
+        with open(HIVE_CONFIG_FILE) as f:
+            return json.load(f)
+    except (json.JSONDecodeError, OSError):
+        return {}
+
+
+# ---------------------------------------------------------------------------
+# Derived helpers
+# ---------------------------------------------------------------------------
+
+
+def get_preferred_model() -> str:
+    """Return the user's preferred LLM model string (e.g. 'anthropic/claude-sonnet-4-20250514')."""
+    llm = get_hive_config().get("llm", {})
+    if llm.get("provider") and llm.get("model"):
+        return f"{llm['provider']}/{llm['model']}"
+    return "anthropic/claude-sonnet-4-20250514"
+
+
+def get_max_tokens() -> int:
+    """Return the configured max_tokens, falling back to DEFAULT_MAX_TOKENS."""
+    return get_hive_config().get("llm", {}).get("max_tokens", DEFAULT_MAX_TOKENS)
+
+
+# ---------------------------------------------------------------------------
+# RuntimeConfig – shared across agent templates
+# ---------------------------------------------------------------------------
+
+
+@dataclass
+class RuntimeConfig:
+    """Agent runtime configuration loaded from ~/.hive/configuration.json."""
+
+    model: str = field(default_factory=get_preferred_model)
+    temperature: float = 0.7
+    max_tokens: int = field(default_factory=get_max_tokens)
+    api_key: str | None = None
+    api_base: str | None = None
@@ -0,0 +1,122 @@
+"""
+Credential Store - Production-ready credential management for Hive.
+
+This module provides secure credential storage with:
+- Key-vault structure: Credentials as objects with multiple keys
+- Template-based usage: {{cred.key}} patterns for injection
+- Bipartisan model: Store stores values, tools define usage
+- Provider system: Extensible lifecycle management (refresh, validate)
+- Multiple backends: Encrypted files, env vars, HashiCorp Vault
+
+Quick Start:
+    from core.framework.credentials import CredentialStore, CredentialObject
+
+    # Create store with encrypted storage
+    store = CredentialStore.with_encrypted_storage()  # defaults to ~/.hive/credentials
+
+    # Get a credential
+    api_key = store.get("brave_search")
+
+    # Resolve templates in headers
+    headers = store.resolve_headers({
+        "Authorization": "Bearer {{github_oauth.access_token}}"
+    })
+
+    # Save a new credential
+    store.save_credential(CredentialObject(
+        id="my_api",
+        keys={"api_key": CredentialKey(name="api_key", value=SecretStr("xxx"))}
+    ))
+
+For OAuth2 support:
+    from core.framework.credentials.oauth2 import BaseOAuth2Provider, OAuth2Config
+
+For Aden server sync:
+    from core.framework.credentials.aden import (
+        AdenCredentialClient,
+        AdenClientConfig,
+        AdenSyncProvider,
+    )
+
+For Vault integration:
+    from core.framework.credentials.vault import HashiCorpVaultStorage
+"""
+
+from .models import (
+    CredentialDecryptionError,
+    CredentialError,
+    CredentialKey,
+    CredentialKeyNotFoundError,
+    CredentialNotFoundError,
+    CredentialObject,
+    CredentialRefreshError,
+    CredentialType,
+    CredentialUsageSpec,
+    CredentialValidationError,
+)
+from .provider import (
+    BearerTokenProvider,
+    CredentialProvider,
+    StaticProvider,
+)
+from .storage import (
+    CompositeStorage,
+    CredentialStorage,
+    EncryptedFileStorage,
+    EnvVarStorage,
+    InMemoryStorage,
+)
+from .store import CredentialStore
+from .template import TemplateResolver
+
+# Aden sync components (lazy import to avoid httpx dependency when not needed)
+# Usage: from core.framework.credentials.aden import AdenSyncProvider
+# Or: from core.framework.credentials import AdenSyncProvider
+try:
+    from .aden import (
+        AdenCachedStorage,
+        AdenClientConfig,
+        AdenCredentialClient,
+        AdenSyncProvider,
+    )
+
+    _ADEN_AVAILABLE = True
+except ImportError:
+    _ADEN_AVAILABLE = False
+
+__all__ = [
+    # Main store
+    "CredentialStore",
+    # Models
+    "CredentialObject",
+    "CredentialKey",
+    "CredentialType",
+    "CredentialUsageSpec",
+    # Providers
+    "CredentialProvider",
+    "StaticProvider",
+    "BearerTokenProvider",
+    # Storage backends
+    "CredentialStorage",
+    "EncryptedFileStorage",
+    "EnvVarStorage",
+    "InMemoryStorage",
+    "CompositeStorage",
+    # Template resolution
+    "TemplateResolver",
+    # Exceptions
+    "CredentialError",
+    "CredentialNotFoundError",
+    "CredentialKeyNotFoundError",
+    "CredentialRefreshError",
+    "CredentialValidationError",
+    "CredentialDecryptionError",
+    # Aden sync (optional - requires httpx)
+    "AdenSyncProvider",
+    "AdenCredentialClient",
+    "AdenClientConfig",
+    "AdenCachedStorage",
+]
+
+# Track Aden availability for runtime checks
+ADEN_AVAILABLE = _ADEN_AVAILABLE
@@ -0,0 +1,76 @@
+"""
+Aden Credential Sync.
+
+Components for synchronizing credentials with the Aden authentication server.
+
+The Aden server handles OAuth2 authorization flows and maintains refresh tokens.
+These components fetch and cache access tokens locally while delegating
+lifecycle management to Aden.
+
+Components:
+- AdenCredentialClient: HTTP client for Aden API
+- AdenSyncProvider: CredentialProvider that syncs with Aden
+- AdenCachedStorage: Storage with local cache + Aden fallback
+
+Quick Start:
+    from core.framework.credentials import CredentialStore
+    from core.framework.credentials.storage import EncryptedFileStorage
+    from core.framework.credentials.aden import (
+        AdenCredentialClient,
+        AdenClientConfig,
+        AdenSyncProvider,
+    )
+
+    # Configure (API key loaded from ADEN_API_KEY env var)
+    client = AdenCredentialClient(AdenClientConfig(
+        base_url=os.environ["ADEN_API_URL"],
+    ))
+
+    provider = AdenSyncProvider(client=client)
+
+    store = CredentialStore(
+        storage=EncryptedFileStorage(),
+        providers=[provider],
+        auto_refresh=True,
+    )
+
+    # Initial sync
+    provider.sync_all(store)
+
+    # Use normally
+    token = store.get_key("hubspot", "access_token")
+
+See docs/aden-credential-sync.md for detailed documentation.
+"""
+
+from .client import (
+    AdenAuthenticationError,
+    AdenClientConfig,
+    AdenClientError,
+    AdenCredentialClient,
+    AdenCredentialResponse,
+    AdenIntegrationInfo,
+    AdenNotFoundError,
+    AdenRateLimitError,
+    AdenRefreshError,
+)
+from .provider import AdenSyncProvider
+from .storage import AdenCachedStorage
+
+__all__ = [
+    # Client
+    "AdenCredentialClient",
+    "AdenClientConfig",
+    "AdenCredentialResponse",
+    "AdenIntegrationInfo",
+    # Client errors
+    "AdenClientError",
+    "AdenAuthenticationError",
+    "AdenNotFoundError",
+    "AdenRateLimitError",
+    "AdenRefreshError",
+    # Provider
+    "AdenSyncProvider",
+    # Storage
+    "AdenCachedStorage",
+]
@@ -0,0 +1,481 @@
+"""
+Aden Credential Client.
+
+HTTP client for communicating with the Aden authentication server.
+The Aden server handles OAuth2 authorization flows and token management.
+This client fetches tokens and delegates refresh operations to Aden.
+
+Usage:
+    # API key loaded from ADEN_API_KEY environment variable by default
+    client = AdenCredentialClient(AdenClientConfig(
+        base_url="https://api.adenhq.com",
+    ))
+
+    # Or explicitly provide the API key
+    client = AdenCredentialClient(AdenClientConfig(
+        base_url="https://api.adenhq.com",
+        api_key="your-api-key",
+    ))
+
+    # Fetch a credential
+    response = client.get_credential("hubspot")
+    if response:
+        print(f"Token expires at: {response.expires_at}")
+
+    # Request a refresh
+    refreshed = client.request_refresh("hubspot")
+"""
+
+from __future__ import annotations
+
+import logging
+import os
+import time
+from dataclasses import dataclass, field
+from datetime import datetime
+from typing import Any
+
+import httpx
+
+logger = logging.getLogger(__name__)
+
+
+class AdenClientError(Exception):
+    """Base exception for Aden client errors."""
+
+    pass
+
+
+class AdenAuthenticationError(AdenClientError):
+    """Raised when API key is invalid or revoked."""
+
+    pass
+
+
+class AdenNotFoundError(AdenClientError):
+    """Raised when integration is not found."""
+
+    pass
+
+
+class AdenRefreshError(AdenClientError):
+    """Raised when token refresh fails."""
+
+    def __init__(
+        self,
+        message: str,
+        requires_reauthorization: bool = False,
+        reauthorization_url: str | None = None,
+    ):
+        super().__init__(message)
+        self.requires_reauthorization = requires_reauthorization
+        self.reauthorization_url = reauthorization_url
+
+
+class AdenRateLimitError(AdenClientError):
+    """Raised when rate limited."""
+
+    def __init__(self, message: str, retry_after: int = 60):
+        super().__init__(message)
+        self.retry_after = retry_after
+
+
+@dataclass
+class AdenClientConfig:
+    """Configuration for Aden API client."""
+
+    base_url: str
+    """Base URL of the Aden server (e.g., 'https://api.adenhq.com')."""
+
+    api_key: str | None = None
+    """Agent's API key for authenticating with Aden.
+    If not provided, loaded from ADEN_API_KEY environment variable."""
+
+    tenant_id: str | None = None
+    """Optional tenant ID for multi-tenant deployments."""
+
+    timeout: float = 30.0
+    """Request timeout in seconds."""
+
+    retry_attempts: int = 3
+    """Number of retry attempts for transient failures."""
+
+    retry_delay: float = 1.0
+    """Base delay between retries in seconds (exponential backoff)."""
+
+    def __post_init__(self) -> None:
+        """Load API key from environment if not provided."""
+        if self.api_key is None:
+            self.api_key = os.environ.get("ADEN_API_KEY")
+            if not self.api_key:
+                raise ValueError(
+                    "Aden API key not provided. Either pass api_key to AdenClientConfig "
+                    "or set the ADEN_API_KEY environment variable."
+                )
+
+
+@dataclass
+class AdenCredentialResponse:
+    """Response from Aden server containing credential data."""
+
+    integration_id: str
+    """Unique identifier for the integration (e.g., 'hubspot')."""
+
+    integration_type: str
+    """Type of integration (e.g., 'hubspot', 'github', 'slack')."""
+
+    access_token: str
+    """The access token for API calls."""
+
+    token_type: str = "Bearer"
+    """Token type (usually 'Bearer')."""
+
+    expires_at: datetime | None = None
+    """When the access token expires (UTC)."""
+
+    scopes: list[str] = field(default_factory=list)
+    """OAuth2 scopes granted to this token."""
+
+    metadata: dict[str, Any] = field(default_factory=dict)
+    """Additional integration-specific metadata."""
+
+    @classmethod
+    def from_dict(
+        cls, data: dict[str, Any], integration_id: str | None = None
+    ) -> AdenCredentialResponse:
+        """Create from API response dictionary or normalized credential dict."""
+
+        expires_at = None
+        if data.get("expires_at"):
+            expires_at = datetime.fromisoformat(data["expires_at"].replace("Z", "+00:00"))
+
+        resolved_integration_id = (
+            integration_id
+            or data.get("integration_id")
+            or data.get("alias")
+            or data.get("provider", "")
+        )
+
+        resolved_integration_type = data.get("integration_type") or data.get("provider", "")
+        metadata = data.get("metadata")
+        if metadata is None and data.get("email"):
+            metadata = {"email": data.get("email")}
+        if metadata is None:
+            metadata = {}
+
+        return cls(
+            integration_id=resolved_integration_id,
+            integration_type=resolved_integration_type,
+            access_token=data["access_token"],
+            token_type=data.get("token_type", "Bearer"),
+            expires_at=expires_at,
+            scopes=data.get("scopes", []),
+            metadata=metadata,
+        )
+
+
+@dataclass
+class AdenIntegrationInfo:
+    """Information about an available integration."""
+
+    integration_id: str
+    integration_type: str
+    status: str  # "active", "requires_reauth", "expired"
+    expires_at: datetime | None = None
+
+    @classmethod
+    def from_dict(cls, data: dict[str, Any]) -> AdenIntegrationInfo:
+        """Create from API response dictionary."""
+        expires_at = None
+        if data.get("expires_at"):
+            expires_at = datetime.fromisoformat(data["expires_at"].replace("Z", "+00:00"))
+
+        return cls(
+            integration_id=data["integration_id"],
+            integration_type=data.get("provider", data["integration_id"]),
+            status=data.get("status", "unknown"),
+            expires_at=expires_at,
+        )
+
+
+class AdenCredentialClient:
+    """
+    HTTP client for Aden credential server.
+
+    Handles communication with the Aden authentication server,
+    including fetching credentials, requesting refreshes, and
+    reporting usage statistics.
+
+    The client automatically handles:
+    - Retries with exponential backoff for transient failures
+    - Proper error classification (auth, not found, rate limit, etc.)
+    - Request headers for authentication and tenant isolation
+
+    Usage:
+        # API key loaded from ADEN_API_KEY environment variable
+        config = AdenClientConfig(
+            base_url="https://api.adenhq.com",
+        )
+
+        client = AdenCredentialClient(config)
+
+        # Fetch a credential
+        cred = client.get_credential("hubspot")
+        if cred:
+            headers = {"Authorization": f"Bearer {cred.access_token}"}
+
+        # List all integrations
+        integrations = client.list_integrations()
+        for info in integrations:
+            print(f"{info.integration_id}: {info.status}")
+
+        # Clean up
+        client.close()
+    """
+
+    def __init__(self, config: AdenClientConfig):
+        """
+        Initialize the Aden client.
+
+        Args:
+            config: Client configuration including base URL and API key.
+        """
+        self.config = config
+        self._client: httpx.Client | None = None
+
+    def _get_client(self) -> httpx.Client:
+        """Get or create the HTTP client."""
+        if self._client is None:
+            headers = {
+                "Authorization": f"Bearer {self.config.api_key}",
+                "Content-Type": "application/json",
+                "User-Agent": "hive-credential-store/1.0",
+            }
+
+            if self.config.tenant_id:
+                headers["X-Tenant-ID"] = self.config.tenant_id
+
+            self._client = httpx.Client(
+                base_url=self.config.base_url,
+                timeout=self.config.timeout,
+                headers=headers,
+            )
+
+        return self._client
+
+    def _request_with_retry(
+        self,
+        method: str,
+        path: str,
+        **kwargs: Any,
+    ) -> httpx.Response:
+        """Make a request with retry logic."""
+        client = self._get_client()
+        last_error: Exception | None = None
+
+        for attempt in range(self.config.retry_attempts):
+            try:
+                response = client.request(method, path, **kwargs)
+
+                # Handle specific error codes
+                if response.status_code == 401:
+                    raise AdenAuthenticationError("Agent API key is invalid or revoked")
+
+                if response.status_code == 404:
+                    raise AdenNotFoundError(f"Integration not found: {path}")
+
+                if response.status_code == 429:
+                    retry_after = int(response.headers.get("Retry-After", 60))
+                    raise AdenRateLimitError(
+                        "Rate limited by Aden server",
+                        retry_after=retry_after,
+                    )
+
+                if response.status_code == 400:
+                    data = response.json()
+                    if data.get("error") == "refresh_failed":
+                        raise AdenRefreshError(
+                            data.get("message", "Token refresh failed"),
+                            requires_reauthorization=data.get("requires_reauthorization", False),
+                            reauthorization_url=data.get("reauthorization_url"),
+                        )
+
+                # Success or other error
+                response.raise_for_status()
+                return response
+
+            except (httpx.ConnectError, httpx.TimeoutException) as e:
+                last_error = e
+                if attempt < self.config.retry_attempts - 1:
+                    delay = self.config.retry_delay * (2**attempt)
+                    logger.warning(
+                        f"Aden request failed (attempt {attempt + 1}), retrying in {delay}s: {e}"
+                    )
+                    time.sleep(delay)
+                else:
+                    raise AdenClientError(f"Failed to connect to Aden server: {e}") from e
+
+            except (
+                AdenAuthenticationError,
+                AdenNotFoundError,
+                AdenRefreshError,
+                AdenRateLimitError,
+            ):
+                # Don't retry these errors
+                raise
+
+        # Should not reach here, but just in case
+        raise AdenClientError(
+            f"Request failed after {self.config.retry_attempts} attempts"
+        ) from last_error
+
+    def get_credential(self, integration_id: str) -> AdenCredentialResponse | None:
+        """
+        Fetch the current credential for an integration.
+
+        The Aden server may refresh the token internally if it's expired
+        before returning it.
+
+        Args:
+            integration_id: The integration identifier (e.g., 'hubspot').
+
+        Returns:
+            Credential response with access token, or None if not found.
+
+        Raises:
+            AdenAuthenticationError: If API key is invalid.
+            AdenClientError: For connection failures.
+        """
+        try:
+            response = self._request_with_retry("GET", f"/v1/credentials/{integration_id}")
+            data = response.json()
+            return AdenCredentialResponse.from_dict(data, integration_id=integration_id)
+        except AdenNotFoundError:
+            return None
+
+    def request_refresh(self, integration_id: str) -> AdenCredentialResponse:
+        """
+        Request the Aden server to refresh the token.
+
+        Use this when the local store detects an expired or near-expiry token.
+        The Aden server handles the actual OAuth2 refresh token flow.
+
+        Args:
+            integration_id: The integration identifier.
+
+        Returns:
+            Credential response with new access token.
+
+        Raises:
+            AdenRefreshError: If refresh fails (may require re-authorization).
+            AdenNotFoundError: If integration not found.
+            AdenAuthenticationError: If API key is invalid.
+            AdenRateLimitError: If rate limited.
+        """
+        response = self._request_with_retry("POST", f"/v1/credentials/{integration_id}/refresh")
+        data = response.json()
+        return AdenCredentialResponse.from_dict(data, integration_id=integration_id)
+
+    def list_integrations(self) -> list[AdenIntegrationInfo]:
+        """
+        List all integrations available for this agent/tenant.
+
+        Returns:
+            List of integration info objects.
+
+        Raises:
+            AdenAuthenticationError: If API key is invalid.
+            AdenClientError: For connection failures.
+        """
+        response = self._request_with_retry("GET", "/v1/credentials")
+        data = response.json()
+        return [AdenIntegrationInfo.from_dict(item) for item in data.get("integrations", [])]
+
+    def validate_token(self, integration_id: str) -> dict[str, Any]:
+        """
+        Check if a token is still valid without fetching it.
+
+        Args:
+            integration_id: The integration identifier.
+
+        Returns:
+            Dict with 'valid' bool and optional 'expires_at', 'reason',
+            'requires_reauthorization', 'reauthorization_url'.
+
+        Raises:
+            AdenNotFoundError: If integration not found.
+            AdenAuthenticationError: If API key is invalid.
+        """
+        response = self._request_with_retry("GET", f"/v1/credentials/{integration_id}/validate")
+        return response.json()
+
+    def report_usage(
+        self,
+        integration_id: str,
+        operation: str,
+        status: str = "success",
+        metadata: dict[str, Any] | None = None,
+    ) -> None:
+        """
+        Report credential usage statistics to Aden.
+
+        This is optional and used for analytics/billing.
+
+        Args:
+            integration_id: The integration identifier.
+            operation: Operation name (e.g., 'api_call').
+            status: Operation status ('success', 'error').
+            metadata: Additional operation metadata.
+        """
+        try:
+            self._request_with_retry(
+                "POST",
+                f"/v1/credentials/{integration_id}/usage",
+                json={
+                    "operation": operation,
+                    "status": status,
+                    "timestamp": datetime.utcnow().isoformat() + "Z",
+                    "metadata": metadata or {},
+                },
+            )
+        except Exception as e:
+            # Usage reporting is best-effort, don't fail on errors
+            logger.warning(f"Failed to report usage for '{integration_id}': {e}")
+
+    def health_check(self) -> dict[str, Any]:
+        """
+        Check Aden server health and connectivity.
+
+        Returns:
+            Dict with 'status', 'version', 'timestamp', and optionally 'error'.
+        """
+        try:
+            client = self._get_client()
+            response = client.get("/health")
+            if response.status_code == 200:
+                data = response.json()
+                data["latency_ms"] = response.elapsed.total_seconds() * 1000
+                return data
+            return {
+                "status": "degraded",
+                "error": f"Unexpected status code: {response.status_code}",
+            }
+        except Exception as e:
+            return {
+                "status": "unhealthy",
+                "error": str(e),
+            }
+
+    def close(self) -> None:
+        """Close the HTTP client and release resources."""
+        if self._client:
+            self._client.close()
+            self._client = None
+
+    def __enter__(self) -> AdenCredentialClient:
+        """Context manager entry."""
+        return self
+
+    def __exit__(self, *args: Any) -> None:
+        """Context manager exit."""
+        self.close()
@@ -0,0 +1,415 @@
+"""
+Aden Sync Provider.
+
+Provider that synchronizes credentials with the Aden authentication server.
+The Aden server is the authoritative source for OAuth2 tokens - this provider
+fetches and caches tokens locally while delegating refresh operations to Aden.
+
+Usage:
+    from core.framework.credentials import CredentialStore
+    from core.framework.credentials.storage import EncryptedFileStorage
+    from core.framework.credentials.aden import (
+        AdenCredentialClient,
+        AdenClientConfig,
+        AdenSyncProvider,
+    )
+
+    # Configure client (API key loaded from ADEN_API_KEY env var)
+    client = AdenCredentialClient(AdenClientConfig(
+        base_url=os.environ["ADEN_API_URL"],
+    ))
+
+    # Create provider
+    provider = AdenSyncProvider(client=client)
+
+    # Create store
+    store = CredentialStore(
+        storage=EncryptedFileStorage(),
+        providers=[provider],
+        auto_refresh=True,
+    )
+
+    # Initial sync from Aden
+    provider.sync_all(store)
+
+    # Use normally - auto-refreshes via Aden when needed
+    token = store.get_key("hubspot", "access_token")
+"""
+
+from __future__ import annotations
+
+import logging
+from datetime import UTC, datetime, timedelta
+from typing import TYPE_CHECKING
+
+from pydantic import SecretStr
+
+from ..models import CredentialKey, CredentialObject, CredentialRefreshError, CredentialType
+from ..provider import CredentialProvider
+from .client import (
+    AdenClientError,
+    AdenCredentialClient,
+    AdenCredentialResponse,
+    AdenRefreshError,
+)
+
+if TYPE_CHECKING:
+    from ..store import CredentialStore
+
+logger = logging.getLogger(__name__)
+
+
+class AdenSyncProvider(CredentialProvider):
+    """
+    Provider that synchronizes credentials with the Aden server.
+
+    The Aden server handles OAuth2 authorization flows and maintains
+    refresh tokens. This provider:
+
+    - Fetches access tokens from the Aden server
+    - Delegates token refresh to the Aden server
+    - Caches tokens locally in the credential store
+    - Optionally reports usage statistics back to Aden
+
+    Key benefits:
+    - Client secrets never leave the Aden server
+    - Refresh token security (stored only on Aden)
+    - Centralized audit logging
+    - Multi-tenant support
+
+    Usage:
+        client = AdenCredentialClient(AdenClientConfig(
+            base_url="https://api.adenhq.com",
+            api_key=os.environ["ADEN_API_KEY"],
+        ))
+
+        provider = AdenSyncProvider(client=client)
+
+        store = CredentialStore(
+            storage=EncryptedFileStorage(),
+            providers=[provider],
+            auto_refresh=True,
+        )
+    """
+
+    def __init__(
+        self,
+        client: AdenCredentialClient,
+        provider_id: str = "aden_sync",
+        refresh_buffer_minutes: int = 5,
+        report_usage: bool = False,
+    ):
+        """
+        Initialize the Aden sync provider.
+
+        Args:
+            client: Configured Aden API client.
+            provider_id: Unique identifier for this provider instance.
+                        Useful for multi-tenant scenarios (e.g., 'aden_tenant_123').
+            refresh_buffer_minutes: Minutes before expiry to trigger refresh.
+                                   Default is 5 minutes.
+            report_usage: Whether to report usage statistics to Aden server.
+        """
+        self._client = client
+        self._provider_id = provider_id
+        self._refresh_buffer = timedelta(minutes=refresh_buffer_minutes)
+        self._report_usage = report_usage
+
+    @property
+    def provider_id(self) -> str:
+        """Unique identifier for this provider."""
+        return self._provider_id
+
+    @property
+    def supported_types(self) -> list[CredentialType]:
+        """Credential types this provider can manage."""
+        return [CredentialType.OAUTH2, CredentialType.BEARER_TOKEN]
+
+    def can_handle(self, credential: CredentialObject) -> bool:
+        """
+        Check if this provider can handle a credential.
+
+        Returns True if:
+        - Credential type is supported (OAUTH2 or BEARER_TOKEN)
+        - Credential's provider_id matches this provider, OR
+        - Credential has '_aden_managed' metadata flag
+        """
+        if credential.credential_type not in self.supported_types:
+            return False
+
+        # Check if credential is explicitly linked to this provider
+        if credential.provider_id == self.provider_id:
+            return True
+
+        # Check for Aden-managed flag in metadata
+        aden_flag = credential.keys.get("_aden_managed")
+        if aden_flag and aden_flag.value.get_secret_value() == "true":
+            return True
+
+        return False
+
+    def refresh(self, credential: CredentialObject) -> CredentialObject:
+        """
+        Refresh credential by requesting new token from Aden server.
+
+        The Aden server handles the actual OAuth2 refresh token flow.
+        This method simply fetches the result.
+
+        Args:
+            credential: The credential to refresh.
+
+        Returns:
+            Updated credential with new access token.
+
+        Raises:
+            CredentialRefreshError: If refresh fails.
+        """
+        try:
+            # Request Aden to refresh the token
+            aden_response = self._client.request_refresh(credential.id)
+
+            # Update credential with new values
+            credential = self._update_credential_from_aden(credential, aden_response)
+
+            logger.info(f"Refreshed credential '{credential.id}' via Aden server")
+
+            # Report usage if enabled
+            if self._report_usage:
+                self._client.report_usage(
+                    integration_id=credential.id,
+                    operation="token_refresh",
+                    status="success",
+                )
+
+            return credential
+
+        except AdenRefreshError as e:
+            logger.error(f"Aden refresh failed for '{credential.id}': {e}")
+
+            if e.requires_reauthorization:
+                raise CredentialRefreshError(
+                    f"Integration '{credential.id}' requires re-authorization. "
+                    f"Visit: {e.reauthorization_url or 'your Aden dashboard'}"
+                ) from e
+
+            raise CredentialRefreshError(
+                f"Failed to refresh credential '{credential.id}': {e}"
+            ) from e
+
+        except AdenClientError as e:
+            logger.error(f"Aden client error for '{credential.id}': {e}")
+
+            # Check if local token is still valid
+            access_key = credential.keys.get("access_token")
+            if access_key and access_key.expires_at:
+                if datetime.now(UTC) < access_key.expires_at:
+                    logger.warning(f"Aden unavailable, using cached token for '{credential.id}'")
+                    return credential
+
+            raise CredentialRefreshError(
+                f"Aden server unavailable and token expired for '{credential.id}'"
+            ) from e
+
+    def validate(self, credential: CredentialObject) -> bool:
+        """
+        Validate credential via Aden server introspection.
+
+        Args:
+            credential: The credential to validate.
+
+        Returns:
+            True if credential is valid.
+        """
+        try:
+            result = self._client.validate_token(credential.id)
+            return result.get("valid", False)
+        except AdenClientError:
+            # Fall back to local validation
+            access_key = credential.keys.get("access_token")
+            if access_key is None:
+                return False
+
+            if access_key.expires_at is None:
+                # No expiration - assume valid
+                return True
+
+            return datetime.now(UTC) < access_key.expires_at
+
+    def should_refresh(self, credential: CredentialObject) -> bool:
+        """
+        Check if credential should be refreshed.
+
+        Returns True if access_token is expired or within the refresh buffer.
+
+        Args:
+            credential: The credential to check.
+
+        Returns:
+            True if credential should be refreshed.
+        """
+        access_key = credential.keys.get("access_token")
+        if access_key is None:
+            return False
+
+        if access_key.expires_at is None:
+            return False
+
+        # Refresh if within buffer of expiration
+        return datetime.now(UTC) >= (access_key.expires_at - self._refresh_buffer)
+
+    def fetch_from_aden(self, integration_id: str) -> CredentialObject | None:
+        """
+        Fetch credential directly from Aden server.
+
+        Use this for initial population or when local cache is missing.
+
+        Args:
+            integration_id: The integration identifier (e.g., 'hubspot').
+
+        Returns:
+            CredentialObject if found, None otherwise.
+
+        Raises:
+            AdenClientError: For connection failures.
+        """
+        aden_response = self._client.get_credential(integration_id)
+        if aden_response is None:
+            return None
+
+        return self._aden_response_to_credential(aden_response)
+
+    def sync_all(self, store: CredentialStore) -> int:
+        """
+        Sync all credentials from Aden server to local store.
+
+        Fetches the list of available integrations from Aden and
+        populates the local credential store with current tokens.
+
+        Args:
+            store: The credential store to populate.
+
+        Returns:
+            Number of credentials synced.
+        """
+        synced = 0
+
+        try:
+            integrations = self._client.list_integrations()
+
+            for info in integrations:
+                if info.status != "active":
+                    logger.warning(
+                        f"Skipping integration '{info.integration_id}': status={info.status}"
+                    )
+                    continue
+
+                try:
+                    cred = self.fetch_from_aden(info.integration_id)
+                    if cred:
+                        store.save_credential(cred)
+                        synced += 1
+                        logger.info(f"Synced credential '{info.integration_id}' from Aden")
+                except Exception as e:
+                    logger.warning(f"Failed to sync '{info.integration_id}': {e}")
+
+        except AdenClientError as e:
+            logger.error(f"Failed to list integrations from Aden: {e}")
+
+        return synced
+
+    def report_credential_usage(
+        self,
+        credential: CredentialObject,
+        operation: str,
+        status: str = "success",
+        metadata: dict | None = None,
+    ) -> None:
+        """
+        Report credential usage to Aden server.
+
+        Args:
+            credential: The credential that was used.
+            operation: Operation name (e.g., 'api_call').
+            status: Operation status ('success', 'error').
+            metadata: Additional metadata.
+        """
+        if self._report_usage:
+            self._client.report_usage(
+                integration_id=credential.id,
+                operation=operation,
+                status=status,
+                metadata=metadata or {},
+            )
+
+    def _update_credential_from_aden(
+        self,
+        credential: CredentialObject,
+        aden_response: AdenCredentialResponse,
+    ) -> CredentialObject:
+        """Update credential object from Aden response."""
+        # Update access token
+        credential.keys["access_token"] = CredentialKey(
+            name="access_token",
+            value=SecretStr(aden_response.access_token),
+            expires_at=aden_response.expires_at,
+        )
+
+        # Update scopes if present
+        if aden_response.scopes:
+            credential.keys["scope"] = CredentialKey(
+                name="scope",
+                value=SecretStr(" ".join(aden_response.scopes)),
+            )
+
+        # Mark as Aden-managed
+        credential.keys["_aden_managed"] = CredentialKey(
+            name="_aden_managed",
+            value=SecretStr("true"),
+        )
+
+        # Store integration type
+        credential.keys["_integration_type"] = CredentialKey(
+            name="_integration_type",
+            value=SecretStr(aden_response.integration_type),
+        )
+
+        # Update timestamps
+        credential.last_refreshed = datetime.now(UTC)
+        credential.provider_id = self.provider_id
+
+        return credential
+
+    def _aden_response_to_credential(
+        self,
+        aden_response: AdenCredentialResponse,
+    ) -> CredentialObject:
+        """Convert Aden response to CredentialObject."""
+        keys: dict[str, CredentialKey] = {
+            "access_token": CredentialKey(
+                name="access_token",
+                value=SecretStr(aden_response.access_token),
+                expires_at=aden_response.expires_at,
+            ),
+            "_aden_managed": CredentialKey(
+                name="_aden_managed",
+                value=SecretStr("true"),
+            ),
+            "_integration_type": CredentialKey(
+                name="_integration_type",
+                value=SecretStr(aden_response.integration_type),
+            ),
+        }
+
+        if aden_response.scopes:
+            keys["scope"] = CredentialKey(
+                name="scope",
+                value=SecretStr(" ".join(aden_response.scopes)),
+            )
+
+        return CredentialObject(
+            id=aden_response.integration_id,
+            credential_type=CredentialType.OAUTH2,
+            keys=keys,
+            provider_id=self.provider_id,
+            auto_refresh=True,
+        )
@@ -0,0 +1,389 @@
+"""
+Aden Cached Storage.
+
+Storage backend that combines local cache with Aden server fallback.
+Provides offline resilience by caching credentials locally while
+keeping them synchronized with the Aden server.
+
+Usage:
+    from core.framework.credentials import CredentialStore
+    from core.framework.credentials.storage import EncryptedFileStorage
+    from core.framework.credentials.aden import (
+        AdenCredentialClient,
+        AdenClientConfig,
+        AdenSyncProvider,
+        AdenCachedStorage,
+    )
+
+    # Configure
+    client = AdenCredentialClient(AdenClientConfig(
+        base_url=os.environ["ADEN_API_URL"],
+        api_key=os.environ["ADEN_API_KEY"],
+    ))
+    provider = AdenSyncProvider(client=client)
+
+    # Create cached storage
+    storage = AdenCachedStorage(
+        local_storage=EncryptedFileStorage(),
+        aden_provider=provider,
+        cache_ttl_seconds=300,  # Re-check Aden every 5 minutes
+    )
+
+    # Create store
+    store = CredentialStore(
+        storage=storage,
+        providers=[provider],
+        auto_refresh=True,
+    )
+
+    # Credentials automatically fetched from Aden on first access
+    # Cached locally for 5 minutes
+    # Falls back to cache if Aden is unreachable
+"""
+
+from __future__ import annotations
+
+import logging
+from datetime import UTC, datetime, timedelta
+from typing import TYPE_CHECKING
+
+from ..storage import CredentialStorage
+
+if TYPE_CHECKING:
+    from ..models import CredentialObject
+    from .provider import AdenSyncProvider
+
+logger = logging.getLogger(__name__)
+
+
+class AdenCachedStorage(CredentialStorage):
+    """
+    Storage with local cache and Aden server fallback.
+
+    This storage provides:
+    - **Reads**: Try local cache first, fallback to Aden if stale/missing
+    - **Writes**: Always write to local cache
+    - **Offline resilience**: Uses cached credentials when Aden is unreachable
+    - **Provider-based lookup**: Match credentials by provider name (e.g., "hubspot")
+      when direct ID lookup fails, since Aden uses hash-based IDs internally.
+
+    The cache TTL determines how long to trust local credentials before
+    checking with the Aden server for updates. This balances:
+    - Performance (fewer network calls)
+    - Freshness (tokens stay current)
+    - Resilience (works during brief outages)
+
+    Usage:
+        storage = AdenCachedStorage(
+            local_storage=EncryptedFileStorage(),
+            aden_provider=provider,
+            cache_ttl_seconds=300,  # 5 minutes
+        )
+
+        store = CredentialStore(
+            storage=storage,
+            providers=[provider],
+        )
+
+        # First access fetches from Aden
+        # Subsequent accesses use cache until TTL expires
+        # Can look up by provider name OR credential ID
+        token = store.get_key("hubspot", "access_token")
+    """
+
+    def __init__(
+        self,
+        local_storage: CredentialStorage,
+        aden_provider: AdenSyncProvider,
+        cache_ttl_seconds: int = 300,
+        prefer_local: bool = True,
+    ):
+        """
+        Initialize Aden-cached storage.
+
+        Args:
+            local_storage: Local storage backend for caching (e.g., EncryptedFileStorage).
+            aden_provider: Provider for fetching from Aden server.
+            cache_ttl_seconds: How long to trust local cache before checking Aden.
+                              Default is 300 seconds (5 minutes).
+            prefer_local: If True, use local cache when available and fresh.
+                         If False, always check Aden first.
+        """
+        self._local = local_storage
+        self._aden_provider = aden_provider
+        self._cache_ttl = timedelta(seconds=cache_ttl_seconds)
+        self._prefer_local = prefer_local
+        self._cache_timestamps: dict[str, datetime] = {}
+        # Index: provider name (e.g., "hubspot") -> credential hash ID
+        self._provider_index: dict[str, str] = {}
+
+    def save(self, credential: CredentialObject) -> None:
+        """
+        Save credential to local cache and update provider index.
+
+        Args:
+            credential: The credential to save.
+        """
+        self._local.save(credential)
+        self._cache_timestamps[credential.id] = datetime.now(UTC)
+        self._index_provider(credential)
+        logger.debug(f"Cached credential '{credential.id}'")
+
+    def load(self, credential_id: str) -> CredentialObject | None:
+        """
+        Load credential from cache, with Aden fallback and provider-based lookup.
+
+        The loading strategy depends on the `prefer_local` setting:
+
+        If prefer_local=True (default):
+        1. Check if local cache exists and is fresh (within TTL)
+        2. If fresh, return cached credential
+        3. If stale or missing, fetch from Aden
+        4. Update local cache with Aden response
+        5. If Aden fails, fall back to stale cache
+
+        If prefer_local=False:
+        1. Always try to fetch from Aden first
+        2. Update local cache with response
+        3. Fall back to local cache only if Aden fails
+
+        Provider-based lookup:
+        When a provider index mapping exists for the credential_id (e.g.,
+        "hubspot" → hash ID), the Aden-synced credential is loaded first.
+        This ensures fresh OAuth tokens from Aden take priority over stale
+        local credentials (env vars, old encrypted files).
+
+        Args:
+            credential_id: The credential identifier or provider name.
+
+        Returns:
+            CredentialObject if found, None otherwise.
+        """
+        # Check provider index first — Aden-synced credentials take priority
+        resolved_id = self._provider_index.get(credential_id)
+        if resolved_id and resolved_id != credential_id:
+            result = self._load_by_id(resolved_id)
+            if result is not None:
+                logger.info(
+                    f"Loaded credential '{credential_id}' via provider index (id='{resolved_id}')"
+                )
+                return result
+
+        # Direct lookup (exact credential_id match)
+        return self._load_by_id(credential_id)
+
+    def _load_by_id(self, credential_id: str) -> CredentialObject | None:
+        """
+        Load credential by exact ID from cache, with Aden fallback.
+
+        Args:
+            credential_id: The exact credential identifier.
+
+        Returns:
+            CredentialObject if found, None otherwise.
+        """
+        local_cred = self._local.load(credential_id)
+
+        # If we prefer local and have a fresh cache, use it
+        if self._prefer_local and local_cred and self._is_cache_fresh(credential_id):
+            logger.debug(f"Using cached credential '{credential_id}'")
+            return local_cred
+
+        # Try to fetch from Aden
+        try:
+            aden_cred = self._aden_provider.fetch_from_aden(credential_id)
+            if aden_cred:
+                # Update local cache
+                self.save(aden_cred)
+                logger.debug(f"Fetched credential '{credential_id}' from Aden")
+                return aden_cred
+        except Exception as e:
+            logger.warning(f"Failed to fetch '{credential_id}' from Aden: {e}")
+
+            # Fall back to local cache if Aden fails
+            if local_cred:
+                logger.info(f"Using stale cached credential '{credential_id}'")
+                return local_cred
+
+        # Return local credential if it exists (may be None)
+        return local_cred
+
+    def delete(self, credential_id: str) -> bool:
+        """
+        Delete credential from local cache.
+
+        Note: This does NOT delete the credential from the Aden server.
+        It only removes the local cache entry.
+
+        Args:
+            credential_id: The credential identifier.
+
+        Returns:
+            True if credential existed and was deleted.
+        """
+        self._cache_timestamps.pop(credential_id, None)
+        return self._local.delete(credential_id)
+
+    def list_all(self) -> list[str]:
+        """
+        List credentials from local cache.
+
+        Returns:
+            List of credential IDs in local cache.
+        """
+        return self._local.list_all()
+
+    def exists(self, credential_id: str) -> bool:
+        """
+        Check if credential exists in local cache (by ID or provider name).
+
+        Args:
+            credential_id: The credential identifier or provider name.
+
+        Returns:
+            True if credential exists locally.
+        """
+        if self._local.exists(credential_id):
+            return True
+        # Check provider index
+        resolved_id = self._provider_index.get(credential_id)
+        if resolved_id and resolved_id != credential_id:
+            return self._local.exists(resolved_id)
+        return False
+
+    def _is_cache_fresh(self, credential_id: str) -> bool:
+        """
+        Check if local cache is still fresh (within TTL).
+
+        Args:
+            credential_id: The credential identifier.
+
+        Returns:
+            True if cache is fresh, False if stale or not cached.
+        """
+        cached_at = self._cache_timestamps.get(credential_id)
+        if cached_at is None:
+            return False
+        return datetime.now(UTC) - cached_at < self._cache_ttl
+
+    def invalidate_cache(self, credential_id: str) -> None:
+        """
+        Invalidate cache for a specific credential.
+
+        The next load() call will fetch from Aden regardless of TTL.
+
+        Args:
+            credential_id: The credential identifier.
+        """
+        self._cache_timestamps.pop(credential_id, None)
+        logger.debug(f"Invalidated cache for '{credential_id}'")
+
+    def invalidate_all(self) -> None:
+        """Invalidate all cache entries."""
+        self._cache_timestamps.clear()
+        logger.debug("Invalidated all cache entries")
+
+    def _index_provider(self, credential: CredentialObject) -> None:
+        """
+        Index a credential by its provider/integration type.
+
+        Aden credentials carry an ``_integration_type`` key whose value is
+        the provider name (e.g., ``hubspot``).  This method maps that
+        provider name to the credential's hash ID so that subsequent
+        ``load("hubspot")`` calls resolve to the correct credential.
+
+        Args:
+            credential: The credential to index.
+        """
+        integration_type_key = credential.keys.get("_integration_type")
+        if integration_type_key is None:
+            return
+        provider_name = integration_type_key.value.get_secret_value()
+        if provider_name:
+            self._provider_index[provider_name] = credential.id
+            logger.debug(f"Indexed provider '{provider_name}' -> '{credential.id}'")
+
+    def rebuild_provider_index(self) -> int:
+        """
+        Rebuild the provider index from all locally cached credentials.
+
+        Useful after loading from disk when the in-memory index is empty.
+
+        Returns:
+            Number of provider mappings indexed.
+        """
+        self._provider_index.clear()
+        indexed = 0
+        for cred_id in self._local.list_all():
+            cred = self._local.load(cred_id)
+            if cred:
+                before = len(self._provider_index)
+                self._index_provider(cred)
+                if len(self._provider_index) > before:
+                    indexed += 1
+        logger.debug(f"Rebuilt provider index with {indexed} mappings")
+        return indexed
+
+    def sync_all_from_aden(self) -> int:
+        """
+        Sync all credentials from Aden server to local cache.
+
+        Fetches the list of available integrations from Aden and
+        updates the local cache with current tokens.
+
+        Returns:
+            Number of credentials synced.
+        """
+        synced = 0
+
+        try:
+            integrations = self._aden_provider._client.list_integrations()
+
+            for info in integrations:
+                if info.status != "active":
+                    logger.warning(
+                        f"Skipping integration '{info.integration_id}': status={info.status}"
+                    )
+                    continue
+
+                try:
+                    cred = self._aden_provider.fetch_from_aden(info.integration_id)
+                    if cred:
+                        self.save(cred)
+                        synced += 1
+                        logger.info(f"Synced credential '{info.integration_id}' from Aden")
+                except Exception as e:
+                    logger.warning(f"Failed to sync '{info.integration_id}': {e}")
+
+        except Exception as e:
+            logger.error(f"Failed to list integrations from Aden: {e}")
+
+        return synced
+
+    def get_cache_info(self) -> dict[str, dict]:
+        """
+        Get cache status information for all credentials.
+
+        Returns:
+            Dict mapping credential_id to cache info (cached_at, is_fresh, ttl_remaining).
+        """
+        now = datetime.now(UTC)
+        info = {}
+
+        for cred_id in self.list_all():
+            cached_at = self._cache_timestamps.get(cred_id)
+            if cached_at:
+                ttl_remaining = (cached_at + self._cache_ttl - now).total_seconds()
+                info[cred_id] = {
+                    "cached_at": cached_at.isoformat(),
+                    "is_fresh": ttl_remaining > 0,
+                    "ttl_remaining_seconds": max(0, ttl_remaining),
+                }
+            else:
+                info[cred_id] = {
+                    "cached_at": None,
+                    "is_fresh": False,
+                    "ttl_remaining_seconds": 0,
+                }
+
+        return info
@@ -0,0 +1 @@
+"""Tests for Aden credential sync components."""
@@ -0,0 +1,813 @@
+"""
+Tests for Aden credential sync components.
+
+Tests cover:
+- AdenCredentialClient: HTTP client for Aden API
+- AdenSyncProvider: Provider that syncs with Aden
+- AdenCachedStorage: Storage with local cache + Aden fallback
+"""
+
+from datetime import UTC, datetime, timedelta
+from unittest.mock import Mock
+
+import pytest
+from pydantic import SecretStr
+
+from framework.credentials import (
+    CredentialKey,
+    CredentialObject,
+    CredentialStore,
+    CredentialType,
+    InMemoryStorage,
+)
+from framework.credentials.aden import (
+    AdenCachedStorage,
+    AdenClientConfig,
+    AdenClientError,
+    AdenCredentialClient,
+    AdenCredentialResponse,
+    AdenIntegrationInfo,
+    AdenRefreshError,
+    AdenSyncProvider,
+)
+
+# =============================================================================
+# Fixtures
+# =============================================================================
+
+
+@pytest.fixture
+def aden_config():
+    """Create a test Aden client config."""
+    return AdenClientConfig(
+        base_url="https://api.test-aden.com",
+        api_key="test-api-key",
+        tenant_id="test-tenant",
+        timeout=5.0,
+        retry_attempts=2,
+        retry_delay=0.1,
+    )
+
+
+@pytest.fixture
+def mock_client(aden_config):
+    """Create a mock Aden client."""
+    client = Mock(spec=AdenCredentialClient)
+    client.config = aden_config
+    return client
+
+
+@pytest.fixture
+def aden_response():
+    """Create a sample Aden credential response."""
+    return AdenCredentialResponse(
+        integration_id="hubspot",
+        integration_type="hubspot",
+        access_token="test-access-token",
+        token_type="Bearer",
+        expires_at=datetime.now(UTC) + timedelta(hours=1),
+        scopes=["crm.objects.contacts.read", "crm.objects.contacts.write"],
+        metadata={"portal_id": "12345"},
+    )
+
+
+@pytest.fixture
+def provider(mock_client):
+    """Create an AdenSyncProvider with mock client."""
+    return AdenSyncProvider(
+        client=mock_client,
+        provider_id="test_aden",
+        refresh_buffer_minutes=5,
+        report_usage=False,
+    )
+
+
+@pytest.fixture
+def local_storage():
+    """Create an in-memory storage for testing."""
+    return InMemoryStorage()
+
+
+@pytest.fixture
+def cached_storage(local_storage, provider):
+    """Create an AdenCachedStorage for testing."""
+    return AdenCachedStorage(
+        local_storage=local_storage,
+        aden_provider=provider,
+        cache_ttl_seconds=60,
+        prefer_local=True,
+    )
+
+
+# =============================================================================
+# AdenCredentialResponse Tests
+# =============================================================================
+
+
+class TestAdenCredentialResponse:
+    """Tests for AdenCredentialResponse dataclass."""
+
+    def test_from_dict_basic(self):
+        """Test creating response from dict."""
+        data = {
+            "integration_id": "github",
+            "integration_type": "github",
+            "access_token": "ghp_xxxxx",
+        }
+
+        response = AdenCredentialResponse.from_dict(data)
+
+        assert response.integration_id == "github"
+        assert response.integration_type == "github"
+        assert response.access_token == "ghp_xxxxx"
+        assert response.token_type == "Bearer"
+        assert response.expires_at is None
+        assert response.scopes == []
+
+    def test_from_dict_full(self):
+        """Test creating response with all fields."""
+        data = {
+            "integration_id": "hubspot",
+            "integration_type": "hubspot",
+            "access_token": "token123",
+            "token_type": "Bearer",
+            "expires_at": "2026-01-28T15:30:00Z",
+            "scopes": ["read", "write"],
+            "metadata": {"key": "value"},
+        }
+
+        response = AdenCredentialResponse.from_dict(data)
+
+        assert response.integration_id == "hubspot"
+        assert response.access_token == "token123"
+        assert response.expires_at is not None
+        assert response.scopes == ["read", "write"]
+        assert response.metadata == {"key": "value"}
+
+
+class TestAdenIntegrationInfo:
+    """Tests for AdenIntegrationInfo dataclass."""
+
+    def test_from_dict(self):
+        """Test creating integration info from dict."""
+        data = {
+            "integration_id": "slack",
+            "integration_type": "slack",
+            "status": "active",
+            "expires_at": "2026-02-01T00:00:00Z",
+        }
+
+        info = AdenIntegrationInfo.from_dict(data)
+
+        assert info.integration_id == "slack"
+        assert info.integration_type == "slack"
+        assert info.status == "active"
+        assert info.expires_at is not None
+
+
+# =============================================================================
+# AdenSyncProvider Tests
+# =============================================================================
+
+
+class TestAdenSyncProvider:
+    """Tests for AdenSyncProvider."""
+
+    def test_provider_id(self, provider):
+        """Test provider ID."""
+        assert provider.provider_id == "test_aden"
+
+    def test_supported_types(self, provider):
+        """Test supported credential types."""
+        assert CredentialType.OAUTH2 in provider.supported_types
+        assert CredentialType.BEARER_TOKEN in provider.supported_types
+
+    def test_can_handle_oauth2(self, provider):
+        """Test can_handle returns True for OAUTH2 credentials with matching provider_id."""
+        cred = CredentialObject(
+            id="test",
+            credential_type=CredentialType.OAUTH2,
+            keys={},
+            provider_id="test_aden",
+        )
+
+        assert provider.can_handle(cred) is True
+
+    def test_can_handle_aden_managed(self, provider):
+        """Test can_handle returns True for Aden-managed credentials."""
+        cred = CredentialObject(
+            id="test",
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "_aden_managed": CredentialKey(
+                    name="_aden_managed",
+                    value=SecretStr("true"),
+                )
+            },
+        )
+
+        assert provider.can_handle(cred) is True
+
+    def test_can_handle_wrong_type(self, provider):
+        """Test can_handle returns False for unsupported types."""
+        cred = CredentialObject(
+            id="test",
+            credential_type=CredentialType.API_KEY,
+            keys={},
+        )
+
+        assert provider.can_handle(cred) is False
+
+    def test_refresh_success(self, provider, mock_client, aden_response):
+        """Test successful credential refresh."""
+        mock_client.request_refresh.return_value = aden_response
+
+        cred = CredentialObject(
+            id="hubspot",
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(
+                    name="access_token",
+                    value=SecretStr("old-token"),
+                )
+            },
+            provider_id="test_aden",
+        )
+
+        refreshed = provider.refresh(cred)
+
+        assert refreshed.keys["access_token"].value.get_secret_value() == "test-access-token"
+        assert refreshed.keys["_aden_managed"].value.get_secret_value() == "true"
+        assert refreshed.last_refreshed is not None
+        mock_client.request_refresh.assert_called_once_with("hubspot")
+
+    def test_refresh_requires_reauth(self, provider, mock_client):
+        """Test refresh that requires re-authorization."""
+        mock_client.request_refresh.side_effect = AdenRefreshError(
+            "Token revoked",
+            requires_reauthorization=True,
+            reauthorization_url="https://aden.com/reauth",
+        )
+
+        cred = CredentialObject(
+            id="hubspot",
+            credential_type=CredentialType.OAUTH2,
+            keys={},
+        )
+
+        from framework.credentials import CredentialRefreshError
+
+        with pytest.raises(CredentialRefreshError) as exc_info:
+            provider.refresh(cred)
+
+        assert "re-authorization" in str(exc_info.value).lower()
+
+    def test_refresh_aden_unavailable_cached_valid(self, provider, mock_client):
+        """Test refresh falls back to cache when Aden is unavailable and token is valid."""
+        mock_client.request_refresh.side_effect = AdenClientError("Connection failed")
+
+        # Token expires in 1 hour - still valid
+        future = datetime.now(UTC) + timedelta(hours=1)
+        cred = CredentialObject(
+            id="hubspot",
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(
+                    name="access_token",
+                    value=SecretStr("cached-token"),
+                    expires_at=future,
+                )
+            },
+        )
+
+        # Should return the cached credential instead of failing
+        result = provider.refresh(cred)
+
+        assert result.keys["access_token"].value.get_secret_value() == "cached-token"
+
+    def test_should_refresh_expired(self, provider):
+        """Test should_refresh returns True for expired token."""
+        past = datetime.now(UTC) - timedelta(hours=1)
+        cred = CredentialObject(
+            id="test",
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(
+                    name="access_token",
+                    value=SecretStr("token"),
+                    expires_at=past,
+                )
+            },
+        )
+
+        assert provider.should_refresh(cred) is True
+
+    def test_should_refresh_within_buffer(self, provider):
+        """Test should_refresh returns True when within buffer."""
+        # Expires in 3 minutes (buffer is 5 minutes)
+        soon = datetime.now(UTC) + timedelta(minutes=3)
+        cred = CredentialObject(
+            id="test",
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(
+                    name="access_token",
+                    value=SecretStr("token"),
+                    expires_at=soon,
+                )
+            },
+        )
+
+        assert provider.should_refresh(cred) is True
+
+    def test_should_refresh_still_valid(self, provider):
+        """Test should_refresh returns False for valid token."""
+        future = datetime.now(UTC) + timedelta(hours=1)
+        cred = CredentialObject(
+            id="test",
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(
+                    name="access_token",
+                    value=SecretStr("token"),
+                    expires_at=future,
+                )
+            },
+        )
+
+        assert provider.should_refresh(cred) is False
+
+    def test_fetch_from_aden(self, provider, mock_client, aden_response):
+        """Test fetching credential from Aden."""
+        mock_client.get_credential.return_value = aden_response
+
+        cred = provider.fetch_from_aden("hubspot")
+
+        assert cred is not None
+        assert cred.id == "hubspot"
+        assert cred.keys["access_token"].value.get_secret_value() == "test-access-token"
+        assert cred.auto_refresh is True
+
+    def test_fetch_from_aden_not_found(self, provider, mock_client):
+        """Test fetch returns None when not found."""
+        mock_client.get_credential.return_value = None
+
+        cred = provider.fetch_from_aden("nonexistent")
+
+        assert cred is None
+
+    def test_sync_all(self, provider, mock_client, aden_response):
+        """Test syncing all credentials."""
+        mock_client.list_integrations.return_value = [
+            AdenIntegrationInfo(
+                integration_id="hubspot",
+                integration_type="hubspot",
+                status="active",
+            ),
+            AdenIntegrationInfo(
+                integration_id="github",
+                integration_type="github",
+                status="requires_reauth",  # Should be skipped
+            ),
+        ]
+        mock_client.get_credential.return_value = aden_response
+
+        store = CredentialStore(storage=InMemoryStorage())
+        synced = provider.sync_all(store)
+
+        assert synced == 1  # Only active one was synced
+        assert store.get_credential("hubspot") is not None
+
+    def test_validate_via_aden(self, provider, mock_client):
+        """Test validation via Aden introspection."""
+        mock_client.validate_token.return_value = {"valid": True}
+
+        cred = CredentialObject(
+            id="hubspot",
+            credential_type=CredentialType.OAUTH2,
+            keys={},
+        )
+
+        assert provider.validate(cred) is True
+
+    def test_validate_fallback_to_local(self, provider, mock_client):
+        """Test validation falls back to local check when Aden fails."""
+        mock_client.validate_token.side_effect = AdenClientError("Failed")
+
+        future = datetime.now(UTC) + timedelta(hours=1)
+        cred = CredentialObject(
+            id="hubspot",
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(
+                    name="access_token",
+                    value=SecretStr("token"),
+                    expires_at=future,
+                )
+            },
+        )
+
+        assert provider.validate(cred) is True
+
+
+# =============================================================================
+# AdenCachedStorage Tests
+# =============================================================================
+
+
+class TestAdenCachedStorage:
+    """Tests for AdenCachedStorage."""
+
+    def test_save_updates_cache_timestamp(self, cached_storage):
+        """Test save updates cache timestamp."""
+        cred = CredentialObject(
+            id="test",
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(
+                    name="access_token",
+                    value=SecretStr("token"),
+                )
+            },
+        )
+
+        cached_storage.save(cred)
+
+        assert "test" in cached_storage._cache_timestamps
+        assert cached_storage.exists("test")
+
+    def test_load_from_fresh_cache(self, cached_storage, local_storage):
+        """Test load returns cached credential when fresh."""
+        cred = CredentialObject(
+            id="test",
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(
+                    name="access_token",
+                    value=SecretStr("cached-token"),
+                )
+            },
+        )
+
+        # Save to both local storage and update timestamp
+        local_storage.save(cred)
+        cached_storage._cache_timestamps["test"] = datetime.now(UTC)
+
+        loaded = cached_storage.load("test")
+
+        assert loaded is not None
+        assert loaded.keys["access_token"].value.get_secret_value() == "cached-token"
+
+    def test_load_from_aden_when_stale(
+        self, cached_storage, local_storage, provider, mock_client, aden_response
+    ):
+        """Test load fetches from Aden when cache is stale."""
+        # Create stale cached credential
+        cred = CredentialObject(
+            id="hubspot",
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(
+                    name="access_token",
+                    value=SecretStr("stale-token"),
+                )
+            },
+        )
+        local_storage.save(cred)
+
+        # Set cache timestamp to be stale (2 minutes ago, TTL is 60 seconds)
+        cached_storage._cache_timestamps["hubspot"] = datetime.now(UTC) - timedelta(minutes=2)
+
+        # Mock Aden response
+        mock_client.get_credential.return_value = aden_response
+
+        loaded = cached_storage.load("hubspot")
+
+        assert loaded is not None
+        assert loaded.keys["access_token"].value.get_secret_value() == "test-access-token"
+
+    def test_load_falls_back_to_stale_when_aden_fails(
+        self, cached_storage, local_storage, provider, mock_client
+    ):
+        """Test load falls back to stale cache when Aden fails."""
+        # Create stale cached credential
+        cred = CredentialObject(
+            id="hubspot",
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(
+                    name="access_token",
+                    value=SecretStr("stale-token"),
+                )
+            },
+        )
+        local_storage.save(cred)
+        cached_storage._cache_timestamps["hubspot"] = datetime.now(UTC) - timedelta(minutes=2)
+
+        # Aden fails
+        mock_client.get_credential.side_effect = AdenClientError("Connection failed")
+
+        loaded = cached_storage.load("hubspot")
+
+        assert loaded is not None
+        assert loaded.keys["access_token"].value.get_secret_value() == "stale-token"
+
+    def test_delete_removes_cache_timestamp(self, cached_storage, local_storage):
+        """Test delete removes cache timestamp."""
+        cred = CredentialObject(
+            id="test",
+            credential_type=CredentialType.OAUTH2,
+            keys={},
+        )
+        cached_storage.save(cred)
+
+        assert "test" in cached_storage._cache_timestamps
+
+        cached_storage.delete("test")
+
+        assert "test" not in cached_storage._cache_timestamps
+        assert not cached_storage.exists("test")
+
+    def test_invalidate_cache(self, cached_storage, local_storage):
+        """Test invalidate_cache removes timestamp."""
+        cred = CredentialObject(
+            id="test",
+            credential_type=CredentialType.OAUTH2,
+            keys={},
+        )
+        cached_storage.save(cred)
+
+        cached_storage.invalidate_cache("test")
+
+        assert "test" not in cached_storage._cache_timestamps
+        # Credential still exists in local storage
+        assert local_storage.exists("test")
+
+    def test_invalidate_all(self, cached_storage):
+        """Test invalidate_all clears all timestamps."""
+        for i in range(3):
+            cached_storage._cache_timestamps[f"test_{i}"] = datetime.now(UTC)
+
+        cached_storage.invalidate_all()
+
+        assert len(cached_storage._cache_timestamps) == 0
+
+    def test_is_cache_fresh(self, cached_storage):
+        """Test _is_cache_fresh logic."""
+        # Fresh cache
+        cached_storage._cache_timestamps["fresh"] = datetime.now(UTC)
+        assert cached_storage._is_cache_fresh("fresh") is True
+
+        # Stale cache
+        cached_storage._cache_timestamps["stale"] = datetime.now(UTC) - timedelta(minutes=5)
+        assert cached_storage._is_cache_fresh("stale") is False
+
+        # No cache
+        assert cached_storage._is_cache_fresh("nonexistent") is False
+
+    def test_get_cache_info(self, cached_storage, local_storage):
+        """Test get_cache_info returns status for all credentials."""
+        # Add some credentials
+        for name in ["fresh", "stale"]:
+            cred = CredentialObject(
+                id=name,
+                credential_type=CredentialType.OAUTH2,
+                keys={},
+            )
+            local_storage.save(cred)
+
+        cached_storage._cache_timestamps["fresh"] = datetime.now(UTC)
+        cached_storage._cache_timestamps["stale"] = datetime.now(UTC) - timedelta(minutes=5)
+
+        info = cached_storage.get_cache_info()
+
+        assert "fresh" in info
+        assert info["fresh"]["is_fresh"] is True
+        assert info["fresh"]["ttl_remaining_seconds"] > 0
+
+        assert "stale" in info
+        assert info["stale"]["is_fresh"] is False
+        assert info["stale"]["ttl_remaining_seconds"] == 0
+
+    def test_save_indexes_provider(self, cached_storage):
+        """Test save builds the provider index from _integration_type key."""
+        cred = CredentialObject(
+            id="aHVic3BvdDp0ZXN0OjEzNjExOjExNTI1",
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(
+                    name="access_token",
+                    value=SecretStr("token-value"),
+                ),
+                "_integration_type": CredentialKey(
+                    name="_integration_type",
+                    value=SecretStr("hubspot"),
+                ),
+            },
+        )
+
+        cached_storage.save(cred)
+
+        assert cached_storage._provider_index["hubspot"] == "aHVic3BvdDp0ZXN0OjEzNjExOjExNTI1"
+
+    def test_load_by_provider_name(self, cached_storage):
+        """Test load resolves provider name to hash-based credential ID."""
+        hash_id = "aHVic3BvdDp0ZXN0OjEzNjExOjExNTI1"
+        cred = CredentialObject(
+            id=hash_id,
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(
+                    name="access_token",
+                    value=SecretStr("hubspot-token"),
+                ),
+                "_integration_type": CredentialKey(
+                    name="_integration_type",
+                    value=SecretStr("hubspot"),
+                ),
+            },
+        )
+
+        # Save builds the index
+        cached_storage.save(cred)
+
+        # Load by provider name should resolve to the hash ID
+        loaded = cached_storage.load("hubspot")
+
+        assert loaded is not None
+        assert loaded.id == hash_id
+        assert loaded.keys["access_token"].value.get_secret_value() == "hubspot-token"
+
+    def test_load_by_direct_id_still_works(self, cached_storage):
+        """Test load by direct hash ID still works as before."""
+        hash_id = "aHVic3BvdDp0ZXN0OjEzNjExOjExNTI1"
+        cred = CredentialObject(
+            id=hash_id,
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(
+                    name="access_token",
+                    value=SecretStr("token"),
+                ),
+                "_integration_type": CredentialKey(
+                    name="_integration_type",
+                    value=SecretStr("hubspot"),
+                ),
+            },
+        )
+
+        cached_storage.save(cred)
+
+        # Direct ID lookup should still work
+        loaded = cached_storage.load(hash_id)
+
+        assert loaded is not None
+        assert loaded.id == hash_id
+
+    def test_exists_by_provider_name(self, cached_storage):
+        """Test exists resolves provider name to hash-based credential ID."""
+        hash_id = "c2xhY2s6dGVzdDo5OTk="
+        cred = CredentialObject(
+            id=hash_id,
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(
+                    name="access_token",
+                    value=SecretStr("slack-token"),
+                ),
+                "_integration_type": CredentialKey(
+                    name="_integration_type",
+                    value=SecretStr("slack"),
+                ),
+            },
+        )
+
+        cached_storage.save(cred)
+
+        assert cached_storage.exists("slack") is True
+        assert cached_storage.exists(hash_id) is True
+        assert cached_storage.exists("nonexistent") is False
+
+    def test_rebuild_provider_index(self, cached_storage, local_storage):
+        """Test rebuild_provider_index reconstructs from local storage."""
+        # Manually save credentials to local storage (bypassing cached_storage.save)
+        for provider_name, hash_id in [("hubspot", "hash_hub"), ("slack", "hash_slack")]:
+            cred = CredentialObject(
+                id=hash_id,
+                credential_type=CredentialType.OAUTH2,
+                keys={
+                    "_integration_type": CredentialKey(
+                        name="_integration_type",
+                        value=SecretStr(provider_name),
+                    ),
+                },
+            )
+            local_storage.save(cred)
+
+        # Index should be empty (we bypassed save)
+        assert len(cached_storage._provider_index) == 0
+
+        # Rebuild
+        indexed = cached_storage.rebuild_provider_index()
+
+        assert indexed == 2
+        assert cached_storage._provider_index["hubspot"] == "hash_hub"
+        assert cached_storage._provider_index["slack"] == "hash_slack"
+
+    def test_save_without_integration_type_no_index(self, cached_storage):
+        """Test save does not index credentials without _integration_type key."""
+        cred = CredentialObject(
+            id="plain-cred",
+            credential_type=CredentialType.API_KEY,
+            keys={
+                "api_key": CredentialKey(
+                    name="api_key",
+                    value=SecretStr("key-value"),
+                ),
+            },
+        )
+
+        cached_storage.save(cred)
+
+        assert "plain-cred" not in cached_storage._provider_index
+        assert len(cached_storage._provider_index) == 0
+
+
+# =============================================================================
+# Integration Tests
+# =============================================================================
+
+
+class TestAdenIntegration:
+    """Integration tests for Aden sync components."""
+
+    def test_full_workflow(self, mock_client, aden_response):
+        """Test full workflow: sync, get, refresh."""
+        # Setup
+        mock_client.list_integrations.return_value = [
+            AdenIntegrationInfo(
+                integration_id="hubspot",
+                integration_type="hubspot",
+                status="active",
+            ),
+        ]
+        mock_client.get_credential.return_value = aden_response
+        mock_client.request_refresh.return_value = AdenCredentialResponse(
+            integration_id="hubspot",
+            integration_type="hubspot",
+            access_token="refreshed-token",
+            expires_at=datetime.now(UTC) + timedelta(hours=2),
+            scopes=[],
+        )
+
+        provider = AdenSyncProvider(client=mock_client)
+        storage = InMemoryStorage()
+        store = CredentialStore(
+            storage=storage,
+            providers=[provider],
+            auto_refresh=True,
+        )
+
+        # Initial sync
+        synced = provider.sync_all(store)
+        assert synced == 1
+
+        # Get credential
+        cred = store.get_credential("hubspot")
+        assert cred is not None
+        assert cred.keys["access_token"].value.get_secret_value() == "test-access-token"
+
+        # Simulate expiration
+        cred.keys["access_token"] = CredentialKey(
+            name="access_token",
+            value=SecretStr("test-access-token"),
+            expires_at=datetime.now(UTC) - timedelta(hours=1),  # Expired
+        )
+        storage.save(cred)
+
+        # Refresh should be triggered
+        refreshed = provider.refresh(cred)
+        assert refreshed.keys["access_token"].value.get_secret_value() == "refreshed-token"
+
+    def test_cached_storage_with_store(self, mock_client, aden_response):
+        """Test AdenCachedStorage with CredentialStore."""
+        mock_client.get_credential.return_value = aden_response
+
+        provider = AdenSyncProvider(client=mock_client)
+        local_storage = InMemoryStorage()
+        cached_storage = AdenCachedStorage(
+            local_storage=local_storage,
+            aden_provider=provider,
+            cache_ttl_seconds=300,
+        )
+
+        # First load fetches from Aden
+        cred = cached_storage.load("hubspot")
+        assert cred is not None
+        mock_client.get_credential.assert_called_once()
+
+        # Second load uses cache
+        mock_client.get_credential.reset_mock()
+        cred2 = cached_storage.load("hubspot")
+        assert cred2 is not None
+        mock_client.get_credential.assert_not_called()
@@ -0,0 +1,293 @@
+"""
+Core data models for the credential store.
+
+This module defines the key-vault structure where credentials are objects
+containing one or more keys (e.g., api_key, access_token, refresh_token).
+"""
+
+from __future__ import annotations
+
+from datetime import UTC, datetime
+from enum import StrEnum
+from typing import Any
+
+from pydantic import BaseModel, Field, SecretStr
+
+
+def _utc_now() -> datetime:
+    """Get current UTC time as timezone-aware datetime."""
+    return datetime.now(UTC)
+
+
+class CredentialType(StrEnum):
+    """Types of credentials the store can manage."""
+
+    API_KEY = "api_key"
+    """Simple API key (e.g., Brave Search, OpenAI)"""
+
+    OAUTH2 = "oauth2"
+    """OAuth2 with refresh token support"""
+
+    BASIC_AUTH = "basic_auth"
+    """Username/password pair"""
+
+    BEARER_TOKEN = "bearer_token"
+    """JWT or bearer token without refresh"""
+
+    CUSTOM = "custom"
+    """User-defined credential type"""
+
+
+class CredentialKey(BaseModel):
+    """
+    A single key within a credential object.
+
+    Example: 'api_key' within a 'brave_search' credential
+
+    Attributes:
+        name: Key name (e.g., 'api_key', 'access_token')
+        value: Secret value (SecretStr prevents accidental logging)
+        expires_at: Optional expiration time
+        metadata: Additional key-specific metadata
+    """
+
+    name: str
+    value: SecretStr
+    expires_at: datetime | None = None
+    metadata: dict[str, Any] = Field(default_factory=dict)
+
+    model_config = {"extra": "allow"}
+
+    @property
+    def is_expired(self) -> bool:
+        """Check if this key has expired."""
+        if self.expires_at is None:
+            return False
+        return datetime.now(UTC) >= self.expires_at
+
+    def get_secret_value(self) -> str:
+        """Get the actual secret value (use sparingly)."""
+        return self.value.get_secret_value()
+
+
+class CredentialObject(BaseModel):
+    """
+    A credential object containing one or more keys.
+
+    This is the key-vault structure where each credential can have
+    multiple keys (e.g., access_token, refresh_token, expires_at).
+
+    Example:
+        CredentialObject(
+            id="github_oauth",
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(name="access_token", value=SecretStr("ghp_xxx")),
+                "refresh_token": CredentialKey(name="refresh_token", value=SecretStr("ghr_xxx")),
+            },
+            provider_id="oauth2"
+        )
+
+    Attributes:
+        id: Unique identifier (e.g., 'brave_search', 'github_oauth')
+        credential_type: Type of credential (API_KEY, OAUTH2, etc.)
+        keys: Dictionary of key name to CredentialKey
+        provider_id: ID of provider responsible for lifecycle management
+        auto_refresh: Whether to automatically refresh when expired
+    """
+
+    id: str = Field(description="Unique identifier (e.g., 'brave_search', 'github_oauth')")
+    credential_type: CredentialType = CredentialType.API_KEY
+    keys: dict[str, CredentialKey] = Field(default_factory=dict)
+
+    # Lifecycle management
+    provider_id: str | None = Field(
+        default=None,
+        description="ID of provider responsible for lifecycle (e.g., 'oauth2', 'static')",
+    )
+    last_refreshed: datetime | None = None
+    auto_refresh: bool = False
+
+    # Usage tracking
+    last_used: datetime | None = None
+    use_count: int = 0
+
+    # Metadata
+    description: str = ""
+    tags: list[str] = Field(default_factory=list)
+    created_at: datetime = Field(default_factory=_utc_now)
+    updated_at: datetime = Field(default_factory=_utc_now)
+
+    model_config = {"extra": "allow"}
+
+    def get_key(self, key_name: str) -> str | None:
+        """
+        Get a specific key's value.
+
+        Args:
+            key_name: Name of the key to retrieve
+
+        Returns:
+            The key's secret value, or None if not found
+        """
+        key = self.keys.get(key_name)
+        if key is None:
+            return None
+        return key.get_secret_value()
+
+    def set_key(
+        self,
+        key_name: str,
+        value: str,
+        expires_at: datetime | None = None,
+        metadata: dict[str, Any] | None = None,
+    ) -> None:
+        """
+        Set or update a key.
+
+        Args:
+            key_name: Name of the key
+            value: Secret value
+            expires_at: Optional expiration time
+            metadata: Optional key-specific metadata
+        """
+        self.keys[key_name] = CredentialKey(
+            name=key_name,
+            value=SecretStr(value),
+            expires_at=expires_at,
+            metadata=metadata or {},
+        )
+        self.updated_at = datetime.now(UTC)
+
+    def has_key(self, key_name: str) -> bool:
+        """Check if a key exists."""
+        return key_name in self.keys
+
+    @property
+    def needs_refresh(self) -> bool:
+        """Check if any key is expired or near expiration."""
+        for key in self.keys.values():
+            if key.is_expired:
+                return True
+        return False
+
+    @property
+    def is_valid(self) -> bool:
+        """Check if credential has at least one non-expired key."""
+        if not self.keys:
+            return False
+        return not all(key.is_expired for key in self.keys.values())
+
+    def record_usage(self) -> None:
+        """Record that this credential was used."""
+        self.last_used = datetime.now(UTC)
+        self.use_count += 1
+
+    def get_default_key(self) -> str | None:
+        """
+        Get the default key value.
+
+        Priority: 'value' > 'api_key' > 'access_token' > first key
+
+        Returns:
+            The default key's value, or None if no keys exist
+        """
+        for key_name in ["value", "api_key", "access_token"]:
+            if key_name in self.keys:
+                return self.get_key(key_name)
+
+        if self.keys:
+            first_key = next(iter(self.keys))
+            return self.get_key(first_key)
+
+        return None
+
+
+class CredentialUsageSpec(BaseModel):
+    """
+    Specification for how a tool uses credentials.
+
+    This implements the "bipartisan" model where the credential store
+    just stores values, and tools define how those values are used
+    in HTTP requests (headers, query params, body).
+
+    Example:
+        CredentialUsageSpec(
+            credential_id="brave_search",
+            required_keys=["api_key"],
+            headers={"X-Subscription-Token": "{{api_key}}"}
+        )
+
+        CredentialUsageSpec(
+            credential_id="github_oauth",
+            required_keys=["access_token"],
+            headers={"Authorization": "Bearer {{access_token}}"}
+        )
+
+    Attributes:
+        credential_id: ID of credential to use
+        required_keys: Keys that must be present
+        headers: Header templates with {{key}} placeholders
+        query_params: Query parameter templates
+        body_fields: Request body field templates
+    """
+
+    credential_id: str = Field(description="ID of credential to use (e.g., 'brave_search')")
+    required_keys: list[str] = Field(default_factory=list, description="Keys that must be present")
+
+    # Injection templates (bipartisan model)
+    headers: dict[str, str] = Field(
+        default_factory=dict,
+        description="Header templates (e.g., {'Authorization': 'Bearer {{access_token}}'})",
+    )
+    query_params: dict[str, str] = Field(
+        default_factory=dict,
+        description="Query param templates (e.g., {'api_key': '{{api_key}}'})",
+    )
+    body_fields: dict[str, str] = Field(
+        default_factory=dict,
+        description="Request body field templates",
+    )
+
+    # Metadata
+    required: bool = True
+    description: str = ""
+    help_url: str = ""
+
+    model_config = {"extra": "allow"}
+
+
+class CredentialError(Exception):
+    """Base exception for credential-related errors."""
+
+    pass
+
+
+class CredentialNotFoundError(CredentialError):
+    """Raised when a referenced credential doesn't exist."""
+
+    pass
+
+
+class CredentialKeyNotFoundError(CredentialError):
+    """Raised when a referenced key doesn't exist in a credential."""
+
+    pass
+
+
+class CredentialRefreshError(CredentialError):
+    """Raised when credential refresh fails."""
+
+    pass
+
+
+class CredentialValidationError(CredentialError):
+    """Raised when credential validation fails."""
+
+    pass
+
+
+class CredentialDecryptionError(CredentialError):
+    """Raised when credential decryption fails."""
+
+    pass
@@ -0,0 +1,92 @@
+"""
+OAuth2 support for the credential store.
+
+This module provides OAuth2 credential management with:
+- Token types and configuration (OAuth2Token, OAuth2Config)
+- Generic OAuth2 provider (BaseOAuth2Provider)
+- Token lifecycle management (TokenLifecycleManager)
+
+Quick Start:
+    from core.framework.credentials import CredentialStore
+    from core.framework.credentials.oauth2 import BaseOAuth2Provider, OAuth2Config
+
+    # Configure OAuth2 provider
+    provider = BaseOAuth2Provider(OAuth2Config(
+        token_url="https://oauth2.example.com/token",
+        client_id="your-client-id",
+        client_secret="your-client-secret",
+        default_scopes=["read", "write"],
+    ))
+
+    # Create store with OAuth2 provider
+    store = CredentialStore.with_encrypted_storage(
+        providers=[provider]  # defaults to ~/.hive/credentials
+    )
+
+    # Get token using client credentials
+    token = provider.client_credentials_grant()
+
+    # Save to store
+    from core.framework.credentials import CredentialObject, CredentialKey, CredentialType
+    from pydantic import SecretStr
+
+    store.save_credential(CredentialObject(
+        id="my_api",
+        credential_type=CredentialType.OAUTH2,
+        keys={
+            "access_token": CredentialKey(
+                name="access_token",
+                value=SecretStr(token.access_token),
+                expires_at=token.expires_at,
+            ),
+            "refresh_token": CredentialKey(
+                name="refresh_token",
+                value=SecretStr(token.refresh_token),
+            ) if token.refresh_token else None,
+        },
+        provider_id="oauth2",
+        auto_refresh=True,
+    ))
+
+For advanced lifecycle management:
+    from core.framework.credentials.oauth2 import TokenLifecycleManager
+
+    manager = TokenLifecycleManager(
+        provider=provider,
+        credential_id="my_api",
+        store=store,
+    )
+
+    # Get valid token (auto-refreshes if needed)
+    token = manager.sync_get_valid_token()
+    headers = manager.get_request_headers()
+"""
+
+from .base_provider import BaseOAuth2Provider
+from .hubspot_provider import HubSpotOAuth2Provider
+from .lifecycle import TokenLifecycleManager, TokenRefreshResult
+from .provider import (
+    OAuth2Config,
+    OAuth2Error,
+    OAuth2Token,
+    RefreshTokenInvalidError,
+    TokenExpiredError,
+    TokenPlacement,
+)
+
+__all__ = [
+    # Types
+    "OAuth2Token",
+    "OAuth2Config",
+    "TokenPlacement",
+    # Providers
+    "BaseOAuth2Provider",
+    "HubSpotOAuth2Provider",
+    # Lifecycle
+    "TokenLifecycleManager",
+    "TokenRefreshResult",
+    # Errors
+    "OAuth2Error",
+    "TokenExpiredError",
+    "RefreshTokenInvalidError",
+]
@@ -0,0 +1,486 @@
+"""
+Base OAuth2 provider implementation.
+
+This module provides a generic OAuth2 provider that works with standard
+OAuth2 servers. OSS users can extend this class for custom providers.
+"""
+
+from __future__ import annotations
+
+import logging
+from datetime import UTC, datetime, timedelta
+from typing import Any
+from urllib.parse import urlencode
+
+from ..models import CredentialObject, CredentialRefreshError, CredentialType
+from ..provider import CredentialProvider
+from .provider import (
+    OAuth2Config,
+    OAuth2Error,
+    OAuth2Token,
+    TokenPlacement,
+)
+
+logger = logging.getLogger(__name__)
+
+
+class BaseOAuth2Provider(CredentialProvider):
+    """
+    Generic OAuth2 provider implementation.
+
+    Works with standard OAuth2 servers (RFC 6749). Override methods for
+    provider-specific behavior.
+
+    Supported grant types:
+    - Client Credentials: For server-to-server authentication
+    - Refresh Token: For refreshing expired access tokens
+    - Authorization Code: For user-authorized access (requires callback handling)
+
+    OSS users can extend this class for custom providers:
+
+        class GitHubOAuth2Provider(BaseOAuth2Provider):
+            def __init__(self, client_id: str, client_secret: str):
+                super().__init__(OAuth2Config(
+                    token_url="https://github.com/login/oauth/access_token",
+                    authorization_url="https://github.com/login/oauth/authorize",
+                    client_id=client_id,
+                    client_secret=client_secret,
+                    default_scopes=["repo", "user"],
+                ))
+
+            def exchange_code(self, code: str, redirect_uri: str, **kwargs) -> OAuth2Token:
+                # GitHub returns data as form-encoded by default
+                # Override to handle this
+                ...
+
+    Example usage:
+        provider = BaseOAuth2Provider(OAuth2Config(
+            token_url="https://oauth2.example.com/token",
+            client_id="my-client-id",
+            client_secret="my-client-secret",
+        ))
+
+        # Get token using client credentials
+        token = provider.client_credentials_grant()
+
+        # Refresh an expired token
+        new_token = provider.refresh_token(old_token.refresh_token)
+    """
+
+    def __init__(self, config: OAuth2Config, provider_id: str = "oauth2"):
+        """
+        Initialize the OAuth2 provider.
+
+        Args:
+            config: OAuth2 configuration
+            provider_id: Unique identifier for this provider instance
+        """
+        self.config = config
+        self._provider_id = provider_id
+        self._client: Any | None = None
+
+    @property
+    def provider_id(self) -> str:
+        return self._provider_id
+
+    @property
+    def supported_types(self) -> list[CredentialType]:
+        return [CredentialType.OAUTH2, CredentialType.BEARER_TOKEN]
+
+    def _get_client(self) -> Any:
+        """Get or create HTTP client."""
+        if self._client is None:
+            try:
+                import httpx
+
+                self._client = httpx.Client(timeout=self.config.request_timeout)
+            except ImportError as e:
+                raise ImportError(
+                    "OAuth2 provider requires 'httpx'. Install with: uv pip install httpx"
+                ) from e
+        return self._client
+
+    def _close_client(self) -> None:
+        """Close the HTTP client."""
+        if self._client is not None:
+            self._client.close()
+            self._client = None
+
+    def __del__(self) -> None:
+        """Cleanup HTTP client on deletion."""
+        self._close_client()
+
+    # --- Grant Types ---
+
+    def get_authorization_url(
+        self,
+        state: str,
+        redirect_uri: str,
+        scopes: list[str] | None = None,
+        **kwargs: Any,
+    ) -> str:
+        """
+        Generate authorization URL for user consent (Authorization Code flow).
+
+        Args:
+            state: Anti-CSRF state parameter (should be random and verified)
+            redirect_uri: Callback URL to receive the authorization code
+            scopes: Requested scopes (defaults to config.default_scopes)
+            **kwargs: Additional provider-specific parameters
+
+        Returns:
+            URL to redirect user for authorization
+
+        Raises:
+            ValueError: If authorization_url is not configured
+        """
+        if not self.config.authorization_url:
+            raise ValueError("authorization_url not configured for this provider")
+
+        params = {
+            "client_id": self.config.client_id,
+            "redirect_uri": redirect_uri,
+            "response_type": "code",
+            "state": state,
+            "scope": " ".join(scopes or self.config.default_scopes),
+            **kwargs,
+        }
+
+        return f"{self.config.authorization_url}?{urlencode(params)}"
+
+    def exchange_code(
+        self,
+        code: str,
+        redirect_uri: str,
+        **kwargs: Any,
+    ) -> OAuth2Token:
+        """
+        Exchange authorization code for tokens (Authorization Code flow).
+
+        Args:
+            code: Authorization code from callback
+            redirect_uri: Same redirect_uri used in authorization request
+            **kwargs: Additional provider-specific parameters
+
+        Returns:
+            OAuth2Token with access_token and optional refresh_token
+
+        Raises:
+            OAuth2Error: If token exchange fails
+        """
+        data = {
+            "grant_type": "authorization_code",
+            "client_id": self.config.client_id,
+            "client_secret": self.config.client_secret,
+            "code": code,
+            "redirect_uri": redirect_uri,
+            **self.config.extra_token_params,
+            **kwargs,
+        }
+
+        return self._token_request(data)
+
+    def client_credentials_grant(
+        self,
+        scopes: list[str] | None = None,
+        **kwargs: Any,
+    ) -> OAuth2Token:
+        """
+        Obtain token using client credentials (Client Credentials flow).
+
+        This is for server-to-server authentication where no user is involved.
+
+        Args:
+            scopes: Requested scopes (defaults to config.default_scopes)
+            **kwargs: Additional provider-specific parameters
+
+        Returns:
+            OAuth2Token (typically without refresh_token)
+
+        Raises:
+            OAuth2Error: If token request fails
+        """
+        data = {
+            "grant_type": "client_credentials",
+            "client_id": self.config.client_id,
+            "client_secret": self.config.client_secret,
+            **self.config.extra_token_params,
+            **kwargs,
+        }
+
+        if scopes or self.config.default_scopes:
+            data["scope"] = " ".join(scopes or self.config.default_scopes)
+
+        return self._token_request(data)
+
+    def refresh_access_token(
+        self,
+        refresh_token: str,
+        scopes: list[str] | None = None,
+        **kwargs: Any,
+    ) -> OAuth2Token:
+        """
+        Refresh an expired access token (Refresh Token flow).
+
+        Args:
+            refresh_token: The refresh token
+            scopes: Scopes to request (defaults to original scopes)
+            **kwargs: Additional provider-specific parameters
+
+        Returns:
+            New OAuth2Token (may include new refresh_token)
+
+        Raises:
+            OAuth2Error: If refresh fails
+            RefreshTokenInvalidError: If refresh token is revoked/invalid
+        """
+        data = {
+            "grant_type": "refresh_token",
+            "client_id": self.config.client_id,
+            "client_secret": self.config.client_secret,
+            "refresh_token": refresh_token,
+            **self.config.extra_token_params,
+            **kwargs,
+        }
+
+        if scopes:
+            data["scope"] = " ".join(scopes)
+
+        return self._token_request(data)
+
+    def revoke_token(
+        self,
+        token: str,
+        token_type_hint: str = "access_token",
+    ) -> bool:
+        """
+        Revoke a token (RFC 7009).
+
+        Args:
+            token: The token to revoke
+            token_type_hint: "access_token" or "refresh_token"
+
+        Returns:
+            True if revocation succeeded
+        """
+        if not self.config.revocation_url:
+            logger.warning("revocation_url not configured, cannot revoke token")
+            return False
+
+        try:
+            client = self._get_client()
+            response = client.post(
+                self.config.revocation_url,
+                data={
+                    "token": token,
+                    "token_type_hint": token_type_hint,
+                    "client_id": self.config.client_id,
+                    "client_secret": self.config.client_secret,
+                },
+                headers={"Accept": "application/json", **self.config.extra_headers},
+            )
+            # RFC 7009: 200 indicates success (even if token was already invalid)
+            return response.status_code == 200
+        except Exception as e:
+            logger.error(f"Token revocation failed: {e}")
+            return False
+
+    # --- CredentialProvider Interface ---
+
+    def refresh(self, credential: CredentialObject) -> CredentialObject:
+        """
+        Refresh a credential using its refresh token.
+
+        Implements CredentialProvider.refresh().
+
+        Args:
+            credential: The credential to refresh
+
+        Returns:
+            Updated credential with new access_token
+
+        Raises:
+            CredentialRefreshError: If refresh fails
+        """
+        refresh_tok = credential.get_key("refresh_token")
+        if not refresh_tok:
+            raise CredentialRefreshError(f"Credential '{credential.id}' has no refresh_token")
+
+        try:
+            new_token = self.refresh_access_token(refresh_tok)
+        except OAuth2Error as e:
+            if e.error == "invalid_grant":
+                raise CredentialRefreshError(
+                    f"Refresh token for '{credential.id}' is invalid or revoked. "
+                    "Re-authorization required."
+                ) from e
+            raise CredentialRefreshError(f"Failed to refresh '{credential.id}': {e}") from e
+
+        # Update credential
+        credential.set_key("access_token", new_token.access_token, expires_at=new_token.expires_at)
+
+        # Update refresh token if a new one was issued
+        if new_token.refresh_token and new_token.refresh_token != refresh_tok:
+            credential.set_key("refresh_token", new_token.refresh_token)
+
+        credential.last_refreshed = datetime.now(UTC)
+        logger.info(f"Refreshed OAuth2 credential '{credential.id}'")
+
+        return credential
+
+    def validate(self, credential: CredentialObject) -> bool:
+        """
+        Validate that credential has a valid (non-expired) access_token.
+
+        Args:
+            credential: The credential to validate
+
+        Returns:
+            True if credential has valid access_token
+        """
+        access_key = credential.keys.get("access_token")
+        if access_key is None:
+            return False
+        return not access_key.is_expired
+
+    def should_refresh(self, credential: CredentialObject) -> bool:
+        """
+        Check if credential should be refreshed.
+
+        Returns True if access_token is expired or within 5 minutes of expiry.
+        """
+        access_key = credential.keys.get("access_token")
+        if access_key is None:
+            return False
+
+        if access_key.expires_at is None:
+            return False
+
+        buffer = timedelta(minutes=5)
+        return datetime.now(UTC) >= (access_key.expires_at - buffer)
+
+    def revoke(self, credential: CredentialObject) -> bool:
+        """
+        Revoke all tokens in a credential.
+
+        Args:
+            credential: The credential to revoke
+
+        Returns:
+            True if all revocations succeeded
+        """
+        success = True
+
+        # Revoke access token
+        access_token = credential.get_key("access_token")
+        if access_token:
+            if not self.revoke_token(access_token, "access_token"):
+                success = False
+
+        # Revoke refresh token
+        refresh_token = credential.get_key("refresh_token")
+        if refresh_token:
+            if not self.revoke_token(refresh_token, "refresh_token"):
+                success = False
+
+        return success
+
+    # --- Token Request Helpers ---
+
+    def _token_request(self, data: dict[str, Any]) -> OAuth2Token:
+        """
+        Make a token request to the OAuth2 server.
+
+        Args:
+            data: Form data for the token request
+
+        Returns:
+            OAuth2Token from the response
+
+        Raises:
+            OAuth2Error: If request fails or returns an error
+        """
+        client = self._get_client()
+
+        headers = {
+            "Accept": "application/json",
+            "Content-Type": "application/x-www-form-urlencoded",
+            **self.config.extra_headers,
+        }
+
+        response = client.post(self.config.token_url, data=data, headers=headers)
+
+        # Parse response
+        content_type = response.headers.get("content-type", "")
+        if "application/json" in content_type:
+            response_data = response.json()
+        else:
+            # Some providers (like GitHub) may return form-encoded
+            response_data = self._parse_form_response(response.text)
+
+        # Check for error
+        if response.status_code != 200 or "error" in response_data:
+            error = response_data.get("error", "unknown_error")
+            description = response_data.get("error_description", response.text)
+            raise OAuth2Error(
+                error=error, description=description, status_code=response.status_code
+            )
+
+        return OAuth2Token.from_token_response(response_data)
+
+    def _parse_form_response(self, text: str) -> dict[str, str]:
+        """Parse form-encoded response (some providers use this instead of JSON)."""
+        from urllib.parse import parse_qs
+
+        parsed = parse_qs(text)
+        return {k: v[0] if len(v) == 1 else v for k, v in parsed.items()}
+
+    # --- Token Formatting for Requests ---
+
+    def format_for_request(self, token: OAuth2Token) -> dict[str, Any]:
+        """
+        Format token for use in HTTP requests (bipartisan model).
+
+        Args:
+            token: The OAuth2 token
+
+        Returns:
+            Dict with 'headers', 'params', or 'data' keys as appropriate
+        """
+        placement = self.config.token_placement
+
+        if placement == TokenPlacement.HEADER_BEARER:
+            return {"headers": {"Authorization": f"{token.token_type} {token.access_token}"}}
+
+        elif placement == TokenPlacement.HEADER_CUSTOM:
+            header_name = self.config.custom_header_name or "X-Access-Token"
+            return {"headers": {header_name: token.access_token}}
+
+        elif placement == TokenPlacement.QUERY_PARAM:
+            return {"params": {self.config.query_param_name: token.access_token}}
+
+        elif placement == TokenPlacement.BODY_PARAM:
+            return {"data": {"access_token": token.access_token}}
+
+        return {}
+
+    def format_credential_for_request(self, credential: CredentialObject) -> dict[str, Any]:
+        """
+        Format a credential for use in HTTP requests.
+
+        Args:
+            credential: The credential containing access_token
+
+        Returns:
+            Dict with 'headers', 'params', or 'data' keys as appropriate
+        """
+        access_token = credential.get_key("access_token")
+        if not access_token:
+            return {}
+
+        token = OAuth2Token(
+            access_token=access_token,
+            token_type=credential.keys.get("token_type", "Bearer") or "Bearer",
+        )
+
+        return self.format_for_request(token)
@@ -0,0 +1,112 @@
+"""
+HubSpot-specific OAuth2 provider.
+
+Pre-configured for HubSpot's OAuth2 endpoints and CRM scopes.
+Extends BaseOAuth2Provider for HubSpot-specific behavior.
+
+Usage:
+    provider = HubSpotOAuth2Provider(
+        client_id="your-client-id",
+        client_secret="your-client-secret",
+    )
+
+    # Use with credential store
+    store = CredentialStore(
+        storage=EncryptedFileStorage(),  # defaults to ~/.hive/credentials
+        providers=[provider],
+    )
+
+See: https://developers.hubspot.com/docs/api/oauth-quickstart-guide
+"""
+
+from __future__ import annotations
+
+import logging
+from typing import Any
+
+from ..models import CredentialObject, CredentialType
+from .base_provider import BaseOAuth2Provider
+from .provider import OAuth2Config
+
+logger = logging.getLogger(__name__)
+
+# HubSpot OAuth2 endpoints
+HUBSPOT_TOKEN_URL = "https://api.hubapi.com/oauth/v1/token"
+HUBSPOT_AUTHORIZATION_URL = "https://app.hubspot.com/oauth/authorize"
+
+# Default CRM scopes for contacts, companies, and deals
+HUBSPOT_DEFAULT_SCOPES = [
+    "crm.objects.contacts.read",
+    "crm.objects.contacts.write",
+    "crm.objects.companies.read",
+    "crm.objects.companies.write",
+    "crm.objects.deals.read",
+    "crm.objects.deals.write",
+]
+
+
+class HubSpotOAuth2Provider(BaseOAuth2Provider):
+    """
+    HubSpot OAuth2 provider with pre-configured endpoints.
+
+    Handles HubSpot-specific OAuth2 behavior:
+    - Pre-configured token and authorization URLs
+    - Default CRM scopes for contacts, companies, and deals
+    - Token validation via HubSpot API
+
+    Example:
+        provider = HubSpotOAuth2Provider(
+            client_id="your-hubspot-client-id",
+            client_secret="your-hubspot-client-secret",
+            scopes=["crm.objects.contacts.read"],  # Override default scopes
+        )
+    """
+
+    def __init__(
+        self,
+        client_id: str,
+        client_secret: str,
+        scopes: list[str] | None = None,
+    ):
+        config = OAuth2Config(
+            token_url=HUBSPOT_TOKEN_URL,
+            authorization_url=HUBSPOT_AUTHORIZATION_URL,
+            client_id=client_id,
+            client_secret=client_secret,
+            default_scopes=scopes or HUBSPOT_DEFAULT_SCOPES,
+        )
+        super().__init__(config, provider_id="hubspot_oauth2")
+
+    @property
+    def supported_types(self) -> list[CredentialType]:
+        return [CredentialType.OAUTH2]
+
+    def validate(self, credential: CredentialObject) -> bool:
+        """
+        Validate HubSpot credential by making a lightweight API call.
+
+        Tests the access token against the contacts endpoint with limit=1.
+        """
+        access_token = credential.get_key("access_token")
+        if not access_token:
+            return False
+
+        try:
+            client = self._get_client()
+            response = client.get(
+                "https://api.hubapi.com/crm/v3/objects/contacts",
+                headers={
+                    "Authorization": f"Bearer {access_token}",
+                    "Accept": "application/json",
+                },
+                params={"limit": "1"},
+            )
+            return response.status_code == 200
+        except Exception:
+            return False
+
+    def _parse_token_response(self, response_data: dict[str, Any]) -> Any:
+        """Parse HubSpot token response."""
+        from .provider import OAuth2Token
+
+        return OAuth2Token.from_token_response(response_data)
@@ -0,0 +1,363 @@
+"""
+Token lifecycle management for OAuth2 credentials.
+
+This module provides the TokenLifecycleManager which coordinates
+automatic token refresh with the credential store.
+"""
+
+from __future__ import annotations
+
+import asyncio
+import logging
+from collections.abc import Callable
+from dataclasses import dataclass
+from datetime import UTC, datetime, timedelta
+from typing import TYPE_CHECKING
+
+from pydantic import SecretStr
+
+from ..models import CredentialKey, CredentialObject, CredentialType
+from .base_provider import BaseOAuth2Provider
+from .provider import OAuth2Token
+
+if TYPE_CHECKING:
+    from ..store import CredentialStore
+
+logger = logging.getLogger(__name__)
+
+
+@dataclass
+class TokenRefreshResult:
+    """Result of a token refresh operation."""
+
+    success: bool
+    token: OAuth2Token | None = None
+    error: str | None = None
+    needs_reauthorization: bool = False
+
+
+class TokenLifecycleManager:
+    """
+    Manages the complete lifecycle of OAuth2 tokens.
+
+    Responsibilities:
+    - Coordinate with CredentialStore for persistence
+    - Automatically refresh expired tokens
+    - Handle refresh failures gracefully
+    - Provide callbacks for monitoring
+
+    This class is useful when you need more control over token management
+    than the basic auto-refresh in CredentialStore provides.
+
+    Usage:
+        manager = TokenLifecycleManager(
+            provider=github_provider,
+            credential_id="github_oauth",
+            store=credential_store,
+        )
+
+        # Get valid token (auto-refreshes if needed)
+        token = await manager.get_valid_token()
+
+        # Use token
+        headers = provider.format_for_request(token)
+
+    Synchronous usage:
+        # For synchronous code, use sync_ methods
+        token = manager.sync_get_valid_token()
+    """
+
+    def __init__(
+        self,
+        provider: BaseOAuth2Provider,
+        credential_id: str,
+        store: CredentialStore,
+        refresh_buffer_minutes: int = 5,
+        on_token_refreshed: Callable[[OAuth2Token], None] | None = None,
+        on_refresh_failed: Callable[[str], None] | None = None,
+    ):
+        """
+        Initialize the lifecycle manager.
+
+        Args:
+            provider: OAuth2 provider for token operations
+            credential_id: ID of the credential in the store
+            store: Credential store for persistence
+            refresh_buffer_minutes: Minutes before expiry to trigger refresh
+            on_token_refreshed: Callback when token is refreshed
+            on_refresh_failed: Callback when refresh fails
+        """
+        self.provider = provider
+        self.credential_id = credential_id
+        self.store = store
+        self.refresh_buffer = timedelta(minutes=refresh_buffer_minutes)
+        self.on_token_refreshed = on_token_refreshed
+        self.on_refresh_failed = on_refresh_failed
+
+        # In-memory cache for performance
+        self._cached_token: OAuth2Token | None = None
+        self._cache_time: datetime | None = None
+
+    # --- Async Token Access ---
+
+    async def get_valid_token(self) -> OAuth2Token | None:
+        """
+        Get a valid access token, refreshing if necessary.
+
+        This is the main entry point for async code.
+
+        Returns:
+            Valid OAuth2Token or None if unavailable
+        """
+        # Check cache first
+        if self._cached_token and not self._needs_refresh(self._cached_token):
+            return self._cached_token
+
+        # Load from store
+        credential = self.store.get_credential(self.credential_id, refresh_if_needed=False)
+        if credential is None:
+            return None
+
+        # Convert to OAuth2Token
+        token = self._credential_to_token(credential)
+        if token is None:
+            return None
+
+        # Refresh if needed
+        if self._needs_refresh(token):
+            result = await self._async_refresh_token(credential)
+            if result.success and result.token:
+                token = result.token
+            elif result.needs_reauthorization:
+                logger.warning(f"Token for {self.credential_id} needs reauthorization")
+                return None
+            else:
+                # Use existing token if still technically valid
+                if token.is_expired:
+                    return None
+                logger.warning(f"Refresh failed for {self.credential_id}, using existing token")
+
+        self._cached_token = token
+        self._cache_time = datetime.now(UTC)
+        return token
+
+    async def acquire_token_client_credentials(
+        self,
+        scopes: list[str] | None = None,
+    ) -> OAuth2Token:
+        """
+        Acquire a new token using client credentials flow.
+
+        For service-to-service authentication.
+
+        Args:
+            scopes: Scopes to request
+
+        Returns:
+            New OAuth2Token
+        """
+        # Run in executor to avoid blocking
+        loop = asyncio.get_event_loop()
+        token = await loop.run_in_executor(
+            None, lambda: self.provider.client_credentials_grant(scopes=scopes)
+        )
+
+        self._save_token_to_store(token)
+        self._cached_token = token
+        return token
+
+    async def revoke(self) -> bool:
+        """
+        Revoke tokens and clear from store.
+
+        Returns:
+            True if revocation succeeded
+        """
+        credential = self.store.get_credential(self.credential_id, refresh_if_needed=False)
+        if credential:
+            self.provider.revoke(credential)
+
+        self.store.delete_credential(self.credential_id)
+        self._cached_token = None
+        return True
+
+    # --- Synchronous Token Access ---
+
+    def sync_get_valid_token(self) -> OAuth2Token | None:
+        """
+        Synchronous version of get_valid_token().
+
+        For use in synchronous code.
+        """
+        # Check cache
+        if self._cached_token and not self._needs_refresh(self._cached_token):
+            return self._cached_token
+
+        # Load from store
+        credential = self.store.get_credential(self.credential_id, refresh_if_needed=False)
+        if credential is None:
+            return None
+
+        token = self._credential_to_token(credential)
+        if token is None:
+            return None
+
+        # Refresh if needed
+        if self._needs_refresh(token):
+            result = self._sync_refresh_token(credential)
+            if result.success and result.token:
+                token = result.token
+            elif result.needs_reauthorization:
+                logger.warning(f"Token for {self.credential_id} needs reauthorization")
+                return None
+            else:
+                if token.is_expired:
+                    return None
+
+        self._cached_token = token
+        self._cache_time = datetime.now(UTC)
+        return token
+
+    def sync_acquire_token_client_credentials(
+        self,
+        scopes: list[str] | None = None,
+    ) -> OAuth2Token:
+        """Synchronous version of acquire_token_client_credentials()."""
+        token = self.provider.client_credentials_grant(scopes=scopes)
+        self._save_token_to_store(token)
+        self._cached_token = token
+        return token
+
+    # --- Helper Methods ---
+
+    def _needs_refresh(self, token: OAuth2Token) -> bool:
+        """Check if token needs refresh."""
+        if token.expires_at is None:
+            return False
+        return datetime.now(UTC) >= (token.expires_at - self.refresh_buffer)
+
+    def _credential_to_token(self, credential: CredentialObject) -> OAuth2Token | None:
+        """Convert credential to OAuth2Token."""
+        access_token = credential.get_key("access_token")
+        if not access_token:
+            return None
+
+        expires_at = None
+        access_key = credential.keys.get("access_token")
+        if access_key:
+            expires_at = access_key.expires_at
+
+        return OAuth2Token(
+            access_token=access_token,
+            token_type="Bearer",
+            expires_at=expires_at,
+            refresh_token=credential.get_key("refresh_token"),
+            scope=credential.get_key("scope"),
+        )
+
+    def _save_token_to_store(self, token: OAuth2Token) -> None:
+        """Save token to credential store."""
+        credential = CredentialObject(
+            id=self.credential_id,
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(
+                    name="access_token",
+                    value=SecretStr(token.access_token),
+                    expires_at=token.expires_at,
+                ),
+            },
+            provider_id=self.provider.provider_id,
+            auto_refresh=True,
+        )
+
+        if token.refresh_token:
+            credential.keys["refresh_token"] = CredentialKey(
+                name="refresh_token",
+                value=SecretStr(token.refresh_token),
+            )
+
+        if token.scope:
+            credential.keys["scope"] = CredentialKey(
+                name="scope",
+                value=SecretStr(token.scope),
+            )
+
+        self.store.save_credential(credential)
+
+    async def _async_refresh_token(self, credential: CredentialObject) -> TokenRefreshResult:
+        """Async wrapper for token refresh."""
+        loop = asyncio.get_event_loop()
+        return await loop.run_in_executor(None, lambda: self._sync_refresh_token(credential))
+
+    def _sync_refresh_token(self, credential: CredentialObject) -> TokenRefreshResult:
+        """Synchronously refresh token."""
+        refresh_token = credential.get_key("refresh_token")
+        if not refresh_token:
+            return TokenRefreshResult(
+                success=False,
+                error="No refresh token available",
+                needs_reauthorization=True,
+            )
+
+        try:
+            new_token = self.provider.refresh_access_token(refresh_token)
+
+            # Save to store
+            self._save_token_to_store(new_token)
+
+            # Notify callback
+            if self.on_token_refreshed:
+                self.on_token_refreshed(new_token)
+
+            logger.info(f"Token refreshed for {self.credential_id}")
+            return TokenRefreshResult(success=True, token=new_token)
+
+        except Exception as e:
+            error_msg = str(e)
+
+            # Check for refresh token revocation
+            if "invalid_grant" in error_msg.lower():
+                return TokenRefreshResult(
+                    success=False,
+                    error=error_msg,
+                    needs_reauthorization=True,
+                )
+
+            if self.on_refresh_failed:
+                self.on_refresh_failed(error_msg)
+
+            logger.error(f"Token refresh failed for {self.credential_id}: {e}")
+            return TokenRefreshResult(success=False, error=error_msg)
+
+    def invalidate_cache(self) -> None:
+        """Clear cached token."""
+        self._cached_token = None
+        self._cache_time = None
+
+    # --- Convenience Methods ---
+
+    def get_request_headers(self) -> dict[str, str]:
+        """
+        Get headers for HTTP request with current token.
+
+        Returns empty dict if no valid token.
+        """
+        token = self.sync_get_valid_token()
+        if token is None:
+            return {}
+
+        result = self.provider.format_for_request(token)
+        return result.get("headers", {})
+
+    def get_request_kwargs(self) -> dict:
+        """
+        Get kwargs for HTTP request (headers, params, etc.).
+
+        Returns empty dict if no valid token.
+        """
+        token = self.sync_get_valid_token()
+        if token is None:
+            return {}
+
+        return self.provider.format_for_request(token)
@@ -0,0 +1,213 @@
+"""
+OAuth2 types and configuration.
+
+This module defines the core OAuth2 data structures:
+- OAuth2Token: Represents an access token with metadata
+- OAuth2Config: Configuration for OAuth2 endpoints
+- TokenPlacement: Where to place tokens in requests
+"""
+
+from __future__ import annotations
+
+from dataclasses import dataclass, field
+from datetime import UTC, datetime, timedelta
+from enum import StrEnum
+from typing import Any
+
+
+class TokenPlacement(StrEnum):
+    """Where to place the access token in HTTP requests."""
+
+    HEADER_BEARER = "header_bearer"
+    """Authorization: Bearer <token> (most common)"""
+
+    HEADER_CUSTOM = "header_custom"
+    """Custom header name (e.g., X-Access-Token)"""
+
+    QUERY_PARAM = "query_param"
+    """Query parameter (e.g., ?access_token=<token>)"""
+
+    BODY_PARAM = "body_param"
+    """Form body parameter"""
+
+
+@dataclass
+class OAuth2Token:
+    """
+    Represents an OAuth2 token with metadata.
+
+    Attributes:
+        access_token: The access token string
+        token_type: Token type (usually "Bearer")
+        expires_at: When the token expires
+        refresh_token: Optional refresh token
+        scope: Granted scopes (space-separated)
+        raw_response: Original token response from server
+    """
+
+    access_token: str
+    token_type: str = "Bearer"
+    expires_at: datetime | None = None
+    refresh_token: str | None = None
+    scope: str | None = None
+    raw_response: dict[str, Any] = field(default_factory=dict)
+
+    @property
+    def is_expired(self) -> bool:
+        """
+        Check if token is expired.
+
+        Uses a 5-minute buffer to account for clock skew and
+        request latency.
+        """
+        if self.expires_at is None:
+            return False
+        buffer = timedelta(minutes=5)
+        return datetime.now(UTC) >= (self.expires_at - buffer)
+
+    @property
+    def can_refresh(self) -> bool:
+        """Check if token can be refreshed (has refresh_token)."""
+        return self.refresh_token is not None and self.refresh_token.strip() != ""
+
+    @property
+    def expires_in_seconds(self) -> int | None:
+        """Get seconds until expiration, or None if no expiration."""
+        if self.expires_at is None:
+            return None
+        delta = self.expires_at - datetime.now(UTC)
+        return max(0, int(delta.total_seconds()))
+
+    @classmethod
+    def from_token_response(cls, data: dict[str, Any]) -> OAuth2Token:
+        """
+        Create OAuth2Token from an OAuth2 token endpoint response.
+
+        Args:
+            data: Token response JSON (access_token, token_type, expires_in, etc.)
+
+        Returns:
+            OAuth2Token instance
+        """
+        expires_at = None
+        if "expires_in" in data:
+            expires_at = datetime.now(UTC) + timedelta(seconds=data["expires_in"])
+
+        return cls(
+            access_token=data["access_token"],
+            token_type=data.get("token_type", "Bearer"),
+            expires_at=expires_at,
+            refresh_token=data.get("refresh_token"),
+            scope=data.get("scope"),
+            raw_response=data,
+        )
+
+
+@dataclass
+class OAuth2Config:
+    """
+    Configuration for an OAuth2 provider.
+
+    This contains all the information needed to perform OAuth2 operations
+    for a specific provider (GitHub, Google, Salesforce, etc.).
+
+    Attributes:
+        token_url: URL for token endpoint (required)
+        authorization_url: URL for authorization endpoint (optional, for auth code flow)
+        revocation_url: URL for token revocation (optional)
+        introspection_url: URL for token introspection (optional)
+        client_id: OAuth2 client ID
+        client_secret: OAuth2 client secret
+        default_scopes: Default scopes to request
+        token_placement: How to include token in requests
+        custom_header_name: Header name when using HEADER_CUSTOM placement
+        query_param_name: Query param name when using QUERY_PARAM placement
+        extra_token_params: Additional parameters for token requests
+        request_timeout: Timeout for HTTP requests in seconds
+
+    Example:
+        config = OAuth2Config(
+            token_url="https://github.com/login/oauth/access_token",
+            authorization_url="https://github.com/login/oauth/authorize",
+            client_id="your-client-id",
+            client_secret="your-client-secret",
+            default_scopes=["repo", "user"],
+        )
+    """
+
+    # Endpoints (only token_url is strictly required)
+    token_url: str
+    authorization_url: str | None = None
+    revocation_url: str | None = None
+    introspection_url: str | None = None
+
+    # Client credentials
+    client_id: str = ""
+    client_secret: str = ""
+
+    # Scopes
+    default_scopes: list[str] = field(default_factory=list)
+
+    # Token placement for API calls (bipartisan model)
+    token_placement: TokenPlacement = TokenPlacement.HEADER_BEARER
+    custom_header_name: str | None = None
+    query_param_name: str = "access_token"
+
+    # Request configuration
+    extra_token_params: dict[str, str] = field(default_factory=dict)
+    request_timeout: float = 30.0
+
+    # Additional headers for token requests
+    extra_headers: dict[str, str] = field(default_factory=dict)
+
+    def __post_init__(self) -> None:
+        """Validate configuration."""
+        if not self.token_url:
+            raise ValueError("token_url is required")
+
+        if self.token_placement == TokenPlacement.HEADER_CUSTOM and not self.custom_header_name:
+            raise ValueError("custom_header_name is required when using HEADER_CUSTOM placement")
+
+
+class OAuth2Error(Exception):
+    """
+    OAuth2 protocol error.
+
+    Attributes:
+        error: OAuth2 error code (e.g., 'invalid_grant', 'invalid_client')
+        description: Human-readable error description
+        status_code: HTTP status code from the response
+    """
+
+    def __init__(
+        self,
+        error: str,
+        description: str = "",
+        status_code: int = 0,
+    ):
+        self.error = error
+        self.description = description
+        self.status_code = status_code
+        super().__init__(f"{error}: {description}" if description else error)
+
+
+class TokenExpiredError(OAuth2Error):
+    """Raised when a token has expired and cannot be used."""
+
+    def __init__(self, credential_id: str):
+        super().__init__(
+            error="token_expired",
+            description=f"Token for '{credential_id}' has expired",
+        )
+        self.credential_id = credential_id
+
+
+class RefreshTokenInvalidError(OAuth2Error):
+    """Raised when the refresh token is invalid or revoked."""
+
+    def __init__(self, credential_id: str, reason: str = ""):
+        description = f"Refresh token for '{credential_id}' is invalid"
+        if reason:
+            description += f": {reason}"
+        super().__init__(error="invalid_grant", description=description)
+        self.credential_id = credential_id
@@ -0,0 +1,283 @@
+"""
+Provider interface for credential lifecycle management.
+
+Providers handle credential lifecycle operations:
+- Refresh: Obtain new tokens when expired
+- Validate: Check if credentials are still working
+- Revoke: Invalidate credentials when no longer needed
+
+OSS users can implement custom providers by subclassing CredentialProvider.
+"""
+
+from __future__ import annotations
+
+import logging
+from abc import ABC, abstractmethod
+from datetime import UTC, datetime, timedelta
+
+from .models import CredentialObject, CredentialRefreshError, CredentialType
+
+logger = logging.getLogger(__name__)
+
+
+class CredentialProvider(ABC):
+    """
+    Abstract base class for credential providers.
+
+    Providers handle credential lifecycle operations:
+    - refresh(): Obtain new tokens when expired
+    - validate(): Check if credentials are still working
+    - should_refresh(): Determine if a credential needs refresh
+    - revoke(): Invalidate credentials (optional)
+
+    Example custom provider:
+        class MyCustomProvider(CredentialProvider):
+            @property
+            def provider_id(self) -> str:
+                return "my_custom"
+
+            @property
+            def supported_types(self) -> List[CredentialType]:
+                return [CredentialType.CUSTOM]
+
+            def refresh(self, credential: CredentialObject) -> CredentialObject:
+                # Custom refresh logic
+                new_token = my_api.refresh(credential.get_key("api_key"))
+                credential.set_key("access_token", new_token)
+                return credential
+
+            def validate(self, credential: CredentialObject) -> bool:
+                token = credential.get_key("access_token")
+                return my_api.validate(token)
+    """
+
+    @property
+    @abstractmethod
+    def provider_id(self) -> str:
+        """
+        Unique identifier for this provider.
+
+        Examples: 'static', 'oauth2', 'my_custom_auth'
+        """
+        pass
+
+    @property
+    @abstractmethod
+    def supported_types(self) -> list[CredentialType]:
+        """
+        Credential types this provider can manage.
+
+        Returns:
+            List of CredentialType enums this provider supports
+        """
+        pass
+
+    @abstractmethod
+    def refresh(self, credential: CredentialObject) -> CredentialObject:
+        """
+        Refresh the credential (e.g., use refresh_token to get new access_token).
+
+        This method should:
+        1. Use existing credential data to obtain new values
+        2. Update the credential object with new values
+        3. Set appropriate expiration times
+        4. Update last_refreshed timestamp
+
+        Args:
+            credential: The credential to refresh
+
+        Returns:
+            Updated credential with new values
+
+        Raises:
+            CredentialRefreshError: If refresh fails
+        """
+        pass
+
+    @abstractmethod
+    def validate(self, credential: CredentialObject) -> bool:
+        """
+        Validate that a credential is still working.
+
+        This might involve:
+        - Checking expiration times
+        - Making a test API call
+        - Validating token signatures
+
+        Args:
+            credential: The credential to validate
+
+        Returns:
+            True if credential is valid, False otherwise
+        """
+        pass
+
+    def should_refresh(self, credential: CredentialObject) -> bool:
+        """
+        Determine if a credential should be refreshed.
+
+        Default implementation: refresh if any key is expired or within
+        5 minutes of expiry. Override for custom logic.
+
+        Args:
+            credential: The credential to check
+
+        Returns:
+            True if credential should be refreshed
+        """
+        buffer = timedelta(minutes=5)
+        now = datetime.now(UTC)
+
+        for key in credential.keys.values():
+            if key.expires_at is not None:
+                if key.expires_at <= now + buffer:
+                    return True
+        return False
+
+    def revoke(self, credential: CredentialObject) -> bool:
+        """
+        Revoke a credential (optional operation).
+
+        Not all providers support revocation. The default implementation
+        logs a warning and returns False.
+
+        Args:
+            credential: The credential to revoke
+
+        Returns:
+            True if revocation succeeded, False otherwise
+        """
+        logger.warning(f"Provider '{self.provider_id}' does not support revocation")
+        return False
+
+    def can_handle(self, credential: CredentialObject) -> bool:
+        """
+        Check if this provider can handle a credential.
+
+        Args:
+            credential: The credential to check
+
+        Returns:
+            True if this provider can manage the credential
+        """
+        return credential.credential_type in self.supported_types
+
+
+class StaticProvider(CredentialProvider):
+    """
+    Provider for static credentials that never need refresh.
+
+    Use for simple API keys that don't expire, such as:
+    - Brave Search API key
+    - OpenAI API key
+    - Basic auth credentials
+
+    Static credentials are always considered valid if they have at least one key.
+    """
+
+    @property
+    def provider_id(self) -> str:
+        return "static"
+
+    @property
+    def supported_types(self) -> list[CredentialType]:
+        return [CredentialType.API_KEY, CredentialType.BASIC_AUTH, CredentialType.CUSTOM]
+
+    def refresh(self, credential: CredentialObject) -> CredentialObject:
+        """
+        Static credentials don't need refresh.
+
+        Returns the credential unchanged.
+        """
+        logger.debug(f"Static credential '{credential.id}' does not need refresh")
+        return credential
+
+    def validate(self, credential: CredentialObject) -> bool:
+        """
+        Validate that credential has at least one key with a value.
+
+        For static credentials, we can't verify the key works without
+        making an API call, so we just check existence.
+        """
+        if not credential.keys:
+            return False
+
+        # Check at least one key has a non-empty value
+        for key in credential.keys.values():
+            try:
+                value = key.get_secret_value()
+                if value and value.strip():
+                    return True
+            except Exception:
+                continue
+
+        return False
+
+    def should_refresh(self, credential: CredentialObject) -> bool:
+        """Static credentials never need refresh."""
+        return False
+
+
+class BearerTokenProvider(CredentialProvider):
+    """
+    Provider for bearer tokens without refresh capability.
+
+    Use for JWTs or tokens that:
+    - Have an expiration time
+    - Cannot be refreshed (no refresh token)
+    - Must be re-obtained when expired
+
+    This provider validates based on expiration time only.
+    """
+
+    @property
+    def provider_id(self) -> str:
+        return "bearer_token"
+
+    @property
+    def supported_types(self) -> list[CredentialType]:
+        return [CredentialType.BEARER_TOKEN]
+
+    def refresh(self, credential: CredentialObject) -> CredentialObject:
+        """
+        Bearer tokens without refresh capability cannot be refreshed.
+
+        Raises:
+            CredentialRefreshError: Always, as refresh is not supported
+        """
+        raise CredentialRefreshError(
+            f"Bearer token '{credential.id}' cannot be refreshed. "
+            "Obtain a new token and save it to the credential store."
+        )
+
+    def validate(self, credential: CredentialObject) -> bool:
+        """
+        Validate based on expiration time.
+
+        Returns True if token exists and is not expired.
+        """
+        access_key = credential.keys.get("access_token") or credential.keys.get("token")
+        if access_key is None:
+            return False
+
+        # Check if expired
+        return not access_key.is_expired
+
+    def should_refresh(self, credential: CredentialObject) -> bool:
+        """
+        Check if token is expired or near expiration.
+
+        Note: Even though this returns True for expired tokens,
+        refresh() will fail. This allows the store to know the
+        credential needs attention.
+        """
+        buffer = timedelta(minutes=5)
+        now = datetime.now(UTC)
+
+        for key_name in ["access_token", "token"]:
+            key = credential.keys.get(key_name)
+            if key and key.expires_at:
+                if key.expires_at <= now + buffer:
+                    return True
+
+        return False
@@ -0,0 +1,519 @@
+"""
+Storage backends for the credential store.
+
+This module provides abstract and concrete storage implementations:
+- CredentialStorage: Abstract base class
+- EncryptedFileStorage: Fernet-encrypted JSON files (default for production)
+- EnvVarStorage: Environment variable reading (backward compatibility)
+- InMemoryStorage: For testing
+"""
+
+from __future__ import annotations
+
+import json
+import logging
+import os
+from abc import ABC, abstractmethod
+from datetime import UTC, datetime
+from pathlib import Path
+from typing import Any
+
+from pydantic import SecretStr
+
+from .models import CredentialDecryptionError, CredentialKey, CredentialObject, CredentialType
+
+logger = logging.getLogger(__name__)
+
+
+class CredentialStorage(ABC):
+    """
+    Abstract storage backend for credentials.
+
+    Implementations must provide save, load, delete, list_all, and exists methods.
+    All implementations should handle serialization of SecretStr values securely.
+    """
+
+    @abstractmethod
+    def save(self, credential: CredentialObject) -> None:
+        """
+        Save a credential to storage.
+
+        Args:
+            credential: The credential object to save
+        """
+        pass
+
+    @abstractmethod
+    def load(self, credential_id: str) -> CredentialObject | None:
+        """
+        Load a credential from storage.
+
+        Args:
+            credential_id: The ID of the credential to load
+
+        Returns:
+            CredentialObject if found, None otherwise
+        """
+        pass
+
+    @abstractmethod
+    def delete(self, credential_id: str) -> bool:
+        """
+        Delete a credential from storage.
+
+        Args:
+            credential_id: The ID of the credential to delete
+
+        Returns:
+            True if the credential existed and was deleted, False otherwise
+        """
+        pass
+
+    @abstractmethod
+    def list_all(self) -> list[str]:
+        """
+        List all credential IDs in storage.
+
+        Returns:
+            List of credential IDs
+        """
+        pass
+
+    @abstractmethod
+    def exists(self, credential_id: str) -> bool:
+        """
+        Check if a credential exists in storage.
+
+        Args:
+            credential_id: The ID to check
+
+        Returns:
+            True if credential exists, False otherwise
+        """
+        pass
+
+
+class EncryptedFileStorage(CredentialStorage):
+    """
+    Encrypted file-based credential storage.
+
+    Uses Fernet symmetric encryption (AES-128-CBC + HMAC) for at-rest encryption.
+    Each credential is stored as a separate encrypted JSON file.
+
+    Directory structure:
+        {base_path}/
+            credentials/
+                {credential_id}.enc   # Encrypted credential JSON
+            metadata/
+                index.json            # Index of all credentials (unencrypted)
+
+    The encryption key is read from the HIVE_CREDENTIAL_KEY environment variable.
+    If not set, a new key is generated (and must be persisted for data recovery).
+
+    Example:
+        storage = EncryptedFileStorage("~/.hive/credentials")
+        storage.save(credential)
+        credential = storage.load("brave_search")
+    """
+
+    DEFAULT_PATH = "~/.hive/credentials"
+
+    def __init__(
+        self,
+        base_path: str | Path | None = None,
+        encryption_key: bytes | None = None,
+        key_env_var: str = "HIVE_CREDENTIAL_KEY",
+    ):
+        """
+        Initialize encrypted storage.
+
+        Args:
+            base_path: Directory for credential files. Defaults to ~/.hive/credentials.
+            encryption_key: 32-byte Fernet key. If None, reads from env var.
+            key_env_var: Environment variable containing encryption key
+        """
+        try:
+            from cryptography.fernet import Fernet
+        except ImportError as e:
+            raise ImportError(
+                "Encrypted storage requires 'cryptography'. "
+                "Install with: uv pip install cryptography"
+            ) from e
+
+        self.base_path = Path(base_path or self.DEFAULT_PATH).expanduser()
+        self._ensure_dirs()
+        self._key_env_var = key_env_var
+
+        # Get or generate encryption key
+        if encryption_key:
+            self._key = encryption_key
+        else:
+            key_str = os.environ.get(key_env_var)
+            if key_str:
+                self._key = key_str.encode()
+            else:
+                # Generate new key
+                self._key = Fernet.generate_key()
+                logger.warning(
+                    f"Generated new encryption key. To persist credentials across restarts, "
+                    f"set {key_env_var}={self._key.decode()}"
+                )
+
+        self._fernet = Fernet(self._key)
+
+    def _ensure_dirs(self) -> None:
+        """Create directory structure."""
+        (self.base_path / "credentials").mkdir(parents=True, exist_ok=True)
+        (self.base_path / "metadata").mkdir(parents=True, exist_ok=True)
+
+    def _cred_path(self, credential_id: str) -> Path:
+        """Get the file path for a credential."""
+        # Sanitize credential_id to prevent path traversal
+        safe_id = credential_id.replace("/", "_").replace("\\", "_").replace("..", "_")
+        return self.base_path / "credentials" / f"{safe_id}.enc"
+
+    def save(self, credential: CredentialObject) -> None:
+        """Encrypt and save credential."""
+        # Serialize credential
+        data = self._serialize_credential(credential)
+        json_bytes = json.dumps(data, default=str).encode()
+
+        # Encrypt
+        encrypted = self._fernet.encrypt(json_bytes)
+
+        # Write to file
+        cred_path = self._cred_path(credential.id)
+        with open(cred_path, "wb") as f:
+            f.write(encrypted)
+
+        # Update index
+        self._update_index(credential.id, "save", credential.credential_type.value)
+        logger.debug(f"Saved encrypted credential '{credential.id}'")
+
+    def load(self, credential_id: str) -> CredentialObject | None:
+        """Load and decrypt credential."""
+        cred_path = self._cred_path(credential_id)
+        if not cred_path.exists():
+            return None
+
+        # Read encrypted data
+        with open(cred_path, "rb") as f:
+            encrypted = f.read()
+
+        # Decrypt
+        try:
+            json_bytes = self._fernet.decrypt(encrypted)
+            data = json.loads(json_bytes.decode())
+        except Exception as e:
+            raise CredentialDecryptionError(
+                f"Failed to decrypt credential '{credential_id}': {e}"
+            ) from e
+
+        # Deserialize
+        return self._deserialize_credential(data)
+
+    def delete(self, credential_id: str) -> bool:
+        """Delete a credential file."""
+        cred_path = self._cred_path(credential_id)
+        if cred_path.exists():
+            cred_path.unlink()
+            self._update_index(credential_id, "delete")
+            logger.debug(f"Deleted credential '{credential_id}'")
+            return True
+        return False
+
+    def list_all(self) -> list[str]:
+        """List all credential IDs."""
+        index_path = self.base_path / "metadata" / "index.json"
+        if not index_path.exists():
+            return []
+        with open(index_path) as f:
+            index = json.load(f)
+        return list(index.get("credentials", {}).keys())
+
+    def exists(self, credential_id: str) -> bool:
+        """Check if credential exists."""
+        return self._cred_path(credential_id).exists()
+
+    def _serialize_credential(self, credential: CredentialObject) -> dict[str, Any]:
+        """Convert credential to JSON-serializable dict, extracting secret values."""
+        data = credential.model_dump(mode="json")
+
+        # Extract actual secret values from SecretStr
+        for key_name, key_data in data.get("keys", {}).items():
+            if "value" in key_data:
+                # SecretStr serializes as "**********", need actual value
+                actual_key = credential.keys.get(key_name)
+                if actual_key:
+                    key_data["value"] = actual_key.get_secret_value()
+
+        return data
+
+    def _deserialize_credential(self, data: dict[str, Any]) -> CredentialObject:
+        """Reconstruct credential from dict, wrapping values in SecretStr."""
+        # Convert plain values back to SecretStr
+        for key_data in data.get("keys", {}).values():
+            if "value" in key_data and isinstance(key_data["value"], str):
+                key_data["value"] = SecretStr(key_data["value"])
+
+        return CredentialObject.model_validate(data)
+
+    def _update_index(
+        self,
+        credential_id: str,
+        operation: str,
+        credential_type: str | None = None,
+    ) -> None:
+        """Update the metadata index."""
+        index_path = self.base_path / "metadata" / "index.json"
+
+        if index_path.exists():
+            with open(index_path) as f:
+                index = json.load(f)
+        else:
+            index = {"credentials": {}, "version": "1.0"}
+
+        if operation == "save":
+            index["credentials"][credential_id] = {
+                "updated_at": datetime.now(UTC).isoformat(),
+                "type": credential_type,
+            }
+        elif operation == "delete":
+            index["credentials"].pop(credential_id, None)
+
+        index["last_modified"] = datetime.now(UTC).isoformat()
+
+        with open(index_path, "w") as f:
+            json.dump(index, f, indent=2)
+
+
+class EnvVarStorage(CredentialStorage):
+    """
+    Environment variable-based storage for backward compatibility.
+
+    Maps credential IDs to environment variable patterns.
+    Supports hot-reload from .env files using python-dotenv.
+
+    This storage is READ-ONLY - credentials cannot be saved at runtime.
+
+    Example:
+        storage = EnvVarStorage(
+            env_mapping={"brave_search": "BRAVE_SEARCH_API_KEY"},
+            dotenv_path=Path(".env")
+        )
+        credential = storage.load("brave_search")
+    """
+
+    def __init__(
+        self,
+        env_mapping: dict[str, str] | None = None,
+        dotenv_path: Path | None = None,
+    ):
+        """
+        Initialize env var storage.
+
+        Args:
+            env_mapping: Map of credential_id -> env_var_name
+                        e.g., {"brave_search": "BRAVE_SEARCH_API_KEY"}
+                        If not provided, uses {CREDENTIAL_ID}_API_KEY pattern
+            dotenv_path: Path to .env file for hot-reload support
+        """
+        self._env_mapping = env_mapping or {}
+        self._dotenv_path = dotenv_path or Path.cwd() / ".env"
+
+    def _get_env_var_name(self, credential_id: str) -> str:
+        """Get the environment variable name for a credential."""
+        if credential_id in self._env_mapping:
+            return self._env_mapping[credential_id]
+        # Default pattern: CREDENTIAL_ID_API_KEY
+        return f"{credential_id.upper().replace('-', '_')}_API_KEY"
+
+    def _read_env_value(self, env_var: str) -> str | None:
+        """Read value from env var or .env file."""
+        # Check os.environ first (takes precedence)
+        value = os.environ.get(env_var)
+        if value:
+            return value
+
+        # Fallback: read from .env file (hot-reload)
+        if self._dotenv_path.exists():
+            try:
+                from dotenv import dotenv_values
+
+                values = dotenv_values(self._dotenv_path)
+                return values.get(env_var)
+            except ImportError:
+                logger.debug("python-dotenv not installed, skipping .env file")
+                return None
+
+        return None
+
+    def save(self, credential: CredentialObject) -> None:
+        """Cannot save to environment variables at runtime."""
+        raise NotImplementedError(
+            "EnvVarStorage is read-only. Set environment variables "
+            "externally or use EncryptedFileStorage."
+        )
+
+    def load(self, credential_id: str) -> CredentialObject | None:
+        """Load credential from environment variable."""
+        env_var = self._get_env_var_name(credential_id)
+        value = self._read_env_value(env_var)
+
+        if not value:
+            return None
+
+        return CredentialObject(
+            id=credential_id,
+            credential_type=CredentialType.API_KEY,
+            keys={"api_key": CredentialKey(name="api_key", value=SecretStr(value))},
+            description=f"Loaded from {env_var}",
+        )
+
+    def delete(self, credential_id: str) -> bool:
+        """Cannot delete environment variables at runtime."""
+        raise NotImplementedError(
+            "EnvVarStorage is read-only. Unset environment variables externally."
+        )
+
+    def list_all(self) -> list[str]:
+        """List credentials that are available in environment."""
+        available = []
+
+        # Check mapped credentials
+        for cred_id in self._env_mapping.keys():
+            if self.exists(cred_id):
+                available.append(cred_id)
+
+        return available
+
+    def exists(self, credential_id: str) -> bool:
+        """Check if credential is available in environment."""
+        env_var = self._get_env_var_name(credential_id)
+        return self._read_env_value(env_var) is not None
+
+    def add_mapping(self, credential_id: str, env_var: str) -> None:
+        """
+        Add a credential ID to environment variable mapping.
+
+        Args:
+            credential_id: The credential identifier
+            env_var: The environment variable name
+        """
+        self._env_mapping[credential_id] = env_var
+
+
+class InMemoryStorage(CredentialStorage):
+    """
+    In-memory storage for testing.
+
+    Credentials are stored in a dictionary and lost when the process exits.
+
+    Example:
+        storage = InMemoryStorage()
+        storage.save(credential)
+        credential = storage.load("test_cred")
+    """
+
+    def __init__(self, initial_data: dict[str, CredentialObject] | None = None):
+        """
+        Initialize in-memory storage.
+
+        Args:
+            initial_data: Optional dict of credential_id -> CredentialObject
+        """
+        self._data: dict[str, CredentialObject] = initial_data or {}
+
+    def save(self, credential: CredentialObject) -> None:
+        """Save credential to memory."""
+        self._data[credential.id] = credential
+
+    def load(self, credential_id: str) -> CredentialObject | None:
+        """Load credential from memory."""
+        return self._data.get(credential_id)
+
+    def delete(self, credential_id: str) -> bool:
+        """Delete credential from memory."""
+        if credential_id in self._data:
+            del self._data[credential_id]
+            return True
+        return False
+
+    def list_all(self) -> list[str]:
+        """List all credential IDs."""
+        return list(self._data.keys())
+
+    def exists(self, credential_id: str) -> bool:
+        """Check if credential exists."""
+        return credential_id in self._data
+
+    def clear(self) -> None:
+        """Clear all credentials."""
+        self._data.clear()
+
+
+class CompositeStorage(CredentialStorage):
+    """
+    Composite storage that reads from multiple backends.
+
+    Useful for layering storages, e.g., encrypted file with env var fallback:
+    - Writes go to the primary storage
+    - Reads check primary first, then fallback storages
+
+    Example:
+        storage = CompositeStorage(
+            primary=EncryptedFileStorage("~/.hive/credentials"),
+            fallbacks=[EnvVarStorage({"brave_search": "BRAVE_SEARCH_API_KEY"})]
+        )
+    """
+
+    def __init__(
+        self,
+        primary: CredentialStorage,
+        fallbacks: list[CredentialStorage] | None = None,
+    ):
+        """
+        Initialize composite storage.
+
+        Args:
+            primary: Primary storage for writes and first read attempt
+            fallbacks: List of fallback storages to check if primary doesn't have credential
+        """
+        self._primary = primary
+        self._fallbacks = fallbacks or []
+
+    def save(self, credential: CredentialObject) -> None:
+        """Save to primary storage."""
+        self._primary.save(credential)
+
+    def load(self, credential_id: str) -> CredentialObject | None:
+        """Load from primary, then fallbacks."""
+        # Try primary first
+        credential = self._primary.load(credential_id)
+        if credential is not None:
+            return credential
+
+        # Try fallbacks
+        for fallback in self._fallbacks:
+            credential = fallback.load(credential_id)
+            if credential is not None:
+                return credential
+
+        return None
+
+    def delete(self, credential_id: str) -> bool:
+        """Delete from primary storage only."""
+        return self._primary.delete(credential_id)
+
+    def list_all(self) -> list[str]:
+        """List credentials from all storages."""
+        all_ids = set(self._primary.list_all())
+        for fallback in self._fallbacks:
+            all_ids.update(fallback.list_all())
+        return list(all_ids)
+
+    def exists(self, credential_id: str) -> bool:
+        """Check if credential exists in any storage."""
+        if self._primary.exists(credential_id):
+            return True
+        return any(fallback.exists(credential_id) for fallback in self._fallbacks)
@@ -0,0 +1,708 @@
+"""
+Main credential store orchestrating storage, providers, and template resolution.
+
+The CredentialStore is the primary interface for credential management, providing:
+- Multi-backend storage (file, env, vault)
+- Provider-based lifecycle management (refresh, validate)
+- Template resolution for {{cred.key}} patterns
+- Caching with TTL for performance
+- Thread-safe operations
+"""
+
+from __future__ import annotations
+
+import logging
+import threading
+from datetime import UTC, datetime
+from typing import Any
+
+from pydantic import SecretStr
+
+from .models import (
+    CredentialKey,
+    CredentialObject,
+    CredentialRefreshError,
+    CredentialUsageSpec,
+)
+from .provider import CredentialProvider, StaticProvider
+from .storage import CredentialStorage, EnvVarStorage, InMemoryStorage
+from .template import TemplateResolver
+
+logger = logging.getLogger(__name__)
+
+
+class CredentialStore:
+    """
+    Main credential store orchestrating storage, providers, and template resolution.
+
+    Features:
+    - Multi-backend storage (file, env, vault)
+    - Provider-based lifecycle management (refresh, validate)
+    - Template resolution for {{cred.key}} patterns
+    - Caching with TTL for performance
+    - Thread-safe operations
+
+    Usage:
+        # Basic usage
+        store = CredentialStore(
+            storage=EncryptedFileStorage("~/.hive/credentials"),
+            providers=[OAuth2Provider(), StaticProvider()]
+        )
+
+        # Get a credential
+        cred = store.get_credential("github_oauth")
+
+        # Resolve templates in headers
+        headers = store.resolve_headers({
+            "Authorization": "Bearer {{github_oauth.access_token}}"
+        })
+
+        # Register a tool's credential requirements
+        store.register_usage(CredentialUsageSpec(
+            credential_id="brave_search",
+            required_keys=["api_key"],
+            headers={"X-Subscription-Token": "{{brave_search.api_key}}"}
+        ))
+    """
+
+    def __init__(
+        self,
+        storage: CredentialStorage | None = None,
+        providers: list[CredentialProvider] | None = None,
+        cache_ttl_seconds: int = 300,
+        auto_refresh: bool = True,
+    ):
+        """
+        Initialize the credential store.
+
+        Args:
+            storage: Storage backend. Defaults to EnvVarStorage for compatibility.
+            providers: List of credential providers. Defaults to [StaticProvider()].
+            cache_ttl_seconds: How long to cache credentials in memory (default: 5 minutes).
+            auto_refresh: Whether to auto-refresh expired credentials on access.
+        """
+        self._storage = storage or EnvVarStorage()
+        self._providers: dict[str, CredentialProvider] = {}
+        self._usage_specs: dict[str, CredentialUsageSpec] = {}
+
+        # Cache: credential_id -> (CredentialObject, cached_at)
+        self._cache: dict[str, tuple[CredentialObject, datetime]] = {}
+        self._cache_ttl = cache_ttl_seconds
+        self._lock = threading.RLock()
+
+        self._auto_refresh = auto_refresh
+
+        # Register providers
+        for provider in providers or [StaticProvider()]:
+            self.register_provider(provider)
+
+        # Template resolver
+        self._resolver = TemplateResolver(self)
+
+    # --- Provider Management ---
+
+    def register_provider(self, provider: CredentialProvider) -> None:
+        """
+        Register a credential provider.
+
+        Args:
+            provider: The provider to register
+        """
+        self._providers[provider.provider_id] = provider
+        logger.debug(f"Registered credential provider: {provider.provider_id}")
+
+    def get_provider(self, provider_id: str) -> CredentialProvider | None:
+        """
+        Get a provider by ID.
+
+        Args:
+            provider_id: The provider identifier
+
+        Returns:
+            The provider if found, None otherwise
+        """
+        return self._providers.get(provider_id)
+
+    def get_provider_for_credential(
+        self, credential: CredentialObject
+    ) -> CredentialProvider | None:
+        """
+        Get the appropriate provider for a credential.
+
+        Args:
+            credential: The credential to find a provider for
+
+        Returns:
+            The provider if found, None otherwise
+        """
+        # First, check if credential specifies a provider
+        if credential.provider_id:
+            provider = self._providers.get(credential.provider_id)
+            if provider:
+                return provider
+
+        # Fall back to finding a provider that supports this type
+        for provider in self._providers.values():
+            if provider.can_handle(credential):
+                return provider
+
+        return None
+
+    # --- Usage Spec Management ---
+
+    def register_usage(self, spec: CredentialUsageSpec) -> None:
+        """
+        Register how a tool uses credentials.
+
+        Args:
+            spec: The usage specification
+        """
+        self._usage_specs[spec.credential_id] = spec
+
+    def get_usage_spec(self, credential_id: str) -> CredentialUsageSpec | None:
+        """
+        Get the usage spec for a credential.
+
+        Args:
+            credential_id: The credential identifier
+
+        Returns:
+            The usage spec if registered, None otherwise
+        """
+        return self._usage_specs.get(credential_id)
+
+    # --- Credential Access ---
+
+    def get_credential(
+        self,
+        credential_id: str,
+        refresh_if_needed: bool = True,
+    ) -> CredentialObject | None:
+        """
+        Get a credential by ID.
+
+        Args:
+            credential_id: The credential identifier
+            refresh_if_needed: If True, refresh expired credentials
+
+        Returns:
+            CredentialObject or None if not found
+        """
+        with self._lock:
+            # Check cache
+            cached = self._get_from_cache(credential_id)
+            if cached is not None:
+                if refresh_if_needed and self._should_refresh(cached):
+                    return self._refresh_credential(cached)
+                return cached
+
+            # Load from storage
+            credential = self._storage.load(credential_id)
+            if credential is None:
+                return None
+
+            # Refresh if needed
+            if refresh_if_needed and self._should_refresh(credential):
+                credential = self._refresh_credential(credential)
+
+            # Cache
+            self._add_to_cache(credential)
+
+            return credential
+
+    def get_key(self, credential_id: str, key_name: str) -> str | None:
+        """
+        Convenience method to get a specific key value.
+
+        Args:
+            credential_id: The credential identifier
+            key_name: The key within the credential
+
+        Returns:
+            The key value or None if not found
+        """
+        credential = self.get_credential(credential_id)
+        if credential is None:
+            return None
+        return credential.get_key(key_name)
+
+    def get(self, credential_id: str) -> str | None:
+        """
+        Legacy compatibility: get the primary key value.
+
+        For single-key credentials, returns that key.
+        For multi-key, returns 'value', 'api_key', or 'access_token'.
+
+        Args:
+            credential_id: The credential identifier
+
+        Returns:
+            The primary key value or None
+        """
+        credential = self.get_credential(credential_id)
+        if credential is None:
+            return None
+        return credential.get_default_key()
+
+    # --- Template Resolution ---
+
+    def resolve(self, template: str) -> str:
+        """
+        Resolve credential templates in a string.
+
+        Args:
+            template: String containing {{cred.key}} patterns
+
+        Returns:
+            Template with all references resolved
+
+        Example:
+            >>> store.resolve("Bearer {{github.access_token}}")
+            "Bearer ghp_xxxxxxxxxxxx"
+        """
+        return self._resolver.resolve(template)
+
+    def resolve_headers(self, headers: dict[str, str]) -> dict[str, str]:
+        """
+        Resolve credential templates in headers dictionary.
+
+        Args:
+            headers: Dict of header name to template value
+
+        Returns:
+            Dict with all templates resolved
+
+        Example:
+            >>> store.resolve_headers({
+            ...     "Authorization": "Bearer {{github.access_token}}"
+            ... })
+            {"Authorization": "Bearer ghp_xxx"}
+        """
+        return self._resolver.resolve_headers(headers)
+
+    def resolve_params(self, params: dict[str, str]) -> dict[str, str]:
+        """
+        Resolve credential templates in query parameters dictionary.
+
+        Args:
+            params: Dict of param name to template value
+
+        Returns:
+            Dict with all templates resolved
+        """
+        return self._resolver.resolve_params(params)
+
+    def resolve_for_usage(self, credential_id: str) -> dict[str, Any]:
+        """
+        Get resolved request kwargs for a registered usage spec.
+
+        Args:
+            credential_id: The credential identifier
+
+        Returns:
+            Dict with 'headers', 'params', etc. keys as appropriate
+
+        Raises:
+            ValueError: If no usage spec is registered for the credential
+        """
+        spec = self._usage_specs.get(credential_id)
+        if spec is None:
+            raise ValueError(f"No usage spec registered for '{credential_id}'")
+
+        result: dict[str, Any] = {}
+
+        if spec.headers:
+            result["headers"] = self.resolve_headers(spec.headers)
+
+        if spec.query_params:
+            result["params"] = self.resolve_params(spec.query_params)
+
+        if spec.body_fields:
+            result["data"] = {key: self.resolve(value) for key, value in spec.body_fields.items()}
+
+        return result
+
+    # --- Credential Management ---
+
+    def save_credential(self, credential: CredentialObject) -> None:
+        """
+        Save a credential to storage.
+
+        Args:
+            credential: The credential to save
+        """
+        with self._lock:
+            self._storage.save(credential)
+            self._add_to_cache(credential)
+            logger.info(f"Saved credential '{credential.id}'")
+
+    def delete_credential(self, credential_id: str) -> bool:
+        """
+        Delete a credential from storage.
+
+        Args:
+            credential_id: The credential identifier
+
+        Returns:
+            True if the credential existed and was deleted
+        """
+        with self._lock:
+            self._remove_from_cache(credential_id)
+            result = self._storage.delete(credential_id)
+            if result:
+                logger.info(f"Deleted credential '{credential_id}'")
+            return result
+
+    def list_credentials(self) -> list[str]:
+        """
+        List all available credential IDs.
+
+        Returns:
+            List of credential IDs
+        """
+        return self._storage.list_all()
+
+    def is_available(self, credential_id: str) -> bool:
+        """
+        Check if a credential is available.
+
+        Args:
+            credential_id: The credential identifier
+
+        Returns:
+            True if credential exists and is accessible
+        """
+        return self.get_credential(credential_id, refresh_if_needed=False) is not None
+
+    # --- Validation ---
+
+    def validate_for_usage(self, credential_id: str) -> list[str]:
+        """
+        Validate that a credential meets its usage spec requirements.
+
+        Args:
+            credential_id: The credential identifier
+
+        Returns:
+            List of missing keys or errors. Empty list if valid.
+        """
+        spec = self._usage_specs.get(credential_id)
+        if spec is None:
+            return []  # No requirements registered
+
+        credential = self.get_credential(credential_id)
+        if credential is None:
+            return [f"Credential '{credential_id}' not found"]
+
+        errors = []
+        for key_name in spec.required_keys:
+            if not credential.has_key(key_name):
+                errors.append(f"Missing required key '{key_name}'")
+
+        return errors
+
+    def validate_all(self) -> dict[str, list[str]]:
+        """
+        Validate all registered usage specs.
+
+        Returns:
+            Dict mapping credential_id to list of errors.
+            Only includes credentials with errors.
+        """
+        errors = {}
+        for cred_id in self._usage_specs.keys():
+            cred_errors = self.validate_for_usage(cred_id)
+            if cred_errors:
+                errors[cred_id] = cred_errors
+        return errors
+
+    def validate_credential(self, credential_id: str) -> bool:
+        """
+        Validate a credential using its provider.
+
+        Args:
+            credential_id: The credential identifier
+
+        Returns:
+            True if credential is valid
+        """
+        credential = self.get_credential(credential_id, refresh_if_needed=False)
+        if credential is None:
+            return False
+
+        provider = self.get_provider_for_credential(credential)
+        if provider is None:
+            # No provider, assume valid if has keys
+            return bool(credential.keys)
+
+        return provider.validate(credential)
+
+    # --- Lifecycle Management ---
+
+    def _should_refresh(self, credential: CredentialObject) -> bool:
+        """Check if credential should be refreshed."""
+        if not self._auto_refresh:
+            return False
+
+        if not credential.auto_refresh:
+            return False
+
+        provider = self.get_provider_for_credential(credential)
+        if provider is None:
+            return False
+
+        return provider.should_refresh(credential)
+
+    def _refresh_credential(self, credential: CredentialObject) -> CredentialObject:
+        """Refresh a credential using its provider."""
+        provider = self.get_provider_for_credential(credential)
+        if provider is None:
+            logger.warning(f"No provider found for credential '{credential.id}'")
+            return credential
+
+        try:
+            refreshed = provider.refresh(credential)
+            refreshed.last_refreshed = datetime.now(UTC)
+
+            # Persist the refreshed credential
+            self._storage.save(refreshed)
+            self._add_to_cache(refreshed)
+
+            logger.info(f"Refreshed credential '{credential.id}'")
+            return refreshed
+
+        except CredentialRefreshError as e:
+            logger.error(f"Failed to refresh credential '{credential.id}': {e}")
+            return credential
+
+    def refresh_credential(self, credential_id: str) -> CredentialObject | None:
+        """
+        Manually refresh a credential.
+
+        Args:
+            credential_id: The credential identifier
+
+        Returns:
+            The refreshed credential, or None if not found
+
+        Raises:
+            CredentialRefreshError: If refresh fails
+        """
+        credential = self.get_credential(credential_id, refresh_if_needed=False)
+        if credential is None:
+            return None
+
+        return self._refresh_credential(credential)
+
+    # --- Caching ---
+
+    def _get_from_cache(self, credential_id: str) -> CredentialObject | None:
+        """Get credential from cache if not expired."""
+        if credential_id not in self._cache:
+            return None
+
+        credential, cached_at = self._cache[credential_id]
+        age = (datetime.now(UTC) - cached_at).total_seconds()
+
+        if age > self._cache_ttl:
+            del self._cache[credential_id]
+            return None
+
+        return credential
+
+    def _add_to_cache(self, credential: CredentialObject) -> None:
+        """Add credential to cache."""
+        self._cache[credential.id] = (credential, datetime.now(UTC))
+
+    def _remove_from_cache(self, credential_id: str) -> None:
+        """Remove credential from cache."""
+        self._cache.pop(credential_id, None)
+
+    def clear_cache(self) -> None:
+        """Clear the credential cache."""
+        with self._lock:
+            self._cache.clear()
+
+    # --- Factory Methods ---
+
+    @classmethod
+    def for_testing(
+        cls,
+        credentials: dict[str, dict[str, str]],
+    ) -> CredentialStore:
+        """
+        Create a credential store for testing with mock credentials.
+
+        Args:
+            credentials: Dict mapping credential_id to {key_name: value}
+                        e.g., {"brave_search": {"api_key": "test-key"}}
+
+        Returns:
+            CredentialStore with in-memory credentials
+
+        Example:
+            store = CredentialStore.for_testing({
+                "brave_search": {"api_key": "test-brave-key"},
+                "github_oauth": {
+                    "access_token": "test-token",
+                    "refresh_token": "test-refresh"
+                }
+            })
+        """
+        # Convert test data to CredentialObjects
+        cred_objects: dict[str, CredentialObject] = {}
+
+        for cred_id, keys in credentials.items():
+            cred_objects[cred_id] = CredentialObject(
+                id=cred_id,
+                keys={k: CredentialKey(name=k, value=SecretStr(v)) for k, v in keys.items()},
+            )
+
+        return cls(
+            storage=InMemoryStorage(cred_objects),
+            auto_refresh=False,
+        )
+
+    @classmethod
+    def with_encrypted_storage(
+        cls,
+        base_path: str | None = None,
+        providers: list[CredentialProvider] | None = None,
+        **kwargs: Any,
+    ) -> CredentialStore:
+        """
+        Create a credential store with encrypted file storage.
+
+        Args:
+            base_path: Directory for credential files. Defaults to ~/.hive/credentials.
+            providers: List of credential providers
+            **kwargs: Additional arguments passed to CredentialStore
+
+        Returns:
+            CredentialStore with EncryptedFileStorage
+        """
+        from .storage import EncryptedFileStorage
+
+        return cls(
+            storage=EncryptedFileStorage(base_path),
+            providers=providers,
+            **kwargs,
+        )
+
+    @classmethod
+    def with_env_storage(
+        cls,
+        env_mapping: dict[str, str] | None = None,
+        providers: list[CredentialProvider] | None = None,
+        **kwargs: Any,
+    ) -> CredentialStore:
+        """
+        Create a credential store with environment variable storage.
+
+        Args:
+            env_mapping: Map of credential_id -> env_var_name
+            providers: List of credential providers
+            **kwargs: Additional arguments passed to CredentialStore
+
+        Returns:
+            CredentialStore with EnvVarStorage
+        """
+        return cls(
+            storage=EnvVarStorage(env_mapping),
+            providers=providers,
+            **kwargs,
+        )
+
+    @classmethod
+    def with_aden_sync(
+        cls,
+        base_url: str = "https://api.adenhq.com",
+        cache_ttl_seconds: int = 300,
+        local_path: str | None = None,
+        auto_sync: bool = True,
+        **kwargs: Any,
+    ) -> CredentialStore:
+        """
+        Create a credential store with Aden server sync.
+
+        Automatically syncs OAuth2 tokens from the Aden authentication server.
+        Falls back to local-only storage if ADEN_API_KEY is not set or Aden
+        is unreachable.
+
+        Args:
+            base_url: Aden server URL (default: https://api.adenhq.com)
+            cache_ttl_seconds: How long to cache credentials locally (default: 5 min)
+            local_path: Path for local credential storage (default: ~/.hive/credentials)
+            auto_sync: Whether to sync all credentials on startup (default: True)
+            **kwargs: Additional arguments passed to CredentialStore
+
+        Returns:
+            CredentialStore configured with Aden sync
+
+        Example:
+            # Simple usage - just set ADEN_API_KEY env var
+            store = CredentialStore.with_aden_sync()
+
+            # Get HubSpot token (auto-refreshed via Aden)
+            token = store.get_key("hubspot", "access_token")
+        """
+        import os
+        from pathlib import Path
+
+        from .storage import EncryptedFileStorage
+
+        # Determine local storage path
+        if local_path is None:
+            local_path = str(Path.home() / ".hive" / "credentials")
+
+        local_storage = EncryptedFileStorage(base_path=local_path)
+
+        # Check if Aden is configured
+        api_key = os.environ.get("ADEN_API_KEY")
+        if not api_key:
+            logger.info("ADEN_API_KEY not set, using local-only credential storage")
+            return cls(storage=local_storage, **kwargs)
+
+        # Try to setup Aden sync
+        try:
+            from .aden import (
+                AdenCachedStorage,
+                AdenClientConfig,
+                AdenCredentialClient,
+                AdenSyncProvider,
+            )
+
+            # Create Aden client
+            client = AdenCredentialClient(AdenClientConfig(base_url=base_url))
+
+            # Create sync provider
+            provider = AdenSyncProvider(client=client)
+
+            # Use cached storage for offline resilience
+            cached_storage = AdenCachedStorage(
+                local_storage=local_storage,
+                aden_provider=provider,
+                cache_ttl_seconds=cache_ttl_seconds,
+            )
+
+            store = cls(
+                storage=cached_storage,
+                providers=[provider],
+                auto_refresh=True,
+                **kwargs,
+            )
+
+            # Initial sync
+            if auto_sync:
+                synced = provider.sync_all(store)
+                logger.info(f"Synced {synced} credentials from Aden server")
+
+            return store
+
+        except ImportError:
+            logger.warning("Aden components not available, using local storage")
+            return cls(storage=local_storage, **kwargs)
+
+        except Exception as e:
+            logger.warning(f"Failed to setup Aden sync: {e}. Using local storage.")
+            return cls(storage=local_storage, **kwargs)
@@ -0,0 +1,219 @@
+"""
+Template resolution system for credential injection.
+
+This module handles {{cred.key}} patterns, enabling the bipartisan model
+where tools specify how credentials are used in HTTP requests.
+
+Template Syntax:
+    {{credential_id.key_name}} - Access specific key
+    {{credential_id}}          - Access default key (value, api_key, or access_token)
+
+Examples:
+    "Bearer {{github_oauth.access_token}}" -> "Bearer ghp_xxx"
+    "X-API-Key: {{brave_search.api_key}}"  -> "X-API-Key: BSAKxxx"
+    "{{brave_search}}"                      -> "BSAKxxx" (uses default key)
+"""
+
+from __future__ import annotations
+
+import re
+from typing import TYPE_CHECKING
+
+from .models import CredentialKeyNotFoundError, CredentialNotFoundError
+
+if TYPE_CHECKING:
+    from .store import CredentialStore
+
+
+class TemplateResolver:
+    """
+    Resolves credential templates like {{cred.key}} into actual values.
+
+    Usage:
+        resolver = TemplateResolver(credential_store)
+
+        # Resolve single template string
+        auth_header = resolver.resolve("Bearer {{github_oauth.access_token}}")
+
+        # Resolve all headers at once
+        headers = resolver.resolve_headers({
+            "Authorization": "Bearer {{github_oauth.access_token}}",
+            "X-API-Key": "{{brave_search.api_key}}"
+        })
+    """
+
+    # Matches {{credential_id}} or {{credential_id.key_name}}
+    TEMPLATE_PATTERN = re.compile(r"\{\{([a-zA-Z0-9_-]+)(?:\.([a-zA-Z0-9_-]+))?\}\}")
+
+    def __init__(self, credential_store: CredentialStore):
+        """
+        Initialize the template resolver.
+
+        Args:
+            credential_store: The credential store to resolve references against
+        """
+        self._store = credential_store
+
+    def resolve(self, template: str, fail_on_missing: bool = True) -> str:
+        """
+        Resolve all credential references in a template string.
+
+        Args:
+            template: String containing {{cred.key}} patterns
+            fail_on_missing: If True, raise error on missing credentials
+
+        Returns:
+            Template with all references replaced with actual values
+
+        Raises:
+            CredentialNotFoundError: If credential doesn't exist and fail_on_missing=True
+            CredentialKeyNotFoundError: If key doesn't exist in credential
+
+        Example:
+            >>> resolver.resolve("Bearer {{github_oauth.access_token}}")
+            "Bearer ghp_xxxxxxxxxxxx"
+        """
+
+        def replace_match(match: re.Match) -> str:
+            cred_id = match.group(1)
+            key_name = match.group(2)  # May be None
+
+            credential = self._store.get_credential(cred_id, refresh_if_needed=True)
+            if credential is None:
+                if fail_on_missing:
+                    raise CredentialNotFoundError(f"Credential '{cred_id}' not found")
+                return match.group(0)  # Return original template
+
+            # Get specific key or default
+            if key_name:
+                value = credential.get_key(key_name)
+                if value is None:
+                    raise CredentialKeyNotFoundError(
+                        f"Key '{key_name}' not found in credential '{cred_id}'"
+                    )
+            else:
+                # Use default key
+                value = credential.get_default_key()
+                if value is None:
+                    raise CredentialKeyNotFoundError(f"Credential '{cred_id}' has no keys")
+
+            # Record usage
+            credential.record_usage()
+
+            return value
+
+        return self.TEMPLATE_PATTERN.sub(replace_match, template)
+
+    def resolve_headers(
+        self,
+        header_templates: dict[str, str],
+        fail_on_missing: bool = True,
+    ) -> dict[str, str]:
+        """
+        Resolve templates in a headers dictionary.
+
+        Args:
+            header_templates: Dict of header name to template value
+            fail_on_missing: If True, raise error on missing credentials
+
+        Returns:
+            Dict with all templates resolved to actual values
+
+        Example:
+            >>> resolver.resolve_headers({
+            ...     "Authorization": "Bearer {{github_oauth.access_token}}",
+            ...     "X-API-Key": "{{brave_search.api_key}}"
+            ... })
+            {"Authorization": "Bearer ghp_xxx", "X-API-Key": "BSAKxxx"}
+        """
+        return {
+            key: self.resolve(value, fail_on_missing) for key, value in header_templates.items()
+        }
+
+    def resolve_params(
+        self,
+        param_templates: dict[str, str],
+        fail_on_missing: bool = True,
+    ) -> dict[str, str]:
+        """
+        Resolve templates in a query parameters dictionary.
+
+        Args:
+            param_templates: Dict of param name to template value
+            fail_on_missing: If True, raise error on missing credentials
+
+        Returns:
+            Dict with all templates resolved to actual values
+        """
+        return {key: self.resolve(value, fail_on_missing) for key, value in param_templates.items()}
+
+    def has_templates(self, text: str) -> bool:
+        """
+        Check if text contains any credential templates.
+
+        Args:
+            text: String to check
+
+        Returns:
+            True if text contains {{...}} patterns
+        """
+        return bool(self.TEMPLATE_PATTERN.search(text))
+
+    def extract_references(self, text: str) -> list[tuple[str, str | None]]:
+        """
+        Extract all credential references from text.
+
+        Args:
+            text: String to extract references from
+
+        Returns:
+            List of (credential_id, key_name) tuples.
+            key_name is None if only credential_id was specified.
+
+        Example:
+            >>> resolver.extract_references("{{github.token}} and {{brave_search.api_key}}")
+            [("github", "token"), ("brave_search", "api_key")]
+        """
+        return [(match.group(1), match.group(2)) for match in self.TEMPLATE_PATTERN.finditer(text)]
+
+    def validate_references(self, text: str) -> list[str]:
+        """
+        Validate all credential references in text without resolving.
+
+        Args:
+            text: String containing template references
+
+        Returns:
+            List of error messages for invalid references.
+            Empty list if all references are valid.
+        """
+        errors = []
+        references = self.extract_references(text)
+
+        for cred_id, key_name in references:
+            credential = self._store.get_credential(cred_id, refresh_if_needed=False)
+
+            if credential is None:
+                errors.append(f"Credential '{cred_id}' not found")
+                continue
+
+            if key_name:
+                if not credential.has_key(key_name):
+                    errors.append(f"Key '{key_name}' not found in credential '{cred_id}'")
+            elif not credential.keys:
+                errors.append(f"Credential '{cred_id}' has no keys")
+
+        return errors
+
+    def get_required_credentials(self, text: str) -> list[str]:
+        """
+        Get list of credential IDs required by a template string.
+
+        Args:
+            text: String containing template references
+
+        Returns:
+            List of unique credential IDs referenced in the text
+        """
+        references = self.extract_references(text)
+        return list(dict.fromkeys(cred_id for cred_id, _ in references))
@@ -0,0 +1 @@
+"""Tests for the credential store module."""
@@ -0,0 +1,707 @@
+"""
+Comprehensive tests for the credential store module.
+
+Tests cover:
+- Core models (CredentialObject, CredentialKey, CredentialUsageSpec)
+- Template resolution
+- Storage backends (InMemoryStorage, EnvVarStorage, EncryptedFileStorage)
+- Providers (StaticProvider, BearerTokenProvider)
+- Main CredentialStore
+- OAuth2 module
+"""
+
+import os
+import tempfile
+from datetime import UTC, datetime, timedelta
+from pathlib import Path
+from unittest.mock import patch
+
+import pytest
+from core.framework.credentials import (
+    CompositeStorage,
+    CredentialKey,
+    CredentialKeyNotFoundError,
+    CredentialNotFoundError,
+    CredentialObject,
+    CredentialStore,
+    CredentialType,
+    CredentialUsageSpec,
+    EncryptedFileStorage,
+    EnvVarStorage,
+    InMemoryStorage,
+    StaticProvider,
+    TemplateResolver,
+)
+from pydantic import SecretStr
+
+
+class TestCredentialKey:
+    """Tests for CredentialKey model."""
+
+    def test_create_basic_key(self):
+        """Test creating a basic credential key."""
+        key = CredentialKey(name="api_key", value=SecretStr("test-value"))
+        assert key.name == "api_key"
+        assert key.get_secret_value() == "test-value"
+        assert key.expires_at is None
+        assert not key.is_expired
+
+    def test_key_with_expiration(self):
+        """Test key with expiration time."""
+        future = datetime.now(UTC) + timedelta(hours=1)
+        key = CredentialKey(name="token", value=SecretStr("xxx"), expires_at=future)
+        assert not key.is_expired
+
+    def test_expired_key(self):
+        """Test that expired key is detected."""
+        past = datetime.now(UTC) - timedelta(hours=1)
+        key = CredentialKey(name="token", value=SecretStr("xxx"), expires_at=past)
+        assert key.is_expired
+
+    def test_key_with_metadata(self):
+        """Test key with metadata."""
+        key = CredentialKey(
+            name="token",
+            value=SecretStr("xxx"),
+            metadata={"client_id": "abc", "scope": "read"},
+        )
+        assert key.metadata["client_id"] == "abc"
+
+
+class TestCredentialObject:
+    """Tests for CredentialObject model."""
+
+    def test_create_simple_credential(self):
+        """Test creating a simple API key credential."""
+        cred = CredentialObject(
+            id="brave_search",
+            credential_type=CredentialType.API_KEY,
+            keys={"api_key": CredentialKey(name="api_key", value=SecretStr("test-key"))},
+        )
+        assert cred.id == "brave_search"
+        assert cred.credential_type == CredentialType.API_KEY
+        assert cred.get_key("api_key") == "test-key"
+
+    def test_create_multi_key_credential(self):
+        """Test creating a credential with multiple keys."""
+        cred = CredentialObject(
+            id="github_oauth",
+            credential_type=CredentialType.OAUTH2,
+            keys={
+                "access_token": CredentialKey(name="access_token", value=SecretStr("ghp_xxx")),
+                "refresh_token": CredentialKey(name="refresh_token", value=SecretStr("ghr_xxx")),
+            },
+        )
+        assert cred.get_key("access_token") == "ghp_xxx"
+        assert cred.get_key("refresh_token") == "ghr_xxx"
+        assert cred.get_key("nonexistent") is None
+
+    def test_set_key(self):
+        """Test setting a key on a credential."""
+        cred = CredentialObject(id="test", keys={})
+        cred.set_key("new_key", "new_value")
+        assert cred.get_key("new_key") == "new_value"
+
+    def test_set_key_with_expiration(self):
+        """Test setting a key with expiration."""
+        cred = CredentialObject(id="test", keys={})
+        expires = datetime.now(UTC) + timedelta(hours=1)
+        cred.set_key("token", "xxx", expires_at=expires)
+        assert cred.keys["token"].expires_at == expires
+
+    def test_needs_refresh(self):
+        """Test needs_refresh property."""
+        past = datetime.now(UTC) - timedelta(hours=1)
+        cred = CredentialObject(
+            id="test",
+            keys={"token": CredentialKey(name="token", value=SecretStr("xxx"), expires_at=past)},
+        )
+        assert cred.needs_refresh
+
+    def test_get_default_key(self):
+        """Test get_default_key returns appropriate default."""
+        # With api_key
+        cred = CredentialObject(
+            id="test",
+            keys={"api_key": CredentialKey(name="api_key", value=SecretStr("key-value"))},
+        )
+        assert cred.get_default_key() == "key-value"
+
+        # With access_token
+        cred2 = CredentialObject(
+            id="test",
+            keys={
+                "access_token": CredentialKey(name="access_token", value=SecretStr("token-value"))
+            },
+        )
+        assert cred2.get_default_key() == "token-value"
+
+    def test_record_usage(self):
+        """Test recording credential usage."""
+        cred = CredentialObject(id="test", keys={})
+        assert cred.use_count == 0
+        assert cred.last_used is None
+
+        cred.record_usage()
+        assert cred.use_count == 1
+        assert cred.last_used is not None
+
+
+class TestCredentialUsageSpec:
+    """Tests for CredentialUsageSpec model."""
+
+    def test_create_usage_spec(self):
+        """Test creating a usage spec."""
+        spec = CredentialUsageSpec(
+            credential_id="brave_search",
+            required_keys=["api_key"],
+            headers={"X-Subscription-Token": "{{api_key}}"},
+        )
+        assert spec.credential_id == "brave_search"
+        assert "api_key" in spec.required_keys
+        assert "{{api_key}}" in spec.headers.values()
+
+
+class TestInMemoryStorage:
+    """Tests for InMemoryStorage."""
+
+    def test_save_and_load(self):
+        """Test saving and loading a credential."""
+        storage = InMemoryStorage()
+        cred = CredentialObject(
+            id="test",
+            keys={"key": CredentialKey(name="key", value=SecretStr("value"))},
+        )
+
+        storage.save(cred)
+        loaded = storage.load("test")
+
+        assert loaded is not None
+        assert loaded.id == "test"
+        assert loaded.get_key("key") == "value"
+
+    def test_load_nonexistent(self):
+        """Test loading a nonexistent credential."""
+        storage = InMemoryStorage()
+        assert storage.load("nonexistent") is None
+
+    def test_delete(self):
+        """Test deleting a credential."""
+        storage = InMemoryStorage()
+        cred = CredentialObject(id="test", keys={})
+        storage.save(cred)
+
+        assert storage.delete("test")
+        assert storage.load("test") is None
+        assert not storage.delete("test")
+
+    def test_list_all(self):
+        """Test listing all credentials."""
+        storage = InMemoryStorage()
+        storage.save(CredentialObject(id="a", keys={}))
+        storage.save(CredentialObject(id="b", keys={}))
+
+        ids = storage.list_all()
+        assert "a" in ids
+        assert "b" in ids
+
+    def test_exists(self):
+        """Test checking if credential exists."""
+        storage = InMemoryStorage()
+        storage.save(CredentialObject(id="test", keys={}))
+
+        assert storage.exists("test")
+        assert not storage.exists("nonexistent")
+
+    def test_clear(self):
+        """Test clearing all credentials."""
+        storage = InMemoryStorage()
+        storage.save(CredentialObject(id="test", keys={}))
+        storage.clear()
+
+        assert storage.list_all() == []
+
+
+class TestEnvVarStorage:
+    """Tests for EnvVarStorage."""
+
+    def test_load_from_env(self):
+        """Test loading credential from environment variable."""
+        with patch.dict(os.environ, {"TEST_API_KEY": "test-value"}):
+            storage = EnvVarStorage(env_mapping={"test": "TEST_API_KEY"})
+            cred = storage.load("test")
+
+            assert cred is not None
+            assert cred.get_key("api_key") == "test-value"
+
+    def test_load_nonexistent(self):
+        """Test loading when env var is not set."""
+        storage = EnvVarStorage(env_mapping={"test": "NONEXISTENT_VAR"})
+        assert storage.load("test") is None
+
+    def test_default_env_var_pattern(self):
+        """Test default env var naming pattern."""
+        with patch.dict(os.environ, {"MY_SERVICE_API_KEY": "value"}):
+            storage = EnvVarStorage()
+            cred = storage.load("my_service")
+
+            assert cred is not None
+            assert cred.get_key("api_key") == "value"
+
+    def test_save_raises(self):
+        """Test that save raises NotImplementedError."""
+        storage = EnvVarStorage()
+        with pytest.raises(NotImplementedError):
+            storage.save(CredentialObject(id="test", keys={}))
+
+    def test_delete_raises(self):
+        """Test that delete raises NotImplementedError."""
+        storage = EnvVarStorage()
+        with pytest.raises(NotImplementedError):
+            storage.delete("test")
+
+
+class TestEncryptedFileStorage:
+    """Tests for EncryptedFileStorage."""
+
+    @pytest.fixture
+    def temp_dir(self):
+        """Create a temporary directory for tests."""
+        with tempfile.TemporaryDirectory() as tmpdir:
+            yield Path(tmpdir)
+
+    @pytest.fixture
+    def storage(self, temp_dir):
+        """Create EncryptedFileStorage for tests."""
+        return EncryptedFileStorage(temp_dir)
+
+    def test_save_and_load(self, storage):
+        """Test saving and loading encrypted credential."""
+        cred = CredentialObject(
+            id="test",
+            credential_type=CredentialType.API_KEY,
+            keys={"api_key": CredentialKey(name="api_key", value=SecretStr("secret-value"))},
+        )
+
+        storage.save(cred)
+        loaded = storage.load("test")
+
+        assert loaded is not None
+        assert loaded.id == "test"
+        assert loaded.get_key("api_key") == "secret-value"
+
+    def test_encryption_key_from_env(self, temp_dir):
+        """Test using encryption key from environment variable."""
+        from cryptography.fernet import Fernet
+
+        key = Fernet.generate_key().decode()
+        with patch.dict(os.environ, {"HIVE_CREDENTIAL_KEY": key}):
+            storage = EncryptedFileStorage(temp_dir)
+            cred = CredentialObject(
+                id="test", keys={"k": CredentialKey(name="k", value=SecretStr("v"))}
+            )
+            storage.save(cred)
+
+            # Create new storage instance with same key
+            storage2 = EncryptedFileStorage(temp_dir)
+            loaded = storage2.load("test")
+            assert loaded is not None
+            assert loaded.get_key("k") == "v"
+
+    def test_list_all(self, storage):
+        """Test listing all credentials."""
+        storage.save(CredentialObject(id="cred1", keys={}))
+        storage.save(CredentialObject(id="cred2", keys={}))
+
+        ids = storage.list_all()
+        assert "cred1" in ids
+        assert "cred2" in ids
+
+    def test_delete(self, storage):
+        """Test deleting a credential."""
+        storage.save(CredentialObject(id="test", keys={}))
+        assert storage.delete("test")
+        assert storage.load("test") is None
+
+
+class TestCompositeStorage:
+    """Tests for CompositeStorage."""
+
+    def test_read_from_primary(self):
+        """Test reading from primary storage."""
+        primary = InMemoryStorage()
+        primary.save(
+            CredentialObject(
+                id="test", keys={"k": CredentialKey(name="k", value=SecretStr("primary"))}
+            )
+        )
+
+        fallback = InMemoryStorage()
+        fallback.save(
+            CredentialObject(
+                id="test", keys={"k": CredentialKey(name="k", value=SecretStr("fallback"))}
+            )
+        )
+
+        storage = CompositeStorage(primary, [fallback])
+        cred = storage.load("test")
+
+        # Should get from primary
+        assert cred.get_key("k") == "primary"
+
+    def test_fallback_when_not_in_primary(self):
+        """Test fallback when credential not in primary."""
+        primary = InMemoryStorage()
+        fallback = InMemoryStorage()
+        fallback.save(
+            CredentialObject(
+                id="test", keys={"k": CredentialKey(name="k", value=SecretStr("fallback"))}
+            )
+        )
+
+        storage = CompositeStorage(primary, [fallback])
+        cred = storage.load("test")
+
+        assert cred.get_key("k") == "fallback"
+
+    def test_write_to_primary_only(self):
+        """Test that writes go to primary only."""
+        primary = InMemoryStorage()
+        fallback = InMemoryStorage()
+
+        storage = CompositeStorage(primary, [fallback])
+        storage.save(CredentialObject(id="test", keys={}))
+
+        assert primary.exists("test")
+        assert not fallback.exists("test")
+
+
+class TestStaticProvider:
+    """Tests for StaticProvider."""
+
+    def test_provider_id(self):
+        """Test provider ID."""
+        provider = StaticProvider()
+        assert provider.provider_id == "static"
+
+    def test_supported_types(self):
+        """Test supported credential types."""
+        provider = StaticProvider()
+        assert CredentialType.API_KEY in provider.supported_types
+        assert CredentialType.CUSTOM in provider.supported_types
+
+    def test_refresh_returns_unchanged(self):
+        """Test that refresh returns credential unchanged."""
+        provider = StaticProvider()
+        cred = CredentialObject(
+            id="test", keys={"k": CredentialKey(name="k", value=SecretStr("v"))}
+        )
+
+        refreshed = provider.refresh(cred)
+        assert refreshed.get_key("k") == "v"
+
+    def test_validate_with_keys(self):
+        """Test validation with keys present."""
+        provider = StaticProvider()
+        cred = CredentialObject(
+            id="test", keys={"k": CredentialKey(name="k", value=SecretStr("v"))}
+        )
+
+        assert provider.validate(cred)
+
+    def test_validate_without_keys(self):
+        """Test validation without keys."""
+        provider = StaticProvider()
+        cred = CredentialObject(id="test", keys={})
+
+        assert not provider.validate(cred)
+
+    def test_should_refresh(self):
+        """Test that static provider never needs refresh."""
+        provider = StaticProvider()
+        cred = CredentialObject(id="test", keys={})
+
+        assert not provider.should_refresh(cred)
+
+
+class TestTemplateResolver:
+    """Tests for TemplateResolver."""
+
+    @pytest.fixture
+    def store(self):
+        """Create a test store with credentials."""
+        return CredentialStore.for_testing(
+            {
+                "brave_search": {"api_key": "test-brave-key"},
+                "github_oauth": {"access_token": "ghp_xxx", "refresh_token": "ghr_xxx"},
+            }
+        )
+
+    @pytest.fixture
+    def resolver(self, store):
+        """Create a resolver with the test store."""
+        return TemplateResolver(store)
+
+    def test_resolve_simple(self, resolver):
+        """Test resolving a simple template."""
+        result = resolver.resolve("Bearer {{github_oauth.access_token}}")
+        assert result == "Bearer ghp_xxx"
+
+    def test_resolve_multiple(self, resolver):
+        """Test resolving multiple templates."""
+        result = resolver.resolve("{{github_oauth.access_token}} and {{brave_search.api_key}}")
+        assert "ghp_xxx" in result
+        assert "test-brave-key" in result
+
+    def test_resolve_default_key(self, resolver):
+        """Test resolving credential without key specified."""
+        result = resolver.resolve("Key: {{brave_search}}")
+        assert "test-brave-key" in result
+
+    def test_resolve_headers(self, resolver):
+        """Test resolving headers dict."""
+        headers = resolver.resolve_headers(
+            {
+                "Authorization": "Bearer {{github_oauth.access_token}}",
+                "X-API-Key": "{{brave_search.api_key}}",
+            }
+        )
+        assert headers["Authorization"] == "Bearer ghp_xxx"
+        assert headers["X-API-Key"] == "test-brave-key"
+
+    def test_resolve_missing_credential(self, resolver):
+        """Test error on missing credential."""
+        with pytest.raises(CredentialNotFoundError):
+            resolver.resolve("{{nonexistent.key}}")
+
+    def test_resolve_missing_key(self, resolver):
+        """Test error on missing key."""
+        with pytest.raises(CredentialKeyNotFoundError):
+            resolver.resolve("{{github_oauth.nonexistent}}")
+
+    def test_has_templates(self, resolver):
+        """Test detecting templates in text."""
+        assert resolver.has_templates("{{cred.key}}")
+        assert resolver.has_templates("Bearer {{token}}")
+        assert not resolver.has_templates("no templates here")
+
+    def test_extract_references(self, resolver):
+        """Test extracting credential references."""
+        refs = resolver.extract_references("{{github.token}} and {{brave.key}}")
+        assert ("github", "token") in refs
+        assert ("brave", "key") in refs
+
+
+class TestCredentialStore:
+    """Tests for CredentialStore."""
+
+    def test_for_testing_factory(self):
+        """Test creating store for testing."""
+        store = CredentialStore.for_testing({"test": {"api_key": "value"}})
+
+        assert store.get("test") == "value"
+        assert store.get_key("test", "api_key") == "value"
+
+    def test_get_credential(self):
+        """Test getting a credential."""
+        store = CredentialStore.for_testing({"test": {"key": "value"}})
+
+        cred = store.get_credential("test")
+        assert cred is not None
+        assert cred.get_key("key") == "value"
+
+    def test_get_nonexistent(self):
+        """Test getting nonexistent credential."""
+        store = CredentialStore.for_testing({})
+        assert store.get_credential("nonexistent") is None
+        assert store.get("nonexistent") is None
+
+    def test_save_and_load(self):
+        """Test saving and loading a credential."""
+        store = CredentialStore.for_testing({})
+
+        cred = CredentialObject(id="new", keys={"k": CredentialKey(name="k", value=SecretStr("v"))})
+        store.save_credential(cred)
+
+        loaded = store.get_credential("new")
+        assert loaded is not None
+        assert loaded.get_key("k") == "v"
+
+    def test_delete_credential(self):
+        """Test deleting a credential."""
+        store = CredentialStore.for_testing({"test": {"k": "v"}})
+
+        assert store.delete_credential("test")
+        assert store.get_credential("test") is None
+
+    def test_list_credentials(self):
+        """Test listing all credentials."""
+        store = CredentialStore.for_testing({"a": {"k": "v"}, "b": {"k": "v"}})
+
+        ids = store.list_credentials()
+        assert "a" in ids
+        assert "b" in ids
+
+    def test_is_available(self):
+        """Test checking credential availability."""
+        store = CredentialStore.for_testing({"test": {"k": "v"}})
+
+        assert store.is_available("test")
+        assert not store.is_available("nonexistent")
+
+    def test_resolve_templates(self):
+        """Test template resolution through store."""
+        store = CredentialStore.for_testing({"test": {"api_key": "value"}})
+
+        result = store.resolve("Key: {{test.api_key}}")
+        assert result == "Key: value"
+
+    def test_resolve_headers(self):
+        """Test resolving headers through store."""
+        store = CredentialStore.for_testing({"test": {"token": "xxx"}})
+
+        headers = store.resolve_headers({"Authorization": "Bearer {{test.token}}"})
+        assert headers["Authorization"] == "Bearer xxx"
+
+    def test_register_provider(self):
+        """Test registering a provider."""
+        store = CredentialStore.for_testing({})
+        provider = StaticProvider()
+
+        store.register_provider(provider)
+        assert store.get_provider("static") is provider
+
+    def test_register_usage_spec(self):
+        """Test registering a usage spec."""
+        store = CredentialStore.for_testing({})
+        spec = CredentialUsageSpec(
+            credential_id="test",
+            required_keys=["api_key"],
+            headers={"X-Key": "{{api_key}}"},
+        )
+
+        store.register_usage(spec)
+        assert store.get_usage_spec("test") is spec
+
+    def test_validate_for_usage(self):
+        """Test validating credential for usage spec."""
+        store = CredentialStore.for_testing({"test": {"api_key": "value"}})
+        spec = CredentialUsageSpec(credential_id="test", required_keys=["api_key"])
+        store.register_usage(spec)
+
+        errors = store.validate_for_usage("test")
+        assert errors == []
+
+    def test_validate_for_usage_missing_key(self):
+        """Test validation with missing required key."""
+        store = CredentialStore.for_testing({"test": {"other_key": "value"}})
+        spec = CredentialUsageSpec(credential_id="test", required_keys=["api_key"])
+        store.register_usage(spec)
+
+        errors = store.validate_for_usage("test")
+        assert "api_key" in errors[0]
+
+    def test_caching(self):
+        """Test that credentials are cached."""
+        storage = InMemoryStorage()
+        store = CredentialStore(storage=storage, cache_ttl_seconds=60)
+
+        storage.save(
+            CredentialObject(id="test", keys={"k": CredentialKey(name="k", value=SecretStr("v"))})
+        )
+
+        # First load
+        store.get_credential("test")
+
+        # Delete from storage
+        storage.delete("test")
+
+        # Should still get from cache
+        cred2 = store.get_credential("test")
+        assert cred2 is not None
+
+    def test_clear_cache(self):
+        """Test clearing the cache."""
+        storage = InMemoryStorage()
+        store = CredentialStore(storage=storage)
+
+        storage.save(CredentialObject(id="test", keys={}))
+        store.get_credential("test")  # Cache it
+
+        storage.delete("test")
+        store.clear_cache()
+
+        # Should not find in cache now
+        assert store.get_credential("test") is None
+
+
+class TestOAuth2Module:
+    """Tests for OAuth2 module."""
+
+    def test_oauth2_token_from_response(self):
+        """Test creating OAuth2Token from token response."""
+        from core.framework.credentials.oauth2 import OAuth2Token
+
+        response = {
+            "access_token": "xxx",
+            "token_type": "Bearer",
+            "expires_in": 3600,
+            "refresh_token": "yyy",
+            "scope": "read write",
+        }
+
+        token = OAuth2Token.from_token_response(response)
+        assert token.access_token == "xxx"
+        assert token.token_type == "Bearer"
+        assert token.refresh_token == "yyy"
+        assert token.scope == "read write"
+        assert token.expires_at is not None
+
+    def test_token_is_expired(self):
+        """Test token expiration check."""
+        from core.framework.credentials.oauth2 import OAuth2Token
+
+        # Not expired
+        future = datetime.now(UTC) + timedelta(hours=1)
+        token = OAuth2Token(access_token="xxx", expires_at=future)
+        assert not token.is_expired
+
+        # Expired
+        past = datetime.now(UTC) - timedelta(hours=1)
+        expired_token = OAuth2Token(access_token="xxx", expires_at=past)
+        assert expired_token.is_expired
+
+    def test_token_can_refresh(self):
+        """Test token refresh capability check."""
+        from core.framework.credentials.oauth2 import OAuth2Token
+
+        with_refresh = OAuth2Token(access_token="xxx", refresh_token="yyy")
+        assert with_refresh.can_refresh
+
+        without_refresh = OAuth2Token(access_token="xxx")
+        assert not without_refresh.can_refresh
+
+    def test_oauth2_config_validation(self):
+        """Test OAuth2Config validation."""
+        from core.framework.credentials.oauth2 import OAuth2Config, TokenPlacement
+
+        # Valid config
+        config = OAuth2Config(
+            token_url="https://example.com/token", client_id="id", client_secret="secret"
+        )
+        assert config.token_url == "https://example.com/token"
+
+        # Missing token_url
+        with pytest.raises(ValueError):
+            OAuth2Config(token_url="")
+
+        # HEADER_CUSTOM without custom_header_name
+        with pytest.raises(ValueError):
+            OAuth2Config(
+                token_url="https://example.com/token",
+                token_placement=TokenPlacement.HEADER_CUSTOM,
+            )
+
+
+if __name__ == "__main__":
+    pytest.main([__file__, "-v"])
@@ -0,0 +1,55 @@
+"""
+HashiCorp Vault integration for the credential store.
+
+This module provides enterprise-grade secret management through
+HashiCorp Vault integration.
+
+Quick Start:
+    from core.framework.credentials import CredentialStore
+    from core.framework.credentials.vault import HashiCorpVaultStorage
+
+    # Configure Vault storage
+    storage = HashiCorpVaultStorage(
+        url="https://vault.example.com:8200",
+        # token read from VAULT_TOKEN env var
+        mount_point="secret",
+        path_prefix="hive/agents/prod"
+    )
+
+    # Create credential store with Vault backend
+    store = CredentialStore(storage=storage)
+
+    # Use normally - credentials are stored in Vault
+    credential = store.get_credential("my_api")
+
+Requirements:
+    pip install hvac
+
+Authentication:
+    Set the VAULT_TOKEN environment variable or pass the token directly:
+
+        export VAULT_TOKEN="hvs.xxxxxxxxxxxxx"
+
+    For production, consider using Vault auth methods:
+    - Kubernetes auth
+    - AppRole auth
+    - AWS IAM auth
+
+Vault Configuration:
+    Ensure KV v2 secrets engine is enabled:
+
+        vault secrets enable -path=secret kv-v2
+
+    Grant appropriate policies:
+
+        path "secret/data/hive/credentials/*" {
+            capabilities = ["create", "read", "update", "delete", "list"]
+        }
+        path "secret/metadata/hive/credentials/*" {
+            capabilities = ["list", "delete"]
+        }
+"""
+
+from .hashicorp import HashiCorpVaultStorage
+
+__all__ = ["HashiCorpVaultStorage"]
@@ -0,0 +1,394 @@
+"""
+HashiCorp Vault storage adapter.
+
+Provides integration with HashiCorp Vault for enterprise secret management.
+Requires the 'hvac' package: uv pip install hvac
+"""
+
+from __future__ import annotations
+
+import logging
+import os
+from datetime import datetime
+from typing import Any
+
+from pydantic import SecretStr
+
+from ..models import CredentialKey, CredentialObject, CredentialType
+from ..storage import CredentialStorage
+
+logger = logging.getLogger(__name__)
+
+
+class HashiCorpVaultStorage(CredentialStorage):
+    """
+    HashiCorp Vault storage adapter.
+
+    Features:
+    - KV v2 secrets engine support
+    - Namespace support (Enterprise)
+    - Automatic secret versioning
+    - Audit logging via Vault
+
+    The adapter stores credentials in Vault's KV v2 secrets engine with
+    the following structure:
+
+        {mount_point}/data/{path_prefix}/{credential_id}
+        └── data:
+            ├── _type: "oauth2"
+            ├── access_token: "xxx"
+            ├── refresh_token: "yyy"
+            ├── _expires_access_token: "2024-01-26T12:00:00"
+            └── _provider_id: "oauth2"
+
+    Example:
+        storage = HashiCorpVaultStorage(
+            url="https://vault.example.com:8200",
+            token="hvs.xxx",  # Or use VAULT_TOKEN env var
+            mount_point="secret",
+            path_prefix="hive/credentials"
+        )
+
+        store = CredentialStore(storage=storage)
+
+        # Credentials are now stored in Vault
+        store.save_credential(credential)
+        credential = store.get_credential("my_api")
+
+    Authentication:
+        The adapter uses token-based authentication. The token can be provided:
+        1. Directly via the 'token' parameter
+        2. Via the VAULT_TOKEN environment variable
+
+        For production, consider using:
+        - Kubernetes auth method
+        - AppRole auth method
+        - AWS IAM auth method
+
+    Requirements:
+        uv pip install hvac
+    """
+
+    def __init__(
+        self,
+        url: str,
+        token: str | None = None,
+        mount_point: str = "secret",
+        path_prefix: str = "hive/credentials",
+        namespace: str | None = None,
+        verify_ssl: bool = True,
+    ):
+        """
+        Initialize Vault storage.
+
+        Args:
+            url: Vault server URL (e.g., https://vault.example.com:8200)
+            token: Vault token. If None, reads from VAULT_TOKEN env var
+            mount_point: KV secrets engine mount point (default: "secret")
+            path_prefix: Path prefix for all credentials
+            namespace: Vault namespace (Enterprise feature)
+            verify_ssl: Whether to verify SSL certificates
+
+        Raises:
+            ImportError: If hvac is not installed
+            ValueError: If authentication fails
+        """
+        try:
+            import hvac
+        except ImportError as e:
+            raise ImportError(
+                "HashiCorp Vault support requires 'hvac'. Install with: uv pip install hvac"
+            ) from e
+
+        self._url = url
+        self._token = token or os.environ.get("VAULT_TOKEN")
+        self._mount = mount_point
+        self._prefix = path_prefix
+        self._namespace = namespace
+
+        if not self._token:
+            raise ValueError(
+                "Vault token required. Set VAULT_TOKEN env var or pass token parameter."
+            )
+
+        self._client = hvac.Client(
+            url=url,
+            token=self._token,
+            namespace=namespace,
+            verify=verify_ssl,
+        )
+
+        if not self._client.is_authenticated():
+            raise ValueError("Vault authentication failed. Check token and server URL.")
+
+        logger.info(f"Connected to HashiCorp Vault at {url}")
+
+    def _path(self, credential_id: str) -> str:
+        """Build Vault path for credential."""
+        # Sanitize credential_id
+        safe_id = credential_id.replace("/", "_").replace("\\", "_")
+        return f"{self._prefix}/{safe_id}"
+
+    def save(self, credential: CredentialObject) -> None:
+        """Save credential to Vault KV v2."""
+        path = self._path(credential.id)
+        data = self._serialize_for_vault(credential)
+
+        try:
+            self._client.secrets.kv.v2.create_or_update_secret(
+                path=path,
+                secret=data,
+                mount_point=self._mount,
+            )
+            logger.debug(f"Saved credential '{credential.id}' to Vault at {path}")
+        except Exception as e:
+            logger.error(f"Failed to save credential '{credential.id}' to Vault: {e}")
+            raise
+
+    def load(self, credential_id: str) -> CredentialObject | None:
+        """Load credential from Vault."""
+        path = self._path(credential_id)
+
+        try:
+            response = self._client.secrets.kv.v2.read_secret_version(
+                path=path,
+                mount_point=self._mount,
+            )
+            data = response["data"]["data"]
+            return self._deserialize_from_vault(credential_id, data)
+        except Exception as e:
+            # Check if it's a "not found" error
+            error_str = str(e).lower()
+            if "not found" in error_str or "404" in error_str:
+                logger.debug(f"Credential '{credential_id}' not found in Vault")
+                return None
+            logger.error(f"Failed to load credential '{credential_id}' from Vault: {e}")
+            raise
+
+    def delete(self, credential_id: str) -> bool:
+        """Delete credential from Vault (all versions)."""
+        path = self._path(credential_id)
+
+        try:
+            self._client.secrets.kv.v2.delete_metadata_and_all_versions(
+                path=path,
+                mount_point=self._mount,
+            )
+            logger.debug(f"Deleted credential '{credential_id}' from Vault")
+            return True
+        except Exception as e:
+            error_str = str(e).lower()
+            if "not found" in error_str or "404" in error_str:
+                return False
+            logger.error(f"Failed to delete credential '{credential_id}' from Vault: {e}")
+            raise
+
+    def list_all(self) -> list[str]:
+        """List all credentials under the prefix."""
+        try:
+            response = self._client.secrets.kv.v2.list_secrets(
+                path=self._prefix,
+                mount_point=self._mount,
+            )
+            keys = response.get("data", {}).get("keys", [])
+            # Remove trailing slashes from folder names
+            return [k.rstrip("/") for k in keys]
+        except Exception as e:
+            error_str = str(e).lower()
+            if "not found" in error_str or "404" in error_str:
+                return []
+            logger.error(f"Failed to list credentials from Vault: {e}")
+            raise
+
+    def exists(self, credential_id: str) -> bool:
+        """Check if credential exists in Vault."""
+        try:
+            path = self._path(credential_id)
+            self._client.secrets.kv.v2.read_secret_version(
+                path=path,
+                mount_point=self._mount,
+            )
+            return True
+        except Exception:
+            return False
+
+    def _serialize_for_vault(self, credential: CredentialObject) -> dict[str, Any]:
+        """Convert credential to Vault secret format."""
+        data: dict[str, Any] = {
+            "_type": credential.credential_type.value,
+        }
+
+        if credential.provider_id:
+            data["_provider_id"] = credential.provider_id
+
+        if credential.description:
+            data["_description"] = credential.description
+
+        if credential.auto_refresh:
+            data["_auto_refresh"] = "true"
+
+        # Store each key
+        for key_name, key in credential.keys.items():
+            data[key_name] = key.get_secret_value()
+
+            if key.expires_at:
+                data[f"_expires_{key_name}"] = key.expires_at.isoformat()
+
+            if key.metadata:
+                data[f"_metadata_{key_name}"] = str(key.metadata)
+
+        return data
+
+    def _deserialize_from_vault(self, credential_id: str, data: dict[str, Any]) -> CredentialObject:
+        """Reconstruct credential from Vault secret."""
+        # Extract metadata fields
+        cred_type = CredentialType(data.pop("_type", "api_key"))
+        provider_id = data.pop("_provider_id", None)
+        description = data.pop("_description", "")
+        auto_refresh = data.pop("_auto_refresh", "") == "true"
+
+        # Build keys dict
+        keys: dict[str, CredentialKey] = {}
+
+        # Find all non-metadata keys
+        key_names = [k for k in data.keys() if not k.startswith("_")]
+
+        for key_name in key_names:
+            value = data[key_name]
+
+            # Check for expiration
+            expires_at = None
+            expires_key = f"_expires_{key_name}"
+            if expires_key in data:
+                try:
+                    expires_at = datetime.fromisoformat(data[expires_key])
+                except (ValueError, TypeError):
+                    pass
+
+            # Check for metadata
+            metadata: dict[str, Any] = {}
+            metadata_key = f"_metadata_{key_name}"
+            if metadata_key in data:
+                try:
+                    import ast
+
+                    metadata = ast.literal_eval(data[metadata_key])
+                except (ValueError, SyntaxError):
+                    pass
+
+            keys[key_name] = CredentialKey(
+                name=key_name,
+                value=SecretStr(value),
+                expires_at=expires_at,
+                metadata=metadata,
+            )
+
+        return CredentialObject(
+            id=credential_id,
+            credential_type=cred_type,
+            keys=keys,
+            provider_id=provider_id,
+            description=description,
+            auto_refresh=auto_refresh,
+        )
+
+    # --- Vault-Specific Operations ---
+
+    def get_secret_metadata(self, credential_id: str) -> dict[str, Any] | None:
+        """
+        Get Vault metadata for a secret (version info, timestamps, etc.).
+
+        Args:
+            credential_id: The credential identifier
+
+        Returns:
+            Metadata dict or None if not found
+        """
+        path = self._path(credential_id)
+
+        try:
+            response = self._client.secrets.kv.v2.read_secret_metadata(
+                path=path,
+                mount_point=self._mount,
+            )
+            return response.get("data", {})
+        except Exception:
+            return None
+
+    def soft_delete(self, credential_id: str, versions: list[int] | None = None) -> bool:
+        """
+        Soft delete specific versions (can be recovered).
+
+        Args:
+            credential_id: The credential identifier
+            versions: Version numbers to delete. If None, deletes latest.
+
+        Returns:
+            True if successful
+        """
+        path = self._path(credential_id)
+
+        try:
+            if versions:
+                self._client.secrets.kv.v2.delete_secret_versions(
+                    path=path,
+                    versions=versions,
+                    mount_point=self._mount,
+                )
+            else:
+                self._client.secrets.kv.v2.delete_latest_version_of_secret(
+                    path=path,
+                    mount_point=self._mount,
+                )
+            return True
+        except Exception as e:
+            logger.error(f"Soft delete failed for '{credential_id}': {e}")
+            return False
+
+    def undelete(self, credential_id: str, versions: list[int]) -> bool:
+        """
+        Recover soft-deleted versions.
+
+        Args:
+            credential_id: The credential identifier
+            versions: Version numbers to recover
+
+        Returns:
+            True if successful
+        """
+        path = self._path(credential_id)
+
+        try:
+            self._client.secrets.kv.v2.undelete_secret_versions(
+                path=path,
+                versions=versions,
+                mount_point=self._mount,
+            )
+            return True
+        except Exception as e:
+            logger.error(f"Undelete failed for '{credential_id}': {e}")
+            return False
+
+    def load_version(self, credential_id: str, version: int) -> CredentialObject | None:
+        """
+        Load a specific version of a credential.
+
+        Args:
+            credential_id: The credential identifier
+            version: Version number to load
+
+        Returns:
+            CredentialObject or None
+        """
+        path = self._path(credential_id)
+
+        try:
+            response = self._client.secrets.kv.v2.read_secret_version(
+                path=path,
+                version=version,
+                mount_point=self._mount,
+            )
+            data = response["data"]["data"]
+            return self._deserialize_from_vault(credential_id, data)
+        except Exception:
+            return None
@@ -1,32 +1,47 @@
 """Graph structures: Goals, Nodes, Edges, and Flexible Execution."""

-from framework.graph.goal import Goal, SuccessCriterion, Constraint, GoalStatus
-from framework.graph.node import NodeSpec, NodeContext, NodeResult, NodeProtocol
-from framework.graph.edge import EdgeSpec, EdgeCondition
+from framework.graph.client_io import (
+    ActiveNodeClientIO,
+    ClientIOGateway,
+    InertNodeClientIO,
+    NodeClientIO,
+)
+from framework.graph.code_sandbox import CodeSandbox, safe_eval, safe_exec
+from framework.graph.context_handoff import ContextHandoff, HandoffContext
+from framework.graph.conversation import ConversationStore, Message, NodeConversation
+from framework.graph.edge import DEFAULT_MAX_TOKENS, EdgeCondition, EdgeSpec, GraphSpec
+from framework.graph.event_loop_node import (
+    EventLoopNode,
+    JudgeProtocol,
+    JudgeVerdict,
+    LoopConfig,
+    OutputAccumulator,
+)
 from framework.graph.executor import GraphExecutor
+from framework.graph.flexible_executor import ExecutorConfig, FlexibleGraphExecutor
+from framework.graph.goal import Constraint, Goal, GoalStatus, SuccessCriterion
+from framework.graph.judge import HybridJudge, create_default_judge
+from framework.graph.node import NodeContext, NodeProtocol, NodeResult, NodeSpec

 # Flexible execution (Worker-Judge pattern)
 from framework.graph.plan import (
-    Plan,
-    PlanStep,
    ActionSpec,
    ActionType,
-    StepStatus,
-    Judgment,
-    JudgmentAction,
-    EvaluationRule,
-    PlanExecutionResult,
-    ExecutionStatus,
-    load_export,
    # HITL (Human-in-the-loop)
    ApprovalDecision,
    ApprovalRequest,
    ApprovalResult,
+    EvaluationRule,
+    ExecutionStatus,
+    Judgment,
+    JudgmentAction,
+    Plan,
+    PlanExecutionResult,
+    PlanStep,
+    StepStatus,
+    load_export,
 )
-from framework.graph.judge import HybridJudge, create_default_judge
-from framework.graph.worker_node import WorkerNode, StepExecutionResult
-from framework.graph.flexible_executor import FlexibleGraphExecutor, ExecutorConfig
-from framework.graph.code_sandbox import CodeSandbox, safe_exec, safe_eval
+from framework.graph.worker_node import StepExecutionResult, WorkerNode

 __all__ = [
    # Goal
@@ -42,6 +57,8 @@ __all__ = [
    # Edge
    "EdgeSpec",
    "EdgeCondition",
+    "GraphSpec",
+    "DEFAULT_MAX_TOKENS",
    # Executor (fixed graph)
    "GraphExecutor",
    # Plan (flexible execution)
@@ -71,4 +88,22 @@ __all__ = [
    "CodeSandbox",
    "safe_exec",
    "safe_eval",
+    # Conversation
+    "NodeConversation",
+    "ConversationStore",
+    "Message",
+    # Event Loop
+    "EventLoopNode",
+    "LoopConfig",
+    "OutputAccumulator",
+    "JudgeProtocol",
+    "JudgeVerdict",
+    # Context Handoff
+    "ContextHandoff",
+    "HandoffContext",
+    # Client I/O
+    "NodeClientIO",
+    "ActiveNodeClientIO",
+    "InertNodeClientIO",
+    "ClientIOGateway",
 ]
@@ -0,0 +1,85 @@
+"""
+Checkpoint Configuration - Controls checkpoint behavior during execution.
+"""
+
+from dataclasses import dataclass
+
+
+@dataclass
+class CheckpointConfig:
+    """
+    Configuration for checkpoint behavior during graph execution.
+
+    Controls when checkpoints are created, how they're stored,
+    and when they're pruned.
+    """
+
+    # Enable/disable checkpointing
+    enabled: bool = True
+
+    # When to checkpoint
+    checkpoint_on_node_start: bool = True
+    checkpoint_on_node_complete: bool = True
+
+    # Pruning (time-based)
+    checkpoint_max_age_days: int = 7  # Prune checkpoints older than 1 week
+    prune_every_n_nodes: int = 10  # Check for pruning every N nodes
+
+    # Performance
+    async_checkpoint: bool = True  # Don't block execution on checkpoint writes
+
+    # What to include in checkpoints
+    include_full_memory: bool = True
+    include_metrics: bool = True
+
+    def should_checkpoint_node_start(self) -> bool:
+        """Check if should checkpoint before node execution."""
+        return self.enabled and self.checkpoint_on_node_start
+
+    def should_checkpoint_node_complete(self) -> bool:
+        """Check if should checkpoint after node execution."""
+        return self.enabled and self.checkpoint_on_node_complete
+
+    def should_prune_checkpoints(self, nodes_executed: int) -> bool:
+        """
+        Check if should prune checkpoints based on execution progress.
+
+        Args:
+            nodes_executed: Number of nodes executed so far
+
+        Returns:
+            True if should check for old checkpoints and prune them
+        """
+        return (
+            self.enabled
+            and self.prune_every_n_nodes > 0
+            and nodes_executed % self.prune_every_n_nodes == 0
+        )
+
+
+# Default configuration for most agents
+DEFAULT_CHECKPOINT_CONFIG = CheckpointConfig(
+    enabled=True,
+    checkpoint_on_node_start=True,
+    checkpoint_on_node_complete=True,
+    checkpoint_max_age_days=7,
+    prune_every_n_nodes=10,
+    async_checkpoint=True,
+)
+
+
+# Minimal configuration (only checkpoint at node completion)
+MINIMAL_CHECKPOINT_CONFIG = CheckpointConfig(
+    enabled=True,
+    checkpoint_on_node_start=False,
+    checkpoint_on_node_complete=True,
+    checkpoint_max_age_days=7,
+    prune_every_n_nodes=20,
+    async_checkpoint=True,
+)
+
+
+# Disabled configuration (no checkpointing)
+DISABLED_CHECKPOINT_CONFIG = CheckpointConfig(
+    enabled=False,
+)
@@ -0,0 +1,170 @@
+"""
+Client I/O gateway for graph nodes.
+
+Provides the bridge between node code and external clients:
+- ActiveNodeClientIO: for client_facing=True nodes (streams output, accepts input)
+- InertNodeClientIO: for client_facing=False nodes (logs internally, redirects input)
+- ClientIOGateway: factory that creates the right variant per node
+"""
+
+from __future__ import annotations
+
+import asyncio
+import logging
+from abc import ABC, abstractmethod
+from collections.abc import AsyncIterator
+from typing import TYPE_CHECKING
+
+if TYPE_CHECKING:
+    from framework.runtime.event_bus import EventBus
+
+logger = logging.getLogger(__name__)
+
+
+class NodeClientIO(ABC):
+    """Abstract base for node client I/O."""
+
+    @abstractmethod
+    async def emit_output(self, content: str, is_final: bool = False) -> None:
+        """Emit output content. If is_final=True, signal end of stream."""
+
+    @abstractmethod
+    async def request_input(self, prompt: str = "", timeout: float | None = None) -> str:
+        """Request input. Behavior depends on whether the node is client-facing."""
+
+
+class ActiveNodeClientIO(NodeClientIO):
+    """
+    Client I/O for client_facing=True nodes.
+
+    - emit_output() queues content and publishes CLIENT_OUTPUT_DELTA.
+    - request_input() publishes CLIENT_INPUT_REQUESTED, then awaits provide_input().
+    - output_stream() yields queued content until the final sentinel.
+    """
+
+    def __init__(
+        self,
+        node_id: str,
+        event_bus: EventBus | None = None,
+    ) -> None:
+        self.node_id = node_id
+        self._event_bus = event_bus
+
+        self._output_queue: asyncio.Queue[str | None] = asyncio.Queue()
+        self._output_snapshot = ""
+
+        self._input_event: asyncio.Event | None = None
+        self._input_result: str | None = None
+
+    async def emit_output(self, content: str, is_final: bool = False) -> None:
+        self._output_snapshot += content
+        await self._output_queue.put(content)
+
+        if self._event_bus is not None:
+            await self._event_bus.emit_client_output_delta(
+                stream_id=self.node_id,
+                node_id=self.node_id,
+                content=content,
+                snapshot=self._output_snapshot,
+            )
+
+        if is_final:
+            await self._output_queue.put(None)
+
+    async def request_input(self, prompt: str = "", timeout: float | None = None) -> str:
+        if self._input_event is not None:
+            raise RuntimeError("request_input already pending for this node")
+
+        self._input_event = asyncio.Event()
+        self._input_result = None
+
+        if self._event_bus is not None:
+            await self._event_bus.emit_client_input_requested(
+                stream_id=self.node_id,
+                node_id=self.node_id,
+                prompt=prompt,
+            )
+
+        try:
+            if timeout is not None:
+                await asyncio.wait_for(self._input_event.wait(), timeout=timeout)
+            else:
+                await self._input_event.wait()
+        finally:
+            self._input_event = None
+
+        if self._input_result is None:
+            raise RuntimeError("input event was set but no input was provided")
+        result = self._input_result
+        self._input_result = None
+        return result
+
+    async def provide_input(self, content: str) -> None:
+        """Called externally to fulfill a pending request_input()."""
+        if self._input_event is None:
+            raise RuntimeError("no pending request_input to fulfill")
+        self._input_result = content
+        self._input_event.set()
+
+    async def output_stream(self) -> AsyncIterator[str]:
+        """Async iterator that yields output chunks until the final sentinel."""
+        while True:
+            chunk = await self._output_queue.get()
+            if chunk is None:
+                break
+            yield chunk
+
+
+class InertNodeClientIO(NodeClientIO):
+    """
+    Client I/O for client_facing=False nodes.
+
+    - emit_output() publishes NODE_INTERNAL_OUTPUT (content is not discarded).
+    - request_input() publishes NODE_INPUT_BLOCKED and returns a redirect string.
+    """
+
+    def __init__(
+        self,
+        node_id: str,
+        event_bus: EventBus | None = None,
+    ) -> None:
+        self.node_id = node_id
+        self._event_bus = event_bus
+
+    async def emit_output(self, content: str, is_final: bool = False) -> None:
+        if self._event_bus is not None:
+            await self._event_bus.emit_node_internal_output(
+                stream_id=self.node_id,
+                node_id=self.node_id,
+                content=content,
+            )
+
+    async def request_input(self, prompt: str = "", timeout: float | None = None) -> str:
+        if self._event_bus is not None:
+            await self._event_bus.emit_node_input_blocked(
+                stream_id=self.node_id,
+                node_id=self.node_id,
+                prompt=prompt,
+            )
+        return (
+            "You are an internal processing node. There is no user to interact with."
+            " Work with the data provided in your inputs to complete your task."
+        )
+
+
+class ClientIOGateway:
+    """Factory that creates the appropriate NodeClientIO for a node."""
+
+    def __init__(self, event_bus: EventBus | None = None) -> None:
+        self._event_bus = event_bus
+
+    def create_io(self, node_id: str, client_facing: bool) -> NodeClientIO:
+        if client_facing:
+            return ActiveNodeClientIO(
+                node_id=node_id,
+                event_bus=self._event_bus,
+            )
+        return InertNodeClientIO(
+            node_id=node_id,
+            event_bus=self._event_bus,
+        )
@@ -13,11 +13,11 @@ Security measures:
 """

 import ast
-import sys
 import signal
-from typing import Any
-from dataclasses import dataclass, field
+import sys
 from contextlib import contextmanager
+from dataclasses import dataclass, field
+from typing import Any

 # Safe builtins whitelist
 SAFE_BUILTINS = {
@@ -25,7 +25,6 @@ SAFE_BUILTINS = {
    "True": True,
    "False": False,
    "None": None,
-
    # Type constructors
    "bool": bool,
    "int": int,
@@ -36,7 +35,6 @@ SAFE_BUILTINS = {
    "set": set,
    "tuple": tuple,
    "frozenset": frozenset,
-
    # Basic functions
    "abs": abs,
    "all": all,
@@ -97,22 +95,26 @@ BLOCKED_AST_NODES = {

 class CodeSandboxError(Exception):
    """Error during sandboxed code execution."""
+
    pass


 class TimeoutError(CodeSandboxError):
    """Code execution timed out."""
+
    pass


 class SecurityError(CodeSandboxError):
    """Code contains potentially dangerous operations."""
+
    pass


@dataclass
 class SandboxResult:
    """Result of sandboxed code execution."""
+
    success: bool
    result: Any = None
    error: str | None = None
@@ -134,6 +136,7 @@ class RestrictedImporter:

        if name not in self._cache:
            import importlib
+
            self._cache[name] = importlib.import_module(name)

        return self._cache[name]
@@ -161,9 +164,8 @@ class CodeValidator:
        for node in ast.walk(tree):
            # Check for blocked node types
            if type(node) in self.blocked_nodes:
-                issues.append(
-                    f"Blocked operation: {type(node).__name__} at line {getattr(node, 'lineno', '?')}"
-                )
+                lineno = getattr(node, "lineno", "?")
+                issues.append(f"Blocked operation: {type(node).__name__} at line {lineno}")

            # Check for dangerous attribute access
            if isinstance(node, ast.Attribute):
@@ -212,11 +214,12 @@ class CodeSandbox:
    @contextmanager
    def _timeout_context(self, seconds: int):
        """Context manager for timeout enforcement."""
+
        def handler(signum, frame):
            raise TimeoutError(f"Code execution timed out after {seconds} seconds")

        # Only works on Unix-like systems
-        if hasattr(signal, 'SIGALRM'):
+        if hasattr(signal, "SIGALRM"):
            old_handler = signal.signal(signal.SIGALRM, handler)
            signal.alarm(seconds)
            try:
@@ -275,6 +278,7 @@ class CodeSandbox:

        # Capture stdout
        import io
+
        old_stdout = sys.stdout
        sys.stdout = captured_stdout = io.StringIO()

@@ -296,11 +300,7 @@ class CodeSandbox:

            # Also extract any new variables (not in inputs or builtins)
            for key, value in namespace.items():
-                if (
-                    key not in inputs
-                    and key not in self.safe_builtins
-                    and not key.startswith("_")
-                ):
+                if key not in inputs and key not in self.safe_builtins and not key.startswith("_"):
                    extracted[key] = value

            return SandboxResult(
@@ -0,0 +1,191 @@
+"""Context handoff: summarize a completed NodeConversation for the next graph node."""
+
+from __future__ import annotations
+
+import logging
+from dataclasses import dataclass
+from typing import TYPE_CHECKING, Any
+
+from framework.graph.conversation import _try_extract_key
+
+if TYPE_CHECKING:
+    from framework.graph.conversation import NodeConversation
+    from framework.llm.provider import LLMProvider
+
+logger = logging.getLogger(__name__)
+
+_TRUNCATE_CHARS = 500
+
+
+# ---------------------------------------------------------------------------
+# Data
+# ---------------------------------------------------------------------------
+
+
+@dataclass
+class HandoffContext:
+    """Structured summary of a completed node conversation."""
+
+    source_node_id: str
+    summary: str
+    key_outputs: dict[str, Any]
+    turn_count: int
+    total_tokens_used: int
+
+
+# ---------------------------------------------------------------------------
+# ContextHandoff
+# ---------------------------------------------------------------------------
+
+
+class ContextHandoff:
+    """Summarize a completed NodeConversation into a HandoffContext.
+
+    Parameters
+    ----------
+    llm : LLMProvider | None
+        Optional LLM provider for abstractive summarization.
+        When *None*, all summarization uses the extractive fallback.
+    """
+
+    def __init__(self, llm: LLMProvider | None = None) -> None:
+        self.llm = llm
+
+    # ------------------------------------------------------------------
+    # Public API
+    # ------------------------------------------------------------------
+
+    def summarize_conversation(
+        self,
+        conversation: NodeConversation,
+        node_id: str,
+        output_keys: list[str] | None = None,
+    ) -> HandoffContext:
+        """Produce a HandoffContext from *conversation*.
+
+        1. Extracts turn_count & total_tokens_used (sync properties).
+        2. Extracts key_outputs by scanning assistant messages most-recent-first.
+        3. Builds a summary via the LLM (if available) or extractive fallback.
+        """
+        turn_count = conversation.turn_count
+        total_tokens_used = conversation.estimate_tokens()
+        messages = conversation.messages  # defensive copy
+
+        # --- key outputs ---------------------------------------------------
+        key_outputs: dict[str, Any] = {}
+        if output_keys:
+            remaining = set(output_keys)
+            for msg in reversed(messages):
+                if msg.role != "assistant" or not remaining:
+                    continue
+                for key in list(remaining):
+                    value = _try_extract_key(msg.content, key)
+                    if value is not None:
+                        key_outputs[key] = value
+                        remaining.discard(key)
+
+        # --- summary -------------------------------------------------------
+        if self.llm is not None:
+            try:
+                summary = self._llm_summary(messages, output_keys or [])
+            except Exception:
+                logger.warning(
+                    "LLM summarization failed; falling back to extractive.",
+                    exc_info=True,
+                )
+                summary = self._extractive_summary(messages)
+        else:
+            summary = self._extractive_summary(messages)
+
+        return HandoffContext(
+            source_node_id=node_id,
+            summary=summary,
+            key_outputs=key_outputs,
+            turn_count=turn_count,
+            total_tokens_used=total_tokens_used,
+        )
+
+    @staticmethod
+    def format_as_input(handoff: HandoffContext) -> str:
+        """Render *handoff* as structured plain text for the next node's input."""
+        header = (
+            f"--- CONTEXT FROM: {handoff.source_node_id} "
+            f"({handoff.turn_count} turns, ~{handoff.total_tokens_used} tokens) ---"
+        )
+
+        sections: list[str] = [header, ""]
+
+        if handoff.key_outputs:
+            sections.append("KEY OUTPUTS:")
+            for k, v in handoff.key_outputs.items():
+                sections.append(f"- {k}: {v}")
+            sections.append("")
+
+        summary_text = handoff.summary or "No summary available."
+        sections.append("SUMMARY:")
+        sections.append(summary_text)
+        sections.append("")
+        sections.append("--- END CONTEXT ---")
+
+        return "\n".join(sections)
+
+    # ------------------------------------------------------------------
+    # Private helpers
+    # ------------------------------------------------------------------
+
+    @staticmethod
+    def _extractive_summary(messages: list) -> str:
+        """Build a summary from key assistant messages without an LLM.
+
+        Strategy:
+        - Include the first assistant message (initial assessment).
+        - Include the last assistant message (final conclusion).
+        - Truncate each to ~500 chars.
+        """
+        if not messages:
+            return "Empty conversation."
+
+        assistant_msgs = [m for m in messages if m.role == "assistant"]
+        if not assistant_msgs:
+            return "No assistant responses."
+
+        parts: list[str] = []
+
+        first = assistant_msgs[0].content
+        parts.append(first[:_TRUNCATE_CHARS])
+
+        if len(assistant_msgs) > 1:
+            last = assistant_msgs[-1].content
+            parts.append(last[:_TRUNCATE_CHARS])
+
+        return "\n\n".join(parts)
+
+    def _llm_summary(self, messages: list, output_keys: list[str]) -> str:
+        """Produce a summary by calling the LLM provider."""
+        if self.llm is None:
+            raise ValueError("_llm_summary called without an LLM provider")
+
+        conversation_text = "\n".join(f"[{m.role}]: {m.content}" for m in messages)
+
+        key_hint = ""
+        if output_keys:
+            key_hint = (
+                "\nThe following output keys are especially important: "
+                + ", ".join(output_keys)
+                + ".\n"
+            )
+
+        system_prompt = (
+            "You are a concise summarizer. Given the conversation below, "
+            "produce a brief summary (at most ~500 tokens) that captures the "
+            "key decisions, findings, and outcomes. Focus on what was concluded "
+            "rather than the back-and-forth process." + key_hint
+        )
+
+        response = self.llm.complete(
+            messages=[{"role": "user", "content": conversation_text}],
+            system=system_prompt,
+            max_tokens=500,
+        )
+
+        return response.content.strip()
@@ -0,0 +1,600 @@
+"""NodeConversation: Message history management for graph nodes."""
+
+from __future__ import annotations
+
+import json
+import re
+from dataclasses import dataclass
+from typing import Any, Literal, Protocol, runtime_checkable
+
+
+@dataclass
+class Message:
+    """A single message in a conversation.
+
+    Attributes:
+        seq: Monotonic sequence number.
+        role: One of "user", "assistant", or "tool".
+        content: Message text.
+        tool_use_id: Internal tool-use identifier (output as ``tool_call_id`` in LLM dicts).
+        tool_calls: OpenAI-format tool call list for assistant messages.
+        is_error: When True and role is "tool", ``to_llm_dict`` prepends "ERROR: " to content.
+    """
+
+    seq: int
+    role: Literal["user", "assistant", "tool"]
+    content: str
+    tool_use_id: str | None = None
+    tool_calls: list[dict[str, Any]] | None = None
+    is_error: bool = False
+
+    def to_llm_dict(self) -> dict[str, Any]:
+        """Convert to OpenAI-format message dict."""
+        if self.role == "user":
+            return {"role": "user", "content": self.content}
+
+        if self.role == "assistant":
+            d: dict[str, Any] = {"role": "assistant", "content": self.content}
+            if self.tool_calls:
+                d["tool_calls"] = self.tool_calls
+            return d
+
+        # role == "tool"
+        content = f"ERROR: {self.content}" if self.is_error else self.content
+        return {
+            "role": "tool",
+            "tool_call_id": self.tool_use_id,
+            "content": content,
+        }
+
+    def to_storage_dict(self) -> dict[str, Any]:
+        """Serialize all fields for persistence.  Omits None/default-False fields."""
+        d: dict[str, Any] = {
+            "seq": self.seq,
+            "role": self.role,
+            "content": self.content,
+        }
+        if self.tool_use_id is not None:
+            d["tool_use_id"] = self.tool_use_id
+        if self.tool_calls is not None:
+            d["tool_calls"] = self.tool_calls
+        if self.is_error:
+            d["is_error"] = self.is_error
+        return d
+
+    @classmethod
+    def from_storage_dict(cls, data: dict[str, Any]) -> Message:
+        """Deserialize from a storage dict."""
+        return cls(
+            seq=data["seq"],
+            role=data["role"],
+            content=data["content"],
+            tool_use_id=data.get("tool_use_id"),
+            tool_calls=data.get("tool_calls"),
+            is_error=data.get("is_error", False),
+        )
+
+
+def _extract_spillover_filename(content: str) -> str | None:
+    """Extract spillover filename from a truncated tool result.
+
+    Matches the pattern produced by EventLoopNode._truncate_tool_result():
+        "saved to 'tool_github_list_stargazers_abc123.txt'"
+    """
+    match = re.search(r"saved to '([^']+)'", content)
+    return match.group(1) if match else None
+
+
+# ---------------------------------------------------------------------------
+# ConversationStore protocol (Phase 2)
+# ---------------------------------------------------------------------------
+
+
+@runtime_checkable
+class ConversationStore(Protocol):
+    """Protocol for conversation persistence backends."""
+
+    async def write_part(self, seq: int, data: dict[str, Any]) -> None: ...
+
+    async def read_parts(self) -> list[dict[str, Any]]: ...
+
+    async def write_meta(self, data: dict[str, Any]) -> None: ...
+
+    async def read_meta(self) -> dict[str, Any] | None: ...
+
+    async def write_cursor(self, data: dict[str, Any]) -> None: ...
+
+    async def read_cursor(self) -> dict[str, Any] | None: ...
+
+    async def delete_parts_before(self, seq: int) -> None: ...
+
+    async def close(self) -> None: ...
+
+    async def destroy(self) -> None: ...
+
+
+# ---------------------------------------------------------------------------
+# NodeConversation
+# ---------------------------------------------------------------------------
+
+
+def _try_extract_key(content: str, key: str) -> str | None:
+    """Try 4 strategies to extract a *key*'s value from message content.
+
+    Strategies (in order):
+    1. Whole message is JSON — ``json.loads``, check for key.
+    2. Embedded JSON via ``find_json_object`` helper.
+    3. Colon format: ``key: value``.
+    4. Equals format: ``key = value``.
+    """
+    from framework.graph.node import find_json_object
+
+    # 1. Whole message is JSON
+    try:
+        parsed = json.loads(content)
+        if isinstance(parsed, dict) and key in parsed:
+            val = parsed[key]
+            return json.dumps(val) if not isinstance(val, str) else val
+    except (json.JSONDecodeError, TypeError):
+        pass
+
+    # 2. Embedded JSON via find_json_object
+    json_str = find_json_object(content)
+    if json_str:
+        try:
+            parsed = json.loads(json_str)
+            if isinstance(parsed, dict) and key in parsed:
+                val = parsed[key]
+                return json.dumps(val) if not isinstance(val, str) else val
+        except (json.JSONDecodeError, TypeError):
+            pass
+
+    # 3. Colon format: key: value
+    match = re.search(rf"\b{re.escape(key)}\s*:\s*(.+)", content)
+    if match:
+        return match.group(1).strip()
+
+    # 4. Equals format: key = value
+    match = re.search(rf"\b{re.escape(key)}\s*=\s*(.+)", content)
+    if match:
+        return match.group(1).strip()
+
+    return None
+
+
+class NodeConversation:
+    """Message history for a graph node with optional write-through persistence.
+
+    When *store* is ``None`` the conversation works purely in-memory.
+    When a :class:`ConversationStore` is supplied every mutation is
+    persisted via write-through (meta is lazily written on the first
+    ``_persist`` call).
+    """
+
+    def __init__(
+        self,
+        system_prompt: str = "",
+        max_history_tokens: int = 32000,
+        compaction_threshold: float = 0.8,
+        output_keys: list[str] | None = None,
+        store: ConversationStore | None = None,
+    ) -> None:
+        self._system_prompt = system_prompt
+        self._max_history_tokens = max_history_tokens
+        self._compaction_threshold = compaction_threshold
+        self._output_keys = output_keys
+        self._store = store
+        self._messages: list[Message] = []
+        self._next_seq: int = 0
+        self._meta_persisted: bool = False
+        self._last_api_input_tokens: int | None = None
+
+    # --- Properties --------------------------------------------------------
+
+    @property
+    def system_prompt(self) -> str:
+        return self._system_prompt
+
+    @property
+    def messages(self) -> list[Message]:
+        """Return a defensive copy of the message list."""
+        return list(self._messages)
+
+    @property
+    def turn_count(self) -> int:
+        """Number of conversational turns (one turn = one user message)."""
+        return sum(1 for m in self._messages if m.role == "user")
+
+    @property
+    def message_count(self) -> int:
+        """Total number of messages (all roles)."""
+        return len(self._messages)
+
+    @property
+    def next_seq(self) -> int:
+        return self._next_seq
+
+    # --- Add messages ------------------------------------------------------
+
+    async def add_user_message(self, content: str) -> Message:
+        msg = Message(seq=self._next_seq, role="user", content=content)
+        self._messages.append(msg)
+        self._next_seq += 1
+        await self._persist(msg)
+        return msg
+
+    async def add_assistant_message(
+        self,
+        content: str,
+        tool_calls: list[dict[str, Any]] | None = None,
+    ) -> Message:
+        msg = Message(
+            seq=self._next_seq,
+            role="assistant",
+            content=content,
+            tool_calls=tool_calls,
+        )
+        self._messages.append(msg)
+        self._next_seq += 1
+        await self._persist(msg)
+        return msg
+
+    async def add_tool_result(
+        self,
+        tool_use_id: str,
+        content: str,
+        is_error: bool = False,
+    ) -> Message:
+        msg = Message(
+            seq=self._next_seq,
+            role="tool",
+            content=content,
+            tool_use_id=tool_use_id,
+            is_error=is_error,
+        )
+        self._messages.append(msg)
+        self._next_seq += 1
+        await self._persist(msg)
+        return msg
+
+    # --- Query -------------------------------------------------------------
+
+    def to_llm_messages(self) -> list[dict[str, Any]]:
+        """Return messages as OpenAI-format dicts (system prompt excluded).
+
+        Automatically repairs orphaned tool_use blocks (assistant messages
+        with tool_calls that lack corresponding tool-result messages).  This
+        can happen when a loop is cancelled mid-tool-execution.
+        """
+        msgs = [m.to_llm_dict() for m in self._messages]
+        return self._repair_orphaned_tool_calls(msgs)
+
+    @staticmethod
+    def _repair_orphaned_tool_calls(
+        msgs: list[dict[str, Any]],
+    ) -> list[dict[str, Any]]:
+        """Ensure every tool_call has a matching tool-result message."""
+        repaired: list[dict[str, Any]] = []
+        for i, m in enumerate(msgs):
+            repaired.append(m)
+            tool_calls = m.get("tool_calls")
+            if m.get("role") != "assistant" or not tool_calls:
+                continue
+            # Collect IDs of tool results that follow this assistant message
+            answered: set[str] = set()
+            for j in range(i + 1, len(msgs)):
+                if msgs[j].get("role") == "tool":
+                    tid = msgs[j].get("tool_call_id")
+                    if tid:
+                        answered.add(tid)
+                else:
+                    break  # stop at first non-tool message
+            # Patch any missing results
+            for tc in tool_calls:
+                tc_id = tc.get("id")
+                if tc_id and tc_id not in answered:
+                    repaired.append(
+                        {
+                            "role": "tool",
+                            "tool_call_id": tc_id,
+                            "content": "ERROR: Tool execution was interrupted.",
+                        }
+                    )
+        return repaired
+
+    def estimate_tokens(self) -> int:
+        """Best available token estimate.
+
+        Uses actual API input token count when available (set via
+        :meth:`update_token_count`), otherwise falls back to the rough
+        ``total_chars / 4`` heuristic.
+        """
+        if self._last_api_input_tokens is not None:
+            return self._last_api_input_tokens
+        total_chars = sum(len(m.content) for m in self._messages)
+        return total_chars // 4
+
+    def update_token_count(self, actual_input_tokens: int) -> None:
+        """Store actual API input token count for more accurate compaction.
+
+        Called by EventLoopNode after each LLM call with the ``input_tokens``
+        value from the API response.  This value includes system prompt and
+        tool definitions, so it may be higher than a message-only estimate.
+        """
+        self._last_api_input_tokens = actual_input_tokens
+
+    def usage_ratio(self) -> float:
+        """Current token usage as a fraction of *max_history_tokens*.
+
+        Returns 0.0 when ``max_history_tokens`` is zero (unlimited).
+        """
+        if self._max_history_tokens <= 0:
+            return 0.0
+        return self.estimate_tokens() / self._max_history_tokens
+
+    def needs_compaction(self) -> bool:
+        return self.estimate_tokens() >= self._max_history_tokens * self._compaction_threshold
+
+    # --- Output-key extraction ---------------------------------------------
+
+    def _extract_protected_values(self, messages: list[Message]) -> dict[str, str]:
+        """Scan assistant messages for output_key values before compaction.
+
+        Iterates most-recent-first. Once a key is found, it's skipped for
+        older messages (latest value wins).
+        """
+        if not self._output_keys:
+            return {}
+
+        found: dict[str, str] = {}
+        remaining_keys = set(self._output_keys)
+
+        for msg in reversed(messages):
+            if msg.role != "assistant" or not remaining_keys:
+                continue
+
+            for key in list(remaining_keys):
+                value = self._try_extract_key(msg.content, key)
+                if value is not None:
+                    found[key] = value
+                    remaining_keys.discard(key)
+
+        return found
+
+    def _try_extract_key(self, content: str, key: str) -> str | None:
+        """Try 4 strategies to extract a key's value from message content."""
+        return _try_extract_key(content, key)
+
+    # --- Lifecycle ---------------------------------------------------------
+
+    async def prune_old_tool_results(
+        self,
+        protect_tokens: int = 5000,
+        min_prune_tokens: int = 2000,
+    ) -> int:
+        """Replace old tool result content with compact placeholders.
+
+        Walks backward through messages. Recent tool results (within
+        *protect_tokens*) are kept intact. Older tool results have their
+        content replaced with a ~100-char placeholder that preserves the
+        spillover filename reference (if any). Message structure (role,
+        seq, tool_use_id) stays valid for the LLM API.
+
+        Error tool results are never pruned — they prevent re-calling
+        failing tools.
+
+        Returns the number of messages pruned (0 if nothing was pruned).
+        """
+        if not self._messages:
+            return 0
+
+        # Phase 1: Walk backward, classify tool results as protected vs pruneable
+        protected_tokens = 0
+        pruneable: list[int] = []  # indices into self._messages
+        pruneable_tokens = 0
+
+        for i in range(len(self._messages) - 1, -1, -1):
+            msg = self._messages[i]
+            if msg.role != "tool":
+                continue
+            if msg.is_error:
+                continue  # never prune errors
+            if msg.content.startswith("[Pruned tool result"):
+                continue  # already pruned
+
+            est = len(msg.content) // 4
+            if protected_tokens < protect_tokens:
+                protected_tokens += est
+            else:
+                pruneable.append(i)
+                pruneable_tokens += est
+
+        # Phase 2: Only prune if enough to be worthwhile
+        if pruneable_tokens < min_prune_tokens:
+            return 0
+
+        # Phase 3: Replace content with compact placeholder
+        count = 0
+        for i in pruneable:
+            msg = self._messages[i]
+            orig_len = len(msg.content)
+            spillover = _extract_spillover_filename(msg.content)
+
+            if spillover:
+                placeholder = (
+                    f"[Pruned tool result: {orig_len} chars. "
+                    f"Full data in '{spillover}'. "
+                    f"Use load_data('{spillover}') to retrieve.]"
+                )
+            else:
+                placeholder = f"[Pruned tool result: {orig_len} chars cleared from context.]"
+
+            self._messages[i] = Message(
+                seq=msg.seq,
+                role=msg.role,
+                content=placeholder,
+                tool_use_id=msg.tool_use_id,
+                tool_calls=msg.tool_calls,
+                is_error=msg.is_error,
+            )
+            count += 1
+
+            if self._store:
+                await self._store.write_part(msg.seq, self._messages[i].to_storage_dict())
+
+        # Reset token estimate — content lengths changed
+        self._last_api_input_tokens = None
+        return count
+
+    async def compact(self, summary: str, keep_recent: int = 2) -> None:
+        """Replace old messages with a summary, optionally keeping recent ones.
+
+        Args:
+            summary: Caller-provided summary text.
+            keep_recent: Number of recent messages to preserve (default 2).
+                         Clamped to [0, len(messages) - 1].
+        """
+        if not self._messages:
+            return
+
+        # Clamp: must discard at least 1 message
+        keep_recent = max(0, min(keep_recent, len(self._messages) - 1))
+
+        total = len(self._messages)
+        split = total - keep_recent if keep_recent > 0 else total
+
+        # Advance split past orphaned tool results at the boundary.
+        # Tool-role messages reference a tool_use from the preceding
+        # assistant message; if that assistant message falls into the
+        # compacted (old) portion the tool_result becomes invalid.
+        while split < total and self._messages[split].role == "tool":
+            split += 1
+
+        old_messages = list(self._messages[:split])
+        recent_messages = list(self._messages[split:])
+
+        # Extract protected values from messages being discarded
+        if self._output_keys:
+            protected = self._extract_protected_values(old_messages)
+            if protected:
+                lines = ["PRESERVED VALUES (do not lose these):"]
+                for k, v in protected.items():
+                    lines.append(f"- {k}: {v}")
+                lines.append("")
+                lines.append("CONVERSATION SUMMARY:")
+                lines.append(summary)
+                summary = "\n".join(lines)
+
+        # Determine summary seq
+        if recent_messages:
+            summary_seq = recent_messages[0].seq - 1
+        else:
+            summary_seq = self._next_seq
+            self._next_seq += 1
+
+        summary_msg = Message(seq=summary_seq, role="user", content=summary)
+
+        # Persist
+        if self._store:
+            delete_before = recent_messages[0].seq if recent_messages else self._next_seq
+            await self._store.delete_parts_before(delete_before)
+            await self._store.write_part(summary_msg.seq, summary_msg.to_storage_dict())
+            await self._store.write_cursor({"next_seq": self._next_seq})
+
+        self._messages = [summary_msg] + recent_messages
+        self._last_api_input_tokens = None  # reset; next LLM call will recalibrate
+
+    async def clear(self) -> None:
+        """Remove all messages, keep system prompt, preserve ``_next_seq``."""
+        if self._store:
+            await self._store.delete_parts_before(self._next_seq)
+            await self._store.write_cursor({"next_seq": self._next_seq})
+        self._messages.clear()
+        self._last_api_input_tokens = None
+
+    def export_summary(self) -> str:
+        """Structured summary with [STATS], [CONFIG], [RECENT_MESSAGES] sections."""
+        prompt_preview = (
+            self._system_prompt[:80] + "..."
+            if len(self._system_prompt) > 80
+            else self._system_prompt
+        )
+
+        lines = [
+            "[STATS]",
+            f"turns: {self.turn_count}",
+            f"messages: {self.message_count}",
+            f"estimated_tokens: {self.estimate_tokens()}",
+            "",
+            "[CONFIG]",
+            f"system_prompt: {prompt_preview!r}",
+        ]
+
+        if self._output_keys:
+            lines.append(f"output_keys: {', '.join(self._output_keys)}")
+
+        lines.append("")
+        lines.append("[RECENT_MESSAGES]")
+        for m in self._messages[-5:]:
+            preview = m.content[:60] + "..." if len(m.content) > 60 else m.content
+            lines.append(f"  [{m.role}] {preview}")
+
+        return "\n".join(lines)
+
+    # --- Persistence internals ---------------------------------------------
+
+    async def _persist(self, message: Message) -> None:
+        """Write-through a single message.  No-op when store is None."""
+        if self._store is None:
+            return
+        if not self._meta_persisted:
+            await self._persist_meta()
+        await self._store.write_part(message.seq, message.to_storage_dict())
+        await self._store.write_cursor({"next_seq": self._next_seq})
+
+    async def _persist_meta(self) -> None:
+        """Lazily write conversation metadata to the store (called once)."""
+        if self._store is None:
+            return
+        await self._store.write_meta(
+            {
+                "system_prompt": self._system_prompt,
+                "max_history_tokens": self._max_history_tokens,
+                "compaction_threshold": self._compaction_threshold,
+                "output_keys": self._output_keys,
+            }
+        )
+        self._meta_persisted = True
+
+    # --- Restore -----------------------------------------------------------
+
+    @classmethod
+    async def restore(cls, store: ConversationStore) -> NodeConversation | None:
+        """Reconstruct a NodeConversation from a store.
+
+        Returns ``None`` if the store contains no metadata (i.e. the
+        conversation was never persisted).
+        """
+        meta = await store.read_meta()
+        if meta is None:
+            return None
+
+        conv = cls(
+            system_prompt=meta.get("system_prompt", ""),
+            max_history_tokens=meta.get("max_history_tokens", 32000),
+            compaction_threshold=meta.get("compaction_threshold", 0.8),
+            output_keys=meta.get("output_keys"),
+            store=store,
+        )
+        conv._meta_persisted = True
+
+        parts = await store.read_parts()
+        conv._messages = [Message.from_storage_dict(p) for p in parts]
+
+        cursor = await store.read_cursor()
+        if cursor:
+            conv._next_seq = cursor["next_seq"]
+        elif conv._messages:
+            conv._next_seq = conv._messages[-1].seq + 1
+
+        return conv
@@ -13,7 +13,7 @@ Edge Types:
 - always: Always traverse after source completes
 - on_success: Traverse only if source succeeds
 - on_failure: Traverse only if source fails
- conditional: Traverse based on expression evaluation
+- conditional: Traverse based on expression evaluation (SAFE SUBSET ONLY)
 - llm_decide: Let LLM decide based on goal and context (goal-aware routing)

 The llm_decide condition is particularly powerful for goal-driven agents,
@@ -21,19 +21,24 @@ allowing the LLM to evaluate whether proceeding along an edge makes sense
 given the current goal, context, and execution state.
 """

+from enum import StrEnum
 from typing import Any
-from enum import Enum

-from pydantic import BaseModel, Field
+from pydantic import BaseModel, Field, model_validator
+
+from framework.graph.safe_eval import safe_eval
+
+DEFAULT_MAX_TOKENS = 8192


-class EdgeCondition(str, Enum):
+class EdgeCondition(StrEnum):
    """When an edge should be traversed."""
-    ALWAYS = "always"           # Always after source completes
-    ON_SUCCESS = "on_success"   # Only if source succeeds
-    ON_FAILURE = "on_failure"   # Only if source fails
-    CONDITIONAL = "conditional" # Based on expression
-    LLM_DECIDE = "llm_decide"   # Let LLM decide based on goal and context
+
+    ALWAYS = "always"  # Always after source completes
+    ON_SUCCESS = "on_success"  # Only if source succeeds
+    ON_FAILURE = "on_failure"  # Only if source fails
+    CONDITIONAL = "conditional"  # Based on expression
+    LLM_DECIDE = "llm_decide"  # Let LLM decide based on goal and context


 class EdgeSpec(BaseModel):
@@ -68,6 +73,7 @@ class EdgeSpec(BaseModel):
            description="Only filter if results need refinement to meet goal",
        )
    """
+
    id: str
    source: str = Field(description="Source node ID")
    target: str = Field(description="Target node ID")
@@ -76,20 +82,17 @@ class EdgeSpec(BaseModel):
    condition: EdgeCondition = EdgeCondition.ALWAYS
    condition_expr: str | None = Field(
        default=None,
-        description="Expression for CONDITIONAL edges, e.g., 'output.confidence > 0.8'"
+        description="Expression for CONDITIONAL edges, e.g., 'output.confidence > 0.8'",
    )

    # Data flow
    input_mapping: dict[str, str] = Field(
        default_factory=dict,
-        description="Map source outputs to target inputs: {target_key: source_key}"
+        description="Map source outputs to target inputs: {target_key: source_key}",
    )

    # Priority for multiple outgoing edges
-    priority: int = Field(
-        default=0,
-        description="Higher priority edges are evaluated first"
-    )
+    priority: int = Field(default=0, description="Higher priority edges are evaluated first")

    # Metadata
    description: str = ""
@@ -155,6 +158,10 @@ class EdgeSpec(BaseModel):
        memory: dict[str, Any],
    ) -> bool:
        """Evaluate a conditional expression."""
+        import logging
+
+        logger = logging.getLogger(__name__)
+
        if not self.condition_expr:
            return True

@@ -164,18 +171,31 @@ class EdgeSpec(BaseModel):
            "output": output,
            "memory": memory,
            "result": output.get("result"),
-            "true": True,   # Allow lowercase true/false in conditions
+            "true": True,  # Allow lowercase true/false in conditions
            "false": False,
            **memory,  # Unpack memory keys directly into context
        }

        try:
-            # Safe evaluation (in production, use a proper expression evaluator)
-            return bool(eval(self.condition_expr, {"__builtins__": {}}, context))
+            # Safe evaluation using AST-based whitelist
+            result = bool(safe_eval(self.condition_expr, context))
+            # Log the evaluation for visibility
+            # Extract the variable names used in the expression for debugging
+            expr_vars = {
+                k: repr(context[k])
+                for k in context
+                if k not in ("output", "memory", "result", "true", "false")
+                and k in self.condition_expr
+            }
+            logger.info(
+                "  Edge %s: condition '%s' → %s  (vars: %s)",
+                self.id,
+                self.condition_expr,
+                result,
+                expr_vars or "none matched",
+            )
+            return result
        except Exception as e:
-            # Log the error for debugging
-            import logging
-            logger = logging.getLogger(__name__)
            logger.warning(f"      ⚠ Condition evaluation failed: {self.condition_expr}")
            logger.warning(f"         Error: {e}")
            logger.warning(f"         Available context keys: {list(context.keys())}")
@@ -235,7 +255,8 @@ Respond with ONLY a JSON object:

            # Parse response
            import re
-            json_match = re.search(r'\{[^{}]*\}', response.content, re.DOTALL)
+
+            json_match = re.search(r"\{[^{}]*\}", response.content, re.DOTALL)
            if json_match:
                data = json.loads(json_match.group())
                proceed = data.get("proceed", False)
@@ -243,6 +264,7 @@ Respond with ONLY a JSON object:

                # Log the decision (using basic print for now)
                import logging
+
                logger = logging.getLogger(__name__)
                logger.info(f"      🤔 LLM routing decision: {'PROCEED' if proceed else 'SKIP'}")
                logger.info(f"         Reason: {reasoning}")
@@ -252,6 +274,7 @@ Respond with ONLY a JSON object:
        except Exception as e:
            # Fallback: proceed on success
            import logging
+
            logger = logging.getLogger(__name__)
            logger.warning(f"      ⚠ LLM routing failed, defaulting to on_success: {e}")
            return source_success
@@ -288,13 +311,52 @@ Respond with ONLY a JSON object:
        return result


+class AsyncEntryPointSpec(BaseModel):
+    """
+    Specification for an asynchronous entry point.
+
+    Used with AgentRuntime for multi-entry-point agents that handle
+    concurrent execution streams (e.g., webhook + API handlers).
+
+    Example:
+        AsyncEntryPointSpec(
+            id="webhook",
+            name="Zendesk Webhook Handler",
+            entry_node="process-webhook",
+            trigger_type="webhook",
+            isolation_level="shared",
+        )
+    """
+
+    id: str = Field(description="Unique identifier for this entry point")
+    name: str = Field(description="Human-readable name")
+    entry_node: str = Field(description="Node ID to start execution from")
+    trigger_type: str = Field(
+        default="manual",
+        description="How this entry point is triggered: webhook, api, timer, event, manual",
+    )
+    trigger_config: dict[str, Any] = Field(
+        default_factory=dict,
+        description="Trigger-specific configuration (e.g., webhook URL, timer interval)",
+    )
+    isolation_level: str = Field(
+        default="shared", description="State isolation: isolated, shared, or synchronized"
+    )
+    priority: int = Field(default=0, description="Execution priority (higher = more priority)")
+    max_concurrent: int = Field(
+        default=10, description="Maximum concurrent executions for this entry point"
+    )
+
+    model_config = {"extra": "allow"}
+
+
 class GraphSpec(BaseModel):
    """
    Complete specification of an agent graph.

    Contains all nodes, edges, and metadata needed to execute.

-    Example:
+    For single-entry-point agents (traditional pattern):
        GraphSpec(
            id="calculator-graph",
            goal_id="calc-001",
@@ -303,7 +365,31 @@ class GraphSpec(BaseModel):
            nodes=[...],
            edges=[...],
        )
+
+    For multi-entry-point agents (concurrent streams):
+        GraphSpec(
+            id="support-agent-graph",
+            goal_id="support-001",
+            entry_node="process-webhook",  # Default entry
+            async_entry_points=[
+                AsyncEntryPointSpec(
+                    id="webhook",
+                    name="Zendesk Webhook",
+                    entry_node="process-webhook",
+                    trigger_type="webhook",
+                ),
+                AsyncEntryPointSpec(
+                    id="api",
+                    name="API Handler",
+                    entry_node="process-request",
+                    trigger_type="api",
+                ),
+            ],
+            nodes=[...],
+            edges=[...],
+        )
    """
+
    id: str
    goal_id: str
    version: str = "1.0.0"
@@ -312,50 +398,67 @@ class GraphSpec(BaseModel):
    entry_node: str = Field(description="ID of the first node to execute")
    entry_points: dict[str, str] = Field(
        default_factory=dict,
-        description="Named entry points for resuming execution. Format: {name: node_id}"
+        description="Named entry points for resuming execution. Format: {name: node_id}",
+    )
+    async_entry_points: list[AsyncEntryPointSpec] = Field(
+        default_factory=list,
+        description=(
+            "Asynchronous entry points for concurrent execution streams (used with AgentRuntime)"
+        ),
    )
    terminal_nodes: list[str] = Field(
-        default_factory=list,
-        description="IDs of nodes that end execution"
+        default_factory=list, description="IDs of nodes that end execution"
    )
    pause_nodes: list[str] = Field(
-        default_factory=list,
-        description="IDs of nodes that pause execution for HITL input"
+        default_factory=list, description="IDs of nodes that pause execution for HITL input"
    )

    # Components
    nodes: list[Any] = Field(  # NodeSpec, but avoiding circular import
-        default_factory=list,
-        description="All node specifications"
-    )
-    edges: list[EdgeSpec] = Field(
-        default_factory=list,
-        description="All edge specifications"
+        default_factory=list, description="All node specifications"
    )
+    edges: list[EdgeSpec] = Field(default_factory=list, description="All edge specifications")

    # Shared memory keys
    memory_keys: list[str] = Field(
-        default_factory=list,
-        description="Keys available in shared memory"
+        default_factory=list, description="Keys available in shared memory"
    )

    # Default LLM settings
    default_model: str = "claude-haiku-4-5-20251001"
-    max_tokens: int = 1024
+    max_tokens: int = Field(default=None)  # resolved by _resolve_max_tokens validator
+
+    # Cleanup LLM for JSON extraction fallback (fast/cheap model preferred)
+    # If not set, uses CEREBRAS_API_KEY -> cerebras/llama-3.3-70b or
+    # ANTHROPIC_API_KEY -> claude-3-5-haiku as fallback
+    cleanup_llm_model: str | None = None

    # Execution limits
-    max_steps: int = Field(
-        default=100,
-        description="Maximum node executions before timeout"
-    )
+    max_steps: int = Field(default=100, description="Maximum node executions before timeout")
    max_retries_per_node: int = 3

+    # EventLoopNode configuration (from configure_loop)
+    loop_config: dict[str, Any] = Field(
+        default_factory=dict,
+        description="EventLoopNode configuration (max_iterations, max_tool_calls_per_turn, etc.)",
+    )
+
    # Metadata
    description: str = ""
    created_by: str = ""  # "human" or "builder_agent"

    model_config = {"extra": "allow"}

+    @model_validator(mode="before")
+    @classmethod
+    def _resolve_max_tokens(cls, values: Any) -> Any:
+        """Resolve max_tokens from the global config store when not explicitly set."""
+        if isinstance(values, dict) and values.get("max_tokens") is None:
+            from framework.config import get_max_tokens
+
+            values["max_tokens"] = get_max_tokens()
+        return values
+
    def get_node(self, node_id: str) -> Any | None:
        """Get a node by ID."""
        for node in self.nodes:
@@ -363,6 +466,17 @@ class GraphSpec(BaseModel):
                return node
        return None

+    def has_async_entry_points(self) -> bool:
+        """Check if this graph uses async entry points (multi-stream execution)."""
+        return len(self.async_entry_points) > 0
+
+    def get_async_entry_point(self, entry_point_id: str) -> AsyncEntryPointSpec | None:
+        """Get an async entry point by ID."""
+        for ep in self.async_entry_points:
+            if ep.id == entry_point_id:
+                return ep
+        return None
+
    def get_outgoing_edges(self, node_id: str) -> list[EdgeSpec]:
        """Get all edges leaving a node, sorted by priority."""
        edges = [e for e in self.edges if e.source == node_id]
@@ -372,6 +486,42 @@ class GraphSpec(BaseModel):
        """Get all edges entering a node."""
        return [e for e in self.edges if e.target == node_id]

+    def detect_fan_out_nodes(self) -> dict[str, list[str]]:
+        """
+        Detect nodes that fan-out to multiple targets.
+
+        A fan-out occurs when a node has multiple outgoing edges with the same
+        condition (typically ON_SUCCESS) that should execute in parallel.
+
+        Returns:
+            Dict mapping source_node_id -> list of parallel target_node_ids
+        """
+        fan_outs: dict[str, list[str]] = {}
+        for node in self.nodes:
+            outgoing = self.get_outgoing_edges(node.id)
+            # Fan-out: multiple edges with ON_SUCCESS condition
+            success_edges = [e for e in outgoing if e.condition == EdgeCondition.ON_SUCCESS]
+            if len(success_edges) > 1:
+                fan_outs[node.id] = [e.target for e in success_edges]
+        return fan_outs
+
+    def detect_fan_in_nodes(self) -> dict[str, list[str]]:
+        """
+        Detect nodes that receive from multiple sources (fan-in / convergence).
+
+        A fan-in occurs when a node has multiple incoming edges, meaning
+        it should wait for all predecessor branches to complete.
+
+        Returns:
+            Dict mapping target_node_id -> list of source_node_ids
+        """
+        fan_ins: dict[str, list[str]] = {}
+        for node in self.nodes:
+            incoming = self.get_incoming_edges(node.id)
+            if len(incoming) > 1:
+                fan_ins[node.id] = [e.source for e in incoming]
+        return fan_ins
+
    def get_entry_point(self, session_state: dict | None = None) -> str:
        """
        Get the appropriate entry point based on session state.
@@ -412,6 +562,37 @@ class GraphSpec(BaseModel):
        if not self.get_node(self.entry_node):
            errors.append(f"Entry node '{self.entry_node}' not found")

+        # Check async entry points
+        seen_entry_ids = set()
+        for entry_point in self.async_entry_points:
+            # Check for duplicate IDs
+            if entry_point.id in seen_entry_ids:
+                errors.append(f"Duplicate async entry point ID: '{entry_point.id}'")
+            seen_entry_ids.add(entry_point.id)
+
+            # Check entry node exists
+            if not self.get_node(entry_point.entry_node):
+                errors.append(
+                    f"Async entry point '{entry_point.id}' references "
+                    f"missing node '{entry_point.entry_node}'"
+                )
+
+            # Validate isolation level
+            valid_isolation = {"isolated", "shared", "synchronized"}
+            if entry_point.isolation_level not in valid_isolation:
+                errors.append(
+                    f"Async entry point '{entry_point.id}' has invalid isolation_level "
+                    f"'{entry_point.isolation_level}'. Valid: {valid_isolation}"
+                )
+
+            # Validate trigger type
+            valid_triggers = {"webhook", "api", "timer", "event", "manual"}
+            if entry_point.trigger_type not in valid_triggers:
+                errors.append(
+                    f"Async entry point '{entry_point.id}' has invalid trigger_type "
+                    f"'{entry_point.trigger_type}'. Valid: {valid_triggers}"
+                )
+
        # Check terminal nodes exist
        for term in self.terminal_nodes:
            if not self.get_node(term):
@@ -433,6 +614,10 @@ class GraphSpec(BaseModel):
        for entry_point_node in self.entry_points.values():
            to_visit.append(entry_point_node)

+        # Add all async entry points as valid starting points
+        for async_entry in self.async_entry_points:
+            to_visit.append(async_entry.entry_node)
+
        # Traverse from all entry points
        while to_visit:
            current = to_visit.pop()
@@ -442,12 +627,55 @@ class GraphSpec(BaseModel):
            for edge in self.get_outgoing_edges(current):
                to_visit.append(edge.target)

+        # Build set of async entry point nodes for quick lookup
+        async_entry_nodes = {ep.entry_node for ep in self.async_entry_points}
+
        for node in self.nodes:
            if node.id not in reachable:
-                # Skip this error if the node is a pause node or an entry point target
-                # (pause/resume architecture makes these reachable via session state)
-                if node.id in self.pause_nodes or node.id in self.entry_points.values():
+                # Skip if node is a pause node, entry point target, or async entry
+                # (pause/resume architecture and async entry points make reachable)
+                if (
+                    node.id in self.pause_nodes
+                    or node.id in self.entry_points.values()
+                    or node.id in async_entry_nodes
+                ):
                    continue
                errors.append(f"Node '{node.id}' is unreachable from entry")

+        # Client-facing fan-out validation
+        fan_outs = self.detect_fan_out_nodes()
+        for source_id, targets in fan_outs.items():
+            client_facing_targets = [
+                t
+                for t in targets
+                if self.get_node(t) and getattr(self.get_node(t), "client_facing", False)
+            ]
+            if len(client_facing_targets) > 1:
+                errors.append(
+                    f"Fan-out from '{source_id}' has multiple client-facing nodes: "
+                    f"{client_facing_targets}. Only one branch may be client-facing."
+                )
+
+        # Output key overlap on parallel event_loop nodes
+        for source_id, targets in fan_outs.items():
+            event_loop_targets = [
+                t
+                for t in targets
+                if self.get_node(t) and getattr(self.get_node(t), "node_type", "") == "event_loop"
+            ]
+            if len(event_loop_targets) > 1:
+                seen_keys: dict[str, str] = {}
+                for node_id in event_loop_targets:
+                    node = self.get_node(node_id)
+                    for key in getattr(node, "output_keys", []):
+                        if key in seen_keys:
+                            errors.append(
+                                f"Fan-out from '{source_id}': event_loop nodes "
+                                f"'{seen_keys[key]}' and '{node_id}' both write to "
+                                f"output_key '{key}'. Parallel event_loop nodes must "
+                                f"have disjoint output_keys to prevent last-wins data loss."
+                            )
+                        else:
+                            seen_keys[key] = node_id
+
        return errors
--- a/Show More
+++ b/Show More
				`@@ -0,0 +1 @@`
				`"""Tests for Aden credential sync components."""`
				`@@ -0,0 +1 @@`
				`"""Tests for the credential store module."""`