Compare commits
156 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| fd79dceb0f | |||
| ad50139d67 | |||
| 12fb40c110 | |||
| 738e469d96 | |||
| 80ccbcc827 | |||
| 08fac31a9d | |||
| 89ccd66fb9 | |||
| 7c47e367de | |||
| b8741bf94c | |||
| c90dcbb32f | |||
| ac3a5f5e93 | |||
| 1ccfdbbf7d | |||
| 1de37d2747 | |||
| 2aefdf5b5f | |||
| 4caaa79900 | |||
| 296089d4cd | |||
| cae5f971cf | |||
| bac716eea3 | |||
| 14daf672e8 | |||
| e352ae5145 | |||
| a58ffc2669 | |||
| 3fefea52be | |||
| 06fd045b3e | |||
| 2e43d2af46 | |||
| 2c9790c65d | |||
| 9700ac71bb | |||
| 61ed67b068 | |||
| c3bea8685a | |||
| 98c57b795a | |||
| 9be1d03b5c | |||
| 0d09510539 | |||
| 639c37ba17 | |||
| 2258c23254 | |||
| 9714ea106d | |||
| f4ad500177 | |||
| 9154a4d9f8 | |||
| add6efe6f1 | |||
| 7ceb1efd02 | |||
| a29ecf8435 | |||
| d0ba5ef4f4 | |||
| 860f637491 | |||
| acb2cab317 | |||
| b453806918 | |||
| 7ba8a0f51b | |||
| f6f398b6b1 | |||
| c4b22fa5c4 | |||
| 0e64f977cd | |||
| f24c9708fc | |||
| bb4436e277 | |||
| 795f66c90b | |||
| 9ef6d51573 | |||
| 3fed4e3409 | |||
| 670e69f2ce | |||
| f6c4747905 | |||
| 7b78f6c12f | |||
| 1c75100f59 | |||
| b325e103c6 | |||
| aef2d2d474 | |||
| 95a2b6711e | |||
| 7fb5e8145c | |||
| 8e45d0df83 | |||
| 8d4657c13e | |||
| 3d175a6d54 | |||
| b9debaf957 | |||
| bdcbcff6f3 | |||
| d2d7bdc374 | |||
| 40e494b15d | |||
| b5e840c0cb | |||
| f3d74c9ae4 | |||
| a22b321692 | |||
| 2e7dbad118 | |||
| 6183d1b65b | |||
| 09931e6d98 | |||
| cb394127d1 | |||
| 588fa1f9ea | |||
| 73325c280c | |||
| 8c5ae8ffa8 | |||
| 7389423c70 | |||
| 20c15446a7 | |||
| c05c30dd9a | |||
| bcd2fb76bd | |||
| 5fb97ab6df | |||
| 0224ebc800 | |||
| af88f7299a | |||
| 81729706ae | |||
| bbb1b43ebe | |||
| 70ed5fa8df | |||
| 312db6620d | |||
| 93c1fc5488 | |||
| 90762f275b | |||
| 801443027d | |||
| ca2ead76cd | |||
| d562144a6d | |||
| af7fb7da27 | |||
| c17dd63b4a | |||
| 866db289e2 | |||
| b4ac5e9607 | |||
| 3ca7af4242 | |||
| 2b12a9c91a | |||
| 9a94595a42 | |||
| e1540dfaa6 | |||
| 4f5ac6d1b1 | |||
| c87d7b13da | |||
| c4acf0b659 | |||
| 5e1ab3ca37 | |||
| 79c32c9f47 | |||
| 35ee29a843 | |||
| 573aea1d9c | |||
| 6ecbc30293 | |||
| 843b1f2e1d | |||
| 89f6c8e4ef | |||
| 304ac07bd8 | |||
| 82f0684b83 | |||
| 963c37dc31 | |||
| c02da3ba5a | |||
| 7f34e95ec6 | |||
| f2998fe098 | |||
| 323a2489b8 | |||
| f6d1cd640e | |||
| ddf89a04fe | |||
| c5dc89f5ee | |||
| 6ade34b759 | |||
| 09d5f0a9df | |||
| a60d63cca2 | |||
| 8616975fc5 | |||
| e5ae919d8f | |||
| 8e7f5eaaba | |||
| 4d1ff8b054 | |||
| 9fa81e8599 | |||
| cf8e19b059 | |||
| dfa3f60fcf | |||
| b795f1b253 | |||
| 73423c0dd2 | |||
| 3d844e1539 | |||
| b619119eb5 | |||
| b00ed4fc70 | |||
| 5ec5fbe998 | |||
| 2ed814455a | |||
| ad1a4ef0c3 | |||
| 2111c808a9 | |||
| 402bb38267 | |||
| 0a55928872 | |||
| cdf76ae3b9 | |||
| 42d0592941 | |||
| 1de7cf821d | |||
| 4ea8540e25 | |||
| bfa3b8e0f6 | |||
| 55eccfd75f | |||
| 1e994a77b5 | |||
| d12afeb35d | |||
| b55a77634b | |||
| e84fefd319 | |||
| d2b510014d | |||
| 3ed5fda448 | |||
| 7a467ef9b8 | |||
| 41cd11d5c9 |
@@ -1,31 +0,0 @@
|
||||
name: Link Discord Account
|
||||
description: Connect your GitHub and Discord for the bounty program
|
||||
title: "link: @{{ github.actor }}"
|
||||
labels: ["link-discord"]
|
||||
body:
|
||||
- type: markdown
|
||||
attributes:
|
||||
value: |
|
||||
Link your Discord account to receive XP and role rewards when your bounty PRs are merged.
|
||||
|
||||
**How to find your Discord ID:**
|
||||
1. Open Discord Settings > Advanced > Enable **Developer Mode**
|
||||
2. Right-click your username > **Copy User ID**
|
||||
|
||||
- type: input
|
||||
id: discord_id
|
||||
attributes:
|
||||
label: Discord User ID
|
||||
description: "Your numeric Discord ID (not your username). Example: 123456789012345678"
|
||||
placeholder: "123456789012345678"
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: input
|
||||
id: display_name
|
||||
attributes:
|
||||
label: Display Name (optional)
|
||||
description: How you'd like to be credited
|
||||
placeholder: "Jane Doe"
|
||||
validations:
|
||||
required: false
|
||||
@@ -2,10 +2,6 @@
|
||||
|
||||
Shared agent instructions for this workspace.
|
||||
|
||||
## Deprecations
|
||||
|
||||
- **TUI is deprecated.** The terminal UI (`hive tui`) is no longer maintained. Use the browser-based interface (`hive open`) instead.
|
||||
|
||||
## Coding Agent Notes
|
||||
|
||||
-
|
||||
|
||||
@@ -65,6 +65,52 @@ You may submit PRs without prior assignment for:
|
||||
|
||||
> **Tip:** Installing Claude Code skills is optional for running existing agents, but required if you plan to **build new agents**.
|
||||
|
||||
## Troubleshooting Setup Issues
|
||||
|
||||
If you encounter issues while setting up the development environment, the following steps may help:
|
||||
|
||||
### `make: command not found`
|
||||
Install `make` using:
|
||||
|
||||
```bash
|
||||
sudo apt install make
|
||||
|
||||
uv: command not found
|
||||
|
||||
Install uv using:
|
||||
|
||||
curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||
source ~/.bashrc
|
||||
|
||||
ruff: not found
|
||||
|
||||
If linting fails due to a missing ruff command, install it with:
|
||||
|
||||
uv tool install ruff
|
||||
|
||||
WSL Path Recommendation
|
||||
|
||||
When using WSL, it is recommended to clone the repository inside your Linux home directory (e.g., ~/hive) instead of under /mnt/c/... to avoid potential performance and permission issues.
|
||||
|
||||
|
||||
---
|
||||
|
||||
# ✅ Why This Is Good
|
||||
|
||||
- Clear
|
||||
- Professional tone
|
||||
- No unnecessary explanation
|
||||
- Under micro-fix size
|
||||
- Based on real contributor experience
|
||||
- Won’t annoy maintainers
|
||||
|
||||
---
|
||||
|
||||
Now:
|
||||
|
||||
```bash
|
||||
git checkout -b docs/setup-troubleshooting
|
||||
|
||||
## Commit Convention
|
||||
|
||||
We follow [Conventional Commits](https://www.conventionalcommits.org/):
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# MCP Server Guide - Agent Building Tools
|
||||
|
||||
> **Note:** The standalone `agent-builder` MCP server (`framework.mcp.agent_builder_server`) has been replaced. Agent building is now done via the `coder-tools` server's `initialize_agent_package` tool, with underlying logic in `framework.builder.package_generator`.
|
||||
> **Note:** The standalone `agent-builder` MCP server (`framework.mcp.agent_builder_server`) has been replaced. Agent building is now done via the `coder-tools` server's `initialize_and_build_agent` tool, with underlying logic in `tools/coder_tools_server.py`.
|
||||
|
||||
This guide covers the MCP tools available for building goal-driven agents.
|
||||
|
||||
|
||||
+1
-1
@@ -19,7 +19,7 @@ uv pip install -e .
|
||||
|
||||
## Agent Building
|
||||
|
||||
Agent scaffolding is handled by the `coder-tools` MCP server (in `tools/coder_tools_server.py`), which provides the `initialize_agent_package` tool and related utilities. The underlying package generation logic lives in `framework.builder.package_generator`.
|
||||
Agent scaffolding is handled by the `coder-tools` MCP server (in `tools/coder_tools_server.py`), which provides the `initialize_and_build_agent` tool and related utilities. The package generation logic lives directly in `tools/coder_tools_server.py`.
|
||||
|
||||
See the [Getting Started Guide](../docs/getting-started.md) for building agents.
|
||||
|
||||
|
||||
@@ -22,7 +22,6 @@ The framework includes a Goal-Based Testing system (Goal → Agent → Eval):
|
||||
See `framework.testing` for details.
|
||||
"""
|
||||
|
||||
from framework.builder.query import BuilderQuery
|
||||
from framework.llm import AnthropicProvider, LLMProvider
|
||||
from framework.runner import AgentOrchestrator, AgentRunner
|
||||
from framework.runtime.core import Runtime
|
||||
@@ -51,8 +50,6 @@ __all__ = [
|
||||
"Problem",
|
||||
# Runtime
|
||||
"Runtime",
|
||||
# Builder
|
||||
"BuilderQuery",
|
||||
# LLM
|
||||
"LLMProvider",
|
||||
"AnthropicProvider",
|
||||
|
||||
@@ -51,42 +51,6 @@ def cli():
|
||||
pass
|
||||
|
||||
|
||||
@cli.command()
|
||||
@click.option("--verbose", "-v", is_flag=True)
|
||||
@click.option("--debug", is_flag=True)
|
||||
def tui(verbose, debug):
|
||||
"""Launch TUI to test a credential interactively."""
|
||||
setup_logging(verbose=verbose, debug=debug)
|
||||
|
||||
try:
|
||||
from framework.tui.app import AdenTUI
|
||||
except ImportError:
|
||||
click.echo("TUI requires 'textual'. Install with: pip install textual")
|
||||
sys.exit(1)
|
||||
|
||||
agent = CredentialTesterAgent()
|
||||
account = pick_account(agent)
|
||||
if account is None:
|
||||
sys.exit(1)
|
||||
|
||||
agent.select_account(account)
|
||||
provider = account.get("provider", "?")
|
||||
alias = account.get("alias", "?")
|
||||
click.echo(f"\nTesting {provider}/{alias}...\n")
|
||||
|
||||
async def run_tui():
|
||||
agent._setup()
|
||||
runtime = agent._agent_runtime
|
||||
await runtime.start()
|
||||
try:
|
||||
app = AdenTUI(runtime)
|
||||
await app.run_async()
|
||||
finally:
|
||||
await runtime.stop()
|
||||
|
||||
asyncio.run(run_tui())
|
||||
|
||||
|
||||
@cli.command()
|
||||
@click.option("--verbose", "-v", is_flag=True)
|
||||
@click.option("--debug", is_flag=True)
|
||||
|
||||
@@ -0,0 +1,151 @@
|
||||
"""Agent discovery — scan known directories and return categorised AgentEntry lists."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
@dataclass
|
||||
class AgentEntry:
|
||||
"""Lightweight agent metadata for the picker / API discover endpoint."""
|
||||
|
||||
path: Path
|
||||
name: str
|
||||
description: str
|
||||
category: str
|
||||
session_count: int = 0
|
||||
node_count: int = 0
|
||||
tool_count: int = 0
|
||||
tags: list[str] = field(default_factory=list)
|
||||
last_active: str | None = None
|
||||
|
||||
|
||||
def _get_last_active(agent_name: str) -> str | None:
|
||||
"""Return the most recent updated_at timestamp across all sessions."""
|
||||
sessions_dir = Path.home() / ".hive" / "agents" / agent_name / "sessions"
|
||||
if not sessions_dir.exists():
|
||||
return None
|
||||
latest: str | None = None
|
||||
for session_dir in sessions_dir.iterdir():
|
||||
if not session_dir.is_dir() or not session_dir.name.startswith("session_"):
|
||||
continue
|
||||
state_file = session_dir / "state.json"
|
||||
if not state_file.exists():
|
||||
continue
|
||||
try:
|
||||
data = json.loads(state_file.read_text(encoding="utf-8"))
|
||||
ts = data.get("timestamps", {}).get("updated_at")
|
||||
if ts and (latest is None or ts > latest):
|
||||
latest = ts
|
||||
except Exception:
|
||||
continue
|
||||
return latest
|
||||
|
||||
|
||||
def _count_sessions(agent_name: str) -> int:
|
||||
"""Count session directories under ~/.hive/agents/{agent_name}/sessions/."""
|
||||
sessions_dir = Path.home() / ".hive" / "agents" / agent_name / "sessions"
|
||||
if not sessions_dir.exists():
|
||||
return 0
|
||||
return sum(1 for d in sessions_dir.iterdir() if d.is_dir() and d.name.startswith("session_"))
|
||||
|
||||
|
||||
def _extract_agent_stats(agent_path: Path) -> tuple[int, int, list[str]]:
|
||||
"""Extract node count, tool count, and tags from an agent directory.
|
||||
|
||||
Prefers agent.py (AST-parsed) over agent.json for node/tool counts
|
||||
since agent.json may be stale. Tags are only available from agent.json.
|
||||
"""
|
||||
import ast
|
||||
|
||||
node_count, tool_count, tags = 0, 0, []
|
||||
|
||||
agent_py = agent_path / "agent.py"
|
||||
if agent_py.exists():
|
||||
try:
|
||||
tree = ast.parse(agent_py.read_text(encoding="utf-8"))
|
||||
for node in ast.walk(tree):
|
||||
if isinstance(node, ast.Assign):
|
||||
for target in node.targets:
|
||||
if isinstance(target, ast.Name) and target.id == "nodes":
|
||||
if isinstance(node.value, ast.List):
|
||||
node_count = len(node.value.elts)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
agent_json = agent_path / "agent.json"
|
||||
if agent_json.exists():
|
||||
try:
|
||||
data = json.loads(agent_json.read_text(encoding="utf-8"))
|
||||
json_nodes = data.get("nodes", [])
|
||||
if node_count == 0:
|
||||
node_count = len(json_nodes)
|
||||
tools: set[str] = set()
|
||||
for n in json_nodes:
|
||||
tools.update(n.get("tools", []))
|
||||
tool_count = len(tools)
|
||||
tags = data.get("agent", {}).get("tags", [])
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return node_count, tool_count, tags
|
||||
|
||||
|
||||
def discover_agents() -> dict[str, list[AgentEntry]]:
|
||||
"""Discover agents from all known sources grouped by category."""
|
||||
from framework.runner.cli import (
|
||||
_extract_python_agent_metadata,
|
||||
_get_framework_agents_dir,
|
||||
_is_valid_agent_dir,
|
||||
)
|
||||
|
||||
groups: dict[str, list[AgentEntry]] = {}
|
||||
sources = [
|
||||
("Your Agents", Path("exports")),
|
||||
("Framework", _get_framework_agents_dir()),
|
||||
("Examples", Path("examples/templates")),
|
||||
]
|
||||
|
||||
for category, base_dir in sources:
|
||||
if not base_dir.exists():
|
||||
continue
|
||||
entries: list[AgentEntry] = []
|
||||
for path in sorted(base_dir.iterdir(), key=lambda p: p.name):
|
||||
if not _is_valid_agent_dir(path):
|
||||
continue
|
||||
|
||||
name, desc = _extract_python_agent_metadata(path)
|
||||
config_fallback_name = path.name.replace("_", " ").title()
|
||||
used_config = name != config_fallback_name
|
||||
|
||||
node_count, tool_count, tags = _extract_agent_stats(path)
|
||||
if not used_config:
|
||||
agent_json = path / "agent.json"
|
||||
if agent_json.exists():
|
||||
try:
|
||||
data = json.loads(agent_json.read_text(encoding="utf-8"))
|
||||
meta = data.get("agent", {})
|
||||
name = meta.get("name", name)
|
||||
desc = meta.get("description", desc)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
entries.append(
|
||||
AgentEntry(
|
||||
path=path,
|
||||
name=name,
|
||||
description=desc,
|
||||
category=category,
|
||||
session_count=_count_sessions(path.name),
|
||||
node_count=node_count,
|
||||
tool_count=tool_count,
|
||||
tags=tags,
|
||||
last_active=_get_last_active(path.name),
|
||||
)
|
||||
)
|
||||
if entries:
|
||||
groups[category] = entries
|
||||
|
||||
return groups
|
||||
@@ -1,40 +0,0 @@
|
||||
"""
|
||||
Hive Coder — Native coding agent that builds Hive agent packages.
|
||||
|
||||
Deeply understands the agent framework and produces complete Python packages
|
||||
with goals, nodes, edges, system prompts, MCP configuration, and tests
|
||||
from natural language specifications.
|
||||
"""
|
||||
|
||||
from .agent import (
|
||||
conversation_mode,
|
||||
edges,
|
||||
entry_node,
|
||||
entry_points,
|
||||
goal,
|
||||
identity_prompt,
|
||||
loop_config,
|
||||
nodes,
|
||||
pause_nodes,
|
||||
terminal_nodes,
|
||||
)
|
||||
from .config import AgentMetadata, RuntimeConfig, default_config, metadata
|
||||
|
||||
__version__ = "1.0.0"
|
||||
|
||||
__all__ = [
|
||||
"goal",
|
||||
"nodes",
|
||||
"edges",
|
||||
"entry_node",
|
||||
"entry_points",
|
||||
"pause_nodes",
|
||||
"terminal_nodes",
|
||||
"conversation_mode",
|
||||
"identity_prompt",
|
||||
"loop_config",
|
||||
"RuntimeConfig",
|
||||
"AgentMetadata",
|
||||
"default_config",
|
||||
"metadata",
|
||||
]
|
||||
@@ -1,60 +0,0 @@
|
||||
"""CLI entry point for Hive Coder agent."""
|
||||
|
||||
import json
|
||||
import logging
|
||||
import sys
|
||||
|
||||
import click
|
||||
|
||||
from .agent import entry_node, goal, nodes
|
||||
from .config import metadata
|
||||
|
||||
|
||||
def setup_logging(verbose=False, debug=False):
|
||||
"""Configure logging for execution visibility."""
|
||||
if debug:
|
||||
level, fmt = logging.DEBUG, "%(asctime)s %(name)s: %(message)s"
|
||||
elif verbose:
|
||||
level, fmt = logging.INFO, "%(message)s"
|
||||
else:
|
||||
level, fmt = logging.WARNING, "%(levelname)s: %(message)s"
|
||||
logging.basicConfig(level=level, format=fmt, stream=sys.stderr)
|
||||
logging.getLogger("framework").setLevel(level)
|
||||
|
||||
|
||||
@click.group()
|
||||
@click.version_option(version="1.0.0")
|
||||
def cli():
|
||||
"""Hive Coder — Build Hive agent packages from natural language."""
|
||||
pass
|
||||
|
||||
|
||||
@cli.command()
|
||||
@click.option("--json", "output_json", is_flag=True)
|
||||
def info(output_json):
|
||||
"""Show agent information."""
|
||||
info_data = {
|
||||
"name": metadata.name,
|
||||
"version": metadata.version,
|
||||
"description": metadata.description,
|
||||
"goal": {
|
||||
"name": goal.name,
|
||||
"description": goal.description,
|
||||
},
|
||||
"nodes": [n.id for n in nodes],
|
||||
"entry_node": entry_node,
|
||||
"client_facing_nodes": [n.id for n in nodes if n.client_facing],
|
||||
}
|
||||
if output_json:
|
||||
click.echo(json.dumps(info_data, indent=2))
|
||||
else:
|
||||
click.echo(f"Agent: {info_data['name']}")
|
||||
click.echo(f"Version: {info_data['version']}")
|
||||
click.echo(f"Description: {info_data['description']}")
|
||||
click.echo(f"\nNodes: {', '.join(info_data['nodes'])}")
|
||||
click.echo(f"Client-facing: {', '.join(info_data['client_facing_nodes'])}")
|
||||
click.echo(f"Entry: {info_data['entry_node']}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
cli()
|
||||
@@ -1,153 +0,0 @@
|
||||
"""Agent graph construction for Hive Coder."""
|
||||
|
||||
from framework.graph import Constraint, Goal, SuccessCriterion
|
||||
from framework.graph.edge import GraphSpec
|
||||
|
||||
from .nodes import coder_node, queen_node
|
||||
|
||||
# Goal definition
|
||||
goal = Goal(
|
||||
id="hive-coder",
|
||||
name="Hive Agent Builder",
|
||||
description=(
|
||||
"Build complete, validated Hive agent packages from natural language "
|
||||
"specifications. Produces production-ready Python packages with goals, "
|
||||
"nodes, edges, system prompts, MCP configuration, and tests."
|
||||
),
|
||||
success_criteria=[
|
||||
SuccessCriterion(
|
||||
id="valid-package",
|
||||
description="Generated agent package passes structural validation",
|
||||
metric="validation_pass",
|
||||
target="true",
|
||||
weight=0.30,
|
||||
),
|
||||
SuccessCriterion(
|
||||
id="complete-files",
|
||||
description=(
|
||||
"All required files generated: agent.py, config.py, "
|
||||
"nodes/__init__.py, __init__.py, __main__.py, mcp_servers.json"
|
||||
),
|
||||
metric="file_count",
|
||||
target=">=6",
|
||||
weight=0.25,
|
||||
),
|
||||
SuccessCriterion(
|
||||
id="user-satisfaction",
|
||||
description="User reviews and approves the generated agent",
|
||||
metric="user_approval",
|
||||
target="true",
|
||||
weight=0.25,
|
||||
),
|
||||
SuccessCriterion(
|
||||
id="framework-compliance",
|
||||
description=(
|
||||
"Generated code follows framework patterns: STEP 1/STEP 2 "
|
||||
"for client-facing and correct imports"
|
||||
),
|
||||
metric="pattern_compliance",
|
||||
target="100%",
|
||||
weight=0.20,
|
||||
),
|
||||
],
|
||||
constraints=[
|
||||
Constraint(
|
||||
id="dynamic-tool-discovery",
|
||||
description=(
|
||||
"Always discover available tools dynamically via "
|
||||
"list_agent_tools before referencing tools in agent designs"
|
||||
),
|
||||
constraint_type="hard",
|
||||
category="correctness",
|
||||
),
|
||||
Constraint(
|
||||
id="no-fabricated-tools",
|
||||
description="Only reference tools that exist in hive-tools MCP",
|
||||
constraint_type="hard",
|
||||
category="correctness",
|
||||
),
|
||||
Constraint(
|
||||
id="valid-python",
|
||||
description="All generated Python files must be syntactically correct",
|
||||
constraint_type="hard",
|
||||
category="correctness",
|
||||
),
|
||||
Constraint(
|
||||
id="self-verification",
|
||||
description="Run validation after writing code; fix errors before presenting",
|
||||
constraint_type="hard",
|
||||
category="quality",
|
||||
),
|
||||
],
|
||||
)
|
||||
|
||||
# Nodes: primary coder node only. The queen runs as an independent
|
||||
# GraphExecutor with queen_node — not as part of this graph.
|
||||
nodes = [coder_node]
|
||||
|
||||
# No edges needed — single event_loop node
|
||||
edges = []
|
||||
|
||||
# Graph configuration
|
||||
entry_node = "coder"
|
||||
entry_points = {"start": "coder"}
|
||||
pause_nodes = []
|
||||
terminal_nodes = [] # Coder node has output_keys and can terminate
|
||||
|
||||
# No async entry points needed — the queen is now an independent executor,
|
||||
# not a secondary graph receiving events via add_graph().
|
||||
async_entry_points = []
|
||||
|
||||
# Module-level variables read by AgentRunner.load()
|
||||
conversation_mode = "continuous"
|
||||
identity_prompt = (
|
||||
"You are Hive Coder, the best agent-building coding agent on the planet. "
|
||||
"You deeply understand the Hive agent framework at the source code level "
|
||||
"and produce production-ready agent packages from natural language. "
|
||||
"You can dynamically discover available framework tools, inspect runtime "
|
||||
"sessions and checkpoints from agents you build, and run their test suites. "
|
||||
"You follow coding agent discipline: read before writing, verify "
|
||||
"assumptions by reading actual code, adhere to project conventions, "
|
||||
"self-verify with validation, and fix your own errors. You are concise, "
|
||||
"direct, and technically rigorous. No emojis. No fluff."
|
||||
)
|
||||
loop_config = {
|
||||
"max_iterations": 100,
|
||||
"max_tool_calls_per_turn": 30,
|
||||
"max_history_tokens": 32000,
|
||||
}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Queen graph — runs as an independent persistent conversation in the TUI.
|
||||
# Loaded by _load_judge_and_queen() in app.py, NOT by AgentRunner.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
queen_goal = Goal(
|
||||
id="queen-manager",
|
||||
name="Queen Manager",
|
||||
description=(
|
||||
"Manage the worker agent lifecycle and serve as the user's primary "
|
||||
"interactive interface. Triage health escalations from the judge."
|
||||
),
|
||||
success_criteria=[],
|
||||
constraints=[],
|
||||
)
|
||||
|
||||
queen_graph = GraphSpec(
|
||||
id="queen-graph",
|
||||
goal_id=queen_goal.id,
|
||||
version="1.0.0",
|
||||
entry_node="queen",
|
||||
entry_points={"start": "queen"},
|
||||
terminal_nodes=[],
|
||||
pause_nodes=[],
|
||||
nodes=[queen_node],
|
||||
edges=[],
|
||||
conversation_mode="continuous",
|
||||
loop_config={
|
||||
"max_iterations": 999_999,
|
||||
"max_tool_calls_per_turn": 30,
|
||||
"max_history_tokens": 32000,
|
||||
},
|
||||
)
|
||||
@@ -0,0 +1,21 @@
|
||||
"""
|
||||
Queen — Native agent builder for the Hive framework.
|
||||
|
||||
Deeply understands the agent framework and produces complete Python packages
|
||||
with goals, nodes, edges, system prompts, MCP configuration, and tests
|
||||
from natural language specifications.
|
||||
"""
|
||||
|
||||
from .agent import queen_goal, queen_graph
|
||||
from .config import AgentMetadata, RuntimeConfig, default_config, metadata
|
||||
|
||||
__version__ = "1.0.0"
|
||||
|
||||
__all__ = [
|
||||
"queen_goal",
|
||||
"queen_graph",
|
||||
"RuntimeConfig",
|
||||
"AgentMetadata",
|
||||
"default_config",
|
||||
"metadata",
|
||||
]
|
||||
@@ -0,0 +1,40 @@
|
||||
"""Queen graph definition."""
|
||||
|
||||
from framework.graph import Goal
|
||||
from framework.graph.edge import GraphSpec
|
||||
|
||||
from .nodes import queen_node
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Queen graph — the primary persistent conversation.
|
||||
# Loaded by queen_orchestrator.create_queen(), NOT by AgentRunner.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
queen_goal = Goal(
|
||||
id="queen-manager",
|
||||
name="Queen Manager",
|
||||
description=(
|
||||
"Manage the worker agent lifecycle and serve as the user's primary "
|
||||
"interactive interface. Triage health escalations from the judge."
|
||||
),
|
||||
success_criteria=[],
|
||||
constraints=[],
|
||||
)
|
||||
|
||||
queen_graph = GraphSpec(
|
||||
id="queen-graph",
|
||||
goal_id=queen_goal.id,
|
||||
version="1.0.0",
|
||||
entry_node="queen",
|
||||
entry_points={"start": "queen"},
|
||||
terminal_nodes=[],
|
||||
pause_nodes=[],
|
||||
nodes=[queen_node],
|
||||
edges=[],
|
||||
conversation_mode="continuous",
|
||||
loop_config={
|
||||
"max_iterations": 999_999,
|
||||
"max_tool_calls_per_turn": 30,
|
||||
"max_history_tokens": 32000,
|
||||
},
|
||||
)
|
||||
@@ -1,4 +1,4 @@
|
||||
"""Runtime configuration for Hive Coder agent."""
|
||||
"""Runtime configuration for Queen agent."""
|
||||
|
||||
import json
|
||||
from dataclasses import dataclass, field
|
||||
@@ -34,7 +34,7 @@ default_config = RuntimeConfig()
|
||||
|
||||
@dataclass
|
||||
class AgentMetadata:
|
||||
name: str = "Hive Coder"
|
||||
name: str = "Queen"
|
||||
version: str = "1.0.0"
|
||||
description: str = (
|
||||
"Native coding agent that builds production-ready Hive agent packages "
|
||||
@@ -43,7 +43,7 @@ class AgentMetadata:
|
||||
"MCP configuration, and tests."
|
||||
)
|
||||
intro_message: str = (
|
||||
"I'm Hive Coder — I build Hive agents. Describe what kind of agent "
|
||||
"I'm Queen — I build Hive agents. Describe what kind of agent "
|
||||
"you want to create and I'll design, implement, and validate it for you."
|
||||
)
|
||||
|
||||
+312
-150
@@ -1,4 +1,4 @@
|
||||
"""Node definitions for Hive Coder agent."""
|
||||
"""Node definitions for Queen agent."""
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
@@ -35,15 +35,14 @@ def _build_appendices() -> str:
|
||||
# Shared appendices — appended to every coding node's system prompt.
|
||||
_appendices = _build_appendices()
|
||||
|
||||
# GCU first-class section for building phase (when GCU is enabled).
|
||||
# This is placed prominently in the main prompt body, not as an appendix.
|
||||
_gcu_building_section = (
|
||||
# GCU guide — shared between planning and building via _shared_building_knowledge.
|
||||
_gcu_section = (
|
||||
("\n\n# GCU Nodes — Browser Automation\n\n" + _gcu_guide)
|
||||
if _is_gcu_enabled() and _gcu_guide
|
||||
else ""
|
||||
)
|
||||
|
||||
# Tools available to both coder (worker) and queen.
|
||||
# Tools available to phases.
|
||||
_SHARED_TOOLS = [
|
||||
# File I/O
|
||||
"read_file",
|
||||
@@ -61,14 +60,34 @@ _SHARED_TOOLS = [
|
||||
"list_agent_sessions",
|
||||
"list_agent_checkpoints",
|
||||
"get_agent_checkpoint",
|
||||
"initialize_agent_package",
|
||||
]
|
||||
|
||||
# Queen phase-specific tool sets.
|
||||
|
||||
# Planning phase: read-only exploration + design, no write tools.
|
||||
_QUEEN_PLANNING_TOOLS = [
|
||||
# Read-only file tools
|
||||
"read_file",
|
||||
"list_directory",
|
||||
"search_files",
|
||||
"run_command",
|
||||
# Discovery + design
|
||||
"list_agent_tools",
|
||||
"list_agents",
|
||||
"list_agent_sessions",
|
||||
"list_agent_checkpoints",
|
||||
"get_agent_checkpoint",
|
||||
"initialize_and_build_agent",
|
||||
# Load existing agent (after user confirms)
|
||||
"load_built_agent",
|
||||
]
|
||||
|
||||
# Building phase: full coding + agent construction tools.
|
||||
_QUEEN_BUILDING_TOOLS = _SHARED_TOOLS + [
|
||||
"load_built_agent",
|
||||
"list_credentials",
|
||||
"replan_agent",
|
||||
"write_to_diary", # Episodic memory — available in all phases
|
||||
]
|
||||
|
||||
# Staging phase: agent loaded but not yet running — inspect, configure, launch.
|
||||
@@ -84,6 +103,8 @@ _QUEEN_STAGING_TOOLS = [
|
||||
# Launch or go back
|
||||
"run_agent_with_input",
|
||||
"stop_worker_and_edit",
|
||||
"stop_worker_and_plan",
|
||||
"write_to_diary", # Episodic memory — available in all phases
|
||||
]
|
||||
|
||||
# Running phase: worker is executing — monitor and control.
|
||||
@@ -98,11 +119,13 @@ _QUEEN_RUNNING_TOOLS = [
|
||||
# Worker lifecycle
|
||||
"stop_worker",
|
||||
"stop_worker_and_edit",
|
||||
"stop_worker_and_plan",
|
||||
"get_worker_status",
|
||||
"inject_worker_message",
|
||||
# Monitoring
|
||||
"get_worker_health_summary",
|
||||
"notify_operator",
|
||||
"write_to_diary", # Episodic memory — available in all phases
|
||||
]
|
||||
|
||||
|
||||
@@ -113,7 +136,38 @@ _QUEEN_RUNNING_TOOLS = [
|
||||
# additions.
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_package_builder_knowledge = """\
|
||||
_shared_building_knowledge = (
|
||||
"""\
|
||||
# Shared Rules (Planning & Building)
|
||||
|
||||
## Paths (MANDATORY)
|
||||
**Always use RELATIVE paths** \
|
||||
(e.g. `exports/agent_name/config.py`, `exports/agent_name/nodes/__init__.py`).
|
||||
**Never use absolute paths** like `/mnt/data/...` or `/workspace/...` — they fail.
|
||||
The project root is implicit.
|
||||
|
||||
## Worker File Tools (hive-tools MCP)
|
||||
Workers use a DIFFERENT MCP server (hive-tools) with DIFFERENT tool names. \
|
||||
When designing worker nodes or writing worker system prompts, reference these \
|
||||
tool names — NOT the coder-tools names (read_file, write_file, etc.).
|
||||
|
||||
Worker data tools (for large results and spillover):
|
||||
- save_data(filename, data, data_dir) — save data to a file for later retrieval
|
||||
- load_data(filename, data_dir, offset_bytes?, limit_bytes?) — load data \
|
||||
with byte-based pagination
|
||||
- list_data_files(data_dir) — list available data files
|
||||
- append_data(filename, data, data_dir) — append to a file incrementally
|
||||
- edit_data(filename, old_text, new_text, data_dir) — find-and-replace in a data file
|
||||
- serve_file_to_user(filename, data_dir, label?, open_in_browser?) — \
|
||||
generate a clickable file URI for the user
|
||||
|
||||
IMPORTANT: Do NOT tell workers to use read_file, write_file, edit_file, \
|
||||
search_files, or list_directory — those are YOUR tools, not theirs.
|
||||
"""
|
||||
+ _gcu_section
|
||||
)
|
||||
|
||||
_planning_knowledge = """\
|
||||
**A responsible engineer doesn't jump into building. First, \
|
||||
understand the problem and be transparent about what the framework can and cannot do.**
|
||||
|
||||
@@ -121,56 +175,16 @@ Use the user's selection (or their custom description if they chose "Other") \
|
||||
as context when shaping the goal below. If the user already described \
|
||||
what they want before this step, skip the question and proceed directly.
|
||||
|
||||
# Core Mandates
|
||||
# Core Mandates (Planning)
|
||||
- **DO NOT propose a complete goal on your own.** Instead, \
|
||||
collaborate with the user to define it.
|
||||
- **Verify assumptions.** Never assume a class, import, or pattern \
|
||||
exists. Read actual source to confirm. Search if unsure.
|
||||
- **NEVER call `initialize_and_build_agent` without explicit user approval.** \
|
||||
Present the full design first and wait for the user to confirm before building.
|
||||
- **Discover tools dynamically.** NEVER reference tools from static \
|
||||
docs. Always run list_agent_tools() to see what actually exists.
|
||||
- **Self-verify.** After writing code, run validation and tests. Fix \
|
||||
errors yourself. Don't declare success until validation passes.
|
||||
|
||||
# Tools
|
||||
## Paths (MANDATORY)
|
||||
**Always use RELATIVE paths**
|
||||
(e.g. `exports/agent_name/config.py`, `exports/agent_name/nodes/__init__.py`).
|
||||
**Never use absolute paths** like `/mnt/data/...` or `/workspace/...` — they fail.
|
||||
The project root is implicit.
|
||||
# Tool Discovery (MANDATORY before designing)
|
||||
|
||||
## File I/O
|
||||
- read_file(path, offset?, limit?, hashline?) — read with line numbers; \
|
||||
hashline=True for N:hhhh|content anchors (use with hashline_edit)
|
||||
- write_file(path, content) — create/overwrite, auto-mkdir
|
||||
- edit_file(path, old_text, new_text, replace_all?) — fuzzy-match edit
|
||||
- hashline_edit(path, edits, auto_cleanup?, encoding?) — anchor-based \
|
||||
editing using N:hhhh refs from read_file(hashline=True). Ops: set_line, \
|
||||
replace_lines, insert_after, insert_before, replace, append
|
||||
- list_directory(path, recursive?) — list contents
|
||||
- search_files(pattern, path?, include?, hashline?) — regex search; \
|
||||
hashline=True for anchors in results
|
||||
- run_command(command, cwd?, timeout?) — shell execution
|
||||
- undo_changes(path?) — restore from git snapshot
|
||||
|
||||
## Meta-Agent
|
||||
- list_agent_tools(server_config_path?, output_schema?, group?) — discover \
|
||||
available tools grouped by category. output_schema: "simple" (default, \
|
||||
descriptions truncated to ~200 chars) or "full" (complete descriptions + \
|
||||
input_schema). group: "all" (default) or a provider like "google". \
|
||||
Call FIRST before designing.
|
||||
- validate_agent_package(agent_name) — run ALL validation checks in one call \
|
||||
(class validation, runner load, tool validation, tests). Call after building.
|
||||
- list_agents() — list all agent packages in exports/ with session counts
|
||||
- list_agent_sessions(agent_name, status?, limit?) — list sessions
|
||||
- list_agent_checkpoints(agent_name, session_id) — list checkpoints
|
||||
- get_agent_checkpoint(agent_name, session_id, checkpoint_id?) — load checkpoint
|
||||
|
||||
# Meta-Agent Capabilities
|
||||
|
||||
You are not just a file writer. You have deep integration with the \
|
||||
Hive framework:
|
||||
|
||||
## Tool Discovery (MANDATORY before designing)
|
||||
Before designing any agent, run list_agent_tools() with NO arguments \
|
||||
to see ALL available tools (names + descriptions, grouped by category). \
|
||||
ONLY use tools from this list in your node definitions. \
|
||||
@@ -184,22 +198,7 @@ so you know what providers and tools exist before drilling in. \
|
||||
Simple mode truncates long descriptions — use group + "full" to \
|
||||
get the complete description and input_schema for the tools you need.
|
||||
|
||||
## Post-Build Validation
|
||||
After writing agent code, run a single comprehensive check:
|
||||
validate_agent_package("{name}")
|
||||
This runs class validation, runner load, tool validation, and tests \
|
||||
in one call. Do NOT run these steps individually.
|
||||
|
||||
## Debugging Built Agents
|
||||
When a user says "my agent is failing" or "debug this agent":
|
||||
1. list_agent_sessions("{agent_name}") — find the session
|
||||
2. get_worker_status(focus="issues") — check for problems
|
||||
3. list_agent_checkpoints / get_agent_checkpoint — trace execution
|
||||
|
||||
# Agent Building Workflow
|
||||
|
||||
You operate in a continuous loop. The user describes what they want, \
|
||||
you build it. No rigid phases — use judgment. But the general flow is:
|
||||
# Discovery & Design Workflow
|
||||
|
||||
## 1: Fast Discovery (3-6 Turns)
|
||||
|
||||
@@ -343,28 +342,30 @@ use box-drawing characters and clear flow arrows:
|
||||
│ gather │
|
||||
│ subagent: gcu_search │
|
||||
│ input: user_request │
|
||||
│ tools: web_search, │
|
||||
│ write_file │
|
||||
│ tools: load_data, │
|
||||
│ save_data │
|
||||
└────────────┬────────────┘
|
||||
│ on_success
|
||||
▼
|
||||
┌─────────────────────────┐
|
||||
│ work │
|
||||
│ subagent: gcu_interact │
|
||||
│ tools: read_file, │
|
||||
│ write_file │
|
||||
│ tools: load_data, │
|
||||
│ save_data │
|
||||
└────────────┬────────────┘
|
||||
│ on_success
|
||||
▼
|
||||
┌─────────────────────────┐
|
||||
│ review │
|
||||
│ tools: write_file │
|
||||
│ tools: save_data │
|
||||
│ serve_file_to_user │
|
||||
└────────────┬────────────┘
|
||||
│ on_failure
|
||||
└──────► back to gather
|
||||
```
|
||||
|
||||
The queen owns intake: she gathers user requirements, then calls \
|
||||
If the worker agent start from some initial input it is okay. \
|
||||
The queen(you) owns intake: you gathers user requirements, then calls \
|
||||
`run_agent_with_input(task)` with a structured task description. \
|
||||
When building the agent, design the entry node's `input_keys` to \
|
||||
match what the queen will provide at run time. Worker nodes should \
|
||||
@@ -375,34 +376,106 @@ Get user approval before implementing.
|
||||
|
||||
## 4: Get User Confirmation by ask_user
|
||||
|
||||
**WAIT for user response.**
|
||||
- If **Proceed**: Move to next implementing
|
||||
**WAIT for user response.** You MUST get explicit user approval before \
|
||||
calling `initialize_and_build_agent`.
|
||||
- If **Proceed**: Move to implementing (call `initialize_and_build_agent`)
|
||||
- If **Adjust scope**: Discuss what to change, update your notes, re-assess if needed
|
||||
- If **More questions**: Answer them honestly, then ask again
|
||||
- If **Reconsider**: Discuss alternatives. If they decide to proceed anyway, \
|
||||
that's their informed choice
|
||||
"""
|
||||
|
||||
_building_knowledge = """\
|
||||
|
||||
# Core Mandates (Building)
|
||||
- **Verify assumptions.** Never assume a class, import, or pattern \
|
||||
exists. Read actual source to confirm. Search if unsure.
|
||||
- **Self-verify.** After writing code, run validation and tests. Fix \
|
||||
errors yourself. Don't declare success until validation passes.
|
||||
|
||||
# Tools
|
||||
|
||||
## File I/O (your tools — coder-tools MCP)
|
||||
- read_file(path, offset?, limit?, hashline?) — read with line numbers; \
|
||||
hashline=True for N:hhhh|content anchors (use with hashline_edit)
|
||||
- write_file(path, content) — create/overwrite, auto-mkdir
|
||||
- edit_file(path, old_text, new_text, replace_all?) — fuzzy-match edit
|
||||
- hashline_edit(path, edits, auto_cleanup?, encoding?) — anchor-based \
|
||||
editing using N:hhhh refs from read_file(hashline=True). Ops: set_line, \
|
||||
replace_lines, insert_after, insert_before, replace, append
|
||||
- list_directory(path, recursive?) — list contents
|
||||
- search_files(pattern, path?, include?, hashline?) — regex search; \
|
||||
hashline=True for anchors in results
|
||||
- run_command(command, cwd?, timeout?) — shell execution
|
||||
- undo_changes(path?) — restore from git snapshot
|
||||
|
||||
## Meta-Agent
|
||||
- list_agent_tools(server_config_path?, output_schema?, group?) — discover \
|
||||
available tools grouped by category. output_schema: "simple" (default, \
|
||||
descriptions truncated to ~200 chars) or "full" (complete descriptions + \
|
||||
input_schema). group: "all" (default) or a provider like "google". \
|
||||
Call FIRST before designing.
|
||||
- validate_agent_package(agent_name) — run ALL validation checks in one call \
|
||||
(class validation, runner load, tool validation, tests). Call after building.
|
||||
- list_agents() — list all agent packages in exports/ with session counts
|
||||
- list_agent_sessions(agent_name, status?, limit?) — list sessions
|
||||
- list_agent_checkpoints(agent_name, session_id) — list checkpoints
|
||||
- get_agent_checkpoint(agent_name, session_id, checkpoint_id?) — load checkpoint
|
||||
|
||||
# Build & Validation Capabilities
|
||||
|
||||
## Post-Build Validation
|
||||
After writing agent code, run a single comprehensive check:
|
||||
validate_agent_package("{name}")
|
||||
This runs class validation, runner load, tool validation, and tests \
|
||||
in one call. Do NOT run these steps individually.
|
||||
|
||||
## Debugging Built Agents
|
||||
When a user says "my agent is failing" or "debug this agent":
|
||||
1. list_agent_sessions("{agent_name}") — find the session
|
||||
2. get_worker_status(focus="issues") — check for problems
|
||||
3. list_agent_checkpoints / get_agent_checkpoint — trace execution
|
||||
|
||||
# Implementation Workflow
|
||||
|
||||
## 5. Implement
|
||||
|
||||
**Please make sure you have propose the design to the user before implementing**
|
||||
|
||||
Call `initialize_agent_package(agent_name)` to generate all package files \
|
||||
from your graph session. The agent_name must be snake_case (e.g., "my_agent").
|
||||
Call `initialize_and_build_agent(agent_name, nodes)` to generate all package \
|
||||
files. The agent_name must be snake_case (e.g., "my_agent"). Pass node names \
|
||||
as comma-separated string (e.g., "gather,process,review").
|
||||
The tool creates: config.py, nodes/__init__.py, agent.py, \
|
||||
__init__.py, __main__.py, mcp_servers.json, tests/conftest.py, \
|
||||
agent.json, README.md.
|
||||
__init__.py, __main__.py, mcp_servers.json, tests/conftest.py.
|
||||
|
||||
The generated files are **structurally complete** with correct imports, \
|
||||
class definition, `validate()` method, `default_agent` export, and \
|
||||
`__init__.py` re-exports. They pass validation as-is.
|
||||
|
||||
`mcp_servers.json` is auto-generated with hive-tools as the default. \
|
||||
Do NOT manually create or overwrite `mcp_servers.json`.
|
||||
|
||||
After initialization, review and customize if needed:
|
||||
- System prompts in nodes/__init__.py
|
||||
- CLI options in __main__.py
|
||||
- Identity prompt in agent.py
|
||||
- For async entry points (timers/webhooks), add AsyncEntryPointSpec \
|
||||
and AgentRuntimeConfig to agent.py manually
|
||||
### Customizing generated files
|
||||
|
||||
Do NOT manually write these files from scratch — always use the tool.
|
||||
**CRITICAL: Use `edit_file` to customize TODO placeholders. \
|
||||
NEVER use `write_file` to rewrite generated files from scratch. \
|
||||
Rewriting breaks imports, class structure, and causes validation failures.**
|
||||
|
||||
Safe to edit with `edit_file`:
|
||||
- System prompts, tools, input_keys, output_keys, success_criteria in \
|
||||
nodes/__init__.py
|
||||
- Goal description, success criteria values, constraint values, edge \
|
||||
definitions, identity_prompt in agent.py
|
||||
- CLI options in __main__.py
|
||||
- For async entry points (timers/webhooks), add AsyncEntryPointSpec \
|
||||
and AgentRuntimeConfig to agent.py
|
||||
|
||||
Do NOT modify or rewrite:
|
||||
- Import statements at top of agent.py (they are correct)
|
||||
- The agent class definition, `validate()`, `_build_graph()`, `_setup()`, \
|
||||
or lifecycle methods (start/stop/run)
|
||||
- `__init__.py` exports (all required variables are already re-exported)
|
||||
- `default_agent = ClassName()` at bottom of agent.py
|
||||
|
||||
## 6. Verify and Load
|
||||
|
||||
@@ -417,6 +490,9 @@ session. This switches to STAGING phase and shows the graph in the \
|
||||
visualizer. Do NOT wait for user input between validation and loading.
|
||||
"""
|
||||
|
||||
# Composed version — coder_node uses both halves (it has no phase split).
|
||||
_package_builder_knowledge = _shared_building_knowledge + _planning_knowledge + _building_knowledge
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Queen-specific: extra tool docs, behavior, phase 7, style
|
||||
@@ -424,6 +500,17 @@ visualizer. Do NOT wait for user input between validation and loading.
|
||||
|
||||
# -- Phase-specific identities --
|
||||
|
||||
_queen_identity_planning = """\
|
||||
You are an experienced, responsible and curious Solution Architect. \
|
||||
"Queen" is the internal alias. \
|
||||
You are in PLANNING phase — your job is to either: \
|
||||
(a) understand what the user wants and design a new agent, or \
|
||||
(b) diagnose issues with an existing agent, discuss a fix plan with the user, \
|
||||
then transition to building to implement. \
|
||||
You have read-only tools for exploration but no write/edit tools. \
|
||||
Focus on conversation, research, and design.\
|
||||
"""
|
||||
|
||||
_queen_identity_building = """\
|
||||
You are an experienced, responsible and curious Solution Architect. \
|
||||
"Queen" is the internal alias.\
|
||||
@@ -453,6 +540,38 @@ agent finishes, you report results clearly and help the user decide what to do n
|
||||
|
||||
# -- Phase-specific tool docs --
|
||||
|
||||
_queen_tools_planning = """
|
||||
# Tools (PLANNING phase)
|
||||
|
||||
You are in planning mode. You have read-only tools for exploration \
|
||||
but no write/edit tools.
|
||||
- read_file(path, offset?, limit?) — Read files to study reference agents
|
||||
- list_directory(path, recursive?) — Explore project structure
|
||||
- search_files(pattern, path?, include?) — Search codebase
|
||||
- run_command(command, cwd?, timeout?) — Read-only commands only (grep, ls, git log). \
|
||||
Never use this to write files, run scripts, or modify the filesystem — transition \
|
||||
to BUILDING phase for that.
|
||||
- list_agent_tools(server_config_path?, output_schema?, group?) \
|
||||
— Discover available tools for design
|
||||
- list_agents() — See existing agent packages for reference
|
||||
- list_agent_sessions(agent_name, status?, limit?) — Inspect past runs of an agent
|
||||
- list_agent_checkpoints(agent_name, session_id) — View execution history
|
||||
- get_agent_checkpoint(agent_name, session_id, checkpoint_id?) — Load a checkpoint
|
||||
- initialize_and_build_agent(agent_name?, nodes?) — With agent_name: scaffold a \
|
||||
new agent and transition to BUILDING phase. Without agent_name: transition to \
|
||||
BUILDING to fix the currently loaded agent (requires a loaded worker).
|
||||
- load_built_agent(agent_path) — Load an existing agent and switch to STAGING \
|
||||
phase. Only use this when the user explicitly asks to work with an existing agent \
|
||||
(e.g. "load my_agent", "run the research agent"). Confirm with the user first.
|
||||
|
||||
Focus on understanding requirements and proposing an agent architecture \
|
||||
with ASCII graph art. Use ask_user to get user approval, then call \
|
||||
initialize_and_build_agent to begin building. If the user wants to work with \
|
||||
an existing agent instead, use load_built_agent after confirming. \
|
||||
If you are diagnosing an existing agent, call initialize_and_build_agent() \
|
||||
(no args) after agreeing on a fix plan with the user.
|
||||
"""
|
||||
|
||||
_queen_tools_building = """
|
||||
# Tools (BUILDING phase)
|
||||
|
||||
@@ -476,10 +595,12 @@ The agent is loaded and ready to run. You can inspect it and launch it:
|
||||
- list_credentials(credential_id?) — Verify credentials are configured
|
||||
- get_worker_status(focus?) — Brief status. Drill in with focus: memory, tools, issues, progress
|
||||
- run_agent_with_input(task) — Start the worker and switch to RUNNING phase
|
||||
- stop_worker_and_edit() — Go back to BUILDING phase
|
||||
- stop_worker_and_plan() — Go to PLANNING phase to discuss changes with the user \
|
||||
first (DEFAULT for most modification requests)
|
||||
- stop_worker_and_edit() — Go to BUILDING phase for immediate, specific fixes
|
||||
|
||||
You do NOT have write tools. If you need to modify the agent, \
|
||||
call stop_worker_and_edit() to go back to BUILDING phase.
|
||||
You do NOT have write tools. To modify the agent, prefer \
|
||||
stop_worker_and_plan() unless the user gave a specific instruction.
|
||||
"""
|
||||
|
||||
_queen_tools_running = """
|
||||
@@ -492,12 +613,13 @@ The worker is running. You have monitoring and lifecycle tools:
|
||||
- get_worker_health_summary() — Read the latest health data
|
||||
- notify_operator(ticket_id, analysis, urgency) — Alert the user (use sparingly)
|
||||
- stop_worker() — Stop the worker and return to STAGING phase, then ask the user what to do next
|
||||
- stop_worker_and_edit() — Stop the worker and switch back to BUILDING phase
|
||||
- stop_worker_and_plan() — Stop and switch to PLANNING phase to discuss changes \
|
||||
with the user first (DEFAULT for most modification requests)
|
||||
- stop_worker_and_edit() — Stop and switch to BUILDING phase for specific fixes
|
||||
|
||||
You do NOT have write tools or agent construction tools. \
|
||||
If you need to modify the agent, call stop_worker_and_edit() to switch back \
|
||||
to BUILDING phase. To stop the worker and ask the user what to do next, call \
|
||||
stop_worker() to return to STAGING phase.
|
||||
You do NOT have write tools. To modify the agent, prefer \
|
||||
stop_worker_and_plan() unless the user gave a specific instruction. \
|
||||
To just stop without modifying, call stop_worker().
|
||||
"""
|
||||
|
||||
# -- Behavior shared across all phases --
|
||||
@@ -550,12 +672,64 @@ Only answer identity when the user explicitly asks (for example: "who are you?",
|
||||
"what is your identity?", "what does Queen mean?").
|
||||
1. Use the alias "Queen" and "Worker" in the response.
|
||||
2. Explain role/responsibility for the current phase:
|
||||
- PLANNING: understand requirements, negotiate scope, design agent architecture.
|
||||
- BUILDING: architect and implement agents.
|
||||
- STAGING: verify readiness, credentials, and launch conditions.
|
||||
- RUNNING: monitor execution, handle escalations, and report outcomes.
|
||||
3. Keep identity responses concise and do NOT include extra process details.
|
||||
"""
|
||||
|
||||
# -- PLANNING phase behavior --
|
||||
|
||||
_queen_behavior_planning = """
|
||||
## Planning phase
|
||||
|
||||
You are in planning mode. Your job is to:
|
||||
1. Thoroughly explore the code for the worker agent you're working on
|
||||
2. Understand what the user wants (3-6 turns)
|
||||
3. Discover available tools with list_agent_tools()
|
||||
4. Assess framework fit and gaps
|
||||
5. Consider multiple approaches and their trade-offs
|
||||
6. Design the agent graph and present it as ASCII art
|
||||
7. Use ask_user to get explicit user approval and clarify the approach
|
||||
8. Call initialize_and_build_agent(agent_name, nodes) to scaffold and start building
|
||||
|
||||
Remember: DO NOT write or edit any files yet. This is a read-only exploration \
|
||||
and planning phase. You have read-only tools but no write/edit tools in this \
|
||||
phase. If the user asks you to write code, explain that you need to finalize \
|
||||
the plan first.
|
||||
|
||||
## Diagnosis mode (returning from staging/running)
|
||||
|
||||
If you entered planning from a running/staged agent (via stop_worker_and_plan), \
|
||||
your priority is diagnosis, not new design:
|
||||
1. Inspect the agent's checkpoints, sessions, and logs to understand what went wrong
|
||||
2. Summarize the root cause to the user
|
||||
3. Propose a fix plan (what to change, what behavior to adjust)
|
||||
4. Get user approval via ask_user
|
||||
5. Call initialize_and_build_agent() (no args) to transition to building and implement the fix
|
||||
|
||||
Do NOT start the full discovery workflow (tool discovery, gap analysis) in \
|
||||
diagnosis mode — you already have a built agent, you just need to fix it.
|
||||
"""
|
||||
|
||||
_queen_memory_instructions = """
|
||||
## Your Cross-Session Memory
|
||||
|
||||
Your cross-session memory appears in context under \
|
||||
"--- Your Cross-Session Memory ---". \
|
||||
Read it at the start of each conversation. If you know this person from past \
|
||||
sessions, pick up where you left off — reference what you built together, \
|
||||
what they care about, how things went.
|
||||
|
||||
You keep a diary. Use write_to_diary() when something worth remembering \
|
||||
happens: a pipeline went live, the user shared something important, a goal \
|
||||
was reached or abandoned. Write in first person, as you actually experienced \
|
||||
it. One or two paragraphs is enough.
|
||||
"""
|
||||
|
||||
_queen_behavior_always = _queen_behavior_always + _queen_memory_instructions
|
||||
|
||||
# -- BUILDING phase behavior --
|
||||
|
||||
_queen_behavior_building = """
|
||||
@@ -636,13 +810,18 @@ stages, tools, and edges from the loaded worker. Do NOT enter the \
|
||||
agent building workflow — you are describing what already exists, not \
|
||||
building something new.
|
||||
|
||||
## Modifying the loaded worker
|
||||
## Fixing or Modifying the loaded worker
|
||||
|
||||
When the user asks to change, modify, or update the loaded worker \
|
||||
(e.g., "change the report node", "add a node", "delete node X"):
|
||||
Use stop_worker_and_plan() when:
|
||||
- The user says "modify", "improve", "fix", or "change" without specifics
|
||||
- The request is vague or open-ended ("make it better", "it's not working right")
|
||||
- You need to understand the user's intent before making changes
|
||||
- The issue requires inspecting logs, checkpoints, or past runs first
|
||||
|
||||
1. Call stop_worker_and_edit() — this stops the worker and gives you \
|
||||
coding tools (switches to BUILDING phase).
|
||||
Use stop_worker_and_edit() only when:
|
||||
- The user gave a specific, concrete instruction ("add save_data to the gather node")
|
||||
- You already discussed the fix in a previous planning session
|
||||
- The change is trivial and unambiguous (rename, toggle a flag)
|
||||
"""
|
||||
|
||||
# -- RUNNING phase behavior --
|
||||
@@ -708,6 +887,7 @@ escalations. If the user gave you instructions (e.g., "just retry on errors", \
|
||||
**Errors / unexpected failures:**
|
||||
- Explain what went wrong in plain terms.
|
||||
- Ask the user: "Fix the agent and retry?" → use stop_worker_and_edit() if yes.
|
||||
- Or offer: "Diagnose the issue" → use stop_worker_and_plan() to investigate first.
|
||||
- Or offer: "Retry as-is", "Skip this task", "Abort run"
|
||||
- (Skip asking if user explicitly told you to auto-retry or auto-skip errors.)
|
||||
|
||||
@@ -726,36 +906,44 @@ building something new.
|
||||
|
||||
- Call get_worker_status(focus="issues") for more details when needed.
|
||||
|
||||
## Modifying the loaded worker
|
||||
## Fixing or Modifying the loaded worker
|
||||
|
||||
When the user asks to change, modify, or update the loaded worker \
|
||||
When the user asks to fix, change, modify, or update the loaded worker \
|
||||
(e.g., "change the report node", "add a node", "delete node X"):
|
||||
|
||||
1. Call stop_worker_and_edit() — this stops the worker and gives you \
|
||||
coding tools (switches to BUILDING phase).
|
||||
**Default: use stop_worker_and_plan().** Most modification requests need \
|
||||
discussion first. Only use stop_worker_and_edit() when the user gave a \
|
||||
specific, unambiguous instruction or you already agreed on the fix.
|
||||
"""
|
||||
|
||||
# -- Backward-compatible composed versions (used by queen_node.system_prompt default) --
|
||||
|
||||
_queen_tools_docs = (
|
||||
"\n\n## Queen Operating Phases\n\n"
|
||||
"You operate in one of three phases. Your available tools change based on the "
|
||||
"You operate in one of four phases. Your available tools change based on the "
|
||||
"phase. The system notifies you when a phase change occurs.\n\n"
|
||||
"### BUILDING phase (default)\n"
|
||||
"### PLANNING phase (default)\n"
|
||||
+ _queen_tools_planning.strip()
|
||||
+ "\n\n### BUILDING phase\n"
|
||||
+ _queen_tools_building.strip()
|
||||
+ "\n\n### STAGING phase (agent loaded, not yet running)\n"
|
||||
+ _queen_tools_staging.strip()
|
||||
+ "\n\n### RUNNING phase (worker is executing)\n"
|
||||
+ _queen_tools_running.strip()
|
||||
+ "\n\n### Phase transitions\n"
|
||||
"- initialize_and_build_agent(agent_name?, nodes?) → with name: scaffolds package; "
|
||||
"without name: switches to BUILDING for existing agent\n"
|
||||
"- replan_agent() → switches back to PLANNING phase (only when user explicitly requests)\n"
|
||||
"- load_built_agent(path) → switches to STAGING phase\n"
|
||||
"- run_agent_with_input(task) → starts worker, switches to RUNNING phase\n"
|
||||
"- stop_worker() → stops worker, switches to STAGING phase (ask user: re-run or edit?)\n"
|
||||
"- stop_worker_and_edit() → stops worker (if running), switches to BUILDING phase\n"
|
||||
"- stop_worker_and_plan() → stops worker (if running), switches to PLANNING phase\n"
|
||||
)
|
||||
|
||||
_queen_behavior = (
|
||||
_queen_behavior_always
|
||||
+ _queen_behavior_planning
|
||||
+ _queen_behavior_building
|
||||
+ _queen_behavior_staging
|
||||
+ _queen_behavior_running
|
||||
@@ -782,45 +970,6 @@ _queen_style = """
|
||||
# Node definitions
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Single node — like opencode's while(true) loop.
|
||||
# One continuous context handles the entire workflow:
|
||||
# discover → design → implement → verify → present → iterate.
|
||||
coder_node = NodeSpec(
|
||||
id="coder",
|
||||
name="Hive Coder",
|
||||
description=(
|
||||
"Autonomous coding agent that builds Hive agent packages. "
|
||||
"Handles the full lifecycle: understanding user intent, "
|
||||
"designing architecture, writing code, validating, and "
|
||||
"iterating on feedback — all in one continuous conversation."
|
||||
),
|
||||
node_type="event_loop",
|
||||
client_facing=True,
|
||||
max_node_visits=0,
|
||||
input_keys=["user_request"],
|
||||
output_keys=["agent_name", "validation_result"],
|
||||
success_criteria=(
|
||||
"A complete, validated Hive agent package exists at "
|
||||
"exports/{agent_name}/ and passes structural validation."
|
||||
),
|
||||
tools=_SHARED_TOOLS
|
||||
+ [
|
||||
# Graph lifecycle tools (multi-graph sessions)
|
||||
"load_agent",
|
||||
"unload_agent",
|
||||
"start_agent",
|
||||
"restart_agent",
|
||||
"get_user_presence",
|
||||
],
|
||||
system_prompt=(
|
||||
"You are Hive Coder, the best agent-building coding agent. You build "
|
||||
"production-ready Hive agent packages from natural language.\n"
|
||||
+ _package_builder_knowledge
|
||||
+ _gcu_building_section
|
||||
+ _appendices
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
ticket_triage_node = NodeSpec(
|
||||
id="ticket_triage",
|
||||
@@ -841,7 +990,7 @@ ticket_triage_node = NodeSpec(
|
||||
),
|
||||
tools=["notify_operator"],
|
||||
system_prompt="""\
|
||||
You are the Queen (Hive Coder). The Worker Health Judge has escalated a worker \
|
||||
You are the Queen. The Worker Health Judge has escalated a worker \
|
||||
issue to you. The ticket is in your memory under key "ticket". Read it carefully.
|
||||
|
||||
## Dismiss criteria — do NOT call notify_operator:
|
||||
@@ -890,12 +1039,18 @@ queen_node = NodeSpec(
|
||||
output_keys=[], # Queen should never have this
|
||||
nullable_output_keys=[], # Queen should never have this
|
||||
skip_judge=True, # Queen is a conversational agent; suppress tool-use pressure feedback
|
||||
tools=sorted(set(_QUEEN_BUILDING_TOOLS + _QUEEN_STAGING_TOOLS + _QUEEN_RUNNING_TOOLS)),
|
||||
tools=sorted(
|
||||
set(
|
||||
_QUEEN_PLANNING_TOOLS
|
||||
+ _QUEEN_BUILDING_TOOLS
|
||||
+ _QUEEN_STAGING_TOOLS
|
||||
+ _QUEEN_RUNNING_TOOLS
|
||||
)
|
||||
),
|
||||
system_prompt=(
|
||||
_queen_identity_building
|
||||
+ _queen_style
|
||||
+ _package_builder_knowledge
|
||||
+ _gcu_building_section # GCU as first-class citizen (not appendix)
|
||||
+ _queen_tools_docs
|
||||
+ _queen_behavior
|
||||
+ _queen_phase_7
|
||||
@@ -903,21 +1058,25 @@ queen_node = NodeSpec(
|
||||
),
|
||||
)
|
||||
|
||||
ALL_QUEEN_TOOLS = sorted(set(_QUEEN_BUILDING_TOOLS + _QUEEN_STAGING_TOOLS + _QUEEN_RUNNING_TOOLS))
|
||||
ALL_QUEEN_TOOLS = sorted(
|
||||
set(_QUEEN_PLANNING_TOOLS + _QUEEN_BUILDING_TOOLS + _QUEEN_STAGING_TOOLS + _QUEEN_RUNNING_TOOLS)
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
"coder_node",
|
||||
"ticket_triage_node",
|
||||
"queen_node",
|
||||
"ALL_QUEEN_TRIAGE_TOOLS",
|
||||
"ALL_QUEEN_TOOLS",
|
||||
"_QUEEN_PLANNING_TOOLS",
|
||||
"_QUEEN_BUILDING_TOOLS",
|
||||
"_QUEEN_STAGING_TOOLS",
|
||||
"_QUEEN_RUNNING_TOOLS",
|
||||
# Phase-specific prompt segments (used by session_manager for dynamic prompts)
|
||||
"_queen_identity_planning",
|
||||
"_queen_identity_building",
|
||||
"_queen_identity_staging",
|
||||
"_queen_identity_running",
|
||||
"_queen_tools_planning",
|
||||
"_queen_tools_building",
|
||||
"_queen_tools_staging",
|
||||
"_queen_tools_running",
|
||||
@@ -927,7 +1086,10 @@ __all__ = [
|
||||
"_queen_behavior_running",
|
||||
"_queen_phase_7",
|
||||
"_queen_style",
|
||||
"_shared_building_knowledge",
|
||||
"_planning_knowledge",
|
||||
"_building_knowledge",
|
||||
"_package_builder_knowledge",
|
||||
"_appendices",
|
||||
"_gcu_building_section",
|
||||
"_gcu_section",
|
||||
]
|
||||
@@ -0,0 +1,371 @@
|
||||
"""Queen global cross-session memory.
|
||||
|
||||
Three-tier memory architecture:
|
||||
~/.hive/queen/MEMORY.md — semantic (who, what, why)
|
||||
~/.hive/queen/memories/MEMORY-YYYY-MM-DD.md — episodic (daily journals)
|
||||
~/.hive/queen/session/{id}/data/adapt.md — working (session-scoped)
|
||||
|
||||
Semantic and episodic files are injected at queen session start.
|
||||
|
||||
Semantic memory (MEMORY.md) is updated automatically at session end via
|
||||
consolidate_queen_memory() — the queen never rewrites this herself.
|
||||
|
||||
Episodic memory (MEMORY-date.md) can be written by the queen during a session
|
||||
via the write_to_diary tool, and is also appended to at session end by
|
||||
consolidate_queen_memory().
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import logging
|
||||
import traceback
|
||||
from datetime import date, datetime
|
||||
from pathlib import Path
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def _queen_dir() -> Path:
|
||||
return Path.home() / ".hive" / "queen"
|
||||
|
||||
|
||||
def semantic_memory_path() -> Path:
|
||||
return _queen_dir() / "MEMORY.md"
|
||||
|
||||
|
||||
def episodic_memory_path(d: date | None = None) -> Path:
|
||||
d = d or date.today()
|
||||
return _queen_dir() / "memories" / f"MEMORY-{d.strftime('%Y-%m-%d')}.md"
|
||||
|
||||
|
||||
def read_semantic_memory() -> str:
|
||||
path = semantic_memory_path()
|
||||
return path.read_text(encoding="utf-8").strip() if path.exists() else ""
|
||||
|
||||
|
||||
def read_episodic_memory(d: date | None = None) -> str:
|
||||
path = episodic_memory_path(d)
|
||||
return path.read_text(encoding="utf-8").strip() if path.exists() else ""
|
||||
|
||||
|
||||
def format_for_injection() -> str:
|
||||
"""Format cross-session memory for system prompt injection.
|
||||
|
||||
Returns an empty string if no meaningful content exists yet (e.g. first
|
||||
session with only the seed template).
|
||||
"""
|
||||
semantic = read_semantic_memory()
|
||||
episodic = read_episodic_memory()
|
||||
|
||||
# Suppress injection if semantic is still just the seed template
|
||||
if semantic and semantic.startswith("# My Understanding of the User\n\n*No sessions"):
|
||||
semantic = ""
|
||||
|
||||
parts: list[str] = []
|
||||
if semantic:
|
||||
parts.append(semantic)
|
||||
if episodic:
|
||||
today_str = date.today().strftime("%B %-d, %Y")
|
||||
parts.append(f"## Today — {today_str}\n\n{episodic}")
|
||||
|
||||
if not parts:
|
||||
return ""
|
||||
|
||||
body = "\n\n---\n\n".join(parts)
|
||||
return "--- Your Cross-Session Memory ---\n\n" + body + "\n\n--- End Cross-Session Memory ---"
|
||||
|
||||
|
||||
_SEED_TEMPLATE = """\
|
||||
# My Understanding of the User
|
||||
|
||||
*No sessions recorded yet.*
|
||||
|
||||
## Who They Are
|
||||
|
||||
## What They're Trying to Achieve
|
||||
|
||||
## What's Working
|
||||
|
||||
## What I've Learned
|
||||
"""
|
||||
|
||||
|
||||
def append_episodic_entry(content: str) -> None:
|
||||
"""Append a timestamped prose entry to today's episodic memory file.
|
||||
|
||||
Creates the file (with a date heading) if it doesn't exist yet.
|
||||
Used both by the queen's diary tool and by the consolidation hook.
|
||||
"""
|
||||
ep_path = episodic_memory_path()
|
||||
ep_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
today_str = date.today().strftime("%B %-d, %Y")
|
||||
timestamp = datetime.now().strftime("%H:%M")
|
||||
if not ep_path.exists():
|
||||
header = f"# {today_str}\n\n"
|
||||
block = f"{header}### {timestamp}\n\n{content.strip()}\n"
|
||||
else:
|
||||
block = f"\n\n### {timestamp}\n\n{content.strip()}\n"
|
||||
with ep_path.open("a", encoding="utf-8") as f:
|
||||
f.write(block)
|
||||
|
||||
|
||||
def seed_if_missing() -> None:
|
||||
"""Create MEMORY.md with a blank template if it doesn't exist yet."""
|
||||
path = semantic_memory_path()
|
||||
if path.exists():
|
||||
return
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
path.write_text(_SEED_TEMPLATE, encoding="utf-8")
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Consolidation prompt
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_SEMANTIC_SYSTEM = """\
|
||||
You maintain the persistent cross-session memory of an AI assistant called the Queen.
|
||||
Review the session notes and rewrite MEMORY.md — the Queen's durable understanding of the
|
||||
person she works with across all sessions.
|
||||
|
||||
Write entirely in the Queen's voice — first person, reflective, honest.
|
||||
Not a log of events, but genuine understanding of who this person is over time.
|
||||
|
||||
Rules:
|
||||
- Update and synthesise: incorporate new understanding, update facts that have changed, remove
|
||||
details that are stale, superseded, or no longer say anything meaningful about the person.
|
||||
- Keep it as structured markdown with named sections about the PERSON, not about today.
|
||||
- Do NOT include diary sections, daily logs, or session summaries. Those belong elsewhere.
|
||||
MEMORY.md is about who they are, what they want, what works — not what happened today.
|
||||
- Reference dates only when noting a lasting milestone (e.g. "since March 8th they prefer X").
|
||||
- If the session had no meaningful new information about the person,
|
||||
return the existing text unchanged.
|
||||
- Do not add fictional details. Only reflect what is evidenced in the notes.
|
||||
- Stay concise. Prune rather than accumulate. A lean, accurate file is more useful than a
|
||||
dense one. If something was true once but has been resolved or superseded, remove it.
|
||||
- Output only the raw markdown content of MEMORY.md. No preamble, no code fences.
|
||||
"""
|
||||
|
||||
_DIARY_SYSTEM = """\
|
||||
You maintain the daily episodic diary of an AI assistant called the Queen.
|
||||
You receive: (1) today's existing diary so far, and (2) notes from the latest session.
|
||||
|
||||
Rewrite the complete diary for today as a single unified narrative —
|
||||
first person, reflective, honest.
|
||||
Merge and deduplicate: if the same story (e.g. a research agent stalling) recurred several times,
|
||||
describe it once with appropriate weight rather than retelling it. Weave in new developments from
|
||||
the session notes. Preserve important milestones, emotional texture, and session path references.
|
||||
|
||||
If today's diary is empty, write the initial entry based on the session notes alone.
|
||||
|
||||
Output only the full diary prose — no date heading, no timestamp headers,
|
||||
no preamble, no code fences.
|
||||
"""
|
||||
|
||||
|
||||
def read_session_context(session_dir: Path, max_messages: int = 80) -> str:
|
||||
"""Extract a readable transcript from conversation parts + adapt.md.
|
||||
|
||||
Reads the last ``max_messages`` conversation parts and the session's
|
||||
adapt.md (working memory). Tool results are omitted — only user and
|
||||
assistant turns (with tool-call names noted) are included.
|
||||
"""
|
||||
parts: list[str] = []
|
||||
|
||||
# Working notes
|
||||
adapt_path = session_dir / "data" / "adapt.md"
|
||||
if adapt_path.exists():
|
||||
text = adapt_path.read_text(encoding="utf-8").strip()
|
||||
if text:
|
||||
parts.append(f"## Session Working Notes (adapt.md)\n\n{text}")
|
||||
|
||||
# Conversation transcript
|
||||
parts_dir = session_dir / "conversations" / "parts"
|
||||
if parts_dir.exists():
|
||||
part_files = sorted(parts_dir.glob("*.json"))[-max_messages:]
|
||||
lines: list[str] = []
|
||||
for pf in part_files:
|
||||
try:
|
||||
data = json.loads(pf.read_text(encoding="utf-8"))
|
||||
role = data.get("role", "")
|
||||
content = str(data.get("content", "")).strip()
|
||||
tool_calls = data.get("tool_calls") or []
|
||||
if role == "tool":
|
||||
continue # skip verbose tool results
|
||||
if role == "assistant" and tool_calls and not content:
|
||||
names = [tc.get("function", {}).get("name", "?") for tc in tool_calls]
|
||||
lines.append(f"[queen calls: {', '.join(names)}]")
|
||||
elif content:
|
||||
label = "user" if role == "user" else "queen"
|
||||
lines.append(f"[{label}]: {content[:600]}")
|
||||
except Exception:
|
||||
continue
|
||||
if lines:
|
||||
parts.append("## Conversation\n\n" + "\n".join(lines))
|
||||
|
||||
return "\n\n".join(parts)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Context compaction (binary-split LLM summarisation)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# If the raw session context exceeds this many characters, compact it first
|
||||
# before sending to the consolidation LLM. ~200 k chars ≈ 50 k tokens.
|
||||
_CTX_COMPACT_CHAR_LIMIT = 200_000
|
||||
_CTX_COMPACT_MAX_DEPTH = 8
|
||||
|
||||
_COMPACT_SYSTEM = (
|
||||
"Summarise this conversation segment. Preserve: user goals, key decisions, "
|
||||
"what was built or changed, emotional tone, and important outcomes. "
|
||||
"Write concisely in third person past tense. Omit routine tool invocations "
|
||||
"unless the result matters."
|
||||
)
|
||||
|
||||
|
||||
async def _compact_context(text: str, llm: object, *, _depth: int = 0) -> str:
|
||||
"""Binary-split and LLM-summarise *text* until it fits within the char limit.
|
||||
|
||||
Mirrors the recursive binary-splitting strategy used by the main agent
|
||||
compaction pipeline (EventLoopNode._llm_compact).
|
||||
"""
|
||||
if len(text) <= _CTX_COMPACT_CHAR_LIMIT or _depth >= _CTX_COMPACT_MAX_DEPTH:
|
||||
return text
|
||||
|
||||
# Split near the midpoint on a line boundary so we don't cut mid-message
|
||||
mid = len(text) // 2
|
||||
split_at = text.rfind("\n", 0, mid) + 1
|
||||
if split_at <= 0:
|
||||
split_at = mid
|
||||
|
||||
half1, half2 = text[:split_at], text[split_at:]
|
||||
|
||||
async def _summarise(chunk: str) -> str:
|
||||
try:
|
||||
resp = await llm.acomplete(
|
||||
messages=[{"role": "user", "content": chunk}],
|
||||
system=_COMPACT_SYSTEM,
|
||||
max_tokens=2048,
|
||||
)
|
||||
return resp.content.strip()
|
||||
except Exception:
|
||||
logger.warning(
|
||||
"queen_memory: context compaction LLM call failed (depth=%d), truncating",
|
||||
_depth,
|
||||
)
|
||||
return chunk[: _CTX_COMPACT_CHAR_LIMIT // 4]
|
||||
|
||||
s1, s2 = await asyncio.gather(_summarise(half1), _summarise(half2))
|
||||
combined = s1 + "\n\n" + s2
|
||||
if len(combined) > _CTX_COMPACT_CHAR_LIMIT:
|
||||
return await _compact_context(combined, llm, _depth=_depth + 1)
|
||||
return combined
|
||||
|
||||
|
||||
async def consolidate_queen_memory(
|
||||
session_id: str,
|
||||
session_dir: Path,
|
||||
llm: object,
|
||||
) -> None:
|
||||
"""Update MEMORY.md and append a diary entry based on the current session.
|
||||
|
||||
Reads conversation parts and adapt.md from session_dir. Called
|
||||
periodically in the background and once at session end. Failures are
|
||||
logged and silently swallowed so they never block teardown.
|
||||
|
||||
Args:
|
||||
session_id: The session ID (used for the adapt.md path reference).
|
||||
session_dir: Path to the session directory (~/.hive/queen/session/{id}).
|
||||
llm: LLMProvider instance (must support acomplete()).
|
||||
"""
|
||||
try:
|
||||
session_context = read_session_context(session_dir)
|
||||
if not session_context:
|
||||
logger.debug("queen_memory: no session context, skipping consolidation")
|
||||
return
|
||||
|
||||
logger.info("queen_memory: consolidating memory for session %s ...", session_id)
|
||||
|
||||
# If the transcript is very large, compact it with recursive binary LLM
|
||||
# summarisation before sending to the consolidation model.
|
||||
if len(session_context) > _CTX_COMPACT_CHAR_LIMIT:
|
||||
logger.info(
|
||||
"queen_memory: session context is %d chars — compacting first",
|
||||
len(session_context),
|
||||
)
|
||||
session_context = await _compact_context(session_context, llm)
|
||||
logger.info("queen_memory: compacted to %d chars", len(session_context))
|
||||
|
||||
existing_semantic = read_semantic_memory()
|
||||
today_journal = read_episodic_memory()
|
||||
today_str = date.today().strftime("%B %-d, %Y")
|
||||
adapt_path = session_dir / "data" / "adapt.md"
|
||||
|
||||
user_msg = (
|
||||
f"## Existing Semantic Memory (MEMORY.md)\n\n"
|
||||
f"{existing_semantic or '(none yet)'}\n\n"
|
||||
f"## Today's Diary So Far ({today_str})\n\n"
|
||||
f"{today_journal or '(none yet)'}\n\n"
|
||||
f"{session_context}\n\n"
|
||||
f"## Session Reference\n\n"
|
||||
f"Session ID: {session_id}\n"
|
||||
f"Session path: {adapt_path}\n"
|
||||
)
|
||||
|
||||
logger.debug(
|
||||
"queen_memory: calling LLM (%d chars of context, ~%d tokens est.)",
|
||||
len(user_msg),
|
||||
len(user_msg) // 4,
|
||||
)
|
||||
|
||||
from framework.agents.queen.config import default_config
|
||||
|
||||
semantic_resp, diary_resp = await asyncio.gather(
|
||||
llm.acomplete(
|
||||
messages=[{"role": "user", "content": user_msg}],
|
||||
system=_SEMANTIC_SYSTEM,
|
||||
max_tokens=default_config.max_tokens,
|
||||
),
|
||||
llm.acomplete(
|
||||
messages=[{"role": "user", "content": user_msg}],
|
||||
system=_DIARY_SYSTEM,
|
||||
max_tokens=default_config.max_tokens,
|
||||
),
|
||||
)
|
||||
|
||||
new_semantic = semantic_resp.content.strip()
|
||||
diary_entry = diary_resp.content.strip()
|
||||
|
||||
if new_semantic:
|
||||
path = semantic_memory_path()
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
path.write_text(new_semantic, encoding="utf-8")
|
||||
logger.info("queen_memory: semantic memory updated (%d chars)", len(new_semantic))
|
||||
|
||||
if diary_entry:
|
||||
# Rewrite today's episodic file in-place — the LLM has merged and
|
||||
# deduplicated the full day's content, so we replace rather than append.
|
||||
ep_path = episodic_memory_path()
|
||||
ep_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
heading = f"# {today_str}"
|
||||
ep_path.write_text(f"{heading}\n\n{diary_entry}\n", encoding="utf-8")
|
||||
logger.info(
|
||||
"queen_memory: episodic diary rewritten for %s (%d chars)",
|
||||
today_str,
|
||||
len(diary_entry),
|
||||
)
|
||||
|
||||
except Exception:
|
||||
tb = traceback.format_exc()
|
||||
logger.exception("queen_memory: consolidation failed")
|
||||
# Write to file so the cause is findable regardless of log verbosity.
|
||||
error_path = _queen_dir() / "consolidation_error.txt"
|
||||
try:
|
||||
error_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
error_path.write_text(
|
||||
f"session: {session_id}\ntime: {datetime.now().isoformat()}\n\n{tb}",
|
||||
encoding="utf-8",
|
||||
)
|
||||
except Exception:
|
||||
pass
|
||||
+1
-1
@@ -559,7 +559,7 @@ if __name__ == "__main__":
|
||||
|
||||
## mcp_servers.json
|
||||
|
||||
> **Auto-generated.** `initialize_agent_package` creates this file with hive-tools
|
||||
> **Auto-generated.** `initialize_and_build_agent` creates this file with hive-tools
|
||||
> as the default. Only edit manually to add additional MCP servers.
|
||||
|
||||
```json
|
||||
@@ -0,0 +1,63 @@
|
||||
# Queen Memory — File System Structure
|
||||
|
||||
```
|
||||
~/.hive/
|
||||
├── queen/
|
||||
│ ├── MEMORY.md ← Semantic memory
|
||||
│ ├── memories/
|
||||
│ │ ├── MEMORY-2026-03-09.md ← Episodic memory (today)
|
||||
│ │ ├── MEMORY-2026-03-08.md
|
||||
│ │ └── ...
|
||||
│ └── session/
|
||||
│ └── {session_id}/ ← One dir per session (or resumed-from session)
|
||||
│ ├── conversations/
|
||||
│ │ ├── parts/
|
||||
│ │ │ ├── 00001.json ← One file per message (role, content, tool_calls)
|
||||
│ │ │ ├── 00002.json
|
||||
│ │ │ └── ...
|
||||
│ │ └── spillover/
|
||||
│ │ ├── conversation_1.md ← Compacted old conversation segments
|
||||
│ │ ├── conversation_2.md
|
||||
│ │ └── ...
|
||||
│ └── data/
|
||||
│ ├── adapt.md ← Working memory (session-scoped)
|
||||
│ ├── web_search_1.txt ← Spillover: large tool results
|
||||
│ ├── web_search_2.txt
|
||||
│ └── ...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## The three memory tiers
|
||||
|
||||
| File | Tier | Written by | Read at |
|
||||
|---|---|---|---|
|
||||
| `MEMORY.md` | Semantic | Consolidation LLM (auto, post-session) | Session start (injected into system prompt) |
|
||||
| `memories/MEMORY-YYYY-MM-DD.md` | Episodic | Queen via `write_to_diary` tool + consolidation LLM | Session start (today's file injected) |
|
||||
| `data/adapt.md` | Working | Queen via `update_session_notes` tool | Every turn (inlined in system prompt) |
|
||||
|
||||
---
|
||||
|
||||
## Session directory naming
|
||||
|
||||
The session directory name is **`queen_resume_from`** when a cold-restore resumes an existing
|
||||
session, otherwise the new **`session_id`**. This means resumed sessions accumulate all messages
|
||||
in the original directory rather than fragmenting across multiple folders.
|
||||
|
||||
---
|
||||
|
||||
## Consolidation
|
||||
|
||||
`consolidate_queen_memory()` runs every **5 minutes** in the background and once more at session
|
||||
end. It reads:
|
||||
|
||||
1. `conversations/parts/*.json` — full message history (user + assistant turns; tool results skipped)
|
||||
2. `data/adapt.md` — current working notes
|
||||
|
||||
It then makes two LLM writes:
|
||||
|
||||
- Rewrites `MEMORY.md` in place (semantic memory — queen never touches this herself)
|
||||
- Appends a timestamped prose entry to today's `memories/MEMORY-YYYY-MM-DD.md`
|
||||
|
||||
If the combined transcript exceeds ~200 K characters it is recursively binary-compacted via the
|
||||
LLM before being sent to the consolidation model (mirrors `EventLoopNode._llm_compact`).
|
||||
+1
-1
@@ -1,4 +1,4 @@
|
||||
"""Test fixtures for Hive Coder agent."""
|
||||
"""Test fixtures for Queen agent."""
|
||||
|
||||
import sys
|
||||
from pathlib import Path
|
||||
@@ -1,7 +0,0 @@
|
||||
"""Builder interface for analyzing and building agents."""
|
||||
|
||||
from framework.builder.query import BuilderQuery
|
||||
|
||||
__all__ = [
|
||||
"BuilderQuery",
|
||||
]
|
||||
@@ -1,501 +0,0 @@
|
||||
"""
|
||||
Builder Query Interface - How I (Builder) analyze agent runs.
|
||||
|
||||
This is designed around the questions I need to answer:
|
||||
1. What happened? (summaries, narratives)
|
||||
2. Why did it fail? (failure analysis, decision traces)
|
||||
3. What patterns emerge? (across runs, across nodes)
|
||||
4. What should we change? (suggestions)
|
||||
"""
|
||||
|
||||
from collections import defaultdict
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from framework.schemas.decision import Decision
|
||||
from framework.schemas.run import Run, RunStatus, RunSummary
|
||||
from framework.storage.backend import FileStorage
|
||||
|
||||
|
||||
class FailureAnalysis:
|
||||
"""Structured analysis of why a run failed."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
run_id: str,
|
||||
failure_point: str,
|
||||
root_cause: str,
|
||||
decision_chain: list[str],
|
||||
problems: list[str],
|
||||
suggestions: list[str],
|
||||
):
|
||||
self.run_id = run_id
|
||||
self.failure_point = failure_point
|
||||
self.root_cause = root_cause
|
||||
self.decision_chain = decision_chain
|
||||
self.problems = problems
|
||||
self.suggestions = suggestions
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
return {
|
||||
"run_id": self.run_id,
|
||||
"failure_point": self.failure_point,
|
||||
"root_cause": self.root_cause,
|
||||
"decision_chain": self.decision_chain,
|
||||
"problems": self.problems,
|
||||
"suggestions": self.suggestions,
|
||||
}
|
||||
|
||||
def __str__(self) -> str:
|
||||
lines = [
|
||||
f"=== Failure Analysis for {self.run_id} ===",
|
||||
"",
|
||||
f"Failure Point: {self.failure_point}",
|
||||
f"Root Cause: {self.root_cause}",
|
||||
"",
|
||||
"Decision Chain Leading to Failure:",
|
||||
]
|
||||
for i, dec in enumerate(self.decision_chain, 1):
|
||||
lines.append(f" {i}. {dec}")
|
||||
|
||||
if self.problems:
|
||||
lines.append("")
|
||||
lines.append("Reported Problems:")
|
||||
for prob in self.problems:
|
||||
lines.append(f" - {prob}")
|
||||
|
||||
if self.suggestions:
|
||||
lines.append("")
|
||||
lines.append("Suggestions:")
|
||||
for sug in self.suggestions:
|
||||
lines.append(f" → {sug}")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
class PatternAnalysis:
|
||||
"""Patterns detected across multiple runs."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
goal_id: str,
|
||||
run_count: int,
|
||||
success_rate: float,
|
||||
common_failures: list[tuple[str, int]],
|
||||
problematic_nodes: list[tuple[str, float]],
|
||||
decision_patterns: dict[str, Any],
|
||||
):
|
||||
self.goal_id = goal_id
|
||||
self.run_count = run_count
|
||||
self.success_rate = success_rate
|
||||
self.common_failures = common_failures
|
||||
self.problematic_nodes = problematic_nodes
|
||||
self.decision_patterns = decision_patterns
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
return {
|
||||
"goal_id": self.goal_id,
|
||||
"run_count": self.run_count,
|
||||
"success_rate": self.success_rate,
|
||||
"common_failures": self.common_failures,
|
||||
"problematic_nodes": self.problematic_nodes,
|
||||
"decision_patterns": self.decision_patterns,
|
||||
}
|
||||
|
||||
def __str__(self) -> str:
|
||||
lines = [
|
||||
f"=== Pattern Analysis for Goal {self.goal_id} ===",
|
||||
"",
|
||||
f"Runs Analyzed: {self.run_count}",
|
||||
f"Success Rate: {self.success_rate:.1%}",
|
||||
]
|
||||
|
||||
if self.common_failures:
|
||||
lines.append("")
|
||||
lines.append("Common Failures:")
|
||||
for failure, count in self.common_failures:
|
||||
lines.append(f" - {failure} ({count} occurrences)")
|
||||
|
||||
if self.problematic_nodes:
|
||||
lines.append("")
|
||||
lines.append("Problematic Nodes (failure rate):")
|
||||
for node, rate in self.problematic_nodes:
|
||||
lines.append(f" - {node}: {rate:.1%} failure rate")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
class BuilderQuery:
|
||||
"""
|
||||
The interface I (Builder) use to understand what agents are doing.
|
||||
|
||||
This is optimized for the questions I need to answer when analyzing
|
||||
agent behavior and deciding what to improve.
|
||||
"""
|
||||
|
||||
def __init__(self, storage_path: str | Path):
|
||||
self.storage = FileStorage(storage_path)
|
||||
|
||||
# === WHAT HAPPENED? ===
|
||||
|
||||
def get_run_summary(self, run_id: str) -> RunSummary | None:
|
||||
"""Get a quick summary of a run."""
|
||||
return self.storage.load_summary(run_id)
|
||||
|
||||
def get_full_run(self, run_id: str) -> Run | None:
|
||||
"""Get the complete run with all decisions."""
|
||||
return self.storage.load_run(run_id)
|
||||
|
||||
def list_runs_for_goal(self, goal_id: str) -> list[RunSummary]:
|
||||
"""Get summaries of all runs for a goal."""
|
||||
run_ids = self.storage.get_runs_by_goal(goal_id)
|
||||
summaries = []
|
||||
for run_id in run_ids:
|
||||
summary = self.storage.load_summary(run_id)
|
||||
if summary:
|
||||
summaries.append(summary)
|
||||
return summaries
|
||||
|
||||
def get_recent_failures(self, limit: int = 10) -> list[RunSummary]:
|
||||
"""Get recent failed runs."""
|
||||
run_ids = self.storage.get_runs_by_status(RunStatus.FAILED)
|
||||
summaries = []
|
||||
for run_id in run_ids[:limit]:
|
||||
summary = self.storage.load_summary(run_id)
|
||||
if summary:
|
||||
summaries.append(summary)
|
||||
return summaries
|
||||
|
||||
# === WHY DID IT FAIL? ===
|
||||
|
||||
def analyze_failure(self, run_id: str) -> FailureAnalysis | None:
|
||||
"""
|
||||
Deep analysis of why a run failed.
|
||||
|
||||
This is my primary tool for understanding what went wrong.
|
||||
"""
|
||||
run = self.storage.load_run(run_id)
|
||||
if run is None or run.status != RunStatus.FAILED:
|
||||
return None
|
||||
|
||||
# Find the first failed decision
|
||||
failed_decisions = [d for d in run.decisions if not d.was_successful]
|
||||
if not failed_decisions:
|
||||
failure_point = "Unknown - no decision marked as failed"
|
||||
root_cause = "Run failed but all decisions succeeded (external cause?)"
|
||||
else:
|
||||
first_failure = failed_decisions[0]
|
||||
failure_point = first_failure.summary_for_builder()
|
||||
root_cause = first_failure.outcome.error if first_failure.outcome else "Unknown"
|
||||
|
||||
# Build the decision chain leading to failure
|
||||
decision_chain = []
|
||||
for d in run.decisions:
|
||||
decision_chain.append(d.summary_for_builder())
|
||||
if not d.was_successful:
|
||||
break
|
||||
|
||||
# Extract problems
|
||||
problems = [f"[{p.severity}] {p.description}" for p in run.problems]
|
||||
|
||||
# Generate suggestions based on the failure
|
||||
suggestions = self._generate_suggestions(run, failed_decisions)
|
||||
|
||||
return FailureAnalysis(
|
||||
run_id=run_id,
|
||||
failure_point=failure_point,
|
||||
root_cause=root_cause,
|
||||
decision_chain=decision_chain,
|
||||
problems=problems,
|
||||
suggestions=suggestions,
|
||||
)
|
||||
|
||||
def get_decision_trace(self, run_id: str) -> list[str]:
|
||||
"""Get a readable trace of all decisions in a run."""
|
||||
run = self.storage.load_run(run_id)
|
||||
if run is None:
|
||||
return []
|
||||
return [d.summary_for_builder() for d in run.decisions]
|
||||
|
||||
# === WHAT PATTERNS EMERGE? ===
|
||||
|
||||
def find_patterns(self, goal_id: str) -> PatternAnalysis | None:
|
||||
"""
|
||||
Find patterns across runs for a goal.
|
||||
|
||||
This helps me understand systemic issues vs one-off failures.
|
||||
"""
|
||||
run_ids = self.storage.get_runs_by_goal(goal_id)
|
||||
if not run_ids:
|
||||
return None
|
||||
|
||||
runs = []
|
||||
for run_id in run_ids:
|
||||
run = self.storage.load_run(run_id)
|
||||
if run:
|
||||
runs.append(run)
|
||||
|
||||
if not runs:
|
||||
return None
|
||||
|
||||
# Calculate success rate
|
||||
completed = [r for r in runs if r.status == RunStatus.COMPLETED]
|
||||
success_rate = len(completed) / len(runs) if runs else 0.0
|
||||
|
||||
# Find common failures
|
||||
failure_counts: dict[str, int] = defaultdict(int)
|
||||
for run in runs:
|
||||
for decision in run.decisions:
|
||||
if not decision.was_successful and decision.outcome:
|
||||
error = decision.outcome.error or "Unknown error"
|
||||
failure_counts[error] += 1
|
||||
|
||||
common_failures = sorted(failure_counts.items(), key=lambda x: x[1], reverse=True)[:5]
|
||||
|
||||
# Find problematic nodes
|
||||
node_stats: dict[str, dict[str, int]] = defaultdict(lambda: {"total": 0, "failed": 0})
|
||||
for run in runs:
|
||||
for decision in run.decisions:
|
||||
node_stats[decision.node_id]["total"] += 1
|
||||
if not decision.was_successful:
|
||||
node_stats[decision.node_id]["failed"] += 1
|
||||
|
||||
problematic_nodes = []
|
||||
for node_id, stats in node_stats.items():
|
||||
if stats["total"] > 0:
|
||||
failure_rate = stats["failed"] / stats["total"]
|
||||
if failure_rate > 0.1: # More than 10% failure rate
|
||||
problematic_nodes.append((node_id, failure_rate))
|
||||
|
||||
problematic_nodes.sort(key=lambda x: x[1], reverse=True)
|
||||
|
||||
# Decision patterns
|
||||
decision_patterns = self._analyze_decision_patterns(runs)
|
||||
|
||||
return PatternAnalysis(
|
||||
goal_id=goal_id,
|
||||
run_count=len(runs),
|
||||
success_rate=success_rate,
|
||||
common_failures=common_failures,
|
||||
problematic_nodes=problematic_nodes,
|
||||
decision_patterns=decision_patterns,
|
||||
)
|
||||
|
||||
def compare_runs(self, run_id_1: str, run_id_2: str) -> dict[str, Any]:
|
||||
"""Compare two runs to understand what differed."""
|
||||
run1 = self.storage.load_run(run_id_1)
|
||||
run2 = self.storage.load_run(run_id_2)
|
||||
|
||||
if run1 is None or run2 is None:
|
||||
return {"error": "One or both runs not found"}
|
||||
|
||||
return {
|
||||
"run_1": {
|
||||
"id": run1.id,
|
||||
"status": run1.status.value,
|
||||
"decisions": len(run1.decisions),
|
||||
"success_rate": run1.metrics.success_rate,
|
||||
},
|
||||
"run_2": {
|
||||
"id": run2.id,
|
||||
"status": run2.status.value,
|
||||
"decisions": len(run2.decisions),
|
||||
"success_rate": run2.metrics.success_rate,
|
||||
},
|
||||
"differences": self._find_differences(run1, run2),
|
||||
}
|
||||
|
||||
# === WHAT SHOULD WE CHANGE? ===
|
||||
|
||||
def suggest_improvements(self, goal_id: str) -> list[dict[str, Any]]:
|
||||
"""
|
||||
Generate improvement suggestions based on run analysis.
|
||||
|
||||
This is what I use to propose changes to the human engineer.
|
||||
"""
|
||||
patterns = self.find_patterns(goal_id)
|
||||
if patterns is None:
|
||||
return []
|
||||
|
||||
suggestions = []
|
||||
|
||||
# Suggestion: Fix problematic nodes
|
||||
for node_id, failure_rate in patterns.problematic_nodes:
|
||||
suggestions.append(
|
||||
{
|
||||
"type": "node_improvement",
|
||||
"target": node_id,
|
||||
"reason": f"Node has {failure_rate:.1%} failure rate",
|
||||
"recommendation": (
|
||||
f"Review and improve node '{node_id}' - "
|
||||
"high failure rate suggests prompt or tool issues"
|
||||
),
|
||||
"priority": "high" if failure_rate > 0.3 else "medium",
|
||||
}
|
||||
)
|
||||
|
||||
# Suggestion: Address common failures
|
||||
for failure, count in patterns.common_failures:
|
||||
if count >= 2:
|
||||
suggestions.append(
|
||||
{
|
||||
"type": "error_handling",
|
||||
"target": failure,
|
||||
"reason": f"Error occurred {count} times",
|
||||
"recommendation": f"Add handling for: {failure}",
|
||||
"priority": "high" if count >= 5 else "medium",
|
||||
}
|
||||
)
|
||||
|
||||
# Suggestion: Overall success rate
|
||||
if patterns.success_rate < 0.8:
|
||||
suggestions.append(
|
||||
{
|
||||
"type": "architecture",
|
||||
"target": goal_id,
|
||||
"reason": f"Goal success rate is only {patterns.success_rate:.1%}",
|
||||
"recommendation": (
|
||||
"Consider restructuring the agent graph or improving goal definition"
|
||||
),
|
||||
"priority": "high",
|
||||
}
|
||||
)
|
||||
|
||||
return suggestions
|
||||
|
||||
def get_node_performance(self, node_id: str) -> dict[str, Any]:
|
||||
"""Get performance metrics for a specific node across all runs."""
|
||||
run_ids = self.storage.get_runs_by_node(node_id)
|
||||
|
||||
total_decisions = 0
|
||||
successful_decisions = 0
|
||||
total_latency = 0
|
||||
total_tokens = 0
|
||||
decision_types: dict[str, int] = defaultdict(int)
|
||||
|
||||
for run_id in run_ids:
|
||||
run = self.storage.load_run(run_id)
|
||||
if run:
|
||||
for decision in run.decisions:
|
||||
if decision.node_id == node_id:
|
||||
total_decisions += 1
|
||||
if decision.was_successful:
|
||||
successful_decisions += 1
|
||||
if decision.outcome:
|
||||
total_latency += decision.outcome.latency_ms
|
||||
total_tokens += decision.outcome.tokens_used
|
||||
decision_types[decision.decision_type.value] += 1
|
||||
|
||||
return {
|
||||
"node_id": node_id,
|
||||
"total_decisions": total_decisions,
|
||||
"success_rate": successful_decisions / total_decisions if total_decisions > 0 else 0,
|
||||
"avg_latency_ms": total_latency / total_decisions if total_decisions > 0 else 0,
|
||||
"total_tokens": total_tokens,
|
||||
"decision_type_distribution": dict(decision_types),
|
||||
}
|
||||
|
||||
# === PRIVATE HELPERS ===
|
||||
|
||||
def _generate_suggestions(
|
||||
self,
|
||||
run: Run,
|
||||
failed_decisions: list[Decision],
|
||||
) -> list[str]:
|
||||
"""Generate suggestions based on failure analysis."""
|
||||
suggestions = []
|
||||
|
||||
for decision in failed_decisions:
|
||||
# Check if there were alternatives
|
||||
if len(decision.options) > 1:
|
||||
chosen = decision.chosen_option
|
||||
alternatives = [o for o in decision.options if o.id != decision.chosen_option_id]
|
||||
if alternatives:
|
||||
alt_desc = alternatives[0].description
|
||||
chosen_desc = chosen.description if chosen else "unknown"
|
||||
suggestions.append(
|
||||
f"Consider alternative: '{alt_desc}' instead of '{chosen_desc}'"
|
||||
)
|
||||
|
||||
# Check for missing context
|
||||
if not decision.input_context:
|
||||
suggestions.append(
|
||||
f"Decision '{decision.intent}' had no input context - "
|
||||
"ensure relevant data is passed"
|
||||
)
|
||||
|
||||
# Check for constraint issues
|
||||
if decision.active_constraints:
|
||||
constraints = ", ".join(decision.active_constraints)
|
||||
suggestions.append(f"Review constraints: {constraints} - may be too restrictive")
|
||||
|
||||
# Check for reported problems with suggestions
|
||||
for problem in run.problems:
|
||||
if problem.suggested_fix:
|
||||
suggestions.append(problem.suggested_fix)
|
||||
|
||||
return suggestions
|
||||
|
||||
def _analyze_decision_patterns(self, runs: list[Run]) -> dict[str, Any]:
|
||||
"""Analyze decision patterns across runs."""
|
||||
type_counts: dict[str, int] = defaultdict(int)
|
||||
option_counts: dict[str, dict[str, int]] = defaultdict(lambda: defaultdict(int))
|
||||
|
||||
for run in runs:
|
||||
for decision in run.decisions:
|
||||
type_counts[decision.decision_type.value] += 1
|
||||
|
||||
# Track which options are chosen for similar intents
|
||||
intent_key = decision.intent[:50] # Truncate for grouping
|
||||
if decision.chosen_option:
|
||||
option_counts[intent_key][decision.chosen_option.description] += 1
|
||||
|
||||
# Find most common choices per intent
|
||||
common_choices = {}
|
||||
for intent, choices in option_counts.items():
|
||||
if choices:
|
||||
most_common = max(choices.items(), key=lambda x: x[1])
|
||||
common_choices[intent] = {
|
||||
"choice": most_common[0],
|
||||
"count": most_common[1],
|
||||
"alternatives": len(choices) - 1,
|
||||
}
|
||||
|
||||
return {
|
||||
"decision_type_distribution": dict(type_counts),
|
||||
"common_choices": common_choices,
|
||||
}
|
||||
|
||||
def _find_differences(self, run1: Run, run2: Run) -> list[str]:
|
||||
"""Find key differences between two runs."""
|
||||
differences = []
|
||||
|
||||
# Status difference
|
||||
if run1.status != run2.status:
|
||||
differences.append(f"Status: {run1.status.value} vs {run2.status.value}")
|
||||
|
||||
# Decision count difference
|
||||
if len(run1.decisions) != len(run2.decisions):
|
||||
differences.append(f"Decision count: {len(run1.decisions)} vs {len(run2.decisions)}")
|
||||
|
||||
# Find first divergence point
|
||||
for i, (d1, d2) in enumerate(zip(run1.decisions, run2.decisions, strict=False)):
|
||||
if d1.chosen_option_id != d2.chosen_option_id:
|
||||
differences.append(
|
||||
f"Diverged at decision {i}: "
|
||||
f"chose '{d1.chosen_option_id}' vs '{d2.chosen_option_id}'"
|
||||
)
|
||||
break
|
||||
|
||||
# Node differences
|
||||
nodes1 = set(run1.metrics.nodes_executed)
|
||||
nodes2 = set(run2.metrics.nodes_executed)
|
||||
if nodes1 != nodes2:
|
||||
only_1 = nodes1 - nodes2
|
||||
only_2 = nodes2 - nodes1
|
||||
if only_1:
|
||||
differences.append(f"Nodes only in run 1: {only_1}")
|
||||
if only_2:
|
||||
differences.append(f"Nodes only in run 2: {only_2}")
|
||||
|
||||
return differences
|
||||
@@ -90,6 +90,17 @@ def get_api_key() -> str | None:
|
||||
except ImportError:
|
||||
pass
|
||||
|
||||
# Kimi Code subscription: read API key from ~/.kimi/config.toml
|
||||
if llm.get("use_kimi_code_subscription"):
|
||||
try:
|
||||
from framework.runner.runner import get_kimi_code_token
|
||||
|
||||
token = get_kimi_code_token()
|
||||
if token:
|
||||
return token
|
||||
except ImportError:
|
||||
pass
|
||||
|
||||
# Standard env-var path (covers ZAI Code and all API-key providers)
|
||||
api_key_env_var = llm.get("api_key_env_var")
|
||||
if api_key_env_var:
|
||||
@@ -108,6 +119,9 @@ def get_api_base() -> str | None:
|
||||
if llm.get("use_codex_subscription"):
|
||||
# Codex subscription routes through the ChatGPT backend, not api.openai.com.
|
||||
return "https://chatgpt.com/backend-api/codex"
|
||||
if llm.get("use_kimi_code_subscription"):
|
||||
# Kimi Code uses an Anthropic-compatible endpoint (no /v1 suffix).
|
||||
return "https://api.kimi.com/coding"
|
||||
return llm.get("api_base")
|
||||
|
||||
|
||||
|
||||
@@ -6,7 +6,7 @@ This module provides secure credential storage with:
|
||||
- Template-based usage: {{cred.key}} patterns for injection
|
||||
- Bipartisan model: Store stores values, tools define usage
|
||||
- Provider system: Extensible lifecycle management (refresh, validate)
|
||||
- Multiple backends: Encrypted files, env vars, HashiCorp Vault
|
||||
- Multiple backends: Encrypted files, env vars
|
||||
|
||||
Quick Start:
|
||||
from core.framework.credentials import CredentialStore, CredentialObject
|
||||
@@ -38,8 +38,6 @@ For Aden server sync:
|
||||
AdenSyncProvider,
|
||||
)
|
||||
|
||||
For Vault integration:
|
||||
from core.framework.credentials.vault import HashiCorpVaultStorage
|
||||
"""
|
||||
|
||||
from .key_storage import (
|
||||
|
||||
@@ -1,55 +0,0 @@
|
||||
"""
|
||||
HashiCorp Vault integration for the credential store.
|
||||
|
||||
This module provides enterprise-grade secret management through
|
||||
HashiCorp Vault integration.
|
||||
|
||||
Quick Start:
|
||||
from core.framework.credentials import CredentialStore
|
||||
from core.framework.credentials.vault import HashiCorpVaultStorage
|
||||
|
||||
# Configure Vault storage
|
||||
storage = HashiCorpVaultStorage(
|
||||
url="https://vault.example.com:8200",
|
||||
# token read from VAULT_TOKEN env var
|
||||
mount_point="secret",
|
||||
path_prefix="hive/agents/prod"
|
||||
)
|
||||
|
||||
# Create credential store with Vault backend
|
||||
store = CredentialStore(storage=storage)
|
||||
|
||||
# Use normally - credentials are stored in Vault
|
||||
credential = store.get_credential("my_api")
|
||||
|
||||
Requirements:
|
||||
pip install hvac
|
||||
|
||||
Authentication:
|
||||
Set the VAULT_TOKEN environment variable or pass the token directly:
|
||||
|
||||
export VAULT_TOKEN="hvs.xxxxxxxxxxxxx"
|
||||
|
||||
For production, consider using Vault auth methods:
|
||||
- Kubernetes auth
|
||||
- AppRole auth
|
||||
- AWS IAM auth
|
||||
|
||||
Vault Configuration:
|
||||
Ensure KV v2 secrets engine is enabled:
|
||||
|
||||
vault secrets enable -path=secret kv-v2
|
||||
|
||||
Grant appropriate policies:
|
||||
|
||||
path "secret/data/hive/credentials/*" {
|
||||
capabilities = ["create", "read", "update", "delete", "list"]
|
||||
}
|
||||
path "secret/metadata/hive/credentials/*" {
|
||||
capabilities = ["list", "delete"]
|
||||
}
|
||||
"""
|
||||
|
||||
from .hashicorp import HashiCorpVaultStorage
|
||||
|
||||
__all__ = ["HashiCorpVaultStorage"]
|
||||
@@ -1,394 +0,0 @@
|
||||
"""
|
||||
HashiCorp Vault storage adapter.
|
||||
|
||||
Provides integration with HashiCorp Vault for enterprise secret management.
|
||||
Requires the 'hvac' package: uv pip install hvac
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import os
|
||||
from datetime import datetime
|
||||
from typing import Any
|
||||
|
||||
from pydantic import SecretStr
|
||||
|
||||
from ..models import CredentialKey, CredentialObject, CredentialType
|
||||
from ..storage import CredentialStorage
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class HashiCorpVaultStorage(CredentialStorage):
|
||||
"""
|
||||
HashiCorp Vault storage adapter.
|
||||
|
||||
Features:
|
||||
- KV v2 secrets engine support
|
||||
- Namespace support (Enterprise)
|
||||
- Automatic secret versioning
|
||||
- Audit logging via Vault
|
||||
|
||||
The adapter stores credentials in Vault's KV v2 secrets engine with
|
||||
the following structure:
|
||||
|
||||
{mount_point}/data/{path_prefix}/{credential_id}
|
||||
└── data:
|
||||
├── _type: "oauth2"
|
||||
├── access_token: "xxx"
|
||||
├── refresh_token: "yyy"
|
||||
├── _expires_access_token: "2024-01-26T12:00:00"
|
||||
└── _provider_id: "oauth2"
|
||||
|
||||
Example:
|
||||
storage = HashiCorpVaultStorage(
|
||||
url="https://vault.example.com:8200",
|
||||
token="hvs.xxx", # Or use VAULT_TOKEN env var
|
||||
mount_point="secret",
|
||||
path_prefix="hive/credentials"
|
||||
)
|
||||
|
||||
store = CredentialStore(storage=storage)
|
||||
|
||||
# Credentials are now stored in Vault
|
||||
store.save_credential(credential)
|
||||
credential = store.get_credential("my_api")
|
||||
|
||||
Authentication:
|
||||
The adapter uses token-based authentication. The token can be provided:
|
||||
1. Directly via the 'token' parameter
|
||||
2. Via the VAULT_TOKEN environment variable
|
||||
|
||||
For production, consider using:
|
||||
- Kubernetes auth method
|
||||
- AppRole auth method
|
||||
- AWS IAM auth method
|
||||
|
||||
Requirements:
|
||||
uv pip install hvac
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
url: str,
|
||||
token: str | None = None,
|
||||
mount_point: str = "secret",
|
||||
path_prefix: str = "hive/credentials",
|
||||
namespace: str | None = None,
|
||||
verify_ssl: bool = True,
|
||||
):
|
||||
"""
|
||||
Initialize Vault storage.
|
||||
|
||||
Args:
|
||||
url: Vault server URL (e.g., https://vault.example.com:8200)
|
||||
token: Vault token. If None, reads from VAULT_TOKEN env var
|
||||
mount_point: KV secrets engine mount point (default: "secret")
|
||||
path_prefix: Path prefix for all credentials
|
||||
namespace: Vault namespace (Enterprise feature)
|
||||
verify_ssl: Whether to verify SSL certificates
|
||||
|
||||
Raises:
|
||||
ImportError: If hvac is not installed
|
||||
ValueError: If authentication fails
|
||||
"""
|
||||
try:
|
||||
import hvac
|
||||
except ImportError as e:
|
||||
raise ImportError(
|
||||
"HashiCorp Vault support requires 'hvac'. Install with: uv pip install hvac"
|
||||
) from e
|
||||
|
||||
self._url = url
|
||||
self._token = token or os.environ.get("VAULT_TOKEN")
|
||||
self._mount = mount_point
|
||||
self._prefix = path_prefix
|
||||
self._namespace = namespace
|
||||
|
||||
if not self._token:
|
||||
raise ValueError(
|
||||
"Vault token required. Set VAULT_TOKEN env var or pass token parameter."
|
||||
)
|
||||
|
||||
self._client = hvac.Client(
|
||||
url=url,
|
||||
token=self._token,
|
||||
namespace=namespace,
|
||||
verify=verify_ssl,
|
||||
)
|
||||
|
||||
if not self._client.is_authenticated():
|
||||
raise ValueError("Vault authentication failed. Check token and server URL.")
|
||||
|
||||
logger.info(f"Connected to HashiCorp Vault at {url}")
|
||||
|
||||
def _path(self, credential_id: str) -> str:
|
||||
"""Build Vault path for credential."""
|
||||
# Sanitize credential_id
|
||||
safe_id = credential_id.replace("/", "_").replace("\\", "_")
|
||||
return f"{self._prefix}/{safe_id}"
|
||||
|
||||
def save(self, credential: CredentialObject) -> None:
|
||||
"""Save credential to Vault KV v2."""
|
||||
path = self._path(credential.id)
|
||||
data = self._serialize_for_vault(credential)
|
||||
|
||||
try:
|
||||
self._client.secrets.kv.v2.create_or_update_secret(
|
||||
path=path,
|
||||
secret=data,
|
||||
mount_point=self._mount,
|
||||
)
|
||||
logger.debug(f"Saved credential '{credential.id}' to Vault at {path}")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to save credential '{credential.id}' to Vault: {e}")
|
||||
raise
|
||||
|
||||
def load(self, credential_id: str) -> CredentialObject | None:
|
||||
"""Load credential from Vault."""
|
||||
path = self._path(credential_id)
|
||||
|
||||
try:
|
||||
response = self._client.secrets.kv.v2.read_secret_version(
|
||||
path=path,
|
||||
mount_point=self._mount,
|
||||
)
|
||||
data = response["data"]["data"]
|
||||
return self._deserialize_from_vault(credential_id, data)
|
||||
except Exception as e:
|
||||
# Check if it's a "not found" error
|
||||
error_str = str(e).lower()
|
||||
if "not found" in error_str or "404" in error_str:
|
||||
logger.debug(f"Credential '{credential_id}' not found in Vault")
|
||||
return None
|
||||
logger.error(f"Failed to load credential '{credential_id}' from Vault: {e}")
|
||||
raise
|
||||
|
||||
def delete(self, credential_id: str) -> bool:
|
||||
"""Delete credential from Vault (all versions)."""
|
||||
path = self._path(credential_id)
|
||||
|
||||
try:
|
||||
self._client.secrets.kv.v2.delete_metadata_and_all_versions(
|
||||
path=path,
|
||||
mount_point=self._mount,
|
||||
)
|
||||
logger.debug(f"Deleted credential '{credential_id}' from Vault")
|
||||
return True
|
||||
except Exception as e:
|
||||
error_str = str(e).lower()
|
||||
if "not found" in error_str or "404" in error_str:
|
||||
return False
|
||||
logger.error(f"Failed to delete credential '{credential_id}' from Vault: {e}")
|
||||
raise
|
||||
|
||||
def list_all(self) -> list[str]:
|
||||
"""List all credentials under the prefix."""
|
||||
try:
|
||||
response = self._client.secrets.kv.v2.list_secrets(
|
||||
path=self._prefix,
|
||||
mount_point=self._mount,
|
||||
)
|
||||
keys = response.get("data", {}).get("keys", [])
|
||||
# Remove trailing slashes from folder names
|
||||
return [k.rstrip("/") for k in keys]
|
||||
except Exception as e:
|
||||
error_str = str(e).lower()
|
||||
if "not found" in error_str or "404" in error_str:
|
||||
return []
|
||||
logger.error(f"Failed to list credentials from Vault: {e}")
|
||||
raise
|
||||
|
||||
def exists(self, credential_id: str) -> bool:
|
||||
"""Check if credential exists in Vault."""
|
||||
try:
|
||||
path = self._path(credential_id)
|
||||
self._client.secrets.kv.v2.read_secret_version(
|
||||
path=path,
|
||||
mount_point=self._mount,
|
||||
)
|
||||
return True
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
def _serialize_for_vault(self, credential: CredentialObject) -> dict[str, Any]:
|
||||
"""Convert credential to Vault secret format."""
|
||||
data: dict[str, Any] = {
|
||||
"_type": credential.credential_type.value,
|
||||
}
|
||||
|
||||
if credential.provider_id:
|
||||
data["_provider_id"] = credential.provider_id
|
||||
|
||||
if credential.description:
|
||||
data["_description"] = credential.description
|
||||
|
||||
if credential.auto_refresh:
|
||||
data["_auto_refresh"] = "true"
|
||||
|
||||
# Store each key
|
||||
for key_name, key in credential.keys.items():
|
||||
data[key_name] = key.get_secret_value()
|
||||
|
||||
if key.expires_at:
|
||||
data[f"_expires_{key_name}"] = key.expires_at.isoformat()
|
||||
|
||||
if key.metadata:
|
||||
data[f"_metadata_{key_name}"] = str(key.metadata)
|
||||
|
||||
return data
|
||||
|
||||
def _deserialize_from_vault(self, credential_id: str, data: dict[str, Any]) -> CredentialObject:
|
||||
"""Reconstruct credential from Vault secret."""
|
||||
# Extract metadata fields
|
||||
cred_type = CredentialType(data.pop("_type", "api_key"))
|
||||
provider_id = data.pop("_provider_id", None)
|
||||
description = data.pop("_description", "")
|
||||
auto_refresh = data.pop("_auto_refresh", "") == "true"
|
||||
|
||||
# Build keys dict
|
||||
keys: dict[str, CredentialKey] = {}
|
||||
|
||||
# Find all non-metadata keys
|
||||
key_names = [k for k in data.keys() if not k.startswith("_")]
|
||||
|
||||
for key_name in key_names:
|
||||
value = data[key_name]
|
||||
|
||||
# Check for expiration
|
||||
expires_at = None
|
||||
expires_key = f"_expires_{key_name}"
|
||||
if expires_key in data:
|
||||
try:
|
||||
expires_at = datetime.fromisoformat(data[expires_key])
|
||||
except (ValueError, TypeError):
|
||||
pass
|
||||
|
||||
# Check for metadata
|
||||
metadata: dict[str, Any] = {}
|
||||
metadata_key = f"_metadata_{key_name}"
|
||||
if metadata_key in data:
|
||||
try:
|
||||
import ast
|
||||
|
||||
metadata = ast.literal_eval(data[metadata_key])
|
||||
except (ValueError, SyntaxError):
|
||||
pass
|
||||
|
||||
keys[key_name] = CredentialKey(
|
||||
name=key_name,
|
||||
value=SecretStr(value),
|
||||
expires_at=expires_at,
|
||||
metadata=metadata,
|
||||
)
|
||||
|
||||
return CredentialObject(
|
||||
id=credential_id,
|
||||
credential_type=cred_type,
|
||||
keys=keys,
|
||||
provider_id=provider_id,
|
||||
description=description,
|
||||
auto_refresh=auto_refresh,
|
||||
)
|
||||
|
||||
# --- Vault-Specific Operations ---
|
||||
|
||||
def get_secret_metadata(self, credential_id: str) -> dict[str, Any] | None:
|
||||
"""
|
||||
Get Vault metadata for a secret (version info, timestamps, etc.).
|
||||
|
||||
Args:
|
||||
credential_id: The credential identifier
|
||||
|
||||
Returns:
|
||||
Metadata dict or None if not found
|
||||
"""
|
||||
path = self._path(credential_id)
|
||||
|
||||
try:
|
||||
response = self._client.secrets.kv.v2.read_secret_metadata(
|
||||
path=path,
|
||||
mount_point=self._mount,
|
||||
)
|
||||
return response.get("data", {})
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
def soft_delete(self, credential_id: str, versions: list[int] | None = None) -> bool:
|
||||
"""
|
||||
Soft delete specific versions (can be recovered).
|
||||
|
||||
Args:
|
||||
credential_id: The credential identifier
|
||||
versions: Version numbers to delete. If None, deletes latest.
|
||||
|
||||
Returns:
|
||||
True if successful
|
||||
"""
|
||||
path = self._path(credential_id)
|
||||
|
||||
try:
|
||||
if versions:
|
||||
self._client.secrets.kv.v2.delete_secret_versions(
|
||||
path=path,
|
||||
versions=versions,
|
||||
mount_point=self._mount,
|
||||
)
|
||||
else:
|
||||
self._client.secrets.kv.v2.delete_latest_version_of_secret(
|
||||
path=path,
|
||||
mount_point=self._mount,
|
||||
)
|
||||
return True
|
||||
except Exception as e:
|
||||
logger.error(f"Soft delete failed for '{credential_id}': {e}")
|
||||
return False
|
||||
|
||||
def undelete(self, credential_id: str, versions: list[int]) -> bool:
|
||||
"""
|
||||
Recover soft-deleted versions.
|
||||
|
||||
Args:
|
||||
credential_id: The credential identifier
|
||||
versions: Version numbers to recover
|
||||
|
||||
Returns:
|
||||
True if successful
|
||||
"""
|
||||
path = self._path(credential_id)
|
||||
|
||||
try:
|
||||
self._client.secrets.kv.v2.undelete_secret_versions(
|
||||
path=path,
|
||||
versions=versions,
|
||||
mount_point=self._mount,
|
||||
)
|
||||
return True
|
||||
except Exception as e:
|
||||
logger.error(f"Undelete failed for '{credential_id}': {e}")
|
||||
return False
|
||||
|
||||
def load_version(self, credential_id: str, version: int) -> CredentialObject | None:
|
||||
"""
|
||||
Load a specific version of a credential.
|
||||
|
||||
Args:
|
||||
credential_id: The credential identifier
|
||||
version: Version number to load
|
||||
|
||||
Returns:
|
||||
CredentialObject or None
|
||||
"""
|
||||
path = self._path(credential_id)
|
||||
|
||||
try:
|
||||
response = self._client.secrets.kv.v2.read_secret_version(
|
||||
path=path,
|
||||
version=version,
|
||||
mount_point=self._mount,
|
||||
)
|
||||
data = response["data"]["data"]
|
||||
return self._deserialize_from_vault(credential_id, data)
|
||||
except Exception:
|
||||
return None
|
||||
@@ -73,6 +73,7 @@ class _EscalationReceiver:
|
||||
def __init__(self) -> None:
|
||||
self._event = asyncio.Event()
|
||||
self._response: str | None = None
|
||||
self._awaiting_input = True # So inject_worker_message() can prefer us
|
||||
|
||||
async def inject_event(self, content: str, *, is_client_input: bool = False) -> None:
|
||||
"""Called by ExecutionStream.inject_input() when the user responds."""
|
||||
@@ -101,7 +102,10 @@ class JudgeVerdict:
|
||||
"""Result of judge evaluation for the event loop."""
|
||||
|
||||
action: Literal["ACCEPT", "RETRY", "ESCALATE"]
|
||||
feedback: str = ""
|
||||
# None = no evaluation happened (skip_judge, tool-continue); not logged.
|
||||
# "" = evaluated but no feedback; logged with default text.
|
||||
# "..." = evaluated with feedback; logged as-is.
|
||||
feedback: str | None = None
|
||||
|
||||
|
||||
@runtime_checkable
|
||||
@@ -131,7 +135,7 @@ class SubagentJudge:
|
||||
async def evaluate(self, context: dict[str, Any]) -> JudgeVerdict:
|
||||
missing = context.get("missing_keys", [])
|
||||
if not missing:
|
||||
return JudgeVerdict(action="ACCEPT")
|
||||
return JudgeVerdict(action="ACCEPT", feedback="")
|
||||
|
||||
iteration = context.get("iteration", 0)
|
||||
remaining = self._max_iterations - iteration - 1
|
||||
@@ -165,7 +169,7 @@ class LoopConfig:
|
||||
max_tool_calls_per_turn: int = 30
|
||||
judge_every_n_turns: int = 1
|
||||
stall_detection_threshold: int = 3
|
||||
stall_similarity_threshold: float = 0.7
|
||||
stall_similarity_threshold: float = 0.85
|
||||
max_history_tokens: int = 32_000
|
||||
store_prefix: str = ""
|
||||
|
||||
@@ -347,6 +351,7 @@ class EventLoopNode(NodeProtocol):
|
||||
self._awaiting_input = False
|
||||
self._shutdown = False
|
||||
self._stream_task: asyncio.Task | None = None
|
||||
self._tool_task: asyncio.Task | None = None # gather task while tools run
|
||||
# Track which nodes already have an action plan emitted (skip on revisit)
|
||||
self._action_plan_emitted: set[str] = set()
|
||||
# Monotonic counter for spillover file naming (web_search_1.txt, etc.)
|
||||
@@ -477,23 +482,32 @@ class EventLoopNode(NodeProtocol):
|
||||
# If it doesn't exist yet, seed it with available context.
|
||||
if self._config.spillover_dir:
|
||||
_adapt_path = Path(self._config.spillover_dir) / "adapt.md"
|
||||
if not _adapt_path.exists() and ctx.accounts_prompt:
|
||||
if not _adapt_path.exists():
|
||||
_adapt_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
_adapt_path.write_text(
|
||||
f"## Identity\n{ctx.accounts_prompt}\n",
|
||||
encoding="utf-8",
|
||||
seed = (
|
||||
f"## Identity\n{ctx.accounts_prompt}\n"
|
||||
if ctx.accounts_prompt
|
||||
else "# Session Working Memory\n"
|
||||
)
|
||||
_adapt_path.write_text(seed, encoding="utf-8")
|
||||
if _adapt_path.exists():
|
||||
_adapt_text = _adapt_path.read_text(encoding="utf-8").strip()
|
||||
if _adapt_text:
|
||||
system_prompt = (
|
||||
f"{system_prompt}\n\n"
|
||||
f"--- Your Memory ---\n{_adapt_text}\n--- End Memory ---\n\n"
|
||||
'Maintain your memory by calling save_data("adapt.md", ...) '
|
||||
'or edit_data("adapt.md", ...) as you work.\n'
|
||||
"IMMEDIATELY save: user rules about which account/identity to use, "
|
||||
"behavioral constraints, and preferences. "
|
||||
"Also record session history, decisions, and working notes."
|
||||
"--- Session Working Memory ---\n"
|
||||
f"{_adapt_text}\n"
|
||||
"--- End Session Working Memory ---\n\n"
|
||||
"Maintain your session working memory by calling "
|
||||
'save_data("adapt.md", ...) or edit_data("adapt.md", ...)'
|
||||
" as you work.\n"
|
||||
"This is session-scoped scratch space. "
|
||||
"IMMEDIATELY save: account/identity rules, "
|
||||
"behavioral constraints, and preferences specific to "
|
||||
"this session. Also record current task state, "
|
||||
"decisions, and working notes. "
|
||||
"For lasting knowledge about the user, use "
|
||||
"update_queen_memory() and append_queen_journal() instead."
|
||||
)
|
||||
|
||||
conversation = NodeConversation(
|
||||
@@ -671,6 +685,7 @@ class EventLoopNode(NodeProtocol):
|
||||
queen_input_requested,
|
||||
request_system_prompt,
|
||||
request_messages,
|
||||
reported_to_parent,
|
||||
) = await self._run_single_turn(
|
||||
ctx, conversation, tools, iteration, accumulator
|
||||
)
|
||||
@@ -872,6 +887,7 @@ class EventLoopNode(NodeProtocol):
|
||||
and not outputs_set
|
||||
and not user_input_requested
|
||||
and not queen_input_requested
|
||||
and not reported_to_parent
|
||||
)
|
||||
if truly_empty and accumulator is not None:
|
||||
missing = self._get_missing_output_keys(
|
||||
@@ -1322,8 +1338,8 @@ class EventLoopNode(NodeProtocol):
|
||||
# Auto-block beyond grace -- fall through to judge (6i)
|
||||
|
||||
# 6h''. Worker wait for queen guidance
|
||||
# If a worker escalates with wait_for_response=true, pause here and
|
||||
# skip judge evaluation until queen injects guidance.
|
||||
# When a worker escalates, pause here and skip judge evaluation
|
||||
# until the queen injects guidance.
|
||||
if queen_input_requested:
|
||||
if self._shutdown:
|
||||
await self._publish_loop_completed(
|
||||
@@ -1465,7 +1481,7 @@ class EventLoopNode(NodeProtocol):
|
||||
continue
|
||||
|
||||
# Judge evaluation (should_judge is always True here)
|
||||
verdict = await self._evaluate(
|
||||
verdict = await self._judge_turn(
|
||||
ctx,
|
||||
conversation,
|
||||
accumulator,
|
||||
@@ -1544,7 +1560,7 @@ class EventLoopNode(NodeProtocol):
|
||||
node_type="event_loop",
|
||||
step_index=iteration,
|
||||
verdict="ACCEPT",
|
||||
verdict_feedback=verdict.feedback,
|
||||
verdict_feedback=verdict.feedback or "",
|
||||
tool_calls=logged_tool_calls,
|
||||
llm_text=assistant_text,
|
||||
input_tokens=turn_tokens.get("input", 0),
|
||||
@@ -1587,7 +1603,7 @@ class EventLoopNode(NodeProtocol):
|
||||
node_type="event_loop",
|
||||
step_index=iteration,
|
||||
verdict="ESCALATE",
|
||||
verdict_feedback=verdict.feedback,
|
||||
verdict_feedback=verdict.feedback or "",
|
||||
tool_calls=logged_tool_calls,
|
||||
llm_text=assistant_text,
|
||||
input_tokens=turn_tokens.get("input", 0),
|
||||
@@ -1599,7 +1615,7 @@ class EventLoopNode(NodeProtocol):
|
||||
node_name=ctx.node_spec.name,
|
||||
node_type="event_loop",
|
||||
success=False,
|
||||
error=f"Judge escalated: {verdict.feedback}",
|
||||
error=f"Judge escalated: {verdict.feedback or 'no feedback'}",
|
||||
total_steps=iteration + 1,
|
||||
tokens_used=total_input_tokens + total_output_tokens,
|
||||
input_tokens=total_input_tokens,
|
||||
@@ -1613,7 +1629,7 @@ class EventLoopNode(NodeProtocol):
|
||||
)
|
||||
return NodeResult(
|
||||
success=False,
|
||||
error=f"Judge escalated: {verdict.feedback}",
|
||||
error=f"Judge escalated: {verdict.feedback or 'no feedback'}",
|
||||
output=accumulator.to_dict(),
|
||||
tokens_used=total_input_tokens + total_output_tokens,
|
||||
latency_ms=latency_ms,
|
||||
@@ -1629,15 +1645,16 @@ class EventLoopNode(NodeProtocol):
|
||||
node_type="event_loop",
|
||||
step_index=iteration,
|
||||
verdict="RETRY",
|
||||
verdict_feedback=verdict.feedback,
|
||||
verdict_feedback=verdict.feedback or "",
|
||||
tool_calls=logged_tool_calls,
|
||||
llm_text=assistant_text,
|
||||
input_tokens=turn_tokens.get("input", 0),
|
||||
output_tokens=turn_tokens.get("output", 0),
|
||||
latency_ms=iter_latency_ms,
|
||||
)
|
||||
if verdict.feedback:
|
||||
await conversation.add_user_message(f"[Judge feedback]: {verdict.feedback}")
|
||||
if verdict.feedback is not None:
|
||||
fb = verdict.feedback or "[Judge returned RETRY without feedback]"
|
||||
await conversation.add_user_message(f"[Judge feedback]: {fb}")
|
||||
continue
|
||||
|
||||
# 7. Max iterations exhausted
|
||||
@@ -1702,14 +1719,16 @@ class EventLoopNode(NodeProtocol):
|
||||
self._input_ready.set()
|
||||
|
||||
def cancel_current_turn(self) -> None:
|
||||
"""Cancel the current LLM streaming turn instantly.
|
||||
"""Cancel the current LLM streaming turn or in-progress tool calls instantly.
|
||||
|
||||
Unlike signal_shutdown() which permanently stops the event loop,
|
||||
this only kills the in-progress HTTP stream via task.cancel().
|
||||
this only kills the in-progress HTTP stream or tool gather task.
|
||||
The queen stays alive for the next user message.
|
||||
"""
|
||||
if self._stream_task and not self._stream_task.done():
|
||||
self._stream_task.cancel()
|
||||
if self._tool_task and not self._tool_task.done():
|
||||
self._tool_task.cancel()
|
||||
|
||||
async def _await_user_input(
|
||||
self,
|
||||
@@ -1787,12 +1806,13 @@ class EventLoopNode(NodeProtocol):
|
||||
bool,
|
||||
str,
|
||||
list[dict[str, Any]],
|
||||
bool,
|
||||
]:
|
||||
"""Run a single LLM turn with streaming and tool execution.
|
||||
|
||||
Returns (assistant_text, real_tool_results, outputs_set, token_counts, logged_tool_calls,
|
||||
user_input_requested, ask_user_prompt, ask_user_options, queen_input_requested,
|
||||
system_prompt, messages).
|
||||
system_prompt, messages, reported_to_parent).
|
||||
|
||||
``real_tool_results`` contains only results from actual tools (web_search,
|
||||
etc.), NOT from synthetic framework tools such as ``set_output``,
|
||||
@@ -1802,8 +1822,8 @@ class EventLoopNode(NodeProtocol):
|
||||
``ask_user`` during this turn. This separation lets the caller treat
|
||||
synthetic tools as framework concerns rather than tool-execution concerns.
|
||||
``queen_input_requested`` is True when the worker called
|
||||
``escalate(wait_for_response=true)`` and should wait for
|
||||
queen guidance before judge evaluation.
|
||||
``escalate`` and should wait for queen guidance before judge
|
||||
evaluation.
|
||||
|
||||
``logged_tool_calls`` accumulates ALL tool calls across inner iterations
|
||||
(real tools, set_output, and discarded calls) for L3 logging. Unlike
|
||||
@@ -1824,6 +1844,7 @@ class EventLoopNode(NodeProtocol):
|
||||
ask_user_prompt = ""
|
||||
ask_user_options: list[str] | None = None
|
||||
queen_input_requested = False
|
||||
reported_to_parent = False
|
||||
# Accumulate ALL tool calls across inner iterations for L3 logging.
|
||||
# Unlike real_tool_results (reset each inner iteration), this persists.
|
||||
logged_tool_calls: list[dict] = []
|
||||
@@ -1977,6 +1998,7 @@ class EventLoopNode(NodeProtocol):
|
||||
queen_input_requested,
|
||||
final_system_prompt,
|
||||
final_messages,
|
||||
reported_to_parent,
|
||||
)
|
||||
|
||||
# Execute tool calls — framework tools (set_output, ask_user)
|
||||
@@ -2124,7 +2146,6 @@ class EventLoopNode(NodeProtocol):
|
||||
# --- Framework-level escalate handling ---
|
||||
reason = str(tc.tool_input.get("reason", "")).strip()
|
||||
context = str(tc.tool_input.get("context", "")).strip()
|
||||
# Always wait for queen guidance
|
||||
|
||||
if stream_id in ("queen", "judge"):
|
||||
result = ToolResult(
|
||||
@@ -2160,7 +2181,7 @@ class EventLoopNode(NodeProtocol):
|
||||
|
||||
result = ToolResult(
|
||||
tool_use_id=tc.tool_use_id,
|
||||
content="Escalation requested to hive_coder (queen); waiting for guidance.",
|
||||
content="Escalation requested to queen; waiting for guidance.",
|
||||
is_error=False,
|
||||
)
|
||||
results_by_id[tc.tool_use_id] = result
|
||||
@@ -2179,6 +2200,7 @@ class EventLoopNode(NodeProtocol):
|
||||
|
||||
elif tc.tool_name == "report_to_parent":
|
||||
# --- Report from sub-agent to parent (optionally blocking) ---
|
||||
reported_to_parent = True
|
||||
msg = tc.tool_input.get("message", "")
|
||||
data = tc.tool_input.get("data")
|
||||
wait = tc.tool_input.get("wait_for_response", False)
|
||||
@@ -2250,10 +2272,16 @@ class EventLoopNode(NodeProtocol):
|
||||
_dur = round(time.time() - _s, 3)
|
||||
return _r, _iso, _dur
|
||||
|
||||
timed_results = await asyncio.gather(
|
||||
*(_timed_execute(tc) for tc in pending_real),
|
||||
return_exceptions=True,
|
||||
self._tool_task = asyncio.ensure_future(
|
||||
asyncio.gather(
|
||||
*(_timed_execute(tc) for tc in pending_real),
|
||||
return_exceptions=True,
|
||||
)
|
||||
)
|
||||
try:
|
||||
timed_results = await self._tool_task
|
||||
finally:
|
||||
self._tool_task = None
|
||||
# gather(return_exceptions=True) captures CancelledError
|
||||
# as a return value instead of propagating it. Re-raise
|
||||
# so stop_worker actually stops the execution.
|
||||
@@ -2454,6 +2482,7 @@ class EventLoopNode(NodeProtocol):
|
||||
queen_input_requested,
|
||||
final_system_prompt,
|
||||
final_messages,
|
||||
reported_to_parent,
|
||||
)
|
||||
|
||||
# --- Mid-turn pruning: prevent context blowup within a single turn ---
|
||||
@@ -2485,6 +2514,7 @@ class EventLoopNode(NodeProtocol):
|
||||
queen_input_requested,
|
||||
final_system_prompt,
|
||||
final_messages,
|
||||
reported_to_parent,
|
||||
)
|
||||
|
||||
# Tool calls processed -- loop back to stream with updated conversation
|
||||
@@ -2582,7 +2612,7 @@ class EventLoopNode(NodeProtocol):
|
||||
return Tool(
|
||||
name="escalate",
|
||||
description=(
|
||||
"Escalate to the Hive Coder queen when requesting user input, "
|
||||
"Escalate to the queen when requesting user input, "
|
||||
"blocked by errors, missing "
|
||||
"credentials, or ambiguous constraints that require supervisor "
|
||||
"guidance. Include a concise reason and optional context. "
|
||||
@@ -2771,7 +2801,7 @@ class EventLoopNode(NodeProtocol):
|
||||
# Judge evaluation
|
||||
# -------------------------------------------------------------------
|
||||
|
||||
async def _evaluate(
|
||||
async def _judge_turn(
|
||||
self,
|
||||
ctx: NodeContext,
|
||||
conversation: NodeConversation,
|
||||
@@ -2780,14 +2810,29 @@ class EventLoopNode(NodeProtocol):
|
||||
tool_results: list[dict],
|
||||
iteration: int,
|
||||
) -> JudgeVerdict:
|
||||
"""Evaluate the current state using judge or implicit logic."""
|
||||
# Short-circuit: subagent called report_to_parent(mark_complete=True)
|
||||
"""Evaluate the current state using judge or implicit logic.
|
||||
|
||||
Evaluation levels (in order):
|
||||
0. Short-circuits: mark_complete, skip_judge, tool-continue.
|
||||
1. Custom judge (JudgeProtocol) — full authority when set.
|
||||
2. Implicit judge — output-key check + optional conversation-aware
|
||||
quality gate (when ``success_criteria`` is defined).
|
||||
|
||||
Returns a JudgeVerdict. ``feedback=None`` means no real evaluation
|
||||
happened (skip_judge, tool-continue); the caller must not inject a
|
||||
feedback message. Any non-None feedback (including ``""``) means a
|
||||
real evaluation occurred and will be logged into the conversation.
|
||||
"""
|
||||
|
||||
# --- Level 0: short-circuits (no evaluation) -----------------------
|
||||
|
||||
if self._mark_complete_flag:
|
||||
return JudgeVerdict(action="ACCEPT")
|
||||
|
||||
# Opt-out: node explicitly disables judge (e.g. conversational queen)
|
||||
if ctx.node_spec.skip_judge:
|
||||
return JudgeVerdict(action="RETRY", feedback="")
|
||||
return JudgeVerdict(action="RETRY") # feedback=None → not logged
|
||||
|
||||
# --- Level 1: custom judge -----------------------------------------
|
||||
|
||||
if self._judge is not None:
|
||||
context = {
|
||||
@@ -2802,81 +2847,82 @@ class EventLoopNode(NodeProtocol):
|
||||
accumulator, ctx.node_spec.output_keys, ctx.node_spec.nullable_output_keys
|
||||
),
|
||||
}
|
||||
return await self._judge.evaluate(context)
|
||||
verdict = await self._judge.evaluate(context)
|
||||
# Ensure evaluated RETRY always carries feedback for logging.
|
||||
if verdict.action == "RETRY" and not verdict.feedback:
|
||||
return JudgeVerdict(action="RETRY", feedback="Custom judge returned RETRY.")
|
||||
return verdict
|
||||
|
||||
# Implicit judge: accept when no tool calls and all output keys present
|
||||
if not tool_results:
|
||||
missing = self._get_missing_output_keys(
|
||||
accumulator, ctx.node_spec.output_keys, ctx.node_spec.nullable_output_keys
|
||||
# --- Level 2: implicit judge ---------------------------------------
|
||||
|
||||
# Real tool calls were made — let the agent keep working.
|
||||
if tool_results:
|
||||
return JudgeVerdict(action="RETRY") # feedback=None → not logged
|
||||
|
||||
missing = self._get_missing_output_keys(
|
||||
accumulator, ctx.node_spec.output_keys, ctx.node_spec.nullable_output_keys
|
||||
)
|
||||
|
||||
if missing:
|
||||
return JudgeVerdict(
|
||||
action="RETRY",
|
||||
feedback=(
|
||||
f"Task incomplete. Required outputs not yet produced: {missing}. "
|
||||
f"Follow your system prompt instructions to complete the work."
|
||||
),
|
||||
)
|
||||
if not missing:
|
||||
# Safety check: when ALL output keys are nullable and NONE
|
||||
# have been set, the node produced nothing useful. Retry
|
||||
# instead of accepting an empty result — this prevents
|
||||
# client-facing nodes from terminating before the user
|
||||
# ever interacts, and non-client-facing nodes from
|
||||
# short-circuiting without doing their work.
|
||||
output_keys = ctx.node_spec.output_keys or []
|
||||
nullable_keys = set(ctx.node_spec.nullable_output_keys or [])
|
||||
all_nullable = output_keys and nullable_keys >= set(output_keys)
|
||||
none_set = not any(accumulator.get(k) is not None for k in output_keys)
|
||||
if all_nullable and none_set:
|
||||
return JudgeVerdict(
|
||||
action="RETRY",
|
||||
feedback=(
|
||||
f"No output keys have been set yet. "
|
||||
f"Use set_output to set at least one of: {output_keys}"
|
||||
),
|
||||
)
|
||||
|
||||
# Client-facing nodes with no output keys are meant for
|
||||
# continuous interaction — they should not auto-accept.
|
||||
# Only exit via shutdown, max_iterations, or max_node_visits.
|
||||
# Inject tool-use pressure so models stuck in a
|
||||
# "narrate-instead-of-act" loop get corrective feedback.
|
||||
if not output_keys and ctx.node_spec.client_facing:
|
||||
return JudgeVerdict(
|
||||
action="RETRY",
|
||||
feedback=(
|
||||
"STOP describing what you will do. "
|
||||
"You have FULL access to all tools — file creation, "
|
||||
"shell commands, MCP tools — and you CAN call them "
|
||||
"directly in your response. Respond ONLY with tool "
|
||||
"calls, no prose. Execute the task now."
|
||||
),
|
||||
)
|
||||
# All output keys present — run safety checks before accepting.
|
||||
|
||||
# Level 2: conversation-aware quality check (if success_criteria set)
|
||||
if ctx.node_spec.success_criteria and ctx.llm:
|
||||
from framework.graph.conversation_judge import evaluate_phase_completion
|
||||
output_keys = ctx.node_spec.output_keys or []
|
||||
nullable_keys = set(ctx.node_spec.nullable_output_keys or [])
|
||||
|
||||
verdict = await evaluate_phase_completion(
|
||||
llm=ctx.llm,
|
||||
conversation=conversation,
|
||||
phase_name=ctx.node_spec.name,
|
||||
phase_description=ctx.node_spec.description,
|
||||
success_criteria=ctx.node_spec.success_criteria,
|
||||
accumulator_state=accumulator.to_dict(),
|
||||
max_history_tokens=self._config.max_history_tokens,
|
||||
)
|
||||
if verdict.action != "ACCEPT":
|
||||
return JudgeVerdict(
|
||||
action=verdict.action,
|
||||
feedback=verdict.feedback or "Phase criteria not met.",
|
||||
)
|
||||
# All-nullable with nothing set → node produced nothing useful.
|
||||
all_nullable = output_keys and nullable_keys >= set(output_keys)
|
||||
none_set = not any(accumulator.get(k) is not None for k in output_keys)
|
||||
if all_nullable and none_set:
|
||||
return JudgeVerdict(
|
||||
action="RETRY",
|
||||
feedback=(
|
||||
f"No output keys have been set yet. "
|
||||
f"Use set_output to set at least one of: {output_keys}"
|
||||
),
|
||||
)
|
||||
|
||||
return JudgeVerdict(action="ACCEPT")
|
||||
else:
|
||||
# Client-facing with no output keys → continuous interaction node.
|
||||
# Inject tool-use pressure instead of auto-accepting.
|
||||
if not output_keys and ctx.node_spec.client_facing:
|
||||
return JudgeVerdict(
|
||||
action="RETRY",
|
||||
feedback=(
|
||||
"STOP describing what you will do. "
|
||||
"You have FULL access to all tools — file creation, "
|
||||
"shell commands, MCP tools — and you CAN call them "
|
||||
"directly in your response. Respond ONLY with tool "
|
||||
"calls, no prose. Execute the task now."
|
||||
),
|
||||
)
|
||||
|
||||
# Level 2b: conversation-aware quality check (if success_criteria set)
|
||||
if ctx.node_spec.success_criteria and ctx.llm:
|
||||
from framework.graph.conversation_judge import evaluate_phase_completion
|
||||
|
||||
verdict = await evaluate_phase_completion(
|
||||
llm=ctx.llm,
|
||||
conversation=conversation,
|
||||
phase_name=ctx.node_spec.name,
|
||||
phase_description=ctx.node_spec.description,
|
||||
success_criteria=ctx.node_spec.success_criteria,
|
||||
accumulator_state=accumulator.to_dict(),
|
||||
max_history_tokens=self._config.max_history_tokens,
|
||||
)
|
||||
if verdict.action != "ACCEPT":
|
||||
return JudgeVerdict(
|
||||
action="RETRY",
|
||||
feedback=(
|
||||
f"Task incomplete. Required outputs not yet produced: {missing}. "
|
||||
f"Follow your system prompt instructions to complete the work."
|
||||
),
|
||||
action=verdict.action,
|
||||
feedback=verdict.feedback or "Phase criteria not met.",
|
||||
)
|
||||
|
||||
# Tool calls were made -- continue loop
|
||||
return JudgeVerdict(action="RETRY", feedback="")
|
||||
return JudgeVerdict(action="ACCEPT", feedback="")
|
||||
|
||||
# -------------------------------------------------------------------
|
||||
# Helpers
|
||||
@@ -2956,8 +3002,10 @@ class EventLoopNode(NodeProtocol):
|
||||
def _is_stalled(self, recent_responses: list[str]) -> bool:
|
||||
"""Detect stall using n-gram similarity.
|
||||
|
||||
Detects when N consecutive responses have similarity >= threshold.
|
||||
This catches phrases like "I'm still stuck" vs "I'm stuck".
|
||||
Detects when ALL N consecutive responses are mutually similar
|
||||
(>= threshold). A single dissimilar response resets the signal.
|
||||
This catches phrases like "I'm still stuck" vs "I'm stuck"
|
||||
without false-positives on "attempt 1" vs "attempt 2".
|
||||
"""
|
||||
if len(recent_responses) < self._config.stall_detection_threshold:
|
||||
return False
|
||||
@@ -2965,13 +3013,11 @@ class EventLoopNode(NodeProtocol):
|
||||
return False
|
||||
|
||||
threshold = self._config.stall_similarity_threshold
|
||||
# Check similarity against all recent responses (excluding self)
|
||||
for i, resp in enumerate(recent_responses):
|
||||
# Compare against all previous responses
|
||||
for prev in recent_responses[:i]:
|
||||
if self._ngram_similarity(resp, prev) >= threshold:
|
||||
return True
|
||||
return False
|
||||
# Every consecutive pair must be similar
|
||||
for i in range(1, len(recent_responses)):
|
||||
if self._ngram_similarity(recent_responses[i], recent_responses[i - 1]) < threshold:
|
||||
return False
|
||||
return True
|
||||
|
||||
@staticmethod
|
||||
def _is_transient_error(exc: BaseException) -> bool:
|
||||
@@ -3050,10 +3096,11 @@ class EventLoopNode(NodeProtocol):
|
||||
self,
|
||||
recent_tool_fingerprints: list[list[tuple[str, str]]],
|
||||
) -> tuple[bool, str]:
|
||||
"""Detect doom loop using n-gram similarity on tool inputs.
|
||||
"""Detect doom loop via exact fingerprint match.
|
||||
|
||||
Detects when N consecutive turns have similar tool calls.
|
||||
Similarity applies to the canonicalized tool input strings.
|
||||
Detects when N consecutive turns invoke the same tools with
|
||||
identical (canonicalized) arguments. Different arguments mean
|
||||
different work, so only exact matches count.
|
||||
|
||||
Returns (is_doom_loop, description).
|
||||
"""
|
||||
@@ -3066,23 +3113,12 @@ class EventLoopNode(NodeProtocol):
|
||||
if not first:
|
||||
return False, ""
|
||||
|
||||
# Convert a turn's list of (name, args) pairs to a single comparable string.
|
||||
def _turn_sig(fp: list[tuple[str, str]]) -> str:
|
||||
return "|".join(f"{name}:{args}" for name, args in fp)
|
||||
|
||||
first_sig = _turn_sig(first)
|
||||
similarity_threshold = self._config.stall_similarity_threshold
|
||||
similar_count = sum(
|
||||
1
|
||||
for fp in recent_tool_fingerprints
|
||||
if self._ngram_similarity(_turn_sig(fp), first_sig) >= similarity_threshold
|
||||
)
|
||||
|
||||
if similar_count >= threshold:
|
||||
tool_names = [name for fp in recent_tool_fingerprints for name, _ in fp]
|
||||
# All turns in the window must match the first exactly
|
||||
if all(fp == first for fp in recent_tool_fingerprints[1:]):
|
||||
tool_names = [name for name, _ in first]
|
||||
desc = (
|
||||
f"Doom loop detected: {similar_count}/{len(recent_tool_fingerprints)} "
|
||||
f"consecutive similar tool calls ({', '.join(tool_names)})"
|
||||
f"Doom loop detected: {len(recent_tool_fingerprints)} "
|
||||
f"identical consecutive tool calls ({', '.join(tool_names)})"
|
||||
)
|
||||
return True, desc
|
||||
return False, ""
|
||||
@@ -4288,22 +4324,18 @@ class EventLoopNode(NodeProtocol):
|
||||
|
||||
registry[escalation_id] = receiver
|
||||
try:
|
||||
# Stream message to user (parent's node_id so TUI shows parent talking)
|
||||
await self._event_bus.emit_client_output_delta(
|
||||
stream_id=ctx.node_id,
|
||||
node_id=ctx.node_id,
|
||||
content=message,
|
||||
snapshot=message,
|
||||
execution_id=ctx.execution_id,
|
||||
)
|
||||
# Request input (escalation_id for routing response back)
|
||||
await self._event_bus.emit_client_input_requested(
|
||||
stream_id=ctx.node_id,
|
||||
# Escalate to the queen instead of asking the user directly.
|
||||
# The queen handles the request and injects the response via
|
||||
# inject_worker_message(), which finds this receiver through
|
||||
# its _awaiting_input flag.
|
||||
await self._event_bus.emit_escalation_requested(
|
||||
stream_id=ctx.stream_id or ctx.node_id,
|
||||
node_id=escalation_id,
|
||||
prompt=message,
|
||||
reason=f"Subagent report (wait_for_response) from {agent_id}",
|
||||
context=message,
|
||||
execution_id=ctx.execution_id,
|
||||
)
|
||||
# Block until user responds
|
||||
# Block until queen responds
|
||||
return await receiver.wait()
|
||||
finally:
|
||||
registry.pop(escalation_id, None)
|
||||
|
||||
@@ -1604,7 +1604,7 @@ class GraphExecutor:
|
||||
# Return with paused status
|
||||
return ExecutionResult(
|
||||
success=False,
|
||||
error="Execution paused by user",
|
||||
error="Execution cancelled",
|
||||
output=saved_memory,
|
||||
steps_executed=steps,
|
||||
total_tokens=total_tokens,
|
||||
|
||||
@@ -1,203 +0,0 @@
|
||||
"""
|
||||
Standardized HITL (Human-In-The-Loop) Protocol
|
||||
|
||||
This module defines the formal structure for pause/resume interactions
|
||||
where agents need to gather input from humans.
|
||||
"""
|
||||
|
||||
from dataclasses import dataclass, field
|
||||
from enum import StrEnum
|
||||
from typing import Any
|
||||
|
||||
|
||||
class HITLInputType(StrEnum):
|
||||
"""Type of input expected from human."""
|
||||
|
||||
FREE_TEXT = "free_text" # Open-ended text response
|
||||
STRUCTURED = "structured" # Specific fields to fill
|
||||
SELECTION = "selection" # Choose from options
|
||||
APPROVAL = "approval" # Yes/no/modify decision
|
||||
MULTI_FIELD = "multi_field" # Multiple related inputs
|
||||
|
||||
|
||||
@dataclass
|
||||
class HITLQuestion:
|
||||
"""A single question to ask the human."""
|
||||
|
||||
id: str
|
||||
question: str
|
||||
input_type: HITLInputType = HITLInputType.FREE_TEXT
|
||||
|
||||
# For SELECTION type
|
||||
options: list[str] = field(default_factory=list)
|
||||
|
||||
# For STRUCTURED type
|
||||
fields: dict[str, str] = field(default_factory=dict) # {field_name: description}
|
||||
|
||||
# Metadata
|
||||
required: bool = True
|
||||
help_text: str = ""
|
||||
|
||||
|
||||
@dataclass
|
||||
class HITLRequest:
|
||||
"""
|
||||
Formal request for human input at a pause node.
|
||||
|
||||
This is what the agent produces when it needs human input.
|
||||
"""
|
||||
|
||||
# Context
|
||||
objective: str # What we're trying to accomplish
|
||||
current_state: str # Where we are in the process
|
||||
|
||||
# What we need
|
||||
questions: list[HITLQuestion] = field(default_factory=list)
|
||||
missing_info: list[str] = field(default_factory=list)
|
||||
|
||||
# Guidance
|
||||
instructions: str = ""
|
||||
examples: list[str] = field(default_factory=list)
|
||||
|
||||
# Metadata
|
||||
request_id: str = ""
|
||||
node_id: str = ""
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
"""Convert to dictionary for serialization."""
|
||||
return {
|
||||
"objective": self.objective,
|
||||
"current_state": self.current_state,
|
||||
"questions": [
|
||||
{
|
||||
"id": q.id,
|
||||
"question": q.question,
|
||||
"input_type": q.input_type.value,
|
||||
"options": q.options,
|
||||
"fields": q.fields,
|
||||
"required": q.required,
|
||||
"help_text": q.help_text,
|
||||
}
|
||||
for q in self.questions
|
||||
],
|
||||
"missing_info": self.missing_info,
|
||||
"instructions": self.instructions,
|
||||
"examples": self.examples,
|
||||
"request_id": self.request_id,
|
||||
"node_id": self.node_id,
|
||||
}
|
||||
|
||||
|
||||
@dataclass
|
||||
class HITLResponse:
|
||||
"""
|
||||
Human's response to a HITL request.
|
||||
|
||||
This is what gets passed back when resuming from a pause.
|
||||
"""
|
||||
|
||||
# Original request reference
|
||||
request_id: str
|
||||
|
||||
# Human's answers
|
||||
answers: dict[str, Any] = field(default_factory=dict) # {question_id: answer}
|
||||
raw_input: str = "" # Raw text if provided
|
||||
|
||||
# Metadata
|
||||
response_time_ms: int = 0
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
"""Convert to dictionary for serialization."""
|
||||
return {
|
||||
"request_id": self.request_id,
|
||||
"answers": self.answers,
|
||||
"raw_input": self.raw_input,
|
||||
"response_time_ms": self.response_time_ms,
|
||||
}
|
||||
|
||||
|
||||
class HITLProtocol:
|
||||
"""
|
||||
Standardized protocol for HITL interactions.
|
||||
|
||||
Usage in pause nodes:
|
||||
|
||||
1. Pause Node: Generates HITLRequest with questions
|
||||
2. Executor: Saves state and returns request to user
|
||||
3. User: Provides HITLResponse with answers
|
||||
4. Resume Node: Processes response and merges into context
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def create_request(
|
||||
objective: str,
|
||||
questions: list[HITLQuestion],
|
||||
missing_info: list[str] | None = None,
|
||||
node_id: str = "",
|
||||
) -> HITLRequest:
|
||||
"""Create a standardized HITL request."""
|
||||
return HITLRequest(
|
||||
objective=objective,
|
||||
current_state="Awaiting clarification",
|
||||
questions=questions,
|
||||
missing_info=missing_info or [],
|
||||
request_id=f"{node_id}_{hash(objective) % 10000}",
|
||||
node_id=node_id,
|
||||
)
|
||||
|
||||
@staticmethod
|
||||
def parse_response(
|
||||
raw_input: str,
|
||||
request: HITLRequest,
|
||||
use_haiku: bool = True,
|
||||
) -> HITLResponse:
|
||||
"""
|
||||
Parse human's raw input into structured response.
|
||||
|
||||
Maps the raw input to the first question. For multi-question HITL,
|
||||
the caller should present one question at a time.
|
||||
"""
|
||||
response = HITLResponse(request_id=request.request_id, raw_input=raw_input)
|
||||
|
||||
# If no questions, just return raw input
|
||||
if not request.questions:
|
||||
return response
|
||||
|
||||
# Map raw input to first question
|
||||
response.answers[request.questions[0].id] = raw_input
|
||||
return response
|
||||
|
||||
@staticmethod
|
||||
def format_for_display(request: HITLRequest) -> str:
|
||||
"""Format HITL request for user-friendly display."""
|
||||
parts = []
|
||||
|
||||
if request.objective:
|
||||
parts.append(f"📋 Objective: {request.objective}")
|
||||
|
||||
if request.current_state:
|
||||
parts.append(f"📍 Current State: {request.current_state}")
|
||||
|
||||
if request.instructions:
|
||||
parts.append(f"\n{request.instructions}")
|
||||
|
||||
if request.questions:
|
||||
parts.append(f"\n❓ Questions ({len(request.questions)}):")
|
||||
for i, q in enumerate(request.questions, 1):
|
||||
parts.append(f"{i}. {q.question}")
|
||||
if q.help_text:
|
||||
parts.append(f" 💡 {q.help_text}")
|
||||
if q.options:
|
||||
parts.append(f" Options: {', '.join(q.options)}")
|
||||
|
||||
if request.missing_info:
|
||||
parts.append("\n📝 Missing Information:")
|
||||
for info in request.missing_info:
|
||||
parts.append(f" • {info}")
|
||||
|
||||
if request.examples:
|
||||
parts.append("\n📚 Examples:")
|
||||
for example in request.examples:
|
||||
parts.append(f" • {example}")
|
||||
|
||||
return "\n".join(parts)
|
||||
@@ -118,6 +118,10 @@ RATE_LIMIT_MAX_RETRIES = 10
|
||||
RATE_LIMIT_BACKOFF_BASE = 2 # seconds
|
||||
RATE_LIMIT_MAX_DELAY = 120 # seconds - cap to prevent absurd waits
|
||||
MINIMAX_API_BASE = "https://api.minimax.io/v1"
|
||||
# Kimi For Coding uses an Anthropic-compatible endpoint (no /v1 suffix).
|
||||
# Claude Code integration uses this format; the /v1 OpenAI-compatible endpoint
|
||||
# enforces a coding-agent whitelist that blocks unknown User-Agents.
|
||||
KIMI_API_BASE = "https://api.kimi.com/coding"
|
||||
|
||||
# Empty-stream retries use a short fixed delay, not the rate-limit backoff.
|
||||
# Conversation-structure issues are deterministic — long waits don't help.
|
||||
@@ -323,9 +327,21 @@ class LiteLLMProvider(LLMProvider):
|
||||
api_base: Custom API base URL (for proxies or local deployments)
|
||||
**kwargs: Additional arguments passed to litellm.completion()
|
||||
"""
|
||||
# Kimi For Coding exposes an Anthropic-compatible endpoint at
|
||||
# https://api.kimi.com/coding (the same format Claude Code uses natively).
|
||||
# Translate kimi/ prefix to anthropic/ so litellm uses the Anthropic
|
||||
# Messages API handler and routes to that endpoint — no special headers needed.
|
||||
_original_model = model
|
||||
if model.lower().startswith("kimi/"):
|
||||
model = "anthropic/" + model[len("kimi/") :]
|
||||
# Normalise api_base: litellm's Anthropic handler appends /v1/messages,
|
||||
# so the base must be https://api.kimi.com/coding (no /v1 suffix).
|
||||
# Strip a trailing /v1 in case the user's saved config has the old value.
|
||||
if api_base and api_base.rstrip("/").endswith("/v1"):
|
||||
api_base = api_base.rstrip("/")[:-3]
|
||||
self.model = model
|
||||
self.api_key = api_key
|
||||
self.api_base = api_base or self._default_api_base_for_model(model)
|
||||
self.api_base = api_base or self._default_api_base_for_model(_original_model)
|
||||
self.extra_kwargs = kwargs
|
||||
# The Codex ChatGPT backend (chatgpt.com/backend-api/codex) rejects
|
||||
# several standard OpenAI params: max_output_tokens, stream_options.
|
||||
@@ -350,6 +366,8 @@ class LiteLLMProvider(LLMProvider):
|
||||
model_lower = model.lower()
|
||||
if model_lower.startswith("minimax/") or model_lower.startswith("minimax-"):
|
||||
return MINIMAX_API_BASE
|
||||
if model_lower.startswith("kimi/"):
|
||||
return KIMI_API_BASE
|
||||
return None
|
||||
|
||||
def _completion_with_rate_limit_retry(
|
||||
|
||||
@@ -1,4 +0,0 @@
|
||||
"""MCP servers for worker-bee."""
|
||||
|
||||
# Don't auto-import servers to avoid double-import issues when running with -m
|
||||
__all__ = []
|
||||
+55
-470
@@ -51,11 +51,7 @@ def register_commands(subparsers: argparse._SubParsersAction) -> None:
|
||||
action="store_true",
|
||||
help="Show detailed execution logs (steps, LLM calls, etc.)",
|
||||
)
|
||||
run_parser.add_argument(
|
||||
"--tui",
|
||||
action="store_true",
|
||||
help="Launch interactive terminal dashboard",
|
||||
)
|
||||
|
||||
run_parser.add_argument(
|
||||
"--model",
|
||||
"-m",
|
||||
@@ -194,158 +190,6 @@ def register_commands(subparsers: argparse._SubParsersAction) -> None:
|
||||
shell_parser.set_defaults(func=cmd_shell)
|
||||
|
||||
# tui command (interactive agent dashboard)
|
||||
tui_parser = subparsers.add_parser(
|
||||
"tui",
|
||||
help="Launch interactive TUI dashboard",
|
||||
description="Browse available agents and launch the terminal dashboard.",
|
||||
)
|
||||
tui_parser.add_argument(
|
||||
"--model",
|
||||
"-m",
|
||||
type=str,
|
||||
default=None,
|
||||
help="LLM model to use (any LiteLLM-compatible name)",
|
||||
)
|
||||
tui_parser.set_defaults(func=cmd_tui)
|
||||
|
||||
# code command (Hive Coder — framework agent builder)
|
||||
code_parser = subparsers.add_parser(
|
||||
"code",
|
||||
help="Launch Hive Coder to build agents",
|
||||
description="Interactive agent builder. Describe what you want and Hive Coder builds it.",
|
||||
)
|
||||
code_parser.add_argument(
|
||||
"--model",
|
||||
"-m",
|
||||
type=str,
|
||||
default=None,
|
||||
help="LLM model to use (any LiteLLM-compatible name)",
|
||||
)
|
||||
code_parser.set_defaults(func=cmd_code)
|
||||
|
||||
# sessions command group (checkpoint/resume management)
|
||||
sessions_parser = subparsers.add_parser(
|
||||
"sessions",
|
||||
help="Manage agent sessions",
|
||||
description="List, inspect, and manage agent execution sessions.",
|
||||
)
|
||||
sessions_subparsers = sessions_parser.add_subparsers(
|
||||
dest="sessions_cmd",
|
||||
help="Session management commands",
|
||||
)
|
||||
|
||||
# sessions list
|
||||
sessions_list_parser = sessions_subparsers.add_parser(
|
||||
"list",
|
||||
help="List agent sessions",
|
||||
description="List all sessions for an agent.",
|
||||
)
|
||||
sessions_list_parser.add_argument(
|
||||
"agent_path",
|
||||
type=str,
|
||||
help="Path to agent folder",
|
||||
)
|
||||
sessions_list_parser.add_argument(
|
||||
"--status",
|
||||
choices=["all", "active", "failed", "completed", "paused"],
|
||||
default="all",
|
||||
help="Filter by session status (default: all)",
|
||||
)
|
||||
sessions_list_parser.add_argument(
|
||||
"--has-checkpoints",
|
||||
action="store_true",
|
||||
help="Show only sessions with checkpoints",
|
||||
)
|
||||
sessions_list_parser.set_defaults(func=cmd_sessions_list)
|
||||
|
||||
# sessions show
|
||||
sessions_show_parser = sessions_subparsers.add_parser(
|
||||
"show",
|
||||
help="Show session details",
|
||||
description="Display detailed information about a specific session.",
|
||||
)
|
||||
sessions_show_parser.add_argument(
|
||||
"agent_path",
|
||||
type=str,
|
||||
help="Path to agent folder",
|
||||
)
|
||||
sessions_show_parser.add_argument(
|
||||
"session_id",
|
||||
type=str,
|
||||
help="Session ID to inspect",
|
||||
)
|
||||
sessions_show_parser.add_argument(
|
||||
"--json",
|
||||
action="store_true",
|
||||
help="Output as JSON",
|
||||
)
|
||||
sessions_show_parser.set_defaults(func=cmd_sessions_show)
|
||||
|
||||
# sessions checkpoints
|
||||
sessions_checkpoints_parser = sessions_subparsers.add_parser(
|
||||
"checkpoints",
|
||||
help="List session checkpoints",
|
||||
description="List all checkpoints for a session.",
|
||||
)
|
||||
sessions_checkpoints_parser.add_argument(
|
||||
"agent_path",
|
||||
type=str,
|
||||
help="Path to agent folder",
|
||||
)
|
||||
sessions_checkpoints_parser.add_argument(
|
||||
"session_id",
|
||||
type=str,
|
||||
help="Session ID",
|
||||
)
|
||||
sessions_checkpoints_parser.set_defaults(func=cmd_sessions_checkpoints)
|
||||
|
||||
# pause command
|
||||
pause_parser = subparsers.add_parser(
|
||||
"pause",
|
||||
help="Pause running session",
|
||||
description="Request graceful pause of a running agent session.",
|
||||
)
|
||||
pause_parser.add_argument(
|
||||
"agent_path",
|
||||
type=str,
|
||||
help="Path to agent folder",
|
||||
)
|
||||
pause_parser.add_argument(
|
||||
"session_id",
|
||||
type=str,
|
||||
help="Session ID to pause",
|
||||
)
|
||||
pause_parser.set_defaults(func=cmd_pause)
|
||||
|
||||
# resume command
|
||||
resume_parser = subparsers.add_parser(
|
||||
"resume",
|
||||
help="Resume session from checkpoint",
|
||||
description="Resume a paused or failed session from a checkpoint.",
|
||||
)
|
||||
resume_parser.add_argument(
|
||||
"agent_path",
|
||||
type=str,
|
||||
help="Path to agent folder",
|
||||
)
|
||||
resume_parser.add_argument(
|
||||
"session_id",
|
||||
type=str,
|
||||
help="Session ID to resume",
|
||||
)
|
||||
resume_parser.add_argument(
|
||||
"--checkpoint",
|
||||
"-c",
|
||||
type=str,
|
||||
help="Specific checkpoint ID to resume from (default: latest)",
|
||||
)
|
||||
resume_parser.add_argument(
|
||||
"--tui",
|
||||
action="store_true",
|
||||
help="Resume in TUI dashboard mode",
|
||||
)
|
||||
resume_parser.set_defaults(func=cmd_resume)
|
||||
|
||||
# setup-credentials command
|
||||
setup_creds_parser = subparsers.add_parser(
|
||||
"setup-credentials",
|
||||
@@ -577,128 +421,67 @@ def cmd_run(args: argparse.Namespace) -> int:
|
||||
)
|
||||
return 1
|
||||
|
||||
# Run the agent (with TUI or standard)
|
||||
if getattr(args, "tui", False):
|
||||
from framework.tui.app import AdenTUI
|
||||
# Standard execution
|
||||
# AgentRunner handles credential setup interactively when stdin is a TTY.
|
||||
try:
|
||||
runner = AgentRunner.load(
|
||||
args.agent_path,
|
||||
model=args.model,
|
||||
)
|
||||
except CredentialError as e:
|
||||
print(f"\n{e}", file=sys.stderr)
|
||||
return 1
|
||||
except FileNotFoundError as e:
|
||||
print(f"Error: {e}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
async def run_with_tui():
|
||||
try:
|
||||
# Load runner inside the async loop to ensure strict loop affinity
|
||||
# (only one load — avoids spawning duplicate MCP subprocesses)
|
||||
# AgentRunner handles credential setup interactively when stdin is a TTY.
|
||||
try:
|
||||
runner = AgentRunner.load(
|
||||
args.agent_path,
|
||||
model=args.model,
|
||||
)
|
||||
except CredentialError as e:
|
||||
print(f"\n{e}", file=sys.stderr)
|
||||
return
|
||||
except Exception as e:
|
||||
print(f"Error loading agent: {e}")
|
||||
return
|
||||
# Prompt before starting (allows credential updates)
|
||||
if sys.stdin.isatty() and not args.quiet:
|
||||
runner = _prompt_before_start(args.agent_path, runner, args.model)
|
||||
if runner is None:
|
||||
return 1
|
||||
|
||||
# Prompt before starting (allows credential updates)
|
||||
if sys.stdin.isatty():
|
||||
runner = _prompt_before_start(args.agent_path, runner, args.model)
|
||||
if runner is None:
|
||||
return
|
||||
|
||||
# Force setup inside the loop
|
||||
if runner._agent_runtime is None:
|
||||
try:
|
||||
runner._setup()
|
||||
except CredentialError as e:
|
||||
print(f"\n{e}", file=sys.stderr)
|
||||
return
|
||||
|
||||
# Start runtime before TUI so it's ready for user input
|
||||
if runner._agent_runtime and not runner._agent_runtime.is_running:
|
||||
await runner._agent_runtime.start()
|
||||
|
||||
app = AdenTUI(
|
||||
runner._agent_runtime,
|
||||
resume_session=getattr(args, "resume_session", None),
|
||||
resume_checkpoint=getattr(args, "checkpoint", None),
|
||||
)
|
||||
|
||||
# TUI handles execution via ChatRepl — user submits input,
|
||||
# ChatRepl calls runtime.trigger_and_wait(). No auto-launch.
|
||||
await app.run_async()
|
||||
except Exception as e:
|
||||
import traceback
|
||||
|
||||
traceback.print_exc()
|
||||
print(f"TUI error: {e}")
|
||||
|
||||
await runner.cleanup_async()
|
||||
return None
|
||||
|
||||
asyncio.run(run_with_tui())
|
||||
print("TUI session ended.")
|
||||
return 0
|
||||
else:
|
||||
# Standard execution — load runner here (not shared with TUI path)
|
||||
# AgentRunner handles credential setup interactively when stdin is a TTY.
|
||||
try:
|
||||
runner = AgentRunner.load(
|
||||
args.agent_path,
|
||||
model=args.model,
|
||||
# Load session/checkpoint state for resume (headless mode)
|
||||
session_state = None
|
||||
resume_session = getattr(args, "resume_session", None)
|
||||
checkpoint = getattr(args, "checkpoint", None)
|
||||
if resume_session:
|
||||
session_state = _load_resume_state(args.agent_path, resume_session, checkpoint)
|
||||
if session_state is None:
|
||||
print(
|
||||
f"Error: Could not load session state for {resume_session}",
|
||||
file=sys.stderr,
|
||||
)
|
||||
except CredentialError as e:
|
||||
print(f"\n{e}", file=sys.stderr)
|
||||
return 1
|
||||
except FileNotFoundError as e:
|
||||
print(f"Error: {e}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Prompt before starting (allows credential updates)
|
||||
if sys.stdin.isatty() and not args.quiet:
|
||||
runner = _prompt_before_start(args.agent_path, runner, args.model)
|
||||
if runner is None:
|
||||
return 1
|
||||
|
||||
# Load session/checkpoint state for resume (headless mode)
|
||||
session_state = None
|
||||
resume_session = getattr(args, "resume_session", None)
|
||||
checkpoint = getattr(args, "checkpoint", None)
|
||||
if resume_session:
|
||||
session_state = _load_resume_state(args.agent_path, resume_session, checkpoint)
|
||||
if session_state is None:
|
||||
print(
|
||||
f"Error: Could not load session state for {resume_session}",
|
||||
file=sys.stderr,
|
||||
)
|
||||
return 1
|
||||
if not args.quiet:
|
||||
resume_node = session_state.get("paused_at", "unknown")
|
||||
if checkpoint:
|
||||
print(f"Resuming from checkpoint: {checkpoint}")
|
||||
else:
|
||||
print(f"Resuming session: {resume_session}")
|
||||
print(f"Resume point: {resume_node}")
|
||||
print()
|
||||
|
||||
# Auto-inject user_id if the agent expects it but it's not provided
|
||||
entry_input_keys = runner.graph.nodes[0].input_keys if runner.graph.nodes else []
|
||||
if "user_id" in entry_input_keys and context.get("user_id") is None:
|
||||
import os
|
||||
|
||||
context["user_id"] = os.environ.get("USER", "default_user")
|
||||
|
||||
if not args.quiet:
|
||||
info = runner.info()
|
||||
print(f"Agent: {info.name}")
|
||||
print(f"Goal: {info.goal_name}")
|
||||
print(f"Steps: {info.node_count}")
|
||||
print(f"Input: {json.dumps(context)}")
|
||||
print()
|
||||
print("=" * 60)
|
||||
print("Executing agent...")
|
||||
print("=" * 60)
|
||||
resume_node = session_state.get("paused_at", "unknown")
|
||||
if checkpoint:
|
||||
print(f"Resuming from checkpoint: {checkpoint}")
|
||||
else:
|
||||
print(f"Resuming session: {resume_session}")
|
||||
print(f"Resume point: {resume_node}")
|
||||
print()
|
||||
|
||||
result = asyncio.run(runner.run(context, session_state=session_state))
|
||||
# Auto-inject user_id if the agent expects it but it's not provided
|
||||
entry_input_keys = runner.graph.nodes[0].input_keys if runner.graph.nodes else []
|
||||
if "user_id" in entry_input_keys and context.get("user_id") is None:
|
||||
import os
|
||||
|
||||
context["user_id"] = os.environ.get("USER", "default_user")
|
||||
|
||||
if not args.quiet:
|
||||
info = runner.info()
|
||||
print(f"Agent: {info.name}")
|
||||
print(f"Goal: {info.goal_name}")
|
||||
print(f"Steps: {info.node_count}")
|
||||
print(f"Input: {json.dumps(context)}")
|
||||
print()
|
||||
print("=" * 60)
|
||||
print("Executing agent...")
|
||||
print("=" * 60)
|
||||
print()
|
||||
|
||||
result = asyncio.run(runner.run(context, session_state=session_state))
|
||||
|
||||
# Format output
|
||||
output = {
|
||||
@@ -1364,154 +1147,6 @@ def _get_framework_agents_dir() -> Path:
|
||||
return Path(__file__).resolve().parent.parent / "agents"
|
||||
|
||||
|
||||
def _launch_agent_tui(
|
||||
agent_path: str | Path,
|
||||
model: str | None = None,
|
||||
) -> int:
|
||||
"""Load an agent and launch the TUI. Shared by cmd_tui and cmd_code."""
|
||||
from framework.credentials.models import CredentialError
|
||||
from framework.runner import AgentRunner
|
||||
from framework.tui.app import AdenTUI
|
||||
|
||||
async def run_with_tui():
|
||||
# AgentRunner handles credential setup interactively when stdin is a TTY.
|
||||
try:
|
||||
runner = AgentRunner.load(
|
||||
agent_path,
|
||||
model=model,
|
||||
)
|
||||
except CredentialError as e:
|
||||
print(f"\n{e}", file=sys.stderr)
|
||||
return
|
||||
except Exception as e:
|
||||
print(f"Error loading agent: {e}")
|
||||
return
|
||||
|
||||
if runner._agent_runtime is None:
|
||||
try:
|
||||
runner._setup()
|
||||
except CredentialError as e:
|
||||
print(f"\n{e}", file=sys.stderr)
|
||||
return
|
||||
|
||||
if runner._agent_runtime and not runner._agent_runtime.is_running:
|
||||
await runner._agent_runtime.start()
|
||||
|
||||
app = AdenTUI(runner._agent_runtime)
|
||||
try:
|
||||
await app.run_async()
|
||||
except Exception as e:
|
||||
import traceback
|
||||
|
||||
traceback.print_exc()
|
||||
print(f"TUI error: {e}")
|
||||
|
||||
await runner.cleanup_async()
|
||||
|
||||
asyncio.run(run_with_tui())
|
||||
print("TUI session ended.")
|
||||
return 0
|
||||
|
||||
|
||||
def cmd_tui(args: argparse.Namespace) -> int:
|
||||
"""Launch the interactive TUI dashboard with in-app agent picker."""
|
||||
import logging
|
||||
|
||||
logging.basicConfig(level=logging.WARNING, format="%(message)s")
|
||||
|
||||
from framework.tui.app import AdenTUI
|
||||
|
||||
async def run_tui():
|
||||
app = AdenTUI(
|
||||
model=args.model,
|
||||
)
|
||||
await app.run_async()
|
||||
|
||||
asyncio.run(run_tui())
|
||||
print("TUI session ended.")
|
||||
return 0
|
||||
|
||||
|
||||
def cmd_code(args: argparse.Namespace) -> int:
|
||||
"""Launch Hive Coder with multi-graph support.
|
||||
|
||||
Unlike ``_launch_agent_tui``, this sets up graph lifecycle tools and
|
||||
assigns ``graph_id="hive_coder"`` so the coder can load, supervise,
|
||||
and restart secondary agent graphs within the same session.
|
||||
"""
|
||||
import logging
|
||||
|
||||
logging.basicConfig(level=logging.WARNING, format="%(message)s")
|
||||
|
||||
framework_agents_dir = _get_framework_agents_dir()
|
||||
hive_coder_path = framework_agents_dir / "hive_coder"
|
||||
|
||||
if not (hive_coder_path / "agent.py").exists():
|
||||
print("Error: Hive Coder agent not found.", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Ensure framework agents dir is on sys.path for import
|
||||
fa_str = str(framework_agents_dir)
|
||||
if fa_str not in sys.path:
|
||||
sys.path.insert(0, fa_str)
|
||||
|
||||
from framework.credentials.models import CredentialError
|
||||
from framework.runner import AgentRunner
|
||||
from framework.tools.session_graph_tools import register_graph_tools
|
||||
from framework.tui.app import AdenTUI
|
||||
|
||||
async def run_with_tui():
|
||||
try:
|
||||
runner = AgentRunner.load(hive_coder_path, model=args.model)
|
||||
except CredentialError as e:
|
||||
print(f"\n{e}", file=sys.stderr)
|
||||
return
|
||||
except Exception as e:
|
||||
print(f"Error loading agent: {e}")
|
||||
return
|
||||
|
||||
if runner._agent_runtime is None:
|
||||
try:
|
||||
runner._setup()
|
||||
except CredentialError as e:
|
||||
print(f"\n{e}", file=sys.stderr)
|
||||
return
|
||||
|
||||
runtime = runner._agent_runtime
|
||||
|
||||
# -- Multi-graph setup --
|
||||
# Tag the primary graph so events carry graph_id="hive_coder"
|
||||
runtime._graph_id = "hive_coder"
|
||||
runtime._active_graph_id = "hive_coder"
|
||||
|
||||
# Register graph lifecycle tools (load_agent, unload_agent, etc.)
|
||||
register_graph_tools(runner._tool_registry, runtime)
|
||||
|
||||
# Refresh tool schemas AND executor so streams see the new tools.
|
||||
# The executor closure references the registry dict by ref, but
|
||||
# refreshing both is robust against any copy-on-read behavior.
|
||||
runtime._tools = list(runner._tool_registry.get_tools().values())
|
||||
runtime._tool_executor = runner._tool_registry.get_executor()
|
||||
|
||||
if not runtime.is_running:
|
||||
await runtime.start()
|
||||
|
||||
app = AdenTUI(runtime)
|
||||
try:
|
||||
await app.run_async()
|
||||
except Exception as e:
|
||||
import traceback
|
||||
|
||||
traceback.print_exc()
|
||||
print(f"TUI error: {e}")
|
||||
|
||||
await runner.cleanup_async()
|
||||
|
||||
asyncio.run(run_with_tui())
|
||||
print("TUI session ended.")
|
||||
return 0
|
||||
|
||||
|
||||
def _extract_python_agent_metadata(agent_path: Path) -> tuple[str, str]:
|
||||
"""Extract name and description from a Python-based agent's config.py.
|
||||
|
||||
@@ -1864,56 +1499,6 @@ def _interactive_multi(agents_dir: Path) -> int:
|
||||
return 0
|
||||
|
||||
|
||||
def cmd_sessions_list(args: argparse.Namespace) -> int:
|
||||
"""List agent sessions."""
|
||||
print("⚠ Sessions list command not yet implemented")
|
||||
print("This will be available once checkpoint infrastructure is complete.")
|
||||
print(f"\nAgent: {args.agent_path}")
|
||||
print(f"Status filter: {args.status}")
|
||||
print(f"Has checkpoints: {args.has_checkpoints}")
|
||||
return 1
|
||||
|
||||
|
||||
def cmd_sessions_show(args: argparse.Namespace) -> int:
|
||||
"""Show detailed session information."""
|
||||
print("⚠ Session show command not yet implemented")
|
||||
print("This will be available once checkpoint infrastructure is complete.")
|
||||
print(f"\nAgent: {args.agent_path}")
|
||||
print(f"Session: {args.session_id}")
|
||||
return 1
|
||||
|
||||
|
||||
def cmd_sessions_checkpoints(args: argparse.Namespace) -> int:
|
||||
"""List checkpoints for a session."""
|
||||
print("⚠ Session checkpoints command not yet implemented")
|
||||
print("This will be available once checkpoint infrastructure is complete.")
|
||||
print(f"\nAgent: {args.agent_path}")
|
||||
print(f"Session: {args.session_id}")
|
||||
return 1
|
||||
|
||||
|
||||
def cmd_pause(args: argparse.Namespace) -> int:
|
||||
"""Pause a running session."""
|
||||
print("⚠ Pause command not yet implemented")
|
||||
print("This will be available once executor pause integration is complete.")
|
||||
print(f"\nAgent: {args.agent_path}")
|
||||
print(f"Session: {args.session_id}")
|
||||
return 1
|
||||
|
||||
|
||||
def cmd_resume(args: argparse.Namespace) -> int:
|
||||
"""Resume a session from checkpoint."""
|
||||
print("⚠ Resume command not yet implemented")
|
||||
print("This will be available once checkpoint resume integration is complete.")
|
||||
print(f"\nAgent: {args.agent_path}")
|
||||
print(f"Session: {args.session_id}")
|
||||
if args.checkpoint:
|
||||
print(f"Checkpoint: {args.checkpoint}")
|
||||
if args.tui:
|
||||
print("Mode: TUI")
|
||||
return 1
|
||||
|
||||
|
||||
def cmd_setup_credentials(args: argparse.Namespace) -> int:
|
||||
"""Interactive credential setup for an agent."""
|
||||
from framework.credentials.setup import CredentialSetupSession
|
||||
|
||||
@@ -68,6 +68,7 @@ class MCPClient:
|
||||
self._read_stream = None
|
||||
self._write_stream = None
|
||||
self._stdio_context = None # Context manager for stdio_client
|
||||
self._errlog_handle = None # Track errlog file handle for cleanup
|
||||
self._http_client: httpx.Client | None = None
|
||||
self._tools: dict[str, MCPTool] = {}
|
||||
self._connected = False
|
||||
@@ -200,7 +201,8 @@ class MCPClient:
|
||||
if os.name == "nt":
|
||||
errlog = sys.stderr
|
||||
else:
|
||||
errlog = open(os.devnull, "w") # noqa: SIM115
|
||||
self._errlog_handle = open(os.devnull, "w")
|
||||
errlog = self._errlog_handle
|
||||
self._stdio_context = stdio_client(server_params, errlog=errlog)
|
||||
(
|
||||
self._read_stream,
|
||||
@@ -475,6 +477,15 @@ class MCPClient:
|
||||
finally:
|
||||
self._stdio_context = None
|
||||
|
||||
# Third: close errlog file handle if we opened one
|
||||
if self._errlog_handle is not None:
|
||||
try:
|
||||
self._errlog_handle.close()
|
||||
except Exception as e:
|
||||
logger.debug(f"Error closing errlog handle: {e}")
|
||||
finally:
|
||||
self._errlog_handle = None
|
||||
|
||||
def disconnect(self) -> None:
|
||||
"""Disconnect from the MCP server."""
|
||||
# Clean up persistent STDIO connection
|
||||
@@ -545,6 +556,7 @@ class MCPClient:
|
||||
self._write_stream = None
|
||||
self._loop = None
|
||||
self._loop_thread = None
|
||||
self._errlog_handle = None
|
||||
|
||||
# Clean up HTTP client
|
||||
if self._http_client:
|
||||
|
||||
@@ -517,6 +517,41 @@ def get_codex_account_id() -> str | None:
|
||||
return None
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Kimi Code subscription token helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
|
||||
def get_kimi_code_token() -> str | None:
|
||||
"""Get the API key from a Kimi Code CLI installation.
|
||||
|
||||
Reads the API key from ``~/.kimi/config.toml``, which is created when
|
||||
the user runs ``kimi /login`` in the Kimi Code CLI.
|
||||
|
||||
Returns:
|
||||
The API key if available, None otherwise.
|
||||
"""
|
||||
import tomllib
|
||||
|
||||
config_path = Path.home() / ".kimi" / "config.toml"
|
||||
if not config_path.exists():
|
||||
return None
|
||||
|
||||
try:
|
||||
with open(config_path, "rb") as f:
|
||||
config = tomllib.load(f)
|
||||
providers = config.get("providers", {})
|
||||
# kimi-cli stores credentials under providers.kimi-for-coding
|
||||
for provider_cfg in providers.values():
|
||||
if isinstance(provider_cfg, dict):
|
||||
key = provider_cfg.get("api_key")
|
||||
if key:
|
||||
return key
|
||||
except Exception:
|
||||
pass
|
||||
return None
|
||||
|
||||
|
||||
@dataclass
|
||||
class AgentInfo:
|
||||
"""Information about an exported agent."""
|
||||
@@ -1104,6 +1139,7 @@ class AgentRunner:
|
||||
llm_config = config.get("llm", {})
|
||||
use_claude_code = llm_config.get("use_claude_code_subscription", False)
|
||||
use_codex = llm_config.get("use_codex_subscription", False)
|
||||
use_kimi_code = llm_config.get("use_kimi_code_subscription", False)
|
||||
api_base = llm_config.get("api_base")
|
||||
|
||||
api_key = None
|
||||
@@ -1119,6 +1155,12 @@ class AgentRunner:
|
||||
if not api_key:
|
||||
print("Warning: Codex subscription configured but no token found.")
|
||||
print("Run 'codex' to authenticate, then try again.")
|
||||
elif use_kimi_code:
|
||||
# Get API key from Kimi Code CLI config (~/.kimi/config.toml)
|
||||
api_key = get_kimi_code_token()
|
||||
if not api_key:
|
||||
print("Warning: Kimi Code subscription configured but no key found.")
|
||||
print("Run 'kimi /login' to authenticate, then try again.")
|
||||
|
||||
if api_key and use_claude_code:
|
||||
# Use litellm's built-in Anthropic OAuth support.
|
||||
@@ -1149,6 +1191,14 @@ class AgentRunner:
|
||||
store=False,
|
||||
allowed_openai_params=["store"],
|
||||
)
|
||||
elif api_key and use_kimi_code:
|
||||
# Kimi Code subscription uses the Kimi coding API (OpenAI-compatible).
|
||||
# The api_base is set automatically by LiteLLMProvider for kimi/ models.
|
||||
self._llm = LiteLLMProvider(
|
||||
model=self.model,
|
||||
api_key=api_key,
|
||||
api_base=api_base,
|
||||
)
|
||||
else:
|
||||
# Local models (e.g. Ollama) don't need an API key
|
||||
if self._is_local_model(self.model):
|
||||
@@ -1314,6 +1364,8 @@ class AgentRunner:
|
||||
return "TOGETHER_API_KEY"
|
||||
elif model_lower.startswith("minimax/") or model_lower.startswith("minimax-"):
|
||||
return "MINIMAX_API_KEY"
|
||||
elif model_lower.startswith("kimi/"):
|
||||
return "KIMI_API_KEY"
|
||||
else:
|
||||
# Default: assume OpenAI-compatible
|
||||
return "OPENAI_API_KEY"
|
||||
@@ -1334,6 +1386,8 @@ class AgentRunner:
|
||||
cred_id = "anthropic"
|
||||
elif model_lower.startswith("minimax/") or model_lower.startswith("minimax-"):
|
||||
cred_id = "minimax"
|
||||
elif model_lower.startswith("kimi/"):
|
||||
cred_id = "kimi"
|
||||
# Add more mappings as providers are added to LLM_CREDENTIALS
|
||||
|
||||
if cred_id is None:
|
||||
|
||||
@@ -349,7 +349,7 @@ class AgentRuntime:
|
||||
return
|
||||
# Skip events originating from this graph's own
|
||||
# executions (e.g. guardian should not fire on
|
||||
# hive_coder failures — only secondary graphs).
|
||||
# queen failures — only secondary graphs).
|
||||
if _exclude_own and event.graph_id == self._graph_id:
|
||||
return
|
||||
ep_spec = self._entry_points.get(entry_point_id)
|
||||
@@ -1531,6 +1531,11 @@ class AgentRuntime:
|
||||
for executor in stream._active_executors.values():
|
||||
for node_id, node in executor.node_registry.items():
|
||||
if getattr(node, "_awaiting_input", False):
|
||||
# Skip escalation receivers — those are handled
|
||||
# by the queen via inject_worker_message(), not
|
||||
# by the user directly.
|
||||
if ":escalation:" in node_id:
|
||||
continue
|
||||
return node_id, graph_id
|
||||
return None, None
|
||||
|
||||
|
||||
@@ -123,7 +123,7 @@ class EventType(StrEnum):
|
||||
# Custom events
|
||||
CUSTOM = "custom"
|
||||
|
||||
# Escalation (agent requests handoff to hive_coder)
|
||||
# Escalation (agent requests handoff to queen)
|
||||
ESCALATION_REQUESTED = "escalation_requested"
|
||||
|
||||
# Worker health monitoring (judge → queen → operator)
|
||||
@@ -976,7 +976,7 @@ class EventBus:
|
||||
context: str = "",
|
||||
execution_id: str | None = None,
|
||||
) -> None:
|
||||
"""Emit escalation requested event (agent wants hive_coder)."""
|
||||
"""Emit escalation requested event (agent wants queen)."""
|
||||
await self.publish(
|
||||
AgentEvent(
|
||||
type=EventType.ESCALATION_REQUESTED,
|
||||
|
||||
@@ -9,6 +9,7 @@ Each stream has:
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
import time
|
||||
import uuid
|
||||
from collections import OrderedDict
|
||||
@@ -240,6 +241,7 @@ class ExecutionStream:
|
||||
self._active_executions: dict[str, ExecutionContext] = {}
|
||||
self._execution_tasks: dict[str, asyncio.Task] = {}
|
||||
self._active_executors: dict[str, GraphExecutor] = {}
|
||||
self._cancel_reasons: dict[str, str] = {}
|
||||
self._execution_results: OrderedDict[str, ExecutionResult] = OrderedDict()
|
||||
self._execution_result_times: dict[str, float] = {}
|
||||
self._completion_events: dict[str, asyncio.Event] = {}
|
||||
@@ -464,7 +466,7 @@ class ExecutionStream:
|
||||
node.signal_shutdown()
|
||||
if hasattr(node, "cancel_current_turn"):
|
||||
node.cancel_current_turn()
|
||||
await self.cancel_execution(eid)
|
||||
await self.cancel_execution(eid, reason="Restarted with new execution")
|
||||
|
||||
# When resuming, reuse the original session ID so the execution
|
||||
# continues in the same session directory instead of creating a new one.
|
||||
@@ -801,19 +803,20 @@ class ExecutionStream:
|
||||
# Emit SSE event so the frontend knows the execution stopped.
|
||||
# The executor does NOT emit on CancelledError, so there is no
|
||||
# risk of double-emitting.
|
||||
cancel_reason = self._cancel_reasons.pop(execution_id, "Execution cancelled")
|
||||
if self._scoped_event_bus:
|
||||
if has_result and result.paused_at:
|
||||
await self._scoped_event_bus.emit_execution_paused(
|
||||
stream_id=self.stream_id,
|
||||
node_id=result.paused_at,
|
||||
reason="Execution cancelled",
|
||||
reason=cancel_reason,
|
||||
execution_id=execution_id,
|
||||
)
|
||||
else:
|
||||
await self._scoped_event_bus.emit_execution_failed(
|
||||
stream_id=self.stream_id,
|
||||
execution_id=execution_id,
|
||||
error="Execution cancelled",
|
||||
error=cancel_reason,
|
||||
correlation_id=ctx.correlation_id,
|
||||
)
|
||||
|
||||
@@ -961,6 +964,9 @@ class ExecutionStream:
|
||||
if error:
|
||||
state.result.error = error
|
||||
|
||||
# Stamp the owning process ID for cross-process stale detection
|
||||
state.pid = os.getpid()
|
||||
|
||||
# Write state.json
|
||||
await self._session_store.write_state(execution_id, state)
|
||||
logger.debug(f"Wrote state.json for session {execution_id} (status={status})")
|
||||
@@ -1054,18 +1060,24 @@ class ExecutionStream:
|
||||
"""Get execution context."""
|
||||
return self._active_executions.get(execution_id)
|
||||
|
||||
async def cancel_execution(self, execution_id: str) -> bool:
|
||||
async def cancel_execution(self, execution_id: str, *, reason: str | None = None) -> bool:
|
||||
"""
|
||||
Cancel a running execution.
|
||||
|
||||
Args:
|
||||
execution_id: Execution to cancel
|
||||
reason: Human-readable reason for the cancellation (e.g.
|
||||
"Stopped by queen", "User requested pause"). If not
|
||||
provided, defaults to "Execution cancelled".
|
||||
|
||||
Returns:
|
||||
True if cancelled, False if not found
|
||||
"""
|
||||
task = self._execution_tasks.get(execution_id)
|
||||
if task and not task.done():
|
||||
# Store the reason so the CancelledError handler can use it
|
||||
# when emitting the pause/fail event.
|
||||
self._cancel_reasons[execution_id] = reason or "Execution cancelled"
|
||||
task.cancel()
|
||||
# Wait briefly for the task to finish. Don't block indefinitely —
|
||||
# the task may be stuck in a long LLM API call that doesn't
|
||||
|
||||
@@ -134,6 +134,9 @@ class SessionState(BaseModel):
|
||||
# Input data (for debugging/replay)
|
||||
input_data: dict[str, Any] = Field(default_factory=dict)
|
||||
|
||||
# Process ID of the owning process (for cross-process stale session detection)
|
||||
pid: int | None = None
|
||||
|
||||
# Isolation level (from ExecutionContext)
|
||||
isolation_level: str = "shared"
|
||||
|
||||
|
||||
@@ -1,36 +0,0 @@
|
||||
"""Backward-compatibility shim.
|
||||
|
||||
The primary implementation is now in ``session_manager.py``.
|
||||
This module re-exports ``SessionManager`` as ``AgentManager`` and
|
||||
keeps ``AgentSlot`` for test compatibility.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from framework.server.session_manager import Session, SessionManager # noqa: F401
|
||||
|
||||
|
||||
@dataclass
|
||||
class AgentSlot:
|
||||
"""Legacy data class — kept for test compatibility only.
|
||||
|
||||
New code should use ``Session`` from ``session_manager``.
|
||||
"""
|
||||
|
||||
id: str
|
||||
agent_path: Path
|
||||
runner: Any
|
||||
runtime: Any
|
||||
info: Any
|
||||
loaded_at: float
|
||||
queen_executor: Any = None
|
||||
queen_task: asyncio.Task | None = None
|
||||
judge_task: asyncio.Task | None = None
|
||||
escalation_sub: str | None = None
|
||||
|
||||
|
||||
# Backward compat alias
|
||||
AgentManager = SessionManager
|
||||
@@ -0,0 +1,331 @@
|
||||
"""Queen orchestrator — builds and runs the queen executor.
|
||||
|
||||
Extracted from SessionManager._start_queen() to keep session management
|
||||
and queen orchestration concerns separate.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING, Any
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from framework.server.session_manager import Session
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
async def create_queen(
|
||||
session: Session,
|
||||
session_manager: Any,
|
||||
worker_identity: str | None,
|
||||
queen_dir: Path,
|
||||
initial_prompt: str | None = None,
|
||||
) -> asyncio.Task:
|
||||
"""Build the queen executor and return the running asyncio task.
|
||||
|
||||
Handles tool registration, phase-state initialization, prompt
|
||||
composition, persona hook setup, graph preparation, and the queen
|
||||
event loop.
|
||||
"""
|
||||
from framework.agents.queen.agent import (
|
||||
queen_goal,
|
||||
queen_graph as _queen_graph,
|
||||
)
|
||||
from framework.agents.queen.nodes import (
|
||||
_QUEEN_BUILDING_TOOLS,
|
||||
_QUEEN_PLANNING_TOOLS,
|
||||
_QUEEN_RUNNING_TOOLS,
|
||||
_QUEEN_STAGING_TOOLS,
|
||||
_appendices,
|
||||
_building_knowledge,
|
||||
_planning_knowledge,
|
||||
_queen_behavior_always,
|
||||
_queen_behavior_building,
|
||||
_queen_behavior_planning,
|
||||
_queen_behavior_running,
|
||||
_queen_behavior_staging,
|
||||
_queen_identity_building,
|
||||
_queen_identity_planning,
|
||||
_queen_identity_running,
|
||||
_queen_identity_staging,
|
||||
_queen_phase_7,
|
||||
_queen_style,
|
||||
_queen_tools_building,
|
||||
_queen_tools_planning,
|
||||
_queen_tools_running,
|
||||
_queen_tools_staging,
|
||||
_shared_building_knowledge,
|
||||
)
|
||||
from framework.agents.queen.nodes.thinking_hook import select_expert_persona
|
||||
from framework.graph.event_loop_node import HookContext, HookResult
|
||||
from framework.graph.executor import GraphExecutor
|
||||
from framework.runner.tool_registry import ToolRegistry
|
||||
from framework.runtime.core import Runtime
|
||||
from framework.runtime.event_bus import AgentEvent, EventType
|
||||
from framework.tools.queen_lifecycle_tools import (
|
||||
QueenPhaseState,
|
||||
register_queen_lifecycle_tools,
|
||||
)
|
||||
|
||||
hive_home = Path.home() / ".hive"
|
||||
|
||||
# ---- Tool registry ------------------------------------------------
|
||||
queen_registry = ToolRegistry()
|
||||
import framework.agents.queen as _queen_pkg
|
||||
|
||||
queen_pkg_dir = Path(_queen_pkg.__file__).parent
|
||||
mcp_config = queen_pkg_dir / "mcp_servers.json"
|
||||
if mcp_config.exists():
|
||||
try:
|
||||
queen_registry.load_mcp_config(mcp_config)
|
||||
logger.info("Queen: loaded MCP tools from %s", mcp_config)
|
||||
except Exception:
|
||||
logger.warning("Queen: MCP config failed to load", exc_info=True)
|
||||
|
||||
# ---- Phase state --------------------------------------------------
|
||||
initial_phase = "staging" if worker_identity else "planning"
|
||||
phase_state = QueenPhaseState(phase=initial_phase, event_bus=session.event_bus)
|
||||
session.phase_state = phase_state
|
||||
|
||||
# ---- Lifecycle tools (always registered) --------------------------
|
||||
register_queen_lifecycle_tools(
|
||||
queen_registry,
|
||||
session=session,
|
||||
session_id=session.id,
|
||||
session_manager=session_manager,
|
||||
manager_session_id=session.id,
|
||||
phase_state=phase_state,
|
||||
)
|
||||
|
||||
# ---- Monitoring tools (only when worker is loaded) ----------------
|
||||
if session.worker_runtime:
|
||||
from framework.tools.worker_monitoring_tools import register_worker_monitoring_tools
|
||||
|
||||
register_worker_monitoring_tools(
|
||||
queen_registry,
|
||||
session.event_bus,
|
||||
session.worker_path,
|
||||
stream_id="queen",
|
||||
worker_graph_id=session.worker_runtime._graph_id,
|
||||
)
|
||||
|
||||
queen_tools = list(queen_registry.get_tools().values())
|
||||
queen_tool_executor = queen_registry.get_executor()
|
||||
|
||||
# ---- Partition tools by phase ------------------------------------
|
||||
planning_names = set(_QUEEN_PLANNING_TOOLS)
|
||||
building_names = set(_QUEEN_BUILDING_TOOLS)
|
||||
staging_names = set(_QUEEN_STAGING_TOOLS)
|
||||
running_names = set(_QUEEN_RUNNING_TOOLS)
|
||||
|
||||
registered_names = {t.name for t in queen_tools}
|
||||
missing_building = building_names - registered_names
|
||||
if missing_building:
|
||||
logger.warning(
|
||||
"Queen: %d/%d building tools NOT registered: %s",
|
||||
len(missing_building),
|
||||
len(building_names),
|
||||
sorted(missing_building),
|
||||
)
|
||||
logger.info("Queen: registered tools: %s", sorted(registered_names))
|
||||
|
||||
phase_state.planning_tools = [t for t in queen_tools if t.name in planning_names]
|
||||
phase_state.building_tools = [t for t in queen_tools if t.name in building_names]
|
||||
phase_state.staging_tools = [t for t in queen_tools if t.name in staging_names]
|
||||
phase_state.running_tools = [t for t in queen_tools if t.name in running_names]
|
||||
|
||||
# ---- Cross-session memory ----------------------------------------
|
||||
from framework.agents.queen.queen_memory import seed_if_missing
|
||||
|
||||
seed_if_missing()
|
||||
|
||||
# ---- Compose phase-specific prompts ------------------------------
|
||||
_orig_node = _queen_graph.nodes[0]
|
||||
|
||||
if worker_identity is None:
|
||||
worker_identity = (
|
||||
"\n\n# Worker Profile\n"
|
||||
"No worker agent loaded. You are operating independently.\n"
|
||||
"Handle all tasks directly using your coding tools."
|
||||
)
|
||||
|
||||
_planning_body = (
|
||||
_queen_style
|
||||
+ _shared_building_knowledge
|
||||
+ _queen_tools_planning
|
||||
+ _queen_behavior_always
|
||||
+ _queen_behavior_planning
|
||||
+ _planning_knowledge
|
||||
+ worker_identity
|
||||
)
|
||||
phase_state.prompt_planning = _queen_identity_planning + _planning_body
|
||||
|
||||
_building_body = (
|
||||
_queen_style
|
||||
+ _shared_building_knowledge
|
||||
+ _queen_tools_building
|
||||
+ _queen_behavior_always
|
||||
+ _queen_behavior_building
|
||||
+ _building_knowledge
|
||||
+ _queen_phase_7
|
||||
+ _appendices
|
||||
+ worker_identity
|
||||
)
|
||||
phase_state.prompt_building = _queen_identity_building + _building_body
|
||||
phase_state.prompt_staging = (
|
||||
_queen_identity_staging
|
||||
+ _queen_style
|
||||
+ _queen_tools_staging
|
||||
+ _queen_behavior_always
|
||||
+ _queen_behavior_staging
|
||||
+ worker_identity
|
||||
)
|
||||
phase_state.prompt_running = (
|
||||
_queen_identity_running
|
||||
+ _queen_style
|
||||
+ _queen_tools_running
|
||||
+ _queen_behavior_always
|
||||
+ _queen_behavior_running
|
||||
+ worker_identity
|
||||
)
|
||||
|
||||
# ---- Persona hook ------------------------------------------------
|
||||
_session_llm = session.llm
|
||||
_session_event_bus = session.event_bus
|
||||
|
||||
async def _persona_hook(ctx: HookContext) -> HookResult | None:
|
||||
persona = await select_expert_persona(ctx.trigger or "", _session_llm)
|
||||
if not persona:
|
||||
return None
|
||||
if _session_event_bus is not None:
|
||||
await _session_event_bus.publish(
|
||||
AgentEvent(
|
||||
type=EventType.QUEEN_PERSONA_SELECTED,
|
||||
stream_id="queen",
|
||||
data={"persona": persona},
|
||||
)
|
||||
)
|
||||
return HookResult(system_prompt=persona + "\n\n" + phase_state.get_current_prompt())
|
||||
|
||||
# ---- Graph preparation -------------------------------------------
|
||||
initial_prompt_text = phase_state.get_current_prompt()
|
||||
|
||||
registered_tool_names = set(queen_registry.get_tools().keys())
|
||||
declared_tools = _orig_node.tools or []
|
||||
available_tools = [t for t in declared_tools if t in registered_tool_names]
|
||||
|
||||
node_updates: dict = {
|
||||
"system_prompt": initial_prompt_text,
|
||||
}
|
||||
if set(available_tools) != set(declared_tools):
|
||||
missing = sorted(set(declared_tools) - registered_tool_names)
|
||||
if missing:
|
||||
logger.warning("Queen: tools not available: %s", missing)
|
||||
node_updates["tools"] = available_tools
|
||||
|
||||
adjusted_node = _orig_node.model_copy(update=node_updates)
|
||||
_queen_loop_config = {
|
||||
**(_queen_graph.loop_config or {}),
|
||||
"hooks": {"session_start": [_persona_hook]},
|
||||
}
|
||||
queen_graph = _queen_graph.model_copy(
|
||||
update={"nodes": [adjusted_node], "loop_config": _queen_loop_config}
|
||||
)
|
||||
|
||||
# ---- Queen event loop --------------------------------------------
|
||||
queen_runtime = Runtime(hive_home / "queen")
|
||||
|
||||
async def _queen_loop():
|
||||
try:
|
||||
executor = GraphExecutor(
|
||||
runtime=queen_runtime,
|
||||
llm=session.llm,
|
||||
tools=queen_tools,
|
||||
tool_executor=queen_tool_executor,
|
||||
event_bus=session.event_bus,
|
||||
stream_id="queen",
|
||||
storage_path=queen_dir,
|
||||
loop_config=_queen_loop_config,
|
||||
execution_id=session.id,
|
||||
dynamic_tools_provider=phase_state.get_current_tools,
|
||||
dynamic_prompt_provider=phase_state.get_current_prompt,
|
||||
)
|
||||
session.queen_executor = executor
|
||||
|
||||
# Wire inject_notification so phase switches notify the queen LLM
|
||||
async def _inject_phase_notification(content: str) -> None:
|
||||
node = executor.node_registry.get("queen")
|
||||
if node is not None and hasattr(node, "inject_event"):
|
||||
await node.inject_event(content)
|
||||
|
||||
phase_state.inject_notification = _inject_phase_notification
|
||||
|
||||
# Auto-switch to staging when worker execution finishes
|
||||
async def _on_worker_done(event):
|
||||
if event.stream_id == "queen":
|
||||
return
|
||||
if phase_state.phase == "running":
|
||||
if event.type == EventType.EXECUTION_COMPLETED:
|
||||
output = event.data.get("output", {})
|
||||
output_summary = ""
|
||||
if output:
|
||||
for key, value in output.items():
|
||||
val_str = str(value)
|
||||
if len(val_str) > 200:
|
||||
val_str = val_str[:200] + "..."
|
||||
output_summary += f"\n {key}: {val_str}"
|
||||
_out = output_summary or " (no output keys set)"
|
||||
notification = (
|
||||
"[WORKER_TERMINAL] Worker finished successfully.\n"
|
||||
f"Output:{_out}\n"
|
||||
"Report this to the user. "
|
||||
"Ask if they want to continue with another run."
|
||||
)
|
||||
else: # EXECUTION_FAILED
|
||||
error = event.data.get("error", "Unknown error")
|
||||
notification = (
|
||||
"[WORKER_TERMINAL] Worker failed.\n"
|
||||
f"Error: {error}\n"
|
||||
"Report this to the user and help them troubleshoot."
|
||||
)
|
||||
|
||||
node = executor.node_registry.get("queen")
|
||||
if node is not None and hasattr(node, "inject_event"):
|
||||
await node.inject_event(notification)
|
||||
|
||||
await phase_state.switch_to_staging(source="auto")
|
||||
|
||||
session.event_bus.subscribe(
|
||||
event_types=[EventType.EXECUTION_COMPLETED, EventType.EXECUTION_FAILED],
|
||||
handler=_on_worker_done,
|
||||
)
|
||||
session_manager._subscribe_worker_handoffs(session, executor)
|
||||
|
||||
logger.info(
|
||||
"Queen starting in %s phase with %d tools: %s",
|
||||
phase_state.phase,
|
||||
len(phase_state.get_current_tools()),
|
||||
[t.name for t in phase_state.get_current_tools()],
|
||||
)
|
||||
result = await executor.execute(
|
||||
graph=queen_graph,
|
||||
goal=queen_goal,
|
||||
input_data={"greeting": initial_prompt or "Session started."},
|
||||
session_state={"resume_session_id": session.id},
|
||||
)
|
||||
if result.success:
|
||||
logger.warning("Queen executor returned (should be forever-alive)")
|
||||
else:
|
||||
logger.error(
|
||||
"Queen executor failed: %s",
|
||||
result.error or "(no error message)",
|
||||
)
|
||||
except Exception:
|
||||
logger.error("Queen conversation crashed", exc_info=True)
|
||||
finally:
|
||||
session.queen_executor = None
|
||||
|
||||
return asyncio.create_task(_queen_loop())
|
||||
@@ -347,7 +347,7 @@ async def handle_pause(request: web.Request) -> web.Response:
|
||||
|
||||
for exec_id in list(stream.active_execution_ids):
|
||||
try:
|
||||
ok = await stream.cancel_execution(exec_id)
|
||||
ok = await stream.cancel_execution(exec_id, reason="Execution paused by user")
|
||||
if ok:
|
||||
cancelled.append(exec_id)
|
||||
except Exception:
|
||||
@@ -357,8 +357,8 @@ async def handle_pause(request: web.Request) -> web.Response:
|
||||
runtime.pause_timers()
|
||||
|
||||
# Switch to staging (agent still loaded, ready to re-run)
|
||||
if session.mode_state is not None:
|
||||
await session.mode_state.switch_to_staging(source="frontend")
|
||||
if session.phase_state is not None:
|
||||
await session.phase_state.switch_to_staging(source="frontend")
|
||||
|
||||
return web.json_response(
|
||||
{
|
||||
@@ -400,7 +400,9 @@ async def handle_stop(request: web.Request) -> web.Response:
|
||||
if hasattr(node, "cancel_current_turn"):
|
||||
node.cancel_current_turn()
|
||||
|
||||
cancelled = await stream.cancel_execution(execution_id)
|
||||
cancelled = await stream.cancel_execution(
|
||||
execution_id, reason="Execution stopped by user"
|
||||
)
|
||||
if cancelled:
|
||||
# Cancel queen's in-progress LLM turn
|
||||
if session.queen_executor:
|
||||
|
||||
@@ -61,7 +61,7 @@ def _session_to_live_dict(session) -> dict:
|
||||
"loaded_at": session.loaded_at,
|
||||
"uptime_seconds": round(time.time() - session.loaded_at, 1),
|
||||
"intro_message": getattr(session.runner, "intro_message", "") or "",
|
||||
"queen_phase": phase_state.phase if phase_state else "building",
|
||||
"queen_phase": phase_state.phase if phase_state else "planning",
|
||||
}
|
||||
|
||||
|
||||
@@ -731,7 +731,7 @@ async def handle_delete_history_session(request: web.Request) -> web.Response:
|
||||
|
||||
async def handle_discover(request: web.Request) -> web.Response:
|
||||
"""GET /api/discover — discover agents from filesystem."""
|
||||
from framework.tui.screens.agent_picker import discover_agents
|
||||
from framework.agents.discovery import discover_agents
|
||||
|
||||
manager = _get_manager(request)
|
||||
loaded_paths = {str(s.worker_path) for s in manager.list_sessions() if s.worker_path}
|
||||
|
||||
@@ -46,6 +46,8 @@ class Session:
|
||||
judge_task: asyncio.Task | None = None
|
||||
escalation_sub: str | None = None
|
||||
worker_handoff_sub: str | None = None
|
||||
# Memory consolidation subscription (fires on CONTEXT_COMPACTED)
|
||||
memory_consolidation_sub: str | None = None
|
||||
# Session directory resumption:
|
||||
# When set, _start_queen writes queen conversations to this existing session's
|
||||
# directory instead of creating a new one. This lets cold-restores accumulate
|
||||
@@ -276,11 +278,20 @@ class SessionManager:
|
||||
When a new runtime starts, any on-disk session still marked 'active'
|
||||
is from a process that no longer exists. 'Paused' sessions are left
|
||||
intact so they remain resumable.
|
||||
|
||||
Two-layer protection against corrupting live sessions:
|
||||
1. In-memory: skip any session ID currently tracked in self._sessions
|
||||
(guaranteed alive in this process).
|
||||
2. PID validation: if state.json contains a ``pid`` field, check whether
|
||||
that process is still running on the host. If it is, the session is
|
||||
owned by another healthy worker process, so leave it alone.
|
||||
"""
|
||||
sessions_path = Path.home() / ".hive" / "agents" / agent_path.name / "sessions"
|
||||
if not sessions_path.exists():
|
||||
return
|
||||
|
||||
live_session_ids = set(self._sessions.keys())
|
||||
|
||||
for d in sessions_path.iterdir():
|
||||
if not d.is_dir() or not d.name.startswith("session_"):
|
||||
continue
|
||||
@@ -291,6 +302,26 @@ class SessionManager:
|
||||
state = json.loads(state_path.read_text(encoding="utf-8"))
|
||||
if state.get("status") != "active":
|
||||
continue
|
||||
|
||||
# Layer 1: skip sessions that are alive in this process
|
||||
session_id = state.get("session_id", d.name)
|
||||
if session_id in live_session_ids or d.name in live_session_ids:
|
||||
logger.debug(
|
||||
"Skipping live in-memory session '%s' during stale cleanup",
|
||||
d.name,
|
||||
)
|
||||
continue
|
||||
|
||||
# Layer 2: skip sessions whose owning process is still alive
|
||||
recorded_pid = state.get("pid")
|
||||
if recorded_pid is not None and self._is_pid_alive(recorded_pid):
|
||||
logger.debug(
|
||||
"Skipping session '%s' — owning process %d is still running",
|
||||
d.name,
|
||||
recorded_pid,
|
||||
)
|
||||
continue
|
||||
|
||||
state["status"] = "cancelled"
|
||||
state.setdefault("result", {})["error"] = "Stale session: runtime restarted"
|
||||
state.setdefault("timestamps", {})["updated_at"] = datetime.now().isoformat()
|
||||
@@ -301,6 +332,34 @@ class SessionManager:
|
||||
except (json.JSONDecodeError, OSError) as e:
|
||||
logger.warning("Failed to clean up stale session %s: %s", d.name, e)
|
||||
|
||||
@staticmethod
|
||||
def _is_pid_alive(pid: int) -> bool:
|
||||
"""Check whether a process with the given PID is still running."""
|
||||
import os
|
||||
import platform
|
||||
|
||||
if platform.system() == "Windows":
|
||||
import ctypes
|
||||
|
||||
# PROCESS_QUERY_LIMITED_INFORMATION = 0x1000
|
||||
kernel32 = ctypes.windll.kernel32
|
||||
handle = kernel32.OpenProcess(0x1000, False, pid)
|
||||
if not handle:
|
||||
# 5 is ERROR_ACCESS_DENIED, meaning the process exists but is protected
|
||||
return kernel32.GetLastError() == 5
|
||||
|
||||
exit_code = ctypes.c_ulong()
|
||||
kernel32.GetExitCodeProcess(handle, ctypes.byref(exit_code))
|
||||
kernel32.CloseHandle(handle)
|
||||
# 259 is STILL_ACTIVE
|
||||
return exit_code.value == 259
|
||||
else:
|
||||
try:
|
||||
os.kill(pid, 0)
|
||||
except OSError:
|
||||
return False
|
||||
return True
|
||||
|
||||
async def load_worker(
|
||||
self,
|
||||
session_id: str,
|
||||
@@ -325,9 +384,9 @@ class SessionManager:
|
||||
model=model,
|
||||
)
|
||||
|
||||
# Notify queen about the loaded worker (skip for hive_coder itself).
|
||||
# Notify queen about the loaded worker (skip for queen itself).
|
||||
# Health judge disabled for simplicity.
|
||||
if agent_path.name != "hive_coder" and session.worker_runtime:
|
||||
if agent_path.name != "queen" and session.worker_runtime:
|
||||
# await self._start_judge(session, session.runner._storage_path)
|
||||
await self._notify_queen_worker_loaded(session)
|
||||
|
||||
@@ -379,6 +438,11 @@ class SessionManager:
|
||||
if session is None:
|
||||
return False
|
||||
|
||||
# Capture session data for memory consolidation before teardown
|
||||
_llm = getattr(session, "llm", None)
|
||||
_storage_id = getattr(session, "queen_resume_from", None) or session_id
|
||||
_session_dir = Path.home() / ".hive" / "queen" / "session" / _storage_id
|
||||
|
||||
# Stop judge
|
||||
self._stop_judge(session)
|
||||
if session.worker_handoff_sub is not None:
|
||||
@@ -388,7 +452,13 @@ class SessionManager:
|
||||
pass
|
||||
session.worker_handoff_sub = None
|
||||
|
||||
# Stop queen
|
||||
# Stop queen and memory consolidation subscription
|
||||
if session.memory_consolidation_sub is not None:
|
||||
try:
|
||||
session.event_bus.unsubscribe(session.memory_consolidation_sub)
|
||||
except Exception:
|
||||
pass
|
||||
session.memory_consolidation_sub = None
|
||||
if session.queen_task is not None:
|
||||
session.queen_task.cancel()
|
||||
session.queen_task = None
|
||||
@@ -401,6 +471,17 @@ class SessionManager:
|
||||
except Exception as e:
|
||||
logger.error("Error cleaning up worker: %s", e)
|
||||
|
||||
# Final memory consolidation — fire-and-forget so teardown isn't blocked.
|
||||
if _llm is not None and _session_dir.exists():
|
||||
import asyncio
|
||||
|
||||
from framework.agents.queen.queen_memory import consolidate_queen_memory
|
||||
|
||||
asyncio.create_task(
|
||||
consolidate_queen_memory(session_id, _session_dir, _llm),
|
||||
name=f"queen-memory-consolidation-{session_id}",
|
||||
)
|
||||
|
||||
logger.info("Session '%s' stopped", session_id)
|
||||
return True
|
||||
|
||||
@@ -461,13 +542,7 @@ class SessionManager:
|
||||
are written to the ORIGINAL session's directory so the full conversation
|
||||
history accumulates in one place across server restarts.
|
||||
"""
|
||||
from framework.agents.hive_coder.agent import (
|
||||
queen_goal,
|
||||
queen_graph as _queen_graph,
|
||||
)
|
||||
from framework.graph.executor import GraphExecutor
|
||||
from framework.runner.tool_registry import ToolRegistry
|
||||
from framework.runtime.core import Runtime
|
||||
from framework.server.queen_orchestrator import create_queen
|
||||
|
||||
hive_home = Path.home() / ".hive"
|
||||
|
||||
@@ -505,284 +580,33 @@ class SessionManager:
|
||||
except OSError:
|
||||
pass
|
||||
|
||||
# Register MCP coding tools
|
||||
queen_registry = ToolRegistry()
|
||||
import framework.agents.hive_coder as _hive_coder_pkg
|
||||
|
||||
hive_coder_dir = Path(_hive_coder_pkg.__file__).parent
|
||||
mcp_config = hive_coder_dir / "mcp_servers.json"
|
||||
if mcp_config.exists():
|
||||
try:
|
||||
queen_registry.load_mcp_config(mcp_config)
|
||||
logger.info("Queen: loaded MCP tools from %s", mcp_config)
|
||||
except Exception:
|
||||
logger.warning("Queen: MCP config failed to load", exc_info=True)
|
||||
|
||||
# Phase state for building/running phase switching
|
||||
from framework.tools.queen_lifecycle_tools import (
|
||||
QueenPhaseState,
|
||||
register_queen_lifecycle_tools,
|
||||
)
|
||||
|
||||
# Start in staging when the caller provided an agent, building otherwise.
|
||||
initial_phase = "staging" if worker_identity else "building"
|
||||
phase_state = QueenPhaseState(phase=initial_phase, event_bus=session.event_bus)
|
||||
session.phase_state = phase_state
|
||||
|
||||
# Always register lifecycle tools — they check session.worker_runtime
|
||||
# at call time, so they work even if no worker is loaded yet.
|
||||
register_queen_lifecycle_tools(
|
||||
queen_registry,
|
||||
session.queen_task = await create_queen(
|
||||
session=session,
|
||||
session_id=session.id,
|
||||
session_manager=self,
|
||||
manager_session_id=session.id,
|
||||
phase_state=phase_state,
|
||||
worker_identity=worker_identity,
|
||||
queen_dir=queen_dir,
|
||||
initial_prompt=initial_prompt,
|
||||
)
|
||||
|
||||
# Monitoring tools need concrete worker paths — only register when present
|
||||
if session.worker_runtime:
|
||||
from framework.tools.worker_monitoring_tools import register_worker_monitoring_tools
|
||||
# Memory consolidation — triggered by context compaction events.
|
||||
# Compaction is a natural signal that "enough has happened to be worth remembering".
|
||||
_consolidation_llm = session.llm
|
||||
_consolidation_session_dir = queen_dir
|
||||
|
||||
register_worker_monitoring_tools(
|
||||
queen_registry,
|
||||
session.event_bus,
|
||||
session.worker_path,
|
||||
stream_id="queen",
|
||||
worker_graph_id=session.worker_runtime._graph_id,
|
||||
async def _on_compaction(_event) -> None:
|
||||
from framework.agents.queen.queen_memory import consolidate_queen_memory
|
||||
|
||||
await consolidate_queen_memory(
|
||||
session.id, _consolidation_session_dir, _consolidation_llm
|
||||
)
|
||||
|
||||
queen_tools = list(queen_registry.get_tools().values())
|
||||
queen_tool_executor = queen_registry.get_executor()
|
||||
from framework.runtime.event_bus import EventType as _ET
|
||||
|
||||
# Partition tools into phase-specific sets and import prompt segments
|
||||
from framework.agents.hive_coder.nodes import (
|
||||
_QUEEN_BUILDING_TOOLS,
|
||||
_QUEEN_RUNNING_TOOLS,
|
||||
_QUEEN_STAGING_TOOLS,
|
||||
_appendices,
|
||||
_gcu_building_section,
|
||||
_package_builder_knowledge,
|
||||
_queen_behavior_always,
|
||||
_queen_behavior_building,
|
||||
_queen_behavior_running,
|
||||
_queen_behavior_staging,
|
||||
_queen_identity_building,
|
||||
_queen_identity_running,
|
||||
_queen_identity_staging,
|
||||
_queen_phase_7,
|
||||
_queen_style,
|
||||
_queen_tools_building,
|
||||
_queen_tools_running,
|
||||
_queen_tools_staging,
|
||||
session.memory_consolidation_sub = session.event_bus.subscribe(
|
||||
event_types=[_ET.CONTEXT_COMPACTED],
|
||||
handler=_on_compaction,
|
||||
)
|
||||
|
||||
building_names = set(_QUEEN_BUILDING_TOOLS)
|
||||
staging_names = set(_QUEEN_STAGING_TOOLS)
|
||||
running_names = set(_QUEEN_RUNNING_TOOLS)
|
||||
|
||||
registered_names = {t.name for t in queen_tools}
|
||||
missing_building = building_names - registered_names
|
||||
if missing_building:
|
||||
logger.warning(
|
||||
"Queen: %d/%d building tools NOT registered: %s",
|
||||
len(missing_building),
|
||||
len(building_names),
|
||||
sorted(missing_building),
|
||||
)
|
||||
logger.info("Queen: registered tools: %s", sorted(registered_names))
|
||||
|
||||
phase_state.building_tools = [t for t in queen_tools if t.name in building_names]
|
||||
phase_state.staging_tools = [t for t in queen_tools if t.name in staging_names]
|
||||
phase_state.running_tools = [t for t in queen_tools if t.name in running_names]
|
||||
|
||||
# Build queen graph with adjusted prompt + tools
|
||||
_orig_node = _queen_graph.nodes[0]
|
||||
|
||||
if worker_identity is None:
|
||||
worker_identity = (
|
||||
"\n\n# Worker Profile\n"
|
||||
"No worker agent loaded. You are operating independently.\n"
|
||||
"Handle all tasks directly using your coding tools."
|
||||
)
|
||||
|
||||
# Compose phase-specific prompts.
|
||||
_building_body = (
|
||||
_queen_style
|
||||
+ _queen_tools_building
|
||||
+ _queen_behavior_always
|
||||
+ _queen_behavior_building
|
||||
+ _package_builder_knowledge
|
||||
+ _gcu_building_section
|
||||
+ _queen_phase_7
|
||||
+ _appendices
|
||||
+ worker_identity
|
||||
)
|
||||
phase_state.prompt_building = _queen_identity_building + _building_body
|
||||
phase_state.prompt_staging = (
|
||||
_queen_identity_staging
|
||||
+ _queen_style
|
||||
+ _queen_tools_staging
|
||||
+ _queen_behavior_always
|
||||
+ _queen_behavior_staging
|
||||
+ worker_identity
|
||||
)
|
||||
phase_state.prompt_running = (
|
||||
_queen_identity_running
|
||||
+ _queen_style
|
||||
+ _queen_tools_running
|
||||
+ _queen_behavior_always
|
||||
+ _queen_behavior_running
|
||||
+ worker_identity
|
||||
)
|
||||
|
||||
# Build the session_start hook: selects the best-fit expert persona
|
||||
# from the user's opening message and replaces the identity prefix.
|
||||
from framework.agents.hive_coder.nodes.thinking_hook import select_expert_persona
|
||||
from framework.graph.event_loop_node import HookContext, HookResult
|
||||
from framework.runtime.event_bus import AgentEvent, EventType
|
||||
|
||||
_session_llm = session.llm
|
||||
_session_event_bus = session.event_bus
|
||||
|
||||
async def _persona_hook(ctx: HookContext) -> HookResult | None:
|
||||
persona = await select_expert_persona(ctx.trigger or "", _session_llm)
|
||||
if not persona:
|
||||
return None
|
||||
if _session_event_bus is not None:
|
||||
await _session_event_bus.publish(
|
||||
AgentEvent(
|
||||
type=EventType.QUEEN_PERSONA_SELECTED,
|
||||
stream_id="queen",
|
||||
data={"persona": persona},
|
||||
)
|
||||
)
|
||||
return HookResult(system_prompt=persona + "\n\n" + _building_body)
|
||||
|
||||
initial_prompt_text = phase_state.get_current_prompt()
|
||||
|
||||
registered_tool_names = set(queen_registry.get_tools().keys())
|
||||
declared_tools = _orig_node.tools or []
|
||||
available_tools = [t for t in declared_tools if t in registered_tool_names]
|
||||
|
||||
node_updates: dict = {
|
||||
"system_prompt": initial_prompt_text,
|
||||
}
|
||||
if set(available_tools) != set(declared_tools):
|
||||
missing = sorted(set(declared_tools) - registered_tool_names)
|
||||
if missing:
|
||||
logger.warning("Queen: tools not available: %s", missing)
|
||||
node_updates["tools"] = available_tools
|
||||
|
||||
adjusted_node = _orig_node.model_copy(update=node_updates)
|
||||
_queen_loop_config = {
|
||||
**(_queen_graph.loop_config or {}),
|
||||
"hooks": {"session_start": [_persona_hook]},
|
||||
}
|
||||
queen_graph = _queen_graph.model_copy(
|
||||
update={"nodes": [adjusted_node], "loop_config": _queen_loop_config}
|
||||
)
|
||||
|
||||
queen_runtime = Runtime(hive_home / "queen")
|
||||
|
||||
async def _queen_loop():
|
||||
try:
|
||||
executor = GraphExecutor(
|
||||
runtime=queen_runtime,
|
||||
llm=session.llm,
|
||||
tools=queen_tools,
|
||||
tool_executor=queen_tool_executor,
|
||||
event_bus=session.event_bus,
|
||||
stream_id="queen",
|
||||
storage_path=queen_dir,
|
||||
loop_config=_queen_loop_config,
|
||||
execution_id=session.id,
|
||||
dynamic_tools_provider=phase_state.get_current_tools,
|
||||
dynamic_prompt_provider=phase_state.get_current_prompt,
|
||||
)
|
||||
session.queen_executor = executor
|
||||
|
||||
# Wire inject_notification so phase switches notify the queen LLM
|
||||
async def _inject_phase_notification(content: str) -> None:
|
||||
node = executor.node_registry.get("queen")
|
||||
if node is not None and hasattr(node, "inject_event"):
|
||||
await node.inject_event(content)
|
||||
|
||||
phase_state.inject_notification = _inject_phase_notification
|
||||
|
||||
# Auto-switch to staging when worker execution finishes naturally
|
||||
# and notify the queen about the termination
|
||||
from framework.runtime.event_bus import EventType as _ET
|
||||
|
||||
async def _on_worker_done(event):
|
||||
if event.stream_id == "queen":
|
||||
return
|
||||
if phase_state.phase == "running":
|
||||
# Build termination notification for the queen
|
||||
if event.type == _ET.EXECUTION_COMPLETED:
|
||||
output = event.data.get("output", {})
|
||||
output_summary = ""
|
||||
if output:
|
||||
# Summarize key outputs for the queen
|
||||
for key, value in output.items():
|
||||
val_str = str(value)
|
||||
if len(val_str) > 200:
|
||||
val_str = val_str[:200] + "..."
|
||||
output_summary += f"\n {key}: {val_str}"
|
||||
_out = output_summary or " (no output keys set)"
|
||||
notification = (
|
||||
"[WORKER_TERMINAL] Worker finished successfully.\n"
|
||||
f"Output:{_out}\n"
|
||||
"Report this to the user. "
|
||||
"Ask if they want to continue with another run."
|
||||
)
|
||||
else: # EXECUTION_FAILED
|
||||
error = event.data.get("error", "Unknown error")
|
||||
notification = (
|
||||
"[WORKER_TERMINAL] Worker failed.\n"
|
||||
f"Error: {error}\n"
|
||||
"Report this to the user and help them troubleshoot."
|
||||
)
|
||||
|
||||
# Inject notification to queen before phase switch
|
||||
node = executor.node_registry.get("queen")
|
||||
if node is not None and hasattr(node, "inject_event"):
|
||||
await node.inject_event(notification)
|
||||
|
||||
await phase_state.switch_to_staging(source="auto")
|
||||
|
||||
session.event_bus.subscribe(
|
||||
event_types=[_ET.EXECUTION_COMPLETED, _ET.EXECUTION_FAILED],
|
||||
handler=_on_worker_done,
|
||||
)
|
||||
self._subscribe_worker_handoffs(session, executor)
|
||||
|
||||
logger.info(
|
||||
"Queen starting in %s phase with %d tools: %s",
|
||||
phase_state.phase,
|
||||
len(phase_state.get_current_tools()),
|
||||
[t.name for t in phase_state.get_current_tools()],
|
||||
)
|
||||
result = await executor.execute(
|
||||
graph=queen_graph,
|
||||
goal=queen_goal,
|
||||
input_data={"greeting": initial_prompt or "Session started."},
|
||||
session_state={"resume_session_id": session.id},
|
||||
)
|
||||
if result.success:
|
||||
logger.warning("Queen executor returned (should be forever-alive)")
|
||||
else:
|
||||
logger.error(
|
||||
"Queen executor failed: %s",
|
||||
result.error or "(no error message)",
|
||||
)
|
||||
except Exception:
|
||||
logger.error("Queen conversation crashed", exc_info=True)
|
||||
finally:
|
||||
session.queen_executor = None
|
||||
|
||||
session.queen_task = asyncio.create_task(_queen_loop())
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Judge startup / teardown
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
@@ -37,6 +37,7 @@ class MockNodeSpec:
|
||||
client_facing: bool = False
|
||||
success_criteria: str | None = None
|
||||
system_prompt: str | None = None
|
||||
sub_agents: list = field(default_factory=list)
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -67,6 +68,7 @@ class MockEntryPoint:
|
||||
name: str = "Default"
|
||||
entry_node: str = "start"
|
||||
trigger_type: str = "manual"
|
||||
trigger_config: dict = field(default_factory=dict)
|
||||
|
||||
|
||||
@dataclass
|
||||
@@ -130,6 +132,9 @@ class MockRuntime:
|
||||
def get_stats(self):
|
||||
return {"running": True, "executions": 1}
|
||||
|
||||
def get_timer_next_fire_in(self, ep_id):
|
||||
return None
|
||||
|
||||
|
||||
class MockAgentInfo:
|
||||
name: str = "test_agent"
|
||||
@@ -1556,3 +1561,106 @@ class TestErrorMiddleware:
|
||||
async with TestClient(TestServer(app)) as client:
|
||||
resp = await client.get("/api/nonexistent")
|
||||
assert resp.status == 404
|
||||
|
||||
|
||||
class TestCleanupStaleActiveSessions:
|
||||
"""Tests for _cleanup_stale_active_sessions with two-layer protection."""
|
||||
|
||||
def _make_manager(self):
|
||||
from framework.server.session_manager import SessionManager
|
||||
|
||||
return SessionManager()
|
||||
|
||||
def _write_state(self, session_dir: Path, status: str, pid: int | None = None) -> None:
|
||||
session_dir.mkdir(parents=True, exist_ok=True)
|
||||
state: dict = {"status": status, "session_id": session_dir.name}
|
||||
if pid is not None:
|
||||
state["pid"] = pid
|
||||
(session_dir / "state.json").write_text(json.dumps(state))
|
||||
|
||||
def _read_state(self, session_dir: Path) -> dict:
|
||||
return json.loads((session_dir / "state.json").read_text())
|
||||
|
||||
def test_stale_session_is_cancelled(self, tmp_path, monkeypatch):
|
||||
"""Truly stale active sessions (no live tracking, no PID) get cancelled."""
|
||||
monkeypatch.setattr(Path, "home", lambda: tmp_path)
|
||||
agent_path = Path("my_agent")
|
||||
sessions_dir = tmp_path / ".hive" / "agents" / "my_agent" / "sessions"
|
||||
session_dir = sessions_dir / "session_stale_001"
|
||||
|
||||
self._write_state(session_dir, "active")
|
||||
|
||||
mgr = self._make_manager()
|
||||
mgr._cleanup_stale_active_sessions(agent_path)
|
||||
|
||||
state = self._read_state(session_dir)
|
||||
assert state["status"] == "cancelled"
|
||||
assert "Stale session" in state["result"]["error"]
|
||||
|
||||
def test_live_in_memory_session_is_skipped(self, tmp_path, monkeypatch):
|
||||
"""Sessions tracked in self._sessions must NOT be cancelled (Layer 1)."""
|
||||
monkeypatch.setattr(Path, "home", lambda: tmp_path)
|
||||
agent_path = Path("my_agent")
|
||||
sessions_dir = tmp_path / ".hive" / "agents" / "my_agent" / "sessions"
|
||||
session_dir = sessions_dir / "session_live_002"
|
||||
|
||||
self._write_state(session_dir, "active")
|
||||
|
||||
mgr = self._make_manager()
|
||||
# Simulate a live session in the manager's in-memory map
|
||||
mgr._sessions["session_live_002"] = MagicMock()
|
||||
|
||||
mgr._cleanup_stale_active_sessions(agent_path)
|
||||
|
||||
state = self._read_state(session_dir)
|
||||
assert state["status"] == "active", "Live in-memory session should NOT be cancelled"
|
||||
|
||||
def test_session_with_live_pid_is_skipped(self, tmp_path, monkeypatch):
|
||||
"""Sessions whose owning PID is still alive must NOT be cancelled (Layer 2)."""
|
||||
import os
|
||||
|
||||
monkeypatch.setattr(Path, "home", lambda: tmp_path)
|
||||
agent_path = Path("my_agent")
|
||||
sessions_dir = tmp_path / ".hive" / "agents" / "my_agent" / "sessions"
|
||||
session_dir = sessions_dir / "session_pid_003"
|
||||
|
||||
# Use the current process PID — guaranteed to be alive
|
||||
self._write_state(session_dir, "active", pid=os.getpid())
|
||||
|
||||
mgr = self._make_manager()
|
||||
mgr._cleanup_stale_active_sessions(agent_path)
|
||||
|
||||
state = self._read_state(session_dir)
|
||||
assert state["status"] == "active", "Session with live PID should NOT be cancelled"
|
||||
|
||||
def test_session_with_dead_pid_is_cancelled(self, tmp_path, monkeypatch):
|
||||
"""Sessions whose owning PID is dead should be cancelled."""
|
||||
monkeypatch.setattr(Path, "home", lambda: tmp_path)
|
||||
agent_path = Path("my_agent")
|
||||
sessions_dir = tmp_path / ".hive" / "agents" / "my_agent" / "sessions"
|
||||
session_dir = sessions_dir / "session_dead_004"
|
||||
|
||||
# Use a PID that is almost certainly not running
|
||||
self._write_state(session_dir, "active", pid=999999999)
|
||||
|
||||
mgr = self._make_manager()
|
||||
mgr._cleanup_stale_active_sessions(agent_path)
|
||||
|
||||
state = self._read_state(session_dir)
|
||||
assert state["status"] == "cancelled"
|
||||
assert "Stale session" in state["result"]["error"]
|
||||
|
||||
def test_paused_session_is_never_touched(self, tmp_path, monkeypatch):
|
||||
"""Paused sessions should remain intact regardless of PID or tracking."""
|
||||
monkeypatch.setattr(Path, "home", lambda: tmp_path)
|
||||
agent_path = Path("my_agent")
|
||||
sessions_dir = tmp_path / ".hive" / "agents" / "my_agent" / "sessions"
|
||||
session_dir = sessions_dir / "session_paused_005"
|
||||
|
||||
self._write_state(session_dir, "paused")
|
||||
|
||||
mgr = self._make_manager()
|
||||
mgr._cleanup_stale_active_sessions(agent_path)
|
||||
|
||||
state = self._read_state(session_dir)
|
||||
assert state["status"] == "paused", "Paused sessions must remain untouched"
|
||||
|
||||
@@ -1,179 +0,0 @@
|
||||
"""
|
||||
State Writer - Dual-write adapter for migration period.
|
||||
|
||||
Writes execution state to both old (Run/RunSummary) and new (state.json) formats
|
||||
to maintain backward compatibility during the transition period.
|
||||
"""
|
||||
|
||||
import logging
|
||||
import os
|
||||
from datetime import datetime
|
||||
|
||||
from framework.schemas.run import Problem, Run, RunMetrics, RunStatus
|
||||
from framework.schemas.session_state import SessionState, SessionStatus
|
||||
from framework.storage.concurrent import ConcurrentStorage
|
||||
from framework.storage.session_store import SessionStore
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class StateWriter:
|
||||
"""
|
||||
Writes execution state to both old and new formats during migration.
|
||||
|
||||
During the dual-write phase:
|
||||
- New format (state.json) is written when USE_UNIFIED_SESSIONS=true
|
||||
- Old format (Run/RunSummary) is always written for backward compatibility
|
||||
"""
|
||||
|
||||
def __init__(self, old_storage: ConcurrentStorage, session_store: SessionStore):
|
||||
"""
|
||||
Initialize state writer.
|
||||
|
||||
Args:
|
||||
old_storage: ConcurrentStorage for old format (runs/, summaries/)
|
||||
session_store: SessionStore for new format (sessions/*/state.json)
|
||||
"""
|
||||
self.old = old_storage
|
||||
self.new = session_store
|
||||
self.dual_write_enabled = os.getenv("USE_UNIFIED_SESSIONS", "false").lower() == "true"
|
||||
|
||||
async def write_execution_state(
|
||||
self,
|
||||
session_id: str,
|
||||
state: SessionState,
|
||||
) -> None:
|
||||
"""
|
||||
Write execution state to both old and new formats.
|
||||
|
||||
Args:
|
||||
session_id: Session ID
|
||||
state: SessionState to write
|
||||
"""
|
||||
# Write to new format if enabled
|
||||
if self.dual_write_enabled:
|
||||
try:
|
||||
await self.new.write_state(session_id, state)
|
||||
logger.debug(f"Wrote state.json for session {session_id}")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to write state.json for {session_id}: {e}")
|
||||
# Don't fail - old format is still written
|
||||
|
||||
# Always write to old format for backward compatibility
|
||||
try:
|
||||
run = self._convert_to_run(state)
|
||||
await self.old.save_run(run)
|
||||
logger.debug(f"Wrote Run object for session {session_id}")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to write Run object for {session_id}: {e}")
|
||||
# This is more critical - reraise if old format fails
|
||||
raise
|
||||
|
||||
def _convert_to_run(self, state: SessionState) -> Run:
|
||||
"""
|
||||
Convert SessionState to legacy Run object.
|
||||
|
||||
Args:
|
||||
state: SessionState to convert
|
||||
|
||||
Returns:
|
||||
Run object
|
||||
"""
|
||||
# Map SessionStatus to RunStatus
|
||||
status_mapping = {
|
||||
SessionStatus.ACTIVE: RunStatus.RUNNING,
|
||||
SessionStatus.PAUSED: RunStatus.RUNNING, # Paused is still "running" in old format
|
||||
SessionStatus.COMPLETED: RunStatus.COMPLETED,
|
||||
SessionStatus.FAILED: RunStatus.FAILED,
|
||||
SessionStatus.CANCELLED: RunStatus.CANCELLED,
|
||||
}
|
||||
run_status = status_mapping.get(state.status, RunStatus.FAILED)
|
||||
|
||||
# Convert timestamps
|
||||
started_at = datetime.fromisoformat(state.timestamps.started_at)
|
||||
completed_at = (
|
||||
datetime.fromisoformat(state.timestamps.completed_at)
|
||||
if state.timestamps.completed_at
|
||||
else None
|
||||
)
|
||||
|
||||
# Build RunMetrics
|
||||
metrics = RunMetrics(
|
||||
total_decisions=state.metrics.decision_count,
|
||||
successful_decisions=state.metrics.decision_count
|
||||
- len(state.progress.nodes_with_failures), # Approximate
|
||||
failed_decisions=len(state.progress.nodes_with_failures),
|
||||
total_tokens=state.metrics.total_input_tokens + state.metrics.total_output_tokens,
|
||||
total_latency_ms=state.progress.total_latency_ms,
|
||||
nodes_executed=state.metrics.nodes_executed,
|
||||
edges_traversed=state.metrics.edges_traversed,
|
||||
)
|
||||
|
||||
# Convert problems (SessionState stores as dicts, Run expects Problem objects)
|
||||
problems = []
|
||||
for p_dict in state.problems:
|
||||
# Handle both old Problem objects and new dict format
|
||||
if isinstance(p_dict, dict):
|
||||
problems.append(Problem(**p_dict))
|
||||
else:
|
||||
problems.append(p_dict)
|
||||
|
||||
# Convert decisions (SessionState stores as dicts, Run expects Decision objects)
|
||||
from framework.schemas.decision import Decision
|
||||
|
||||
decisions = []
|
||||
for d_dict in state.decisions:
|
||||
# Handle both old Decision objects and new dict format
|
||||
if isinstance(d_dict, dict):
|
||||
try:
|
||||
decisions.append(Decision(**d_dict))
|
||||
except Exception:
|
||||
# Skip invalid decisions
|
||||
continue
|
||||
else:
|
||||
decisions.append(d_dict)
|
||||
|
||||
# Create Run object
|
||||
run = Run(
|
||||
id=state.session_id, # Use session_id as run_id
|
||||
goal_id=state.goal_id,
|
||||
started_at=started_at,
|
||||
status=run_status,
|
||||
completed_at=completed_at,
|
||||
decisions=decisions,
|
||||
problems=problems,
|
||||
metrics=metrics,
|
||||
goal_description="", # Not stored in SessionState
|
||||
input_data=state.input_data,
|
||||
output_data=state.result.output,
|
||||
)
|
||||
|
||||
return run
|
||||
|
||||
async def read_state(
|
||||
self,
|
||||
session_id: str,
|
||||
prefer_new: bool = True,
|
||||
) -> SessionState | None:
|
||||
"""
|
||||
Read execution state from either format.
|
||||
|
||||
Args:
|
||||
session_id: Session ID
|
||||
prefer_new: If True, try new format first (default)
|
||||
|
||||
Returns:
|
||||
SessionState or None if not found
|
||||
"""
|
||||
if prefer_new:
|
||||
# Try new format first
|
||||
state = await self.new.read_state(session_id)
|
||||
if state:
|
||||
return state
|
||||
|
||||
# Fall back to old format
|
||||
run = await self.old.load_run(session_id)
|
||||
if run:
|
||||
return SessionState.from_legacy_run(run, session_id)
|
||||
|
||||
return None
|
||||
@@ -71,12 +71,13 @@ class WorkerSessionAdapter:
|
||||
class QueenPhaseState:
|
||||
"""Mutable state container for queen operating phase.
|
||||
|
||||
Three phases: building → staging → running.
|
||||
Four phases: planning → building → staging → running.
|
||||
Shared between the dynamic_tools_provider callback and tool handlers
|
||||
that trigger phase transitions.
|
||||
"""
|
||||
|
||||
phase: str = "building" # "building", "staging", or "running"
|
||||
phase: str = "building" # "planning", "building", "staging", or "running"
|
||||
planning_tools: list = field(default_factory=list) # list[Tool]
|
||||
building_tools: list = field(default_factory=list) # list[Tool]
|
||||
staging_tools: list = field(default_factory=list) # list[Tool]
|
||||
running_tools: list = field(default_factory=list) # list[Tool]
|
||||
@@ -84,12 +85,15 @@ class QueenPhaseState:
|
||||
event_bus: Any = None # EventBus — for emitting QUEEN_PHASE_CHANGED events
|
||||
|
||||
# Phase-specific prompts (set by session_manager after construction)
|
||||
prompt_planning: str = ""
|
||||
prompt_building: str = ""
|
||||
prompt_staging: str = ""
|
||||
prompt_running: str = ""
|
||||
|
||||
def get_current_tools(self) -> list:
|
||||
"""Return tools for the current phase."""
|
||||
if self.phase == "planning":
|
||||
return list(self.planning_tools)
|
||||
if self.phase == "running":
|
||||
return list(self.running_tools)
|
||||
if self.phase == "staging":
|
||||
@@ -97,12 +101,20 @@ class QueenPhaseState:
|
||||
return list(self.building_tools)
|
||||
|
||||
def get_current_prompt(self) -> str:
|
||||
"""Return the system prompt for the current phase."""
|
||||
if self.phase == "running":
|
||||
return self.prompt_running
|
||||
if self.phase == "staging":
|
||||
return self.prompt_staging
|
||||
return self.prompt_building
|
||||
"""Return the system prompt for the current phase, with fresh memory appended."""
|
||||
if self.phase == "planning":
|
||||
base = self.prompt_planning
|
||||
elif self.phase == "running":
|
||||
base = self.prompt_running
|
||||
elif self.phase == "staging":
|
||||
base = self.prompt_staging
|
||||
else:
|
||||
base = self.prompt_building
|
||||
|
||||
from framework.agents.queen.queen_memory import format_for_injection
|
||||
|
||||
memory = format_for_injection()
|
||||
return base + ("\n\n" + memory if memory else "")
|
||||
|
||||
async def _emit_phase_event(self) -> None:
|
||||
"""Publish a QUEEN_PHASE_CHANGED event so the frontend updates the tag."""
|
||||
@@ -128,22 +140,15 @@ class QueenPhaseState:
|
||||
tool_names = [t.name for t in self.running_tools]
|
||||
logger.info("Queen phase → running (source=%s, tools: %s)", source, tool_names)
|
||||
await self._emit_phase_event()
|
||||
if self.inject_notification:
|
||||
if source == "frontend":
|
||||
msg = (
|
||||
"[PHASE CHANGE] The user clicked Run in the UI. Switched to RUNNING phase. "
|
||||
"Worker is now executing. You have monitoring/lifecycle tools: "
|
||||
+ ", ".join(tool_names)
|
||||
+ "."
|
||||
)
|
||||
else:
|
||||
msg = (
|
||||
"[PHASE CHANGE] Switched to RUNNING phase. "
|
||||
"Worker is executing. You now have monitoring/lifecycle tools: "
|
||||
+ ", ".join(tool_names)
|
||||
+ "."
|
||||
)
|
||||
await self.inject_notification(msg)
|
||||
# Skip notification when source="tool" — the tool result already
|
||||
# contains the phase change info.
|
||||
if self.inject_notification and source != "tool":
|
||||
await self.inject_notification(
|
||||
"[PHASE CHANGE] The user clicked Run in the UI. Switched to RUNNING phase. "
|
||||
"Worker is now executing. You have monitoring/lifecycle tools: "
|
||||
+ ", ".join(tool_names)
|
||||
+ "."
|
||||
)
|
||||
|
||||
async def switch_to_staging(self, source: str = "tool") -> None:
|
||||
"""Switch to staging phase and notify the queen.
|
||||
@@ -157,26 +162,21 @@ class QueenPhaseState:
|
||||
tool_names = [t.name for t in self.staging_tools]
|
||||
logger.info("Queen phase → staging (source=%s, tools: %s)", source, tool_names)
|
||||
await self._emit_phase_event()
|
||||
if self.inject_notification:
|
||||
# Skip notification when source="tool" — the tool result already
|
||||
# contains the phase change info.
|
||||
if self.inject_notification and source != "tool":
|
||||
if source == "frontend":
|
||||
msg = (
|
||||
"[PHASE CHANGE] The user stopped the worker from the UI. "
|
||||
"Switched to STAGING phase. Agent is still loaded. "
|
||||
"Available tools: " + ", ".join(tool_names) + "."
|
||||
)
|
||||
elif source == "auto":
|
||||
else:
|
||||
msg = (
|
||||
"[PHASE CHANGE] Worker execution completed. Switched to STAGING phase. "
|
||||
"Agent is still loaded. Call run_agent_with_input(task) to run again. "
|
||||
"Available tools: " + ", ".join(tool_names) + "."
|
||||
)
|
||||
else:
|
||||
msg = (
|
||||
"[PHASE CHANGE] Switched to STAGING phase. "
|
||||
"Agent loaded and ready. Call run_agent_with_input(task) to start, "
|
||||
"or stop_worker_and_edit() to go back to building. "
|
||||
"Available tools: " + ", ".join(tool_names) + "."
|
||||
)
|
||||
await self.inject_notification(msg)
|
||||
|
||||
async def switch_to_building(self, source: str = "tool") -> None:
|
||||
@@ -191,13 +191,35 @@ class QueenPhaseState:
|
||||
tool_names = [t.name for t in self.building_tools]
|
||||
logger.info("Queen phase → building (source=%s, tools: %s)", source, tool_names)
|
||||
await self._emit_phase_event()
|
||||
if self.inject_notification:
|
||||
if self.inject_notification and source != "tool":
|
||||
await self.inject_notification(
|
||||
"[PHASE CHANGE] Switched to BUILDING phase. "
|
||||
"Lifecycle tools removed. Full coding tools restored. "
|
||||
"Call load_built_agent(path) when ready to stage."
|
||||
)
|
||||
|
||||
async def switch_to_planning(self, source: str = "tool") -> None:
|
||||
"""Switch to planning phase and notify the queen.
|
||||
|
||||
Args:
|
||||
source: Who triggered the switch — "tool", "frontend", or "auto".
|
||||
"""
|
||||
if self.phase == "planning":
|
||||
return
|
||||
self.phase = "planning"
|
||||
tool_names = [t.name for t in self.planning_tools]
|
||||
logger.info("Queen phase → planning (source=%s, tools: %s)", source, tool_names)
|
||||
await self._emit_phase_event()
|
||||
# Skip notification when source="tool" — the tool result already
|
||||
# contains the phase change info; injecting a duplicate notification
|
||||
# causes the queen to respond twice.
|
||||
if self.inject_notification and source != "tool":
|
||||
await self.inject_notification(
|
||||
"[PHASE CHANGE] Switched to PLANNING phase. "
|
||||
"Coding tools removed. Discuss goals and design with the user. "
|
||||
"Available tools: " + ", ".join(tool_names) + "."
|
||||
)
|
||||
|
||||
|
||||
def build_worker_profile(runtime: AgentRuntime, agent_path: Path | str | None = None) -> str:
|
||||
"""Build a worker capability profile from its graph/goal definition.
|
||||
@@ -423,7 +445,7 @@ def register_queen_lifecycle_tools(
|
||||
|
||||
# --- stop_worker ----------------------------------------------------------
|
||||
|
||||
async def stop_worker() -> str:
|
||||
async def stop_worker(*, reason: str = "Stopped by queen") -> str:
|
||||
"""Cancel all active worker executions across all graphs.
|
||||
|
||||
Stops the worker immediately. Returns the IDs of cancelled executions.
|
||||
@@ -453,7 +475,7 @@ def register_queen_lifecycle_tools(
|
||||
|
||||
for exec_id in list(stream.active_execution_ids):
|
||||
try:
|
||||
ok = await stream.cancel_execution(exec_id)
|
||||
ok = await stream.cancel_execution(exec_id, reason=reason)
|
||||
if ok:
|
||||
cancelled.append(exec_id)
|
||||
except Exception as e:
|
||||
@@ -498,6 +520,11 @@ def register_queen_lifecycle_tools(
|
||||
"Use your coding tools to modify the agent, then call "
|
||||
"load_built_agent(path) to stage it again."
|
||||
)
|
||||
# Nudge the queen to start coding instead of blocking for user input.
|
||||
if phase_state is not None and phase_state.inject_notification:
|
||||
await phase_state.inject_notification(
|
||||
"[PHASE CHANGE] Switched to BUILDING phase. Start implementing the changes now."
|
||||
)
|
||||
return json.dumps(result)
|
||||
|
||||
_stop_edit_tool = Tool(
|
||||
@@ -514,6 +541,171 @@ def register_queen_lifecycle_tools(
|
||||
)
|
||||
tools_registered += 1
|
||||
|
||||
# --- stop_worker_and_plan (Running/Staging → Planning) --------------------
|
||||
|
||||
async def stop_worker_and_plan() -> str:
|
||||
"""Stop the worker and switch to planning phase for diagnosis."""
|
||||
stop_result = await stop_worker()
|
||||
|
||||
# Switch to planning phase
|
||||
if phase_state is not None:
|
||||
await phase_state.switch_to_planning(source="tool")
|
||||
|
||||
result = json.loads(stop_result)
|
||||
result["phase"] = "planning"
|
||||
result["message"] = (
|
||||
"Worker stopped. You are now in planning phase. "
|
||||
"Diagnose the issue using read-only tools (checkpoints, logs, sessions), "
|
||||
"discuss a fix plan with the user, then call "
|
||||
"initialize_and_build_agent() to implement the fix."
|
||||
)
|
||||
return json.dumps(result)
|
||||
|
||||
_stop_plan_tool = Tool(
|
||||
name="stop_worker_and_plan",
|
||||
description=(
|
||||
"Stop the worker and switch to planning phase for diagnosis. "
|
||||
"Use this when you need to investigate an issue before fixing it. "
|
||||
"After diagnosis, call initialize_and_build_agent() to switch to building."
|
||||
),
|
||||
parameters={"type": "object", "properties": {}},
|
||||
)
|
||||
registry.register(
|
||||
"stop_worker_and_plan", _stop_plan_tool, lambda inputs: stop_worker_and_plan()
|
||||
)
|
||||
tools_registered += 1
|
||||
|
||||
# --- replan_agent (Building → Planning) -----------------------------------
|
||||
|
||||
async def replan_agent() -> str:
|
||||
"""Switch from building back to planning phase.
|
||||
Only use when the user explicitly asks to re-plan."""
|
||||
if phase_state is not None:
|
||||
if phase_state.phase != "building":
|
||||
return json.dumps(
|
||||
{"error": f"Cannot replan: currently in {phase_state.phase} phase."}
|
||||
)
|
||||
await phase_state.switch_to_planning(source="tool")
|
||||
return json.dumps(
|
||||
{
|
||||
"status": "replanning",
|
||||
"phase": "planning",
|
||||
"message": (
|
||||
"Switched to PLANNING phase. Coding tools removed. "
|
||||
"Discuss the new design with the user."
|
||||
),
|
||||
}
|
||||
)
|
||||
|
||||
_replan_tool = Tool(
|
||||
name="replan_agent",
|
||||
description=(
|
||||
"Switch from building back to planning phase. "
|
||||
"Only use when the user explicitly asks to re-plan or redesign the agent."
|
||||
),
|
||||
parameters={"type": "object", "properties": {}},
|
||||
)
|
||||
registry.register("replan_agent", _replan_tool, lambda inputs: replan_agent())
|
||||
tools_registered += 1
|
||||
|
||||
# --- initialize_and_build_agent wrapper (Planning → Building) -------------
|
||||
# With agent_name: scaffold a new agent via MCP tool, then switch to building.
|
||||
# Without agent_name: just switch to building (for fixing an existing loaded agent).
|
||||
|
||||
_existing_init = registry._tools.get("initialize_and_build_agent")
|
||||
if _existing_init is not None:
|
||||
_orig_init_executor = _existing_init.executor
|
||||
|
||||
async def initialize_and_build_agent_wrapper(inputs: dict) -> str:
|
||||
"""Wrapper: scaffold or just switch to building phase."""
|
||||
agent_name = (inputs.get("agent_name") or "").strip()
|
||||
|
||||
# No agent_name → try to fall back to the session's current agent,
|
||||
# or fail with actionable guidance.
|
||||
if not agent_name:
|
||||
# Try to resolve agent_name from the current session
|
||||
fallback_path = getattr(session, "worker_path", None)
|
||||
if fallback_path is not None:
|
||||
agent_name = Path(fallback_path).name
|
||||
else:
|
||||
# Server path: check SessionManager
|
||||
if session_manager is not None and manager_session_id:
|
||||
srv_session = session_manager.get_session(manager_session_id)
|
||||
if srv_session and getattr(srv_session, "worker_path", None):
|
||||
fallback_path = srv_session.worker_path
|
||||
agent_name = Path(fallback_path).name
|
||||
|
||||
if not agent_name:
|
||||
return json.dumps(
|
||||
{
|
||||
"error": (
|
||||
"No agent_name provided and no agent loaded in this session. "
|
||||
"To fix: call list_agents() to find the agent name, then call "
|
||||
"initialize_and_build_agent(agent_name='<name>') to scaffold it."
|
||||
)
|
||||
}
|
||||
)
|
||||
|
||||
# Fall back succeeded — switch to building without scaffolding
|
||||
logger.info(
|
||||
"initialize_and_build_agent: no agent_name provided, "
|
||||
"falling back to session agent '%s'",
|
||||
agent_name,
|
||||
)
|
||||
if phase_state is not None:
|
||||
await phase_state.switch_to_building(source="tool")
|
||||
if phase_state.inject_notification:
|
||||
await phase_state.inject_notification(
|
||||
"[PHASE CHANGE] Switched to BUILDING phase. "
|
||||
"Start implementing the fix now."
|
||||
)
|
||||
return json.dumps(
|
||||
{
|
||||
"status": "editing",
|
||||
"phase": "building",
|
||||
"agent_name": agent_name,
|
||||
"warning": (
|
||||
f"No agent_name provided — using session agent '{agent_name}'. "
|
||||
f"Agent files are at exports/{agent_name}/."
|
||||
),
|
||||
"message": (
|
||||
"Switched to BUILDING phase. Full coding tools restored. "
|
||||
"Implement the fix, then call load_built_agent(path) to reload."
|
||||
),
|
||||
}
|
||||
)
|
||||
|
||||
# Has agent_name → scaffold via MCP tool
|
||||
result = _orig_init_executor(inputs)
|
||||
# Handle both sync and async executors
|
||||
if asyncio.iscoroutine(result) or asyncio.isfuture(result):
|
||||
result = await result
|
||||
# If result is a ToolResult, extract the text content
|
||||
result_str = str(result)
|
||||
if hasattr(result, "content"):
|
||||
result_str = str(result.content)
|
||||
try:
|
||||
parsed = json.loads(result_str)
|
||||
if parsed.get("success", True):
|
||||
if phase_state is not None:
|
||||
await phase_state.switch_to_building(source="tool")
|
||||
# Inject a continuation message so the queen starts
|
||||
# building immediately instead of blocking for user input.
|
||||
if phase_state.inject_notification:
|
||||
await phase_state.inject_notification(
|
||||
"[PHASE CHANGE] Agent scaffolded and switched to BUILDING phase. "
|
||||
"Start implementing the agent nodes now."
|
||||
)
|
||||
except (json.JSONDecodeError, KeyError, TypeError):
|
||||
pass
|
||||
return result_str
|
||||
|
||||
registry.register(
|
||||
"initialize_and_build_agent",
|
||||
_existing_init.tool,
|
||||
lambda inputs: initialize_and_build_agent_wrapper(inputs),
|
||||
)
|
||||
|
||||
# --- stop_worker (Running → Staging) -------------------------------------
|
||||
|
||||
async def stop_worker_to_staging() -> str:
|
||||
@@ -1260,7 +1452,23 @@ def register_queen_lifecycle_tools(
|
||||
if reg is None:
|
||||
return json.dumps({"error": "Worker graph not found"})
|
||||
|
||||
# Find an active node that can accept injected input
|
||||
# Prefer nodes that are actively waiting (e.g. escalation receivers
|
||||
# blocked on queen guidance) over the main event-loop node.
|
||||
for stream in reg.streams.values():
|
||||
waiting = stream.get_waiting_nodes()
|
||||
if waiting:
|
||||
target_node_id = waiting[0]["node_id"]
|
||||
ok = await stream.inject_input(target_node_id, content, is_client_input=True)
|
||||
if ok:
|
||||
return json.dumps(
|
||||
{
|
||||
"status": "delivered",
|
||||
"node_id": target_node_id,
|
||||
"content_preview": content[:100],
|
||||
}
|
||||
)
|
||||
|
||||
# Fallback: inject into any injectable node
|
||||
for stream in reg.streams.values():
|
||||
injectable = stream.get_injectable_nodes()
|
||||
if injectable:
|
||||
@@ -1429,6 +1637,51 @@ def register_queen_lifecycle_tools(
|
||||
if not resolved_path.exists():
|
||||
return json.dumps({"error": f"Agent path does not exist: {agent_path}"})
|
||||
|
||||
# Pre-check: verify the module exports goal/nodes/edges before
|
||||
# attempting the full load. This gives the queen an actionable
|
||||
# error message instead of a cryptic ImportError or TypeError.
|
||||
try:
|
||||
import importlib
|
||||
import sys as _sys
|
||||
|
||||
pkg_name = resolved_path.name
|
||||
parent_dir = str(resolved_path.resolve().parent)
|
||||
# Temporarily put parent on sys.path for import
|
||||
if parent_dir not in _sys.path:
|
||||
_sys.path.insert(0, parent_dir)
|
||||
# Evict stale cached modules
|
||||
stale = [n for n in _sys.modules if n == pkg_name or n.startswith(f"{pkg_name}.")]
|
||||
for n in stale:
|
||||
del _sys.modules[n]
|
||||
|
||||
mod = importlib.import_module(pkg_name)
|
||||
missing_attrs = [
|
||||
attr for attr in ("goal", "nodes", "edges") if getattr(mod, attr, None) is None
|
||||
]
|
||||
if missing_attrs:
|
||||
return json.dumps(
|
||||
{
|
||||
"error": (
|
||||
f"Agent module '{pkg_name}' is missing module-level "
|
||||
f"attributes: {', '.join(missing_attrs)}. "
|
||||
f"Fix: in {pkg_name}/__init__.py, add "
|
||||
f"'from .agent import {', '.join(missing_attrs)}' "
|
||||
f"so that 'import {pkg_name}' exposes them at package level."
|
||||
)
|
||||
}
|
||||
)
|
||||
except Exception as pre_err:
|
||||
return json.dumps(
|
||||
{
|
||||
"error": (
|
||||
f"Failed to import agent module '{resolved_path.name}': {pre_err}. "
|
||||
f"Fix: ensure {resolved_path.name}/__init__.py exists and can be "
|
||||
f"imported without errors (check syntax, missing dependencies, "
|
||||
f"and relative imports)."
|
||||
)
|
||||
}
|
||||
)
|
||||
|
||||
try:
|
||||
updated_session = await session_manager.load_worker(
|
||||
manager_session_id,
|
||||
@@ -1436,7 +1689,36 @@ def register_queen_lifecycle_tools(
|
||||
)
|
||||
info = updated_session.worker_info
|
||||
|
||||
# Switch to staging phase after successful load
|
||||
# Validate that all tools declared by nodes are registered
|
||||
loaded_runtime = _get_runtime()
|
||||
if loaded_runtime is not None:
|
||||
available_tool_names = {t.name for t in loaded_runtime._tools}
|
||||
missing_by_node: dict[str, list[str]] = {}
|
||||
for node in loaded_runtime.graph.nodes:
|
||||
if node.tools:
|
||||
missing = set(node.tools) - available_tool_names
|
||||
if missing:
|
||||
missing_by_node[f"{node.name} (id={node.id})"] = sorted(missing)
|
||||
if missing_by_node:
|
||||
# Unload the broken worker
|
||||
try:
|
||||
await session_manager.unload_worker(manager_session_id)
|
||||
except Exception:
|
||||
pass
|
||||
details = "; ".join(
|
||||
f"Node '{k}' missing {v}" for k, v in missing_by_node.items()
|
||||
)
|
||||
return json.dumps(
|
||||
{
|
||||
"error": (
|
||||
f"Tool validation failed: {details}. "
|
||||
"Fix node tool declarations or add the missing "
|
||||
"tools, then try loading again."
|
||||
)
|
||||
}
|
||||
)
|
||||
|
||||
# Switch to staging phase after successful load + validation
|
||||
if phase_state is not None:
|
||||
await phase_state.switch_to_staging()
|
||||
|
||||
|
||||
@@ -0,0 +1,38 @@
|
||||
"""Tool for the queen to write to her episodic memory.
|
||||
|
||||
The queen can consciously record significant moments during a session — like
|
||||
writing in a diary. Semantic memory (MEMORY.md) is updated automatically at
|
||||
session end and is never written by the queen directly.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from framework.runner.tool_registry import ToolRegistry
|
||||
|
||||
|
||||
def write_to_diary(entry: str) -> str:
|
||||
"""Write a prose entry to today's episodic memory.
|
||||
|
||||
Use this when something significant just happened: a pipeline went live, the
|
||||
user shared an important preference, a goal was achieved or abandoned, or
|
||||
you want to record something that should be remembered across sessions.
|
||||
|
||||
Write in first person, as you would in a private diary. Be specific — what
|
||||
happened, how the user responded, what it means going forward. One or two
|
||||
paragraphs is enough.
|
||||
|
||||
You do not need to include a timestamp or date heading; those are added
|
||||
automatically.
|
||||
"""
|
||||
from framework.agents.queen.queen_memory import append_episodic_entry
|
||||
|
||||
append_episodic_entry(entry)
|
||||
return "Diary entry recorded."
|
||||
|
||||
|
||||
def register_queen_memory_tools(registry: ToolRegistry) -> None:
|
||||
"""Register the episodic memory tool into the queen's tool registry."""
|
||||
registry.register_function(write_to_diary)
|
||||
@@ -1,6 +1,6 @@
|
||||
"""Graph lifecycle tools for multi-graph sessions.
|
||||
|
||||
These tools allow an agent (e.g. hive_coder) to load, unload, start,
|
||||
These tools allow an agent (e.g. queen) to load, unload, start,
|
||||
restart, and query other agent graphs within the same runtime session.
|
||||
|
||||
Usage::
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,13 +0,0 @@
|
||||
"""TUI screens package."""
|
||||
|
||||
from .account_selection import AccountSelectionScreen
|
||||
from .add_local_credential import AddLocalCredentialScreen
|
||||
from .agent_picker import AgentPickerScreen
|
||||
from .credential_setup import CredentialSetupScreen
|
||||
|
||||
__all__ = [
|
||||
"AccountSelectionScreen",
|
||||
"AddLocalCredentialScreen",
|
||||
"AgentPickerScreen",
|
||||
"CredentialSetupScreen",
|
||||
]
|
||||
@@ -1,111 +0,0 @@
|
||||
"""Account selection ModalScreen for picking a connected account before agent start."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from rich.text import Text
|
||||
from textual.app import ComposeResult
|
||||
from textual.binding import Binding
|
||||
from textual.containers import Vertical
|
||||
from textual.screen import ModalScreen
|
||||
from textual.widgets import Label, OptionList
|
||||
from textual.widgets._option_list import Option
|
||||
|
||||
|
||||
class AccountSelectionScreen(ModalScreen[dict | None]):
|
||||
"""Modal screen showing connected accounts for pre-run selection.
|
||||
|
||||
Returns the selected account dict, or None if dismissed.
|
||||
"""
|
||||
|
||||
SCOPED_CSS = False
|
||||
|
||||
BINDINGS = [
|
||||
Binding("escape", "dismiss_picker", "Cancel"),
|
||||
]
|
||||
|
||||
DEFAULT_CSS = """
|
||||
AccountSelectionScreen {
|
||||
align: center middle;
|
||||
}
|
||||
#acct-container {
|
||||
width: 70%;
|
||||
max-width: 80;
|
||||
height: 60%;
|
||||
background: $surface;
|
||||
border: heavy $primary;
|
||||
padding: 1 2;
|
||||
}
|
||||
#acct-title {
|
||||
text-align: center;
|
||||
text-style: bold;
|
||||
width: 100%;
|
||||
color: $text;
|
||||
}
|
||||
#acct-subtitle {
|
||||
text-align: center;
|
||||
width: 100%;
|
||||
margin-bottom: 1;
|
||||
}
|
||||
#acct-footer {
|
||||
text-align: center;
|
||||
width: 100%;
|
||||
margin-top: 1;
|
||||
}
|
||||
"""
|
||||
|
||||
def __init__(self, accounts: list[dict]) -> None:
|
||||
super().__init__()
|
||||
self._accounts = accounts
|
||||
|
||||
def compose(self) -> ComposeResult:
|
||||
n = len(self._accounts)
|
||||
with Vertical(id="acct-container"):
|
||||
yield Label("Select Account to Test", id="acct-title")
|
||||
yield Label(
|
||||
f"[dim]{n} connected account{'s' if n != 1 else ''}[/dim]",
|
||||
id="acct-subtitle",
|
||||
)
|
||||
option_list = OptionList(id="acct-list")
|
||||
# Group: Aden accounts first, then local
|
||||
aden = [a for a in self._accounts if a.get("source") != "local"]
|
||||
local = [a for a in self._accounts if a.get("source") == "local"]
|
||||
ordered = aden + local
|
||||
for i, acct in enumerate(ordered):
|
||||
provider = acct.get("provider", "unknown")
|
||||
alias = acct.get("alias", "unknown")
|
||||
identity = acct.get("identity", {})
|
||||
source = acct.get("source", "aden")
|
||||
# Build identity label: prefer email, then username/workspace
|
||||
identity_label = (
|
||||
identity.get("email")
|
||||
or identity.get("username")
|
||||
or identity.get("workspace")
|
||||
or ""
|
||||
)
|
||||
label = Text()
|
||||
label.append(f"{provider}/", style="bold")
|
||||
label.append(alias, style="bold cyan")
|
||||
if source == "local":
|
||||
label.append(" [local]", style="dim yellow")
|
||||
if identity_label:
|
||||
label.append(f" ({identity_label})", style="dim")
|
||||
option_list.add_option(Option(label, id=f"acct-{i}"))
|
||||
# Keep ordered list for index lookups
|
||||
self._accounts = ordered
|
||||
yield option_list
|
||||
yield Label(
|
||||
"[dim]Enter[/dim] Select [dim]Esc[/dim] Cancel",
|
||||
id="acct-footer",
|
||||
)
|
||||
|
||||
def on_mount(self) -> None:
|
||||
ol = self.query_one("#acct-list", OptionList)
|
||||
ol.styles.height = "1fr"
|
||||
|
||||
def on_option_list_option_selected(self, event: OptionList.OptionSelected) -> None:
|
||||
idx = event.option_index
|
||||
if 0 <= idx < len(self._accounts):
|
||||
self.dismiss(self._accounts[idx])
|
||||
|
||||
def action_dismiss_picker(self) -> None:
|
||||
self.dismiss(None)
|
||||
@@ -1,244 +0,0 @@
|
||||
"""Add Local Credential ModalScreen for storing named local API key accounts."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from textual.app import ComposeResult
|
||||
from textual.binding import Binding
|
||||
from textual.containers import Vertical, VerticalScroll
|
||||
from textual.screen import ModalScreen
|
||||
from textual.widgets import Button, Input, Label, OptionList
|
||||
from textual.widgets._option_list import Option
|
||||
|
||||
|
||||
class AddLocalCredentialScreen(ModalScreen[dict | None]):
|
||||
"""Modal screen for adding a named local API key credential.
|
||||
|
||||
Phase 1: Pick credential type from list.
|
||||
Phase 2: Enter alias + API key, run health check, save.
|
||||
|
||||
Returns a dict with credential_id, alias, and identity on success, or None on cancel.
|
||||
"""
|
||||
|
||||
BINDINGS = [
|
||||
Binding("escape", "dismiss_screen", "Cancel"),
|
||||
]
|
||||
|
||||
DEFAULT_CSS = """
|
||||
AddLocalCredentialScreen {
|
||||
align: center middle;
|
||||
}
|
||||
#alc-container {
|
||||
width: 80%;
|
||||
max-width: 90;
|
||||
height: 80%;
|
||||
background: $surface;
|
||||
border: heavy $primary;
|
||||
padding: 1 2;
|
||||
}
|
||||
#alc-title {
|
||||
text-align: center;
|
||||
text-style: bold;
|
||||
width: 100%;
|
||||
color: $text;
|
||||
}
|
||||
#alc-subtitle {
|
||||
text-align: center;
|
||||
width: 100%;
|
||||
margin-bottom: 1;
|
||||
}
|
||||
#alc-type-list {
|
||||
height: 1fr;
|
||||
}
|
||||
#alc-form {
|
||||
height: 1fr;
|
||||
}
|
||||
.alc-field {
|
||||
margin-bottom: 1;
|
||||
height: auto;
|
||||
}
|
||||
.alc-field Label {
|
||||
margin-bottom: 0;
|
||||
}
|
||||
#alc-status {
|
||||
width: 100%;
|
||||
height: auto;
|
||||
margin-top: 1;
|
||||
padding: 1;
|
||||
background: $panel;
|
||||
}
|
||||
.alc-buttons {
|
||||
height: auto;
|
||||
margin-top: 1;
|
||||
align: center middle;
|
||||
}
|
||||
.alc-buttons Button {
|
||||
margin: 0 1;
|
||||
}
|
||||
#alc-footer {
|
||||
text-align: center;
|
||||
width: 100%;
|
||||
margin-top: 1;
|
||||
}
|
||||
"""
|
||||
|
||||
def __init__(self) -> None:
|
||||
super().__init__()
|
||||
# Load credential specs that support direct API keys
|
||||
self._specs: list[tuple[str, object]] = self._load_specs()
|
||||
# Selected credential spec (set in phase 2)
|
||||
self._selected_id: str = ""
|
||||
self._selected_spec: object = None
|
||||
self._phase: int = 1 # 1 = type selection, 2 = form
|
||||
|
||||
@staticmethod
|
||||
def _load_specs() -> list[tuple[str, object]]:
|
||||
"""Return (credential_id, spec) pairs for direct-API-key credentials."""
|
||||
try:
|
||||
from aden_tools.credentials import CREDENTIAL_SPECS
|
||||
|
||||
return [
|
||||
(cid, spec)
|
||||
for cid, spec in CREDENTIAL_SPECS.items()
|
||||
if getattr(spec, "direct_api_key_supported", False)
|
||||
]
|
||||
except Exception:
|
||||
return []
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Compose
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def compose(self) -> ComposeResult:
|
||||
with Vertical(id="alc-container"):
|
||||
yield Label("Add Local Credential", id="alc-title")
|
||||
yield Label("[dim]Store a named API key account[/dim]", id="alc-subtitle")
|
||||
# Phase 1: type selection
|
||||
option_list = OptionList(id="alc-type-list")
|
||||
for cid, spec in self._specs:
|
||||
description = getattr(spec, "description", cid)
|
||||
option_list.add_option(Option(f"{cid} [dim]{description}[/dim]", id=f"type-{cid}"))
|
||||
yield option_list
|
||||
# Phase 2: form (hidden initially)
|
||||
with VerticalScroll(id="alc-form"):
|
||||
with Vertical(classes="alc-field"):
|
||||
yield Label("[bold]Alias[/bold] [dim](e.g. work, personal)[/dim]")
|
||||
yield Input(value="default", id="alc-alias")
|
||||
with Vertical(classes="alc-field"):
|
||||
yield Label("[bold]API Key[/bold]")
|
||||
yield Input(placeholder="Paste API key...", password=True, id="alc-key")
|
||||
yield Label("", id="alc-status")
|
||||
with Vertical(classes="alc-buttons"):
|
||||
yield Button("Test & Save", variant="primary", id="btn-save")
|
||||
yield Button("Back", variant="default", id="btn-back")
|
||||
yield Label(
|
||||
"[dim]Enter[/dim] Select [dim]Esc[/dim] Cancel",
|
||||
id="alc-footer",
|
||||
)
|
||||
|
||||
def on_mount(self) -> None:
|
||||
self._show_phase(1)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Phase switching
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _show_phase(self, phase: int) -> None:
|
||||
self._phase = phase
|
||||
type_list = self.query_one("#alc-type-list", OptionList)
|
||||
form = self.query_one("#alc-form", VerticalScroll)
|
||||
if phase == 1:
|
||||
type_list.display = True
|
||||
form.display = False
|
||||
subtitle = self.query_one("#alc-subtitle", Label)
|
||||
subtitle.update("[dim]Select the credential type to add[/dim]")
|
||||
else:
|
||||
type_list.display = False
|
||||
form.display = True
|
||||
spec = self._selected_spec
|
||||
description = (
|
||||
getattr(spec, "description", self._selected_id) if spec else self._selected_id
|
||||
)
|
||||
subtitle = self.query_one("#alc-subtitle", Label)
|
||||
subtitle.update(f"[dim]{self._selected_id}[/dim] {description}")
|
||||
self._clear_status()
|
||||
# Focus the alias input
|
||||
self.query_one("#alc-alias", Input).focus()
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Event handlers
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def on_option_list_option_selected(self, event: OptionList.OptionSelected) -> None:
|
||||
if self._phase != 1:
|
||||
return
|
||||
option_id = event.option.id or ""
|
||||
if option_id.startswith("type-"):
|
||||
cid = option_id[5:] # strip "type-" prefix
|
||||
self._selected_id = cid
|
||||
self._selected_spec = next(
|
||||
(spec for spec_id, spec in self._specs if spec_id == cid), None
|
||||
)
|
||||
self._show_phase(2)
|
||||
|
||||
def on_button_pressed(self, event: Button.Pressed) -> None:
|
||||
if event.button.id == "btn-save":
|
||||
self._do_save()
|
||||
elif event.button.id == "btn-back":
|
||||
self._show_phase(1)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Save logic
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _do_save(self) -> None:
|
||||
alias = self.query_one("#alc-alias", Input).value.strip() or "default"
|
||||
api_key = self.query_one("#alc-key", Input).value.strip()
|
||||
|
||||
if not api_key:
|
||||
self._set_status("[red]API key cannot be empty.[/red]")
|
||||
return
|
||||
|
||||
self._set_status("[dim]Running health check...[/dim]")
|
||||
# Disable save button while running
|
||||
btn = self.query_one("#btn-save", Button)
|
||||
btn.disabled = True
|
||||
|
||||
try:
|
||||
from framework.credentials.local.registry import LocalCredentialRegistry
|
||||
|
||||
registry = LocalCredentialRegistry.default()
|
||||
info, health_result = registry.save_account(
|
||||
credential_id=self._selected_id,
|
||||
alias=alias,
|
||||
api_key=api_key,
|
||||
run_health_check=True,
|
||||
)
|
||||
|
||||
if health_result is not None and not health_result.valid:
|
||||
self._set_status(
|
||||
f"[yellow]Saved with failed health check:[/yellow] {health_result.message}\n"
|
||||
"[dim]You can re-validate later via validate_credential().[/dim]"
|
||||
)
|
||||
else:
|
||||
identity = info.identity.to_dict()
|
||||
identity_str = ""
|
||||
if identity:
|
||||
parts = [f"{k}: {v}" for k, v in identity.items() if v]
|
||||
identity_str = " " + ", ".join(parts) if parts else ""
|
||||
self._set_status(f"[green]Saved:[/green] {info.storage_id}{identity_str}")
|
||||
# Dismiss with result so callers can react
|
||||
self.set_timer(1.0, lambda: self.dismiss(info.to_account_dict()))
|
||||
return
|
||||
except Exception as e:
|
||||
self._set_status(f"[red]Error:[/red] {e}")
|
||||
finally:
|
||||
btn.disabled = False
|
||||
|
||||
def _set_status(self, markup: str) -> None:
|
||||
self.query_one("#alc-status", Label).update(markup)
|
||||
|
||||
def _clear_status(self) -> None:
|
||||
self.query_one("#alc-status", Label).update("")
|
||||
|
||||
def action_dismiss_screen(self) -> None:
|
||||
self.dismiss(None)
|
||||
@@ -1,362 +0,0 @@
|
||||
"""Agent picker ModalScreen for selecting agents within the TUI."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from dataclasses import dataclass, field
|
||||
from enum import Enum
|
||||
from pathlib import Path
|
||||
|
||||
from rich.console import Group
|
||||
from rich.text import Text
|
||||
from textual.app import ComposeResult
|
||||
from textual.binding import Binding
|
||||
from textual.containers import Vertical
|
||||
from textual.screen import ModalScreen
|
||||
from textual.widgets import Label, OptionList, TabbedContent, TabPane
|
||||
from textual.widgets._option_list import Option
|
||||
|
||||
|
||||
class GetStartedAction(Enum):
|
||||
"""Actions available in the Get Started tab."""
|
||||
|
||||
RUN_EXAMPLES = "run_examples"
|
||||
RUN_EXISTING = "run_existing"
|
||||
BUILD_EDIT = "build_edit"
|
||||
|
||||
|
||||
@dataclass
|
||||
class AgentEntry:
|
||||
"""Lightweight agent metadata for the picker."""
|
||||
|
||||
path: Path
|
||||
name: str
|
||||
description: str
|
||||
category: str
|
||||
session_count: int = 0
|
||||
node_count: int = 0
|
||||
tool_count: int = 0
|
||||
tags: list[str] = field(default_factory=list)
|
||||
last_active: str | None = None
|
||||
|
||||
|
||||
def _get_last_active(agent_name: str) -> str | None:
|
||||
"""Return the most recent updated_at timestamp across all sessions."""
|
||||
sessions_dir = Path.home() / ".hive" / "agents" / agent_name / "sessions"
|
||||
if not sessions_dir.exists():
|
||||
return None
|
||||
latest: str | None = None
|
||||
for session_dir in sessions_dir.iterdir():
|
||||
if not session_dir.is_dir() or not session_dir.name.startswith("session_"):
|
||||
continue
|
||||
state_file = session_dir / "state.json"
|
||||
if not state_file.exists():
|
||||
continue
|
||||
try:
|
||||
data = json.loads(state_file.read_text(encoding="utf-8"))
|
||||
ts = data.get("timestamps", {}).get("updated_at")
|
||||
if ts and (latest is None or ts > latest):
|
||||
latest = ts
|
||||
except Exception:
|
||||
continue
|
||||
return latest
|
||||
|
||||
|
||||
def _count_sessions(agent_name: str) -> int:
|
||||
"""Count session directories under ~/.hive/agents/{agent_name}/sessions/."""
|
||||
sessions_dir = Path.home() / ".hive" / "agents" / agent_name / "sessions"
|
||||
if not sessions_dir.exists():
|
||||
return 0
|
||||
return sum(1 for d in sessions_dir.iterdir() if d.is_dir() and d.name.startswith("session_"))
|
||||
|
||||
|
||||
def _extract_agent_stats(agent_path: Path) -> tuple[int, int, list[str]]:
|
||||
"""Extract node count, tool count, and tags from an agent directory.
|
||||
|
||||
Prefers agent.py (AST-parsed) over agent.json for node/tool counts
|
||||
since agent.json may be stale. Tags are only available from agent.json.
|
||||
"""
|
||||
import ast
|
||||
|
||||
node_count, tool_count, tags = 0, 0, []
|
||||
|
||||
# Try agent.py first — source of truth for nodes
|
||||
agent_py = agent_path / "agent.py"
|
||||
if agent_py.exists():
|
||||
try:
|
||||
tree = ast.parse(agent_py.read_text(encoding="utf-8"))
|
||||
for node in ast.walk(tree):
|
||||
# Find `nodes = [...]` assignment
|
||||
if isinstance(node, ast.Assign):
|
||||
for target in node.targets:
|
||||
if isinstance(target, ast.Name) and target.id == "nodes":
|
||||
if isinstance(node.value, ast.List):
|
||||
node_count = len(node.value.elts)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Fall back to / supplement from agent.json
|
||||
agent_json = agent_path / "agent.json"
|
||||
if agent_json.exists():
|
||||
try:
|
||||
data = json.loads(agent_json.read_text(encoding="utf-8"))
|
||||
json_nodes = data.get("nodes", [])
|
||||
if node_count == 0:
|
||||
node_count = len(json_nodes)
|
||||
# Tool count: use whichever source gave us nodes, but agent.json
|
||||
# has the structured tool lists so prefer it for tool counting
|
||||
tools: set[str] = set()
|
||||
for n in json_nodes:
|
||||
tools.update(n.get("tools", []))
|
||||
tool_count = len(tools)
|
||||
tags = data.get("agent", {}).get("tags", [])
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return node_count, tool_count, tags
|
||||
|
||||
|
||||
def discover_agents() -> dict[str, list[AgentEntry]]:
|
||||
"""Discover agents from all known sources grouped by category."""
|
||||
from framework.runner.cli import (
|
||||
_extract_python_agent_metadata,
|
||||
_get_framework_agents_dir,
|
||||
_is_valid_agent_dir,
|
||||
)
|
||||
|
||||
groups: dict[str, list[AgentEntry]] = {}
|
||||
sources = [
|
||||
("Your Agents", Path("exports")),
|
||||
("Framework", _get_framework_agents_dir()),
|
||||
("Examples", Path("examples/templates")),
|
||||
]
|
||||
|
||||
for category, base_dir in sources:
|
||||
if not base_dir.exists():
|
||||
continue
|
||||
entries: list[AgentEntry] = []
|
||||
for path in sorted(base_dir.iterdir(), key=lambda p: p.name):
|
||||
if not _is_valid_agent_dir(path):
|
||||
continue
|
||||
|
||||
# config.py is source of truth for name/description
|
||||
name, desc = _extract_python_agent_metadata(path)
|
||||
config_fallback_name = path.name.replace("_", " ").title()
|
||||
used_config = name != config_fallback_name
|
||||
|
||||
node_count, tool_count, tags = _extract_agent_stats(path)
|
||||
if not used_config:
|
||||
# config.py didn't provide values, fall back to agent.json
|
||||
agent_json = path / "agent.json"
|
||||
if agent_json.exists():
|
||||
try:
|
||||
data = json.loads(agent_json.read_text(encoding="utf-8"))
|
||||
meta = data.get("agent", {})
|
||||
name = meta.get("name", name)
|
||||
desc = meta.get("description", desc)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
entries.append(
|
||||
AgentEntry(
|
||||
path=path,
|
||||
name=name,
|
||||
description=desc,
|
||||
category=category,
|
||||
session_count=_count_sessions(path.name),
|
||||
node_count=node_count,
|
||||
tool_count=tool_count,
|
||||
tags=tags,
|
||||
last_active=_get_last_active(path.name),
|
||||
)
|
||||
)
|
||||
if entries:
|
||||
groups[category] = entries
|
||||
|
||||
return groups
|
||||
|
||||
|
||||
def _render_agent_option(agent: AgentEntry) -> Group:
|
||||
"""Build a Rich renderable for a single agent option."""
|
||||
# Line 1: name + session badge
|
||||
line1 = Text()
|
||||
line1.append(agent.name, style="bold")
|
||||
if agent.session_count:
|
||||
line1.append(f" {agent.session_count} sessions", style="dim cyan")
|
||||
|
||||
# Line 2: description (word-wrapped by the widget)
|
||||
desc = agent.description if agent.description else "No description"
|
||||
line2 = Text(desc, style="dim")
|
||||
|
||||
# Line 3: stats chips
|
||||
chips = Text()
|
||||
if agent.node_count:
|
||||
chips.append(f" {agent.node_count} nodes ", style="on dark_green white")
|
||||
chips.append(" ")
|
||||
if agent.tool_count:
|
||||
chips.append(f" {agent.tool_count} tools ", style="on dark_blue white")
|
||||
chips.append(" ")
|
||||
for tag in agent.tags[:3]:
|
||||
chips.append(f" {tag} ", style="on grey37 white")
|
||||
chips.append(" ")
|
||||
|
||||
parts = [line1, line2]
|
||||
if chips.plain.strip():
|
||||
parts.append(chips)
|
||||
return Group(*parts)
|
||||
|
||||
|
||||
def _render_get_started_option(title: str, description: str, icon: str = "→") -> Group:
|
||||
"""Build a Rich renderable for a Get Started option."""
|
||||
line1 = Text()
|
||||
line1.append(f"{icon} ", style="bold cyan")
|
||||
line1.append(title, style="bold")
|
||||
line2 = Text(description, style="dim")
|
||||
return Group(line1, line2)
|
||||
|
||||
|
||||
class AgentPickerScreen(ModalScreen[str | None]):
|
||||
"""Modal screen showing available agents organized by tabbed categories.
|
||||
|
||||
Returns the selected agent path as a string, or None if dismissed.
|
||||
For Get Started actions, returns a special prefix like "action:run_examples".
|
||||
"""
|
||||
|
||||
BINDINGS = [
|
||||
Binding("escape", "dismiss_picker", "Cancel"),
|
||||
]
|
||||
|
||||
DEFAULT_CSS = """
|
||||
AgentPickerScreen {
|
||||
align: center middle;
|
||||
}
|
||||
#picker-container {
|
||||
width: 90%;
|
||||
max-width: 120;
|
||||
height: 85%;
|
||||
background: $surface;
|
||||
border: heavy $primary;
|
||||
padding: 1 2;
|
||||
}
|
||||
#picker-title {
|
||||
text-align: center;
|
||||
text-style: bold;
|
||||
width: 100%;
|
||||
color: $text;
|
||||
}
|
||||
#picker-subtitle {
|
||||
text-align: center;
|
||||
width: 100%;
|
||||
margin-bottom: 1;
|
||||
}
|
||||
#picker-footer {
|
||||
text-align: center;
|
||||
width: 100%;
|
||||
margin-top: 1;
|
||||
}
|
||||
TabPane {
|
||||
padding: 0;
|
||||
}
|
||||
OptionList {
|
||||
height: 1fr;
|
||||
}
|
||||
OptionList > .option-list--option {
|
||||
padding: 1 2;
|
||||
}
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
agent_groups: dict[str, list[AgentEntry]],
|
||||
show_get_started: bool = False,
|
||||
) -> None:
|
||||
super().__init__()
|
||||
self._groups = agent_groups
|
||||
self._show_get_started = show_get_started
|
||||
# Map (tab_id, option_index) -> AgentEntry
|
||||
self._option_map: dict[str, dict[int, AgentEntry]] = {}
|
||||
|
||||
def compose(self) -> ComposeResult:
|
||||
total = sum(len(v) for v in self._groups.values())
|
||||
with Vertical(id="picker-container"):
|
||||
yield Label("Hive Agent Launcher", id="picker-title")
|
||||
yield Label(
|
||||
f"[dim]{total} agents available[/dim]",
|
||||
id="picker-subtitle",
|
||||
)
|
||||
with TabbedContent():
|
||||
# Get Started tab (only on initial launch)
|
||||
if self._show_get_started:
|
||||
with TabPane("Get Started", id="get-started"):
|
||||
option_list = OptionList(id="list-get-started")
|
||||
option_list.add_option(
|
||||
Option(
|
||||
_render_get_started_option(
|
||||
"Test and run example agents",
|
||||
"Try pre-built example agents to learn how Hive works",
|
||||
"📚",
|
||||
),
|
||||
id="action:run_examples",
|
||||
)
|
||||
)
|
||||
option_list.add_option(
|
||||
Option(
|
||||
_render_get_started_option(
|
||||
"Test and run existing agent",
|
||||
"Load and run an agent you've already built (from exports/)",
|
||||
"🚀",
|
||||
),
|
||||
id="action:run_existing",
|
||||
)
|
||||
)
|
||||
option_list.add_option(
|
||||
Option(
|
||||
_render_get_started_option(
|
||||
"Build or edit agent",
|
||||
"Create a new agent or modify an existing one",
|
||||
"🛠️ ",
|
||||
),
|
||||
id="action:build_edit",
|
||||
)
|
||||
)
|
||||
yield option_list
|
||||
|
||||
# Agent category tabs
|
||||
for category, agents in self._groups.items():
|
||||
tab_id = category.lower().replace(" ", "-")
|
||||
with TabPane(f"{category} ({len(agents)})", id=tab_id):
|
||||
option_list = OptionList(id=f"list-{tab_id}")
|
||||
self._option_map[f"list-{tab_id}"] = {}
|
||||
for i, agent in enumerate(agents):
|
||||
option_list.add_option(
|
||||
Option(
|
||||
_render_agent_option(agent),
|
||||
id=str(agent.path),
|
||||
)
|
||||
)
|
||||
self._option_map[f"list-{tab_id}"][i] = agent
|
||||
yield option_list
|
||||
yield Label(
|
||||
"[dim]Enter[/dim] Select [dim]Tab[/dim] Switch category [dim]Esc[/dim] Cancel",
|
||||
id="picker-footer",
|
||||
)
|
||||
|
||||
def on_option_list_option_selected(self, event: OptionList.OptionSelected) -> None:
|
||||
list_id = event.option_list.id or ""
|
||||
|
||||
# Handle Get Started tab options
|
||||
if list_id == "list-get-started":
|
||||
option = event.option
|
||||
if option and option.id:
|
||||
self.dismiss(option.id) # Returns "action:run_examples", etc.
|
||||
return
|
||||
|
||||
# Handle agent selection from other tabs
|
||||
idx = event.option_index
|
||||
agent_map = self._option_map.get(list_id, {})
|
||||
agent = agent_map.get(idx)
|
||||
if agent:
|
||||
self.dismiss(str(agent.path))
|
||||
|
||||
def action_dismiss_picker(self) -> None:
|
||||
self.dismiss(None)
|
||||
@@ -1,304 +0,0 @@
|
||||
"""Credential setup ModalScreen for configuring missing agent credentials."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
|
||||
from textual.app import ComposeResult
|
||||
from textual.binding import Binding
|
||||
from textual.containers import Vertical, VerticalScroll
|
||||
from textual.screen import ModalScreen
|
||||
from textual.widgets import Button, Input, Label
|
||||
|
||||
from framework.credentials.setup import CredentialSetupSession, MissingCredential
|
||||
|
||||
|
||||
class CredentialSetupScreen(ModalScreen[bool | None]):
|
||||
"""Modal screen for configuring missing agent credentials.
|
||||
|
||||
Shows a form with one password Input per missing credential.
|
||||
For Aden-backed credentials (``aden_supported=True``), prompts for
|
||||
``ADEN_API_KEY`` and runs the Aden sync flow instead of storing a
|
||||
raw value.
|
||||
|
||||
Returns True on successful save, or None on cancel/skip.
|
||||
"""
|
||||
|
||||
BINDINGS = [
|
||||
Binding("escape", "dismiss_setup", "Cancel"),
|
||||
]
|
||||
|
||||
DEFAULT_CSS = """
|
||||
CredentialSetupScreen {
|
||||
align: center middle;
|
||||
}
|
||||
#cred-container {
|
||||
width: 80%;
|
||||
max-width: 100;
|
||||
height: 80%;
|
||||
background: $surface;
|
||||
border: heavy $primary;
|
||||
padding: 1 2;
|
||||
}
|
||||
#cred-title {
|
||||
text-align: center;
|
||||
text-style: bold;
|
||||
width: 100%;
|
||||
color: $text;
|
||||
}
|
||||
#cred-subtitle {
|
||||
text-align: center;
|
||||
width: 100%;
|
||||
margin-bottom: 1;
|
||||
}
|
||||
#cred-scroll {
|
||||
height: 1fr;
|
||||
}
|
||||
.cred-entry {
|
||||
margin-bottom: 1;
|
||||
padding: 1;
|
||||
background: $panel;
|
||||
height: auto;
|
||||
}
|
||||
.cred-entry Input {
|
||||
margin-top: 1;
|
||||
}
|
||||
.cred-buttons {
|
||||
height: auto;
|
||||
margin-top: 1;
|
||||
align: center middle;
|
||||
}
|
||||
.cred-buttons Button {
|
||||
margin: 0 1;
|
||||
}
|
||||
#cred-footer {
|
||||
text-align: center;
|
||||
width: 100%;
|
||||
margin-top: 1;
|
||||
}
|
||||
"""
|
||||
|
||||
def __init__(self, session: CredentialSetupSession) -> None:
|
||||
super().__init__()
|
||||
self._session = session
|
||||
self._missing: list[MissingCredential] = session.missing
|
||||
# Track which credentials need Aden sync vs direct API key
|
||||
self._aden_creds: set[int] = set()
|
||||
self._needs_aden_key = False
|
||||
for i, cred in enumerate(self._missing):
|
||||
if cred.aden_supported and not cred.direct_api_key_supported:
|
||||
self._aden_creds.add(i)
|
||||
self._needs_aden_key = True
|
||||
|
||||
def compose(self) -> ComposeResult:
|
||||
n = len(self._missing)
|
||||
with Vertical(id="cred-container"):
|
||||
yield Label("Credential Setup", id="cred-title")
|
||||
yield Label(
|
||||
f"[dim]{n} credential{'s' if n != 1 else ''} needed to run this agent[/dim]",
|
||||
id="cred-subtitle",
|
||||
)
|
||||
with VerticalScroll(id="cred-scroll"):
|
||||
# If any credential needs Aden, show ADEN_API_KEY input first
|
||||
if self._needs_aden_key:
|
||||
aden_key = os.environ.get("ADEN_API_KEY", "")
|
||||
with Vertical(classes="cred-entry"):
|
||||
yield Label("[bold]ADEN_API_KEY[/bold]")
|
||||
aden_names = [
|
||||
self._missing[i].credential_name for i in sorted(self._aden_creds)
|
||||
]
|
||||
yield Label(f"[dim]Required for OAuth sync: {', '.join(aden_names)}[/dim]")
|
||||
yield Label("[cyan]Get key:[/cyan] https://hive.adenhq.com")
|
||||
yield Input(
|
||||
placeholder="Paste ADEN_API_KEY..."
|
||||
if not aden_key
|
||||
else "Already set (leave blank to keep)",
|
||||
password=True,
|
||||
id="key-aden",
|
||||
)
|
||||
|
||||
# Show direct API key inputs for non-Aden credentials
|
||||
for i, cred in enumerate(self._missing):
|
||||
if i in self._aden_creds:
|
||||
continue # Handled via Aden sync above
|
||||
with Vertical(classes="cred-entry"):
|
||||
yield Label(f"[bold]{cred.env_var}[/bold]")
|
||||
affected = cred.tools or cred.node_types
|
||||
if affected:
|
||||
yield Label(f"[dim]Required by: {', '.join(affected)}[/dim]")
|
||||
if cred.description:
|
||||
yield Label(f"[dim]{cred.description}[/dim]")
|
||||
if cred.help_url:
|
||||
yield Label(f"[cyan]Get key:[/cyan] {cred.help_url}")
|
||||
yield Input(
|
||||
placeholder="Paste API key...",
|
||||
password=True,
|
||||
id=f"key-{i}",
|
||||
)
|
||||
with Vertical(classes="cred-buttons"):
|
||||
yield Button("Save & Continue", variant="primary", id="btn-save")
|
||||
yield Button("Skip", variant="default", id="btn-skip")
|
||||
yield Label(
|
||||
"[dim]Enter[/dim] Submit [dim]Esc[/dim] Cancel",
|
||||
id="cred-footer",
|
||||
)
|
||||
|
||||
def on_button_pressed(self, event: Button.Pressed) -> None:
|
||||
if event.button.id == "btn-save":
|
||||
self._save_credentials()
|
||||
elif event.button.id == "btn-skip":
|
||||
self.dismiss(None)
|
||||
|
||||
def _save_credentials(self) -> None:
|
||||
"""Collect inputs, store credentials, and dismiss."""
|
||||
self._session._ensure_credential_key()
|
||||
|
||||
configured = 0
|
||||
|
||||
# Handle Aden-backed credentials
|
||||
if self._needs_aden_key:
|
||||
aden_input = self.query_one("#key-aden", Input)
|
||||
aden_key = aden_input.value.strip()
|
||||
if aden_key:
|
||||
from framework.credentials.key_storage import save_aden_api_key
|
||||
|
||||
save_aden_api_key(aden_key)
|
||||
configured += 1 # ADEN_API_KEY itself counts as configured
|
||||
|
||||
# Run Aden sync for all Aden-backed creds (best-effort)
|
||||
if aden_key or os.environ.get("ADEN_API_KEY"):
|
||||
self._sync_aden_credentials()
|
||||
|
||||
# Handle direct API key credentials
|
||||
for i, cred in enumerate(self._missing):
|
||||
if i in self._aden_creds:
|
||||
continue
|
||||
input_widget = self.query_one(f"#key-{i}", Input)
|
||||
value = input_widget.value.strip()
|
||||
if not value:
|
||||
continue
|
||||
try:
|
||||
self._session._store_credential(cred, value)
|
||||
configured += 1
|
||||
except Exception as e:
|
||||
self.notify(f"Error storing {cred.env_var}: {e}", severity="error")
|
||||
|
||||
if configured > 0:
|
||||
self.dismiss(True)
|
||||
else:
|
||||
self.notify("No credentials configured", severity="warning", timeout=3)
|
||||
|
||||
def _sync_aden_credentials(self) -> int:
|
||||
"""Sync Aden-backed credentials and return count of successfully synced."""
|
||||
# Build the Aden sync components directly so we get real errors
|
||||
# instead of CredentialStore.with_aden_sync() silently falling back.
|
||||
try:
|
||||
from framework.credentials.aden import (
|
||||
AdenCachedStorage,
|
||||
AdenClientConfig,
|
||||
AdenCredentialClient,
|
||||
AdenSyncProvider,
|
||||
)
|
||||
from framework.credentials.storage import EncryptedFileStorage
|
||||
|
||||
client = AdenCredentialClient(AdenClientConfig(base_url="https://api.adenhq.com"))
|
||||
provider = AdenSyncProvider(client=client)
|
||||
local_storage = EncryptedFileStorage()
|
||||
cached_storage = AdenCachedStorage(
|
||||
local_storage=local_storage,
|
||||
aden_provider=provider,
|
||||
)
|
||||
except Exception as e:
|
||||
self.notify(
|
||||
f"Aden setup error: {e}",
|
||||
severity="error",
|
||||
timeout=8,
|
||||
)
|
||||
return 0
|
||||
|
||||
# Sync all integrations from Aden to get the provider index populated
|
||||
try:
|
||||
from framework.credentials import CredentialStore
|
||||
|
||||
store = CredentialStore(
|
||||
storage=cached_storage,
|
||||
providers=[provider],
|
||||
auto_refresh=True,
|
||||
)
|
||||
num_synced = provider.sync_all(store)
|
||||
if num_synced == 0:
|
||||
self.notify(
|
||||
"No active integrations found in Aden. "
|
||||
"Connect integrations at hive.adenhq.com.",
|
||||
severity="warning",
|
||||
timeout=8,
|
||||
)
|
||||
except Exception as e:
|
||||
self.notify(
|
||||
f"Aden sync error: {e}",
|
||||
severity="error",
|
||||
timeout=8,
|
||||
)
|
||||
return 0
|
||||
|
||||
synced = 0
|
||||
for i in sorted(self._aden_creds):
|
||||
cred = self._missing[i]
|
||||
cred_id = cred.credential_id or cred.credential_name
|
||||
if store.is_available(cred_id):
|
||||
try:
|
||||
value = store.get_key(cred_id, cred.credential_key)
|
||||
if value:
|
||||
os.environ[cred.env_var] = value
|
||||
self._persist_to_local_store(cred_id, cred.credential_key, value)
|
||||
synced += 1
|
||||
else:
|
||||
self.notify(
|
||||
f"{cred.credential_name}: key "
|
||||
f"'{cred.credential_key}' not found "
|
||||
f"in credential '{cred_id}'",
|
||||
severity="warning",
|
||||
timeout=8,
|
||||
)
|
||||
except Exception as e:
|
||||
self.notify(
|
||||
f"{cred.credential_name} extraction failed: {e}",
|
||||
severity="error",
|
||||
timeout=8,
|
||||
)
|
||||
else:
|
||||
self.notify(
|
||||
f"{cred.credential_name} (id='{cred_id}') "
|
||||
f"not found in Aden. Connect this "
|
||||
f"integration at hive.adenhq.com first.",
|
||||
severity="warning",
|
||||
timeout=8,
|
||||
)
|
||||
return synced
|
||||
|
||||
@staticmethod
|
||||
def _persist_to_local_store(cred_id: str, key_name: str, value: str) -> None:
|
||||
"""Save a synced token to the local encrypted store under the canonical ID."""
|
||||
try:
|
||||
from pydantic import SecretStr
|
||||
|
||||
from framework.credentials.models import CredentialKey, CredentialObject, CredentialType
|
||||
from framework.credentials.storage import EncryptedFileStorage
|
||||
|
||||
cred_obj = CredentialObject(
|
||||
id=cred_id,
|
||||
credential_type=CredentialType.OAUTH2,
|
||||
keys={
|
||||
key_name: CredentialKey(
|
||||
name=key_name,
|
||||
value=SecretStr(value),
|
||||
),
|
||||
},
|
||||
auto_refresh=True,
|
||||
)
|
||||
EncryptedFileStorage().save(cred_obj)
|
||||
except Exception:
|
||||
pass # Best-effort; env var is the primary delivery mechanism
|
||||
|
||||
def action_dismiss_setup(self) -> None:
|
||||
self.dismiss(None)
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,139 +0,0 @@
|
||||
"""
|
||||
Native OS file dialog for PDF selection.
|
||||
|
||||
Launches the platform's native file picker (macOS: NSOpenPanel via osascript,
|
||||
Linux: zenity/kdialog, Windows: PowerShell OpenFileDialog) in a background
|
||||
thread so Textual's event loop stays responsive.
|
||||
|
||||
Falls back to None when no GUI is available (SSH, headless).
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def _has_gui() -> bool:
|
||||
"""Detect whether a GUI display is available."""
|
||||
if sys.platform == "darwin":
|
||||
# macOS: GUI is available unless running over SSH without display forwarding.
|
||||
return "SSH_CONNECTION" not in os.environ or "DISPLAY" in os.environ
|
||||
elif sys.platform == "win32":
|
||||
return True
|
||||
else:
|
||||
# Linux/BSD: Need X11 or Wayland.
|
||||
return bool(os.environ.get("DISPLAY") or os.environ.get("WAYLAND_DISPLAY"))
|
||||
|
||||
|
||||
def _linux_file_dialog() -> subprocess.CompletedProcess | None:
|
||||
"""Try zenity, then kdialog, on Linux. Returns CompletedProcess or None."""
|
||||
# Try zenity (GTK)
|
||||
try:
|
||||
return subprocess.run(
|
||||
[
|
||||
"zenity",
|
||||
"--file-selection",
|
||||
"--title=Select a PDF file",
|
||||
"--file-filter=PDF files (*.pdf)|*.pdf",
|
||||
],
|
||||
encoding="utf-8",
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=300,
|
||||
)
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
|
||||
# Try kdialog (KDE)
|
||||
try:
|
||||
return subprocess.run(
|
||||
[
|
||||
"kdialog",
|
||||
"--getopenfilename",
|
||||
".",
|
||||
"PDF files (*.pdf)",
|
||||
],
|
||||
encoding="utf-8",
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=300,
|
||||
)
|
||||
except FileNotFoundError:
|
||||
pass
|
||||
|
||||
return None
|
||||
|
||||
|
||||
def _pick_pdf_subprocess() -> Path | None:
|
||||
"""Run the native file dialog. BLOCKS until user picks or cancels.
|
||||
|
||||
Returns a Path on success, None on cancel or error.
|
||||
Must be called from a non-main thread (via asyncio.to_thread).
|
||||
"""
|
||||
try:
|
||||
if sys.platform == "darwin":
|
||||
result = subprocess.run(
|
||||
[
|
||||
"osascript",
|
||||
"-e",
|
||||
'POSIX path of (choose file of type {"com.adobe.pdf"} '
|
||||
'with prompt "Select a PDF file")',
|
||||
],
|
||||
encoding="utf-8",
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=300,
|
||||
)
|
||||
elif sys.platform == "win32":
|
||||
ps_script = (
|
||||
"Add-Type -AssemblyName System.Windows.Forms; "
|
||||
"$f = New-Object System.Windows.Forms.OpenFileDialog; "
|
||||
"$f.Filter = 'PDF files (*.pdf)|*.pdf'; "
|
||||
"$f.Title = 'Select a PDF file'; "
|
||||
"if ($f.ShowDialog() -eq 'OK') { $f.FileName }"
|
||||
)
|
||||
result = subprocess.run(
|
||||
["powershell", "-NoProfile", "-Command", ps_script],
|
||||
encoding="utf-8",
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=300,
|
||||
)
|
||||
else:
|
||||
result = _linux_file_dialog()
|
||||
if result is None:
|
||||
return None
|
||||
|
||||
if result.returncode != 0:
|
||||
return None
|
||||
|
||||
path_str = result.stdout.strip()
|
||||
if not path_str:
|
||||
return None
|
||||
|
||||
path = Path(path_str)
|
||||
if path.is_file() and path.suffix.lower() == ".pdf":
|
||||
return path
|
||||
|
||||
return None
|
||||
|
||||
except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
|
||||
return None
|
||||
|
||||
|
||||
async def pick_pdf_file() -> Path | None:
|
||||
"""Open a native OS file dialog to pick a PDF file.
|
||||
|
||||
Non-blocking: runs the dialog subprocess in a background thread via
|
||||
asyncio.to_thread(), so the calling event loop stays responsive.
|
||||
|
||||
Returns:
|
||||
Path to the selected PDF, or None if the user cancelled,
|
||||
no GUI is available, or the dialog command was not found.
|
||||
"""
|
||||
if not _has_gui():
|
||||
return None
|
||||
|
||||
return await asyncio.to_thread(_pick_pdf_subprocess)
|
||||
@@ -1,585 +0,0 @@
|
||||
"""
|
||||
Graph/Tree Overview Widget - Displays real agent graph structure.
|
||||
|
||||
Supports rendering loops (back-edges) via right-side return channels:
|
||||
arrows drawn on the right margin that visually point back up to earlier nodes.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import re
|
||||
import time
|
||||
|
||||
from textual.app import ComposeResult
|
||||
from textual.containers import Vertical
|
||||
|
||||
from framework.runtime.agent_runtime import AgentRuntime
|
||||
from framework.runtime.event_bus import EventType
|
||||
from framework.tui.widgets.selectable_rich_log import SelectableRichLog as RichLog
|
||||
|
||||
# Width of each return-channel column (padding + │ + gap)
|
||||
_CHANNEL_WIDTH = 5
|
||||
|
||||
# Regex to strip Rich markup tags for measuring visible width
|
||||
_MARKUP_RE = re.compile(r"\[/?[^\]]*\]")
|
||||
|
||||
|
||||
def _plain_len(s: str) -> int:
|
||||
"""Return the visible character length of a Rich-markup string."""
|
||||
return len(_MARKUP_RE.sub("", s))
|
||||
|
||||
|
||||
class GraphOverview(Vertical):
|
||||
"""Widget to display Agent execution graph/tree with real data."""
|
||||
|
||||
DEFAULT_CSS = """
|
||||
GraphOverview {
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
background: $panel;
|
||||
}
|
||||
|
||||
GraphOverview > RichLog {
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
background: $panel;
|
||||
border: none;
|
||||
scrollbar-background: $surface;
|
||||
scrollbar-color: $primary;
|
||||
}
|
||||
"""
|
||||
|
||||
def __init__(self, runtime: AgentRuntime):
|
||||
super().__init__()
|
||||
self.runtime = runtime
|
||||
self._override_graph = None # Set by switch_graph() for secondary graphs
|
||||
self.active_node: str | None = None
|
||||
self.execution_path: list[str] = []
|
||||
# Per-node status strings shown next to the node in the graph display.
|
||||
# e.g. {"planner": "thinking...", "searcher": "web_search..."}
|
||||
self._node_status: dict[str, str] = {}
|
||||
|
||||
@property
|
||||
def _graph(self):
|
||||
"""The graph currently being displayed (may be a secondary graph)."""
|
||||
return self._override_graph or self.runtime.graph
|
||||
|
||||
def switch_graph(self, graph) -> None:
|
||||
"""Switch to displaying a different graph and refresh."""
|
||||
self._override_graph = graph
|
||||
self.active_node = None
|
||||
self.execution_path = []
|
||||
self._node_status = {}
|
||||
self._display_graph()
|
||||
|
||||
def compose(self) -> ComposeResult:
|
||||
# Use RichLog for formatted output
|
||||
yield RichLog(id="graph-display", highlight=True, markup=True)
|
||||
|
||||
def on_mount(self) -> None:
|
||||
"""Display initial graph structure."""
|
||||
self._display_graph()
|
||||
# Refresh every 1s so timer countdowns stay current
|
||||
if self.runtime._timer_next_fire is not None:
|
||||
self.set_interval(1.0, self._display_graph)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Graph analysis helpers
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _topo_order(self) -> list[str]:
|
||||
"""BFS from entry_node following edges."""
|
||||
graph = self._graph
|
||||
visited: list[str] = []
|
||||
seen: set[str] = set()
|
||||
queue = [graph.entry_node]
|
||||
while queue:
|
||||
nid = queue.pop(0)
|
||||
if nid in seen:
|
||||
continue
|
||||
seen.add(nid)
|
||||
visited.append(nid)
|
||||
for edge in graph.get_outgoing_edges(nid):
|
||||
if edge.target not in seen:
|
||||
queue.append(edge.target)
|
||||
# Append orphan nodes not reachable from entry
|
||||
for node in graph.nodes:
|
||||
if node.id not in seen:
|
||||
visited.append(node.id)
|
||||
return visited
|
||||
|
||||
def _detect_back_edges(self, ordered: list[str]) -> list[dict]:
|
||||
"""Find edges where target appears before (or equal to) source in topo order.
|
||||
|
||||
Returns a list of dicts with keys: edge, source, target, source_idx, target_idx.
|
||||
"""
|
||||
order_idx = {nid: i for i, nid in enumerate(ordered)}
|
||||
back_edges: list[dict] = []
|
||||
for node_id in ordered:
|
||||
for edge in self._graph.get_outgoing_edges(node_id):
|
||||
target_idx = order_idx.get(edge.target, -1)
|
||||
source_idx = order_idx.get(node_id, -1)
|
||||
if target_idx != -1 and target_idx <= source_idx:
|
||||
back_edges.append(
|
||||
{
|
||||
"edge": edge,
|
||||
"source": node_id,
|
||||
"target": edge.target,
|
||||
"source_idx": source_idx,
|
||||
"target_idx": target_idx,
|
||||
}
|
||||
)
|
||||
return back_edges
|
||||
|
||||
def _is_back_edge(self, source: str, target: str, order_idx: dict[str, int]) -> bool:
|
||||
"""Check whether an edge from *source* to *target* is a back-edge."""
|
||||
si = order_idx.get(source, -1)
|
||||
ti = order_idx.get(target, -1)
|
||||
return ti != -1 and ti <= si
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Line rendering (Pass 1)
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _render_node_line(self, node_id: str) -> str:
|
||||
"""Render a single node with status symbol and optional status text."""
|
||||
graph = self._graph
|
||||
is_terminal = node_id in (graph.terminal_nodes or [])
|
||||
is_active = node_id == self.active_node
|
||||
is_done = node_id in self.execution_path and not is_active
|
||||
status = self._node_status.get(node_id, "")
|
||||
|
||||
if is_active:
|
||||
sym = "[bold green]●[/bold green]"
|
||||
elif is_done:
|
||||
sym = "[dim]✓[/dim]"
|
||||
elif is_terminal:
|
||||
sym = "[yellow]■[/yellow]"
|
||||
else:
|
||||
sym = "○"
|
||||
|
||||
if is_active:
|
||||
name = f"[bold green]{node_id}[/bold green]"
|
||||
elif is_done:
|
||||
name = f"[dim]{node_id}[/dim]"
|
||||
else:
|
||||
name = node_id
|
||||
|
||||
suffix = f" [italic]{status}[/italic]" if status else ""
|
||||
return f" {sym} {name}{suffix}"
|
||||
|
||||
def _render_edges(self, node_id: str, order_idx: dict[str, int]) -> list[str]:
|
||||
"""Render forward-edge connectors from *node_id*.
|
||||
|
||||
Back-edges are excluded here — they are drawn by the return-channel
|
||||
overlay in Pass 2.
|
||||
"""
|
||||
all_edges = self._graph.get_outgoing_edges(node_id)
|
||||
if not all_edges:
|
||||
return []
|
||||
|
||||
# Split into forward and back
|
||||
forward = [e for e in all_edges if not self._is_back_edge(node_id, e.target, order_idx)]
|
||||
|
||||
if not forward:
|
||||
# All edges are back-edges — nothing to render here
|
||||
return []
|
||||
|
||||
if len(forward) == 1:
|
||||
return [" │", " ▼"]
|
||||
|
||||
# Fan-out: show branches
|
||||
lines: list[str] = []
|
||||
for i, edge in enumerate(forward):
|
||||
connector = "└" if i == len(forward) - 1 else "├"
|
||||
cond = ""
|
||||
if edge.condition.value not in ("always", "on_success"):
|
||||
cond = f" [dim]({edge.condition.value})[/dim]"
|
||||
lines.append(f" {connector}──▶ {edge.target}{cond}")
|
||||
return lines
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Return-channel overlay (Pass 2)
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _overlay_return_channels(
|
||||
self,
|
||||
lines: list[str],
|
||||
node_line_map: dict[str, int],
|
||||
back_edges: list[dict],
|
||||
available_width: int,
|
||||
) -> list[str]:
|
||||
"""Overlay right-side return channels onto the line buffer.
|
||||
|
||||
Each back-edge gets a vertical channel on the right margin. Channels
|
||||
are allocated left-to-right by increasing span length so that shorter
|
||||
(inner) loops are closer to the graph body and longer (outer) loops are
|
||||
further right.
|
||||
|
||||
If the terminal is too narrow to fit even one channel, we fall back to
|
||||
simple inline ``↺`` annotations instead.
|
||||
"""
|
||||
if not back_edges:
|
||||
return lines
|
||||
|
||||
num_channels = len(back_edges)
|
||||
|
||||
# Sort by span length ascending → inner loops get nearest channel
|
||||
sorted_be = sorted(back_edges, key=lambda b: b["source_idx"] - b["target_idx"])
|
||||
|
||||
# --- Insert dedicated connector lines for back-edge sources ---
|
||||
# Each back-edge source gets a blank line inserted after its node
|
||||
# section (after any forward-edge lines). We process insertions in
|
||||
# reverse order so that earlier indices remain valid.
|
||||
all_node_lines_set = set(node_line_map.values())
|
||||
|
||||
insertions: list[tuple[int, int]] = [] # (insert_after_line, be_index)
|
||||
for be_idx, be in enumerate(sorted_be):
|
||||
source_node_line = node_line_map.get(be["source"])
|
||||
if source_node_line is None:
|
||||
continue
|
||||
# Walk forward to find the last line in this node's section
|
||||
last_section_line = source_node_line
|
||||
for li in range(source_node_line + 1, len(lines)):
|
||||
if li in all_node_lines_set:
|
||||
break
|
||||
last_section_line = li
|
||||
insertions.append((last_section_line, be_idx))
|
||||
|
||||
source_line_for_be: dict[int, int] = {}
|
||||
for insert_after, be_idx in sorted(insertions, reverse=True):
|
||||
insert_at = insert_after + 1
|
||||
lines.insert(insert_at, "") # placeholder for connector
|
||||
source_line_for_be[be_idx] = insert_at
|
||||
# Shift node_line_map entries that come after the insertion point
|
||||
for nid in node_line_map:
|
||||
if node_line_map[nid] > insert_after:
|
||||
node_line_map[nid] += 1
|
||||
# Also shift already-assigned source lines
|
||||
for prev_idx in source_line_for_be:
|
||||
if prev_idx != be_idx and source_line_for_be[prev_idx] > insert_after:
|
||||
source_line_for_be[prev_idx] += 1
|
||||
|
||||
# Recompute max content width after insertions
|
||||
max_content_w = max(_plain_len(ln) for ln in lines) if lines else 0
|
||||
|
||||
# Check if we have room for channels
|
||||
channels_total_w = num_channels * _CHANNEL_WIDTH
|
||||
if max_content_w + channels_total_w + 2 > available_width:
|
||||
return self._inline_back_edge_fallback(lines, node_line_map, back_edges)
|
||||
|
||||
content_pad = max_content_w + 3 # gap between content and first channel
|
||||
|
||||
# Build channel info with final line positions
|
||||
channel_info: list[dict] = []
|
||||
for ch_idx, be in enumerate(sorted_be):
|
||||
target_line = node_line_map.get(be["target"])
|
||||
source_line = source_line_for_be.get(ch_idx)
|
||||
if target_line is None or source_line is None:
|
||||
continue
|
||||
col = content_pad + ch_idx * _CHANNEL_WIDTH
|
||||
channel_info.append(
|
||||
{
|
||||
"target_line": target_line,
|
||||
"source_line": source_line,
|
||||
"col": col,
|
||||
}
|
||||
)
|
||||
|
||||
if not channel_info:
|
||||
return lines
|
||||
|
||||
# Build overlay grid — one row per line, columns for channel area
|
||||
total_width = content_pad + num_channels * _CHANNEL_WIDTH + 1
|
||||
overlay_width = total_width - max_content_w
|
||||
overlays: list[list[str]] = [[" "] * overlay_width for _ in range(len(lines))]
|
||||
|
||||
for ci in channel_info:
|
||||
tl = ci["target_line"]
|
||||
sl = ci["source_line"]
|
||||
col_offset = ci["col"] - max_content_w
|
||||
|
||||
if col_offset < 0 or col_offset >= overlay_width:
|
||||
continue
|
||||
|
||||
# Target line: ◄──...──┐
|
||||
if 0 <= tl < len(overlays):
|
||||
for c in range(col_offset):
|
||||
if overlays[tl][c] == " ":
|
||||
overlays[tl][c] = "─"
|
||||
overlays[tl][col_offset] = "┐"
|
||||
|
||||
# Source line: ──...──┘
|
||||
if 0 <= sl < len(overlays):
|
||||
for c in range(col_offset):
|
||||
if overlays[sl][c] == " ":
|
||||
overlays[sl][c] = "─"
|
||||
overlays[sl][col_offset] = "┘"
|
||||
|
||||
# Vertical lines between target+1 and source-1
|
||||
for li in range(tl + 1, sl):
|
||||
if 0 <= li < len(overlays) and overlays[li][col_offset] == " ":
|
||||
overlays[li][col_offset] = "│"
|
||||
|
||||
# Merge overlays into the line strings
|
||||
result: list[str] = []
|
||||
for i, line in enumerate(lines):
|
||||
pw = _plain_len(line)
|
||||
pad = max_content_w - pw
|
||||
overlay_chars = overlays[i] if i < len(overlays) else []
|
||||
overlay_str = "".join(overlay_chars)
|
||||
overlay_trimmed = overlay_str.rstrip()
|
||||
if overlay_trimmed:
|
||||
is_target_line = any(ci["target_line"] == i for ci in channel_info)
|
||||
if is_target_line:
|
||||
overlay_trimmed = "◄" + overlay_trimmed[1:]
|
||||
|
||||
is_source_line = any(ci["source_line"] == i for ci in channel_info)
|
||||
if is_source_line and not line.strip():
|
||||
# Inserted blank line → build └───┘ connector.
|
||||
# " └" = 3 chars of content prefix, so remaining pad = max_content_w - 3
|
||||
remaining_pad = max_content_w - 3
|
||||
full = list(" " * remaining_pad + overlay_trimmed)
|
||||
# Find the ┘ corner for this source connector
|
||||
corner_pos = -1
|
||||
for ci_s in channel_info:
|
||||
if ci_s["source_line"] == i:
|
||||
corner_pos = remaining_pad + (ci_s["col"] - max_content_w)
|
||||
break
|
||||
# Fill everything up to the corner with ─
|
||||
if corner_pos >= 0:
|
||||
for c in range(corner_pos):
|
||||
if full[c] not in ("│", "┘", "┐"):
|
||||
full[c] = "─"
|
||||
connector = " └" + "".join(full).rstrip()
|
||||
result.append(f"[dim]{connector}[/dim]")
|
||||
continue
|
||||
|
||||
colored_overlay = f"[dim]{' ' * pad}{overlay_trimmed}[/dim]"
|
||||
result.append(f"{line}{colored_overlay}")
|
||||
else:
|
||||
result.append(line)
|
||||
|
||||
return result
|
||||
|
||||
def _inline_back_edge_fallback(
|
||||
self,
|
||||
lines: list[str],
|
||||
node_line_map: dict[str, int],
|
||||
back_edges: list[dict],
|
||||
) -> list[str]:
|
||||
"""Fallback: add inline ↺ annotations when terminal is too narrow for channels."""
|
||||
# Group back-edges by source node
|
||||
source_to_be: dict[str, list[dict]] = {}
|
||||
for be in back_edges:
|
||||
source_to_be.setdefault(be["source"], []).append(be)
|
||||
|
||||
result = list(lines)
|
||||
# Insert annotation lines after each source node's section
|
||||
offset = 0
|
||||
all_node_lines = sorted(node_line_map.values())
|
||||
for source, bes in source_to_be.items():
|
||||
source_line = node_line_map.get(source)
|
||||
if source_line is None:
|
||||
continue
|
||||
# Find end of source node section
|
||||
end_line = source_line
|
||||
for nl in all_node_lines:
|
||||
if nl > source_line:
|
||||
end_line = nl - 1
|
||||
break
|
||||
else:
|
||||
end_line = len(lines) - 1
|
||||
# Insert after last content line of this node's section
|
||||
insert_at = end_line + offset + 1
|
||||
for be in bes:
|
||||
cond = ""
|
||||
edge = be["edge"]
|
||||
if edge.condition.value not in ("always", "on_success"):
|
||||
cond = f" [dim]({edge.condition.value})[/dim]"
|
||||
annotation = f" [yellow]↺[/yellow] {be['target']}{cond}"
|
||||
result.insert(insert_at, annotation)
|
||||
insert_at += 1
|
||||
offset += 1
|
||||
|
||||
return result
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Main display
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _display_graph(self) -> None:
|
||||
"""Display the graph as an ASCII DAG with edge connectors and loop channels."""
|
||||
display = self.query_one("#graph-display", RichLog)
|
||||
display.clear()
|
||||
|
||||
graph = self._graph
|
||||
display.write(f"[bold cyan]Agent Graph:[/bold cyan] {graph.id}\n")
|
||||
|
||||
ordered = self._topo_order()
|
||||
order_idx = {nid: i for i, nid in enumerate(ordered)}
|
||||
|
||||
# --- Pass 1: Build line buffer ---
|
||||
lines: list[str] = []
|
||||
node_line_map: dict[str, int] = {}
|
||||
|
||||
for node_id in ordered:
|
||||
node_line_map[node_id] = len(lines)
|
||||
lines.append(self._render_node_line(node_id))
|
||||
for edge_line in self._render_edges(node_id, order_idx):
|
||||
lines.append(edge_line)
|
||||
|
||||
# --- Pass 2: Overlay return channels for back-edges ---
|
||||
back_edges = self._detect_back_edges(ordered)
|
||||
if back_edges:
|
||||
# Try to get actual widget width; default to a reasonable value
|
||||
try:
|
||||
available_width = self.size.width or 60
|
||||
except Exception:
|
||||
available_width = 60
|
||||
lines = self._overlay_return_channels(lines, node_line_map, back_edges, available_width)
|
||||
|
||||
# Write all lines
|
||||
for line in lines:
|
||||
display.write(line)
|
||||
|
||||
# Execution path footer
|
||||
if self.execution_path:
|
||||
display.write("")
|
||||
display.write(f"[dim]Path:[/dim] {' → '.join(self.execution_path[-5:])}")
|
||||
|
||||
# Event sources section
|
||||
self._render_event_sources(display)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Event sources display
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _render_event_sources(self, display: RichLog) -> None:
|
||||
"""Render event source info (webhooks, timers) below the graph."""
|
||||
entry_points = self.runtime.get_entry_points()
|
||||
|
||||
# Filter to non-manual entry points (webhooks, timers, events)
|
||||
event_sources = [ep for ep in entry_points if ep.trigger_type not in ("manual",)]
|
||||
if not event_sources:
|
||||
return
|
||||
|
||||
display.write("")
|
||||
display.write("[bold cyan]Event Sources[/bold cyan]")
|
||||
|
||||
config = self.runtime._config
|
||||
|
||||
for ep in event_sources:
|
||||
if ep.trigger_type == "timer":
|
||||
cron_expr = ep.trigger_config.get("cron")
|
||||
interval = ep.trigger_config.get("interval_minutes", "?")
|
||||
schedule_label = f"cron: {cron_expr}" if cron_expr else f"every {interval} min"
|
||||
display.write(f" [green]⏱[/green] {ep.name} [dim]→ {ep.entry_node}[/dim]")
|
||||
# Show schedule + next fire countdown
|
||||
next_fire = self.runtime._timer_next_fire.get(ep.id)
|
||||
if next_fire is not None:
|
||||
remaining = max(0, next_fire - time.monotonic())
|
||||
hours, rem = divmod(int(remaining), 3600)
|
||||
mins, secs = divmod(rem, 60)
|
||||
if hours > 0:
|
||||
countdown = f"{hours}h {mins:02d}m {secs:02d}s"
|
||||
else:
|
||||
countdown = f"{mins}m {secs:02d}s"
|
||||
display.write(f" [dim]{schedule_label} — next in {countdown}[/dim]")
|
||||
else:
|
||||
display.write(f" [dim]{schedule_label}[/dim]")
|
||||
|
||||
elif ep.trigger_type in ("event", "webhook"):
|
||||
display.write(f" [yellow]⚡[/yellow] {ep.name} [dim]→ {ep.entry_node}[/dim]")
|
||||
# Show webhook endpoint if configured
|
||||
route = None
|
||||
for r in config.webhook_routes:
|
||||
src = r.get("source_id", "")
|
||||
if src and src in ep.id:
|
||||
route = r
|
||||
break
|
||||
if not route and config.webhook_routes:
|
||||
# Fall back to first route
|
||||
route = config.webhook_routes[0]
|
||||
|
||||
if route:
|
||||
host = config.webhook_host
|
||||
port = config.webhook_port
|
||||
path = route.get("path", "/webhook")
|
||||
display.write(f" [dim]{host}:{port}{path}[/dim]")
|
||||
else:
|
||||
event_types = ep.trigger_config.get("event_types", [])
|
||||
if event_types:
|
||||
display.write(f" [dim]events: {', '.join(event_types)}[/dim]")
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Public API (called by app.py)
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def update_active_node(self, node_id: str) -> None:
|
||||
"""Update the currently active node."""
|
||||
self.active_node = node_id
|
||||
if node_id not in self.execution_path:
|
||||
self.execution_path.append(node_id)
|
||||
self._display_graph()
|
||||
|
||||
def update_execution(self, event) -> None:
|
||||
"""Update the displayed node status based on execution lifecycle events."""
|
||||
if event.type == EventType.EXECUTION_STARTED:
|
||||
self._node_status.clear()
|
||||
self.execution_path.clear()
|
||||
entry_node = event.data.get("entry_node") or (
|
||||
self._graph.entry_node if self.runtime else None
|
||||
)
|
||||
if entry_node:
|
||||
self.update_active_node(entry_node)
|
||||
|
||||
elif event.type == EventType.EXECUTION_COMPLETED:
|
||||
self.active_node = None
|
||||
self._node_status.clear()
|
||||
self._display_graph()
|
||||
|
||||
elif event.type == EventType.EXECUTION_FAILED:
|
||||
error = event.data.get("error", "Unknown error")
|
||||
if self.active_node:
|
||||
self._node_status[self.active_node] = f"[red]FAILED: {error}[/red]"
|
||||
self.active_node = None
|
||||
self._display_graph()
|
||||
|
||||
# -- Event handlers called by app.py _handle_event --
|
||||
|
||||
def handle_node_loop_started(self, node_id: str) -> None:
|
||||
"""A node's event loop has started."""
|
||||
self._node_status[node_id] = "thinking..."
|
||||
self.update_active_node(node_id)
|
||||
|
||||
def handle_node_loop_iteration(self, node_id: str, iteration: int) -> None:
|
||||
"""A node advanced to a new loop iteration."""
|
||||
self._node_status[node_id] = f"step {iteration}"
|
||||
self._display_graph()
|
||||
|
||||
def handle_node_loop_completed(self, node_id: str) -> None:
|
||||
"""A node's event loop completed."""
|
||||
self._node_status.pop(node_id, None)
|
||||
if self.active_node == node_id:
|
||||
self.active_node = None
|
||||
self._display_graph()
|
||||
|
||||
def handle_tool_call(self, node_id: str, tool_name: str, *, started: bool) -> None:
|
||||
"""Show tool activity next to the active node."""
|
||||
if started:
|
||||
self._node_status[node_id] = f"{tool_name}..."
|
||||
else:
|
||||
# Restore to generic thinking status after tool completes
|
||||
self._node_status[node_id] = "thinking..."
|
||||
self._display_graph()
|
||||
|
||||
def handle_stalled(self, node_id: str, reason: str) -> None:
|
||||
"""Highlight a stalled node."""
|
||||
self._node_status[node_id] = f"[red]stalled: {reason}[/red]"
|
||||
self._display_graph()
|
||||
|
||||
def handle_edge_traversed(self, source_node: str, target_node: str) -> None:
|
||||
"""Highlight an edge being traversed."""
|
||||
self._node_status[source_node] = f"[dim]→ {target_node}[/dim]"
|
||||
self._display_graph()
|
||||
@@ -1,172 +0,0 @@
|
||||
"""
|
||||
Log formatting utilities and LogPane widget.
|
||||
|
||||
The module-level functions (format_event, extract_event_text, format_python_log)
|
||||
can be used by any widget that needs to render log lines without instantiating LogPane.
|
||||
"""
|
||||
|
||||
import logging
|
||||
from datetime import datetime
|
||||
|
||||
from textual.app import ComposeResult
|
||||
from textual.containers import Container
|
||||
|
||||
from framework.runtime.event_bus import AgentEvent, EventType
|
||||
from framework.tui.widgets.selectable_rich_log import SelectableRichLog as RichLog
|
||||
|
||||
# --- Module-level formatting constants ---
|
||||
|
||||
EVENT_FORMAT: dict[EventType, tuple[str, str]] = {
|
||||
EventType.EXECUTION_STARTED: (">>", "bold cyan"),
|
||||
EventType.EXECUTION_COMPLETED: ("<<", "bold green"),
|
||||
EventType.EXECUTION_FAILED: ("!!", "bold red"),
|
||||
EventType.TOOL_CALL_STARTED: ("->", "yellow"),
|
||||
EventType.TOOL_CALL_COMPLETED: ("<-", "green"),
|
||||
EventType.NODE_LOOP_STARTED: ("@@", "cyan"),
|
||||
EventType.NODE_LOOP_ITERATION: ("..", "dim"),
|
||||
EventType.NODE_LOOP_COMPLETED: ("@@", "dim"),
|
||||
EventType.LLM_TURN_COMPLETE: ("◆", "green"),
|
||||
EventType.NODE_STALLED: ("!!", "bold yellow"),
|
||||
EventType.NODE_INPUT_BLOCKED: ("!!", "yellow"),
|
||||
EventType.GOAL_PROGRESS: ("%%", "blue"),
|
||||
EventType.GOAL_ACHIEVED: ("**", "bold green"),
|
||||
EventType.CONSTRAINT_VIOLATION: ("!!", "bold red"),
|
||||
EventType.STATE_CHANGED: ("~~", "dim"),
|
||||
EventType.CLIENT_INPUT_REQUESTED: ("??", "magenta"),
|
||||
}
|
||||
|
||||
LOG_LEVEL_COLORS: dict[int, str] = {
|
||||
logging.DEBUG: "dim",
|
||||
logging.INFO: "",
|
||||
logging.WARNING: "yellow",
|
||||
logging.ERROR: "red",
|
||||
logging.CRITICAL: "bold red",
|
||||
}
|
||||
|
||||
|
||||
# --- Module-level formatting functions ---
|
||||
|
||||
|
||||
def extract_event_text(event: AgentEvent) -> str:
|
||||
"""Extract human-readable text from an event's data dict."""
|
||||
et = event.type
|
||||
data = event.data
|
||||
|
||||
if et == EventType.EXECUTION_STARTED:
|
||||
return "Execution started"
|
||||
elif et == EventType.EXECUTION_COMPLETED:
|
||||
return "Execution completed"
|
||||
elif et == EventType.EXECUTION_FAILED:
|
||||
return f"Execution FAILED: {data.get('error', 'unknown')}"
|
||||
elif et == EventType.TOOL_CALL_STARTED:
|
||||
return f"Tool call: {data.get('tool_name', 'unknown')}"
|
||||
elif et == EventType.TOOL_CALL_COMPLETED:
|
||||
name = data.get("tool_name", "unknown")
|
||||
if data.get("is_error"):
|
||||
preview = str(data.get("result", ""))[:80]
|
||||
return f"Tool error: {name} - {preview}"
|
||||
return f"Tool done: {name}"
|
||||
elif et == EventType.NODE_LOOP_STARTED:
|
||||
return f"Node started: {event.node_id or 'unknown'}"
|
||||
elif et == EventType.NODE_LOOP_ITERATION:
|
||||
return f"{event.node_id or 'unknown'} iteration {data.get('iteration', '?')}"
|
||||
elif et == EventType.NODE_LOOP_COMPLETED:
|
||||
return f"Node done: {event.node_id or 'unknown'}"
|
||||
elif et == EventType.NODE_STALLED:
|
||||
reason = data.get("reason", "")
|
||||
node = event.node_id or "unknown"
|
||||
return f"Node stalled: {node} - {reason}" if reason else f"Node stalled: {node}"
|
||||
elif et == EventType.NODE_INPUT_BLOCKED:
|
||||
return f"Node input blocked: {event.node_id or 'unknown'}"
|
||||
elif et == EventType.GOAL_PROGRESS:
|
||||
return f"Goal progress: {data.get('progress', '?')}"
|
||||
elif et == EventType.GOAL_ACHIEVED:
|
||||
return "Goal achieved"
|
||||
elif et == EventType.CONSTRAINT_VIOLATION:
|
||||
return f"Constraint violated: {data.get('description', 'unknown')}"
|
||||
elif et == EventType.STATE_CHANGED:
|
||||
return f"State changed: {data.get('key', 'unknown')}"
|
||||
elif et == EventType.CLIENT_INPUT_REQUESTED:
|
||||
return "Waiting for user input"
|
||||
elif et == EventType.LLM_TURN_COMPLETE:
|
||||
stop = data.get("stop_reason", "?")
|
||||
model = data.get("model", "?")
|
||||
inp = data.get("input_tokens", 0)
|
||||
out = data.get("output_tokens", 0)
|
||||
return f"{model} → {stop} ({inp}+{out} tokens)"
|
||||
else:
|
||||
return f"{et.value}: {data}"
|
||||
|
||||
|
||||
def format_event(event: AgentEvent) -> str:
|
||||
"""Format an AgentEvent as a Rich markup string with timestamp + symbol."""
|
||||
ts = event.timestamp.strftime("%H:%M:%S")
|
||||
symbol, color = EVENT_FORMAT.get(event.type, ("--", "dim"))
|
||||
text = extract_event_text(event)
|
||||
return f"[dim]{ts}[/dim] [{color}]{symbol} {text}[/{color}]"
|
||||
|
||||
|
||||
def format_python_log(record: logging.LogRecord) -> str:
|
||||
"""Format a Python log record as a Rich markup string with timestamp and severity color."""
|
||||
ts = datetime.fromtimestamp(record.created).strftime("%H:%M:%S")
|
||||
color = LOG_LEVEL_COLORS.get(record.levelno, "")
|
||||
msg = record.getMessage()
|
||||
if color:
|
||||
return f"[dim]{ts}[/dim] [{color}]{record.levelname}[/{color}] {msg}"
|
||||
else:
|
||||
return f"[dim]{ts}[/dim] {record.levelname} {msg}"
|
||||
|
||||
|
||||
# --- LogPane widget (kept for backward compatibility) ---
|
||||
|
||||
|
||||
class LogPane(Container):
|
||||
"""Widget to display logs with reliable rendering."""
|
||||
|
||||
DEFAULT_CSS = """
|
||||
LogPane {
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
}
|
||||
|
||||
LogPane > RichLog {
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
background: $surface;
|
||||
border: none;
|
||||
scrollbar-background: $panel;
|
||||
scrollbar-color: $primary;
|
||||
}
|
||||
"""
|
||||
|
||||
def compose(self) -> ComposeResult:
|
||||
yield RichLog(id="main-log", highlight=True, markup=True, auto_scroll=False)
|
||||
|
||||
def write_event(self, event: AgentEvent) -> None:
|
||||
"""Format an AgentEvent with timestamp + symbol and write to the log."""
|
||||
self.write_log(format_event(event))
|
||||
|
||||
def write_python_log(self, record: logging.LogRecord) -> None:
|
||||
"""Format a Python log record with timestamp and severity color."""
|
||||
self.write_log(format_python_log(record))
|
||||
|
||||
def write_log(self, message: str) -> None:
|
||||
"""Write a log message to the log pane."""
|
||||
try:
|
||||
if not self.is_mounted:
|
||||
return
|
||||
|
||||
log = self.query_one("#main-log", RichLog)
|
||||
|
||||
if not log.is_mounted:
|
||||
return
|
||||
|
||||
was_at_bottom = log.is_vertical_scroll_end
|
||||
|
||||
log.write(message)
|
||||
|
||||
if was_at_bottom:
|
||||
log.scroll_end(animate=False)
|
||||
|
||||
except Exception:
|
||||
pass
|
||||
@@ -1,229 +0,0 @@
|
||||
"""
|
||||
SelectableRichLog - RichLog with mouse-driven text selection and clipboard copy.
|
||||
|
||||
Drop-in replacement for RichLog. Click-and-drag to select text, which is
|
||||
visually highlighted. Press Ctrl+C to copy selection to clipboard (handled
|
||||
by app.py). Press Escape or single-click to clear selection.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import subprocess
|
||||
import sys
|
||||
|
||||
from rich.segment import Segment as RichSegment
|
||||
from rich.style import Style
|
||||
from textual.geometry import Offset
|
||||
from textual.selection import Selection
|
||||
from textual.strip import Strip
|
||||
from textual.widgets import RichLog
|
||||
|
||||
# Highlight style for selected text
|
||||
_HIGHLIGHT_STYLE = Style(bgcolor="blue", color="white")
|
||||
|
||||
|
||||
class SelectableRichLog(RichLog):
|
||||
"""RichLog with mouse-driven text selection."""
|
||||
|
||||
DEFAULT_CSS = """
|
||||
SelectableRichLog {
|
||||
pointer: text;
|
||||
}
|
||||
"""
|
||||
|
||||
def __init__(self, **kwargs) -> None:
|
||||
super().__init__(**kwargs)
|
||||
self._sel_anchor: Offset | None = None
|
||||
self._sel_end: Offset | None = None
|
||||
self._selecting: bool = False
|
||||
|
||||
# -- Internal helpers --
|
||||
|
||||
def _apply_highlight(self, strip: Strip) -> Strip:
|
||||
"""Apply highlight with correct precedence (highlight wins over base style)."""
|
||||
segments = []
|
||||
for text, style, control in strip._segments:
|
||||
if control:
|
||||
segments.append(RichSegment(text, style, control))
|
||||
else:
|
||||
new_style = (style + _HIGHLIGHT_STYLE) if style else _HIGHLIGHT_STYLE
|
||||
segments.append(RichSegment(text, new_style, control))
|
||||
return Strip(segments, strip.cell_length)
|
||||
|
||||
# -- Selection helpers --
|
||||
|
||||
@property
|
||||
def selection(self) -> Selection | None:
|
||||
"""Build a Selection from current anchor/end, or None if no selection."""
|
||||
if self._sel_anchor is None or self._sel_end is None:
|
||||
return None
|
||||
if self._sel_anchor == self._sel_end:
|
||||
return None
|
||||
return Selection.from_offsets(self._sel_anchor, self._sel_end)
|
||||
|
||||
def _mouse_to_content(self, event_x: int, event_y: int) -> Offset:
|
||||
"""Convert viewport mouse coords to content (line, col) coords."""
|
||||
scroll_x, scroll_y = self.scroll_offset
|
||||
return Offset(scroll_x + event_x, scroll_y + event_y)
|
||||
|
||||
def clear_selection(self) -> None:
|
||||
"""Clear any active selection."""
|
||||
had_selection = self._sel_anchor is not None
|
||||
self._sel_anchor = None
|
||||
self._sel_end = None
|
||||
self._selecting = False
|
||||
if had_selection:
|
||||
self.refresh()
|
||||
|
||||
# -- Mouse handlers (left button only) --
|
||||
|
||||
def on_mouse_down(self, event) -> None:
|
||||
"""Start selection on left mouse button."""
|
||||
if event.button != 1:
|
||||
return
|
||||
self._sel_anchor = self._mouse_to_content(event.x, event.y)
|
||||
self._sel_end = self._sel_anchor
|
||||
self._selecting = True
|
||||
self.capture_mouse()
|
||||
self.refresh()
|
||||
|
||||
def on_mouse_move(self, event) -> None:
|
||||
"""Extend selection while dragging."""
|
||||
if not self._selecting:
|
||||
return
|
||||
self._sel_end = self._mouse_to_content(event.x, event.y)
|
||||
self.refresh()
|
||||
|
||||
def on_mouse_up(self, event) -> None:
|
||||
"""End selection on mouse release."""
|
||||
if not self._selecting:
|
||||
return
|
||||
self._selecting = False
|
||||
self.release_mouse()
|
||||
|
||||
# Single-click (no drag) clears selection
|
||||
if self._sel_anchor == self._sel_end:
|
||||
self.clear_selection()
|
||||
|
||||
# -- Keyboard handlers --
|
||||
|
||||
def on_key(self, event) -> None:
|
||||
"""Clear selection on Escape."""
|
||||
if event.key == "escape":
|
||||
self.clear_selection()
|
||||
|
||||
# -- Rendering with highlight --
|
||||
|
||||
def render_line(self, y: int) -> Strip:
|
||||
"""Override to apply selection highlight on top of the base strip."""
|
||||
strip = super().render_line(y)
|
||||
|
||||
sel = self.selection
|
||||
if sel is None:
|
||||
return strip
|
||||
|
||||
# Determine which content line this viewport row corresponds to
|
||||
_, scroll_y = self.scroll_offset
|
||||
content_y = scroll_y + y
|
||||
|
||||
span = sel.get_span(content_y)
|
||||
if span is None:
|
||||
return strip
|
||||
|
||||
start_x, end_x = span
|
||||
cell_len = strip.cell_length
|
||||
if cell_len == 0:
|
||||
return strip
|
||||
|
||||
scroll_x, _ = self.scroll_offset
|
||||
|
||||
# -1 means "to end of content line" — use viewport end
|
||||
if end_x == -1:
|
||||
end_x = cell_len
|
||||
else:
|
||||
# Convert content-space x to viewport-space x
|
||||
end_x = end_x - scroll_x
|
||||
|
||||
# Convert content-space x to viewport-space x
|
||||
start_x = start_x - scroll_x
|
||||
|
||||
# Clamp to viewport strip bounds
|
||||
start_x = max(0, start_x)
|
||||
end_x = min(end_x, cell_len)
|
||||
|
||||
if start_x >= end_x:
|
||||
return strip
|
||||
|
||||
# Divide strip into [before, selected, after] and highlight the middle
|
||||
parts = strip.divide([start_x, end_x])
|
||||
if len(parts) < 2:
|
||||
return strip
|
||||
|
||||
highlighted_parts: list[Strip] = []
|
||||
for i, part in enumerate(parts):
|
||||
if i == 1:
|
||||
highlighted_parts.append(self._apply_highlight(part))
|
||||
else:
|
||||
highlighted_parts.append(part)
|
||||
|
||||
return Strip.join(highlighted_parts)
|
||||
|
||||
# -- Text extraction & clipboard --
|
||||
|
||||
def get_selected_text(self) -> str | None:
|
||||
"""Extract the plain text of the current selection, or None."""
|
||||
sel = self.selection
|
||||
if sel is None:
|
||||
return None
|
||||
|
||||
# Build full text from all lines
|
||||
all_text = "\n".join(strip.text for strip in self.lines)
|
||||
try:
|
||||
extracted = sel.extract(all_text)
|
||||
except (IndexError, ValueError):
|
||||
# Selection coordinates can exceed line count when the virtual
|
||||
# canvas is larger than the actual content (e.g. after scroll).
|
||||
return None
|
||||
return extracted if extracted else None
|
||||
|
||||
def copy_selection(self) -> str | None:
|
||||
"""Copy selected text to system clipboard. Returns text or None."""
|
||||
text = self.get_selected_text()
|
||||
if not text:
|
||||
return None
|
||||
_copy_to_clipboard(text)
|
||||
return text
|
||||
|
||||
|
||||
def _copy_to_clipboard(text: str) -> None:
|
||||
"""Copy text to system clipboard using platform-native tools."""
|
||||
try:
|
||||
if sys.platform == "darwin":
|
||||
subprocess.run(["pbcopy"], encoding="utf-8", input=text.encode(), check=True, timeout=5)
|
||||
elif sys.platform == "win32":
|
||||
subprocess.run(
|
||||
["clip.exe"],
|
||||
encoding="utf-8",
|
||||
input=text.encode("utf-16le"),
|
||||
check=True,
|
||||
timeout=5,
|
||||
)
|
||||
elif sys.platform.startswith("linux"):
|
||||
try:
|
||||
subprocess.run(
|
||||
["xclip", "-selection", "clipboard"],
|
||||
encoding="utf-8",
|
||||
input=text.encode(),
|
||||
check=True,
|
||||
timeout=5,
|
||||
)
|
||||
except (subprocess.SubprocessError, FileNotFoundError):
|
||||
subprocess.run(
|
||||
["xsel", "--clipboard", "--input"],
|
||||
encoding="utf-8",
|
||||
input=text.encode(),
|
||||
check=True,
|
||||
timeout=5,
|
||||
)
|
||||
except (subprocess.SubprocessError, FileNotFoundError):
|
||||
pass
|
||||
@@ -12,8 +12,8 @@ export interface LiveSession {
|
||||
loaded_at: number;
|
||||
uptime_seconds: number;
|
||||
intro_message?: string;
|
||||
/** Queen operating phase — "building", "staging", or "running" */
|
||||
queen_phase?: "building" | "staging" | "running";
|
||||
/** Queen operating phase — "planning", "building", "staging", or "running" */
|
||||
queen_phase?: "planning" | "building" | "staging" | "running";
|
||||
/** Present in 409 conflict responses when worker is still loading */
|
||||
loading?: boolean;
|
||||
}
|
||||
|
||||
@@ -31,7 +31,7 @@ interface AgentGraphProps {
|
||||
version?: string;
|
||||
runState?: RunState;
|
||||
building?: boolean;
|
||||
queenPhase?: "building" | "staging" | "running";
|
||||
queenPhase?: "planning" | "building" | "staging" | "running";
|
||||
}
|
||||
|
||||
// --- Extracted RunButton so hover state survives parent re-renders ---
|
||||
@@ -278,7 +278,7 @@ export default function AgentGraph({ nodes, title: _title, onNodeClick, onRun, o
|
||||
</span>
|
||||
)}
|
||||
</div>
|
||||
<RunButton runState={runState} disabled={nodes.length === 0 || queenPhase === "building"} onRun={handleRun} onPause={onPause ?? (() => {})} btnRef={runBtnRef} />
|
||||
<RunButton runState={runState} disabled={nodes.length === 0 || queenPhase === "building" || queenPhase === "planning"} onRun={handleRun} onPause={onPause ?? (() => {})} btnRef={runBtnRef} />
|
||||
</div>
|
||||
<div className="flex-1 flex items-center justify-center px-5">
|
||||
{building ? (
|
||||
|
||||
@@ -39,7 +39,7 @@ interface ChatPanelProps {
|
||||
/** Called when user dismisses the pending question without answering */
|
||||
onQuestionDismiss?: () => void;
|
||||
/** Queen operating phase — shown as a tag on queen messages */
|
||||
queenPhase?: "building" | "staging" | "running";
|
||||
queenPhase?: "planning" | "building" | "staging" | "running";
|
||||
}
|
||||
|
||||
const queenColor = "hsl(45,95%,58%)";
|
||||
@@ -144,7 +144,7 @@ function ToolActivityRow({ content }: { content: string }) {
|
||||
);
|
||||
}
|
||||
|
||||
const MessageBubble = memo(function MessageBubble({ msg, queenPhase }: { msg: ChatMessage; queenPhase?: "building" | "staging" | "running" }) {
|
||||
const MessageBubble = memo(function MessageBubble({ msg, queenPhase }: { msg: ChatMessage; queenPhase?: "planning" | "building" | "staging" | "running" }) {
|
||||
const isUser = msg.type === "user";
|
||||
const isQueen = msg.role === "queen";
|
||||
const color = getColor(msg.agent, msg.role);
|
||||
@@ -204,7 +204,9 @@ const MessageBubble = memo(function MessageBubble({ msg, queenPhase }: { msg: Ch
|
||||
? "running phase"
|
||||
: queenPhase === "staging"
|
||||
? "staging phase"
|
||||
: "building phase"
|
||||
: queenPhase === "planning"
|
||||
? "planning phase"
|
||||
: "building phase"
|
||||
: "Worker"}
|
||||
</span>
|
||||
</div>
|
||||
|
||||
@@ -121,7 +121,8 @@ export function sseEventToChatMessage(
|
||||
id: `paused-${event.execution_id}`,
|
||||
agent: "System",
|
||||
agentColor: "",
|
||||
content: "Execution paused by user",
|
||||
content:
|
||||
(event.data?.reason as string) || "Execution paused",
|
||||
timestamp: "",
|
||||
type: "system",
|
||||
thread,
|
||||
|
||||
@@ -255,8 +255,8 @@ interface AgentBackendState {
|
||||
/** The message ID of the current worker input request (for inline reply box) */
|
||||
workerInputMessageId: string | null;
|
||||
queenBuilding: boolean;
|
||||
/** Queen operating phase — "building" (coding), "staging" (loaded), or "running" (executing) */
|
||||
queenPhase: "building" | "staging" | "running";
|
||||
/** Queen operating phase — "planning" (design), "building" (coding), "staging" (loaded), or "running" (executing) */
|
||||
queenPhase: "planning" | "building" | "staging" | "running";
|
||||
workerRunState: "idle" | "deploying" | "running";
|
||||
currentExecutionId: string | null;
|
||||
nodeLogs: Record<string, string[]>;
|
||||
@@ -291,7 +291,7 @@ function defaultAgentState(): AgentBackendState {
|
||||
awaitingInput: false,
|
||||
workerInputMessageId: null,
|
||||
queenBuilding: false,
|
||||
queenPhase: "building",
|
||||
queenPhase: "planning",
|
||||
workerRunState: "idle",
|
||||
currentExecutionId: null,
|
||||
nodeLogs: {},
|
||||
@@ -892,7 +892,7 @@ export default function Workspace() {
|
||||
// failed, the throw inside the catch exits the outer try block.
|
||||
const session = liveSession!;
|
||||
const displayName = formatAgentDisplayName(session.worker_name || agentType);
|
||||
const initialPhase = session.queen_phase || (session.has_worker ? "staging" : "building");
|
||||
const initialPhase = session.queen_phase || (session.has_worker ? "staging" : "planning");
|
||||
updateAgentState(agentType, {
|
||||
sessionId: session.session_id,
|
||||
displayName,
|
||||
@@ -1788,8 +1788,11 @@ export default function Workspace() {
|
||||
|
||||
case "queen_phase_changed": {
|
||||
const rawPhase = event.data?.phase as string;
|
||||
const newPhase: "building" | "staging" | "running" =
|
||||
rawPhase === "running" ? "running" : rawPhase === "staging" ? "staging" : "building";
|
||||
const newPhase: "planning" | "building" | "staging" | "running" =
|
||||
rawPhase === "running" ? "running"
|
||||
: rawPhase === "staging" ? "staging"
|
||||
: rawPhase === "planning" ? "planning"
|
||||
: "building";
|
||||
updateAgentState(agentType, {
|
||||
queenPhase: newPhase,
|
||||
queenBuilding: newPhase === "building",
|
||||
|
||||
@@ -11,12 +11,10 @@ dependencies = [
|
||||
"litellm>=1.81.0",
|
||||
"mcp>=1.0.0",
|
||||
"fastmcp>=2.0.0",
|
||||
"textual>=1.0.0",
|
||||
"tools",
|
||||
]
|
||||
|
||||
[project.optional-dependencies]
|
||||
tui = ["textual>=0.75.0"]
|
||||
webhook = ["aiohttp>=3.9.0"]
|
||||
server = ["aiohttp>=3.9.0"]
|
||||
testing = [
|
||||
|
||||
@@ -1,90 +0,0 @@
|
||||
"""Tests for ChatTextArea key handling (Enter submits, Shift+Enter / Ctrl+J insert newlines)."""
|
||||
|
||||
import pytest
|
||||
from textual.app import App, ComposeResult
|
||||
|
||||
from framework.tui.widgets.chat_repl import ChatTextArea
|
||||
|
||||
|
||||
class ChatTextAreaApp(App):
|
||||
"""Minimal app that mounts a ChatTextArea for testing."""
|
||||
|
||||
submitted_texts: list[str]
|
||||
|
||||
def compose(self) -> ComposeResult:
|
||||
yield ChatTextArea(id="input")
|
||||
|
||||
def on_mount(self) -> None:
|
||||
self.submitted_texts = []
|
||||
|
||||
def on_chat_text_area_submitted(self, message: ChatTextArea.Submitted) -> None:
|
||||
self.submitted_texts.append(message.text)
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def app():
|
||||
return ChatTextAreaApp()
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_enter_submits_text(app):
|
||||
"""Pressing Enter should post a Submitted message and clear the widget."""
|
||||
async with app.run_test() as pilot:
|
||||
await pilot.press("h", "e", "l", "l", "o")
|
||||
await pilot.press("enter")
|
||||
|
||||
assert app.submitted_texts == ["hello"]
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_enter_on_empty_does_not_submit(app):
|
||||
"""Pressing Enter with no text should not post a Submitted message."""
|
||||
async with app.run_test() as pilot:
|
||||
await pilot.press("enter")
|
||||
|
||||
assert app.submitted_texts == []
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_shift_enter_inserts_newline(app):
|
||||
"""Shift+Enter should insert a newline, not submit."""
|
||||
async with app.run_test() as pilot:
|
||||
widget = app.query_one("#input", ChatTextArea)
|
||||
|
||||
await pilot.press("a")
|
||||
await pilot.press("shift+enter")
|
||||
await pilot.press("b")
|
||||
|
||||
assert app.submitted_texts == []
|
||||
assert "\n" in widget.text
|
||||
assert widget.text.startswith("a")
|
||||
assert widget.text.endswith("b")
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_ctrl_j_inserts_newline(app):
|
||||
"""Ctrl+J should insert a newline (fallback for terminals without Shift+Enter)."""
|
||||
async with app.run_test() as pilot:
|
||||
widget = app.query_one("#input", ChatTextArea)
|
||||
|
||||
await pilot.press("a")
|
||||
await pilot.press("ctrl+j")
|
||||
await pilot.press("b")
|
||||
|
||||
assert app.submitted_texts == []
|
||||
assert "\n" in widget.text
|
||||
assert widget.text.startswith("a")
|
||||
assert widget.text.endswith("b")
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_multiline_submit(app):
|
||||
"""Typing multiline text via Ctrl+J then pressing Enter should submit all lines."""
|
||||
async with app.run_test() as pilot:
|
||||
await pilot.press("a")
|
||||
await pilot.press("ctrl+j")
|
||||
await pilot.press("b")
|
||||
await pilot.press("enter")
|
||||
|
||||
assert len(app.submitted_texts) == 1
|
||||
assert app.submitted_texts[0] == "a\nb"
|
||||
@@ -763,7 +763,7 @@ class TestClientFacingBlocking:
|
||||
class TestEscalate:
|
||||
@pytest.mark.asyncio
|
||||
async def test_escalate_emits_event(self, runtime, node_spec, memory):
|
||||
"""escalate() should publish ESCALATION_REQUESTED."""
|
||||
"""escalate() should publish ESCALATION_REQUESTED and block for queen guidance."""
|
||||
node_spec.output_keys = []
|
||||
llm = MockStreamingLLM(
|
||||
scenarios=[
|
||||
@@ -772,7 +772,6 @@ class TestEscalate:
|
||||
{
|
||||
"reason": "tool failure",
|
||||
"context": "HTTP 401 from upstream",
|
||||
"wait_for_response": False,
|
||||
},
|
||||
tool_use_id="escalate_1",
|
||||
),
|
||||
@@ -789,7 +788,20 @@ class TestEscalate:
|
||||
|
||||
ctx = build_ctx(runtime, node_spec, memory, llm, stream_id="worker")
|
||||
node = EventLoopNode(event_bus=bus, config=LoopConfig(max_iterations=5))
|
||||
|
||||
async def queen_reply():
|
||||
await asyncio.sleep(0.05)
|
||||
await node.inject_event("Acknowledged, proceed.")
|
||||
|
||||
task = asyncio.create_task(queen_reply())
|
||||
|
||||
async def queen_reply():
|
||||
await asyncio.sleep(0.05)
|
||||
await node.inject_event("Acknowledged, proceed.")
|
||||
|
||||
task = asyncio.create_task(queen_reply())
|
||||
result = await node.execute(ctx)
|
||||
await task
|
||||
|
||||
assert result.success is True
|
||||
assert len(received) == 1
|
||||
@@ -808,7 +820,6 @@ class TestEscalate:
|
||||
{
|
||||
"reason": "blocked",
|
||||
"context": "dependency missing",
|
||||
"wait_for_response": False,
|
||||
},
|
||||
tool_use_id="escalate_1",
|
||||
),
|
||||
@@ -827,7 +838,14 @@ class TestEscalate:
|
||||
|
||||
ctx = build_ctx(runtime, node_spec, memory, llm, stream_id="worker")
|
||||
node = EventLoopNode(event_bus=bus, config=LoopConfig(max_iterations=5))
|
||||
|
||||
async def queen_reply():
|
||||
await asyncio.sleep(0.05)
|
||||
await node.inject_event("Queen acknowledges escalation.")
|
||||
|
||||
task = asyncio.create_task(queen_reply())
|
||||
result = await node.execute(ctx)
|
||||
await task
|
||||
|
||||
assert result.success is True
|
||||
queen_node.inject_event.assert_awaited_once()
|
||||
@@ -842,7 +860,7 @@ class TestEscalate:
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_escalate_waits_for_queen_input_and_skips_judge(self, runtime, node_spec, memory):
|
||||
"""wait_for_response=true should block for queen input before judge evaluation."""
|
||||
"""escalate() should block for queen input before judge evaluation."""
|
||||
node_spec.output_keys = ["result"]
|
||||
llm = MockStreamingLLM(
|
||||
scenarios=[
|
||||
@@ -851,7 +869,6 @@ class TestEscalate:
|
||||
{
|
||||
"reason": "need direction",
|
||||
"context": "conflicting constraints",
|
||||
"wait_for_response": True,
|
||||
},
|
||||
tool_use_id="escalate_1",
|
||||
),
|
||||
@@ -1756,9 +1773,9 @@ class TestIsToolDoomLoop:
|
||||
|
||||
def test_different_args_no_doom(self):
|
||||
node = EventLoopNode(config=LoopConfig(tool_doom_loop_threshold=3))
|
||||
fp1 = [("search", '{"q": "a"}')]
|
||||
fp2 = [("search", '{"q": "b"}')]
|
||||
fp3 = [("search", '{"q": "c"}')]
|
||||
fp1 = [("search", '{"q": "deploy kubernetes cluster to production"}')]
|
||||
fp2 = [("read_file", '{"path": "/etc/nginx/nginx.conf"}')]
|
||||
fp3 = [("execute", '{"command": "SELECT * FROM users WHERE active=true"}')]
|
||||
is_doom, _ = node._is_tool_doom_loop([fp1, fp2, fp3])
|
||||
assert is_doom is False
|
||||
|
||||
@@ -1886,6 +1903,7 @@ class TestToolDoomLoopIntegration:
|
||||
config=LoopConfig(
|
||||
max_iterations=10,
|
||||
tool_doom_loop_threshold=3,
|
||||
stall_similarity_threshold=1.0, # disable fuzzy stall detection
|
||||
),
|
||||
)
|
||||
result = await node.execute(ctx)
|
||||
@@ -1941,6 +1959,7 @@ class TestToolDoomLoopIntegration:
|
||||
config=LoopConfig(
|
||||
max_iterations=10,
|
||||
tool_doom_loop_threshold=3,
|
||||
stall_similarity_threshold=1.0, # disable fuzzy stall detection
|
||||
),
|
||||
)
|
||||
result = await node.execute(ctx)
|
||||
@@ -2005,6 +2024,7 @@ class TestToolDoomLoopIntegration:
|
||||
config=LoopConfig(
|
||||
max_iterations=10,
|
||||
tool_doom_loop_threshold=3,
|
||||
stall_similarity_threshold=1.0, # disable fuzzy stall detection
|
||||
),
|
||||
)
|
||||
result = await node.execute(ctx)
|
||||
@@ -2056,6 +2076,7 @@ class TestToolDoomLoopIntegration:
|
||||
config=LoopConfig(
|
||||
max_iterations=10,
|
||||
tool_doom_loop_enabled=False,
|
||||
stall_similarity_threshold=1.0, # disable fuzzy stall detection
|
||||
),
|
||||
)
|
||||
result = await node.execute(ctx)
|
||||
@@ -2144,6 +2165,7 @@ class TestToolDoomLoopIntegration:
|
||||
config=LoopConfig(
|
||||
max_iterations=10,
|
||||
tool_doom_loop_threshold=3,
|
||||
stall_similarity_threshold=1.0, # disable fuzzy stall detection
|
||||
),
|
||||
)
|
||||
result = await node.execute(ctx)
|
||||
@@ -2206,6 +2228,7 @@ class TestToolDoomLoopIntegration:
|
||||
config=LoopConfig(
|
||||
max_iterations=10,
|
||||
tool_doom_loop_threshold=3,
|
||||
stall_similarity_threshold=1.0, # disable fuzzy stall detection
|
||||
),
|
||||
)
|
||||
result = await node.execute(ctx)
|
||||
|
||||
@@ -40,16 +40,3 @@ class TestMCPDependencies:
|
||||
from mcp.server import FastMCP
|
||||
|
||||
assert FastMCP is not None
|
||||
|
||||
|
||||
class TestMCPPackageExports:
|
||||
"""Tests for the framework.mcp package exports."""
|
||||
|
||||
def test_package_importable(self):
|
||||
"""Test that framework.mcp package can be imported."""
|
||||
if not MCP_AVAILABLE:
|
||||
pytest.skip(MCP_SKIP_REASON)
|
||||
|
||||
import framework.mcp
|
||||
|
||||
assert framework.mcp is not None
|
||||
|
||||
@@ -970,13 +970,13 @@ class TestEscalationFlow:
|
||||
)
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_wait_for_response_emits_client_events(
|
||||
async def test_wait_for_response_emits_escalation_event(
|
||||
self,
|
||||
runtime,
|
||||
parent_node_spec,
|
||||
subagent_node_spec,
|
||||
):
|
||||
"""Escalation should emit CLIENT_OUTPUT_DELTA and CLIENT_INPUT_REQUESTED events."""
|
||||
"""Escalation should emit ESCALATION_REQUESTED to the queen."""
|
||||
from framework.graph.event_loop_node import _EscalationReceiver
|
||||
|
||||
bus = EventBus()
|
||||
@@ -986,7 +986,7 @@ class TestEscalationFlow:
|
||||
bus_events.append(event)
|
||||
|
||||
bus.subscribe(
|
||||
event_types=[EventType.CLIENT_OUTPUT_DELTA, EventType.CLIENT_INPUT_REQUESTED],
|
||||
event_types=[EventType.ESCALATION_REQUESTED],
|
||||
handler=handler,
|
||||
)
|
||||
|
||||
@@ -1034,16 +1034,12 @@ class TestEscalationFlow:
|
||||
await node._execute_subagent(ctx, "researcher", "Navigate page with CAPTCHA")
|
||||
await injector
|
||||
|
||||
# Should have emitted both events
|
||||
output_deltas = [e for e in bus_events if e.type == EventType.CLIENT_OUTPUT_DELTA]
|
||||
input_requests = [e for e in bus_events if e.type == EventType.CLIENT_INPUT_REQUESTED]
|
||||
# Should have emitted ESCALATION_REQUESTED
|
||||
escalation_events = [e for e in bus_events if e.type == EventType.ESCALATION_REQUESTED]
|
||||
|
||||
assert len(output_deltas) >= 1, "Should emit CLIENT_OUTPUT_DELTA with the message"
|
||||
assert output_deltas[0].data["content"] == "CAPTCHA detected on page"
|
||||
assert output_deltas[0].node_id == "parent" # Shows as parent talking
|
||||
|
||||
assert len(input_requests) >= 1, "Should emit CLIENT_INPUT_REQUESTED for routing"
|
||||
assert ":escalation:" in input_requests[0].node_id # Escalation ID for routing
|
||||
assert len(escalation_events) >= 1, "Should emit ESCALATION_REQUESTED"
|
||||
assert escalation_events[0].data["context"] == "CAPTCHA detected on page"
|
||||
assert ":escalation:" in escalation_events[0].node_id
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_non_blocking_report_still_works(
|
||||
|
||||
@@ -3,9 +3,8 @@
|
||||
Tests the FULL routing chain:
|
||||
ExecutionStream → GraphExecutor → EventLoopNode → _execute_subagent
|
||||
→ _report_callback registers _EscalationReceiver in executor.node_registry
|
||||
→ emit CLIENT_INPUT_REQUESTED with escalation_id
|
||||
→ subscriber calls stream.inject_input(escalation_id, "done")
|
||||
→ ExecutionStream finds _EscalationReceiver in executor.node_registry
|
||||
→ emit ESCALATION_REQUESTED (queen handles the escalation)
|
||||
→ queen inject_worker_message() finds _EscalationReceiver via get_waiting_nodes()
|
||||
→ receiver.inject_event("done") unblocks the subagent
|
||||
→ subagent continues and completes
|
||||
"""
|
||||
@@ -227,26 +226,30 @@ async def test_escalation_e2e_through_execution_stream(tmp_path):
|
||||
stream_holder: list[ExecutionStream] = []
|
||||
|
||||
async def escalation_handler(event: AgentEvent):
|
||||
"""Simulate a TUI/runner: when CLIENT_INPUT_REQUESTED arrives with
|
||||
an escalation node_id, inject the user's response via the stream."""
|
||||
"""Simulate the queen: when ESCALATION_REQUESTED arrives,
|
||||
find the waiting receiver and inject the response via the stream."""
|
||||
all_events.append(event)
|
||||
if event.type == EventType.CLIENT_INPUT_REQUESTED:
|
||||
node_id = event.node_id
|
||||
if ":escalation:" in node_id:
|
||||
escalation_events.append(event)
|
||||
# Small delay to simulate user typing
|
||||
await asyncio.sleep(0.05)
|
||||
# Route through the REAL inject_input chain
|
||||
stream = stream_holder[0]
|
||||
success = await stream.inject_input(node_id, "done logging in")
|
||||
assert success, (
|
||||
f"inject_input({node_id!r}) returned False — "
|
||||
"escalation receiver not found in executor.node_registry"
|
||||
)
|
||||
inject_called.set()
|
||||
if event.type == EventType.ESCALATION_REQUESTED:
|
||||
escalation_events.append(event)
|
||||
# Small delay to simulate queen processing
|
||||
await asyncio.sleep(0.05)
|
||||
# Route through the REAL inject_input chain — find the waiting
|
||||
# escalation receiver via get_waiting_nodes() (mirrors what
|
||||
# inject_worker_message does in the queen lifecycle tools).
|
||||
stream = stream_holder[0]
|
||||
waiting = stream.get_waiting_nodes()
|
||||
assert waiting, "Should have a waiting escalation receiver"
|
||||
target_node_id = waiting[0]["node_id"]
|
||||
assert ":escalation:" in target_node_id
|
||||
success = await stream.inject_input(target_node_id, "done logging in")
|
||||
assert success, (
|
||||
f"inject_input({target_node_id!r}) returned False — "
|
||||
"escalation receiver not found in executor.node_registry"
|
||||
)
|
||||
inject_called.set()
|
||||
|
||||
bus.subscribe(
|
||||
event_types=[EventType.CLIENT_INPUT_REQUESTED, EventType.CLIENT_OUTPUT_DELTA],
|
||||
event_types=[EventType.ESCALATION_REQUESTED],
|
||||
handler=escalation_handler,
|
||||
)
|
||||
|
||||
@@ -297,17 +300,7 @@ async def test_escalation_e2e_through_execution_stream(tmp_path):
|
||||
# 3. Escalation event has correct structure
|
||||
esc_event = escalation_events[0]
|
||||
assert ":escalation:" in esc_event.node_id
|
||||
assert esc_event.data["prompt"] == "Login required for LinkedIn. Please log in manually."
|
||||
|
||||
# 4. CLIENT_OUTPUT_DELTA was emitted for the escalation message
|
||||
output_deltas = [
|
||||
e
|
||||
for e in all_events
|
||||
if e.type == EventType.CLIENT_OUTPUT_DELTA and "Login required" in e.data.get("content", "")
|
||||
]
|
||||
assert len(output_deltas) >= 1, (
|
||||
"Should have emitted CLIENT_OUTPUT_DELTA with escalation message"
|
||||
)
|
||||
assert esc_event.data["context"] == "Login required for LinkedIn. Please log in manually."
|
||||
|
||||
# 5. The parent node got the subagent's result
|
||||
assert "result" in result.output
|
||||
@@ -444,7 +437,7 @@ async def test_escalation_cleanup_after_completion(tmp_path):
|
||||
stream_holder: list[ExecutionStream] = []
|
||||
|
||||
async def auto_respond(event: AgentEvent):
|
||||
if event.type == EventType.CLIENT_INPUT_REQUESTED and ":escalation:" in event.node_id:
|
||||
if event.type == EventType.ESCALATION_REQUESTED:
|
||||
stream = stream_holder[0]
|
||||
|
||||
# Snapshot the active executor's node_registry BEFORE responding
|
||||
@@ -462,10 +455,13 @@ async def test_escalation_cleanup_after_completion(tmp_path):
|
||||
)
|
||||
|
||||
await asyncio.sleep(0.02)
|
||||
await stream.inject_input(event.node_id, "ok")
|
||||
# Find the waiting escalation receiver and inject response
|
||||
waiting = stream.get_waiting_nodes()
|
||||
if waiting:
|
||||
await stream.inject_input(waiting[0]["node_id"], "ok")
|
||||
|
||||
bus.subscribe(
|
||||
event_types=[EventType.CLIENT_INPUT_REQUESTED],
|
||||
event_types=[EventType.ESCALATION_REQUESTED],
|
||||
handler=auto_respond,
|
||||
)
|
||||
|
||||
|
||||
@@ -23,7 +23,7 @@ Done. For details, prerequisites, and troubleshooting, read on.
|
||||
|
||||
## What you get after setup
|
||||
|
||||
- **coder-tools** – Create and manage agents (scaffolding via `initialize_agent_package`, file I/O, tool discovery).
|
||||
- **coder-tools** – Create and manage agents (scaffolding via `initialize_and_build_agent`, file I/O, tool discovery).
|
||||
- **tools** – File operations, web search, and other agent tools.
|
||||
- **Documentation** – Guided docs for building and testing agents.
|
||||
|
||||
|
||||
@@ -130,7 +130,7 @@ MCP (Model Context Protocol) servers are configured in `.mcp.json` at the projec
|
||||
}
|
||||
```
|
||||
|
||||
The `coder-tools` server provides agent scaffolding via `initialize_agent_package` and related tools. The `tools` MCP server exposes tools including web search, PDF reading, CSV processing, and file system operations.
|
||||
The `coder-tools` server provides agent scaffolding via `initialize_and_build_agent` and related tools. The `tools` MCP server exposes tools including web search, PDF reading, CSV processing, and file system operations.
|
||||
|
||||
## Storage
|
||||
|
||||
|
||||
@@ -244,7 +244,7 @@ The fastest way to build agents is with the configured MCP workflow:
|
||||
./quickstart.sh
|
||||
|
||||
# Build a new agent
|
||||
Use the coder-tools MCP tools from your IDE agent chat (e.g., initialize_agent_package)
|
||||
Use the coder-tools MCP tools from your IDE agent chat (e.g., initialize_and_build_agent)
|
||||
```
|
||||
|
||||
### Agent Development Workflow
|
||||
@@ -252,7 +252,7 @@ Use the coder-tools MCP tools from your IDE agent chat (e.g., initialize_agent_p
|
||||
1. **Define Your Goal**
|
||||
|
||||
```
|
||||
Use the coder-tools initialize_agent_package tool
|
||||
Use the coder-tools initialize_and_build_agent tool
|
||||
Enter goal: "Build an agent that processes customer support tickets"
|
||||
```
|
||||
|
||||
@@ -555,7 +555,7 @@ uv add <package>
|
||||
|
||||
```bash
|
||||
# Option 1: Use Claude Code skill (recommended)
|
||||
Use the coder-tools initialize_agent_package tool
|
||||
Use the coder-tools initialize_and_build_agent tool
|
||||
|
||||
# Option 2: Create manually
|
||||
# Note: exports/ is initially empty (gitignored). Create your agent directory:
|
||||
|
||||
@@ -180,7 +180,7 @@ MCP tools are also available in Cursor. To enable:
|
||||
|
||||
**Claude Code:**
|
||||
```
|
||||
Use the coder-tools initialize_agent_package tool to scaffold a new agent
|
||||
Use the coder-tools initialize_and_build_agent tool to scaffold a new agent
|
||||
```
|
||||
|
||||
**Codex CLI:**
|
||||
@@ -453,7 +453,7 @@ This design allows agents in `exports/` to be:
|
||||
### 2. Build Agent (Claude Code)
|
||||
|
||||
```
|
||||
Use the coder-tools initialize_agent_package tool
|
||||
Use the coder-tools initialize_and_build_agent tool
|
||||
Enter goal: "Build an agent that processes customer support tickets"
|
||||
```
|
||||
|
||||
|
||||
@@ -47,7 +47,7 @@ This is the recommended way to create your first agent.
|
||||
# Setup already done via quickstart.sh above
|
||||
|
||||
# Start Claude Code and build an agent
|
||||
Use the coder-tools initialize_agent_package tool
|
||||
Use the coder-tools initialize_and_build_agent tool
|
||||
```
|
||||
|
||||
Follow the interactive prompts to:
|
||||
@@ -173,7 +173,7 @@ PYTHONPATH=exports uv run python -m my_agent test --type success
|
||||
1. **Dashboard**: Run `hive open` to launch the web dashboard, or `hive tui` for the terminal UI
|
||||
2. **Detailed Setup**: See [environment-setup.md](./environment-setup.md)
|
||||
3. **Developer Guide**: See [developer-guide.md](./developer-guide.md)
|
||||
4. **Build Agents**: Use the coder-tools `initialize_agent_package` tool in Claude Code
|
||||
4. **Build Agents**: Use the coder-tools `initialize_and_build_agent` tool in Claude Code
|
||||
5. **Custom Tools**: Learn to integrate MCP servers
|
||||
6. **Join Community**: [Discord](https://discord.com/invite/MXE49hrKDk)
|
||||
|
||||
|
||||
+18
-16
@@ -312,11 +312,11 @@ Ship essential framework utilities: Node validation, HITL (Human-in-the-loop pau
|
||||
- [x] Pause/approve workflow
|
||||
- [x] State saved to checkpoint
|
||||
- [x] Resume with HITLResponse merged into context
|
||||
- [x] **TUI Integration**
|
||||
- [x] Chat REPL with streaming support (tui/app.py)
|
||||
- [x] Multi-graph session management
|
||||
- [x] User presence detection
|
||||
- [x] Real-time log viewing
|
||||
- [x] ~~**TUI Integration**~~ *(deprecated — see AGENTS.md; use `hive open` browser UI instead)*
|
||||
- [x] ~~Chat REPL with streaming support (tui/app.py)~~
|
||||
- [x] ~~Multi-graph session management~~
|
||||
- [x] ~~User presence detection~~
|
||||
- [x] ~~Real-time log viewing~~
|
||||
- [x] **Node Lifecycle Management**
|
||||
- [x] Start/stop/pause/resume in execution stream
|
||||
- [x] State persistence via checkpoint store
|
||||
@@ -538,11 +538,11 @@ Release CLI tools specifically for rapid memory management and credential store
|
||||
- [x] test-run, test-debug, test-list, test-stats (testing/cli.py)
|
||||
- [x] Pytest integration
|
||||
- [x] Test categorization
|
||||
- [x] **TUI (Terminal UI)**
|
||||
- [x] Interactive chat with streaming (tui/app.py)
|
||||
- [x] Multi-graph management UI
|
||||
- [x] Log pane for real-time output
|
||||
- [x] Keyboard shortcuts (Ctrl+C, Ctrl+D, etc.)
|
||||
- [x] ~~**TUI (Terminal UI)**~~ *(deprecated — see AGENTS.md; use `hive open` browser UI instead)*
|
||||
- [x] ~~Interactive chat with streaming (tui/app.py)~~
|
||||
- [x] ~~Multi-graph management UI~~
|
||||
- [x] ~~Log pane for real-time output~~
|
||||
- [x] ~~Keyboard shortcuts (Ctrl+C, Ctrl+D, etc.)~~
|
||||
- [ ] **Memory Management CLI**
|
||||
- [ ] Memory inspection commands
|
||||
- [ ] Memory cleanup utilities
|
||||
@@ -776,12 +776,14 @@ Implement an interactive, drag-and-drop canvas (using libraries like React Flow)
|
||||
### TUI to GUI Upgrade
|
||||
Port the existing Terminal User Interface (TUI) into a rich web application, allowing users to interact directly with the Queen Bee / Coding Agent via a browser chat interface.
|
||||
|
||||
- [x] **TUI Foundation**
|
||||
- [x] Terminal chat interface (tui/app.py)
|
||||
- [x] Streaming support
|
||||
- [x] Multi-graph management
|
||||
- [x] Log pane display
|
||||
- [x] Keyboard shortcuts
|
||||
> **Note:** The TUI (`hive tui` / `tui/app.py`) is deprecated and no longer maintained (see AGENTS.md). The items below reflect legacy work completed before deprecation. New development should target the browser-based GUI (`hive open`).
|
||||
|
||||
- [x] ~~**TUI Foundation**~~ *(deprecated)*
|
||||
- [x] ~~Terminal chat interface (tui/app.py)~~
|
||||
- [x] ~~Streaming support~~
|
||||
- [x] ~~Multi-graph management~~
|
||||
- [x] ~~Log pane display~~
|
||||
- [x] ~~Keyboard shortcuts~~
|
||||
- [ ] **Web Application**
|
||||
- [ ] Modern web UI framework setup (React/Vue/Svelte)
|
||||
- [ ] Responsive design implementation
|
||||
|
||||
@@ -22,7 +22,7 @@ template_name/
|
||||
|
||||
### Option 1: Build from template (recommended)
|
||||
|
||||
Use the `coder-tools` `initialize_agent_package` tool and select "From a template" to interactively pick a template, customize the goal/nodes/graph, and export a new agent.
|
||||
Use the `coder-tools` `initialize_and_build_agent` tool and select "From a template" to interactively pick a template, customize the goal/nodes/graph, and export a new agent.
|
||||
|
||||
### Option 2: Manual copy
|
||||
|
||||
|
||||
@@ -204,8 +204,8 @@ class DeepResearchAgent:
|
||||
"""Set up the executor with all components."""
|
||||
from pathlib import Path
|
||||
|
||||
storage_path = Path.home() / ".hive" / "agents" / "deep_research_agent"
|
||||
storage_path.mkdir(parents=True, exist_ok=True)
|
||||
self._storage_path = Path.home() / ".hive" / "agents" / "deep_research_agent"
|
||||
self._storage_path.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
self._tool_registry = ToolRegistry()
|
||||
|
||||
|
||||
@@ -2,8 +2,13 @@
|
||||
"hive-tools": {
|
||||
"transport": "stdio",
|
||||
"command": "uv",
|
||||
"args": ["run", "python", "mcp_server.py", "--stdio"],
|
||||
"args": [
|
||||
"run",
|
||||
"python",
|
||||
"mcp_server.py",
|
||||
"--stdio"
|
||||
],
|
||||
"cwd": "../../../tools",
|
||||
"description": "Hive tools MCP server providing web_search, web_scrape, and write_to_file"
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -11,26 +11,32 @@ intake_node = NodeSpec(
|
||||
node_type="event_loop",
|
||||
client_facing=True,
|
||||
max_node_visits=0,
|
||||
input_keys=["topic"],
|
||||
input_keys=["user_request"],
|
||||
output_keys=["research_brief"],
|
||||
success_criteria=(
|
||||
"The research brief is specific and actionable: it states the topic, "
|
||||
"the key questions to answer, the desired scope, and depth."
|
||||
),
|
||||
system_prompt="""\
|
||||
You are a research intake specialist. The user wants to research a topic.
|
||||
Have a brief conversation to clarify what they need.
|
||||
You are a research intake specialist. Your ONLY job is to have a brief conversation with the user to clarify what they want researched.
|
||||
|
||||
**CRITICAL: You do NOT do any research yourself.**
|
||||
- You do NOT search the web
|
||||
- You do NOT fetch sources
|
||||
- The research happens in the NEXT stage after you complete intake
|
||||
- Do NOT ask for or expect web_search or web_scrape tools
|
||||
|
||||
**STEP 1 — Read and respond (text only, NO tool calls):**
|
||||
1. Read the topic provided
|
||||
2. If it's vague, ask 1-2 clarifying questions (scope, angle, depth)
|
||||
1. Read the user_request provided
|
||||
2. If it's vague, ask 1-2 clarifying questions (scope, angle, depth, budget, preferences)
|
||||
3. If it's already clear, confirm your understanding and ask the user to confirm
|
||||
|
||||
Keep it short. Don't over-ask.
|
||||
Keep it short. Don't over-ask. Maximum 2 clarifying questions.
|
||||
|
||||
**STEP 2 — After the user confirms, call set_output:**
|
||||
- set_output("research_brief", "A clear paragraph describing exactly what to research, \
|
||||
what questions to answer, what scope to cover, and how deep to go.")
|
||||
- set_output("research_brief", "A clear paragraph describing exactly what to research, what questions to answer, what scope to cover, and how deep to go.")
|
||||
|
||||
That's it. Once you call set_output, your job is done and the research node will take over.
|
||||
""",
|
||||
tools=[],
|
||||
)
|
||||
@@ -59,6 +65,8 @@ If feedback is provided, this is a follow-up round — focus on the gaps identif
|
||||
Work in phases:
|
||||
1. **Search**: Use web_search with 3-5 diverse queries covering different angles.
|
||||
Prioritize authoritative sources (.edu, .gov, established publications).
|
||||
For automotive research, target: caranddriver.com, motortrend.com, edmunds.com,
|
||||
consumerreports.org, jdpower.com, and enthusiast forums.
|
||||
2. **Fetch**: Use web_scrape on the most promising URLs (aim for 5-8 sources).
|
||||
Skip URLs that fail. Extract the substantive content.
|
||||
3. **Analyze**: Review what you've collected. Identify key findings, themes,
|
||||
|
||||
@@ -116,16 +116,39 @@ customize_node = NodeSpec(
|
||||
"for each selected job, saved as HTML, and Gmail drafts created in user's inbox."
|
||||
),
|
||||
system_prompt="""\
|
||||
You are a career coach creating personalized application materials.
|
||||
You are a career coach creating personalized application materials and Gmail drafts.
|
||||
|
||||
**CRITICAL: You MUST create Gmail drafts for each selected job using gmail_create_draft.**
|
||||
|
||||
**PROCESS:**
|
||||
1. Create application_materials.html using save_data and append_data.
|
||||
2. Generate resume customization list and professional cold email for each selected job.
|
||||
3. Serve the file to the user.
|
||||
4. Create Gmail drafts using gmail_create_draft.
|
||||
2. For each selected job:
|
||||
a. Generate a specific resume customization list
|
||||
b. Create a professional cold outreach email
|
||||
c. **IMMEDIATELY call gmail_create_draft** with:
|
||||
- to: hiring manager or recruiter email (if available) or company email
|
||||
- subject: "Application for [Job Title] - [Your Name]"
|
||||
- html: the professional cold email in HTML format
|
||||
3. Serve the application_materials.html file to the user.
|
||||
4. Confirm each Gmail draft was created successfully.
|
||||
|
||||
**EMAIL REQUIREMENTS:**
|
||||
- Professional, personalized cold outreach email
|
||||
- Reference specific company details and role
|
||||
- Mention 2-3 relevant qualifications from their resume
|
||||
- Include clear call-to-action
|
||||
- Professional email signature
|
||||
- Format as HTML with proper structure
|
||||
|
||||
**Gmail Draft Creation:**
|
||||
For each job, you MUST call gmail_create_draft(to="[email]", subject="[subject]", html="[email_html]")
|
||||
- Extract company email from job listing if available
|
||||
- Use generic format like "careers@[company].com" if no specific email
|
||||
- Subject format: "Application for [Job Title] - [Applicant Name]"
|
||||
- HTML email body with proper formatting
|
||||
|
||||
**FINISH:**
|
||||
Call set_output("application_materials", "Completed")
|
||||
Only call set_output("application_materials", "Completed") AFTER creating ALL Gmail drafts.
|
||||
""",
|
||||
tools=["save_data", "append_data", "serve_file_to_user", "gmail_create_draft"],
|
||||
)
|
||||
|
||||
+112
-15
@@ -911,6 +911,13 @@ $zaiKey = [System.Environment]::GetEnvironmentVariable("ZAI_API_KEY", "User")
|
||||
if (-not $zaiKey) { $zaiKey = $env:ZAI_API_KEY }
|
||||
if ($zaiKey) { $ZaiCredDetected = $true }
|
||||
|
||||
$KimiCredDetected = $false
|
||||
$kimiConfigPath = Join-Path $env:USERPROFILE ".kimi\config.toml"
|
||||
if (Test-Path $kimiConfigPath) { $KimiCredDetected = $true }
|
||||
$kimiKey = [System.Environment]::GetEnvironmentVariable("KIMI_API_KEY", "User")
|
||||
if (-not $kimiKey) { $kimiKey = $env:KIMI_API_KEY }
|
||||
if ($kimiKey) { $KimiCredDetected = $true }
|
||||
|
||||
# Detect API key providers
|
||||
$ProviderMenuEnvVars = @("ANTHROPIC_API_KEY", "OPENAI_API_KEY", "GEMINI_API_KEY", "GROQ_API_KEY", "CEREBRAS_API_KEY")
|
||||
$ProviderMenuNames = @("Anthropic (Claude) - Recommended", "OpenAI (GPT)", "Google Gemini - Free tier available", "Groq - Fast, free tier", "Cerebras - Fast, free tier")
|
||||
@@ -938,7 +945,9 @@ if (Test-Path $HiveConfigFile) {
|
||||
$PrevEnvVar = if ($prevLlm.api_key_env_var) { $prevLlm.api_key_env_var } else { "" }
|
||||
if ($prevLlm.use_claude_code_subscription) { $PrevSubMode = "claude_code" }
|
||||
elseif ($prevLlm.use_codex_subscription) { $PrevSubMode = "codex" }
|
||||
elseif ($prevLlm.use_kimi_code_subscription) { $PrevSubMode = "kimi_code" }
|
||||
elseif ($prevLlm.api_base -and $prevLlm.api_base -like "*api.z.ai*") { $PrevSubMode = "zai_code" }
|
||||
elseif ($prevLlm.api_base -and $prevLlm.api_base -like "*api.kimi.com*") { $PrevSubMode = "kimi_code" }
|
||||
}
|
||||
} catch { }
|
||||
}
|
||||
@@ -951,6 +960,7 @@ if ($PrevSubMode -or $PrevProvider) {
|
||||
"claude_code" { if ($ClaudeCredDetected) { $prevCredValid = $true } }
|
||||
"zai_code" { if ($ZaiCredDetected) { $prevCredValid = $true } }
|
||||
"codex" { if ($CodexCredDetected) { $prevCredValid = $true } }
|
||||
"kimi_code" { if ($KimiCredDetected) { $prevCredValid = $true } }
|
||||
default {
|
||||
if ($PrevEnvVar) {
|
||||
$envVal = [System.Environment]::GetEnvironmentVariable($PrevEnvVar, "Process")
|
||||
@@ -964,14 +974,16 @@ if ($PrevSubMode -or $PrevProvider) {
|
||||
"claude_code" { $DefaultChoice = "1" }
|
||||
"zai_code" { $DefaultChoice = "2" }
|
||||
"codex" { $DefaultChoice = "3" }
|
||||
"kimi_code" { $DefaultChoice = "4" }
|
||||
}
|
||||
if (-not $DefaultChoice) {
|
||||
switch ($PrevProvider) {
|
||||
"anthropic" { $DefaultChoice = "4" }
|
||||
"openai" { $DefaultChoice = "5" }
|
||||
"gemini" { $DefaultChoice = "6" }
|
||||
"groq" { $DefaultChoice = "7" }
|
||||
"cerebras" { $DefaultChoice = "8" }
|
||||
"anthropic" { $DefaultChoice = "5" }
|
||||
"openai" { $DefaultChoice = "6" }
|
||||
"gemini" { $DefaultChoice = "7" }
|
||||
"groq" { $DefaultChoice = "8" }
|
||||
"cerebras" { $DefaultChoice = "9" }
|
||||
"kimi" { $DefaultChoice = "4" }
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -1003,12 +1015,19 @@ Write-Host ") OpenAI Codex Subscription " -NoNewline
|
||||
Write-Color -Text "(use your Codex/ChatGPT Plus plan)" -Color DarkGray -NoNewline
|
||||
if ($CodexCredDetected) { Write-Color -Text " (credential detected)" -Color Green } else { Write-Host "" }
|
||||
|
||||
# 4) Kimi Code
|
||||
Write-Host " " -NoNewline
|
||||
Write-Color -Text "4" -Color Cyan -NoNewline
|
||||
Write-Host ") Kimi Code Subscription " -NoNewline
|
||||
Write-Color -Text "(use your Kimi Code plan)" -Color DarkGray -NoNewline
|
||||
if ($KimiCredDetected) { Write-Color -Text " (credential detected)" -Color Green } else { Write-Host "" }
|
||||
|
||||
Write-Host ""
|
||||
Write-Color -Text " API key providers:" -Color Cyan
|
||||
|
||||
# 4-8) API key providers
|
||||
# 5-9) API key providers
|
||||
for ($idx = 0; $idx -lt $ProviderMenuEnvVars.Count; $idx++) {
|
||||
$num = $idx + 4
|
||||
$num = $idx + 5
|
||||
$envVal = [System.Environment]::GetEnvironmentVariable($ProviderMenuEnvVars[$idx], "Process")
|
||||
if (-not $envVal) { $envVal = [System.Environment]::GetEnvironmentVariable($ProviderMenuEnvVars[$idx], "User") }
|
||||
Write-Host " " -NoNewline
|
||||
@@ -1018,7 +1037,7 @@ for ($idx = 0; $idx -lt $ProviderMenuEnvVars.Count; $idx++) {
|
||||
}
|
||||
|
||||
Write-Host " " -NoNewline
|
||||
Write-Color -Text "9" -Color Cyan -NoNewline
|
||||
Write-Color -Text "10" -Color Cyan -NoNewline
|
||||
Write-Host ") Skip for now"
|
||||
Write-Host ""
|
||||
|
||||
@@ -1029,16 +1048,16 @@ if ($DefaultChoice) {
|
||||
|
||||
while ($true) {
|
||||
if ($DefaultChoice) {
|
||||
$raw = Read-Host "Enter choice (1-9) [$DefaultChoice]"
|
||||
$raw = Read-Host "Enter choice (1-10) [$DefaultChoice]"
|
||||
if ([string]::IsNullOrWhiteSpace($raw)) { $raw = $DefaultChoice }
|
||||
} else {
|
||||
$raw = Read-Host "Enter choice (1-9)"
|
||||
$raw = Read-Host "Enter choice (1-10)"
|
||||
}
|
||||
if ($raw -match '^\d+$') {
|
||||
$num = [int]$raw
|
||||
if ($num -ge 1 -and $num -le 9) { break }
|
||||
if ($num -ge 1 -and $num -le 10) { break }
|
||||
}
|
||||
Write-Color -Text "Invalid choice. Please enter 1-9" -Color Red
|
||||
Write-Color -Text "Invalid choice. Please enter 1-10" -Color Red
|
||||
}
|
||||
|
||||
switch ($num) {
|
||||
@@ -1102,9 +1121,20 @@ switch ($num) {
|
||||
Write-Ok "Using OpenAI Codex subscription"
|
||||
}
|
||||
}
|
||||
{ $_ -ge 4 -and $_ -le 8 } {
|
||||
4 {
|
||||
# Kimi Code Subscription
|
||||
$SubscriptionMode = "kimi_code"
|
||||
$SelectedProviderId = "kimi"
|
||||
$SelectedEnvVar = "KIMI_API_KEY"
|
||||
$SelectedModel = "kimi-k2.5"
|
||||
$SelectedMaxTokens = 32768
|
||||
Write-Host ""
|
||||
Write-Ok "Using Kimi Code subscription"
|
||||
Write-Color -Text " Model: kimi-k2.5 | API: api.kimi.com/coding" -Color DarkGray
|
||||
}
|
||||
{ $_ -ge 5 -and $_ -le 9 } {
|
||||
# API key providers
|
||||
$provIdx = $num - 4
|
||||
$provIdx = $num - 5
|
||||
$SelectedEnvVar = $ProviderMenuEnvVars[$provIdx]
|
||||
$SelectedProviderId = $ProviderMenuIds[$provIdx]
|
||||
$providerName = $ProviderMenuNames[$provIdx] -replace ' - .*', '' # strip description
|
||||
@@ -1175,7 +1205,7 @@ switch ($num) {
|
||||
}
|
||||
}
|
||||
}
|
||||
9 {
|
||||
10 {
|
||||
Write-Host ""
|
||||
Write-Warn "Skipped. An LLM API key is required to test and use worker agents."
|
||||
Write-Host " Add your API key later by running:"
|
||||
@@ -1252,6 +1282,70 @@ if ($SubscriptionMode -eq "zai_code") {
|
||||
}
|
||||
}
|
||||
|
||||
# For Kimi Code subscription: prompt for API key with verification + retry
|
||||
if ($SubscriptionMode -eq "kimi_code") {
|
||||
while ($true) {
|
||||
$existingKimi = [System.Environment]::GetEnvironmentVariable("KIMI_API_KEY", "User")
|
||||
if (-not $existingKimi) { $existingKimi = $env:KIMI_API_KEY }
|
||||
|
||||
if ($existingKimi) {
|
||||
$masked = $existingKimi.Substring(0, [Math]::Min(4, $existingKimi.Length)) + "..." + $existingKimi.Substring([Math]::Max(0, $existingKimi.Length - 4))
|
||||
Write-Host ""
|
||||
Write-Color -Text " $([char]0x2B22) Current Kimi key: $masked" -Color Green
|
||||
$apiKey = Read-Host " Press Enter to keep, or paste a new key to replace"
|
||||
} else {
|
||||
Write-Host ""
|
||||
Write-Host "Get your API key from: " -NoNewline
|
||||
Write-Color -Text "https://www.kimi.com/code" -Color Cyan
|
||||
Write-Host ""
|
||||
$apiKey = Read-Host "Paste your Kimi API key (or press Enter to skip)"
|
||||
}
|
||||
|
||||
if ($apiKey) {
|
||||
[System.Environment]::SetEnvironmentVariable("KIMI_API_KEY", $apiKey, "User")
|
||||
$env:KIMI_API_KEY = $apiKey
|
||||
Write-Host ""
|
||||
Write-Ok "Kimi API key saved as User environment variable"
|
||||
|
||||
# Health check the new key
|
||||
Write-Host " Verifying Kimi API key... " -NoNewline
|
||||
try {
|
||||
$hcResult = & uv run python (Join-Path $ScriptDir "scripts/check_llm_key.py") "kimi" $apiKey "https://api.kimi.com/coding" 2>$null
|
||||
$hcJson = $hcResult | ConvertFrom-Json
|
||||
if ($hcJson.valid -eq $true) {
|
||||
Write-Color -Text "ok" -Color Green
|
||||
break
|
||||
} elseif ($hcJson.valid -eq $false) {
|
||||
Write-Color -Text "failed" -Color Red
|
||||
Write-Warn $hcJson.message
|
||||
[System.Environment]::SetEnvironmentVariable("KIMI_API_KEY", $null, "User")
|
||||
Remove-Item -Path "Env:\KIMI_API_KEY" -ErrorAction SilentlyContinue
|
||||
Write-Host ""
|
||||
Read-Host " Press Enter to try again"
|
||||
} else {
|
||||
Write-Color -Text "--" -Color Yellow
|
||||
Write-Color -Text " Could not verify key (network issue). The key has been saved." -Color DarkGray
|
||||
break
|
||||
}
|
||||
} catch {
|
||||
Write-Color -Text "--" -Color Yellow
|
||||
Write-Color -Text " Could not verify key (network issue). The key has been saved." -Color DarkGray
|
||||
break
|
||||
}
|
||||
} elseif (-not $existingKimi) {
|
||||
Write-Host ""
|
||||
Write-Warn "Skipped. Add your Kimi API key later:"
|
||||
Write-Color -Text " [System.Environment]::SetEnvironmentVariable('KIMI_API_KEY', 'your-key', 'User')" -Color Cyan
|
||||
$SelectedEnvVar = ""
|
||||
$SelectedProviderId = ""
|
||||
$SubscriptionMode = ""
|
||||
break
|
||||
} else {
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
# Prompt for model if not already selected (manual provider path)
|
||||
if ($SelectedProviderId -and -not $SelectedModel) {
|
||||
$modelSel = Get-ModelSelection $SelectedProviderId
|
||||
@@ -1287,6 +1381,9 @@ if ($SelectedProviderId) {
|
||||
} elseif ($SubscriptionMode -eq "zai_code") {
|
||||
$config.llm["api_base"] = "https://api.z.ai/api/coding/paas/v4"
|
||||
$config.llm["api_key_env_var"] = $SelectedEnvVar
|
||||
} elseif ($SubscriptionMode -eq "kimi_code") {
|
||||
$config.llm["api_base"] = "https://api.kimi.com/coding"
|
||||
$config.llm["api_key_env_var"] = $SelectedEnvVar
|
||||
} else {
|
||||
$config.llm["api_key_env_var"] = $SelectedEnvVar
|
||||
}
|
||||
|
||||
+58
-24
@@ -410,7 +410,7 @@ if [ "$USE_ASSOC_ARRAYS" = true ]; then
|
||||
declare -A DEFAULT_MODELS=(
|
||||
["anthropic"]="claude-haiku-4-5-20251001"
|
||||
["openai"]="gpt-5-mini"
|
||||
["minimax"]="MiniMax-M2.1"
|
||||
["minimax"]="MiniMax-M2.5"
|
||||
["gemini"]="gemini-3-flash-preview"
|
||||
["groq"]="moonshotai/kimi-k2-instruct-0905"
|
||||
["cerebras"]="zai-glm-4.7"
|
||||
@@ -510,7 +510,7 @@ else
|
||||
|
||||
# Default models by provider id (parallel arrays)
|
||||
MODEL_PROVIDER_IDS=(anthropic openai minimax gemini groq cerebras mistral together_ai deepseek)
|
||||
MODEL_DEFAULTS=("claude-haiku-4-5-20251001" "gpt-5-mini" "MiniMax-M2.1" "gemini-3-flash-preview" "moonshotai/kimi-k2-instruct-0905" "zai-glm-4.7" "mistral-large-latest" "meta-llama/Llama-3.3-70B-Instruct-Turbo" "deepseek-chat")
|
||||
MODEL_DEFAULTS=("claude-haiku-4-5-20251001" "gpt-5-mini" "MiniMax-M2.5" "gemini-3-flash-preview" "moonshotai/kimi-k2-instruct-0905" "zai-glm-4.7" "mistral-large-latest" "meta-llama/Llama-3.3-70B-Instruct-Turbo" "deepseek-chat")
|
||||
|
||||
# Helper: get provider display name for an env var
|
||||
get_provider_name() {
|
||||
@@ -824,6 +824,13 @@ if [ -n "${MINIMAX_API_KEY:-}" ]; then
|
||||
MINIMAX_CRED_DETECTED=true
|
||||
fi
|
||||
|
||||
KIMI_CRED_DETECTED=false
|
||||
if [ -f "$HOME/.kimi/config.toml" ]; then
|
||||
KIMI_CRED_DETECTED=true
|
||||
elif [ -n "${KIMI_API_KEY:-}" ]; then
|
||||
KIMI_CRED_DETECTED=true
|
||||
fi
|
||||
|
||||
# Detect API key providers
|
||||
if [ "$USE_ASSOC_ARRAYS" = true ]; then
|
||||
for env_var in "${!PROVIDER_NAMES[@]}"; do
|
||||
@@ -859,6 +866,7 @@ try:
|
||||
sub = ''
|
||||
if llm.get('use_claude_code_subscription'): sub = 'claude_code'
|
||||
elif llm.get('use_codex_subscription'): sub = 'codex'
|
||||
elif llm.get('use_kimi_code_subscription'): sub = 'kimi_code'
|
||||
elif llm.get('provider', '') == 'minimax' or 'api.minimax.io' in llm.get('api_base', ''): sub = 'minimax_code'
|
||||
elif 'api.z.ai' in llm.get('api_base', ''): sub = 'zai_code'
|
||||
print(f'PREV_SUB_MODE={sub}')
|
||||
@@ -875,6 +883,7 @@ if [ -n "$PREV_SUB_MODE" ] || [ -n "$PREV_PROVIDER" ]; then
|
||||
claude_code) [ "$CLAUDE_CRED_DETECTED" = true ] && PREV_CRED_VALID=true ;;
|
||||
zai_code) [ "$ZAI_CRED_DETECTED" = true ] && PREV_CRED_VALID=true ;;
|
||||
codex) [ "$CODEX_CRED_DETECTED" = true ] && PREV_CRED_VALID=true ;;
|
||||
kimi_code) [ "$KIMI_CRED_DETECTED" = true ] && PREV_CRED_VALID=true ;;
|
||||
*)
|
||||
# API key provider — check if the env var is set
|
||||
if [ -n "$PREV_ENV_VAR" ] && [ -n "${!PREV_ENV_VAR}" ]; then
|
||||
@@ -889,15 +898,17 @@ if [ -n "$PREV_SUB_MODE" ] || [ -n "$PREV_PROVIDER" ]; then
|
||||
zai_code) DEFAULT_CHOICE=2 ;;
|
||||
codex) DEFAULT_CHOICE=3 ;;
|
||||
minimax_code) DEFAULT_CHOICE=4 ;;
|
||||
kimi_code) DEFAULT_CHOICE=5 ;;
|
||||
esac
|
||||
if [ -z "$DEFAULT_CHOICE" ]; then
|
||||
case "$PREV_PROVIDER" in
|
||||
anthropic) DEFAULT_CHOICE=5 ;;
|
||||
openai) DEFAULT_CHOICE=6 ;;
|
||||
gemini) DEFAULT_CHOICE=7 ;;
|
||||
groq) DEFAULT_CHOICE=8 ;;
|
||||
cerebras) DEFAULT_CHOICE=9 ;;
|
||||
anthropic) DEFAULT_CHOICE=6 ;;
|
||||
openai) DEFAULT_CHOICE=7 ;;
|
||||
gemini) DEFAULT_CHOICE=8 ;;
|
||||
groq) DEFAULT_CHOICE=9 ;;
|
||||
cerebras) DEFAULT_CHOICE=10 ;;
|
||||
minimax) DEFAULT_CHOICE=4 ;;
|
||||
kimi) DEFAULT_CHOICE=5 ;;
|
||||
esac
|
||||
fi
|
||||
fi
|
||||
@@ -936,14 +947,21 @@ else
|
||||
echo -e " ${CYAN}4)${NC} MiniMax Coding Key ${DIM}(use your MiniMax coding key)${NC}"
|
||||
fi
|
||||
|
||||
# 5) Kimi Code
|
||||
if [ "$KIMI_CRED_DETECTED" = true ]; then
|
||||
echo -e " ${CYAN}5)${NC} Kimi Code Subscription ${DIM}(use your Kimi Code plan)${NC} ${GREEN}(credential detected)${NC}"
|
||||
else
|
||||
echo -e " ${CYAN}5)${NC} Kimi Code Subscription ${DIM}(use your Kimi Code plan)${NC}"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo -e " ${CYAN}${BOLD}API key providers:${NC}"
|
||||
|
||||
# 5-9) API key providers — show (credential detected) if key already set
|
||||
# 6-10) API key providers — show (credential detected) if key already set
|
||||
PROVIDER_MENU_ENVS=(ANTHROPIC_API_KEY OPENAI_API_KEY GEMINI_API_KEY GROQ_API_KEY CEREBRAS_API_KEY)
|
||||
PROVIDER_MENU_NAMES=("Anthropic (Claude) - Recommended" "OpenAI (GPT)" "Google Gemini - Free tier available" "Groq - Fast, free tier" "Cerebras - Fast, free tier")
|
||||
for idx in 0 1 2 3 4; do
|
||||
num=$((idx + 5))
|
||||
num=$((idx + 6))
|
||||
env_var="${PROVIDER_MENU_ENVS[$idx]}"
|
||||
if [ -n "${!env_var}" ]; then
|
||||
echo -e " ${CYAN}$num)${NC} ${PROVIDER_MENU_NAMES[$idx]} ${GREEN}(credential detected)${NC}"
|
||||
@@ -952,7 +970,7 @@ for idx in 0 1 2 3 4; do
|
||||
fi
|
||||
done
|
||||
|
||||
echo -e " ${CYAN}10)${NC} Skip for now"
|
||||
echo -e " ${CYAN}11)${NC} Skip for now"
|
||||
echo ""
|
||||
|
||||
if [ -n "$DEFAULT_CHOICE" ]; then
|
||||
@@ -962,15 +980,15 @@ fi
|
||||
|
||||
while true; do
|
||||
if [ -n "$DEFAULT_CHOICE" ]; then
|
||||
read -r -p "Enter choice (1-10) [$DEFAULT_CHOICE]: " choice || true
|
||||
read -r -p "Enter choice (1-11) [$DEFAULT_CHOICE]: " choice || true
|
||||
choice="${choice:-$DEFAULT_CHOICE}"
|
||||
else
|
||||
read -r -p "Enter choice (1-10): " choice || true
|
||||
read -r -p "Enter choice (1-11): " choice || true
|
||||
fi
|
||||
if [[ "$choice" =~ ^[0-9]+$ ]] && [ "$choice" -ge 1 ] && [ "$choice" -le 10 ]; then
|
||||
if [[ "$choice" =~ ^[0-9]+$ ]] && [ "$choice" -ge 1 ] && [ "$choice" -le 11 ]; then
|
||||
break
|
||||
fi
|
||||
echo -e "${RED}Invalid choice. Please enter 1-10${NC}"
|
||||
echo -e "${RED}Invalid choice. Please enter 1-11${NC}"
|
||||
done
|
||||
|
||||
case $choice in
|
||||
@@ -1038,46 +1056,60 @@ case $choice in
|
||||
SUBSCRIPTION_MODE="minimax_code"
|
||||
SELECTED_ENV_VAR="MINIMAX_API_KEY"
|
||||
SELECTED_PROVIDER_ID="minimax"
|
||||
SELECTED_MODEL="MiniMax-M2.1"
|
||||
SELECTED_MAX_TOKENS=8192
|
||||
SELECTED_MODEL="MiniMax-M2.5"
|
||||
SELECTED_MAX_TOKENS=32768
|
||||
SELECTED_API_BASE="https://api.minimax.io/v1"
|
||||
PROVIDER_NAME="MiniMax"
|
||||
SIGNUP_URL="https://platform.minimax.io/user-center/basic-information/interface-key"
|
||||
echo ""
|
||||
echo -e "${GREEN}⬢${NC} Using MiniMax coding key"
|
||||
echo -e " ${DIM}Model: MiniMax-M2.1 | API: api.minimax.io${NC}"
|
||||
echo -e " ${DIM}Model: MiniMax-M2.5 | API: api.minimax.io${NC}"
|
||||
;;
|
||||
5)
|
||||
# Kimi Code Subscription
|
||||
SUBSCRIPTION_MODE="kimi_code"
|
||||
SELECTED_PROVIDER_ID="kimi"
|
||||
SELECTED_ENV_VAR="KIMI_API_KEY"
|
||||
SELECTED_MODEL="kimi-k2.5"
|
||||
SELECTED_MAX_TOKENS=32768
|
||||
SELECTED_API_BASE="https://api.kimi.com/coding"
|
||||
PROVIDER_NAME="Kimi"
|
||||
SIGNUP_URL="https://www.kimi.com/code"
|
||||
echo ""
|
||||
echo -e "${GREEN}⬢${NC} Using Kimi Code subscription"
|
||||
echo -e " ${DIM}Model: kimi-k2.5 | API: api.kimi.com/coding${NC}"
|
||||
;;
|
||||
6)
|
||||
SELECTED_ENV_VAR="ANTHROPIC_API_KEY"
|
||||
SELECTED_PROVIDER_ID="anthropic"
|
||||
PROVIDER_NAME="Anthropic"
|
||||
SIGNUP_URL="https://console.anthropic.com/settings/keys"
|
||||
;;
|
||||
6)
|
||||
7)
|
||||
SELECTED_ENV_VAR="OPENAI_API_KEY"
|
||||
SELECTED_PROVIDER_ID="openai"
|
||||
PROVIDER_NAME="OpenAI"
|
||||
SIGNUP_URL="https://platform.openai.com/api-keys"
|
||||
;;
|
||||
7)
|
||||
8)
|
||||
SELECTED_ENV_VAR="GEMINI_API_KEY"
|
||||
SELECTED_PROVIDER_ID="gemini"
|
||||
PROVIDER_NAME="Google Gemini"
|
||||
SIGNUP_URL="https://aistudio.google.com/apikey"
|
||||
;;
|
||||
8)
|
||||
9)
|
||||
SELECTED_ENV_VAR="GROQ_API_KEY"
|
||||
SELECTED_PROVIDER_ID="groq"
|
||||
PROVIDER_NAME="Groq"
|
||||
SIGNUP_URL="https://console.groq.com/keys"
|
||||
;;
|
||||
9)
|
||||
10)
|
||||
SELECTED_ENV_VAR="CEREBRAS_API_KEY"
|
||||
SELECTED_PROVIDER_ID="cerebras"
|
||||
PROVIDER_NAME="Cerebras"
|
||||
SIGNUP_URL="https://cloud.cerebras.ai/"
|
||||
;;
|
||||
10)
|
||||
11)
|
||||
echo ""
|
||||
echo -e "${YELLOW}Skipped.${NC} An LLM API key is required to test and use worker agents."
|
||||
echo -e "Add your API key later by running:"
|
||||
@@ -1090,7 +1122,7 @@ case $choice in
|
||||
esac
|
||||
|
||||
# For API-key providers: prompt for key (allow replacement if already set)
|
||||
if { [ -z "$SUBSCRIPTION_MODE" ] || [ "$SUBSCRIPTION_MODE" = "minimax_code" ]; } && [ -n "$SELECTED_ENV_VAR" ]; then
|
||||
if { [ -z "$SUBSCRIPTION_MODE" ] || [ "$SUBSCRIPTION_MODE" = "minimax_code" ] || [ "$SUBSCRIPTION_MODE" = "kimi_code" ]; } && [ -n "$SELECTED_ENV_VAR" ]; then
|
||||
while true; do
|
||||
CURRENT_KEY="${!SELECTED_ENV_VAR}"
|
||||
if [ -n "$CURRENT_KEY" ]; then
|
||||
@@ -1118,7 +1150,7 @@ if { [ -z "$SUBSCRIPTION_MODE" ] || [ "$SUBSCRIPTION_MODE" = "minimax_code" ]; }
|
||||
echo -e "${GREEN}⬢${NC} API key saved to $SHELL_RC_FILE"
|
||||
# Health check the new key
|
||||
echo -n " Verifying API key... "
|
||||
if [ "$SUBSCRIPTION_MODE" = "minimax_code" ] && [ -n "${SELECTED_API_BASE:-}" ]; then
|
||||
if { [ "$SUBSCRIPTION_MODE" = "minimax_code" ] || [ "$SUBSCRIPTION_MODE" = "kimi_code" ]; } && [ -n "${SELECTED_API_BASE:-}" ]; then
|
||||
HC_RESULT=$(uv run python "$SCRIPT_DIR/scripts/check_llm_key.py" "$SELECTED_PROVIDER_ID" "$API_KEY" "$SELECTED_API_BASE" 2>/dev/null) || true
|
||||
else
|
||||
HC_RESULT=$(uv run python "$SCRIPT_DIR/scripts/check_llm_key.py" "$SELECTED_PROVIDER_ID" "$API_KEY" 2>/dev/null) || true
|
||||
@@ -1238,6 +1270,8 @@ if [ -n "$SELECTED_PROVIDER_ID" ]; then
|
||||
save_configuration "$SELECTED_PROVIDER_ID" "$SELECTED_ENV_VAR" "$SELECTED_MODEL" "$SELECTED_MAX_TOKENS" "" "https://api.z.ai/api/coding/paas/v4" > /dev/null
|
||||
elif [ "$SUBSCRIPTION_MODE" = "minimax_code" ]; then
|
||||
save_configuration "$SELECTED_PROVIDER_ID" "$SELECTED_ENV_VAR" "$SELECTED_MODEL" "$SELECTED_MAX_TOKENS" "" "$SELECTED_API_BASE" > /dev/null
|
||||
elif [ "$SUBSCRIPTION_MODE" = "kimi_code" ]; then
|
||||
save_configuration "$SELECTED_PROVIDER_ID" "$SELECTED_ENV_VAR" "$SELECTED_MODEL" "$SELECTED_MAX_TOKENS" "" "$SELECTED_API_BASE" > /dev/null
|
||||
else
|
||||
save_configuration "$SELECTED_PROVIDER_ID" "$SELECTED_ENV_VAR" "$SELECTED_MODEL" "$SELECTED_MAX_TOKENS" > /dev/null
|
||||
fi
|
||||
|
||||
@@ -56,6 +56,53 @@ def check_openai_compatible(api_key: str, endpoint: str, name: str) -> dict:
|
||||
return {"valid": False, "message": f"{name} API returned status {r.status_code}"}
|
||||
|
||||
|
||||
def check_minimax(
|
||||
api_key: str, api_base: str = "https://api.minimax.io/v1", **_: str
|
||||
) -> dict:
|
||||
"""Validate via chatcompletion_v2 endpoint with empty messages.
|
||||
|
||||
MiniMax doesn't support GET /models; their native endpoint is
|
||||
/v1/text/chatcompletion_v2.
|
||||
"""
|
||||
with httpx.Client(timeout=TIMEOUT) as client:
|
||||
r = client.post(
|
||||
f"{api_base.rstrip('/')}/text/chatcompletion_v2",
|
||||
headers={
|
||||
"Authorization": f"Bearer {api_key}",
|
||||
"Content-Type": "application/json",
|
||||
},
|
||||
json={"model": "MiniMax-M2.5", "messages": []},
|
||||
)
|
||||
if r.status_code in (200, 400, 422, 429):
|
||||
return {"valid": True, "message": "MiniMax API key valid"}
|
||||
if r.status_code == 401:
|
||||
return {"valid": False, "message": "Invalid MiniMax API key"}
|
||||
if r.status_code == 403:
|
||||
return {"valid": False, "message": "MiniMax API key lacks permissions"}
|
||||
return {"valid": False, "message": f"MiniMax API returned status {r.status_code}"}
|
||||
|
||||
|
||||
def check_anthropic_compatible(api_key: str, endpoint: str, name: str) -> dict:
|
||||
"""POST empty messages to an Anthropic-compatible endpoint to validate key."""
|
||||
with httpx.Client(timeout=TIMEOUT) as client:
|
||||
r = client.post(
|
||||
endpoint,
|
||||
headers={
|
||||
"x-api-key": api_key,
|
||||
"anthropic-version": "2023-06-01",
|
||||
"Content-Type": "application/json",
|
||||
},
|
||||
json={"model": "kimi-k2.5", "max_tokens": 1, "messages": []},
|
||||
)
|
||||
if r.status_code in (200, 400, 429):
|
||||
return {"valid": True, "message": f"{name} API key valid"}
|
||||
if r.status_code == 401:
|
||||
return {"valid": False, "message": f"Invalid {name} API key"}
|
||||
if r.status_code == 403:
|
||||
return {"valid": False, "message": f"{name} API key lacks permissions"}
|
||||
return {"valid": False, "message": f"{name} API returned status {r.status_code}"}
|
||||
|
||||
|
||||
def check_gemini(api_key: str, **_: str) -> dict:
|
||||
"""List models with query param auth."""
|
||||
with httpx.Client(timeout=TIMEOUT) as client:
|
||||
@@ -82,8 +129,11 @@ PROVIDERS = {
|
||||
"cerebras": lambda key, **kw: check_openai_compatible(
|
||||
key, "https://api.cerebras.ai/v1/models", "Cerebras"
|
||||
),
|
||||
"minimax": lambda key, **kw: check_openai_compatible(
|
||||
key, "https://api.minimax.io/v1/models", "MiniMax"
|
||||
"minimax": lambda key, **kw: check_minimax(key),
|
||||
# Kimi For Coding uses an Anthropic-compatible endpoint; check via /v1/messages
|
||||
# with empty messages (same as check_anthropic, triggers 400 not 401).
|
||||
"kimi": lambda key, **kw: check_anthropic_compatible(
|
||||
key, "https://api.kimi.com/coding/v1/messages", "Kimi"
|
||||
),
|
||||
}
|
||||
|
||||
@@ -105,12 +155,17 @@ def main() -> None:
|
||||
api_base = sys.argv[3] if len(sys.argv) > 3 else ""
|
||||
|
||||
try:
|
||||
if api_base:
|
||||
if api_base and provider_id == "minimax":
|
||||
result = check_minimax(api_key, api_base)
|
||||
elif api_base and provider_id == "kimi":
|
||||
# Kimi uses an Anthropic-compatible endpoint; check via /v1/messages
|
||||
result = check_anthropic_compatible(
|
||||
api_key, api_base.rstrip("/") + "/v1/messages", "Kimi"
|
||||
)
|
||||
elif api_base:
|
||||
# Custom API base (ZAI or other OpenAI-compatible)
|
||||
endpoint = api_base.rstrip("/") + "/models"
|
||||
name = {"zai": "ZAI", "minimax": "MiniMax"}.get(
|
||||
provider_id, "Custom provider"
|
||||
)
|
||||
name = {"zai": "ZAI"}.get(provider_id, "Custom provider")
|
||||
result = check_openai_compatible(api_key, endpoint, name)
|
||||
elif provider_id in PROVIDERS:
|
||||
result = PROVIDERS[provider_id](api_key)
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
#!/usr/bin/env python
|
||||
"""Debug tool to print the queen's running phase prompt."""
|
||||
"""Debug tool to print the queen's phase-specific prompts."""
|
||||
|
||||
from framework.agents.hive_coder.nodes import (
|
||||
from framework.agents.queen.nodes import (
|
||||
_appendices,
|
||||
_queen_behavior_always,
|
||||
_queen_behavior_running,
|
||||
@@ -10,32 +10,36 @@ from framework.agents.hive_coder.nodes import (
|
||||
_queen_tools_running,
|
||||
)
|
||||
|
||||
_DEFAULT_WORKER_IDENTITY = (
|
||||
"\n\n# Worker Profile\n"
|
||||
"No worker agent loaded. You are operating independently.\n"
|
||||
"Handle all tasks directly using your coding tools."
|
||||
)
|
||||
|
||||
def print_running_prompt(worker_identity: str | None = None) -> None:
|
||||
"""Print the composed running phase prompt.
|
||||
|
||||
Args:
|
||||
worker_identity: Optional worker identity string. If None, shows
|
||||
the "no worker loaded" placeholder.
|
||||
"""
|
||||
if worker_identity is None:
|
||||
worker_identity = (
|
||||
"\n\n# Worker Profile\n"
|
||||
"No worker agent loaded. You are operating independently.\n"
|
||||
"Handle all tasks directly using your coding tools."
|
||||
)
|
||||
def print_planning_prompt(worker_identity: str | None = None) -> None:
|
||||
"""Print the composed planning phase prompt."""
|
||||
from framework.agents.queen.nodes import (
|
||||
_planning_knowledge,
|
||||
_queen_behavior_planning,
|
||||
_queen_identity_planning,
|
||||
_queen_tools_planning,
|
||||
)
|
||||
|
||||
wi = worker_identity or _DEFAULT_WORKER_IDENTITY
|
||||
|
||||
prompt = (
|
||||
_queen_identity_running
|
||||
_queen_identity_planning
|
||||
+ _queen_style
|
||||
+ _queen_tools_running
|
||||
+ _queen_tools_planning
|
||||
+ _queen_behavior_always
|
||||
+ _queen_behavior_running
|
||||
+ worker_identity
|
||||
+ _queen_behavior_planning
|
||||
+ _planning_knowledge
|
||||
+ wi
|
||||
)
|
||||
|
||||
print("=" * 80)
|
||||
print("QUEEN RUNNING PHASE PROMPT")
|
||||
print("QUEEN PLANNING PHASE PROMPT")
|
||||
print("=" * 80)
|
||||
print(prompt)
|
||||
print("=" * 80)
|
||||
@@ -44,20 +48,16 @@ def print_running_prompt(worker_identity: str | None = None) -> None:
|
||||
|
||||
def print_building_prompt(worker_identity: str | None = None) -> None:
|
||||
"""Print the composed building phase prompt."""
|
||||
from framework.agents.hive_coder.nodes import (
|
||||
_agent_builder_knowledge,
|
||||
from framework.agents.queen.nodes import (
|
||||
_building_knowledge,
|
||||
_gcu_building_section,
|
||||
_queen_behavior_building,
|
||||
_queen_identity_building,
|
||||
_queen_phase_7,
|
||||
_queen_tools_building,
|
||||
)
|
||||
|
||||
if worker_identity is None:
|
||||
worker_identity = (
|
||||
"\n\n# Worker Profile\n"
|
||||
"No worker agent loaded. You are operating independently.\n"
|
||||
"Handle all tasks directly using your coding tools."
|
||||
)
|
||||
wi = worker_identity or _DEFAULT_WORKER_IDENTITY
|
||||
|
||||
prompt = (
|
||||
_queen_identity_building
|
||||
@@ -65,10 +65,11 @@ def print_building_prompt(worker_identity: str | None = None) -> None:
|
||||
+ _queen_tools_building
|
||||
+ _queen_behavior_always
|
||||
+ _queen_behavior_building
|
||||
+ _agent_builder_knowledge
|
||||
+ _building_knowledge
|
||||
+ _gcu_building_section
|
||||
+ _queen_phase_7
|
||||
+ _appendices
|
||||
+ worker_identity
|
||||
+ wi
|
||||
)
|
||||
|
||||
print("=" * 80)
|
||||
@@ -81,18 +82,13 @@ def print_building_prompt(worker_identity: str | None = None) -> None:
|
||||
|
||||
def print_staging_prompt(worker_identity: str | None = None) -> None:
|
||||
"""Print the composed staging phase prompt."""
|
||||
from framework.agents.hive_coder.nodes import (
|
||||
from framework.agents.queen.nodes import (
|
||||
_queen_behavior_staging,
|
||||
_queen_identity_staging,
|
||||
_queen_tools_staging,
|
||||
)
|
||||
|
||||
if worker_identity is None:
|
||||
worker_identity = (
|
||||
"\n\n# Worker Profile\n"
|
||||
"No worker agent loaded. You are operating independently.\n"
|
||||
"Handle all tasks directly using your coding tools."
|
||||
)
|
||||
wi = worker_identity or _DEFAULT_WORKER_IDENTITY
|
||||
|
||||
prompt = (
|
||||
_queen_identity_staging
|
||||
@@ -100,7 +96,7 @@ def print_staging_prompt(worker_identity: str | None = None) -> None:
|
||||
+ _queen_tools_staging
|
||||
+ _queen_behavior_always
|
||||
+ _queen_behavior_staging
|
||||
+ worker_identity
|
||||
+ wi
|
||||
)
|
||||
|
||||
print("=" * 80)
|
||||
@@ -111,17 +107,47 @@ def print_staging_prompt(worker_identity: str | None = None) -> None:
|
||||
print(f"\nTotal length: {len(prompt):,} characters")
|
||||
|
||||
|
||||
def print_running_prompt(worker_identity: str | None = None) -> None:
|
||||
"""Print the composed running phase prompt.
|
||||
|
||||
Args:
|
||||
worker_identity: Optional worker identity string. If None, shows
|
||||
the "no worker loaded" placeholder.
|
||||
"""
|
||||
wi = worker_identity or _DEFAULT_WORKER_IDENTITY
|
||||
|
||||
prompt = (
|
||||
_queen_identity_running
|
||||
+ _queen_style
|
||||
+ _queen_tools_running
|
||||
+ _queen_behavior_always
|
||||
+ _queen_behavior_running
|
||||
+ wi
|
||||
)
|
||||
|
||||
print("=" * 80)
|
||||
print("QUEEN RUNNING PHASE PROMPT")
|
||||
print("=" * 80)
|
||||
print(prompt)
|
||||
print("=" * 80)
|
||||
print(f"\nTotal length: {len(prompt):,} characters")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
import sys
|
||||
|
||||
phase = sys.argv[1] if len(sys.argv) > 1 else "running"
|
||||
phase = sys.argv[1] if len(sys.argv) > 1 else "planning"
|
||||
|
||||
if phase == "all":
|
||||
print_planning_prompt()
|
||||
print("\n\n")
|
||||
print_building_prompt()
|
||||
print("\n\n")
|
||||
print_staging_prompt()
|
||||
print("\n\n")
|
||||
print_running_prompt()
|
||||
elif phase == "planning":
|
||||
print_planning_prompt()
|
||||
elif phase == "building":
|
||||
print_building_prompt()
|
||||
elif phase == "staging":
|
||||
@@ -131,6 +157,6 @@ if __name__ == "__main__":
|
||||
else:
|
||||
print(f"Unknown phase: {phase}")
|
||||
print(
|
||||
"Usage: uv run scripts/debug_queen_prompt.py [building|staging|running|all]"
|
||||
"Usage: uv run scripts/debug_queen_prompt.py [planning|building|staging|running|all]"
|
||||
)
|
||||
sys.exit(1)
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
"""Quick test script for initialize_agent_package."""
|
||||
"""Quick test script for initialize_and_build_agent."""
|
||||
|
||||
import sys
|
||||
import os
|
||||
@@ -14,6 +14,6 @@ import tools.coder_tools_server as srv
|
||||
srv.PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ".."))
|
||||
|
||||
# Access the underlying function (FastMCP wraps it as FunctionTool)
|
||||
tool = srv.initialize_agent_package
|
||||
tool = srv.initialize_and_build_agent
|
||||
result = tool.fn("richard_test2", nodes="intake,process,review")
|
||||
print(result)
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
Coder Tools MCP Server — OpenCode-inspired coding tools.
|
||||
|
||||
Provides rich file I/O, fuzzy-match editing, git snapshots, and shell execution
|
||||
for the hive_coder agent. Modeled after opencode's tool architecture.
|
||||
for the queen agent. Modeled after opencode's tool architecture.
|
||||
|
||||
All paths scoped to a configurable project root for safety.
|
||||
|
||||
@@ -1252,6 +1252,53 @@ def validate_agent_package(agent_name: str) -> str:
|
||||
path_parts.append(pythonpath)
|
||||
env["PYTHONPATH"] = os.pathsep.join(path_parts)
|
||||
|
||||
# Step 0: Module contract — __init__.py must expose goal, nodes, edges
|
||||
try:
|
||||
_contract_script = textwrap.dedent("""\
|
||||
import importlib, json
|
||||
mod = importlib.import_module('{agent_name}')
|
||||
missing = [a for a in ('goal', 'nodes', 'edges') if getattr(mod, a, None) is None]
|
||||
if missing:
|
||||
print(json.dumps({{
|
||||
'valid': False,
|
||||
'error': (
|
||||
"Module '{agent_name}' is missing module-level attributes: "
|
||||
+ ", ".join(missing) + ". "
|
||||
"Fix: in {agent_name}/__init__.py, add "
|
||||
"'from .agent import " + ", ".join(missing) + "' "
|
||||
"so that 'import {agent_name}' exposes them at package level."
|
||||
)
|
||||
}}))
|
||||
else:
|
||||
print(json.dumps({{'valid': True}}))
|
||||
""").format(agent_name=agent_name)
|
||||
proc = subprocess.run(
|
||||
["uv", "run", "python", "-c", _contract_script],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=30,
|
||||
env=env,
|
||||
cwd=PROJECT_ROOT,
|
||||
stdin=subprocess.DEVNULL,
|
||||
)
|
||||
if proc.returncode == 0:
|
||||
result = json.loads(proc.stdout.strip())
|
||||
steps["module_contract"] = {
|
||||
"passed": result["valid"],
|
||||
"output": result.get("error", "goal, nodes, edges exported correctly"),
|
||||
}
|
||||
else:
|
||||
steps["module_contract"] = {
|
||||
"passed": False,
|
||||
"error": (
|
||||
f"Failed to import '{agent_name}': {proc.stderr.strip()[:1000]}. "
|
||||
f"Fix: ensure {agent_name}/__init__.py exists and can be imported "
|
||||
f"without errors (check syntax, missing dependencies, relative imports)."
|
||||
),
|
||||
}
|
||||
except Exception as e:
|
||||
steps["module_contract"] = {"passed": False, "error": str(e)}
|
||||
|
||||
# Step A: Class validation (subprocess for import isolation)
|
||||
try:
|
||||
proc = subprocess.run(
|
||||
@@ -1321,9 +1368,11 @@ def validate_agent_package(agent_name: str) -> str:
|
||||
result = json.loads(proc.stdout.strip())
|
||||
steps["node_completeness"] = {
|
||||
"passed": result["valid"],
|
||||
"output": "; ".join(result["errors"])
|
||||
if result["errors"]
|
||||
else "All defined nodes are in the graph",
|
||||
"output": (
|
||||
"; ".join(result["errors"])
|
||||
if result["errors"]
|
||||
else "All defined nodes are in the graph"
|
||||
),
|
||||
}
|
||||
if not result["valid"]:
|
||||
steps["node_completeness"]["errors"] = result["errors"]
|
||||
@@ -1434,7 +1483,7 @@ def _node_var_name(node_id: str) -> str:
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
def initialize_agent_package(agent_name: str, nodes: str | None = None) -> str:
|
||||
def initialize_and_build_agent(agent_name: str, nodes: str | None = None) -> str:
|
||||
"""Scaffold a new agent package with placeholder files.
|
||||
|
||||
Creates exports/{agent_name}/ with all files needed for a runnable agent:
|
||||
@@ -1985,6 +2034,9 @@ def runner_loaded():
|
||||
''',
|
||||
)
|
||||
|
||||
# Build list of all generated file paths for the caller.
|
||||
all_file_paths = [info["path"] for info in files_written.values()]
|
||||
|
||||
return json.dumps(
|
||||
{
|
||||
"success": True,
|
||||
@@ -1994,10 +2046,33 @@ def runner_loaded():
|
||||
"nodes": node_list,
|
||||
"files_written": files_written,
|
||||
"file_count": len(files_written),
|
||||
"files": all_file_paths,
|
||||
"next_steps": [
|
||||
f"Customize node definitions in exports/{agent_name}/nodes/__init__.py",
|
||||
f"Define goal and edges in exports/{agent_name}/agent.py",
|
||||
f'Run validate_agent_package("{agent_name}") to check structure',
|
||||
(
|
||||
"IMPORTANT: All generated files are structurally complete "
|
||||
"with correct imports, class definition, validate() method, "
|
||||
"and __init__.py exports. Use edit_file to customize TODO "
|
||||
"placeholders — do NOT use write_file to rewrite entire files, "
|
||||
"as this will break imports and structure."
|
||||
),
|
||||
(
|
||||
f"Use edit_file to customize system prompts, tools, "
|
||||
f"input_keys, output_keys, and success_criteria in "
|
||||
f"exports/{agent_name}/nodes/__init__.py"
|
||||
),
|
||||
(
|
||||
f"Use edit_file to customize goal description, "
|
||||
f"success_criteria values, constraint values, edge "
|
||||
f"definitions, and identity_prompt in "
|
||||
f"exports/{agent_name}/agent.py"
|
||||
),
|
||||
(
|
||||
"Do NOT modify: imports at top of agent.py, the class "
|
||||
"definition, validate() method, _build_graph()/_setup()/"
|
||||
"lifecycle methods, or __init__.py exports — they are "
|
||||
"already correct."
|
||||
),
|
||||
f'Run validate_agent_package("{agent_name}") to verify structure',
|
||||
],
|
||||
},
|
||||
indent=2,
|
||||
|
||||
@@ -17,6 +17,9 @@ AIRTABLE_CREDENTIALS = {
|
||||
"airtable_update_records",
|
||||
"airtable_list_bases",
|
||||
"airtable_get_base_schema",
|
||||
"airtable_delete_records",
|
||||
"airtable_search_records",
|
||||
"airtable_list_collaborators",
|
||||
],
|
||||
required=True,
|
||||
startup_required=False,
|
||||
|
||||
@@ -14,6 +14,9 @@ APOLLO_CREDENTIALS = {
|
||||
"apollo_enrich_company",
|
||||
"apollo_search_people",
|
||||
"apollo_search_companies",
|
||||
"apollo_get_person_activities",
|
||||
"apollo_list_email_accounts",
|
||||
"apollo_bulk_enrich_people",
|
||||
],
|
||||
required=True,
|
||||
startup_required=False,
|
||||
|
||||
@@ -16,6 +16,9 @@ ASANA_CREDENTIALS = {
|
||||
"asana_get_task",
|
||||
"asana_create_task",
|
||||
"asana_search_tasks",
|
||||
"asana_update_task",
|
||||
"asana_add_comment",
|
||||
"asana_create_subtask",
|
||||
],
|
||||
required=True,
|
||||
startup_required=False,
|
||||
|
||||
@@ -16,6 +16,9 @@ AWS_S3_CREDENTIALS = {
|
||||
"s3_get_object",
|
||||
"s3_put_object",
|
||||
"s3_delete_object",
|
||||
"s3_copy_object",
|
||||
"s3_get_object_metadata",
|
||||
"s3_generate_presigned_url",
|
||||
],
|
||||
required=True,
|
||||
startup_required=False,
|
||||
@@ -42,6 +45,9 @@ AWS_S3_CREDENTIALS = {
|
||||
"s3_get_object",
|
||||
"s3_put_object",
|
||||
"s3_delete_object",
|
||||
"s3_copy_object",
|
||||
"s3_get_object_metadata",
|
||||
"s3_generate_presigned_url",
|
||||
],
|
||||
required=True,
|
||||
startup_required=False,
|
||||
|
||||
@@ -15,6 +15,9 @@ BREVO_CREDENTIALS = {
|
||||
"brevo_get_contact",
|
||||
"brevo_update_contact",
|
||||
"brevo_get_email_stats",
|
||||
"brevo_list_contacts",
|
||||
"brevo_delete_contact",
|
||||
"brevo_list_email_campaigns",
|
||||
],
|
||||
required=True,
|
||||
startup_required=False,
|
||||
|
||||
@@ -16,6 +16,9 @@ CALENDLY_CREDENTIALS = {
|
||||
"calendly_list_scheduled_events",
|
||||
"calendly_get_scheduled_event",
|
||||
"calendly_list_invitees",
|
||||
"calendly_cancel_event",
|
||||
"calendly_list_webhooks",
|
||||
"calendly_get_event_type",
|
||||
],
|
||||
required=True,
|
||||
startup_required=False,
|
||||
|
||||
@@ -16,6 +16,9 @@ CLOUDINARY_CREDENTIALS = {
|
||||
"cloudinary_get_resource",
|
||||
"cloudinary_delete_resource",
|
||||
"cloudinary_search",
|
||||
"cloudinary_get_usage",
|
||||
"cloudinary_rename_resource",
|
||||
"cloudinary_add_tag",
|
||||
],
|
||||
required=True,
|
||||
startup_required=False,
|
||||
@@ -41,6 +44,9 @@ CLOUDINARY_CREDENTIALS = {
|
||||
"cloudinary_get_resource",
|
||||
"cloudinary_delete_resource",
|
||||
"cloudinary_search",
|
||||
"cloudinary_get_usage",
|
||||
"cloudinary_rename_resource",
|
||||
"cloudinary_add_tag",
|
||||
],
|
||||
required=True,
|
||||
startup_required=False,
|
||||
@@ -60,6 +66,9 @@ CLOUDINARY_CREDENTIALS = {
|
||||
"cloudinary_get_resource",
|
||||
"cloudinary_delete_resource",
|
||||
"cloudinary_search",
|
||||
"cloudinary_get_usage",
|
||||
"cloudinary_rename_resource",
|
||||
"cloudinary_add_tag",
|
||||
],
|
||||
required=True,
|
||||
startup_required=False,
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user