Merge remote-tracking branch 'origin/main' into feature/quickstart-credential-store
This commit is contained in:
@@ -267,7 +267,7 @@ This returns JSON with all the goal, nodes, edges, and MCP server configurations
|
||||
- NOT: `{"first-node-id": ["input_keys"]}` (WRONG)
|
||||
- NOT: `{"first-node-id"}` (WRONG - this is a set)
|
||||
|
||||
**Use the example agent** at `.claude/skills/building-agents-construction/examples/online_research_agent/` as a template for file structure and patterns.
|
||||
**Use the example agent** at `.claude/skills/building-agents-construction/examples/deep_research_agent/` as a template for file structure and patterns. It demonstrates: STEP 1/STEP 2 prompts, client-facing nodes, feedback loops, nullable_output_keys, and data tools.
|
||||
|
||||
**AFTER writing all files, tell the user:**
|
||||
|
||||
@@ -354,7 +354,7 @@ mcp__agent-builder__get_session_status()
|
||||
|
||||
## REFERENCE: System Prompt Best Practice
|
||||
|
||||
For event_loop nodes, instruct the LLM to use `set_output` for structured outputs:
|
||||
For **internal** event_loop nodes (not client-facing), instruct the LLM to use `set_output`:
|
||||
|
||||
```
|
||||
Use set_output(key, value) to store your results. For example:
|
||||
@@ -363,71 +363,55 @@ Use set_output(key, value) to store your results. For example:
|
||||
Do NOT return raw JSON. Use the set_output tool to produce outputs.
|
||||
```
|
||||
|
||||
For **client-facing** event_loop nodes, use the STEP 1/STEP 2 pattern:
|
||||
|
||||
```
|
||||
**STEP 1 — Respond to the user (text only, NO tool calls):**
|
||||
[Present information, ask questions, etc.]
|
||||
|
||||
**STEP 2 — After the user responds, call set_output:**
|
||||
- set_output("key", "value based on user's response")
|
||||
```
|
||||
|
||||
This prevents the LLM from calling `set_output` before the user has had a chance to respond. The "NO tool calls" instruction in STEP 1 ensures the node blocks for user input before proceeding.
|
||||
|
||||
---
|
||||
|
||||
## CRITICAL: EventLoopNode Registration
|
||||
## EventLoopNode Runtime
|
||||
|
||||
**`AgentRuntime` does NOT support `event_loop` nodes.** The `AgentRuntime` / `create_agent_runtime()` path creates `GraphExecutor` instances internally without passing a `node_registry`, causing all `event_loop` nodes to fail at runtime with:
|
||||
|
||||
```
|
||||
EventLoopNode 'node-id' not found in registry. Register it with executor.register_node() before execution.
|
||||
```
|
||||
|
||||
**The correct pattern**: Use `GraphExecutor` directly with a `node_registry` dict containing `EventLoopNode` instances:
|
||||
EventLoopNodes are **auto-created** by `GraphExecutor` at runtime. Both direct `GraphExecutor` and `AgentRuntime` / `create_agent_runtime()` handle event_loop nodes automatically. No manual `node_registry` setup is needed.
|
||||
|
||||
```python
|
||||
from framework.graph.executor import GraphExecutor, ExecutionResult
|
||||
from framework.graph.event_loop_node import EventLoopNode, LoopConfig
|
||||
from framework.runtime.event_bus import EventBus
|
||||
from framework.runtime.core import Runtime # REQUIRED - executor calls runtime.start_run()
|
||||
# Direct execution
|
||||
from framework.graph.executor import GraphExecutor
|
||||
from framework.runtime.core import Runtime
|
||||
|
||||
# 1. Build node_registry with EventLoopNode instances
|
||||
event_bus = EventBus()
|
||||
node_registry = {}
|
||||
for node_spec in nodes:
|
||||
if node_spec.node_type == "event_loop":
|
||||
node_registry[node_spec.id] = EventLoopNode(
|
||||
event_bus=event_bus,
|
||||
judge=None, # implicit judge: accepts when output_keys are filled
|
||||
config=LoopConfig(
|
||||
max_iterations=50,
|
||||
max_tool_calls_per_turn=15,
|
||||
stall_detection_threshold=3,
|
||||
max_history_tokens=32000,
|
||||
),
|
||||
tool_executor=tool_executor,
|
||||
)
|
||||
|
||||
# 2. Create Runtime for run tracking (GraphExecutor calls runtime.start_run())
|
||||
storage_path = Path.home() / ".hive" / "my_agent"
|
||||
storage_path.mkdir(parents=True, exist_ok=True)
|
||||
runtime = Runtime(storage_path)
|
||||
|
||||
# 3. Create GraphExecutor WITH node_registry and runtime
|
||||
executor = GraphExecutor(
|
||||
runtime=runtime, # NOT None - executor needs this for run tracking
|
||||
runtime=runtime,
|
||||
llm=llm,
|
||||
tools=tools,
|
||||
tool_executor=tool_executor,
|
||||
node_registry=node_registry, # EventLoopNode instances
|
||||
storage_path=storage_path,
|
||||
)
|
||||
|
||||
# 4. Execute
|
||||
result = await executor.execute(graph=graph, goal=goal, input_data=input_data)
|
||||
```
|
||||
|
||||
**DO NOT use `AgentRuntime` or `create_agent_runtime()` for agents with `event_loop` nodes.**
|
||||
|
||||
**DO NOT pass `runtime=None` to `GraphExecutor`** — it will crash with `'NoneType' object has no attribute 'start_run'`.
|
||||
|
||||
---
|
||||
|
||||
## COMMON MISTAKES TO AVOID
|
||||
|
||||
1. **Using `AgentRuntime` with event_loop nodes** - `AgentRuntime` does not register EventLoopNodes. Use `GraphExecutor` directly with `node_registry`
|
||||
2. **Passing `runtime=None` to GraphExecutor** - The executor calls `runtime.start_run()` internally. Always provide a `Runtime(storage_path)` instance
|
||||
3. **Using tools that don't exist** - Always check `mcp__agent-builder__list_mcp_tools()` first
|
||||
4. **Wrong entry_points format** - Must be `{"start": "node-id"}`, NOT a set or list
|
||||
5. **Skipping validation** - Always validate nodes and graph before proceeding
|
||||
6. **Not waiting for approval** - Always ask user before major steps
|
||||
7. **Displaying this file** - Execute the steps, don't show documentation
|
||||
1. **Using tools that don't exist** - Always check `mcp__agent-builder__list_mcp_tools()` first
|
||||
2. **Wrong entry_points format** - Must be `{"start": "node-id"}`, NOT a set or list
|
||||
3. **Skipping validation** - Always validate nodes and graph before proceeding
|
||||
4. **Not waiting for approval** - Always ask user before major steps
|
||||
5. **Displaying this file** - Execute the steps, don't show documentation
|
||||
6. **Too many thin nodes** - Prefer fewer, richer nodes (4 nodes > 8 nodes)
|
||||
7. **Missing STEP 1/STEP 2 in client-facing prompts** - Client-facing nodes need explicit phases to prevent premature set_output
|
||||
8. **Forgetting nullable_output_keys** - Mark input_keys that only arrive on certain edges (e.g., feedback) as nullable on the receiving node
|
||||
9. **Adding framework gating for LLM behavior** - Fix prompts or use judges, not ad-hoc code
|
||||
|
||||
@@ -0,0 +1,24 @@
|
||||
"""
|
||||
Deep Research Agent - Interactive, rigorous research with TUI conversation.
|
||||
|
||||
Research any topic through multi-source web search, quality evaluation,
|
||||
and synthesis. Features client-facing TUI interaction at key checkpoints
|
||||
for user guidance and iterative deepening.
|
||||
"""
|
||||
|
||||
from .agent import DeepResearchAgent, default_agent, goal, nodes, edges
|
||||
from .config import RuntimeConfig, AgentMetadata, default_config, metadata
|
||||
|
||||
__version__ = "1.0.0"
|
||||
|
||||
__all__ = [
|
||||
"DeepResearchAgent",
|
||||
"default_agent",
|
||||
"goal",
|
||||
"nodes",
|
||||
"edges",
|
||||
"RuntimeConfig",
|
||||
"AgentMetadata",
|
||||
"default_config",
|
||||
"metadata",
|
||||
]
|
||||
+96
-18
@@ -1,5 +1,5 @@
|
||||
"""
|
||||
CLI entry point for Online Research Agent.
|
||||
CLI entry point for Deep Research Agent.
|
||||
|
||||
Uses AgentRuntime for multi-entrypoint support with HITL pause/resume.
|
||||
"""
|
||||
@@ -10,7 +10,7 @@ import logging
|
||||
import sys
|
||||
import click
|
||||
|
||||
from .agent import default_agent, OnlineResearchAgent
|
||||
from .agent import default_agent, DeepResearchAgent
|
||||
|
||||
|
||||
def setup_logging(verbose=False, debug=False):
|
||||
@@ -28,7 +28,7 @@ def setup_logging(verbose=False, debug=False):
|
||||
@click.group()
|
||||
@click.version_option(version="1.0.0")
|
||||
def cli():
|
||||
"""Online Research Agent - Deep-dive research with narrative reports."""
|
||||
"""Deep Research Agent - Interactive, rigorous research with TUI conversation."""
|
||||
pass
|
||||
|
||||
|
||||
@@ -59,6 +59,83 @@ def run(topic, mock, quiet, verbose, debug):
|
||||
sys.exit(0 if result.success else 1)
|
||||
|
||||
|
||||
@cli.command()
|
||||
@click.option("--mock", is_flag=True, help="Run in mock mode")
|
||||
@click.option("--verbose", "-v", is_flag=True, help="Show execution details")
|
||||
@click.option("--debug", is_flag=True, help="Show debug logging")
|
||||
def tui(mock, verbose, debug):
|
||||
"""Launch the TUI dashboard for interactive research."""
|
||||
setup_logging(verbose=verbose, debug=debug)
|
||||
|
||||
try:
|
||||
from framework.tui.app import AdenTUI
|
||||
except ImportError:
|
||||
click.echo("TUI requires the 'textual' package. Install with: pip install textual")
|
||||
sys.exit(1)
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
from framework.llm import LiteLLMProvider
|
||||
from framework.runner.tool_registry import ToolRegistry
|
||||
from framework.runtime.agent_runtime import create_agent_runtime
|
||||
from framework.runtime.event_bus import EventBus
|
||||
from framework.runtime.execution_stream import EntryPointSpec
|
||||
|
||||
async def run_with_tui():
|
||||
agent = DeepResearchAgent()
|
||||
|
||||
# Build graph and tools
|
||||
agent._event_bus = EventBus()
|
||||
agent._tool_registry = ToolRegistry()
|
||||
|
||||
mcp_config_path = Path(__file__).parent / "mcp_servers.json"
|
||||
if mcp_config_path.exists():
|
||||
agent._tool_registry.load_mcp_config(mcp_config_path)
|
||||
|
||||
llm = None
|
||||
if not mock:
|
||||
llm = LiteLLMProvider(
|
||||
model=agent.config.model,
|
||||
api_key=agent.config.api_key,
|
||||
api_base=agent.config.api_base,
|
||||
)
|
||||
|
||||
tools = list(agent._tool_registry.get_tools().values())
|
||||
tool_executor = agent._tool_registry.get_executor()
|
||||
graph = agent._build_graph()
|
||||
|
||||
storage_path = Path.home() / ".hive" / "deep_research_agent"
|
||||
storage_path.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
runtime = create_agent_runtime(
|
||||
graph=graph,
|
||||
goal=agent.goal,
|
||||
storage_path=storage_path,
|
||||
entry_points=[
|
||||
EntryPointSpec(
|
||||
id="start",
|
||||
name="Start Research",
|
||||
entry_node="intake",
|
||||
trigger_type="manual",
|
||||
isolation_level="isolated",
|
||||
),
|
||||
],
|
||||
llm=llm,
|
||||
tools=tools,
|
||||
tool_executor=tool_executor,
|
||||
)
|
||||
|
||||
await runtime.start()
|
||||
|
||||
try:
|
||||
app = AdenTUI(runtime)
|
||||
await app.run_async()
|
||||
finally:
|
||||
await runtime.stop()
|
||||
|
||||
asyncio.run(run_with_tui())
|
||||
|
||||
|
||||
@cli.command()
|
||||
@click.option("--json", "output_json", is_flag=True)
|
||||
def info(output_json):
|
||||
@@ -71,6 +148,7 @@ def info(output_json):
|
||||
click.echo(f"Version: {info_data['version']}")
|
||||
click.echo(f"Description: {info_data['description']}")
|
||||
click.echo(f"\nNodes: {', '.join(info_data['nodes'])}")
|
||||
click.echo(f"Client-facing: {', '.join(info_data['client_facing_nodes'])}")
|
||||
click.echo(f"Entry: {info_data['entry_node']}")
|
||||
click.echo(f"Terminal: {', '.join(info_data['terminal_nodes'])}")
|
||||
|
||||
@@ -81,6 +159,9 @@ def validate():
|
||||
validation = default_agent.validate()
|
||||
if validation["valid"]:
|
||||
click.echo("Agent is valid")
|
||||
if validation["warnings"]:
|
||||
for warning in validation["warnings"]:
|
||||
click.echo(f" WARNING: {warning}")
|
||||
else:
|
||||
click.echo("Agent has errors:")
|
||||
for error in validation["errors"]:
|
||||
@@ -91,7 +172,7 @@ def validate():
|
||||
@cli.command()
|
||||
@click.option("--verbose", "-v", is_flag=True)
|
||||
def shell(verbose):
|
||||
"""Interactive research session."""
|
||||
"""Interactive research session (CLI, no TUI)."""
|
||||
asyncio.run(_interactive_shell(verbose))
|
||||
|
||||
|
||||
@@ -99,10 +180,10 @@ async def _interactive_shell(verbose=False):
|
||||
"""Async interactive shell."""
|
||||
setup_logging(verbose=verbose)
|
||||
|
||||
click.echo("=== Online Research Agent ===")
|
||||
click.echo("=== Deep Research Agent ===")
|
||||
click.echo("Enter a topic to research (or 'quit' to exit):\n")
|
||||
|
||||
agent = OnlineResearchAgent()
|
||||
agent = DeepResearchAgent()
|
||||
await agent.start()
|
||||
|
||||
try:
|
||||
@@ -118,7 +199,7 @@ async def _interactive_shell(verbose=False):
|
||||
if not topic.strip():
|
||||
continue
|
||||
|
||||
click.echo("\nResearching... (this may take a few minutes)\n")
|
||||
click.echo("\nResearching...\n")
|
||||
|
||||
result = await agent.trigger_and_wait("start", {"topic": topic})
|
||||
|
||||
@@ -128,16 +209,14 @@ async def _interactive_shell(verbose=False):
|
||||
|
||||
if result.success:
|
||||
output = result.output
|
||||
if "file_path" in output:
|
||||
click.echo(f"\nReport saved to: {output['file_path']}\n")
|
||||
if "final_report" in output:
|
||||
click.echo("\n--- Report Preview ---\n")
|
||||
preview = (
|
||||
output["final_report"][:500] + "..."
|
||||
if len(output.get("final_report", "")) > 500
|
||||
else output.get("final_report", "")
|
||||
)
|
||||
click.echo(preview)
|
||||
if "report_content" in output:
|
||||
click.echo("\n--- Report ---\n")
|
||||
click.echo(output["report_content"])
|
||||
click.echo("\n")
|
||||
if "references" in output:
|
||||
click.echo("--- References ---\n")
|
||||
for ref in output.get("references", []):
|
||||
click.echo(f" [{ref.get('number', '?')}] {ref.get('title', '')} - {ref.get('url', '')}")
|
||||
click.echo("\n")
|
||||
else:
|
||||
click.echo(f"\nResearch failed: {result.error}\n")
|
||||
@@ -148,7 +227,6 @@ async def _interactive_shell(verbose=False):
|
||||
except Exception as e:
|
||||
click.echo(f"Error: {e}", err=True)
|
||||
import traceback
|
||||
|
||||
traceback.print_exc()
|
||||
finally:
|
||||
await agent.stop()
|
||||
+72
-127
@@ -1,9 +1,8 @@
|
||||
"""Agent graph construction for Online Research Agent."""
|
||||
"""Agent graph construction for Deep Research Agent."""
|
||||
|
||||
from framework.graph import EdgeSpec, EdgeCondition, Goal, SuccessCriterion, Constraint
|
||||
from framework.graph.edge import GraphSpec
|
||||
from framework.graph.executor import ExecutionResult, GraphExecutor
|
||||
from framework.graph.event_loop_node import EventLoopNode, LoopConfig
|
||||
from framework.runtime.event_bus import EventBus
|
||||
from framework.runtime.core import Runtime
|
||||
from framework.llm import LiteLLMProvider
|
||||
@@ -11,164 +10,132 @@ from framework.runner.tool_registry import ToolRegistry
|
||||
|
||||
from .config import default_config, metadata
|
||||
from .nodes import (
|
||||
parse_query_node,
|
||||
search_sources_node,
|
||||
fetch_content_node,
|
||||
evaluate_sources_node,
|
||||
synthesize_findings_node,
|
||||
write_report_node,
|
||||
quality_check_node,
|
||||
save_report_node,
|
||||
intake_node,
|
||||
research_node,
|
||||
review_node,
|
||||
report_node,
|
||||
)
|
||||
|
||||
# Goal definition
|
||||
goal = Goal(
|
||||
id="comprehensive-online-research",
|
||||
name="Comprehensive Online Research",
|
||||
description="Research any topic by searching multiple sources, synthesizing information, and producing a well-structured narrative report with citations.",
|
||||
id="rigorous-interactive-research",
|
||||
name="Rigorous Interactive Research",
|
||||
description=(
|
||||
"Research any topic by searching diverse sources, analyzing findings, "
|
||||
"and producing a cited report — with user checkpoints to guide direction."
|
||||
),
|
||||
success_criteria=[
|
||||
SuccessCriterion(
|
||||
id="source-coverage",
|
||||
description="Query 10+ diverse sources",
|
||||
id="source-diversity",
|
||||
description="Use multiple diverse, authoritative sources",
|
||||
metric="source_count",
|
||||
target=">=10",
|
||||
weight=0.20,
|
||||
),
|
||||
SuccessCriterion(
|
||||
id="relevance",
|
||||
description="All sources directly address the query",
|
||||
metric="relevance_score",
|
||||
target="90%",
|
||||
target=">=5",
|
||||
weight=0.25,
|
||||
),
|
||||
SuccessCriterion(
|
||||
id="synthesis",
|
||||
description="Synthesize findings into coherent narrative",
|
||||
metric="coherence_score",
|
||||
target="85%",
|
||||
weight=0.25,
|
||||
),
|
||||
SuccessCriterion(
|
||||
id="citations",
|
||||
description="Include citations for all claims",
|
||||
id="citation-coverage",
|
||||
description="Every factual claim in the report cites its source",
|
||||
metric="citation_coverage",
|
||||
target="100%",
|
||||
weight=0.15,
|
||||
weight=0.25,
|
||||
),
|
||||
SuccessCriterion(
|
||||
id="actionable",
|
||||
description="Report answers the user's question",
|
||||
metric="answer_completeness",
|
||||
id="user-satisfaction",
|
||||
description="User reviews findings before report generation",
|
||||
metric="user_approval",
|
||||
target="true",
|
||||
weight=0.25,
|
||||
),
|
||||
SuccessCriterion(
|
||||
id="report-completeness",
|
||||
description="Final report answers the original research questions",
|
||||
metric="question_coverage",
|
||||
target="90%",
|
||||
weight=0.15,
|
||||
weight=0.25,
|
||||
),
|
||||
],
|
||||
constraints=[
|
||||
Constraint(
|
||||
id="no-hallucination",
|
||||
description="Only include information found in sources",
|
||||
description="Only include information found in fetched sources",
|
||||
constraint_type="quality",
|
||||
category="accuracy",
|
||||
),
|
||||
Constraint(
|
||||
id="source-attribution",
|
||||
description="Every factual claim must cite its source",
|
||||
description="Every claim must cite its source with a numbered reference",
|
||||
constraint_type="quality",
|
||||
category="accuracy",
|
||||
),
|
||||
Constraint(
|
||||
id="recency-preference",
|
||||
description="Prefer recent sources when relevant",
|
||||
constraint_type="quality",
|
||||
category="relevance",
|
||||
),
|
||||
Constraint(
|
||||
id="no-paywalled",
|
||||
description="Avoid sources that require payment to access",
|
||||
id="user-checkpoint",
|
||||
description="Present findings to the user before writing the final report",
|
||||
constraint_type="functional",
|
||||
category="accessibility",
|
||||
category="interaction",
|
||||
),
|
||||
],
|
||||
)
|
||||
|
||||
# Node list
|
||||
nodes = [
|
||||
parse_query_node,
|
||||
search_sources_node,
|
||||
fetch_content_node,
|
||||
evaluate_sources_node,
|
||||
synthesize_findings_node,
|
||||
write_report_node,
|
||||
quality_check_node,
|
||||
save_report_node,
|
||||
intake_node,
|
||||
research_node,
|
||||
review_node,
|
||||
report_node,
|
||||
]
|
||||
|
||||
# Edge definitions
|
||||
edges = [
|
||||
# intake -> research
|
||||
EdgeSpec(
|
||||
id="parse-to-search",
|
||||
source="parse-query",
|
||||
target="search-sources",
|
||||
id="intake-to-research",
|
||||
source="intake",
|
||||
target="research",
|
||||
condition=EdgeCondition.ON_SUCCESS,
|
||||
priority=1,
|
||||
),
|
||||
# research -> review
|
||||
EdgeSpec(
|
||||
id="search-to-fetch",
|
||||
source="search-sources",
|
||||
target="fetch-content",
|
||||
id="research-to-review",
|
||||
source="research",
|
||||
target="review",
|
||||
condition=EdgeCondition.ON_SUCCESS,
|
||||
priority=1,
|
||||
),
|
||||
# review -> research (feedback loop)
|
||||
EdgeSpec(
|
||||
id="fetch-to-evaluate",
|
||||
source="fetch-content",
|
||||
target="evaluate-sources",
|
||||
condition=EdgeCondition.ON_SUCCESS,
|
||||
id="review-to-research-feedback",
|
||||
source="review",
|
||||
target="research",
|
||||
condition=EdgeCondition.CONDITIONAL,
|
||||
condition_expr="needs_more_research == True",
|
||||
priority=1,
|
||||
),
|
||||
# review -> report (user satisfied)
|
||||
EdgeSpec(
|
||||
id="evaluate-to-synthesize",
|
||||
source="evaluate-sources",
|
||||
target="synthesize-findings",
|
||||
condition=EdgeCondition.ON_SUCCESS,
|
||||
priority=1,
|
||||
),
|
||||
EdgeSpec(
|
||||
id="synthesize-to-write",
|
||||
source="synthesize-findings",
|
||||
target="write-report",
|
||||
condition=EdgeCondition.ON_SUCCESS,
|
||||
priority=1,
|
||||
),
|
||||
EdgeSpec(
|
||||
id="write-to-quality",
|
||||
source="write-report",
|
||||
target="quality-check",
|
||||
condition=EdgeCondition.ON_SUCCESS,
|
||||
priority=1,
|
||||
),
|
||||
EdgeSpec(
|
||||
id="quality-to-save",
|
||||
source="quality-check",
|
||||
target="save-report",
|
||||
condition=EdgeCondition.ON_SUCCESS,
|
||||
priority=1,
|
||||
id="review-to-report",
|
||||
source="review",
|
||||
target="report",
|
||||
condition=EdgeCondition.CONDITIONAL,
|
||||
condition_expr="needs_more_research == False",
|
||||
priority=2,
|
||||
),
|
||||
]
|
||||
|
||||
# Graph configuration
|
||||
entry_node = "parse-query"
|
||||
entry_points = {"start": "parse-query"}
|
||||
entry_node = "intake"
|
||||
entry_points = {"start": "intake"}
|
||||
pause_nodes = []
|
||||
terminal_nodes = ["save-report"]
|
||||
terminal_nodes = ["report"]
|
||||
|
||||
|
||||
class OnlineResearchAgent:
|
||||
class DeepResearchAgent:
|
||||
"""
|
||||
Online Research Agent - Deep-dive research with narrative reports.
|
||||
Deep Research Agent — 4-node pipeline with user checkpoints.
|
||||
|
||||
Uses GraphExecutor directly with EventLoopNode instances registered
|
||||
in the node_registry for multi-turn tool execution.
|
||||
Flow: intake -> research -> review -> report
|
||||
^ |
|
||||
+-- feedback loop (if user wants more)
|
||||
"""
|
||||
|
||||
def __init__(self, config=None):
|
||||
@@ -188,7 +155,7 @@ class OnlineResearchAgent:
|
||||
def _build_graph(self) -> GraphSpec:
|
||||
"""Build the GraphSpec."""
|
||||
return GraphSpec(
|
||||
id="online-research-agent-graph",
|
||||
id="deep-research-agent-graph",
|
||||
goal_id=self.goal.id,
|
||||
version="1.0.0",
|
||||
entry_node=self.entry_node,
|
||||
@@ -201,29 +168,11 @@ class OnlineResearchAgent:
|
||||
max_tokens=self.config.max_tokens,
|
||||
)
|
||||
|
||||
def _build_node_registry(self, tool_executor=None) -> dict:
|
||||
"""Create EventLoopNode instances for all event_loop nodes."""
|
||||
registry = {}
|
||||
for node_spec in self.nodes:
|
||||
if node_spec.node_type == "event_loop":
|
||||
registry[node_spec.id] = EventLoopNode(
|
||||
event_bus=self._event_bus,
|
||||
judge=None, # implicit judge: accept when output_keys are filled
|
||||
config=LoopConfig(
|
||||
max_iterations=50,
|
||||
max_tool_calls_per_turn=15,
|
||||
stall_detection_threshold=3,
|
||||
max_history_tokens=32000,
|
||||
),
|
||||
tool_executor=tool_executor,
|
||||
)
|
||||
return registry
|
||||
|
||||
def _setup(self, mock_mode=False) -> GraphExecutor:
|
||||
"""Set up the executor with all components."""
|
||||
from pathlib import Path
|
||||
|
||||
storage_path = Path.home() / ".hive" / "online_research_agent"
|
||||
storage_path = Path.home() / ".hive" / "deep_research_agent"
|
||||
storage_path.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
self._event_bus = EventBus()
|
||||
@@ -245,7 +194,6 @@ class OnlineResearchAgent:
|
||||
tools = list(self._tool_registry.get_tools().values())
|
||||
|
||||
self._graph = self._build_graph()
|
||||
node_registry = self._build_node_registry(tool_executor=tool_executor)
|
||||
runtime = Runtime(storage_path)
|
||||
|
||||
self._executor = GraphExecutor(
|
||||
@@ -253,7 +201,8 @@ class OnlineResearchAgent:
|
||||
llm=llm,
|
||||
tools=tools,
|
||||
tool_executor=tool_executor,
|
||||
node_registry=node_registry,
|
||||
event_bus=self._event_bus,
|
||||
storage_path=storage_path,
|
||||
)
|
||||
|
||||
return self._executor
|
||||
@@ -317,7 +266,7 @@ class OnlineResearchAgent:
|
||||
"entry_points": self.entry_points,
|
||||
"pause_nodes": self.pause_nodes,
|
||||
"terminal_nodes": self.terminal_nodes,
|
||||
"multi_entrypoint": True,
|
||||
"client_facing_nodes": [n.id for n in self.nodes if n.client_facing],
|
||||
}
|
||||
|
||||
def validate(self):
|
||||
@@ -339,10 +288,6 @@ class OnlineResearchAgent:
|
||||
if terminal not in node_ids:
|
||||
errors.append(f"Terminal node '{terminal}' not found")
|
||||
|
||||
for pause in self.pause_nodes:
|
||||
if pause not in node_ids:
|
||||
errors.append(f"Pause node '{pause}' not found")
|
||||
|
||||
for ep_id, node_id in self.entry_points.items():
|
||||
if node_id not in node_ids:
|
||||
errors.append(
|
||||
@@ -357,4 +302,4 @@ class OnlineResearchAgent:
|
||||
|
||||
|
||||
# Create default instance
|
||||
default_agent = OnlineResearchAgent()
|
||||
default_agent = DeepResearchAgent()
|
||||
+6
-3
@@ -32,12 +32,15 @@ class RuntimeConfig:
|
||||
default_config = RuntimeConfig()
|
||||
|
||||
|
||||
# Agent metadata
|
||||
@dataclass
|
||||
class AgentMetadata:
|
||||
name: str = "Online Research Agent"
|
||||
name: str = "Deep Research Agent"
|
||||
version: str = "1.0.0"
|
||||
description: str = "Research any topic by searching multiple sources, synthesizing information, and producing a well-structured narrative report with citations."
|
||||
description: str = (
|
||||
"Interactive research agent that rigorously investigates topics through "
|
||||
"multi-source search, quality evaluation, and synthesis - with TUI conversation "
|
||||
"at key checkpoints for user guidance and feedback."
|
||||
)
|
||||
|
||||
|
||||
metadata = AgentMetadata()
|
||||
+147
@@ -0,0 +1,147 @@
|
||||
"""Node definitions for Deep Research Agent."""
|
||||
|
||||
from framework.graph import NodeSpec
|
||||
|
||||
# Node 1: Intake (client-facing)
|
||||
# Brief conversation to clarify what the user wants researched.
|
||||
intake_node = NodeSpec(
|
||||
id="intake",
|
||||
name="Research Intake",
|
||||
description="Discuss the research topic with the user, clarify scope, and confirm direction",
|
||||
node_type="event_loop",
|
||||
client_facing=True,
|
||||
input_keys=["topic"],
|
||||
output_keys=["research_brief"],
|
||||
system_prompt="""\
|
||||
You are a research intake specialist. The user wants to research a topic.
|
||||
Have a brief conversation to clarify what they need.
|
||||
|
||||
**STEP 1 — Read and respond (text only, NO tool calls):**
|
||||
1. Read the topic provided
|
||||
2. If it's vague, ask 1-2 clarifying questions (scope, angle, depth)
|
||||
3. If it's already clear, confirm your understanding and ask the user to confirm
|
||||
|
||||
Keep it short. Don't over-ask.
|
||||
|
||||
**STEP 2 — After the user confirms, call set_output:**
|
||||
- set_output("research_brief", "A clear paragraph describing exactly what to research, \
|
||||
what questions to answer, what scope to cover, and how deep to go.")
|
||||
""",
|
||||
tools=[],
|
||||
)
|
||||
|
||||
# Node 2: Research
|
||||
# The workhorse — searches the web, fetches content, analyzes sources.
|
||||
# One node with both tools avoids the context-passing overhead of 5 separate nodes.
|
||||
research_node = NodeSpec(
|
||||
id="research",
|
||||
name="Research",
|
||||
description="Search the web, fetch source content, and compile findings",
|
||||
node_type="event_loop",
|
||||
max_node_visits=3,
|
||||
input_keys=["research_brief", "feedback"],
|
||||
output_keys=["findings", "sources", "gaps"],
|
||||
nullable_output_keys=["feedback"],
|
||||
system_prompt="""\
|
||||
You are a research agent. Given a research brief, find and analyze sources.
|
||||
|
||||
If feedback is provided, this is a follow-up round — focus on the gaps identified.
|
||||
|
||||
Work in phases:
|
||||
1. **Search**: Use web_search with 3-5 diverse queries covering different angles.
|
||||
Prioritize authoritative sources (.edu, .gov, established publications).
|
||||
2. **Fetch**: Use web_scrape on the most promising URLs (aim for 5-8 sources).
|
||||
Skip URLs that fail. Extract the substantive content.
|
||||
3. **Analyze**: Review what you've collected. Identify key findings, themes,
|
||||
and any contradictions between sources.
|
||||
|
||||
Important:
|
||||
- Work in batches of 3-4 tool calls at a time to manage context
|
||||
- After each batch, assess whether you have enough material
|
||||
- Prefer quality over quantity — 5 good sources beat 15 thin ones
|
||||
- Track which URL each finding comes from (you'll need citations later)
|
||||
|
||||
When done, use set_output:
|
||||
- set_output("findings", "Structured summary: key findings with source URLs for each claim. \
|
||||
Include themes, contradictions, and confidence levels.")
|
||||
- set_output("sources", [{"url": "...", "title": "...", "summary": "..."}])
|
||||
- set_output("gaps", "What aspects of the research brief are NOT well-covered yet, if any.")
|
||||
""",
|
||||
tools=["web_search", "web_scrape", "load_data", "save_data", "list_data_files"],
|
||||
)
|
||||
|
||||
# Node 3: Review (client-facing)
|
||||
# Shows the user what was found and asks whether to dig deeper or proceed.
|
||||
review_node = NodeSpec(
|
||||
id="review",
|
||||
name="Review Findings",
|
||||
description="Present findings to user and decide whether to research more or write the report",
|
||||
node_type="event_loop",
|
||||
client_facing=True,
|
||||
max_node_visits=3,
|
||||
input_keys=["findings", "sources", "gaps", "research_brief"],
|
||||
output_keys=["needs_more_research", "feedback"],
|
||||
system_prompt="""\
|
||||
Present the research findings to the user clearly and concisely.
|
||||
|
||||
**STEP 1 — Present (your first message, text only, NO tool calls):**
|
||||
1. **Summary** (2-3 sentences of what was found)
|
||||
2. **Key Findings** (bulleted, with confidence levels)
|
||||
3. **Sources Used** (count and quality assessment)
|
||||
4. **Gaps** (what's still unclear or under-covered)
|
||||
|
||||
End by asking: Are they satisfied, or do they want deeper research? \
|
||||
Should we proceed to writing the final report?
|
||||
|
||||
**STEP 2 — After the user responds, call set_output:**
|
||||
- set_output("needs_more_research", "true") — if they want more
|
||||
- set_output("needs_more_research", "false") — if they're satisfied
|
||||
- set_output("feedback", "What the user wants explored further, or empty string")
|
||||
""",
|
||||
tools=[],
|
||||
)
|
||||
|
||||
# Node 4: Report (client-facing)
|
||||
# Writes the final report and presents it to the user.
|
||||
report_node = NodeSpec(
|
||||
id="report",
|
||||
name="Write & Deliver Report",
|
||||
description="Write a cited report from the findings and present it to the user",
|
||||
node_type="event_loop",
|
||||
client_facing=True,
|
||||
input_keys=["findings", "sources", "research_brief"],
|
||||
output_keys=["delivery_status"],
|
||||
system_prompt="""\
|
||||
Write a comprehensive research report and present it to the user.
|
||||
|
||||
**STEP 1 — Write and present the report (text only, NO tool calls):**
|
||||
|
||||
Report structure:
|
||||
1. **Executive Summary** (2-3 paragraphs)
|
||||
2. **Findings** (organized by theme, with [n] citations)
|
||||
3. **Analysis** (synthesis, implications, areas of debate)
|
||||
4. **Conclusion** (key takeaways, confidence assessment)
|
||||
5. **References** (numbered list of sources cited)
|
||||
|
||||
Requirements:
|
||||
- Every factual claim must cite its source with [n] notation
|
||||
- Be objective — present multiple viewpoints where sources disagree
|
||||
- Distinguish well-supported conclusions from speculation
|
||||
- Answer the original research questions from the brief
|
||||
|
||||
End by asking the user if they have questions or want to save the report.
|
||||
|
||||
**STEP 2 — After the user responds:**
|
||||
- Answer follow-up questions from the research material
|
||||
- If they want to save, use write_to_file tool
|
||||
- When the user is satisfied: set_output("delivery_status", "completed")
|
||||
""",
|
||||
tools=["write_to_file"],
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
"intake_node",
|
||||
"research_node",
|
||||
"review_node",
|
||||
"report_node",
|
||||
]
|
||||
@@ -1,80 +0,0 @@
|
||||
# Online Research Agent
|
||||
|
||||
Deep-dive research agent that searches 10+ sources and produces comprehensive narrative reports with citations.
|
||||
|
||||
## Features
|
||||
|
||||
- Generates multiple search queries from a topic
|
||||
- Searches and fetches 15+ web sources
|
||||
- Evaluates and ranks sources by relevance
|
||||
- Synthesizes findings into themes
|
||||
- Writes narrative report with numbered citations
|
||||
- Quality checks for uncited claims
|
||||
- Saves report to local markdown file
|
||||
|
||||
## Usage
|
||||
|
||||
### CLI
|
||||
|
||||
```bash
|
||||
# Show agent info
|
||||
uv run python -m online_research_agent info
|
||||
|
||||
# Validate structure
|
||||
uv run python -m online_research_agent validate
|
||||
|
||||
# Run research on a topic
|
||||
uv run python -m online_research_agent run --topic "impact of AI on healthcare"
|
||||
|
||||
# Interactive shell
|
||||
uv run python -m online_research_agent shell
|
||||
```
|
||||
|
||||
### Python API
|
||||
|
||||
```python
|
||||
from online_research_agent import default_agent
|
||||
|
||||
# Simple usage
|
||||
result = await default_agent.run({"topic": "climate change solutions"})
|
||||
|
||||
# Check output
|
||||
if result.success:
|
||||
print(f"Report saved to: {result.output['file_path']}")
|
||||
print(result.output['final_report'])
|
||||
```
|
||||
|
||||
## Workflow
|
||||
|
||||
```
|
||||
parse-query → search-sources → fetch-content → evaluate-sources
|
||||
↓
|
||||
write-report ← synthesize-findings
|
||||
↓
|
||||
quality-check → save-report
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
Reports are saved to `./research_reports/` as markdown files with:
|
||||
|
||||
1. Executive Summary
|
||||
2. Introduction
|
||||
3. Key Findings (by theme)
|
||||
4. Analysis
|
||||
5. Conclusion
|
||||
6. References
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python 3.11+
|
||||
- LLM provider API key (Groq, Cerebras, etc.)
|
||||
- Internet access for web search/fetch
|
||||
|
||||
## Configuration
|
||||
|
||||
Edit `config.py` to change:
|
||||
|
||||
- `model`: LLM model (default: groq/moonshotai/kimi-k2-instruct-0905)
|
||||
- `temperature`: Generation temperature (default: 0.7)
|
||||
- `max_tokens`: Max tokens per response (default: 16384)
|
||||
-23
@@ -1,23 +0,0 @@
|
||||
"""
|
||||
Online Research Agent - Deep-dive research with narrative reports.
|
||||
|
||||
Research any topic by searching multiple sources, synthesizing information,
|
||||
and producing a well-structured narrative report with citations.
|
||||
"""
|
||||
|
||||
from .agent import OnlineResearchAgent, default_agent, goal, nodes, edges
|
||||
from .config import RuntimeConfig, AgentMetadata, default_config, metadata
|
||||
|
||||
__version__ = "1.0.0"
|
||||
|
||||
__all__ = [
|
||||
"OnlineResearchAgent",
|
||||
"default_agent",
|
||||
"goal",
|
||||
"nodes",
|
||||
"edges",
|
||||
"RuntimeConfig",
|
||||
"AgentMetadata",
|
||||
"default_config",
|
||||
"metadata",
|
||||
]
|
||||
-232
@@ -1,232 +0,0 @@
|
||||
"""Node definitions for Online Research Agent."""
|
||||
|
||||
from framework.graph import NodeSpec
|
||||
|
||||
# Node 1: Parse Query
|
||||
parse_query_node = NodeSpec(
|
||||
id="parse-query",
|
||||
name="Parse Query",
|
||||
description="Analyze the research topic and generate 3-5 diverse search queries to cover different aspects",
|
||||
node_type="event_loop",
|
||||
input_keys=["topic"],
|
||||
output_keys=["search_queries", "research_focus", "key_aspects"],
|
||||
system_prompt="""\
|
||||
You are a research query strategist. Given a research topic, analyze it and generate search queries.
|
||||
|
||||
Your task:
|
||||
1. Understand the core research question
|
||||
2. Identify 3-5 key aspects to investigate
|
||||
3. Generate 3-5 diverse search queries that will find comprehensive information
|
||||
|
||||
Use set_output to store each result:
|
||||
- set_output("research_focus", "Brief statement of what we're researching")
|
||||
- set_output("key_aspects", ["aspect1", "aspect2", "aspect3"])
|
||||
- set_output("search_queries", ["query 1", "query 2", "query 3", "query 4", "query 5"])
|
||||
""",
|
||||
tools=[],
|
||||
)
|
||||
|
||||
# Node 2: Search Sources
|
||||
search_sources_node = NodeSpec(
|
||||
id="search-sources",
|
||||
name="Search Sources",
|
||||
description="Execute web searches using the generated queries to find 15+ source URLs",
|
||||
node_type="event_loop",
|
||||
input_keys=["search_queries", "research_focus"],
|
||||
output_keys=["source_urls", "search_results_summary"],
|
||||
system_prompt="""\
|
||||
You are a research assistant executing web searches. Use the web_search tool to find sources.
|
||||
|
||||
Your task:
|
||||
1. Execute each search query using web_search tool
|
||||
2. Collect URLs from search results
|
||||
3. Aim for 15+ diverse sources
|
||||
|
||||
After searching, use set_output to store results:
|
||||
- set_output("source_urls", ["url1", "url2", ...])
|
||||
- set_output("search_results_summary", "Brief summary of what was found")
|
||||
""",
|
||||
tools=["web_search"],
|
||||
)
|
||||
|
||||
# Node 3: Fetch Content
|
||||
fetch_content_node = NodeSpec(
|
||||
id="fetch-content",
|
||||
name="Fetch Content",
|
||||
description="Fetch and extract content from the discovered source URLs",
|
||||
node_type="event_loop",
|
||||
input_keys=["source_urls", "research_focus"],
|
||||
output_keys=["fetched_sources", "fetch_errors"],
|
||||
system_prompt="""\
|
||||
You are a content fetcher. Use web_scrape tool to retrieve content from URLs.
|
||||
|
||||
Your task:
|
||||
1. Fetch content from each source URL using web_scrape tool
|
||||
2. Extract the main content relevant to the research focus
|
||||
3. Track any URLs that failed to fetch
|
||||
|
||||
After fetching, use set_output to store results:
|
||||
- set_output("fetched_sources", [{"url": "...", "title": "...", "content": "..."}])
|
||||
- set_output("fetch_errors", ["url that failed", ...])
|
||||
""",
|
||||
tools=["web_scrape"],
|
||||
)
|
||||
|
||||
# Node 4: Evaluate Sources
|
||||
evaluate_sources_node = NodeSpec(
|
||||
id="evaluate-sources",
|
||||
name="Evaluate Sources",
|
||||
description="Score sources for relevance and quality, filter to top 10",
|
||||
node_type="event_loop",
|
||||
input_keys=["fetched_sources", "research_focus", "key_aspects"],
|
||||
output_keys=["ranked_sources", "source_analysis"],
|
||||
system_prompt="""\
|
||||
You are a source evaluator. Assess each source for quality and relevance.
|
||||
|
||||
Scoring criteria:
|
||||
- Relevance to research focus (1-10)
|
||||
- Source credibility (1-10)
|
||||
- Information depth (1-10)
|
||||
- Recency if relevant (1-10)
|
||||
|
||||
Your task:
|
||||
1. Score each source
|
||||
2. Rank by combined score
|
||||
3. Select top 10 sources
|
||||
4. Note what each source uniquely contributes
|
||||
|
||||
Use set_output to store results:
|
||||
- set_output("ranked_sources", [{"url": "...", "title": "...", "score": 8.5}])
|
||||
- set_output("source_analysis", "Overview of source quality and coverage")
|
||||
""",
|
||||
tools=[],
|
||||
)
|
||||
|
||||
# Node 5: Synthesize Findings
|
||||
synthesize_findings_node = NodeSpec(
|
||||
id="synthesize-findings",
|
||||
name="Synthesize Findings",
|
||||
description="Extract key facts from sources and identify common themes",
|
||||
node_type="event_loop",
|
||||
input_keys=["ranked_sources", "research_focus", "key_aspects"],
|
||||
output_keys=["key_findings", "themes", "source_citations"],
|
||||
system_prompt="""\
|
||||
You are a research synthesizer. Analyze multiple sources to extract insights.
|
||||
|
||||
Your task:
|
||||
1. Identify key facts from each source
|
||||
2. Find common themes across sources
|
||||
3. Note contradictions or debates
|
||||
4. Build a citation map (fact -> source URL)
|
||||
|
||||
Use set_output to store each result:
|
||||
- set_output("key_findings", [{"finding": "...", "sources": ["url1"], "confidence": "high"}])
|
||||
- set_output("themes", [{"theme": "...", "description": "...", "supporting_sources": [...]}])
|
||||
- set_output("source_citations", {"fact or claim": ["url1", "url2"]})
|
||||
""",
|
||||
tools=[],
|
||||
)
|
||||
|
||||
# Node 6: Write Report
|
||||
write_report_node = NodeSpec(
|
||||
id="write-report",
|
||||
name="Write Report",
|
||||
description="Generate a narrative report with proper citations",
|
||||
node_type="event_loop",
|
||||
input_keys=[
|
||||
"key_findings",
|
||||
"themes",
|
||||
"source_citations",
|
||||
"research_focus",
|
||||
"ranked_sources",
|
||||
],
|
||||
output_keys=["report_content", "references"],
|
||||
system_prompt="""\
|
||||
You are a research report writer. Create a well-structured narrative report.
|
||||
|
||||
Report structure:
|
||||
1. Executive Summary (2-3 paragraphs)
|
||||
2. Introduction (context and scope)
|
||||
3. Key Findings (organized by theme)
|
||||
4. Analysis (synthesis and implications)
|
||||
5. Conclusion
|
||||
6. References (numbered list of all sources)
|
||||
|
||||
Citation format: Use numbered citations like [1], [2] that correspond to the References section.
|
||||
|
||||
IMPORTANT:
|
||||
- Every factual claim MUST have a citation
|
||||
- Write in clear, professional prose
|
||||
- Be objective and balanced
|
||||
- Highlight areas of consensus and debate
|
||||
|
||||
Use set_output to store results:
|
||||
- set_output("report_content", "Full markdown report text with citations...")
|
||||
- set_output("references", [{"number": 1, "url": "...", "title": "..."}])
|
||||
""",
|
||||
tools=[],
|
||||
)
|
||||
|
||||
# Node 7: Quality Check
|
||||
quality_check_node = NodeSpec(
|
||||
id="quality-check",
|
||||
name="Quality Check",
|
||||
description="Verify all claims have citations and report is coherent",
|
||||
node_type="event_loop",
|
||||
input_keys=["report_content", "references", "source_citations"],
|
||||
output_keys=["quality_score", "issues", "final_report"],
|
||||
system_prompt="""\
|
||||
You are a quality assurance reviewer. Check the research report for issues.
|
||||
|
||||
Check for:
|
||||
1. Uncited claims (factual statements without [n] citation)
|
||||
2. Broken citations (references to non-existent numbers)
|
||||
3. Coherence (logical flow between sections)
|
||||
4. Completeness (all key aspects covered)
|
||||
5. Accuracy (claims match source content)
|
||||
|
||||
If issues found, fix them in the final report.
|
||||
|
||||
Use set_output to store results:
|
||||
- set_output("quality_score", 0.95)
|
||||
- set_output("issues", [{"type": "uncited_claim", "location": "...", "fixed": true}])
|
||||
- set_output("final_report", "Corrected full report with all issues fixed...")
|
||||
""",
|
||||
tools=[],
|
||||
)
|
||||
|
||||
# Node 8: Save Report
|
||||
save_report_node = NodeSpec(
|
||||
id="save-report",
|
||||
name="Save Report",
|
||||
description="Write the final report to a local markdown file",
|
||||
node_type="event_loop",
|
||||
input_keys=["final_report", "references", "research_focus"],
|
||||
output_keys=["file_path", "save_status"],
|
||||
system_prompt="""\
|
||||
You are a file manager. Save the research report to disk.
|
||||
|
||||
Your task:
|
||||
1. Generate a filename from the research focus (slugified, with date)
|
||||
2. Use the write_to_file tool to save the report as markdown
|
||||
3. Save to the ./research_reports/ directory
|
||||
|
||||
Filename format: research_YYYY-MM-DD_topic-slug.md
|
||||
|
||||
Use set_output to store results:
|
||||
- set_output("file_path", "research_reports/research_2026-01-23_topic-name.md")
|
||||
- set_output("save_status", "success")
|
||||
""",
|
||||
tools=["write_to_file"],
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
"parse_query_node",
|
||||
"search_sources_node",
|
||||
"fetch_content_node",
|
||||
"evaluate_sources_node",
|
||||
"synthesize_findings_node",
|
||||
"write_report_node",
|
||||
"quality_check_node",
|
||||
"save_report_node",
|
||||
]
|
||||
@@ -158,6 +158,43 @@ intake_node = NodeSpec(
|
||||
|
||||
> **Legacy Note:** The old `pause_nodes` / `entry_points` pattern still works but `client_facing=True` is preferred for new agents.
|
||||
|
||||
**STEP 1 / STEP 2 Prompt Pattern:** For client-facing nodes, structure the system prompt with two explicit phases:
|
||||
|
||||
```python
|
||||
system_prompt="""\
|
||||
**STEP 1 — Respond to the user (text only, NO tool calls):**
|
||||
[Present information, ask questions, etc.]
|
||||
|
||||
**STEP 2 — After the user responds, call set_output:**
|
||||
[Call set_output with the structured outputs]
|
||||
"""
|
||||
```
|
||||
|
||||
This prevents the LLM from calling `set_output` prematurely before the user has had a chance to respond.
|
||||
|
||||
### Node Design: Fewer, Richer Nodes
|
||||
|
||||
Prefer fewer nodes that do more work over many thin single-purpose nodes:
|
||||
|
||||
- **Bad**: 8 thin nodes (parse query → search → fetch → evaluate → synthesize → write → check → save)
|
||||
- **Good**: 4 rich nodes (intake → research → review → report)
|
||||
|
||||
Why: Each node boundary requires serializing outputs and passing context. Fewer nodes means the LLM retains full context of its work within the node. A research node that searches, fetches, and analyzes keeps all the source material in its conversation history.
|
||||
|
||||
### nullable_output_keys for Cross-Edge Inputs
|
||||
|
||||
When a node receives inputs that only arrive on certain edges (e.g., `feedback` only comes from a review → research feedback loop, not from intake → research), mark those keys as `nullable_output_keys`:
|
||||
|
||||
```python
|
||||
research_node = NodeSpec(
|
||||
id="research",
|
||||
input_keys=["research_brief", "feedback"],
|
||||
nullable_output_keys=["feedback"], # Not present on first visit
|
||||
max_node_visits=3,
|
||||
...
|
||||
)
|
||||
```
|
||||
|
||||
## Event Loop Architecture Concepts
|
||||
|
||||
### How EventLoopNode Works
|
||||
@@ -169,40 +206,30 @@ An event loop node runs a multi-turn loop:
|
||||
4. Judge evaluates: ACCEPT (exit loop), RETRY (loop again), or ESCALATE
|
||||
5. Repeat until judge ACCEPTs or max_iterations reached
|
||||
|
||||
### CRITICAL: EventLoopNode Runtime Requirements
|
||||
### EventLoopNode Runtime
|
||||
|
||||
EventLoopNodes are **not auto-created** by the graph executor. They must be explicitly instantiated and registered in a `node_registry` dict before execution.
|
||||
|
||||
**Required components:**
|
||||
1. **`EventLoopNode` instances** — One per event_loop NodeSpec, registered in `node_registry`
|
||||
2. **`Runtime` instance** — `GraphExecutor` calls `runtime.start_run()` internally. Passing `None` crashes the executor
|
||||
3. **`GraphExecutor` (not `AgentRuntime`)** — `AgentRuntime`/`create_agent_runtime()` does NOT pass `node_registry` to the internal `GraphExecutor`, so all event_loop nodes fail with "not found in registry"
|
||||
EventLoopNodes are **auto-created** by `GraphExecutor` at runtime. You do NOT need to manually register them. Both `GraphExecutor` (direct) and `AgentRuntime` / `create_agent_runtime()` handle event_loop nodes automatically.
|
||||
|
||||
```python
|
||||
# Direct execution — executor auto-creates EventLoopNodes
|
||||
from framework.graph.executor import GraphExecutor
|
||||
from framework.graph.event_loop_node import EventLoopNode, LoopConfig
|
||||
from framework.runtime.event_bus import EventBus
|
||||
from framework.runtime.core import Runtime
|
||||
|
||||
# Build node_registry
|
||||
event_bus = EventBus()
|
||||
node_registry = {}
|
||||
for node_spec in nodes:
|
||||
if node_spec.node_type == "event_loop":
|
||||
node_registry[node_spec.id] = EventLoopNode(
|
||||
event_bus=event_bus,
|
||||
config=LoopConfig(max_iterations=50, max_tool_calls_per_turn=15),
|
||||
tool_executor=tool_executor,
|
||||
)
|
||||
|
||||
# Create executor with Runtime and node_registry
|
||||
runtime = Runtime(storage_path)
|
||||
executor = GraphExecutor(
|
||||
runtime=runtime,
|
||||
llm=llm,
|
||||
tools=tools,
|
||||
tool_executor=tool_executor,
|
||||
node_registry=node_registry,
|
||||
storage_path=storage_path,
|
||||
)
|
||||
result = await executor.execute(graph=graph, goal=goal, input_data=input_data)
|
||||
|
||||
# TUI execution — AgentRuntime also works
|
||||
from framework.runtime.agent_runtime import create_agent_runtime
|
||||
runtime = create_agent_runtime(
|
||||
graph=graph, goal=goal, storage_path=storage_path,
|
||||
entry_points=[...], llm=llm, tools=tools, tool_executor=tool_executor,
|
||||
)
|
||||
```
|
||||
|
||||
@@ -210,8 +237,12 @@ executor = GraphExecutor(
|
||||
|
||||
Nodes produce structured outputs by calling `set_output(key, value)` — a synthetic tool injected by the framework. When the LLM calls `set_output`, the value is stored in the output accumulator and made available to downstream nodes via shared memory.
|
||||
|
||||
`set_output` is NOT a real tool — it is excluded from `real_tool_results`. For client-facing nodes, this means a turn where the LLM only calls `set_output` (no other tools) is treated as a conversational boundary and will block for user input.
|
||||
|
||||
### JudgeProtocol
|
||||
|
||||
**The judge is the SOLE mechanism for acceptance decisions.** Do not add ad-hoc framework gating, output rollback, or premature rejection logic. If the LLM calls `set_output` too early, fix it with better prompts or a custom judge — not framework-level guards.
|
||||
|
||||
The judge controls when a node's loop exits:
|
||||
- **Implicit judge** (default, no judge configured): ACCEPTs when the LLM finishes with no tool calls and all required output keys are set
|
||||
- **SchemaJudge**: Validates outputs against a Pydantic model
|
||||
@@ -225,6 +256,23 @@ Controls loop behavior:
|
||||
- `stall_detection_threshold` (default 3) — detects repeated identical responses
|
||||
- `max_history_tokens` (default 32000) — triggers conversation compaction
|
||||
|
||||
### Data Tools (Spillover Management)
|
||||
|
||||
When tool results exceed the context window, the framework automatically saves them to a spillover directory and truncates with a hint. Nodes that produce or consume large data should include the data tools:
|
||||
|
||||
- `save_data(filename, data, data_dir)` — Write data to a file in the data directory
|
||||
- `load_data(filename, data_dir, offset=0, limit=50)` — Read data with line-based pagination
|
||||
- `list_data_files(data_dir)` — List available data files
|
||||
|
||||
These are real MCP tools (not synthetic). Add them to nodes that handle large tool results:
|
||||
|
||||
```python
|
||||
research_node = NodeSpec(
|
||||
...
|
||||
tools=["web_search", "web_scrape", "load_data", "save_data", "list_data_files"],
|
||||
)
|
||||
```
|
||||
|
||||
### Fan-Out / Fan-In
|
||||
|
||||
Multiple ON_SUCCESS edges from the same source create parallel execution. All branches run concurrently via `asyncio.gather()`. Parallel event_loop nodes must have disjoint `output_keys`.
|
||||
|
||||
@@ -61,28 +61,38 @@ For agents needing multi-turn conversations with users, use `client_facing=True`
|
||||
A client-facing node streams LLM output to the user and blocks for user input between conversational turns. This replaces the old pause/resume pattern.
|
||||
|
||||
```python
|
||||
# Client-facing node blocks for user input
|
||||
# Client-facing node with STEP 1/STEP 2 prompt pattern
|
||||
intake_node = NodeSpec(
|
||||
id="intake",
|
||||
name="Intake",
|
||||
description="Gather requirements from the user",
|
||||
node_type="event_loop",
|
||||
client_facing=True,
|
||||
input_keys=[],
|
||||
output_keys=["repo_url", "project_url"],
|
||||
system_prompt="You are the intake agent. Ask the user for their repo URL and project URL. When you have both, call set_output for each.",
|
||||
input_keys=["topic"],
|
||||
output_keys=["research_brief"],
|
||||
system_prompt="""\
|
||||
You are an intake specialist.
|
||||
|
||||
**STEP 1 — Read and respond (text only, NO tool calls):**
|
||||
1. Read the topic provided
|
||||
2. If it's vague, ask 1-2 clarifying questions
|
||||
3. If it's clear, confirm your understanding
|
||||
|
||||
**STEP 2 — After the user confirms, call set_output:**
|
||||
- set_output("research_brief", "Clear description of what to research")
|
||||
""",
|
||||
)
|
||||
|
||||
# Internal node runs without user interaction
|
||||
scanner_node = NodeSpec(
|
||||
id="scanner",
|
||||
name="Scanner",
|
||||
description="Scan the repository",
|
||||
research_node = NodeSpec(
|
||||
id="research",
|
||||
name="Research",
|
||||
description="Search and analyze sources",
|
||||
node_type="event_loop",
|
||||
input_keys=["repo_url"],
|
||||
output_keys=["scan_results"],
|
||||
system_prompt="Scan the repository at {repo_url}...",
|
||||
tools=["scan_github_repo"],
|
||||
input_keys=["research_brief"],
|
||||
output_keys=["findings", "sources"],
|
||||
system_prompt="Research the topic using web_search and web_scrape...",
|
||||
tools=["web_search", "web_scrape", "load_data", "save_data"],
|
||||
)
|
||||
```
|
||||
|
||||
@@ -91,6 +101,9 @@ scanner_node = NodeSpec(
|
||||
- User input is injected via `node.inject_event(text)`
|
||||
- When the LLM calls `set_output` to produce structured outputs, the judge evaluates and ACCEPTs
|
||||
- Internal nodes (non-client-facing) run their entire loop without blocking
|
||||
- `set_output` is a synthetic tool — a turn with only `set_output` calls (no real tools) triggers user input blocking
|
||||
|
||||
**STEP 1/STEP 2 pattern:** Always structure client-facing prompts with explicit phases. STEP 1 is text-only conversation. STEP 2 calls `set_output` after user confirmation. This prevents the LLM from calling `set_output` prematurely before the user responds.
|
||||
|
||||
### When to Use client_facing
|
||||
|
||||
@@ -160,6 +173,12 @@ EdgeSpec(
|
||||
|
||||
## Judge Patterns
|
||||
|
||||
**Core Principle: The judge is the SOLE mechanism for acceptance decisions.** Never add ad-hoc framework gating to compensate for LLM behavior. If the LLM calls `set_output` prematurely, fix the system prompt or use a custom judge. Anti-patterns to avoid:
|
||||
- Output rollback logic
|
||||
- `_user_has_responded` flags
|
||||
- Premature set_output rejection
|
||||
- Interaction protocol injection into system prompts
|
||||
|
||||
Judges control when an event_loop node's loop exits. Choose based on validation needs.
|
||||
|
||||
### Implicit Judge (Default)
|
||||
@@ -241,15 +260,34 @@ EventLoopNode automatically manages context window usage with tiered compaction:
|
||||
|
||||
### Spillover Pattern
|
||||
|
||||
For large tool results, use `save_data()` to write to disk and pass the filename through `set_output`. This keeps the LLM context window small.
|
||||
The framework automatically truncates large tool results and saves full content to a spillover directory. The LLM receives a truncation message with instructions to use `load_data` to read the full result.
|
||||
|
||||
```
|
||||
LLM calls save_data(filename, large_data) → file written to spillover/
|
||||
LLM calls set_output("results_file", filename) → filename stored in output
|
||||
Downstream node calls load_data(filename) → reads from spillover/
|
||||
For explicit data management, use the data tools (real MCP tools, not synthetic):
|
||||
|
||||
```python
|
||||
# save_data, load_data, list_data_files are real MCP tools
|
||||
# Each takes a data_dir parameter since the MCP server is shared
|
||||
|
||||
# Saving large results
|
||||
save_data(filename="sources.json", data=large_json_string, data_dir="/path/to/spillover")
|
||||
|
||||
# Reading with pagination (line-based offset/limit)
|
||||
load_data(filename="sources.json", data_dir="/path/to/spillover", offset=0, limit=50)
|
||||
|
||||
# Listing available files
|
||||
list_data_files(data_dir="/path/to/spillover")
|
||||
```
|
||||
|
||||
The `load_data()` tool supports `offset` and `limit` parameters for paginated reading of large files.
|
||||
Add data tools to nodes that handle large tool results:
|
||||
|
||||
```python
|
||||
research_node = NodeSpec(
|
||||
...
|
||||
tools=["web_search", "web_scrape", "load_data", "save_data", "list_data_files"],
|
||||
)
|
||||
```
|
||||
|
||||
The `data_dir` is passed by the framework (from the node's spillover directory). The LLM sees `data_dir` in truncation messages and uses it when calling `load_data`.
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
@@ -259,6 +297,29 @@ The `load_data()` tool supports `offset` and `limit` parameters for paginated re
|
||||
- **Don't hide code in session** — Write to files as components are approved
|
||||
- **Don't wait to write files** — Agent visible from first step
|
||||
- **Don't batch everything** — Write incrementally, one component at a time
|
||||
- **Don't create too many thin nodes** — Prefer fewer, richer nodes (see below)
|
||||
- **Don't add framework gating for LLM behavior** — Fix prompts or use judges instead
|
||||
|
||||
### Fewer, Richer Nodes
|
||||
|
||||
A common mistake is splitting work into too many small single-purpose nodes. Each node boundary requires serializing outputs, losing in-context information, and adding edge complexity.
|
||||
|
||||
| Bad (8 thin nodes) | Good (4 rich nodes) |
|
||||
|---------------------|---------------------|
|
||||
| parse-query | intake (client-facing) |
|
||||
| search-sources | research (search + fetch + analyze) |
|
||||
| fetch-content | review (client-facing) |
|
||||
| evaluate-sources | report (write + deliver) |
|
||||
| synthesize-findings | |
|
||||
| write-report | |
|
||||
| quality-check | |
|
||||
| save-report | |
|
||||
|
||||
**Why fewer nodes are better:**
|
||||
- The LLM retains full context of its work within a single node
|
||||
- A research node that searches, fetches, and analyzes keeps all source material in its conversation history
|
||||
- Fewer edges means simpler graph and fewer failure points
|
||||
- Data tools (`save_data`/`load_data`) handle context window limits within a single node
|
||||
|
||||
### MCP Tools - Correct Usage
|
||||
|
||||
|
||||
@@ -55,14 +55,10 @@ jobs:
|
||||
- name: Install uv
|
||||
uses: astral-sh/setup-uv@v4
|
||||
|
||||
- name: Install dependencies
|
||||
- name: Install dependencies and run tests
|
||||
run: |
|
||||
cd core
|
||||
uv sync
|
||||
|
||||
- name: Run tests
|
||||
run: |
|
||||
cd core
|
||||
uv run pytest tests/ -v
|
||||
|
||||
test-tools:
|
||||
|
||||
@@ -54,7 +54,6 @@ __pycache__/
|
||||
*.egg-info/
|
||||
.eggs/
|
||||
*.egg
|
||||
uv.lock
|
||||
|
||||
# Generated runtime data
|
||||
core/data/
|
||||
|
||||
+2
-3
@@ -198,9 +198,8 @@ hive/ # Repository root
|
||||
│ ├── quizzes/ # Developer quizzes
|
||||
│ └── i18n/ # Translations
|
||||
│
|
||||
├── scripts/ # Build & utility scripts
|
||||
│ ├── setup-python.sh # Python environment setup
|
||||
│ └── setup.sh # Legacy setup script
|
||||
├── scripts/ # Utility scripts
|
||||
│ └── auto-close-duplicates.ts # GitHub duplicate issue closer
|
||||
│
|
||||
├── quickstart.sh # Interactive setup wizard
|
||||
├── ENVIRONMENT_SETUP.md # Complete Python setup guide
|
||||
|
||||
+10
-52
@@ -21,42 +21,18 @@ This will:
|
||||
- Fix package compatibility issues (openai + litellm)
|
||||
- Verify all installations
|
||||
|
||||
## Quick Setup (Windows – PowerShell)
|
||||
## Windows Setup
|
||||
|
||||
Windows users can use the native PowerShell setup script.
|
||||
Windows users should use **WSL (Windows Subsystem for Linux)** to set up and run agents.
|
||||
|
||||
Before running the script, allow script execution for the current session:
|
||||
|
||||
```powershell
|
||||
Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
|
||||
```
|
||||
|
||||
Run setup from the project root:
|
||||
|
||||
```powershell
|
||||
./scripts/setup-python.ps1
|
||||
```
|
||||
|
||||
This will:
|
||||
|
||||
- Check Python version (requires 3.11+)
|
||||
- Create a local `.venv` virtual environment
|
||||
- Install the core framework package (`framework`)
|
||||
- Install the tools package (`aden_tools`)
|
||||
- Fix package compatibility issues (openai + litellm)
|
||||
- Verify all installations
|
||||
|
||||
After setup, activate the virtual environment:
|
||||
|
||||
```powershell
|
||||
.\.venv\Scripts\Activate.ps1
|
||||
```
|
||||
|
||||
Set `PYTHONPATH` (required in every new PowerShell session):
|
||||
|
||||
```powershell
|
||||
$env:PYTHONPATH="core;exports"
|
||||
```
|
||||
1. [Install WSL 2](https://learn.microsoft.com/en-us/windows/wsl/install) if you haven't already:
|
||||
```powershell
|
||||
wsl --install
|
||||
```
|
||||
2. Open your WSL terminal, clone the repo, and run the quickstart script:
|
||||
```bash
|
||||
./quickstart.sh
|
||||
```
|
||||
|
||||
## Alpine Linux Setup
|
||||
|
||||
@@ -326,12 +302,6 @@ Or run the setup script:
|
||||
./quickstart.sh
|
||||
```
|
||||
|
||||
Windows:
|
||||
|
||||
```powershell
|
||||
./scripts/setup-python.ps1
|
||||
```
|
||||
|
||||
### "ModuleNotFoundError: No module named 'openai.\_models'"
|
||||
|
||||
**Cause:** Outdated `openai` package (0.27.x) incompatible with `litellm`
|
||||
@@ -375,12 +345,6 @@ uv pip uninstall framework tools
|
||||
./quickstart.sh
|
||||
```
|
||||
|
||||
Windows:
|
||||
|
||||
```powershell
|
||||
./scripts/setup-python.ps1
|
||||
```
|
||||
|
||||
## Package Structure
|
||||
|
||||
The Hive framework consists of three Python packages:
|
||||
@@ -479,12 +443,6 @@ This design allows agents in `exports/` to be:
|
||||
./quickstart.sh
|
||||
```
|
||||
|
||||
Windows:
|
||||
|
||||
```powershell
|
||||
./scripts/setup-python.ps1
|
||||
```
|
||||
|
||||
### 2. Build Agent (Claude Code)
|
||||
|
||||
```
|
||||
|
||||
@@ -4,7 +4,6 @@
|
||||
- **Added empty response retry logic** — LLM provider now detects empty responses (e.g. Gemini returning 200 with no content on rate limit) and retries with exponential backoff, preventing hallucinated output from the cleanup LLM
|
||||
- **Added context-aware input compaction** — LLM nodes now estimate input token count before calling the model and progressively truncate the largest values if they exceed the context window budget
|
||||
- **Increased rate limit retries to 10** with verbose `[retry]` and `[compaction]` logging that includes model name, finish reason, and attempt count
|
||||
- **Updated setup scripts** — `scripts/setup-python.sh` now installs Playwright Chromium browser automatically for web scraping support
|
||||
- **Interactive quickstart onboarding** — `quickstart.sh` rewritten as bee-themed interactive wizard that detects existing API keys (including Claude Code subscription), lets user pick ONE default LLM provider, and saves configuration to `~/.hive/configuration.json`
|
||||
- **Fixed lint errors** across `hubspot_tool.py` (line length) and `agent_builder_server.py` (unused variable)
|
||||
|
||||
@@ -24,8 +23,6 @@
|
||||
- `tools/src/aden_tools/tools/web_scrape_tool/README.md` — Updated docs
|
||||
- `tools/pyproject.toml` — Added `playwright`, `playwright-stealth` deps
|
||||
- `tools/Dockerfile` — Added `playwright install chromium --with-deps`
|
||||
- `scripts/setup-python.sh` — Added Playwright Chromium browser install step
|
||||
|
||||
### LLM Reliability
|
||||
- `core/framework/llm/litellm.py` — Empty response retry + max retries 10 + verbose logging
|
||||
- `core/framework/graph/node.py` — Input compaction via `_compact_inputs()`, `_estimate_tokens()`, `_get_context_limit()`
|
||||
@@ -41,7 +38,6 @@
|
||||
## Test plan
|
||||
- [ ] Run `make lint` — passes clean
|
||||
- [ ] Run `./quickstart.sh` and verify interactive flow works, config saved to `~/.hive/configuration.json`
|
||||
- [ ] Run `./scripts/setup-python.sh` and verify Playwright Chromium installs
|
||||
- [ ] Run `pytest tests/tools/test_web_scrape_tool.py -v`
|
||||
- [ ] Run agent against a JS-heavy site and verify `web_scrape` returns rendered content
|
||||
- [ ] Set `HUBSPOT_ACCESS_TOKEN` and verify HubSpot tool CRUD operations work
|
||||
|
||||
@@ -0,0 +1,30 @@
|
||||
# TUI Text Selection and Copy Guide
|
||||
|
||||
## Keybindings
|
||||
|
||||
| Key | Action |
|
||||
|---------------|-----------------------|
|
||||
| `Tab` | Next panel |
|
||||
| `Shift+Tab` | Previous panel |
|
||||
| `Ctrl+S` | Save SVG screenshot |
|
||||
| `Ctrl+O` | Command palette |
|
||||
| `Q` | Quit |
|
||||
|
||||
## Panel Cycle Order
|
||||
|
||||
`Tab` cycles: **Log Pane → Graph View → Chat Input**
|
||||
|
||||
## Text Selection
|
||||
|
||||
Textual apps capture the mouse, so normal click-drag selection won't work by default. To select and copy text from any pane:
|
||||
|
||||
1. **Hold `Shift`** while clicking and dragging — this bypasses Textual's mouse capture and lets your terminal handle selection natively.
|
||||
2. Copy with your terminal's shortcut (`Cmd+C` on macOS, `Ctrl+Shift+C` on most Linux terminals).
|
||||
|
||||
## Log Pane Scrolling
|
||||
|
||||
The log pane uses `auto_scroll=False`. New output only scrolls to the bottom when you are already at the bottom of the log. If you've scrolled up to read earlier output, it stays in place.
|
||||
|
||||
## Screenshots
|
||||
|
||||
`Ctrl+S` saves an SVG screenshot to the `screenshots/` directory with a timestamped filename. Open the SVG in any browser to view it.
|
||||
@@ -144,19 +144,19 @@ class EventLoopNode(NodeProtocol):
|
||||
1. Try to restore from durable state (crash recovery)
|
||||
2. If no prior state, init from NodeSpec.system_prompt + input_keys
|
||||
3. Loop: drain injection queue -> stream LLM -> execute tools
|
||||
-> if client_facing + no tools: block for user input (inject_event)
|
||||
-> if not client_facing or tools present: judge evaluates
|
||||
-> if client_facing + no real tools: block for user input
|
||||
-> judge evaluates (acceptance criteria)
|
||||
(each add_* and set_output writes through to store immediately)
|
||||
4. Publish events to EventBus at each stage
|
||||
5. Write cursor after each iteration
|
||||
6. Terminate when judge returns ACCEPT, shutdown signaled, or max iterations
|
||||
7. Build output dict from OutputAccumulator
|
||||
|
||||
Client-facing blocking: When ``client_facing=True`` and the LLM produces
|
||||
text without tool calls (a natural conversational turn), the node blocks
|
||||
via ``_await_user_input()`` until ``inject_event()`` or ``signal_shutdown()``
|
||||
is called. This separates blocking (node concern) from output evaluation
|
||||
(judge concern).
|
||||
Client-facing blocking: When ``client_facing=True`` and the LLM finishes
|
||||
without real tool calls (stop_reason != tool_call), the node blocks via
|
||||
``_await_user_input()`` until ``inject_event()`` or ``signal_shutdown()``
|
||||
is called. After user input, the judge evaluates — the judge is the
|
||||
sole mechanism for acceptance decisions.
|
||||
|
||||
Always returns NodeResult with retryable=False semantics. The executor
|
||||
must NOT retry event loop nodes -- retry is handled internally by the
|
||||
@@ -212,8 +212,10 @@ class EventLoopNode(NodeProtocol):
|
||||
# 2. Restore or create new conversation + accumulator
|
||||
conversation, accumulator, start_iteration = await self._restore(ctx)
|
||||
if conversation is None:
|
||||
system_prompt = ctx.node_spec.system_prompt or ""
|
||||
|
||||
conversation = NodeConversation(
|
||||
system_prompt=ctx.node_spec.system_prompt or "",
|
||||
system_prompt=system_prompt,
|
||||
max_history_tokens=self._config.max_history_tokens,
|
||||
output_keys=ctx.node_spec.output_keys or None,
|
||||
store=self._conversation_store,
|
||||
@@ -276,15 +278,20 @@ class EventLoopNode(NodeProtocol):
|
||||
iteration,
|
||||
len(conversation.messages),
|
||||
)
|
||||
assistant_text, tool_results_list, turn_tokens = await self._run_single_turn(
|
||||
ctx, conversation, tools, iteration, accumulator
|
||||
)
|
||||
(
|
||||
assistant_text,
|
||||
real_tool_results,
|
||||
outputs_set,
|
||||
turn_tokens,
|
||||
) = await self._run_single_turn(ctx, conversation, tools, iteration, accumulator)
|
||||
logger.info(
|
||||
"[%s] iter=%d: LLM done — text=%d chars, tool_calls=%d, tokens=%s, accumulator=%s",
|
||||
"[%s] iter=%d: LLM done — text=%d chars, real_tools=%d, "
|
||||
"outputs_set=%s, tokens=%s, accumulator=%s",
|
||||
node_id,
|
||||
iteration,
|
||||
len(assistant_text),
|
||||
len(tool_results_list),
|
||||
len(real_tool_results),
|
||||
outputs_set or "[]",
|
||||
turn_tokens,
|
||||
{k: ("set" if v is not None else "None") for k, v in accumulator.to_dict().items()},
|
||||
)
|
||||
@@ -300,6 +307,31 @@ class EventLoopNode(NodeProtocol):
|
||||
if conversation.needs_compaction():
|
||||
await self._compact_tiered(ctx, conversation, accumulator)
|
||||
|
||||
# 6e'''. Empty response guard — if the LLM returned nothing
|
||||
# (no text, no real tools, no set_output) and all required
|
||||
# outputs are already set, accept immediately. This prevents
|
||||
# wasted iterations when the LLM has genuinely finished its
|
||||
# work (e.g. after calling set_output in a previous turn).
|
||||
truly_empty = not assistant_text and not real_tool_results and not outputs_set
|
||||
if truly_empty and accumulator is not None:
|
||||
missing = self._get_missing_output_keys(
|
||||
accumulator, ctx.node_spec.output_keys, ctx.node_spec.nullable_output_keys
|
||||
)
|
||||
if not missing:
|
||||
logger.info(
|
||||
"[%s] iter=%d: empty response but all outputs set — accepting",
|
||||
node_id,
|
||||
iteration,
|
||||
)
|
||||
await self._publish_loop_completed(stream_id, node_id, iteration + 1)
|
||||
latency_ms = int((time.time() - start_time) * 1000)
|
||||
return NodeResult(
|
||||
success=True,
|
||||
output=accumulator.to_dict(),
|
||||
tokens_used=total_input_tokens + total_output_tokens,
|
||||
latency_ms=latency_ms,
|
||||
)
|
||||
|
||||
# 6f. Stall detection
|
||||
recent_responses.append(assistant_text)
|
||||
if len(recent_responses) > self._config.stall_detection_threshold:
|
||||
@@ -321,18 +353,17 @@ class EventLoopNode(NodeProtocol):
|
||||
# 6g. Write cursor checkpoint
|
||||
await self._write_cursor(ctx, conversation, accumulator, iteration)
|
||||
|
||||
# 6h. Client-facing input wait
|
||||
logger.info(
|
||||
"[%s] iter=%d: 6h check — client_facing=%s, tool_results=%d",
|
||||
node_id,
|
||||
iteration,
|
||||
ctx.node_spec.client_facing,
|
||||
len(tool_results_list),
|
||||
)
|
||||
if ctx.node_spec.client_facing and not tool_results_list:
|
||||
# LLM finished speaking (no tool calls) on a client-facing node.
|
||||
# This is a conversational turn boundary: block for user input
|
||||
# instead of running the judge.
|
||||
# 6h. Client-facing input blocking
|
||||
#
|
||||
# For client_facing nodes, block for user input whenever the
|
||||
# LLM finishes without making real tool calls (i.e. the LLM's
|
||||
# stop_reason is not tool_call). set_output is separated from
|
||||
# real tools by _run_single_turn, so this correctly treats
|
||||
# set_output-only turns as conversational boundaries.
|
||||
#
|
||||
# After user input, always fall through to judge evaluation
|
||||
# (6i). The judge handles all acceptance decisions.
|
||||
if ctx.node_spec.client_facing and not real_tool_results:
|
||||
if self._shutdown:
|
||||
await self._publish_loop_completed(stream_id, node_id, iteration + 1)
|
||||
latency_ms = int((time.time() - start_time) * 1000)
|
||||
@@ -347,7 +378,6 @@ class EventLoopNode(NodeProtocol):
|
||||
got_input = await self._await_user_input(ctx)
|
||||
logger.info("[%s] iter=%d: unblocked, got_input=%s", node_id, iteration, got_input)
|
||||
if not got_input:
|
||||
# Shutdown signaled during wait
|
||||
await self._publish_loop_completed(stream_id, node_id, iteration + 1)
|
||||
latency_ms = int((time.time() - start_time) * 1000)
|
||||
return NodeResult(
|
||||
@@ -357,46 +387,13 @@ class EventLoopNode(NodeProtocol):
|
||||
latency_ms=latency_ms,
|
||||
)
|
||||
|
||||
# Clear stall detection — user input resets the conversation
|
||||
recent_responses.clear()
|
||||
|
||||
# For nodes with an explicit judge, fall through to judge
|
||||
# evaluation so the LLM gets structured feedback about missing
|
||||
# outputs (e.g. "Missing output keys: [...]"). Without this,
|
||||
# the LLM may generate text like "Ready to proceed!" without
|
||||
# ever calling set_output, and the judge feedback never reaches it.
|
||||
#
|
||||
# For nodes without a judge (HITL review/approval with all-
|
||||
# nullable keys), keep conversing UNLESS the LLM has already
|
||||
# set an output — in that case fall through to the implicit
|
||||
# judge which will ACCEPT and terminate the node.
|
||||
if self._judge is None:
|
||||
has_outputs = accumulator and any(
|
||||
v is not None for v in accumulator.to_dict().values()
|
||||
)
|
||||
if not has_outputs:
|
||||
logger.info(
|
||||
"[%s] iter=%d: no judge, no outputs, continuing",
|
||||
node_id,
|
||||
iteration,
|
||||
)
|
||||
continue
|
||||
logger.info(
|
||||
"[%s] iter=%d: no judge, outputs set — implicit judge",
|
||||
node_id,
|
||||
iteration,
|
||||
)
|
||||
else:
|
||||
logger.info(
|
||||
"[%s] iter=%d: has judge, falling through to 6i",
|
||||
node_id,
|
||||
iteration,
|
||||
)
|
||||
# Fall through to judge evaluation (6i)
|
||||
|
||||
# 6i. Judge evaluation
|
||||
should_judge = (
|
||||
(iteration + 1) % self._config.judge_every_n_turns == 0
|
||||
or not tool_results_list # no tool calls = natural stop
|
||||
or not real_tool_results # no real tool calls = natural stop
|
||||
)
|
||||
|
||||
logger.info("[%s] iter=%d: 6i should_judge=%s", node_id, iteration, should_judge)
|
||||
@@ -406,7 +403,7 @@ class EventLoopNode(NodeProtocol):
|
||||
conversation,
|
||||
accumulator,
|
||||
assistant_text,
|
||||
tool_results_list,
|
||||
real_tool_results,
|
||||
iteration,
|
||||
)
|
||||
fb_preview = (verdict.feedback or "")[:200]
|
||||
@@ -526,16 +523,24 @@ class EventLoopNode(NodeProtocol):
|
||||
tools: list[Tool],
|
||||
iteration: int,
|
||||
accumulator: OutputAccumulator,
|
||||
) -> tuple[str, list[dict], dict[str, int]]:
|
||||
) -> tuple[str, list[dict], list[str], dict[str, int]]:
|
||||
"""Run a single LLM turn with streaming and tool execution.
|
||||
|
||||
Returns (assistant_text, tool_results, token_counts).
|
||||
Returns (assistant_text, real_tool_results, outputs_set, token_counts).
|
||||
|
||||
``real_tool_results`` contains only results from actual tools (web_search,
|
||||
etc.), NOT from the synthetic ``set_output`` tool. ``outputs_set`` lists
|
||||
the output keys written via ``set_output`` during this turn. This
|
||||
separation lets the caller treat set_output as a framework concern
|
||||
rather than a tool-execution concern.
|
||||
"""
|
||||
stream_id = ctx.node_id
|
||||
node_id = ctx.node_id
|
||||
token_counts: dict[str, int] = {"input": 0, "output": 0}
|
||||
tool_call_count = 0
|
||||
final_text = ""
|
||||
# Track output keys set via set_output across all inner iterations
|
||||
outputs_set_this_turn: list[str] = []
|
||||
|
||||
# Inner tool loop: stream may produce tool calls requiring re-invocation
|
||||
while True:
|
||||
@@ -606,10 +611,10 @@ class EventLoopNode(NodeProtocol):
|
||||
|
||||
# If no tool calls, turn is complete
|
||||
if not tool_calls:
|
||||
return final_text, [], token_counts
|
||||
return final_text, [], outputs_set_this_turn, token_counts
|
||||
|
||||
# Execute tool calls
|
||||
tool_results: list[dict] = []
|
||||
# Execute tool calls — separate real tools from set_output
|
||||
real_tool_results: list[dict] = []
|
||||
limit_hit = False
|
||||
executed_in_batch = 0
|
||||
for tc in tool_calls:
|
||||
@@ -624,21 +629,21 @@ class EventLoopNode(NodeProtocol):
|
||||
stream_id, node_id, tc.tool_use_id, tc.tool_name, tc.tool_input
|
||||
)
|
||||
|
||||
# Handle set_output synthetic tool
|
||||
logger.info(
|
||||
"[%s] tool_call: %s(%s)",
|
||||
node_id,
|
||||
tc.tool_name,
|
||||
json.dumps(tc.tool_input)[:200],
|
||||
)
|
||||
|
||||
if tc.tool_name == "set_output":
|
||||
# --- Framework-level set_output handling ---
|
||||
result = self._handle_set_output(tc.tool_input, ctx.node_spec.output_keys)
|
||||
result = ToolResult(
|
||||
tool_use_id=tc.tool_use_id,
|
||||
content=result.content,
|
||||
is_error=result.is_error,
|
||||
)
|
||||
# Async write-through for set_output
|
||||
if not result.is_error:
|
||||
value = tc.tool_input["value"]
|
||||
# Parse JSON strings into native types so downstream
|
||||
@@ -652,26 +657,27 @@ class EventLoopNode(NodeProtocol):
|
||||
except (json.JSONDecodeError, TypeError):
|
||||
pass
|
||||
await accumulator.set(tc.tool_input["key"], value)
|
||||
outputs_set_this_turn.append(tc.tool_input["key"])
|
||||
else:
|
||||
# Execute real tool
|
||||
# --- Real tool execution ---
|
||||
result = await self._execute_tool(tc)
|
||||
# Truncate large results to prevent context blowup
|
||||
result = self._truncate_tool_result(result, tc.tool_name)
|
||||
real_tool_results.append(
|
||||
{
|
||||
"tool_use_id": tc.tool_use_id,
|
||||
"tool_name": tc.tool_name,
|
||||
"content": result.content,
|
||||
"is_error": result.is_error,
|
||||
}
|
||||
)
|
||||
|
||||
# Record tool result in conversation (write-through)
|
||||
# Record tool result in conversation (both real and set_output
|
||||
# go into the conversation for LLM context continuity)
|
||||
await conversation.add_tool_result(
|
||||
tool_use_id=tc.tool_use_id,
|
||||
content=result.content,
|
||||
is_error=result.is_error,
|
||||
)
|
||||
tool_results.append(
|
||||
{
|
||||
"tool_use_id": tc.tool_use_id,
|
||||
"tool_name": tc.tool_name,
|
||||
"content": result.content,
|
||||
"is_error": result.is_error,
|
||||
}
|
||||
)
|
||||
|
||||
# Publish tool call completed
|
||||
await self._publish_tool_completed(
|
||||
@@ -708,7 +714,9 @@ class EventLoopNode(NodeProtocol):
|
||||
content=discard_msg,
|
||||
is_error=True,
|
||||
)
|
||||
tool_results.append(
|
||||
# Discarded calls go into real_tool_results so the
|
||||
# caller sees they were attempted (for judge context).
|
||||
real_tool_results.append(
|
||||
{
|
||||
"tool_use_id": tc.tool_use_id,
|
||||
"tool_name": tc.tool_name,
|
||||
@@ -716,9 +724,24 @@ class EventLoopNode(NodeProtocol):
|
||||
"is_error": True,
|
||||
}
|
||||
)
|
||||
# Prune old tool results NOW to prevent context bloat on the
|
||||
# next turn. The char-based token estimator underestimates
|
||||
# actual API tokens, so the standard compaction check in the
|
||||
# outer loop may not trigger in time.
|
||||
protect = max(2000, self._config.max_history_tokens // 12)
|
||||
pruned = await conversation.prune_old_tool_results(
|
||||
protect_tokens=protect,
|
||||
min_prune_tokens=max(1000, protect // 3),
|
||||
)
|
||||
if pruned > 0:
|
||||
logger.info(
|
||||
"Post-limit pruning: cleared %d old tool results (budget: %d)",
|
||||
pruned,
|
||||
self._config.max_history_tokens,
|
||||
)
|
||||
# Limit hit — return from this turn so the judge can
|
||||
# evaluate instead of looping back for another stream.
|
||||
return final_text, tool_results, token_counts
|
||||
return final_text, real_tool_results, outputs_set_this_turn, token_counts
|
||||
|
||||
# --- Mid-turn pruning: prevent context blowup within a single turn ---
|
||||
if conversation.usage_ratio() >= 0.6:
|
||||
@@ -1025,7 +1048,8 @@ class EventLoopNode(NodeProtocol):
|
||||
truncated = (
|
||||
f"[Result from {tool_name}: {len(result.content)} chars — "
|
||||
f"too large for context, saved to '{filename}'. "
|
||||
f"Use load_data('{filename}') to read the full result.]\n\n"
|
||||
f"Use load_data(filename='{filename}', data_dir='{spill_dir}') "
|
||||
f"to read the full result.]\n\n"
|
||||
f"Preview:\n{preview}…"
|
||||
)
|
||||
logger.info(
|
||||
@@ -1244,9 +1268,11 @@ class EventLoopNode(NodeProtocol):
|
||||
|
||||
# 5. Spillover files hint
|
||||
if self._config.spillover_dir:
|
||||
spill = self._config.spillover_dir
|
||||
parts.append(
|
||||
"NOTE: Large tool results were saved to files. "
|
||||
"Use load_data('<filename>') to read them."
|
||||
f"Use load_data(filename='<filename>', data_dir='{spill}') "
|
||||
"to read them."
|
||||
)
|
||||
|
||||
# 6. Tool call history (prevent re-calling tools)
|
||||
|
||||
@@ -14,6 +14,7 @@ import logging
|
||||
import warnings
|
||||
from collections.abc import Callable
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from framework.graph.edge import EdgeCondition, EdgeSpec, GraphSpec
|
||||
@@ -128,6 +129,9 @@ class GraphExecutor:
|
||||
cleansing_config: CleansingConfig | None = None,
|
||||
enable_parallel_execution: bool = True,
|
||||
parallel_config: ParallelExecutionConfig | None = None,
|
||||
event_bus: Any | None = None,
|
||||
stream_id: str = "",
|
||||
storage_path: str | Path | None = None,
|
||||
):
|
||||
"""
|
||||
Initialize the executor.
|
||||
@@ -142,6 +146,9 @@ class GraphExecutor:
|
||||
cleansing_config: Optional output cleansing configuration
|
||||
enable_parallel_execution: Enable parallel fan-out execution (default True)
|
||||
parallel_config: Configuration for parallel execution behavior
|
||||
event_bus: Optional event bus for emitting node lifecycle events
|
||||
stream_id: Stream ID for event correlation
|
||||
storage_path: Optional base path for conversation persistence
|
||||
"""
|
||||
self.runtime = runtime
|
||||
self.llm = llm
|
||||
@@ -151,6 +158,9 @@ class GraphExecutor:
|
||||
self.approval_callback = approval_callback
|
||||
self.validator = OutputValidator()
|
||||
self.logger = logging.getLogger(__name__)
|
||||
self._event_bus = event_bus
|
||||
self._stream_id = stream_id
|
||||
self._storage_path = Path(storage_path) if storage_path else None
|
||||
|
||||
# Initialize output cleaner
|
||||
self.cleansing_config = cleansing_config or CleansingConfig()
|
||||
@@ -357,13 +367,33 @@ class GraphExecutor:
|
||||
description=f"Validation errors for {current_node_id}: {validation_errors}",
|
||||
)
|
||||
|
||||
# Emit node-started event (skip event_loop nodes — they emit their own)
|
||||
if self._event_bus and node_spec.node_type != "event_loop":
|
||||
await self._event_bus.emit_node_loop_started(
|
||||
stream_id=self._stream_id, node_id=current_node_id
|
||||
)
|
||||
|
||||
# Execute node
|
||||
self.logger.info(" Executing...")
|
||||
result = await node_impl.execute(ctx)
|
||||
|
||||
# Emit node-completed event (skip event_loop nodes)
|
||||
if self._event_bus and node_spec.node_type != "event_loop":
|
||||
await self._event_bus.emit_node_loop_completed(
|
||||
stream_id=self._stream_id, node_id=current_node_id, iterations=1
|
||||
)
|
||||
|
||||
if result.success:
|
||||
# Validate output before accepting it
|
||||
if result.output and node_spec.output_keys:
|
||||
# Validate output before accepting it.
|
||||
# Skip for event_loop nodes — their judge system is
|
||||
# the sole acceptance mechanism (see WP-8). Empty
|
||||
# strings and other flexible outputs are legitimate
|
||||
# for LLM-driven nodes that already passed the judge.
|
||||
if (
|
||||
result.output
|
||||
and node_spec.output_keys
|
||||
and node_spec.node_type != "event_loop"
|
||||
):
|
||||
validation = self.validator.validate_all(
|
||||
output=result.output,
|
||||
expected_keys=node_spec.output_keys,
|
||||
@@ -441,48 +471,66 @@ class GraphExecutor:
|
||||
_is_retry = True
|
||||
continue
|
||||
else:
|
||||
# Max retries exceeded - fail the execution
|
||||
# Max retries exceeded - check for failure handlers
|
||||
self.logger.error(
|
||||
f" ✗ Max retries ({max_retries}) exceeded for node {current_node_id}"
|
||||
)
|
||||
self.runtime.report_problem(
|
||||
severity="critical",
|
||||
description=(
|
||||
f"Node {current_node_id} failed after "
|
||||
f"{max_retries} attempts: {result.error}"
|
||||
),
|
||||
)
|
||||
self.runtime.end_run(
|
||||
success=False,
|
||||
output_data=memory.read_all(),
|
||||
narrative=(
|
||||
f"Failed at {node_spec.name} after "
|
||||
f"{max_retries} retries: {result.error}"
|
||||
),
|
||||
|
||||
# Check if there's an ON_FAILURE edge to follow
|
||||
next_node = self._follow_edges(
|
||||
graph=graph,
|
||||
goal=goal,
|
||||
current_node_id=current_node_id,
|
||||
current_node_spec=node_spec,
|
||||
result=result, # result.success=False triggers ON_FAILURE
|
||||
memory=memory,
|
||||
)
|
||||
|
||||
# Calculate quality metrics
|
||||
total_retries_count = sum(node_retry_counts.values())
|
||||
nodes_failed = list(node_retry_counts.keys())
|
||||
if next_node:
|
||||
# Found a failure handler - route to it
|
||||
self.logger.info(f" → Routing to failure handler: {next_node}")
|
||||
current_node_id = next_node
|
||||
continue # Continue execution with handler
|
||||
else:
|
||||
# No failure handler - terminate execution
|
||||
self.runtime.report_problem(
|
||||
severity="critical",
|
||||
description=(
|
||||
f"Node {current_node_id} failed after "
|
||||
f"{max_retries} attempts: {result.error}"
|
||||
),
|
||||
)
|
||||
self.runtime.end_run(
|
||||
success=False,
|
||||
output_data=memory.read_all(),
|
||||
narrative=(
|
||||
f"Failed at {node_spec.name} after "
|
||||
f"{max_retries} retries: {result.error}"
|
||||
),
|
||||
)
|
||||
|
||||
return ExecutionResult(
|
||||
success=False,
|
||||
error=(
|
||||
f"Node '{node_spec.name}' failed after "
|
||||
f"{max_retries} attempts: {result.error}"
|
||||
),
|
||||
output=memory.read_all(),
|
||||
steps_executed=steps,
|
||||
total_tokens=total_tokens,
|
||||
total_latency_ms=total_latency,
|
||||
path=path,
|
||||
total_retries=total_retries_count,
|
||||
nodes_with_failures=nodes_failed,
|
||||
retry_details=dict(node_retry_counts),
|
||||
had_partial_failures=len(nodes_failed) > 0,
|
||||
execution_quality="failed",
|
||||
node_visit_counts=dict(node_visit_counts),
|
||||
)
|
||||
# Calculate quality metrics
|
||||
total_retries_count = sum(node_retry_counts.values())
|
||||
nodes_failed = list(node_retry_counts.keys())
|
||||
|
||||
return ExecutionResult(
|
||||
success=False,
|
||||
error=(
|
||||
f"Node '{node_spec.name}' failed after "
|
||||
f"{max_retries} attempts: {result.error}"
|
||||
),
|
||||
output=memory.read_all(),
|
||||
steps_executed=steps,
|
||||
total_tokens=total_tokens,
|
||||
total_latency_ms=total_latency,
|
||||
path=path,
|
||||
total_retries=total_retries_count,
|
||||
nodes_with_failures=nodes_failed,
|
||||
retry_details=dict(node_retry_counts),
|
||||
had_partial_failures=len(nodes_failed) > 0,
|
||||
execution_quality="failed",
|
||||
node_visit_counts=dict(node_visit_counts),
|
||||
)
|
||||
|
||||
# Check if we just executed a pause node - if so, save state and return
|
||||
# This must happen BEFORE determining next node, since pause nodes may have no edges
|
||||
@@ -781,11 +829,43 @@ class GraphExecutor:
|
||||
)
|
||||
|
||||
if node_spec.node_type == "event_loop":
|
||||
# Event loop nodes must be pre-registered (like function nodes)
|
||||
raise RuntimeError(
|
||||
f"EventLoopNode '{node_spec.id}' not found in registry. "
|
||||
"Register it with executor.register_node() before execution."
|
||||
# Auto-create EventLoopNode with sensible defaults.
|
||||
# Custom configs can still be pre-registered via node_registry.
|
||||
from framework.graph.event_loop_node import EventLoopNode, LoopConfig
|
||||
|
||||
# Create a FileConversationStore if a storage path is available
|
||||
conv_store = None
|
||||
if self._storage_path:
|
||||
from framework.storage.conversation_store import FileConversationStore
|
||||
|
||||
store_path = self._storage_path / "conversations" / node_spec.id
|
||||
conv_store = FileConversationStore(base_path=store_path)
|
||||
|
||||
# Auto-configure spillover directory for large tool results.
|
||||
# When a tool result exceeds max_tool_result_chars, the full
|
||||
# content is written to spillover_dir and the agent gets a
|
||||
# truncated preview with instructions to use load_data().
|
||||
spillover = None
|
||||
if self._storage_path:
|
||||
spillover = str(self._storage_path / "data")
|
||||
|
||||
node = EventLoopNode(
|
||||
event_bus=self._event_bus,
|
||||
judge=None, # implicit judge: accept when output_keys are filled
|
||||
config=LoopConfig(
|
||||
max_iterations=100 if node_spec.client_facing else 50,
|
||||
max_tool_calls_per_turn=10,
|
||||
stall_detection_threshold=3,
|
||||
max_history_tokens=32000,
|
||||
max_tool_result_chars=3_000,
|
||||
spillover_dir=spillover,
|
||||
),
|
||||
tool_executor=self.tool_executor,
|
||||
conversation_store=conv_store,
|
||||
)
|
||||
# Cache so inject_event() is reachable for client-facing input
|
||||
self.node_registry[node_spec.id] = node
|
||||
return node
|
||||
|
||||
# Should never reach here due to validation above
|
||||
raise RuntimeError(f"Unhandled node type: {node_spec.node_type}")
|
||||
@@ -814,9 +894,12 @@ class GraphExecutor:
|
||||
source_node_name=current_node_spec.name if current_node_spec else current_node_id,
|
||||
target_node_name=target_node_spec.name if target_node_spec else edge.target,
|
||||
):
|
||||
# Validate and clean output before mapping inputs
|
||||
# Validate and clean output before mapping inputs.
|
||||
# Use full memory state (not just result.output) because
|
||||
# target input_keys may come from earlier nodes in the
|
||||
# graph, not only from the immediate source node.
|
||||
if self.cleansing_config.enabled and target_node_spec:
|
||||
output_to_validate = result.output
|
||||
output_to_validate = memory.read_all()
|
||||
|
||||
validation = self.output_cleaner.validate_output(
|
||||
output=output_to_validate,
|
||||
@@ -1012,10 +1095,13 @@ class GraphExecutor:
|
||||
branch.status = "running"
|
||||
|
||||
try:
|
||||
# Validate and clean output before mapping inputs (same as _follow_edges)
|
||||
# Validate and clean output before mapping inputs (same as _follow_edges).
|
||||
# Use full memory state since target input_keys may come
|
||||
# from earlier nodes, not just the immediate source.
|
||||
if self.cleansing_config.enabled and node_spec:
|
||||
mem_snapshot = memory.read_all()
|
||||
validation = self.output_cleaner.validate_output(
|
||||
output=source_result.output,
|
||||
output=mem_snapshot,
|
||||
source_node_id=source_node_spec.id if source_node_spec else "unknown",
|
||||
target_node_spec=node_spec,
|
||||
)
|
||||
@@ -1026,7 +1112,7 @@ class GraphExecutor:
|
||||
f"{branch.node_id}: {validation.errors}"
|
||||
)
|
||||
cleaned_output = self.output_cleaner.clean_output(
|
||||
output=source_result.output,
|
||||
output=mem_snapshot,
|
||||
source_node_id=source_node_spec.id if source_node_spec else "unknown",
|
||||
target_node_spec=node_spec,
|
||||
validation_errors=validation.errors,
|
||||
@@ -1049,12 +1135,24 @@ class GraphExecutor:
|
||||
ctx = self._build_context(node_spec, memory, goal, mapped, graph.max_tokens)
|
||||
node_impl = self._get_node_implementation(node_spec, graph.cleanup_llm_model)
|
||||
|
||||
# Emit node-started event (skip event_loop nodes)
|
||||
if self._event_bus and node_spec.node_type != "event_loop":
|
||||
await self._event_bus.emit_node_loop_started(
|
||||
stream_id=self._stream_id, node_id=branch.node_id
|
||||
)
|
||||
|
||||
self.logger.info(
|
||||
f" ▶ Branch {node_spec.name}: executing (attempt {attempt + 1})"
|
||||
)
|
||||
result = await node_impl.execute(ctx)
|
||||
last_result = result
|
||||
|
||||
# Emit node-completed event (skip event_loop nodes)
|
||||
if self._event_bus and node_spec.node_type != "event_loop":
|
||||
await self._event_bus.emit_node_loop_completed(
|
||||
stream_id=self._stream_id, node_id=branch.node_id, iterations=1
|
||||
)
|
||||
|
||||
if result.success:
|
||||
# Write outputs to shared memory using async write
|
||||
for key, value in result.output.items():
|
||||
|
||||
@@ -144,8 +144,11 @@ class OutputCleaner:
|
||||
errors = []
|
||||
warnings = []
|
||||
|
||||
# Check 1: Required input keys present
|
||||
# Check 1: Required input keys present (skip nullable keys)
|
||||
nullable = set(getattr(target_node_spec, "nullable_output_keys", None) or [])
|
||||
for key in target_node_spec.input_keys:
|
||||
if key in nullable:
|
||||
continue
|
||||
if key not in output:
|
||||
errors.append(f"Missing required key: '{key}'")
|
||||
continue
|
||||
|
||||
@@ -572,17 +572,21 @@ class LiteLLMProvider(LLMProvider):
|
||||
# and we skip the retry path — nothing was yielded in vain.)
|
||||
has_content = accumulated_text or tool_calls_acc
|
||||
if not has_content and attempt < RATE_LIMIT_MAX_RETRIES:
|
||||
# If the conversation ends with an assistant message,
|
||||
# an empty stream is expected (nothing new to say).
|
||||
# Don't retry — just flush whatever we have.
|
||||
# If the conversation ends with an assistant or tool
|
||||
# message, an empty stream is expected — the LLM has
|
||||
# nothing new to say. Don't burn retries on this;
|
||||
# let the caller (EventLoopNode) decide what to do.
|
||||
# Typical case: client_facing node where the LLM set
|
||||
# all outputs via set_output tool calls, and the tool
|
||||
# results are the last messages.
|
||||
last_role = next(
|
||||
(m["role"] for m in reversed(full_messages) if m.get("role") != "system"),
|
||||
None,
|
||||
)
|
||||
if last_role == "assistant":
|
||||
if last_role in ("assistant", "tool"):
|
||||
logger.debug(
|
||||
"[stream] Empty response after assistant message — "
|
||||
"expected, not retrying."
|
||||
"[stream] Empty response after %s message — expected, not retrying.",
|
||||
last_role,
|
||||
)
|
||||
for event in tail_events:
|
||||
yield event
|
||||
|
||||
@@ -1105,17 +1105,30 @@ def validate_graph() -> str:
|
||||
errors.append(f"Unreachable nodes: {unreachable}")
|
||||
|
||||
# === CONTEXT FLOW VALIDATION ===
|
||||
# Build dependency map (node_id -> list of nodes it depends on)
|
||||
# Build dependency maps — separate forward edges from feedback edges.
|
||||
# Feedback edges (priority < 0) create cycles; they must not block the
|
||||
# topological sort. Context they carry arrives on *revisits*, not on
|
||||
# the first execution of a node.
|
||||
feedback_edge_ids = {e.id for e in session.edges if e.priority < 0}
|
||||
forward_dependencies: dict[str, list[str]] = {node.id: [] for node in session.nodes}
|
||||
feedback_sources: dict[str, list[str]] = {node.id: [] for node in session.nodes}
|
||||
# Combined map kept for error-message generation (all deps)
|
||||
dependencies: dict[str, list[str]] = {node.id: [] for node in session.nodes}
|
||||
|
||||
for edge in session.edges:
|
||||
if edge.target in dependencies:
|
||||
dependencies[edge.target].append(edge.source)
|
||||
if edge.target not in forward_dependencies:
|
||||
continue
|
||||
dependencies[edge.target].append(edge.source)
|
||||
if edge.id in feedback_edge_ids:
|
||||
feedback_sources[edge.target].append(edge.source)
|
||||
else:
|
||||
forward_dependencies[edge.target].append(edge.source)
|
||||
|
||||
# Build output map (node_id -> keys it produces)
|
||||
node_outputs: dict[str, set[str]] = {node.id: set(node.output_keys) for node in session.nodes}
|
||||
|
||||
# Compute available context for each node (what keys it can read)
|
||||
# Using topological order
|
||||
# Using topological order on the forward-edge DAG
|
||||
available_context: dict[str, set[str]] = {}
|
||||
computed = set()
|
||||
nodes_by_id = {n.id: n for n in session.nodes}
|
||||
@@ -1125,7 +1138,8 @@ def validate_graph() -> str:
|
||||
# Entry nodes can only read from initial context
|
||||
initial_context_keys: set[str] = set()
|
||||
|
||||
# Compute in topological order
|
||||
# Compute in topological order (forward edges only — feedback edges
|
||||
# don't block, since their context arrives on revisits)
|
||||
remaining = {n.id for n in session.nodes}
|
||||
max_iterations = len(session.nodes) * 2
|
||||
|
||||
@@ -1134,18 +1148,23 @@ def validate_graph() -> str:
|
||||
break
|
||||
|
||||
for node_id in list(remaining):
|
||||
deps = dependencies.get(node_id, [])
|
||||
fwd_deps = forward_dependencies.get(node_id, [])
|
||||
|
||||
# Can compute if all dependencies are computed (or no dependencies)
|
||||
if all(d in computed for d in deps):
|
||||
# Collect outputs from all dependencies
|
||||
# Can compute if all FORWARD dependencies are computed
|
||||
if all(d in computed for d in fwd_deps):
|
||||
# Collect outputs from all forward dependencies
|
||||
available = set(initial_context_keys)
|
||||
for dep_id in deps:
|
||||
# Add outputs from dependency
|
||||
for dep_id in fwd_deps:
|
||||
available.update(node_outputs.get(dep_id, set()))
|
||||
# Also add what was available to the dependency (transitive)
|
||||
available.update(available_context.get(dep_id, set()))
|
||||
|
||||
# Also include context from already-computed feedback
|
||||
# sources (bonus, not blocking)
|
||||
for fb_src in feedback_sources.get(node_id, []):
|
||||
if fb_src in computed:
|
||||
available.update(node_outputs.get(fb_src, set()))
|
||||
available.update(available_context.get(fb_src, set()))
|
||||
|
||||
available_context[node_id] = available
|
||||
computed.add(node_id)
|
||||
remaining.remove(node_id)
|
||||
@@ -1155,15 +1174,37 @@ def validate_graph() -> str:
|
||||
context_errors = []
|
||||
context_warnings = []
|
||||
missing_inputs: dict[str, list[str]] = {}
|
||||
feedback_only_inputs: dict[str, list[str]] = {}
|
||||
|
||||
for node in session.nodes:
|
||||
available = available_context.get(node.id, set())
|
||||
|
||||
for input_key in node.input_keys:
|
||||
if input_key not in available:
|
||||
if node.id not in missing_inputs:
|
||||
missing_inputs[node.id] = []
|
||||
missing_inputs[node.id].append(input_key)
|
||||
# Check if this input is provided by a feedback source
|
||||
fb_provides = set()
|
||||
for fb_src in feedback_sources.get(node.id, []):
|
||||
fb_provides.update(node_outputs.get(fb_src, set()))
|
||||
fb_provides.update(available_context.get(fb_src, set()))
|
||||
|
||||
if input_key in fb_provides:
|
||||
# Input arrives via feedback edge — warn, don't error
|
||||
if node.id not in feedback_only_inputs:
|
||||
feedback_only_inputs[node.id] = []
|
||||
feedback_only_inputs[node.id].append(input_key)
|
||||
else:
|
||||
if node.id not in missing_inputs:
|
||||
missing_inputs[node.id] = []
|
||||
missing_inputs[node.id].append(input_key)
|
||||
|
||||
# Warn about feedback-only inputs (available on revisits, not first run)
|
||||
for node_id, fb_keys in feedback_only_inputs.items():
|
||||
fb_srcs = feedback_sources.get(node_id, [])
|
||||
context_warnings.append(
|
||||
f"Node '{node_id}' input(s) {fb_keys} are only provided via "
|
||||
f"feedback edge(s) from {fb_srcs}. These will be available on "
|
||||
f"revisits but not on the first execution."
|
||||
)
|
||||
|
||||
# Generate helpful error messages
|
||||
for node_id, missing in missing_inputs.items():
|
||||
|
||||
@@ -56,6 +56,18 @@ def register_commands(subparsers: argparse._SubParsersAction) -> None:
|
||||
action="store_true",
|
||||
help="Show detailed execution logs (steps, LLM calls, etc.)",
|
||||
)
|
||||
run_parser.add_argument(
|
||||
"--tui",
|
||||
action="store_true",
|
||||
help="Launch interactive terminal dashboard",
|
||||
)
|
||||
run_parser.add_argument(
|
||||
"--model",
|
||||
"-m",
|
||||
type=str,
|
||||
default=None,
|
||||
help="LLM model to use (any LiteLLM-compatible name)",
|
||||
)
|
||||
run_parser.set_defaults(func=cmd_run)
|
||||
|
||||
# info command
|
||||
@@ -205,38 +217,83 @@ def cmd_run(args: argparse.Namespace) -> int:
|
||||
print(f"Error reading input file: {e}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Load and run agent
|
||||
try:
|
||||
runner = AgentRunner.load(
|
||||
args.agent_path,
|
||||
mock_mode=args.mock,
|
||||
model=getattr(args, "model", "claude-haiku-4-5-20251001"),
|
||||
)
|
||||
except FileNotFoundError as e:
|
||||
print(f"Error: {e}", file=sys.stderr)
|
||||
return 1
|
||||
# Run the agent (with TUI or standard)
|
||||
if getattr(args, "tui", False):
|
||||
from framework.tui.app import AdenTUI
|
||||
|
||||
# Auto-inject user_id if the agent expects it but it's not provided
|
||||
entry_input_keys = runner.graph.nodes[0].input_keys if runner.graph.nodes else []
|
||||
if "user_id" in entry_input_keys and context.get("user_id") is None:
|
||||
import os
|
||||
async def run_with_tui():
|
||||
try:
|
||||
# Load runner inside the async loop to ensure strict loop affinity
|
||||
# (only one load — avoids spawning duplicate MCP subprocesses)
|
||||
try:
|
||||
runner = AgentRunner.load(
|
||||
args.agent_path,
|
||||
mock_mode=args.mock,
|
||||
model=args.model,
|
||||
enable_tui=True,
|
||||
)
|
||||
except Exception as e:
|
||||
print(f"Error loading agent: {e}")
|
||||
return
|
||||
|
||||
context["user_id"] = os.environ.get("USER", "default_user")
|
||||
# Force setup inside the loop
|
||||
if runner._agent_runtime is None:
|
||||
runner._setup()
|
||||
|
||||
if not args.quiet:
|
||||
info = runner.info()
|
||||
print(f"Agent: {info.name}")
|
||||
print(f"Goal: {info.goal_name}")
|
||||
print(f"Steps: {info.node_count}")
|
||||
print(f"Input: {json.dumps(context)}")
|
||||
print()
|
||||
print("=" * 60)
|
||||
print("Executing agent...")
|
||||
print("=" * 60)
|
||||
print()
|
||||
# Start runtime before TUI so it's ready for user input
|
||||
if runner._agent_runtime and not runner._agent_runtime.is_running:
|
||||
await runner._agent_runtime.start()
|
||||
|
||||
# Run the agent
|
||||
result = asyncio.run(runner.run(context))
|
||||
app = AdenTUI(runner._agent_runtime)
|
||||
|
||||
# TUI handles execution via ChatRepl — user submits input,
|
||||
# ChatRepl calls runtime.trigger_and_wait(). No auto-launch.
|
||||
await app.run_async()
|
||||
except Exception as e:
|
||||
import traceback
|
||||
|
||||
traceback.print_exc()
|
||||
print(f"TUI error: {e}")
|
||||
|
||||
await runner.cleanup_async()
|
||||
return None
|
||||
|
||||
asyncio.run(run_with_tui())
|
||||
print("TUI session ended.")
|
||||
return 0
|
||||
else:
|
||||
# Standard execution — load runner here (not shared with TUI path)
|
||||
try:
|
||||
runner = AgentRunner.load(
|
||||
args.agent_path,
|
||||
mock_mode=args.mock,
|
||||
model=args.model,
|
||||
enable_tui=False,
|
||||
)
|
||||
except FileNotFoundError as e:
|
||||
print(f"Error: {e}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
# Auto-inject user_id if the agent expects it but it's not provided
|
||||
entry_input_keys = runner.graph.nodes[0].input_keys if runner.graph.nodes else []
|
||||
if "user_id" in entry_input_keys and context.get("user_id") is None:
|
||||
import os
|
||||
|
||||
context["user_id"] = os.environ.get("USER", "default_user")
|
||||
|
||||
if not args.quiet:
|
||||
info = runner.info()
|
||||
print(f"Agent: {info.name}")
|
||||
print(f"Goal: {info.goal_name}")
|
||||
print(f"Steps: {info.node_count}")
|
||||
print(f"Input: {json.dumps(context)}")
|
||||
print()
|
||||
print("=" * 60)
|
||||
print("Executing agent...")
|
||||
print("=" * 60)
|
||||
print()
|
||||
|
||||
result = asyncio.run(runner.run(context))
|
||||
|
||||
# Format output
|
||||
output = {
|
||||
|
||||
@@ -362,6 +362,15 @@ class MCPClient:
|
||||
# Call tool using persistent session
|
||||
result = await self._session.call_tool(tool_name, arguments=arguments)
|
||||
|
||||
# Check for server-side errors (validation failures, tool exceptions, etc.)
|
||||
if getattr(result, "isError", False):
|
||||
error_text = ""
|
||||
if result.content:
|
||||
content_item = result.content[0]
|
||||
if hasattr(content_item, "text"):
|
||||
error_text = content_item.text
|
||||
raise RuntimeError(f"MCP tool '{tool_name}' failed: {error_text}")
|
||||
|
||||
# Extract content
|
||||
if result.content:
|
||||
# MCP returns content as a list of content items
|
||||
|
||||
+212
-21
@@ -28,6 +28,33 @@ logger = logging.getLogger(__name__)
|
||||
|
||||
# Configuration paths
|
||||
HIVE_CONFIG_FILE = Path.home() / ".hive" / "configuration.json"
|
||||
|
||||
|
||||
def _ensure_credential_key_env() -> None:
|
||||
"""Load HIVE_CREDENTIAL_KEY from shell config if not already in environment.
|
||||
|
||||
The setup-credentials skill writes the encryption key to ~/.zshrc or ~/.bashrc.
|
||||
If the user hasn't sourced their config in the current shell, this reads it
|
||||
directly so the runner (and any MCP subprocesses it spawns) can unlock the
|
||||
encrypted credential store.
|
||||
|
||||
Only HIVE_CREDENTIAL_KEY is loaded this way — all other secrets (API keys, etc.)
|
||||
come from the credential store itself.
|
||||
"""
|
||||
if os.environ.get("HIVE_CREDENTIAL_KEY"):
|
||||
return
|
||||
|
||||
try:
|
||||
from aden_tools.credentials.shell_config import check_env_var_in_shell_config
|
||||
|
||||
found, value = check_env_var_in_shell_config("HIVE_CREDENTIAL_KEY")
|
||||
if found and value:
|
||||
os.environ["HIVE_CREDENTIAL_KEY"] = value
|
||||
logger.debug("Loaded HIVE_CREDENTIAL_KEY from shell config")
|
||||
except ImportError:
|
||||
pass
|
||||
|
||||
|
||||
CLAUDE_CREDENTIALS_FILE = Path.home() / ".claude" / ".credentials.json"
|
||||
|
||||
|
||||
@@ -236,6 +263,15 @@ class AgentRunner:
|
||||
result = await runner.run({"lead_id": "123"})
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def _resolve_default_model() -> str:
|
||||
"""Resolve the default model from ~/.hive/configuration.json."""
|
||||
config = get_hive_config()
|
||||
llm = config.get("llm", {})
|
||||
if llm.get("provider") and llm.get("model"):
|
||||
return f"{llm['provider']}/{llm['model']}"
|
||||
return "anthropic/claude-sonnet-4-20250514"
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
agent_path: Path,
|
||||
@@ -243,7 +279,8 @@ class AgentRunner:
|
||||
goal: Goal,
|
||||
mock_mode: bool = False,
|
||||
storage_path: Path | None = None,
|
||||
model: str = "cerebras/zai-glm-4.7",
|
||||
model: str | None = None,
|
||||
enable_tui: bool = False,
|
||||
):
|
||||
"""
|
||||
Initialize the runner (use AgentRunner.load() instead).
|
||||
@@ -254,14 +291,15 @@ class AgentRunner:
|
||||
goal: Loaded Goal object
|
||||
mock_mode: If True, use mock LLM responses
|
||||
storage_path: Path for runtime storage (defaults to temp)
|
||||
model: Model to use - any LiteLLM-compatible model name
|
||||
(e.g., "claude-sonnet-4-20250514", "gpt-4o-mini", "gemini/gemini-pro")
|
||||
model: Model to use (reads from agent config or ~/.hive/configuration.json if None)
|
||||
enable_tui: If True, forces use of AgentRuntime with EventBus
|
||||
"""
|
||||
self.agent_path = agent_path
|
||||
self.graph = graph
|
||||
self.goal = goal
|
||||
self.mock_mode = mock_mode
|
||||
self.model = model
|
||||
self.model = model or self._resolve_default_model()
|
||||
self.enable_tui = enable_tui
|
||||
|
||||
# Set up storage
|
||||
if storage_path:
|
||||
@@ -275,6 +313,10 @@ class AgentRunner:
|
||||
self._storage_path = default_storage
|
||||
self._temp_dir = None
|
||||
|
||||
# Load HIVE_CREDENTIAL_KEY from shell config if not in env.
|
||||
# Must happen before MCP subprocesses are spawned so they inherit it.
|
||||
_ensure_credential_key_env()
|
||||
|
||||
# Initialize components
|
||||
self._tool_registry = ToolRegistry()
|
||||
self._runtime: Runtime | None = None
|
||||
@@ -296,32 +338,121 @@ class AgentRunner:
|
||||
if mcp_config_path.exists():
|
||||
self._load_mcp_servers_from_config(mcp_config_path)
|
||||
|
||||
@staticmethod
|
||||
def _import_agent_module(agent_path: Path):
|
||||
"""Import an agent package from its directory path.
|
||||
|
||||
Tries package import first (works when exports/ is on sys.path,
|
||||
which cli.py:_configure_paths() ensures). Falls back to direct
|
||||
file import of agent.py via importlib.util.
|
||||
"""
|
||||
import importlib
|
||||
|
||||
package_name = agent_path.name
|
||||
|
||||
# Try importing as a package (works when exports/ is on sys.path)
|
||||
try:
|
||||
return importlib.import_module(package_name)
|
||||
except ImportError:
|
||||
pass
|
||||
|
||||
# Fallback: import agent.py directly via file path
|
||||
import importlib.util
|
||||
|
||||
agent_py = agent_path / "agent.py"
|
||||
if not agent_py.exists():
|
||||
raise FileNotFoundError(
|
||||
f"No importable agent found at {agent_path}. "
|
||||
f"Expected a Python package with agent.py."
|
||||
)
|
||||
spec = importlib.util.spec_from_file_location(
|
||||
f"{package_name}.agent",
|
||||
agent_py,
|
||||
submodule_search_locations=[str(agent_path)],
|
||||
)
|
||||
module = importlib.util.module_from_spec(spec)
|
||||
spec.loader.exec_module(module)
|
||||
return module
|
||||
|
||||
@classmethod
|
||||
def load(
|
||||
cls,
|
||||
agent_path: str | Path,
|
||||
mock_mode: bool = False,
|
||||
storage_path: Path | None = None,
|
||||
model: str = "cerebras/zai-glm-4.7",
|
||||
model: str | None = None,
|
||||
enable_tui: bool = False,
|
||||
) -> "AgentRunner":
|
||||
"""
|
||||
Load an agent from an export folder.
|
||||
|
||||
Imports the agent's Python package and reads module-level variables
|
||||
(goal, nodes, edges, etc.) to build a GraphSpec. Falls back to
|
||||
agent.json if no Python module is found.
|
||||
|
||||
Args:
|
||||
agent_path: Path to agent folder (containing agent.json)
|
||||
agent_path: Path to agent folder
|
||||
mock_mode: If True, use mock LLM responses
|
||||
storage_path: Path for runtime storage (defaults to temp)
|
||||
model: LLM model to use (any LiteLLM-compatible model name)
|
||||
storage_path: Path for runtime storage (defaults to ~/.hive/storage/{name})
|
||||
model: LLM model to use (reads from agent's default_config if None)
|
||||
enable_tui: If True, forces use of AgentRuntime with EventBus
|
||||
|
||||
Returns:
|
||||
AgentRunner instance ready to run
|
||||
"""
|
||||
agent_path = Path(agent_path)
|
||||
|
||||
# Load agent.json
|
||||
# Try loading from Python module first (code-based agents)
|
||||
agent_py = agent_path / "agent.py"
|
||||
if agent_py.exists():
|
||||
agent_module = cls._import_agent_module(agent_path)
|
||||
|
||||
goal = getattr(agent_module, "goal", None)
|
||||
nodes = getattr(agent_module, "nodes", None)
|
||||
edges = getattr(agent_module, "edges", None)
|
||||
|
||||
if goal is None or nodes is None or edges is None:
|
||||
raise ValueError(
|
||||
f"Agent at {agent_path} must define 'goal', 'nodes', and 'edges' "
|
||||
f"in agent.py (or __init__.py)"
|
||||
)
|
||||
|
||||
# Read model and max_tokens from agent's config if not explicitly provided
|
||||
agent_config = getattr(agent_module, "default_config", None)
|
||||
if model is None:
|
||||
if agent_config and hasattr(agent_config, "model"):
|
||||
model = agent_config.model
|
||||
|
||||
max_tokens = getattr(agent_config, "max_tokens", 1024) if agent_config else 1024
|
||||
|
||||
# Build GraphSpec from module-level variables
|
||||
graph = GraphSpec(
|
||||
id=f"{agent_path.name}-graph",
|
||||
goal_id=goal.id,
|
||||
version="1.0.0",
|
||||
entry_node=getattr(agent_module, "entry_node", nodes[0].id),
|
||||
entry_points=getattr(agent_module, "entry_points", {}),
|
||||
terminal_nodes=getattr(agent_module, "terminal_nodes", []),
|
||||
pause_nodes=getattr(agent_module, "pause_nodes", []),
|
||||
nodes=nodes,
|
||||
edges=edges,
|
||||
max_tokens=max_tokens,
|
||||
)
|
||||
|
||||
return cls(
|
||||
agent_path=agent_path,
|
||||
graph=graph,
|
||||
goal=goal,
|
||||
mock_mode=mock_mode,
|
||||
storage_path=storage_path,
|
||||
model=model,
|
||||
enable_tui=enable_tui,
|
||||
)
|
||||
|
||||
# Fallback: load from agent.json (legacy JSON-based agents)
|
||||
agent_json_path = agent_path / "agent.json"
|
||||
if not agent_json_path.exists():
|
||||
raise FileNotFoundError(f"agent.json not found in {agent_path}")
|
||||
raise FileNotFoundError(f"No agent.py or agent.json found in {agent_path}")
|
||||
|
||||
with open(agent_json_path) as f:
|
||||
graph, goal = load_agent_export(f.read())
|
||||
@@ -333,6 +464,7 @@ class AgentRunner:
|
||||
mock_mode=mock_mode,
|
||||
storage_path=storage_path,
|
||||
model=model,
|
||||
enable_tui=enable_tui,
|
||||
)
|
||||
|
||||
def register_tool(
|
||||
@@ -471,16 +603,25 @@ class AgentRunner:
|
||||
api_key_env = self._get_api_key_env_var(self.model)
|
||||
if api_key_env and os.environ.get(api_key_env):
|
||||
self._llm = LiteLLMProvider(model=self.model)
|
||||
elif api_key_env:
|
||||
print(f"Warning: {api_key_env} not set. LLM calls will fail.")
|
||||
print(f"Set it with: export {api_key_env}=your-api-key")
|
||||
else:
|
||||
# Fall back to credential store
|
||||
api_key = self._get_api_key_from_credential_store()
|
||||
if api_key:
|
||||
self._llm = LiteLLMProvider(model=self.model, api_key=api_key)
|
||||
# Set env var so downstream code (e.g. cleanup LLM in
|
||||
# node._extract_json) can also find it
|
||||
if api_key_env:
|
||||
os.environ[api_key_env] = api_key
|
||||
elif api_key_env:
|
||||
print(f"Warning: {api_key_env} not set. LLM calls will fail.")
|
||||
print(f"Set it with: export {api_key_env}=your-api-key")
|
||||
|
||||
# Get tools for executor/runtime
|
||||
tools = list(self._tool_registry.get_tools().values())
|
||||
tool_executor = self._tool_registry.get_executor()
|
||||
|
||||
if self._uses_async_entry_points:
|
||||
# Multi-entry-point mode: use AgentRuntime
|
||||
if self._uses_async_entry_points or self.enable_tui:
|
||||
# Multi-entry-point mode or TUI mode: use AgentRuntime
|
||||
self._setup_agent_runtime(tools, tool_executor)
|
||||
else:
|
||||
# Single-entry-point mode: use legacy GraphExecutor
|
||||
@@ -518,6 +659,33 @@ class AgentRunner:
|
||||
# Default: assume OpenAI-compatible
|
||||
return "OPENAI_API_KEY"
|
||||
|
||||
def _get_api_key_from_credential_store(self) -> str | None:
|
||||
"""Get the LLM API key from the encrypted credential store.
|
||||
|
||||
Maps model name to credential store ID (e.g. "anthropic/..." -> "anthropic")
|
||||
and retrieves the key via CredentialStore.get().
|
||||
"""
|
||||
if not os.environ.get("HIVE_CREDENTIAL_KEY"):
|
||||
return None
|
||||
|
||||
# Map model prefix to credential store ID
|
||||
model_lower = self.model.lower()
|
||||
cred_id = None
|
||||
if model_lower.startswith("anthropic/") or model_lower.startswith("claude"):
|
||||
cred_id = "anthropic"
|
||||
# Add more mappings as providers are added to LLM_CREDENTIALS
|
||||
|
||||
if cred_id is None:
|
||||
return None
|
||||
|
||||
try:
|
||||
from framework.credentials import CredentialStore
|
||||
|
||||
store = CredentialStore.with_encrypted_storage()
|
||||
return store.get(cred_id)
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
def _setup_legacy_executor(self, tools: list, tool_executor: Callable | None) -> None:
|
||||
"""Set up legacy single-entry-point execution using GraphExecutor."""
|
||||
# Create runtime
|
||||
@@ -549,6 +717,19 @@ class AgentRunner:
|
||||
)
|
||||
entry_points.append(ep)
|
||||
|
||||
# If TUI enabled but no entry points (single-entry agent), create default
|
||||
if not entry_points and self.enable_tui and self.graph.entry_node:
|
||||
logger.info("Creating default entry point for TUI")
|
||||
entry_points.append(
|
||||
EntryPointSpec(
|
||||
id="default",
|
||||
name="Default",
|
||||
entry_node=self.graph.entry_node,
|
||||
trigger_type="manual",
|
||||
isolation_level="shared",
|
||||
)
|
||||
)
|
||||
|
||||
# Create AgentRuntime with all entry points
|
||||
self._agent_runtime = create_agent_runtime(
|
||||
graph=self.graph,
|
||||
@@ -599,7 +780,7 @@ class AgentRunner:
|
||||
error=error_msg,
|
||||
)
|
||||
|
||||
if self._uses_async_entry_points:
|
||||
if self._uses_async_entry_points or self.enable_tui:
|
||||
# Multi-entry-point mode: use AgentRuntime
|
||||
return await self._run_with_agent_runtime(
|
||||
input_data=input_data or {},
|
||||
@@ -891,15 +1072,25 @@ class AgentRunner:
|
||||
EnvVarStorage,
|
||||
)
|
||||
|
||||
# Build env mapping for fallback
|
||||
# Build env mapping for credential lookup
|
||||
env_mapping = {
|
||||
(spec.credential_id or name): spec.env_var
|
||||
for name, spec in CREDENTIAL_SPECS.items()
|
||||
}
|
||||
storage = CompositeStorage(
|
||||
primary=EncryptedFileStorage(),
|
||||
fallbacks=[EnvVarStorage(env_mapping=env_mapping)],
|
||||
)
|
||||
|
||||
# Only use EncryptedFileStorage if the encryption key is configured;
|
||||
# otherwise just check env vars (avoids generating a throwaway key)
|
||||
storages: list = [EnvVarStorage(env_mapping=env_mapping)]
|
||||
if os.environ.get("HIVE_CREDENTIAL_KEY"):
|
||||
storages.insert(0, EncryptedFileStorage())
|
||||
|
||||
if len(storages) == 1:
|
||||
storage = storages[0]
|
||||
else:
|
||||
storage = CompositeStorage(
|
||||
primary=storages[0],
|
||||
fallbacks=storages[1:],
|
||||
)
|
||||
store = CredentialStore(storage=storage)
|
||||
|
||||
# Build reverse mappings
|
||||
|
||||
@@ -33,6 +33,11 @@ class ToolRegistry:
|
||||
4. Manually registered tools
|
||||
"""
|
||||
|
||||
# Framework-internal context keys injected into tool calls.
|
||||
# Stripped from LLM-facing schemas (the LLM doesn't know these values)
|
||||
# and auto-injected at call time for tools that accept them.
|
||||
CONTEXT_PARAMS = frozenset({"workspace_id", "agent_id", "session_id"})
|
||||
|
||||
def __init__(self):
|
||||
self._tools: dict[str, RegisteredTool] = {}
|
||||
self._mcp_clients: list[Any] = [] # List of MCPClient instances
|
||||
@@ -275,7 +280,16 @@ class ToolRegistry:
|
||||
return
|
||||
|
||||
base_dir = config_path.parent
|
||||
for server_config in config.get("servers", []):
|
||||
|
||||
# Support both formats:
|
||||
# {"servers": [{"name": "x", ...}]} (list format)
|
||||
# {"server-name": {"transport": ...}, ...} (dict format)
|
||||
server_list = config.get("servers", [])
|
||||
if not server_list and "servers" not in config:
|
||||
# Treat top-level keys as server names
|
||||
server_list = [{"name": name, **cfg} for name, cfg in config.items()]
|
||||
|
||||
for server_config in server_list:
|
||||
cwd = server_config.get("cwd")
|
||||
if cwd and not Path(cwd).is_absolute():
|
||||
server_config["cwd"] = str((base_dir / cwd).resolve())
|
||||
@@ -333,7 +347,7 @@ class ToolRegistry:
|
||||
# Register each tool
|
||||
count = 0
|
||||
for mcp_tool in client.list_tools():
|
||||
# Convert MCP tool to framework Tool
|
||||
# Convert MCP tool to framework Tool (strips context params from LLM schema)
|
||||
tool = self._convert_mcp_tool_to_framework_tool(mcp_tool)
|
||||
|
||||
# Create executor that calls the MCP server
|
||||
@@ -395,6 +409,11 @@ class ToolRegistry:
|
||||
properties = input_schema.get("properties", {})
|
||||
required = input_schema.get("required", [])
|
||||
|
||||
# Strip framework-internal context params from LLM-facing schema.
|
||||
# The LLM can't know these values; they're auto-injected at call time.
|
||||
properties = {k: v for k, v in properties.items() if k not in self.CONTEXT_PARAMS}
|
||||
required = [r for r in required if r not in self.CONTEXT_PARAMS]
|
||||
|
||||
# Convert to framework Tool format
|
||||
tool = Tool(
|
||||
name=mcp_tool.name,
|
||||
|
||||
@@ -296,6 +296,25 @@ class AgentRuntime:
|
||||
raise ValueError(f"Entry point '{entry_point_id}' not found")
|
||||
return await stream.wait_for_completion(exec_id, timeout)
|
||||
|
||||
async def inject_input(self, node_id: str, content: str) -> bool:
|
||||
"""Inject user input into a running client-facing node.
|
||||
|
||||
Routes input to the EventLoopNode identified by ``node_id``
|
||||
across all active streams. Used by the TUI ChatRepl to deliver
|
||||
user responses during client-facing node execution.
|
||||
|
||||
Args:
|
||||
node_id: The node currently waiting for input
|
||||
content: The user's input text
|
||||
|
||||
Returns:
|
||||
True if input was delivered, False if no matching node found
|
||||
"""
|
||||
for stream in self._streams.values():
|
||||
if await stream.inject_input(node_id, content):
|
||||
return True
|
||||
return False
|
||||
|
||||
async def get_goal_progress(self) -> dict[str, Any]:
|
||||
"""
|
||||
Evaluate goal progress across all streams.
|
||||
|
||||
@@ -153,6 +153,7 @@ class ExecutionStream:
|
||||
# Execution tracking
|
||||
self._active_executions: dict[str, ExecutionContext] = {}
|
||||
self._execution_tasks: dict[str, asyncio.Task] = {}
|
||||
self._active_executors: dict[str, GraphExecutor] = {}
|
||||
self._execution_results: OrderedDict[str, ExecutionResult] = OrderedDict()
|
||||
self._execution_result_times: dict[str, float] = {}
|
||||
self._completion_events: dict[str, asyncio.Event] = {}
|
||||
@@ -237,6 +238,21 @@ class ExecutionStream:
|
||||
)
|
||||
)
|
||||
|
||||
async def inject_input(self, node_id: str, content: str) -> bool:
|
||||
"""Inject user input into a running client-facing EventLoopNode.
|
||||
|
||||
Searches active executors for a node matching ``node_id`` and calls
|
||||
its ``inject_event()`` method to unblock ``_await_user_input()``.
|
||||
|
||||
Returns True if input was delivered, False otherwise.
|
||||
"""
|
||||
for executor in self._active_executors.values():
|
||||
node = executor.node_registry.get(node_id)
|
||||
if node is not None and hasattr(node, "inject_event"):
|
||||
await node.inject_event(content)
|
||||
return True
|
||||
return False
|
||||
|
||||
async def execute(
|
||||
self,
|
||||
input_data: dict[str, Any],
|
||||
@@ -314,13 +330,21 @@ class ExecutionStream:
|
||||
# Create runtime adapter for this execution
|
||||
runtime_adapter = StreamRuntimeAdapter(self._runtime, execution_id)
|
||||
|
||||
# Create executor for this execution
|
||||
# Create executor for this execution.
|
||||
# Scope storage by execution_id so each execution gets
|
||||
# fresh conversations and spillover directories.
|
||||
exec_storage = self._storage.base_path / "sessions" / execution_id
|
||||
executor = GraphExecutor(
|
||||
runtime=runtime_adapter,
|
||||
llm=self._llm,
|
||||
tools=self._tools,
|
||||
tool_executor=self._tool_executor,
|
||||
event_bus=self._event_bus,
|
||||
stream_id=self.stream_id,
|
||||
storage_path=exec_storage,
|
||||
)
|
||||
# Track executor so inject_input() can reach EventLoopNode instances
|
||||
self._active_executors[execution_id] = executor
|
||||
|
||||
# Create modified graph with entry point
|
||||
# We need to override the entry_node to use our entry point
|
||||
@@ -334,6 +358,9 @@ class ExecutionStream:
|
||||
session_state=ctx.session_state,
|
||||
)
|
||||
|
||||
# Clean up executor reference
|
||||
self._active_executors.pop(execution_id, None)
|
||||
|
||||
# Store result with retention
|
||||
self._record_execution_result(execution_id, result)
|
||||
|
||||
|
||||
@@ -0,0 +1,518 @@
|
||||
import logging
|
||||
import time
|
||||
|
||||
from textual.app import App, ComposeResult
|
||||
from textual.binding import Binding
|
||||
from textual.containers import Container, Horizontal, Vertical
|
||||
from textual.widgets import Footer, Label
|
||||
|
||||
from framework.runtime.agent_runtime import AgentRuntime
|
||||
from framework.runtime.event_bus import AgentEvent, EventType
|
||||
from framework.tui.widgets.chat_repl import ChatRepl
|
||||
from framework.tui.widgets.graph_view import GraphOverview
|
||||
from framework.tui.widgets.log_pane import LogPane
|
||||
|
||||
|
||||
class StatusBar(Container):
|
||||
"""Live status bar showing agent execution state."""
|
||||
|
||||
DEFAULT_CSS = """
|
||||
StatusBar {
|
||||
dock: top;
|
||||
height: 1;
|
||||
background: $panel;
|
||||
color: $text;
|
||||
padding: 0 1;
|
||||
}
|
||||
StatusBar > Label {
|
||||
width: 100%;
|
||||
}
|
||||
"""
|
||||
|
||||
def __init__(self, graph_id: str = ""):
|
||||
super().__init__()
|
||||
self._graph_id = graph_id
|
||||
self._state = "idle"
|
||||
self._active_node: str | None = None
|
||||
self._node_detail: str = ""
|
||||
self._start_time: float | None = None
|
||||
self._final_elapsed: float | None = None
|
||||
|
||||
def compose(self) -> ComposeResult:
|
||||
yield Label(id="status-content")
|
||||
|
||||
def on_mount(self) -> None:
|
||||
self._refresh()
|
||||
self.set_interval(1.0, self._refresh)
|
||||
|
||||
def _format_elapsed(self, seconds: float) -> str:
|
||||
total = int(seconds)
|
||||
hours, remainder = divmod(total, 3600)
|
||||
mins, secs = divmod(remainder, 60)
|
||||
if hours:
|
||||
return f"{hours}:{mins:02d}:{secs:02d}"
|
||||
return f"{mins}:{secs:02d}"
|
||||
|
||||
def _refresh(self) -> None:
|
||||
parts: list[str] = []
|
||||
|
||||
if self._graph_id:
|
||||
parts.append(f"[bold]{self._graph_id}[/bold]")
|
||||
|
||||
if self._state == "idle":
|
||||
parts.append("[dim]○ idle[/dim]")
|
||||
elif self._state == "running":
|
||||
parts.append("[bold green]● running[/bold green]")
|
||||
elif self._state == "completed":
|
||||
parts.append("[green]✓ done[/green]")
|
||||
elif self._state == "failed":
|
||||
parts.append("[bold red]✗ failed[/bold red]")
|
||||
|
||||
if self._active_node:
|
||||
node_str = f"[cyan]{self._active_node}[/cyan]"
|
||||
if self._node_detail:
|
||||
node_str += f" [dim]({self._node_detail})[/dim]"
|
||||
parts.append(node_str)
|
||||
|
||||
if self._state == "running" and self._start_time:
|
||||
parts.append(f"[dim]{self._format_elapsed(time.time() - self._start_time)}[/dim]")
|
||||
elif self._final_elapsed is not None:
|
||||
parts.append(f"[dim]{self._format_elapsed(self._final_elapsed)}[/dim]")
|
||||
|
||||
try:
|
||||
label = self.query_one("#status-content", Label)
|
||||
label.update(" │ ".join(parts))
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
def set_graph_id(self, graph_id: str) -> None:
|
||||
self._graph_id = graph_id
|
||||
self._refresh()
|
||||
|
||||
def set_running(self, entry_node: str = "") -> None:
|
||||
self._state = "running"
|
||||
self._active_node = entry_node or None
|
||||
self._node_detail = ""
|
||||
self._start_time = time.time()
|
||||
self._final_elapsed = None
|
||||
self._refresh()
|
||||
|
||||
def set_completed(self) -> None:
|
||||
self._state = "completed"
|
||||
if self._start_time:
|
||||
self._final_elapsed = time.time() - self._start_time
|
||||
self._active_node = None
|
||||
self._node_detail = ""
|
||||
self._start_time = None
|
||||
self._refresh()
|
||||
|
||||
def set_failed(self, error: str = "") -> None:
|
||||
self._state = "failed"
|
||||
if self._start_time:
|
||||
self._final_elapsed = time.time() - self._start_time
|
||||
self._node_detail = error[:40] if error else ""
|
||||
self._start_time = None
|
||||
self._refresh()
|
||||
|
||||
def set_active_node(self, node_id: str, detail: str = "") -> None:
|
||||
self._active_node = node_id
|
||||
self._node_detail = detail
|
||||
self._refresh()
|
||||
|
||||
def set_node_detail(self, detail: str) -> None:
|
||||
self._node_detail = detail
|
||||
self._refresh()
|
||||
|
||||
|
||||
class AdenTUI(App):
|
||||
TITLE = "Aden TUI Dashboard"
|
||||
COMMAND_PALETTE_BINDING = "ctrl+o"
|
||||
CSS = """
|
||||
Screen {
|
||||
layout: vertical;
|
||||
background: $surface;
|
||||
}
|
||||
|
||||
#left-pane {
|
||||
width: 60%;
|
||||
height: 100%;
|
||||
layout: vertical;
|
||||
background: $surface;
|
||||
}
|
||||
|
||||
GraphOverview {
|
||||
height: 40%;
|
||||
background: $panel;
|
||||
padding: 0;
|
||||
}
|
||||
|
||||
LogPane {
|
||||
height: 60%;
|
||||
background: $surface;
|
||||
padding: 0;
|
||||
margin-bottom: 1;
|
||||
}
|
||||
|
||||
ChatRepl {
|
||||
width: 40%;
|
||||
height: 100%;
|
||||
background: $panel;
|
||||
border-left: tall $primary;
|
||||
padding: 0;
|
||||
}
|
||||
|
||||
#chat-history {
|
||||
height: 1fr;
|
||||
width: 100%;
|
||||
background: $surface;
|
||||
border: none;
|
||||
scrollbar-background: $panel;
|
||||
scrollbar-color: $primary;
|
||||
}
|
||||
|
||||
RichLog {
|
||||
background: $surface;
|
||||
border: none;
|
||||
scrollbar-background: $panel;
|
||||
scrollbar-color: $primary;
|
||||
}
|
||||
|
||||
Input {
|
||||
background: $surface;
|
||||
border: tall $primary;
|
||||
margin-top: 1;
|
||||
}
|
||||
|
||||
Input:focus {
|
||||
border: tall $accent;
|
||||
}
|
||||
|
||||
StatusBar {
|
||||
background: $panel;
|
||||
color: $text;
|
||||
height: 1;
|
||||
padding: 0 1;
|
||||
}
|
||||
|
||||
Footer {
|
||||
background: $panel;
|
||||
color: $text-muted;
|
||||
}
|
||||
"""
|
||||
|
||||
BINDINGS = [
|
||||
Binding("q", "quit", "Quit"),
|
||||
Binding("ctrl+s", "screenshot", "Screenshot (SVG)", show=True, priority=True),
|
||||
Binding("tab", "focus_next", "Next Panel", show=True),
|
||||
Binding("shift+tab", "focus_previous", "Previous Panel", show=False),
|
||||
]
|
||||
|
||||
def __init__(self, runtime: AgentRuntime):
|
||||
super().__init__()
|
||||
|
||||
self.runtime = runtime
|
||||
self.log_pane = LogPane()
|
||||
self.graph_view = GraphOverview(runtime)
|
||||
self.chat_repl = ChatRepl(runtime)
|
||||
self.status_bar = StatusBar(graph_id=runtime.graph.id)
|
||||
self.is_ready = False
|
||||
|
||||
def compose(self) -> ComposeResult:
|
||||
yield self.status_bar
|
||||
|
||||
yield Horizontal(
|
||||
Vertical(
|
||||
self.log_pane,
|
||||
self.graph_view,
|
||||
id="left-pane",
|
||||
),
|
||||
self.chat_repl,
|
||||
)
|
||||
|
||||
yield Footer()
|
||||
|
||||
async def on_mount(self) -> None:
|
||||
"""Called when app starts."""
|
||||
self.title = "Aden TUI Dashboard"
|
||||
|
||||
# Add logging setup
|
||||
self._setup_logging_queue()
|
||||
|
||||
# Set ready immediately so _poll_logs can process messages
|
||||
self.is_ready = True
|
||||
|
||||
# Add event subscription with delay to ensure TUI is fully initialized
|
||||
self.call_later(self._init_runtime_connection)
|
||||
|
||||
# Delay initial log messages until layout is fully rendered
|
||||
def write_initial_logs():
|
||||
logging.info("TUI Dashboard initialized successfully")
|
||||
logging.info("Waiting for agent execution to start...")
|
||||
|
||||
# Wait for layout to be fully rendered before writing logs
|
||||
self.set_timer(0.2, write_initial_logs)
|
||||
|
||||
def _setup_logging_queue(self) -> None:
|
||||
"""Setup a thread-safe queue for logs."""
|
||||
try:
|
||||
import queue
|
||||
from logging.handlers import QueueHandler
|
||||
|
||||
self.log_queue = queue.Queue()
|
||||
self.queue_handler = QueueHandler(self.log_queue)
|
||||
self.queue_handler.setLevel(logging.INFO)
|
||||
|
||||
# Get root logger
|
||||
root_logger = logging.getLogger()
|
||||
|
||||
# Remove ALL existing handlers to prevent stdout output
|
||||
# This is critical - StreamHandlers cause text to appear in header
|
||||
for handler in root_logger.handlers[:]:
|
||||
root_logger.removeHandler(handler)
|
||||
|
||||
# Add ONLY our queue handler
|
||||
root_logger.addHandler(self.queue_handler)
|
||||
root_logger.setLevel(logging.INFO)
|
||||
|
||||
# Suppress LiteLLM logging completely
|
||||
litellm_logger = logging.getLogger("LiteLLM")
|
||||
litellm_logger.setLevel(logging.CRITICAL) # Only show critical errors
|
||||
litellm_logger.propagate = False # Don't propagate to root logger
|
||||
|
||||
# Start polling
|
||||
self.set_interval(0.1, self._poll_logs)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
def _poll_logs(self) -> None:
|
||||
"""Poll the log queue and update UI."""
|
||||
if not self.is_ready:
|
||||
return
|
||||
|
||||
try:
|
||||
while not self.log_queue.empty():
|
||||
record = self.log_queue.get_nowait()
|
||||
# Filter out framework/library logs
|
||||
if record.name.startswith(("textual", "LiteLLM", "litellm")):
|
||||
continue
|
||||
|
||||
self.log_pane.write_python_log(record)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
_EVENT_TYPES = [
|
||||
EventType.LLM_TEXT_DELTA,
|
||||
EventType.CLIENT_OUTPUT_DELTA,
|
||||
EventType.TOOL_CALL_STARTED,
|
||||
EventType.TOOL_CALL_COMPLETED,
|
||||
EventType.EXECUTION_STARTED,
|
||||
EventType.EXECUTION_COMPLETED,
|
||||
EventType.EXECUTION_FAILED,
|
||||
EventType.NODE_LOOP_STARTED,
|
||||
EventType.NODE_LOOP_ITERATION,
|
||||
EventType.NODE_LOOP_COMPLETED,
|
||||
EventType.CLIENT_INPUT_REQUESTED,
|
||||
EventType.NODE_STALLED,
|
||||
EventType.GOAL_PROGRESS,
|
||||
EventType.GOAL_ACHIEVED,
|
||||
EventType.CONSTRAINT_VIOLATION,
|
||||
EventType.STATE_CHANGED,
|
||||
EventType.NODE_INPUT_BLOCKED,
|
||||
]
|
||||
|
||||
_LOG_PANE_EVENTS = frozenset(_EVENT_TYPES) - {
|
||||
EventType.LLM_TEXT_DELTA,
|
||||
EventType.CLIENT_OUTPUT_DELTA,
|
||||
}
|
||||
|
||||
async def _init_runtime_connection(self) -> None:
|
||||
"""Subscribe to runtime events with an async handler."""
|
||||
try:
|
||||
self._subscription_id = self.runtime.subscribe_to_events(
|
||||
event_types=self._EVENT_TYPES,
|
||||
handler=self._handle_event,
|
||||
)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
async def _handle_event(self, event: AgentEvent) -> None:
|
||||
"""Called from the agent thread — bridge to Textual's main thread."""
|
||||
try:
|
||||
self.call_from_thread(self._route_event, event)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
def _route_event(self, event: AgentEvent) -> None:
|
||||
"""Route incoming events to widgets. Runs on Textual's main thread."""
|
||||
if not self.is_ready:
|
||||
return
|
||||
|
||||
try:
|
||||
et = event.type
|
||||
|
||||
# --- Chat REPL events ---
|
||||
if et in (EventType.LLM_TEXT_DELTA, EventType.CLIENT_OUTPUT_DELTA):
|
||||
self.chat_repl.handle_text_delta(
|
||||
event.data.get("content", ""),
|
||||
event.data.get("snapshot", ""),
|
||||
)
|
||||
elif et == EventType.TOOL_CALL_STARTED:
|
||||
self.chat_repl.handle_tool_started(
|
||||
event.data.get("tool_name", "unknown"),
|
||||
event.data.get("tool_input", {}),
|
||||
)
|
||||
elif et == EventType.TOOL_CALL_COMPLETED:
|
||||
self.chat_repl.handle_tool_completed(
|
||||
event.data.get("tool_name", "unknown"),
|
||||
event.data.get("result", ""),
|
||||
event.data.get("is_error", False),
|
||||
)
|
||||
elif et == EventType.EXECUTION_COMPLETED:
|
||||
self.chat_repl.handle_execution_completed(event.data.get("output", {}))
|
||||
elif et == EventType.EXECUTION_FAILED:
|
||||
self.chat_repl.handle_execution_failed(event.data.get("error", "Unknown error"))
|
||||
elif et == EventType.CLIENT_INPUT_REQUESTED:
|
||||
self.chat_repl.handle_input_requested(
|
||||
event.node_id or event.data.get("node_id", ""),
|
||||
)
|
||||
|
||||
# --- Graph view events ---
|
||||
if et in (
|
||||
EventType.EXECUTION_STARTED,
|
||||
EventType.EXECUTION_COMPLETED,
|
||||
EventType.EXECUTION_FAILED,
|
||||
):
|
||||
self.graph_view.update_execution(event)
|
||||
|
||||
if et == EventType.NODE_LOOP_STARTED:
|
||||
self.graph_view.handle_node_loop_started(event.node_id or "")
|
||||
elif et == EventType.NODE_LOOP_ITERATION:
|
||||
self.graph_view.handle_node_loop_iteration(
|
||||
event.node_id or "",
|
||||
event.data.get("iteration", 0),
|
||||
)
|
||||
elif et == EventType.NODE_LOOP_COMPLETED:
|
||||
self.graph_view.handle_node_loop_completed(event.node_id or "")
|
||||
elif et == EventType.NODE_STALLED:
|
||||
self.graph_view.handle_stalled(
|
||||
event.node_id or "",
|
||||
event.data.get("reason", ""),
|
||||
)
|
||||
|
||||
if et == EventType.TOOL_CALL_STARTED:
|
||||
self.graph_view.handle_tool_call(
|
||||
event.node_id or "",
|
||||
event.data.get("tool_name", "unknown"),
|
||||
started=True,
|
||||
)
|
||||
elif et == EventType.TOOL_CALL_COMPLETED:
|
||||
self.graph_view.handle_tool_call(
|
||||
event.node_id or "",
|
||||
event.data.get("tool_name", "unknown"),
|
||||
started=False,
|
||||
)
|
||||
|
||||
# --- Status bar events ---
|
||||
if et == EventType.EXECUTION_STARTED:
|
||||
entry_node = event.data.get("entry_node") or (
|
||||
self.runtime.graph.entry_node if self.runtime else ""
|
||||
)
|
||||
self.status_bar.set_running(entry_node)
|
||||
elif et == EventType.EXECUTION_COMPLETED:
|
||||
self.status_bar.set_completed()
|
||||
elif et == EventType.EXECUTION_FAILED:
|
||||
self.status_bar.set_failed(event.data.get("error", ""))
|
||||
elif et == EventType.NODE_LOOP_STARTED:
|
||||
self.status_bar.set_active_node(event.node_id or "", "thinking...")
|
||||
elif et == EventType.NODE_LOOP_ITERATION:
|
||||
self.status_bar.set_node_detail(f"step {event.data.get('iteration', '?')}")
|
||||
elif et == EventType.TOOL_CALL_STARTED:
|
||||
self.status_bar.set_node_detail(f"{event.data.get('tool_name', '')}...")
|
||||
elif et == EventType.TOOL_CALL_COMPLETED:
|
||||
self.status_bar.set_node_detail("thinking...")
|
||||
elif et == EventType.NODE_STALLED:
|
||||
self.status_bar.set_node_detail(f"stalled: {event.data.get('reason', '')}")
|
||||
|
||||
# --- Log pane events ---
|
||||
if et in self._LOG_PANE_EVENTS:
|
||||
self.log_pane.write_event(event)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
def save_screenshot(self, filename: str | None = None) -> str:
|
||||
"""Save a screenshot of the current screen as SVG (viewable in browsers).
|
||||
|
||||
Args:
|
||||
filename: Optional filename for the screenshot. If None, generates timestamp-based name.
|
||||
|
||||
Returns:
|
||||
Path to the saved SVG file.
|
||||
"""
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
|
||||
# Create screenshots directory
|
||||
screenshots_dir = Path("screenshots")
|
||||
screenshots_dir.mkdir(exist_ok=True)
|
||||
|
||||
# Generate filename if not provided
|
||||
if filename is None:
|
||||
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
|
||||
filename = f"tui_screenshot_{timestamp}.svg"
|
||||
|
||||
# Ensure .svg extension
|
||||
if not filename.endswith(".svg"):
|
||||
filename += ".svg"
|
||||
|
||||
# Full path
|
||||
filepath = screenshots_dir / filename
|
||||
|
||||
# Temporarily hide borders for cleaner screenshot
|
||||
chat_widget = self.query_one(ChatRepl)
|
||||
original_chat_border = chat_widget.styles.border_left
|
||||
chat_widget.styles.border_left = ("none", "transparent")
|
||||
|
||||
# Hide all Input widget borders
|
||||
input_widgets = self.query("Input")
|
||||
original_input_borders = []
|
||||
for input_widget in input_widgets:
|
||||
original_input_borders.append(input_widget.styles.border)
|
||||
input_widget.styles.border = ("none", "transparent")
|
||||
|
||||
try:
|
||||
# Get SVG data from Textual and save it
|
||||
svg_data = self.export_screenshot()
|
||||
filepath.write_text(svg_data, encoding="utf-8")
|
||||
finally:
|
||||
# Restore the original borders
|
||||
chat_widget.styles.border_left = original_chat_border
|
||||
for i, input_widget in enumerate(input_widgets):
|
||||
input_widget.styles.border = original_input_borders[i]
|
||||
|
||||
return str(filepath)
|
||||
|
||||
def action_screenshot(self) -> None:
|
||||
"""Take a screenshot (bound to Ctrl+S)."""
|
||||
try:
|
||||
filepath = self.save_screenshot()
|
||||
self.notify(
|
||||
f"Screenshot saved: {filepath} (SVG - open in browser)",
|
||||
severity="information",
|
||||
timeout=5,
|
||||
)
|
||||
except Exception as e:
|
||||
self.notify(f"Screenshot failed: {e}", severity="error", timeout=5)
|
||||
|
||||
async def on_unmount(self) -> None:
|
||||
"""Cleanup on app shutdown."""
|
||||
self.is_ready = False
|
||||
try:
|
||||
if hasattr(self, "_subscription_id"):
|
||||
self.runtime.unsubscribe_from_events(self._subscription_id)
|
||||
except Exception:
|
||||
pass
|
||||
try:
|
||||
if hasattr(self, "queue_handler"):
|
||||
logging.getLogger().removeHandler(self.queue_handler)
|
||||
except Exception:
|
||||
pass
|
||||
@@ -0,0 +1,303 @@
|
||||
"""
|
||||
Chat / REPL Widget - Uses RichLog for append-only, selection-safe display.
|
||||
|
||||
Streaming display approach:
|
||||
- The processing-indicator Label is used as a live status bar during streaming
|
||||
(Label.update() replaces text in-place, unlike RichLog which is append-only).
|
||||
- On EXECUTION_COMPLETED, the final output is written to RichLog as permanent history.
|
||||
- Tool events are written directly to RichLog as discrete status lines.
|
||||
|
||||
Client-facing input:
|
||||
- When a client_facing=True EventLoopNode emits CLIENT_INPUT_REQUESTED, the
|
||||
ChatRepl transitions to "waiting for input" state: input is re-enabled and
|
||||
subsequent submissions are routed to runtime.inject_input() instead of
|
||||
starting a new execution.
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import threading
|
||||
from typing import Any
|
||||
|
||||
from textual.app import ComposeResult
|
||||
from textual.containers import Vertical
|
||||
from textual.widgets import Input, Label, RichLog
|
||||
|
||||
from framework.runtime.agent_runtime import AgentRuntime
|
||||
|
||||
|
||||
class ChatRepl(Vertical):
|
||||
"""Widget for interactive chat/REPL."""
|
||||
|
||||
DEFAULT_CSS = """
|
||||
ChatRepl {
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
layout: vertical;
|
||||
}
|
||||
|
||||
ChatRepl > RichLog {
|
||||
width: 100%;
|
||||
height: 1fr;
|
||||
background: $surface;
|
||||
border: none;
|
||||
scrollbar-background: $panel;
|
||||
scrollbar-color: $primary;
|
||||
}
|
||||
|
||||
ChatRepl > #processing-indicator {
|
||||
width: 100%;
|
||||
height: 1;
|
||||
background: $primary 20%;
|
||||
color: $text;
|
||||
text-style: bold;
|
||||
display: none;
|
||||
}
|
||||
|
||||
ChatRepl > Input {
|
||||
width: 100%;
|
||||
height: auto;
|
||||
dock: bottom;
|
||||
background: $surface;
|
||||
border: tall $primary;
|
||||
margin-top: 1;
|
||||
}
|
||||
|
||||
ChatRepl > Input:focus {
|
||||
border: tall $accent;
|
||||
}
|
||||
"""
|
||||
|
||||
def __init__(self, runtime: AgentRuntime):
|
||||
super().__init__()
|
||||
self.runtime = runtime
|
||||
self._current_exec_id: str | None = None
|
||||
self._streaming_snapshot: str = ""
|
||||
self._waiting_for_input: bool = False
|
||||
self._input_node_id: str | None = None
|
||||
|
||||
# Dedicated event loop for agent execution.
|
||||
# Keeps blocking runtime code (LLM calls, MCP tools) off
|
||||
# the Textual event loop so the UI stays responsive.
|
||||
self._agent_loop = asyncio.new_event_loop()
|
||||
self._agent_thread = threading.Thread(
|
||||
target=self._agent_loop.run_forever,
|
||||
daemon=True,
|
||||
name="agent-execution",
|
||||
)
|
||||
self._agent_thread.start()
|
||||
|
||||
def compose(self) -> ComposeResult:
|
||||
yield RichLog(id="chat-history", highlight=True, markup=True, auto_scroll=False, wrap=True)
|
||||
yield Label("Agent is processing...", id="processing-indicator")
|
||||
yield Input(placeholder="Enter input for agent...", id="chat-input")
|
||||
|
||||
def _write_history(self, content: str) -> None:
|
||||
"""Write to chat history, only auto-scrolling if user is at the bottom."""
|
||||
history = self.query_one("#chat-history", RichLog)
|
||||
was_at_bottom = history.is_vertical_scroll_end
|
||||
history.write(content)
|
||||
if was_at_bottom:
|
||||
history.scroll_end(animate=False)
|
||||
|
||||
def on_mount(self) -> None:
|
||||
"""Add welcome message when widget mounts."""
|
||||
history = self.query_one("#chat-history", RichLog)
|
||||
history.write("[bold cyan]Chat REPL Ready[/bold cyan] — Type your input below\n")
|
||||
|
||||
async def on_input_submitted(self, message: Input.Submitted) -> None:
|
||||
"""Handle input submission — either start new execution or inject input."""
|
||||
user_input = message.value.strip()
|
||||
if not user_input:
|
||||
return
|
||||
|
||||
# Client-facing input: route to the waiting node
|
||||
if self._waiting_for_input and self._input_node_id:
|
||||
self._write_history(f"[bold green]You:[/bold green] {user_input}")
|
||||
message.input.value = ""
|
||||
|
||||
# Disable input while agent processes the response
|
||||
chat_input = self.query_one("#chat-input", Input)
|
||||
chat_input.disabled = True
|
||||
chat_input.placeholder = "Enter input for agent..."
|
||||
self._waiting_for_input = False
|
||||
|
||||
indicator = self.query_one("#processing-indicator", Label)
|
||||
indicator.update("Thinking...")
|
||||
|
||||
node_id = self._input_node_id
|
||||
self._input_node_id = None
|
||||
|
||||
try:
|
||||
future = asyncio.run_coroutine_threadsafe(
|
||||
self.runtime.inject_input(node_id, user_input),
|
||||
self._agent_loop,
|
||||
)
|
||||
await asyncio.wrap_future(future)
|
||||
except Exception as e:
|
||||
self._write_history(f"[bold red]Error delivering input:[/bold red] {e}")
|
||||
return
|
||||
|
||||
# Double-submit guard: reject input while an execution is in-flight
|
||||
if self._current_exec_id is not None:
|
||||
self._write_history("[dim]Agent is still running — please wait.[/dim]")
|
||||
return
|
||||
|
||||
indicator = self.query_one("#processing-indicator", Label)
|
||||
|
||||
# Append user message and clear input
|
||||
self._write_history(f"[bold green]You:[/bold green] {user_input}")
|
||||
message.input.value = ""
|
||||
|
||||
try:
|
||||
# Get entry point
|
||||
entry_points = self.runtime.get_entry_points()
|
||||
if not entry_points:
|
||||
self._write_history("[bold red]Error:[/bold red] No entry points")
|
||||
return
|
||||
|
||||
# Determine the input key from the entry node
|
||||
entry_point = entry_points[0]
|
||||
entry_node = self.runtime.graph.get_node(entry_point.entry_node)
|
||||
|
||||
if entry_node and entry_node.input_keys:
|
||||
input_key = entry_node.input_keys[0]
|
||||
else:
|
||||
input_key = "input"
|
||||
|
||||
# Reset streaming state
|
||||
self._streaming_snapshot = ""
|
||||
|
||||
# Show processing indicator
|
||||
indicator.update("Thinking...")
|
||||
indicator.display = True
|
||||
|
||||
# Disable input while the agent is working
|
||||
chat_input = self.query_one("#chat-input", Input)
|
||||
chat_input.disabled = True
|
||||
|
||||
# Submit execution to the dedicated agent loop so blocking
|
||||
# runtime code (LLM, MCP tools) never touches Textual's loop.
|
||||
# trigger() returns immediately with an exec_id; the heavy
|
||||
# execution task runs entirely on the agent thread.
|
||||
future = asyncio.run_coroutine_threadsafe(
|
||||
self.runtime.trigger(
|
||||
entry_point_id=entry_point.id,
|
||||
input_data={input_key: user_input},
|
||||
),
|
||||
self._agent_loop,
|
||||
)
|
||||
# wrap_future lets us await without blocking Textual's loop
|
||||
self._current_exec_id = await asyncio.wrap_future(future)
|
||||
|
||||
except Exception as e:
|
||||
indicator.display = False
|
||||
self._current_exec_id = None
|
||||
# Re-enable input on error
|
||||
chat_input = self.query_one("#chat-input", Input)
|
||||
chat_input.disabled = False
|
||||
self._write_history(f"[bold red]Error:[/bold red] {e}")
|
||||
|
||||
# -- Event handlers called by app.py _handle_event --
|
||||
|
||||
def handle_text_delta(self, content: str, snapshot: str) -> None:
|
||||
"""Handle a streaming text token from the LLM."""
|
||||
self._streaming_snapshot = snapshot
|
||||
|
||||
# Show a truncated live preview in the indicator label
|
||||
indicator = self.query_one("#processing-indicator", Label)
|
||||
preview = snapshot[-80:] if len(snapshot) > 80 else snapshot
|
||||
# Replace newlines for single-line display
|
||||
preview = preview.replace("\n", " ")
|
||||
indicator.update(
|
||||
f"Thinking: ...{preview}" if len(snapshot) > 80 else f"Thinking: {preview}"
|
||||
)
|
||||
|
||||
def handle_tool_started(self, tool_name: str, tool_input: dict[str, Any]) -> None:
|
||||
"""Handle a tool call starting."""
|
||||
# Update indicator to show tool activity
|
||||
indicator = self.query_one("#processing-indicator", Label)
|
||||
indicator.update(f"Using tool: {tool_name}...")
|
||||
|
||||
# Write a discrete status line to history
|
||||
self._write_history(f"[dim]Tool: {tool_name}[/dim]")
|
||||
|
||||
def handle_tool_completed(self, tool_name: str, result: str, is_error: bool) -> None:
|
||||
"""Handle a tool call completing."""
|
||||
result_str = str(result)
|
||||
preview = result_str[:200] + "..." if len(result_str) > 200 else result_str
|
||||
preview = preview.replace("\n", " ")
|
||||
|
||||
if is_error:
|
||||
self._write_history(f"[dim red]Tool {tool_name} error: {preview}[/dim red]")
|
||||
else:
|
||||
self._write_history(f"[dim]Tool {tool_name} result: {preview}[/dim]")
|
||||
|
||||
# Restore thinking indicator
|
||||
indicator = self.query_one("#processing-indicator", Label)
|
||||
indicator.update("Thinking...")
|
||||
|
||||
def handle_execution_completed(self, output: dict[str, Any]) -> None:
|
||||
"""Handle execution finishing successfully."""
|
||||
indicator = self.query_one("#processing-indicator", Label)
|
||||
indicator.display = False
|
||||
|
||||
# Write the final streaming snapshot to permanent history (if any)
|
||||
if self._streaming_snapshot:
|
||||
self._write_history(f"[bold blue]Agent:[/bold blue] {self._streaming_snapshot}")
|
||||
else:
|
||||
output_str = str(output.get("output_string", output))
|
||||
self._write_history(f"[bold blue]Agent:[/bold blue] {output_str}")
|
||||
self._write_history("") # separator
|
||||
|
||||
self._current_exec_id = None
|
||||
self._streaming_snapshot = ""
|
||||
self._waiting_for_input = False
|
||||
self._input_node_id = None
|
||||
|
||||
# Re-enable input
|
||||
chat_input = self.query_one("#chat-input", Input)
|
||||
chat_input.disabled = False
|
||||
chat_input.placeholder = "Enter input for agent..."
|
||||
chat_input.focus()
|
||||
|
||||
def handle_execution_failed(self, error: str) -> None:
|
||||
"""Handle execution failing."""
|
||||
indicator = self.query_one("#processing-indicator", Label)
|
||||
indicator.display = False
|
||||
|
||||
self._write_history(f"[bold red]Error:[/bold red] {error}")
|
||||
self._write_history("") # separator
|
||||
|
||||
self._current_exec_id = None
|
||||
self._streaming_snapshot = ""
|
||||
self._waiting_for_input = False
|
||||
self._input_node_id = None
|
||||
|
||||
# Re-enable input
|
||||
chat_input = self.query_one("#chat-input", Input)
|
||||
chat_input.disabled = False
|
||||
chat_input.placeholder = "Enter input for agent..."
|
||||
chat_input.focus()
|
||||
|
||||
def handle_input_requested(self, node_id: str) -> None:
|
||||
"""Handle a client-facing node requesting user input.
|
||||
|
||||
Transitions to 'waiting for input' state: flushes the current
|
||||
streaming snapshot to history, re-enables the input widget,
|
||||
and sets a flag so the next submission routes to inject_input().
|
||||
"""
|
||||
# Flush accumulated streaming text as agent output
|
||||
if self._streaming_snapshot:
|
||||
self._write_history(f"[bold blue]Agent:[/bold blue] {self._streaming_snapshot}")
|
||||
self._streaming_snapshot = ""
|
||||
|
||||
self._waiting_for_input = True
|
||||
self._input_node_id = node_id or None
|
||||
|
||||
indicator = self.query_one("#processing-indicator", Label)
|
||||
indicator.update("Waiting for your input...")
|
||||
|
||||
chat_input = self.query_one("#chat-input", Input)
|
||||
chat_input.disabled = False
|
||||
chat_input.placeholder = "Type your response..."
|
||||
chat_input.focus()
|
||||
@@ -0,0 +1,194 @@
|
||||
"""
|
||||
Graph/Tree Overview Widget - Displays real agent graph structure.
|
||||
"""
|
||||
|
||||
from textual.app import ComposeResult
|
||||
from textual.containers import Vertical
|
||||
from textual.widgets import RichLog
|
||||
|
||||
from framework.runtime.agent_runtime import AgentRuntime
|
||||
from framework.runtime.event_bus import EventType
|
||||
|
||||
|
||||
class GraphOverview(Vertical):
|
||||
"""Widget to display Agent execution graph/tree with real data."""
|
||||
|
||||
DEFAULT_CSS = """
|
||||
GraphOverview {
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
background: $panel;
|
||||
}
|
||||
|
||||
GraphOverview > RichLog {
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
background: $panel;
|
||||
border: none;
|
||||
scrollbar-background: $surface;
|
||||
scrollbar-color: $primary;
|
||||
}
|
||||
"""
|
||||
|
||||
def __init__(self, runtime: AgentRuntime):
|
||||
super().__init__()
|
||||
self.runtime = runtime
|
||||
self.active_node: str | None = None
|
||||
self.execution_path: list[str] = []
|
||||
# Per-node status strings shown next to the node in the graph display.
|
||||
# e.g. {"planner": "thinking...", "searcher": "web_search..."}
|
||||
self._node_status: dict[str, str] = {}
|
||||
|
||||
def compose(self) -> ComposeResult:
|
||||
# Use RichLog for formatted output
|
||||
yield RichLog(id="graph-display", highlight=True, markup=True)
|
||||
|
||||
def on_mount(self) -> None:
|
||||
"""Display initial graph structure."""
|
||||
self._display_graph()
|
||||
|
||||
def _topo_order(self) -> list[str]:
|
||||
"""BFS from entry_node following edges."""
|
||||
graph = self.runtime.graph
|
||||
visited: list[str] = []
|
||||
seen: set[str] = set()
|
||||
queue = [graph.entry_node]
|
||||
while queue:
|
||||
nid = queue.pop(0)
|
||||
if nid in seen:
|
||||
continue
|
||||
seen.add(nid)
|
||||
visited.append(nid)
|
||||
for edge in graph.get_outgoing_edges(nid):
|
||||
if edge.target not in seen:
|
||||
queue.append(edge.target)
|
||||
# Append orphan nodes not reachable from entry
|
||||
for node in graph.nodes:
|
||||
if node.id not in seen:
|
||||
visited.append(node.id)
|
||||
return visited
|
||||
|
||||
def _render_node_line(self, node_id: str) -> str:
|
||||
"""Render a single node with status symbol and optional status text."""
|
||||
graph = self.runtime.graph
|
||||
is_terminal = node_id in (graph.terminal_nodes or [])
|
||||
is_active = node_id == self.active_node
|
||||
is_done = node_id in self.execution_path and not is_active
|
||||
status = self._node_status.get(node_id, "")
|
||||
|
||||
if is_active:
|
||||
sym = "[bold green]●[/bold green]"
|
||||
elif is_done:
|
||||
sym = "[dim]✓[/dim]"
|
||||
elif is_terminal:
|
||||
sym = "[yellow]■[/yellow]"
|
||||
else:
|
||||
sym = "○"
|
||||
|
||||
if is_active:
|
||||
name = f"[bold green]{node_id}[/bold green]"
|
||||
elif is_done:
|
||||
name = f"[dim]{node_id}[/dim]"
|
||||
else:
|
||||
name = node_id
|
||||
|
||||
suffix = f" [italic]{status}[/italic]" if status else ""
|
||||
return f" {sym} {name}{suffix}"
|
||||
|
||||
def _render_edges(self, node_id: str) -> list[str]:
|
||||
"""Render edge connectors from this node to its targets."""
|
||||
edges = self.runtime.graph.get_outgoing_edges(node_id)
|
||||
if not edges:
|
||||
return []
|
||||
if len(edges) == 1:
|
||||
return [" │", " ▼"]
|
||||
# Fan-out: show branches
|
||||
lines: list[str] = []
|
||||
for i, edge in enumerate(edges):
|
||||
connector = "└" if i == len(edges) - 1 else "├"
|
||||
cond = ""
|
||||
if edge.condition.value not in ("always", "on_success"):
|
||||
cond = f" [dim]({edge.condition.value})[/dim]"
|
||||
lines.append(f" {connector}──▶ {edge.target}{cond}")
|
||||
return lines
|
||||
|
||||
def _display_graph(self) -> None:
|
||||
"""Display the graph as an ASCII DAG with edge connectors."""
|
||||
display = self.query_one("#graph-display", RichLog)
|
||||
display.clear()
|
||||
|
||||
graph = self.runtime.graph
|
||||
display.write(f"[bold cyan]Agent Graph:[/bold cyan] {graph.id}\n")
|
||||
|
||||
# Render each node in topological order with edges
|
||||
ordered = self._topo_order()
|
||||
for node_id in ordered:
|
||||
display.write(self._render_node_line(node_id))
|
||||
for edge_line in self._render_edges(node_id):
|
||||
display.write(edge_line)
|
||||
|
||||
# Execution path footer
|
||||
if self.execution_path:
|
||||
display.write("")
|
||||
display.write(f"[dim]Path:[/dim] {' → '.join(self.execution_path[-5:])}")
|
||||
|
||||
def update_active_node(self, node_id: str) -> None:
|
||||
"""Update the currently active node."""
|
||||
self.active_node = node_id
|
||||
if node_id not in self.execution_path:
|
||||
self.execution_path.append(node_id)
|
||||
self._display_graph()
|
||||
|
||||
def update_execution(self, event) -> None:
|
||||
"""Update the displayed node status based on execution lifecycle events."""
|
||||
if event.type == EventType.EXECUTION_STARTED:
|
||||
self._node_status.clear()
|
||||
self.execution_path.clear()
|
||||
entry_node = event.data.get("entry_node") or (
|
||||
self.runtime.graph.entry_node if self.runtime else None
|
||||
)
|
||||
if entry_node:
|
||||
self.update_active_node(entry_node)
|
||||
|
||||
elif event.type == EventType.EXECUTION_COMPLETED:
|
||||
self.active_node = None
|
||||
self._node_status.clear()
|
||||
self._display_graph()
|
||||
|
||||
elif event.type == EventType.EXECUTION_FAILED:
|
||||
error = event.data.get("error", "Unknown error")
|
||||
if self.active_node:
|
||||
self._node_status[self.active_node] = f"[red]FAILED: {error}[/red]"
|
||||
self.active_node = None
|
||||
self._display_graph()
|
||||
|
||||
# -- Event handlers called by app.py _handle_event --
|
||||
|
||||
def handle_node_loop_started(self, node_id: str) -> None:
|
||||
"""A node's event loop has started."""
|
||||
self._node_status[node_id] = "thinking..."
|
||||
self.update_active_node(node_id)
|
||||
|
||||
def handle_node_loop_iteration(self, node_id: str, iteration: int) -> None:
|
||||
"""A node advanced to a new loop iteration."""
|
||||
self._node_status[node_id] = f"step {iteration}"
|
||||
self._display_graph()
|
||||
|
||||
def handle_node_loop_completed(self, node_id: str) -> None:
|
||||
"""A node's event loop completed."""
|
||||
self._node_status.pop(node_id, None)
|
||||
self._display_graph()
|
||||
|
||||
def handle_tool_call(self, node_id: str, tool_name: str, *, started: bool) -> None:
|
||||
"""Show tool activity next to the active node."""
|
||||
if started:
|
||||
self._node_status[node_id] = f"{tool_name}..."
|
||||
else:
|
||||
# Restore to generic thinking status after tool completes
|
||||
self._node_status[node_id] = "thinking..."
|
||||
self._display_graph()
|
||||
|
||||
def handle_stalled(self, node_id: str, reason: str) -> None:
|
||||
"""Highlight a stalled node."""
|
||||
self._node_status[node_id] = f"[red]stalled: {reason}[/red]"
|
||||
self._display_graph()
|
||||
@@ -0,0 +1,147 @@
|
||||
"""
|
||||
Log Pane Widget - Uses RichLog for reliable rendering.
|
||||
"""
|
||||
|
||||
import logging
|
||||
from datetime import datetime
|
||||
|
||||
from textual.app import ComposeResult
|
||||
from textual.containers import Container
|
||||
from textual.widgets import RichLog
|
||||
|
||||
from framework.runtime.event_bus import AgentEvent, EventType
|
||||
|
||||
|
||||
class LogPane(Container):
|
||||
"""Widget to display logs with reliable rendering."""
|
||||
|
||||
_EVENT_FORMAT: dict[EventType, tuple[str, str]] = {
|
||||
EventType.EXECUTION_STARTED: (">>", "bold cyan"),
|
||||
EventType.EXECUTION_COMPLETED: ("<<", "bold green"),
|
||||
EventType.EXECUTION_FAILED: ("!!", "bold red"),
|
||||
EventType.TOOL_CALL_STARTED: ("->", "yellow"),
|
||||
EventType.TOOL_CALL_COMPLETED: ("<-", "green"),
|
||||
EventType.NODE_LOOP_STARTED: ("@@", "cyan"),
|
||||
EventType.NODE_LOOP_ITERATION: ("..", "dim"),
|
||||
EventType.NODE_LOOP_COMPLETED: ("@@", "dim"),
|
||||
EventType.NODE_STALLED: ("!!", "bold yellow"),
|
||||
EventType.NODE_INPUT_BLOCKED: ("!!", "yellow"),
|
||||
EventType.GOAL_PROGRESS: ("%%", "blue"),
|
||||
EventType.GOAL_ACHIEVED: ("**", "bold green"),
|
||||
EventType.CONSTRAINT_VIOLATION: ("!!", "bold red"),
|
||||
EventType.STATE_CHANGED: ("~~", "dim"),
|
||||
EventType.CLIENT_INPUT_REQUESTED: ("??", "magenta"),
|
||||
}
|
||||
|
||||
_LOG_LEVEL_COLORS = {
|
||||
logging.DEBUG: "dim",
|
||||
logging.INFO: "",
|
||||
logging.WARNING: "yellow",
|
||||
logging.ERROR: "red",
|
||||
logging.CRITICAL: "bold red",
|
||||
}
|
||||
|
||||
DEFAULT_CSS = """
|
||||
LogPane {
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
}
|
||||
|
||||
LogPane > RichLog {
|
||||
width: 100%;
|
||||
height: 100%;
|
||||
background: $surface;
|
||||
border: none;
|
||||
scrollbar-background: $panel;
|
||||
scrollbar-color: $primary;
|
||||
}
|
||||
"""
|
||||
|
||||
def compose(self) -> ComposeResult:
|
||||
# RichLog is designed for log display and doesn't have TextArea's rendering issues
|
||||
yield RichLog(id="main-log", highlight=True, markup=True, auto_scroll=False)
|
||||
|
||||
def write_event(self, event: AgentEvent) -> None:
|
||||
"""Format an AgentEvent with timestamp + symbol and write to the log."""
|
||||
ts = event.timestamp.strftime("%H:%M:%S")
|
||||
symbol, color = self._EVENT_FORMAT.get(event.type, ("--", "dim"))
|
||||
text = self._extract_event_text(event)
|
||||
self.write_log(f"[dim]{ts}[/dim] [{color}]{symbol} {text}[/{color}]")
|
||||
|
||||
def _extract_event_text(self, event: AgentEvent) -> str:
|
||||
"""Extract human-readable text from an event's data dict."""
|
||||
et = event.type
|
||||
data = event.data
|
||||
|
||||
if et == EventType.EXECUTION_STARTED:
|
||||
return "Execution started"
|
||||
elif et == EventType.EXECUTION_COMPLETED:
|
||||
return "Execution completed"
|
||||
elif et == EventType.EXECUTION_FAILED:
|
||||
return f"Execution FAILED: {data.get('error', 'unknown')}"
|
||||
elif et == EventType.TOOL_CALL_STARTED:
|
||||
return f"Tool call: {data.get('tool_name', 'unknown')}"
|
||||
elif et == EventType.TOOL_CALL_COMPLETED:
|
||||
name = data.get("tool_name", "unknown")
|
||||
if data.get("is_error"):
|
||||
preview = str(data.get("result", ""))[:80]
|
||||
return f"Tool error: {name} - {preview}"
|
||||
return f"Tool done: {name}"
|
||||
elif et == EventType.NODE_LOOP_STARTED:
|
||||
return f"Node started: {event.node_id or 'unknown'}"
|
||||
elif et == EventType.NODE_LOOP_ITERATION:
|
||||
return f"{event.node_id or 'unknown'} iteration {data.get('iteration', '?')}"
|
||||
elif et == EventType.NODE_LOOP_COMPLETED:
|
||||
return f"Node done: {event.node_id or 'unknown'}"
|
||||
elif et == EventType.NODE_STALLED:
|
||||
reason = data.get("reason", "")
|
||||
node = event.node_id or "unknown"
|
||||
return f"Node stalled: {node} - {reason}" if reason else f"Node stalled: {node}"
|
||||
elif et == EventType.NODE_INPUT_BLOCKED:
|
||||
return f"Node input blocked: {event.node_id or 'unknown'}"
|
||||
elif et == EventType.GOAL_PROGRESS:
|
||||
return f"Goal progress: {data.get('progress', '?')}"
|
||||
elif et == EventType.GOAL_ACHIEVED:
|
||||
return "Goal achieved"
|
||||
elif et == EventType.CONSTRAINT_VIOLATION:
|
||||
return f"Constraint violated: {data.get('description', 'unknown')}"
|
||||
elif et == EventType.STATE_CHANGED:
|
||||
return f"State changed: {data.get('key', 'unknown')}"
|
||||
elif et == EventType.CLIENT_INPUT_REQUESTED:
|
||||
return "Waiting for user input"
|
||||
else:
|
||||
return f"{et.value}: {data}"
|
||||
|
||||
def write_python_log(self, record: logging.LogRecord) -> None:
|
||||
"""Format a Python log record with timestamp and severity color."""
|
||||
ts = datetime.fromtimestamp(record.created).strftime("%H:%M:%S")
|
||||
color = self._LOG_LEVEL_COLORS.get(record.levelno, "")
|
||||
msg = record.getMessage()
|
||||
if color:
|
||||
self.write_log(f"[dim]{ts}[/dim] [{color}]{record.levelname}[/{color}] {msg}")
|
||||
else:
|
||||
self.write_log(f"[dim]{ts}[/dim] {record.levelname} {msg}")
|
||||
|
||||
def write_log(self, message: str) -> None:
|
||||
"""Write a log message to the log pane."""
|
||||
try:
|
||||
# Check if widget is mounted
|
||||
if not self.is_mounted:
|
||||
return
|
||||
|
||||
log = self.query_one("#main-log", RichLog)
|
||||
|
||||
# Check if log is mounted
|
||||
if not log.is_mounted:
|
||||
return
|
||||
|
||||
# Only auto-scroll if user is already at the bottom
|
||||
was_at_bottom = log.is_vertical_scroll_end
|
||||
|
||||
log.write(message)
|
||||
|
||||
if was_at_bottom:
|
||||
log.scroll_end(animate=False)
|
||||
|
||||
except Exception:
|
||||
pass
|
||||
+2
-1
@@ -18,7 +18,8 @@ dependencies = [
|
||||
"tools",
|
||||
]
|
||||
|
||||
# [project.optional-dependencies]
|
||||
[project.optional-dependencies]
|
||||
tui = ["textual>=0.75.0"]
|
||||
|
||||
[project.scripts]
|
||||
hive = "framework.cli:main"
|
||||
|
||||
@@ -104,8 +104,10 @@ def test_event_loop_node_spec_accepted():
|
||||
# --- _get_node_implementation() tests ---
|
||||
|
||||
|
||||
def test_unregistered_event_loop_raises(runtime):
|
||||
"""An event_loop node not in the registry should raise RuntimeError."""
|
||||
def test_unregistered_event_loop_auto_creates(runtime):
|
||||
"""An event_loop node not in the registry should be auto-created."""
|
||||
from framework.graph.event_loop_node import EventLoopNode
|
||||
|
||||
spec = NodeSpec(
|
||||
id="el1",
|
||||
name="Event Loop",
|
||||
@@ -114,8 +116,10 @@ def test_unregistered_event_loop_raises(runtime):
|
||||
)
|
||||
executor = GraphExecutor(runtime=runtime)
|
||||
|
||||
with pytest.raises(RuntimeError, match="not found in registry"):
|
||||
executor._get_node_implementation(spec)
|
||||
result = executor._get_node_implementation(spec)
|
||||
assert isinstance(result, EventLoopNode)
|
||||
# Auto-created node should be cached in registry
|
||||
assert "el1" in executor.node_registry
|
||||
|
||||
|
||||
def test_registered_event_loop_returns_impl(runtime):
|
||||
|
||||
@@ -5,7 +5,7 @@ Focused on minimal success and failure scenarios.
|
||||
|
||||
import pytest
|
||||
|
||||
from framework.graph.edge import GraphSpec
|
||||
from framework.graph.edge import EdgeCondition, EdgeSpec, GraphSpec
|
||||
from framework.graph.executor import GraphExecutor
|
||||
from framework.graph.goal import Goal
|
||||
from framework.graph.node import NodeResult, NodeSpec
|
||||
@@ -130,3 +130,169 @@ async def test_executor_single_node_failure():
|
||||
assert result.success is False
|
||||
assert result.error is not None
|
||||
assert result.path == ["n1"]
|
||||
|
||||
|
||||
# ---- Fake event bus that records calls ----
|
||||
class FakeEventBus:
|
||||
def __init__(self):
|
||||
self.events = []
|
||||
|
||||
async def emit_node_loop_started(self, **kwargs):
|
||||
self.events.append(("started", kwargs))
|
||||
|
||||
async def emit_node_loop_completed(self, **kwargs):
|
||||
self.events.append(("completed", kwargs))
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_executor_emits_node_events():
|
||||
"""Executor should emit NODE_LOOP_STARTED/COMPLETED for each non-event_loop node."""
|
||||
runtime = DummyRuntime()
|
||||
event_bus = FakeEventBus()
|
||||
|
||||
graph = GraphSpec(
|
||||
id="graph-ev",
|
||||
goal_id="g-ev",
|
||||
nodes=[
|
||||
NodeSpec(
|
||||
id="n1",
|
||||
name="first",
|
||||
description="first node",
|
||||
node_type="llm_generate",
|
||||
input_keys=[],
|
||||
output_keys=["result"],
|
||||
max_retries=0,
|
||||
),
|
||||
NodeSpec(
|
||||
id="n2",
|
||||
name="second",
|
||||
description="second node",
|
||||
node_type="llm_generate",
|
||||
input_keys=["result"],
|
||||
output_keys=["result"],
|
||||
max_retries=0,
|
||||
),
|
||||
],
|
||||
edges=[
|
||||
EdgeSpec(
|
||||
id="e1",
|
||||
source="n1",
|
||||
target="n2",
|
||||
condition=EdgeCondition.ON_SUCCESS,
|
||||
),
|
||||
],
|
||||
entry_node="n1",
|
||||
terminal_nodes=["n2"],
|
||||
)
|
||||
|
||||
executor = GraphExecutor(
|
||||
runtime=runtime,
|
||||
node_registry={
|
||||
"n1": SuccessNode(),
|
||||
"n2": SuccessNode(),
|
||||
},
|
||||
event_bus=event_bus,
|
||||
stream_id="test-stream",
|
||||
)
|
||||
|
||||
goal = Goal(id="g-ev", name="event-test", description="test events")
|
||||
result = await executor.execute(graph=graph, goal=goal)
|
||||
|
||||
assert result.success is True
|
||||
assert result.path == ["n1", "n2"]
|
||||
|
||||
# Should have 4 events: started/completed for n1, then started/completed for n2
|
||||
assert len(event_bus.events) == 4
|
||||
assert event_bus.events[0] == ("started", {"stream_id": "test-stream", "node_id": "n1"})
|
||||
assert event_bus.events[1] == (
|
||||
"completed",
|
||||
{"stream_id": "test-stream", "node_id": "n1", "iterations": 1},
|
||||
)
|
||||
assert event_bus.events[2] == ("started", {"stream_id": "test-stream", "node_id": "n2"})
|
||||
assert event_bus.events[3] == (
|
||||
"completed",
|
||||
{"stream_id": "test-stream", "node_id": "n2", "iterations": 1},
|
||||
)
|
||||
|
||||
|
||||
# ---- Fake event_loop node (registered, so executor won't emit for it) ----
|
||||
class FakeEventLoopNode:
|
||||
def validate_input(self, ctx):
|
||||
return []
|
||||
|
||||
async def execute(self, ctx):
|
||||
return NodeResult(success=True, output={"result": "loop-done"}, tokens_used=1, latency_ms=1)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_executor_skips_events_for_event_loop_nodes():
|
||||
"""Executor should NOT emit events for event_loop nodes (they emit their own)."""
|
||||
runtime = DummyRuntime()
|
||||
event_bus = FakeEventBus()
|
||||
|
||||
graph = GraphSpec(
|
||||
id="graph-el",
|
||||
goal_id="g-el",
|
||||
nodes=[
|
||||
NodeSpec(
|
||||
id="el1",
|
||||
name="event-loop-node",
|
||||
description="event loop node",
|
||||
node_type="event_loop",
|
||||
input_keys=[],
|
||||
output_keys=["result"],
|
||||
max_retries=0,
|
||||
),
|
||||
],
|
||||
edges=[],
|
||||
entry_node="el1",
|
||||
)
|
||||
|
||||
executor = GraphExecutor(
|
||||
runtime=runtime,
|
||||
node_registry={"el1": FakeEventLoopNode()},
|
||||
event_bus=event_bus,
|
||||
stream_id="test-stream",
|
||||
)
|
||||
|
||||
goal = Goal(id="g-el", name="el-test", description="test event_loop guard")
|
||||
result = await executor.execute(graph=graph, goal=goal)
|
||||
|
||||
assert result.success is True
|
||||
# No events should have been emitted — event_loop nodes are skipped
|
||||
assert len(event_bus.events) == 0
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_executor_no_events_without_event_bus():
|
||||
"""Executor should work fine without an event bus (backward compat)."""
|
||||
runtime = DummyRuntime()
|
||||
|
||||
graph = GraphSpec(
|
||||
id="graph-nobus",
|
||||
goal_id="g-nobus",
|
||||
nodes=[
|
||||
NodeSpec(
|
||||
id="n1",
|
||||
name="node1",
|
||||
description="test node",
|
||||
node_type="llm_generate",
|
||||
input_keys=[],
|
||||
output_keys=["result"],
|
||||
max_retries=0,
|
||||
)
|
||||
],
|
||||
edges=[],
|
||||
entry_node="n1",
|
||||
)
|
||||
|
||||
# No event_bus passed — should not crash
|
||||
executor = GraphExecutor(
|
||||
runtime=runtime,
|
||||
node_registry={"n1": SuccessNode()},
|
||||
)
|
||||
|
||||
goal = Goal(id="g-nobus", name="nobus-test", description="no event bus")
|
||||
result = await executor.execute(graph=graph, goal=goal)
|
||||
|
||||
assert result.success is True
|
||||
|
||||
@@ -0,0 +1,360 @@
|
||||
"""
|
||||
Test that ON_FAILURE edges are followed when a node fails after max retries.
|
||||
|
||||
Verifies the fix for Issue #3449 where the executor would immediately terminate
|
||||
when max retries were exceeded, without checking for ON_FAILURE edges that could
|
||||
route to error handler nodes.
|
||||
"""
|
||||
|
||||
from unittest.mock import AsyncMock, MagicMock
|
||||
|
||||
import pytest
|
||||
|
||||
from framework.graph.edge import EdgeCondition, EdgeSpec, GraphSpec
|
||||
from framework.graph.executor import GraphExecutor
|
||||
from framework.graph.goal import Goal
|
||||
from framework.graph.node import NodeContext, NodeProtocol, NodeResult, NodeSpec
|
||||
from framework.runtime.core import Runtime
|
||||
|
||||
|
||||
class AlwaysFailsNode(NodeProtocol):
|
||||
"""A node that always fails."""
|
||||
|
||||
def __init__(self):
|
||||
self.attempt_count = 0
|
||||
|
||||
async def execute(self, ctx: NodeContext) -> NodeResult:
|
||||
self.attempt_count += 1
|
||||
return NodeResult(success=False, error=f"Permanent error (attempt {self.attempt_count})")
|
||||
|
||||
|
||||
class FailureHandlerNode(NodeProtocol):
|
||||
"""A node that handles failures from upstream nodes."""
|
||||
|
||||
def __init__(self):
|
||||
self.executed = False
|
||||
self.execute_count = 0
|
||||
|
||||
async def execute(self, ctx: NodeContext) -> NodeResult:
|
||||
self.executed = True
|
||||
self.execute_count += 1
|
||||
return NodeResult(
|
||||
success=True,
|
||||
output={"handled": True, "recovery": "graceful"},
|
||||
)
|
||||
|
||||
|
||||
class SuccessNode(NodeProtocol):
|
||||
"""A node that always succeeds with configurable output."""
|
||||
|
||||
def __init__(self, output: dict | None = None):
|
||||
self.execute_count = 0
|
||||
self._output = output or {"result": "ok"}
|
||||
|
||||
async def execute(self, ctx: NodeContext) -> NodeResult:
|
||||
self.execute_count += 1
|
||||
return NodeResult(success=True, output=self._output)
|
||||
|
||||
|
||||
@pytest.fixture(autouse=True)
|
||||
def fast_sleep(monkeypatch):
|
||||
"""Mock asyncio.sleep to avoid real delays from exponential backoff."""
|
||||
monkeypatch.setattr("asyncio.sleep", AsyncMock())
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def runtime():
|
||||
"""Create a mock Runtime for testing."""
|
||||
runtime = MagicMock(spec=Runtime)
|
||||
runtime.start_run = MagicMock(return_value="test_run_id")
|
||||
runtime.decide = MagicMock(return_value="test_decision_id")
|
||||
runtime.record_outcome = MagicMock()
|
||||
runtime.end_run = MagicMock()
|
||||
runtime.report_problem = MagicMock()
|
||||
runtime.set_node = MagicMock()
|
||||
return runtime
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def goal():
|
||||
return Goal(
|
||||
id="test_goal",
|
||||
name="Test Goal",
|
||||
description="Test ON_FAILURE edge routing",
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_on_failure_edge_followed_after_max_retries(runtime, goal):
|
||||
"""
|
||||
When a node fails after exhausting max retries, ON_FAILURE edges should
|
||||
be followed to route execution to a failure handler node.
|
||||
"""
|
||||
nodes = [
|
||||
NodeSpec(
|
||||
id="failing",
|
||||
name="Failing Node",
|
||||
description="Always fails",
|
||||
node_type="function",
|
||||
output_keys=[],
|
||||
max_retries=1,
|
||||
),
|
||||
NodeSpec(
|
||||
id="handler",
|
||||
name="Failure Handler",
|
||||
description="Handles failures",
|
||||
node_type="function",
|
||||
output_keys=["handled", "recovery"],
|
||||
),
|
||||
]
|
||||
|
||||
edges = [
|
||||
EdgeSpec(
|
||||
id="fail_to_handler",
|
||||
source="failing",
|
||||
target="handler",
|
||||
condition=EdgeCondition.ON_FAILURE,
|
||||
),
|
||||
]
|
||||
|
||||
graph = GraphSpec(
|
||||
id="test_graph",
|
||||
goal_id="test_goal",
|
||||
name="Test Graph",
|
||||
entry_node="failing",
|
||||
nodes=nodes,
|
||||
edges=edges,
|
||||
terminal_nodes=["handler"],
|
||||
)
|
||||
|
||||
executor = GraphExecutor(runtime=runtime)
|
||||
failing_node = AlwaysFailsNode()
|
||||
handler_node = FailureHandlerNode()
|
||||
executor.register_node("failing", failing_node)
|
||||
executor.register_node("handler", handler_node)
|
||||
|
||||
result = await executor.execute(graph, goal, {})
|
||||
|
||||
# The handler should have executed
|
||||
assert handler_node.executed, "Failure handler was not executed"
|
||||
assert handler_node.execute_count == 1
|
||||
|
||||
# Overall execution should succeed (handler recovered)
|
||||
assert result.success
|
||||
# Handler node should appear in the execution path
|
||||
assert "handler" in result.path
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_no_on_failure_edge_still_terminates(runtime, goal):
|
||||
"""
|
||||
When a node fails after max retries and there is no ON_FAILURE edge,
|
||||
the executor should terminate with a failure result (original behavior).
|
||||
"""
|
||||
nodes = [
|
||||
NodeSpec(
|
||||
id="failing",
|
||||
name="Failing Node",
|
||||
description="Always fails",
|
||||
node_type="function",
|
||||
output_keys=[],
|
||||
max_retries=1,
|
||||
),
|
||||
]
|
||||
|
||||
graph = GraphSpec(
|
||||
id="test_graph",
|
||||
goal_id="test_goal",
|
||||
name="Test Graph",
|
||||
entry_node="failing",
|
||||
nodes=[nodes[0]],
|
||||
edges=[],
|
||||
terminal_nodes=["failing"],
|
||||
)
|
||||
|
||||
executor = GraphExecutor(runtime=runtime)
|
||||
failing_node = AlwaysFailsNode()
|
||||
executor.register_node("failing", failing_node)
|
||||
|
||||
result = await executor.execute(graph, goal, {})
|
||||
|
||||
assert not result.success
|
||||
assert "failed after 1 attempts" in result.error
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_on_failure_edge_not_followed_on_success(runtime, goal):
|
||||
"""
|
||||
ON_FAILURE edges should NOT be followed when a node succeeds.
|
||||
Only ON_SUCCESS edges should fire.
|
||||
"""
|
||||
nodes = [
|
||||
NodeSpec(
|
||||
id="working",
|
||||
name="Working Node",
|
||||
description="Always succeeds",
|
||||
node_type="function",
|
||||
output_keys=["result"],
|
||||
),
|
||||
NodeSpec(
|
||||
id="handler",
|
||||
name="Failure Handler",
|
||||
description="Should not be reached",
|
||||
node_type="function",
|
||||
output_keys=["handled"],
|
||||
),
|
||||
NodeSpec(
|
||||
id="next",
|
||||
name="Next Node",
|
||||
description="Normal successor",
|
||||
node_type="function",
|
||||
output_keys=["done"],
|
||||
),
|
||||
]
|
||||
|
||||
edges = [
|
||||
EdgeSpec(
|
||||
id="on_fail",
|
||||
source="working",
|
||||
target="handler",
|
||||
condition=EdgeCondition.ON_FAILURE,
|
||||
),
|
||||
EdgeSpec(
|
||||
id="on_success",
|
||||
source="working",
|
||||
target="next",
|
||||
condition=EdgeCondition.ON_SUCCESS,
|
||||
),
|
||||
]
|
||||
|
||||
graph = GraphSpec(
|
||||
id="test_graph",
|
||||
goal_id="test_goal",
|
||||
name="Test Graph",
|
||||
entry_node="working",
|
||||
nodes=nodes,
|
||||
edges=edges,
|
||||
terminal_nodes=["handler", "next"],
|
||||
)
|
||||
|
||||
executor = GraphExecutor(runtime=runtime)
|
||||
executor.register_node("working", SuccessNode(output={"result": "ok"}))
|
||||
handler_node = FailureHandlerNode()
|
||||
executor.register_node("handler", handler_node)
|
||||
executor.register_node("next", SuccessNode(output={"done": True}))
|
||||
|
||||
result = await executor.execute(graph, goal, {})
|
||||
|
||||
assert result.success
|
||||
assert not handler_node.executed, "Failure handler should not run on success"
|
||||
assert "next" in result.path, "Should follow ON_SUCCESS edge to 'next'"
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_on_failure_edge_with_zero_retries(runtime, goal):
|
||||
"""
|
||||
ON_FAILURE edges should work even when max_retries=0 (no retries allowed).
|
||||
The node fails once and immediately routes to the failure handler.
|
||||
"""
|
||||
nodes = [
|
||||
NodeSpec(
|
||||
id="fragile",
|
||||
name="Fragile Node",
|
||||
description="Fails with no retries",
|
||||
node_type="function",
|
||||
output_keys=[],
|
||||
max_retries=0,
|
||||
),
|
||||
NodeSpec(
|
||||
id="handler",
|
||||
name="Failure Handler",
|
||||
description="Handles failures",
|
||||
node_type="function",
|
||||
output_keys=["handled", "recovery"],
|
||||
),
|
||||
]
|
||||
|
||||
edges = [
|
||||
EdgeSpec(
|
||||
id="fail_to_handler",
|
||||
source="fragile",
|
||||
target="handler",
|
||||
condition=EdgeCondition.ON_FAILURE,
|
||||
),
|
||||
]
|
||||
|
||||
graph = GraphSpec(
|
||||
id="test_graph",
|
||||
goal_id="test_goal",
|
||||
name="Test Graph",
|
||||
entry_node="fragile",
|
||||
nodes=nodes,
|
||||
edges=edges,
|
||||
terminal_nodes=["handler"],
|
||||
)
|
||||
|
||||
executor = GraphExecutor(runtime=runtime)
|
||||
failing_node = AlwaysFailsNode()
|
||||
handler_node = FailureHandlerNode()
|
||||
executor.register_node("fragile", failing_node)
|
||||
executor.register_node("handler", handler_node)
|
||||
|
||||
result = await executor.execute(graph, goal, {})
|
||||
|
||||
# Should route to handler after single failure (no retries)
|
||||
assert failing_node.attempt_count == 1
|
||||
assert handler_node.executed
|
||||
assert result.success
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_on_failure_handler_appears_in_path(runtime, goal):
|
||||
"""
|
||||
The failure handler node should appear in the execution path.
|
||||
"""
|
||||
nodes = [
|
||||
NodeSpec(
|
||||
id="failing",
|
||||
name="Failing Node",
|
||||
description="Always fails",
|
||||
node_type="function",
|
||||
output_keys=[],
|
||||
max_retries=1,
|
||||
),
|
||||
NodeSpec(
|
||||
id="handler",
|
||||
name="Failure Handler",
|
||||
description="Handles failures",
|
||||
node_type="function",
|
||||
output_keys=["handled", "recovery"],
|
||||
),
|
||||
]
|
||||
|
||||
edges = [
|
||||
EdgeSpec(
|
||||
id="fail_to_handler",
|
||||
source="failing",
|
||||
target="handler",
|
||||
condition=EdgeCondition.ON_FAILURE,
|
||||
),
|
||||
]
|
||||
|
||||
graph = GraphSpec(
|
||||
id="test_graph",
|
||||
goal_id="test_goal",
|
||||
name="Test Graph",
|
||||
entry_node="failing",
|
||||
nodes=nodes,
|
||||
edges=edges,
|
||||
terminal_nodes=["handler"],
|
||||
)
|
||||
|
||||
executor = GraphExecutor(runtime=runtime)
|
||||
executor.register_node("failing", AlwaysFailsNode())
|
||||
executor.register_node("handler", FailureHandlerNode())
|
||||
|
||||
result = await executor.execute(graph, goal, {})
|
||||
|
||||
assert "failing" in result.path
|
||||
assert "handler" in result.path
|
||||
assert result.node_visit_counts.get("handler") == 1
|
||||
Generated
+2990
File diff suppressed because it is too large
Load Diff
+2
-2
@@ -83,7 +83,7 @@ git clone https://github.com/adenhq/hive.git
|
||||
cd hive
|
||||
|
||||
# Python वातावरण कॉन्फ़िगरेशन चलाएँ
|
||||
./scripts/setup-python.sh
|
||||
./quickstart.sh
|
||||
```
|
||||
|
||||
यह इंस्टॉल करता है:
|
||||
@@ -236,7 +236,7 @@ hive/
|
||||
|
||||
```bash
|
||||
# एक-बार का कॉन्फ़िगरेशन
|
||||
./scripts/setup-python.sh
|
||||
./quickstart.sh
|
||||
|
||||
# यह इंस्टॉल करता है:
|
||||
# - फ्रेमवर्क पैकेज (मुख्य रनटाइम)
|
||||
|
||||
+1
-4
@@ -9,14 +9,11 @@
|
||||
},
|
||||
"license": "Apache-2.0",
|
||||
"scripts": {
|
||||
"setup": "echo '⚠️ This npm setup is for the archived web application. For agent development, use: ./scripts/setup-python.sh' && bash scripts/setup.sh",
|
||||
"test:duplicates": "bun test scripts/auto-close-duplicates"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@types/node": "^20.10.0",
|
||||
"tsx": "^4.7.0",
|
||||
"typescript": "^5.3.0",
|
||||
"yaml": "^2.3.0"
|
||||
"typescript": "^5.3.0"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=20.0.0",
|
||||
|
||||
@@ -1,180 +0,0 @@
|
||||
/**
|
||||
* Environment Generator Script
|
||||
*
|
||||
* Reads config.yaml and generates .env files for each service.
|
||||
* This provides a single source of truth for configuration while
|
||||
* maintaining compatibility with standard .env file workflows.
|
||||
*
|
||||
* Usage: npx tsx scripts/generate-env.ts
|
||||
*/
|
||||
|
||||
import { readFileSync, writeFileSync, existsSync } from 'fs';
|
||||
import { parse } from 'yaml';
|
||||
import { join, dirname } from 'path';
|
||||
import { fileURLToPath } from 'url';
|
||||
|
||||
const __dirname = dirname(fileURLToPath(import.meta.url));
|
||||
const PROJECT_ROOT = join(__dirname, '..');
|
||||
|
||||
interface Config {
|
||||
app: {
|
||||
name: string;
|
||||
environment: string;
|
||||
log_level: string;
|
||||
};
|
||||
server: {
|
||||
frontend: {
|
||||
port: number;
|
||||
};
|
||||
backend: {
|
||||
port: number;
|
||||
host: string;
|
||||
};
|
||||
};
|
||||
timescaledb: {
|
||||
url: string;
|
||||
port: number;
|
||||
};
|
||||
mongodb: {
|
||||
url: string;
|
||||
database: string;
|
||||
erp_database: string;
|
||||
port: number;
|
||||
};
|
||||
redis: {
|
||||
url: string;
|
||||
port: number;
|
||||
};
|
||||
auth: {
|
||||
jwt_secret: string;
|
||||
jwt_expires_in: string;
|
||||
passphrase: string;
|
||||
};
|
||||
npm: {
|
||||
token: string;
|
||||
};
|
||||
cors: {
|
||||
origin: string;
|
||||
};
|
||||
features: {
|
||||
registration: boolean;
|
||||
rate_limiting: boolean;
|
||||
request_logging: boolean;
|
||||
mcp_server: boolean;
|
||||
};
|
||||
}
|
||||
|
||||
function loadConfig(): Config {
|
||||
const configPath = join(PROJECT_ROOT, 'config.yaml');
|
||||
|
||||
if (!existsSync(configPath)) {
|
||||
console.error('Error: config.yaml not found.');
|
||||
console.error('Run: cp config.yaml.example config.yaml');
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
const configContent = readFileSync(configPath, 'utf-8');
|
||||
return parse(configContent) as Config;
|
||||
}
|
||||
|
||||
function generateRootEnv(config: Config): string {
|
||||
return `# Generated from config.yaml - do not edit directly
|
||||
# Regenerate with: npm run generate:env
|
||||
|
||||
# Application
|
||||
NODE_ENV=${config.app.environment}
|
||||
APP_NAME=${config.app.name}
|
||||
LOG_LEVEL=${config.app.log_level}
|
||||
|
||||
# Ports
|
||||
FRONTEND_PORT=${config.server.frontend.port}
|
||||
BACKEND_PORT=${config.server.backend.port}
|
||||
TSDB_PORT=${config.timescaledb.port}
|
||||
MONGODB_PORT=${config.mongodb.port}
|
||||
REDIS_PORT=${config.redis.port}
|
||||
|
||||
# API URL for frontend
|
||||
VITE_API_URL=http://localhost:${config.server.backend.port}
|
||||
|
||||
# MongoDB
|
||||
MONGODB_DBNAME=${config.mongodb.database}
|
||||
MONGODB_ERP_DBNAME=${config.mongodb.erp_database}
|
||||
|
||||
# Authentication
|
||||
JWT_SECRET=${config.auth.jwt_secret}
|
||||
PASSPHRASE=${config.auth.passphrase}
|
||||
|
||||
# NPM (for Docker builds with private packages)
|
||||
NPM_TOKEN=${config.npm.token}
|
||||
|
||||
# CORS
|
||||
CORS_ORIGIN=${config.cors.origin}
|
||||
`;
|
||||
}
|
||||
|
||||
function generateFrontendEnv(config: Config): string {
|
||||
return `# Generated from config.yaml - do not edit directly
|
||||
# Regenerate with: npm run generate:env
|
||||
|
||||
VITE_API_URL=http://localhost:${config.server.backend.port}
|
||||
VITE_APP_NAME=${config.app.name}
|
||||
VITE_APP_ENV=${config.app.environment}
|
||||
`;
|
||||
}
|
||||
|
||||
function generateBackendEnv(config: Config): string {
|
||||
return `# Generated from config.yaml - do not edit directly
|
||||
# Regenerate with: npm run generate:env
|
||||
|
||||
# Server
|
||||
NODE_ENV=${config.app.environment}
|
||||
PORT=${config.server.backend.port}
|
||||
|
||||
# Application
|
||||
LOG_LEVEL=${config.app.log_level}
|
||||
|
||||
# TimescaleDB (PostgreSQL)
|
||||
TSDB_PG_URL=${config.timescaledb.url}
|
||||
|
||||
# MongoDB
|
||||
MONGODB_URL=${config.mongodb.url}
|
||||
MONGODB_DBNAME=${config.mongodb.database}
|
||||
MONGODB_ERP_DBNAME=${config.mongodb.erp_database}
|
||||
|
||||
# Redis
|
||||
REDIS_URL=${config.redis.url}
|
||||
|
||||
# Authentication
|
||||
JWT_SECRET=${config.auth.jwt_secret}
|
||||
PASSPHRASE=${config.auth.passphrase}
|
||||
|
||||
# Features
|
||||
FEATURE_MCP_SERVER=${config.features.mcp_server}
|
||||
`;
|
||||
}
|
||||
|
||||
function main() {
|
||||
console.log('Generating environment files from config.yaml...\n');
|
||||
|
||||
const config = loadConfig();
|
||||
|
||||
// Generate root .env (for docker-compose)
|
||||
const rootEnvPath = join(PROJECT_ROOT, '.env');
|
||||
writeFileSync(rootEnvPath, generateRootEnv(config));
|
||||
console.log(`✓ Generated ${rootEnvPath}`);
|
||||
|
||||
// Generate frontend .env
|
||||
const frontendEnvPath = join(PROJECT_ROOT, 'honeycomb', '.env');
|
||||
writeFileSync(frontendEnvPath, generateFrontendEnv(config));
|
||||
console.log(`✓ Generated ${frontendEnvPath}`);
|
||||
|
||||
// Generate backend .env
|
||||
const backendEnvPath = join(PROJECT_ROOT, 'hive', '.env');
|
||||
writeFileSync(backendEnvPath, generateBackendEnv(config));
|
||||
console.log(`✓ Generated ${backendEnvPath}`);
|
||||
|
||||
console.log('\nDone! Environment files have been generated.');
|
||||
console.log('\nNote: These files are git-ignored. Regenerate after editing config.yaml.');
|
||||
}
|
||||
|
||||
main();
|
||||
@@ -1,251 +0,0 @@
|
||||
<#
|
||||
|
||||
setup-python.ps1 - Python Environment Setup for Aden Agent Framework
|
||||
|
||||
This script sets up the Python environment with all required packages
|
||||
for building and running goal-driven agents.
|
||||
#>
|
||||
|
||||
$ErrorActionPreference = "Stop"
|
||||
|
||||
# Colors for output
|
||||
$RED = "Red"
|
||||
$GREEN = "Green"
|
||||
$YELLOW = "Yellow"
|
||||
$BLUE = "Cyan"
|
||||
|
||||
# Get the directory where this script is located
|
||||
$SCRIPT_DIR = Split-Path -Parent $MyInvocation.MyCommand.Path
|
||||
$PROJECT_ROOT = Split-Path -Parent $SCRIPT_DIR
|
||||
|
||||
Write-Host ""
|
||||
Write-Host "=================================================="
|
||||
Write-Host " Aden Agent Framework - Python Setup"
|
||||
Write-Host "=================================================="
|
||||
Write-Host ""
|
||||
|
||||
# Check for Python
|
||||
$pythonCmd = $null
|
||||
if (Get-Command python -ErrorAction SilentlyContinue) {
|
||||
$pythonCmd = "python"
|
||||
}
|
||||
|
||||
if (-not $pythonCmd) {
|
||||
Write-Host "Error: Python is not installed." -ForegroundColor $RED
|
||||
Write-Host "Please install Python 3.11+ from https://python.org"
|
||||
exit 1
|
||||
}
|
||||
|
||||
# Check Python version
|
||||
$versionInfo = & $pythonCmd -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')"
|
||||
$major = & $pythonCmd -c "import sys; print(sys.version_info.major)"
|
||||
$minor = & $pythonCmd -c "import sys; print(sys.version_info.minor)"
|
||||
|
||||
Write-Host "Detected Python: $versionInfo" -ForegroundColor $BLUE
|
||||
|
||||
if ($major -lt 3 -or ($major -eq 3 -and $minor -lt 11)) {
|
||||
Write-Host "Error: Python 3.11+ is required (found $versionInfo)" -ForegroundColor $RED
|
||||
Write-Host "Please upgrade your Python installation"
|
||||
exit 1
|
||||
}
|
||||
|
||||
if ($minor -lt 11) {
|
||||
Write-Host "Warning: Python 3.11+ is recommended for best compatibility" -ForegroundColor $YELLOW
|
||||
Write-Host "You have Python $versionInfo which may work but is not officially supported" -ForegroundColor $YELLOW
|
||||
Write-Host ""
|
||||
}
|
||||
|
||||
Write-Host "[OK] Python version check passed" -ForegroundColor $GREEN
|
||||
Write-Host ""
|
||||
|
||||
# Create and activate virtual environment
|
||||
Write-Host "=================================================="
|
||||
Write-Host "Setting up Python Virtual Environment"
|
||||
Write-Host "=================================================="
|
||||
Write-Host ""
|
||||
|
||||
$VENV_PATH = Join-Path $PROJECT_ROOT ".venv"
|
||||
$VENV_PYTHON = Join-Path $VENV_PATH "Scripts\python.exe"
|
||||
$VENV_ACTIVATE = Join-Path $VENV_PATH "Scripts\Activate.ps1"
|
||||
|
||||
if (-not (Test-Path $VENV_PYTHON)) {
|
||||
Write-Host "Creating virtual environment at .venv..."
|
||||
& $pythonCmd -m venv $VENV_PATH
|
||||
Write-Host "[OK] Virtual environment created" -ForegroundColor $GREEN
|
||||
}
|
||||
else {
|
||||
Write-Host "[OK] Virtual environment already exists" -ForegroundColor $GREEN
|
||||
}
|
||||
|
||||
# Activate venv
|
||||
Write-Host "Activating virtual environment..."
|
||||
& $VENV_ACTIVATE
|
||||
Write-Host "[OK] Virtual environment activated" -ForegroundColor $GREEN
|
||||
|
||||
# From here on, always use venv python
|
||||
$pythonCmd = $VENV_PYTHON
|
||||
|
||||
Write-Host ""
|
||||
|
||||
# Check for pip
|
||||
try {
|
||||
& $pythonCmd -m pip --version | Out-Null
|
||||
}
|
||||
catch {
|
||||
Write-Host "Error: pip is not installed" -ForegroundColor $RED
|
||||
Write-Host "Please install pip for Python $versionInfo"
|
||||
exit 1
|
||||
}
|
||||
|
||||
Write-Host "[OK] pip detected" -ForegroundColor $GREEN
|
||||
Write-Host ""
|
||||
|
||||
# Upgrade pip, setuptools, and wheel
|
||||
Write-Host "Upgrading pip, setuptools, and wheel..."
|
||||
& $pythonCmd -m pip install --upgrade pip setuptools wheel
|
||||
Write-Host "[OK] Core packages upgraded" -ForegroundColor $GREEN
|
||||
Write-Host ""
|
||||
|
||||
# Install core framework package
|
||||
Write-Host "=================================================="
|
||||
Write-Host "Installing Core Framework Package"
|
||||
Write-Host "=================================================="
|
||||
Write-Host ""
|
||||
|
||||
Set-Location "$PROJECT_ROOT\core"
|
||||
|
||||
if (Test-Path "pyproject.toml") {
|
||||
Write-Host "Installing framework from core/ (editable mode)..."
|
||||
& $pythonCmd -m pip install -e . | Out-Null
|
||||
Write-Host "[OK] Framework package installed" -ForegroundColor $GREEN
|
||||
}
|
||||
else {
|
||||
Write-Host "[WARN] No pyproject.toml found in core/, skipping framework installation" -ForegroundColor $YELLOW
|
||||
}
|
||||
|
||||
Write-Host ""
|
||||
|
||||
# Install tools package
|
||||
Write-Host "=================================================="
|
||||
Write-Host "Installing Tools Package (aden_tools)"
|
||||
Write-Host "=================================================="
|
||||
Write-Host ""
|
||||
|
||||
Set-Location "$PROJECT_ROOT\tools"
|
||||
|
||||
if (Test-Path "pyproject.toml") {
|
||||
Write-Host "Installing aden_tools from tools/ (editable mode)..."
|
||||
& $pythonCmd -m pip install -e . | Out-Null
|
||||
Write-Host "[OK] Tools package installed" -ForegroundColor $GREEN
|
||||
}
|
||||
else {
|
||||
Write-Host "Error: No pyproject.toml found in tools/" -ForegroundColor $RED
|
||||
exit 1
|
||||
}
|
||||
|
||||
Write-Host ""
|
||||
|
||||
# Fix openai version compatibility with litellm
|
||||
Write-Host "=================================================="
|
||||
Write-Host "Fixing Package Compatibility"
|
||||
Write-Host "=================================================="
|
||||
Write-Host ""
|
||||
|
||||
try {
|
||||
$openaiVersion = & $pythonCmd -c "import openai; print(openai.__version__)"
|
||||
}
|
||||
catch {
|
||||
$openaiVersion = "not_installed"
|
||||
}
|
||||
|
||||
if ($openaiVersion -eq "not_installed") {
|
||||
Write-Host "Installing openai package..."
|
||||
& $pythonCmd -m pip install "openai>=1.0.0" | Out-Null
|
||||
Write-Host "[OK] openai package installed" -ForegroundColor $GREEN
|
||||
}
|
||||
elseif ($openaiVersion.StartsWith("0.")) {
|
||||
Write-Host "Found old openai version: $openaiVersion" -ForegroundColor $YELLOW
|
||||
Write-Host "Upgrading to openai 1.x+ for litellm compatibility..."
|
||||
& $pythonCmd -m pip install --upgrade "openai>=1.0.0" | Out-Null
|
||||
$openaiVersion = & $pythonCmd -c "import openai; print(openai.__version__)"
|
||||
Write-Host "[OK] openai upgraded to $openaiVersion" -ForegroundColor $GREEN
|
||||
}
|
||||
else {
|
||||
Write-Host "[OK] openai $openaiVersion is compatible" -ForegroundColor $GREEN
|
||||
}
|
||||
|
||||
Write-Host ""
|
||||
|
||||
# Verify installations
|
||||
Write-Host "=================================================="
|
||||
Write-Host "Verifying Installation"
|
||||
Write-Host "=================================================="
|
||||
Write-Host ""
|
||||
|
||||
Set-Location $PROJECT_ROOT
|
||||
|
||||
# Test framework import
|
||||
& $pythonCmd -c "import framework" 2>$null
|
||||
if ($LASTEXITCODE -eq 0) {
|
||||
Write-Host "[OK] framework package imports successfully" -ForegroundColor Green
|
||||
}
|
||||
else {
|
||||
Write-Host "[FAIL] framework package import failed" -ForegroundColor Red
|
||||
}
|
||||
|
||||
# Test aden_tools import
|
||||
& $pythonCmd -c "import aden_tools" 2>$null
|
||||
if ($LASTEXITCODE -eq 0) {
|
||||
Write-Host "[OK] aden_tools package imports successfully" -ForegroundColor Green
|
||||
}
|
||||
else {
|
||||
Write-Host "[FAIL] aden_tools package import failed" -ForegroundColor Red
|
||||
exit 1
|
||||
}
|
||||
|
||||
# Test litellm
|
||||
& $pythonCmd -c "import litellm" 2>$null
|
||||
if ($LASTEXITCODE -eq 0) {
|
||||
Write-Host "[OK] litellm package imports successfully" -ForegroundColor $GREEN
|
||||
}
|
||||
else {
|
||||
Write-Host "[WARN] litellm import had issues (may be OK if not using LLM features)" -ForegroundColor $YELLOW
|
||||
}
|
||||
|
||||
Write-Host ""
|
||||
|
||||
# Print agent commands
|
||||
Write-Host "=================================================="
|
||||
Write-Host " Setup Complete!"
|
||||
Write-Host "=================================================="
|
||||
Write-Host ""
|
||||
Write-Host "Python packages installed:"
|
||||
Write-Host " - framework (core agent runtime)"
|
||||
Write-Host " - aden_tools (tools and MCP servers)"
|
||||
Write-Host " - All dependencies and compatibility fixes applied"
|
||||
Write-Host ""
|
||||
Write-Host "To run agents on Windows (PowerShell):"
|
||||
Write-Host ""
|
||||
Write-Host "1. From the project root, set PYTHONPATH:"
|
||||
Write-Host " `$env:PYTHONPATH=`"exports`""
|
||||
Write-Host ""
|
||||
Write-Host "2. Run an agent command:"
|
||||
Write-Host " uv run python -m agent_name validate"
|
||||
Write-Host " uv run python -m agent_name info"
|
||||
Write-Host " uv run python -m agent_name run --input '{...}'"
|
||||
Write-Host ""
|
||||
Write-Host "Example (support_ticket_agent):"
|
||||
Write-Host " uv run python -m support_ticket_agent validate"
|
||||
Write-Host " uv run python -m support_ticket_agent info"
|
||||
Write-Host " uv run python -m support_ticket_agent run --input '{""ticket_content"":""..."",""customer_id"":""..."",""ticket_id"":""...""}'"
|
||||
Write-Host ""
|
||||
Write-Host "Notes:"
|
||||
Write-Host " - Ensure the virtual environment is activated (.venv)"
|
||||
Write-Host " - PYTHONPATH must be set in each new PowerShell session"
|
||||
Write-Host ""
|
||||
Write-Host "Documentation:"
|
||||
Write-Host " $PROJECT_ROOT\README.md"
|
||||
Write-Host ""
|
||||
Write-Host "Agent Examples:"
|
||||
Write-Host " $PROJECT_ROOT\exports\"
|
||||
Write-Host ""
|
||||
@@ -1,308 +0,0 @@
|
||||
#!/bin/bash
|
||||
#
|
||||
# setup-python.sh - Python Environment Setup for Aden Agent Framework
|
||||
#
|
||||
# DEPRECATED: Use ./quickstart.sh instead. It does everything this script
|
||||
# does plus verifies MCP configuration, Claude Code skills, and API keys.
|
||||
#
|
||||
# This script is kept for CI/headless environments where the extra
|
||||
# verification steps in quickstart.sh are not needed.
|
||||
#
|
||||
|
||||
set -e
|
||||
|
||||
# Colors for output
|
||||
RED='\033[0;31m'
|
||||
GREEN='\033[0;32m'
|
||||
YELLOW='\033[1;33m'
|
||||
BLUE='\033[0;34m'
|
||||
NC='\033[0m' # No Color
|
||||
|
||||
# Get the directory where this script is located
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
|
||||
|
||||
# Python Version
|
||||
REQUIRED_PYTHON_VERSION="3.11"
|
||||
|
||||
# Python version split into Major and Minor
|
||||
IFS='.' read -r PYTHON_MAJOR_VERSION PYTHON_MINOR_VERSION <<< "$REQUIRED_PYTHON_VERSION"
|
||||
|
||||
# Available python interpreter (follows sequence)
|
||||
POSSIBLE_PYTHONS=("python3" "python" "py")
|
||||
|
||||
# Default python interpreter (initialized)
|
||||
PYTHON_CMD=()
|
||||
|
||||
|
||||
echo ""
|
||||
echo "=================================================="
|
||||
echo " Aden Agent Framework - Python Setup"
|
||||
echo "=================================================="
|
||||
echo ""
|
||||
echo -e "${YELLOW}NOTE: Consider using ./quickstart.sh instead for a complete setup.${NC}"
|
||||
echo ""
|
||||
|
||||
# Available Python interpreter
|
||||
for cmd in "${POSSIBLE_PYTHONS[@]}"; do
|
||||
# Check for python interpreter
|
||||
if command -v "$cmd" >/dev/null 2>&1; then
|
||||
|
||||
# Specific check for Windows 'py' launcher
|
||||
if [ "$cmd" = "py" ]; then
|
||||
CURRENT_CMD=(py -3)
|
||||
else
|
||||
CURRENT_CMD=("$cmd")
|
||||
fi
|
||||
|
||||
# Check Python version
|
||||
if "${CURRENT_CMD[@]}" -c "import sys; sys.exit(0 if sys.version_info >= ($PYTHON_MAJOR_VERSION, $PYTHON_MINOR_VERSION) else 1)" >/dev/null 2>&1; then
|
||||
echo -e "${GREEN}✓${NC} interpreter detected: ${CURRENT_CMD[@]}"
|
||||
# Check for pip
|
||||
if "${CURRENT_CMD[@]}" -m pip --version >/dev/null 2>&1; then
|
||||
PYTHON_CMD=("${CURRENT_CMD[@]}")
|
||||
echo -e "${GREEN}✓${NC} pip detected"
|
||||
echo ""
|
||||
break
|
||||
else
|
||||
echo -e "${RED}✗${NC} pip not found"
|
||||
echo ""
|
||||
fi
|
||||
else
|
||||
echo -e "${RED}✗${NC} ${CURRENT_CMD[@]} not found"
|
||||
echo ""
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
# Display error message if python not found
|
||||
if [ "${#PYTHON_CMD[@]}" -eq 0 ]; then
|
||||
echo -e "${RED}Error:${NC} No suitable Python interpreter found with pip installed."
|
||||
echo ""
|
||||
echo "Requirements:"
|
||||
echo " • Python $PYTHON_MAJOR_VERSION.$PYTHON_MINOR_VERSION+"
|
||||
echo " • pip installed"
|
||||
echo ""
|
||||
echo "Tried the following commands:"
|
||||
echo " ${POSSIBLE_PYTHONS[*]}"
|
||||
echo ""
|
||||
echo "Please install Python from:"
|
||||
echo " https://www.python.org/downloads/"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Display Python version
|
||||
PYTHON_VERSION=$("${PYTHON_CMD[@]}" -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")')
|
||||
echo -e "${BLUE}Detected Python:${NC} $PYTHON_VERSION"
|
||||
echo -e "${GREEN}✓${NC} Python version check passed"
|
||||
echo ""
|
||||
|
||||
# Check for uv
|
||||
if ! command -v uv &> /dev/null; then
|
||||
echo -e "${RED}Error: uv is not installed${NC}"
|
||||
echo "Please install uv from https://github.com/astral-sh/uv"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo -e "${GREEN}✓${NC} uv detected"
|
||||
echo ""
|
||||
|
||||
# Install core framework package
|
||||
echo "=================================================="
|
||||
echo "Installing Core Framework Package"
|
||||
echo "=================================================="
|
||||
echo ""
|
||||
cd "$PROJECT_ROOT/core"
|
||||
|
||||
# Create venv if it doesn't exist
|
||||
if [ ! -d ".venv" ]; then
|
||||
echo "Creating virtual environment in core/.venv..."
|
||||
uv venv
|
||||
echo -e "${GREEN}✓${NC} Virtual environment created"
|
||||
else
|
||||
echo -e "${GREEN}✓${NC} Virtual environment already exists"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
if [ -f "pyproject.toml" ]; then
|
||||
echo "Installing framework from core/ (editable mode)..."
|
||||
CORE_PYTHON=".venv/bin/python"
|
||||
if uv pip install --python "$CORE_PYTHON" -e .; then
|
||||
echo -e "${GREEN}✓${NC} Framework package installed"
|
||||
else
|
||||
echo -e "${YELLOW}⚠${NC} Framework installation encountered issues (may be OK if already installed)"
|
||||
fi
|
||||
else
|
||||
echo -e "${YELLOW}⚠${NC} No pyproject.toml found in core/, skipping framework installation"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Install tools package
|
||||
echo "=================================================="
|
||||
echo "Installing Tools Package (aden_tools)"
|
||||
echo "=================================================="
|
||||
echo ""
|
||||
cd "$PROJECT_ROOT/tools"
|
||||
|
||||
# Create venv if it doesn't exist
|
||||
if [ ! -d ".venv" ]; then
|
||||
echo "Creating virtual environment in tools/.venv..."
|
||||
uv venv
|
||||
echo -e "${GREEN}✓${NC} Virtual environment created"
|
||||
else
|
||||
echo -e "${GREEN}✓${NC} Virtual environment already exists"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
if [ -f "pyproject.toml" ]; then
|
||||
echo "Installing aden_tools from tools/ (editable mode)..."
|
||||
TOOLS_PYTHON=".venv/bin/python"
|
||||
if uv pip install --python "$TOOLS_PYTHON" -e .; then
|
||||
echo -e "${GREEN}✓${NC} Tools package installed"
|
||||
else
|
||||
echo -e "${RED}✗${NC} Tools installation failed"
|
||||
exit 1
|
||||
fi
|
||||
else
|
||||
echo -e "${RED}Error: No pyproject.toml found in tools/${NC}"
|
||||
exit 1
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Install Playwright browser for web scraping
|
||||
echo "=================================================="
|
||||
echo "Installing Playwright Browser"
|
||||
echo "=================================================="
|
||||
echo ""
|
||||
|
||||
if $PYTHON_CMD -c "import playwright" > /dev/null 2>&1; then
|
||||
echo "Installing Chromium browser for web scraping..."
|
||||
if $PYTHON_CMD -m playwright install chromium > /dev/null 2>&1; then
|
||||
echo -e "${GREEN}✓${NC} Playwright Chromium installed"
|
||||
else
|
||||
echo -e "${YELLOW}⚠${NC} Playwright browser install failed (web_scrape tool may not work)"
|
||||
echo " Run manually: uv run python -m playwright install chromium"
|
||||
fi
|
||||
else
|
||||
echo -e "${YELLOW}⚠${NC} Playwright not found, skipping browser install"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Fix openai version compatibility with litellm
|
||||
echo "=================================================="
|
||||
echo "Fixing Package Compatibility"
|
||||
echo "=================================================="
|
||||
echo ""
|
||||
|
||||
TOOLS_PYTHON="$PROJECT_ROOT/tools/.venv/bin/python"
|
||||
|
||||
# Check openai version in tools venv
|
||||
OPENAI_VERSION=$($TOOLS_PYTHON -c "import openai; print(openai.__version__)" 2>/dev/null || echo "not_installed")
|
||||
|
||||
if [ "$OPENAI_VERSION" = "not_installed" ]; then
|
||||
echo "Installing openai package..."
|
||||
uv pip install --python "$TOOLS_PYTHON" "openai>=1.0.0"
|
||||
echo -e "${GREEN}✓${NC} openai package installed"
|
||||
elif [[ "$OPENAI_VERSION" =~ ^0\. ]]; then
|
||||
echo -e "${YELLOW}Found old openai version: $OPENAI_VERSION${NC}"
|
||||
echo "Upgrading to openai 1.x+ for litellm compatibility..."
|
||||
uv pip install --python "$TOOLS_PYTHON" --upgrade "openai>=1.0.0"
|
||||
OPENAI_VERSION=$($TOOLS_PYTHON -c "import openai; print(openai.__version__)" 2>/dev/null)
|
||||
echo -e "${GREEN}✓${NC} openai upgraded to $OPENAI_VERSION"
|
||||
else
|
||||
echo -e "${GREEN}✓${NC} openai $OPENAI_VERSION is compatible"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Ensure exports directory exists
|
||||
echo "=================================================="
|
||||
echo "Checking Directory Structure"
|
||||
echo "=================================================="
|
||||
echo ""
|
||||
|
||||
if [ ! -d "$PROJECT_ROOT/exports" ]; then
|
||||
echo "Creating exports directory..."
|
||||
mkdir -p "$PROJECT_ROOT/exports"
|
||||
echo "# Agent Exports" > "$PROJECT_ROOT/exports/README.md"
|
||||
echo "" >> "$PROJECT_ROOT/exports/README.md"
|
||||
echo "This directory is the default location for generated agent packages." >> "$PROJECT_ROOT/exports/README.md"
|
||||
echo -e "${GREEN}✓${NC} Created exports directory"
|
||||
else
|
||||
echo -e "${GREEN}✓${NC} exports directory exists"
|
||||
fi
|
||||
echo ""
|
||||
|
||||
# Verify installations
|
||||
echo "=================================================="
|
||||
echo "Verifying Installation"
|
||||
echo "=================================================="
|
||||
echo ""
|
||||
|
||||
cd "$PROJECT_ROOT"
|
||||
|
||||
# Test framework import using core venv
|
||||
CORE_PYTHON="$PROJECT_ROOT/core/.venv/bin/python"
|
||||
if [ -f "$CORE_PYTHON" ]; then
|
||||
if $CORE_PYTHON -c "import framework; print('framework OK')" > /dev/null 2>&1; then
|
||||
echo -e "${GREEN}✓${NC} framework package imports successfully"
|
||||
else
|
||||
echo -e "${RED}✗${NC} framework package import failed"
|
||||
echo -e "${YELLOW} Note: This may be OK if you don't need the framework${NC}"
|
||||
fi
|
||||
else
|
||||
echo -e "${RED}✗${NC} core/.venv not found - venv creation may have failed${NC}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Test aden_tools import using tools venv
|
||||
TOOLS_PYTHON="$PROJECT_ROOT/tools/.venv/bin/python"
|
||||
if [ -f "$TOOLS_PYTHON" ]; then
|
||||
if $TOOLS_PYTHON -c "import aden_tools; print('aden_tools OK')" > /dev/null 2>&1; then
|
||||
echo -e "${GREEN}✓${NC} aden_tools package imports successfully"
|
||||
else
|
||||
echo -e "${RED}✗${NC} aden_tools package import failed"
|
||||
exit 1
|
||||
fi
|
||||
else
|
||||
echo -e "${RED}✗${NC} tools/.venv not found - venv creation may have failed${NC}"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Test litellm + openai compatibility using tools venv
|
||||
if $TOOLS_PYTHON -c "import litellm; print('litellm OK')" > /dev/null 2>&1; then
|
||||
echo -e "${GREEN}✓${NC} litellm package imports successfully"
|
||||
else
|
||||
echo -e "${YELLOW}⚠${NC} litellm import had issues (may be OK if not using LLM features)"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
|
||||
# Print agent commands
|
||||
echo "=================================================="
|
||||
echo " Setup Complete!"
|
||||
echo "=================================================="
|
||||
echo ""
|
||||
echo "Python packages installed:"
|
||||
echo " • framework (core agent runtime)"
|
||||
echo " • aden_tools (tools and MCP servers)"
|
||||
echo " • All dependencies and compatibility fixes applied"
|
||||
echo ""
|
||||
echo "To run agents, use:"
|
||||
echo ""
|
||||
echo " ${BLUE}# From project root:${NC}"
|
||||
echo " PYTHONPATH=exports uv run python -m agent_name validate"
|
||||
echo " PYTHONPATH=exports uv run python -m agent_name info"
|
||||
echo " PYTHONPATH=exports uv run python -m agent_name run --input '{...}'"
|
||||
echo ""
|
||||
echo "Available commands for your new agent:"
|
||||
echo " PYTHONPATH=exports uv run python -m support_ticket_agent validate"
|
||||
echo " PYTHONPATH=exports uv run python -m support_ticket_agent info"
|
||||
echo " PYTHONPATH=exports uv run python -m support_ticket_agent run --input '{\"ticket_content\":\"...\",\"customer_id\":\"...\",\"ticket_id\":\"...\"}'"
|
||||
echo ""
|
||||
echo "To build new agents, use Claude Code skills:"
|
||||
echo " • /building-agents - Build a new agent"
|
||||
echo " • /testing-agent - Test an existing agent"
|
||||
echo ""
|
||||
echo "Documentation: ${PROJECT_ROOT}/README.md"
|
||||
echo "Agent Examples: ${PROJECT_ROOT}/exports/"
|
||||
echo ""
|
||||
@@ -1,79 +0,0 @@
|
||||
#!/bin/bash
|
||||
# Legacy Web Application Setup Script
|
||||
# NOTE: This script is for the archived honeycomb/hive web application.
|
||||
# For agent development, use: ./quickstart.sh
|
||||
|
||||
set -e
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
|
||||
|
||||
echo "==================================="
|
||||
echo " Legacy Web App Setup (Archived)"
|
||||
echo "==================================="
|
||||
echo ""
|
||||
echo "⚠️ This script is for the archived web application."
|
||||
echo " For agent development, use: ./quickstart.sh"
|
||||
echo ""
|
||||
|
||||
# Check for Node.js
|
||||
if ! command -v node &> /dev/null; then
|
||||
echo "Error: Node.js is not installed."
|
||||
echo "Please install Node.js 20+ from https://nodejs.org"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
NODE_VERSION=$(node -v | cut -d'v' -f2 | cut -d'.' -f1)
|
||||
if [ "$NODE_VERSION" -lt 20 ]; then
|
||||
echo "Error: Node.js 20+ is required (found v$NODE_VERSION)"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "✓ Node.js $(node -v) detected"
|
||||
|
||||
# Check for Docker (optional)
|
||||
if command -v docker &> /dev/null; then
|
||||
echo "✓ Docker $(docker --version | cut -d' ' -f3 | tr -d ',') detected"
|
||||
else
|
||||
echo "⚠ Docker not found (optional, needed for containerized deployment)"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
|
||||
# Create config.yaml if it doesn't exist
|
||||
if [ ! -f "$PROJECT_ROOT/config.yaml" ]; then
|
||||
echo "Creating config.yaml from template..."
|
||||
cp "$PROJECT_ROOT/config.yaml.example" "$PROJECT_ROOT/config.yaml"
|
||||
echo "✓ Created config.yaml"
|
||||
echo ""
|
||||
echo " Please review and edit config.yaml with your settings."
|
||||
echo ""
|
||||
else
|
||||
echo "✓ config.yaml already exists"
|
||||
fi
|
||||
|
||||
# Install dependencies
|
||||
echo ""
|
||||
echo "Installing dependencies..."
|
||||
cd "$PROJECT_ROOT"
|
||||
npm install
|
||||
echo "✓ Dependencies installed"
|
||||
|
||||
# Generate environment files
|
||||
echo ""
|
||||
echo "Generating environment files from config.yaml..."
|
||||
npx tsx scripts/generate-env.ts
|
||||
echo "✓ Environment files generated"
|
||||
|
||||
echo ""
|
||||
echo "==================================="
|
||||
echo " Setup Complete (Legacy)"
|
||||
echo "==================================="
|
||||
echo ""
|
||||
echo "⚠️ NOTE: The honeycomb/hive web application has been archived."
|
||||
echo ""
|
||||
echo "For agent development, please use:"
|
||||
echo " ./quickstart.sh"
|
||||
echo ""
|
||||
echo "See ENVIRONMENT_SETUP.md for complete agent development guide."
|
||||
echo ""
|
||||
@@ -26,6 +26,7 @@ from .email_tool import register_tools as register_email
|
||||
from .example_tool import register_tools as register_example
|
||||
from .file_system_toolkits.apply_diff import register_tools as register_apply_diff
|
||||
from .file_system_toolkits.apply_patch import register_tools as register_apply_patch
|
||||
from .file_system_toolkits.data_tools import register_tools as register_data_tools
|
||||
from .file_system_toolkits.execute_command_tool import (
|
||||
register_tools as register_execute_command,
|
||||
)
|
||||
@@ -82,6 +83,7 @@ def register_all_tools(
|
||||
register_apply_patch(mcp)
|
||||
register_grep_search(mcp)
|
||||
register_execute_command(mcp)
|
||||
register_data_tools(mcp)
|
||||
register_csv(mcp)
|
||||
|
||||
return [
|
||||
@@ -97,6 +99,9 @@ def register_all_tools(
|
||||
"apply_patch",
|
||||
"grep_search",
|
||||
"execute_command_tool",
|
||||
"load_data",
|
||||
"save_data",
|
||||
"list_data_files",
|
||||
"csv_read",
|
||||
"csv_write",
|
||||
"csv_append",
|
||||
|
||||
@@ -0,0 +1,3 @@
|
||||
from .data_tools import register_tools
|
||||
|
||||
__all__ = ["register_tools"]
|
||||
@@ -0,0 +1,179 @@
|
||||
"""
|
||||
Data Tools - Load, save, and list data files for agent pipelines.
|
||||
|
||||
These tools let agents store large intermediate results in files and
|
||||
retrieve them with pagination, keeping the LLM conversation context small.
|
||||
Used in conjunction with the spillover system: when a tool result is too
|
||||
large, the framework writes it to a file and the agent can load it back
|
||||
with load_data().
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from pathlib import Path
|
||||
|
||||
from mcp.server.fastmcp import FastMCP
|
||||
|
||||
|
||||
def register_tools(mcp: FastMCP) -> None:
|
||||
"""Register data management tools with the MCP server."""
|
||||
|
||||
@mcp.tool()
|
||||
def save_data(filename: str, data: str, data_dir: str) -> dict:
|
||||
"""
|
||||
Purpose
|
||||
Save data to a file for later retrieval by this or downstream nodes.
|
||||
|
||||
When to use
|
||||
Store large results (search results, profiles, analysis) instead
|
||||
of passing them inline through set_output.
|
||||
Returns a brief summary with the filename to reference later.
|
||||
|
||||
Rules & Constraints
|
||||
filename must be a simple name like 'results.json' — no paths or '..'
|
||||
data_dir must be the absolute path to the data directory
|
||||
|
||||
Args:
|
||||
filename: Simple filename like 'github_users.json'. No paths or '..'.
|
||||
data: The string data to write (typically JSON).
|
||||
data_dir: Absolute path to the data directory.
|
||||
|
||||
Returns:
|
||||
Dict with success status and file metadata, or error dict
|
||||
"""
|
||||
if not filename or ".." in filename or "/" in filename or "\\" in filename:
|
||||
return {"error": "Invalid filename. Use simple names like 'users.json'"}
|
||||
if not data_dir:
|
||||
return {"error": "data_dir is required"}
|
||||
|
||||
try:
|
||||
dir_path = Path(data_dir)
|
||||
dir_path.mkdir(parents=True, exist_ok=True)
|
||||
path = dir_path / filename
|
||||
path.write_text(data, encoding="utf-8")
|
||||
lines = data.count("\n") + 1
|
||||
return {
|
||||
"success": True,
|
||||
"filename": filename,
|
||||
"size_bytes": len(data.encode("utf-8")),
|
||||
"lines": lines,
|
||||
"preview": data[:200] + ("..." if len(data) > 200 else ""),
|
||||
}
|
||||
except Exception as e:
|
||||
return {"error": f"Failed to save data: {str(e)}"}
|
||||
|
||||
@mcp.tool()
|
||||
def load_data(
|
||||
filename: str,
|
||||
data_dir: str,
|
||||
offset: int = 0,
|
||||
limit: int = 50,
|
||||
) -> dict:
|
||||
"""
|
||||
Purpose
|
||||
Load data from a previously saved file with pagination.
|
||||
|
||||
When to use
|
||||
Retrieve large tool results that were spilled to disk.
|
||||
Read data saved by save_data or by the spillover system.
|
||||
Page through large files without loading everything into context.
|
||||
|
||||
Rules & Constraints
|
||||
filename must match a file in data_dir
|
||||
Returns a page of lines with metadata about the full file
|
||||
|
||||
Args:
|
||||
filename: The filename to load (as shown in spillover messages or save_data results).
|
||||
data_dir: Absolute path to the data directory.
|
||||
offset: 0-based line number to start reading from. Default 0.
|
||||
limit: Max number of lines to return. Default 50.
|
||||
|
||||
Returns:
|
||||
Dict with content, pagination info, and metadata
|
||||
|
||||
Examples:
|
||||
load_data('users.json', '/path/to/data') # first 50 lines
|
||||
load_data('users.json', '/path/to/data', offset=50, limit=50) # next 50
|
||||
load_data('users.json', '/path/to/data', limit=200) # first 200 lines
|
||||
"""
|
||||
if not filename or ".." in filename or "/" in filename or "\\" in filename:
|
||||
return {"error": "Invalid filename"}
|
||||
if not data_dir:
|
||||
return {"error": "data_dir is required"}
|
||||
|
||||
try:
|
||||
offset = int(offset)
|
||||
limit = int(limit)
|
||||
path = Path(data_dir) / filename
|
||||
if not path.exists():
|
||||
return {"error": f"File not found: {filename}"}
|
||||
|
||||
content = path.read_text(encoding="utf-8")
|
||||
size_bytes = len(content.encode("utf-8"))
|
||||
|
||||
# If content is a single long line, try to pretty-print JSON so
|
||||
# line-based pagination actually works.
|
||||
all_lines = content.split("\n")
|
||||
if len(all_lines) <= 2 and size_bytes > 500:
|
||||
try:
|
||||
parsed = json.loads(content)
|
||||
content = json.dumps(parsed, indent=2, ensure_ascii=False)
|
||||
all_lines = content.split("\n")
|
||||
except (json.JSONDecodeError, TypeError, ValueError):
|
||||
pass
|
||||
|
||||
total = len(all_lines)
|
||||
start = min(offset, total)
|
||||
end = min(start + limit, total)
|
||||
sliced = all_lines[start:end]
|
||||
|
||||
return {
|
||||
"success": True,
|
||||
"filename": filename,
|
||||
"content": "\n".join(sliced),
|
||||
"total_lines": total,
|
||||
"size_bytes": size_bytes,
|
||||
"offset": start,
|
||||
"lines_returned": len(sliced),
|
||||
"has_more": end < total,
|
||||
}
|
||||
except Exception as e:
|
||||
return {"error": f"Failed to load data: {str(e)}"}
|
||||
|
||||
@mcp.tool()
|
||||
def list_data_files(data_dir: str) -> dict:
|
||||
"""
|
||||
Purpose
|
||||
List all data files in the data directory.
|
||||
|
||||
When to use
|
||||
Discover what intermediate results or spillover files are available.
|
||||
Check what data was saved by previous nodes in the pipeline.
|
||||
|
||||
Args:
|
||||
data_dir: Absolute path to the data directory.
|
||||
|
||||
Returns:
|
||||
Dict with list of files and their sizes
|
||||
"""
|
||||
if not data_dir:
|
||||
return {"error": "data_dir is required"}
|
||||
|
||||
try:
|
||||
dir_path = Path(data_dir)
|
||||
if not dir_path.exists():
|
||||
return {"files": []}
|
||||
|
||||
files = []
|
||||
for f in sorted(dir_path.iterdir()):
|
||||
if f.is_file():
|
||||
files.append(
|
||||
{
|
||||
"filename": f.name,
|
||||
"size_bytes": f.stat().st_size,
|
||||
}
|
||||
)
|
||||
return {"files": files}
|
||||
except Exception as e:
|
||||
return {"error": f"Failed to list data files: {str(e)}"}
|
||||
@@ -11,6 +11,7 @@ Auto-detection: If provider="auto", tries Brave first (backward compatible), the
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import time
|
||||
from typing import TYPE_CHECKING, Literal
|
||||
|
||||
import httpx
|
||||
@@ -35,27 +36,35 @@ def register_tools(
|
||||
cse_id: str,
|
||||
) -> dict:
|
||||
"""Execute search using Google Custom Search API."""
|
||||
response = httpx.get(
|
||||
"https://www.googleapis.com/customsearch/v1",
|
||||
params={
|
||||
"key": api_key,
|
||||
"cx": cse_id,
|
||||
"q": query,
|
||||
"num": min(num_results, 10),
|
||||
"lr": f"lang_{language}",
|
||||
"gl": country,
|
||||
},
|
||||
timeout=30.0,
|
||||
)
|
||||
max_retries = 3
|
||||
for attempt in range(max_retries + 1):
|
||||
response = httpx.get(
|
||||
"https://www.googleapis.com/customsearch/v1",
|
||||
params={
|
||||
"key": api_key,
|
||||
"cx": cse_id,
|
||||
"q": query,
|
||||
"num": min(num_results, 10),
|
||||
"lr": f"lang_{language}",
|
||||
"gl": country,
|
||||
},
|
||||
timeout=30.0,
|
||||
)
|
||||
|
||||
if response.status_code == 401:
|
||||
return {"error": "Invalid Google API key"}
|
||||
elif response.status_code == 403:
|
||||
return {"error": "Google API key not authorized or quota exceeded"}
|
||||
elif response.status_code == 429:
|
||||
return {"error": "Google rate limit exceeded. Try again later."}
|
||||
elif response.status_code != 200:
|
||||
return {"error": f"Google API request failed: HTTP {response.status_code}"}
|
||||
if response.status_code == 429 and attempt < max_retries:
|
||||
time.sleep(2**attempt)
|
||||
continue
|
||||
|
||||
if response.status_code == 401:
|
||||
return {"error": "Invalid Google API key"}
|
||||
elif response.status_code == 403:
|
||||
return {"error": "Google API key not authorized or quota exceeded"}
|
||||
elif response.status_code == 429:
|
||||
return {"error": "Google rate limit exceeded. Try again later."}
|
||||
elif response.status_code != 200:
|
||||
return {"error": f"Google API request failed: HTTP {response.status_code}"}
|
||||
|
||||
break
|
||||
|
||||
data = response.json()
|
||||
results = []
|
||||
@@ -82,26 +91,34 @@ def register_tools(
|
||||
api_key: str,
|
||||
) -> dict:
|
||||
"""Execute search using Brave Search API."""
|
||||
response = httpx.get(
|
||||
"https://api.search.brave.com/res/v1/web/search",
|
||||
params={
|
||||
"q": query,
|
||||
"count": min(num_results, 20),
|
||||
"country": country,
|
||||
},
|
||||
headers={
|
||||
"X-Subscription-Token": api_key,
|
||||
"Accept": "application/json",
|
||||
},
|
||||
timeout=30.0,
|
||||
)
|
||||
max_retries = 3
|
||||
for attempt in range(max_retries + 1):
|
||||
response = httpx.get(
|
||||
"https://api.search.brave.com/res/v1/web/search",
|
||||
params={
|
||||
"q": query,
|
||||
"count": min(num_results, 20),
|
||||
"country": country,
|
||||
},
|
||||
headers={
|
||||
"X-Subscription-Token": api_key,
|
||||
"Accept": "application/json",
|
||||
},
|
||||
timeout=30.0,
|
||||
)
|
||||
|
||||
if response.status_code == 401:
|
||||
return {"error": "Invalid Brave API key"}
|
||||
elif response.status_code == 429:
|
||||
return {"error": "Brave rate limit exceeded. Try again later."}
|
||||
elif response.status_code != 200:
|
||||
return {"error": f"Brave API request failed: HTTP {response.status_code}"}
|
||||
if response.status_code == 429 and attempt < max_retries:
|
||||
time.sleep(2**attempt)
|
||||
continue
|
||||
|
||||
if response.status_code == 401:
|
||||
return {"error": "Invalid Brave API key"}
|
||||
elif response.status_code == 429:
|
||||
return {"error": "Brave rate limit exceeded. Try again later."}
|
||||
elif response.status_code != 200:
|
||||
return {"error": f"Brave API request failed: HTTP {response.status_code}"}
|
||||
|
||||
break
|
||||
|
||||
data = response.json()
|
||||
results = []
|
||||
|
||||
Generated
+3419
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user