Merge remote-tracking branch 'origin/main' into feature/quickstart-credential-store

This commit is contained in:
Timothy
2026-02-04 20:03:44 -08:00
50 changed files with 13098 additions and 1708 deletions
@@ -267,7 +267,7 @@ This returns JSON with all the goal, nodes, edges, and MCP server configurations
- NOT: `{"first-node-id": ["input_keys"]}` (WRONG)
- NOT: `{"first-node-id"}` (WRONG - this is a set)
**Use the example agent** at `.claude/skills/building-agents-construction/examples/online_research_agent/` as a template for file structure and patterns.
**Use the example agent** at `.claude/skills/building-agents-construction/examples/deep_research_agent/` as a template for file structure and patterns. It demonstrates: STEP 1/STEP 2 prompts, client-facing nodes, feedback loops, nullable_output_keys, and data tools.
**AFTER writing all files, tell the user:**
@@ -354,7 +354,7 @@ mcp__agent-builder__get_session_status()
## REFERENCE: System Prompt Best Practice
For event_loop nodes, instruct the LLM to use `set_output` for structured outputs:
For **internal** event_loop nodes (not client-facing), instruct the LLM to use `set_output`:
```
Use set_output(key, value) to store your results. For example:
@@ -363,71 +363,55 @@ Use set_output(key, value) to store your results. For example:
Do NOT return raw JSON. Use the set_output tool to produce outputs.
```
For **client-facing** event_loop nodes, use the STEP 1/STEP 2 pattern:
```
**STEP 1 — Respond to the user (text only, NO tool calls):**
[Present information, ask questions, etc.]
**STEP 2 — After the user responds, call set_output:**
- set_output("key", "value based on user's response")
```
This prevents the LLM from calling `set_output` before the user has had a chance to respond. The "NO tool calls" instruction in STEP 1 ensures the node blocks for user input before proceeding.
---
## CRITICAL: EventLoopNode Registration
## EventLoopNode Runtime
**`AgentRuntime` does NOT support `event_loop` nodes.** The `AgentRuntime` / `create_agent_runtime()` path creates `GraphExecutor` instances internally without passing a `node_registry`, causing all `event_loop` nodes to fail at runtime with:
```
EventLoopNode 'node-id' not found in registry. Register it with executor.register_node() before execution.
```
**The correct pattern**: Use `GraphExecutor` directly with a `node_registry` dict containing `EventLoopNode` instances:
EventLoopNodes are **auto-created** by `GraphExecutor` at runtime. Both direct `GraphExecutor` and `AgentRuntime` / `create_agent_runtime()` handle event_loop nodes automatically. No manual `node_registry` setup is needed.
```python
from framework.graph.executor import GraphExecutor, ExecutionResult
from framework.graph.event_loop_node import EventLoopNode, LoopConfig
from framework.runtime.event_bus import EventBus
from framework.runtime.core import Runtime # REQUIRED - executor calls runtime.start_run()
# Direct execution
from framework.graph.executor import GraphExecutor
from framework.runtime.core import Runtime
# 1. Build node_registry with EventLoopNode instances
event_bus = EventBus()
node_registry = {}
for node_spec in nodes:
if node_spec.node_type == "event_loop":
node_registry[node_spec.id] = EventLoopNode(
event_bus=event_bus,
judge=None, # implicit judge: accepts when output_keys are filled
config=LoopConfig(
max_iterations=50,
max_tool_calls_per_turn=15,
stall_detection_threshold=3,
max_history_tokens=32000,
),
tool_executor=tool_executor,
)
# 2. Create Runtime for run tracking (GraphExecutor calls runtime.start_run())
storage_path = Path.home() / ".hive" / "my_agent"
storage_path.mkdir(parents=True, exist_ok=True)
runtime = Runtime(storage_path)
# 3. Create GraphExecutor WITH node_registry and runtime
executor = GraphExecutor(
runtime=runtime, # NOT None - executor needs this for run tracking
runtime=runtime,
llm=llm,
tools=tools,
tool_executor=tool_executor,
node_registry=node_registry, # EventLoopNode instances
storage_path=storage_path,
)
# 4. Execute
result = await executor.execute(graph=graph, goal=goal, input_data=input_data)
```
**DO NOT use `AgentRuntime` or `create_agent_runtime()` for agents with `event_loop` nodes.**
**DO NOT pass `runtime=None` to `GraphExecutor`** — it will crash with `'NoneType' object has no attribute 'start_run'`.
---
## COMMON MISTAKES TO AVOID
1. **Using `AgentRuntime` with event_loop nodes** - `AgentRuntime` does not register EventLoopNodes. Use `GraphExecutor` directly with `node_registry`
2. **Passing `runtime=None` to GraphExecutor** - The executor calls `runtime.start_run()` internally. Always provide a `Runtime(storage_path)` instance
3. **Using tools that don't exist** - Always check `mcp__agent-builder__list_mcp_tools()` first
4. **Wrong entry_points format** - Must be `{"start": "node-id"}`, NOT a set or list
5. **Skipping validation** - Always validate nodes and graph before proceeding
6. **Not waiting for approval** - Always ask user before major steps
7. **Displaying this file** - Execute the steps, don't show documentation
1. **Using tools that don't exist** - Always check `mcp__agent-builder__list_mcp_tools()` first
2. **Wrong entry_points format** - Must be `{"start": "node-id"}`, NOT a set or list
3. **Skipping validation** - Always validate nodes and graph before proceeding
4. **Not waiting for approval** - Always ask user before major steps
5. **Displaying this file** - Execute the steps, don't show documentation
6. **Too many thin nodes** - Prefer fewer, richer nodes (4 nodes > 8 nodes)
7. **Missing STEP 1/STEP 2 in client-facing prompts** - Client-facing nodes need explicit phases to prevent premature set_output
8. **Forgetting nullable_output_keys** - Mark input_keys that only arrive on certain edges (e.g., feedback) as nullable on the receiving node
9. **Adding framework gating for LLM behavior** - Fix prompts or use judges, not ad-hoc code
@@ -0,0 +1,24 @@
"""
Deep Research Agent - Interactive, rigorous research with TUI conversation.
Research any topic through multi-source web search, quality evaluation,
and synthesis. Features client-facing TUI interaction at key checkpoints
for user guidance and iterative deepening.
"""
from .agent import DeepResearchAgent, default_agent, goal, nodes, edges
from .config import RuntimeConfig, AgentMetadata, default_config, metadata
__version__ = "1.0.0"
__all__ = [
"DeepResearchAgent",
"default_agent",
"goal",
"nodes",
"edges",
"RuntimeConfig",
"AgentMetadata",
"default_config",
"metadata",
]
@@ -1,5 +1,5 @@
"""
CLI entry point for Online Research Agent.
CLI entry point for Deep Research Agent.
Uses AgentRuntime for multi-entrypoint support with HITL pause/resume.
"""
@@ -10,7 +10,7 @@ import logging
import sys
import click
from .agent import default_agent, OnlineResearchAgent
from .agent import default_agent, DeepResearchAgent
def setup_logging(verbose=False, debug=False):
@@ -28,7 +28,7 @@ def setup_logging(verbose=False, debug=False):
@click.group()
@click.version_option(version="1.0.0")
def cli():
"""Online Research Agent - Deep-dive research with narrative reports."""
"""Deep Research Agent - Interactive, rigorous research with TUI conversation."""
pass
@@ -59,6 +59,83 @@ def run(topic, mock, quiet, verbose, debug):
sys.exit(0 if result.success else 1)
@cli.command()
@click.option("--mock", is_flag=True, help="Run in mock mode")
@click.option("--verbose", "-v", is_flag=True, help="Show execution details")
@click.option("--debug", is_flag=True, help="Show debug logging")
def tui(mock, verbose, debug):
"""Launch the TUI dashboard for interactive research."""
setup_logging(verbose=verbose, debug=debug)
try:
from framework.tui.app import AdenTUI
except ImportError:
click.echo("TUI requires the 'textual' package. Install with: pip install textual")
sys.exit(1)
from pathlib import Path
from framework.llm import LiteLLMProvider
from framework.runner.tool_registry import ToolRegistry
from framework.runtime.agent_runtime import create_agent_runtime
from framework.runtime.event_bus import EventBus
from framework.runtime.execution_stream import EntryPointSpec
async def run_with_tui():
agent = DeepResearchAgent()
# Build graph and tools
agent._event_bus = EventBus()
agent._tool_registry = ToolRegistry()
mcp_config_path = Path(__file__).parent / "mcp_servers.json"
if mcp_config_path.exists():
agent._tool_registry.load_mcp_config(mcp_config_path)
llm = None
if not mock:
llm = LiteLLMProvider(
model=agent.config.model,
api_key=agent.config.api_key,
api_base=agent.config.api_base,
)
tools = list(agent._tool_registry.get_tools().values())
tool_executor = agent._tool_registry.get_executor()
graph = agent._build_graph()
storage_path = Path.home() / ".hive" / "deep_research_agent"
storage_path.mkdir(parents=True, exist_ok=True)
runtime = create_agent_runtime(
graph=graph,
goal=agent.goal,
storage_path=storage_path,
entry_points=[
EntryPointSpec(
id="start",
name="Start Research",
entry_node="intake",
trigger_type="manual",
isolation_level="isolated",
),
],
llm=llm,
tools=tools,
tool_executor=tool_executor,
)
await runtime.start()
try:
app = AdenTUI(runtime)
await app.run_async()
finally:
await runtime.stop()
asyncio.run(run_with_tui())
@cli.command()
@click.option("--json", "output_json", is_flag=True)
def info(output_json):
@@ -71,6 +148,7 @@ def info(output_json):
click.echo(f"Version: {info_data['version']}")
click.echo(f"Description: {info_data['description']}")
click.echo(f"\nNodes: {', '.join(info_data['nodes'])}")
click.echo(f"Client-facing: {', '.join(info_data['client_facing_nodes'])}")
click.echo(f"Entry: {info_data['entry_node']}")
click.echo(f"Terminal: {', '.join(info_data['terminal_nodes'])}")
@@ -81,6 +159,9 @@ def validate():
validation = default_agent.validate()
if validation["valid"]:
click.echo("Agent is valid")
if validation["warnings"]:
for warning in validation["warnings"]:
click.echo(f" WARNING: {warning}")
else:
click.echo("Agent has errors:")
for error in validation["errors"]:
@@ -91,7 +172,7 @@ def validate():
@cli.command()
@click.option("--verbose", "-v", is_flag=True)
def shell(verbose):
"""Interactive research session."""
"""Interactive research session (CLI, no TUI)."""
asyncio.run(_interactive_shell(verbose))
@@ -99,10 +180,10 @@ async def _interactive_shell(verbose=False):
"""Async interactive shell."""
setup_logging(verbose=verbose)
click.echo("=== Online Research Agent ===")
click.echo("=== Deep Research Agent ===")
click.echo("Enter a topic to research (or 'quit' to exit):\n")
agent = OnlineResearchAgent()
agent = DeepResearchAgent()
await agent.start()
try:
@@ -118,7 +199,7 @@ async def _interactive_shell(verbose=False):
if not topic.strip():
continue
click.echo("\nResearching... (this may take a few minutes)\n")
click.echo("\nResearching...\n")
result = await agent.trigger_and_wait("start", {"topic": topic})
@@ -128,16 +209,14 @@ async def _interactive_shell(verbose=False):
if result.success:
output = result.output
if "file_path" in output:
click.echo(f"\nReport saved to: {output['file_path']}\n")
if "final_report" in output:
click.echo("\n--- Report Preview ---\n")
preview = (
output["final_report"][:500] + "..."
if len(output.get("final_report", "")) > 500
else output.get("final_report", "")
)
click.echo(preview)
if "report_content" in output:
click.echo("\n--- Report ---\n")
click.echo(output["report_content"])
click.echo("\n")
if "references" in output:
click.echo("--- References ---\n")
for ref in output.get("references", []):
click.echo(f" [{ref.get('number', '?')}] {ref.get('title', '')} - {ref.get('url', '')}")
click.echo("\n")
else:
click.echo(f"\nResearch failed: {result.error}\n")
@@ -148,7 +227,6 @@ async def _interactive_shell(verbose=False):
except Exception as e:
click.echo(f"Error: {e}", err=True)
import traceback
traceback.print_exc()
finally:
await agent.stop()
@@ -1,9 +1,8 @@
"""Agent graph construction for Online Research Agent."""
"""Agent graph construction for Deep Research Agent."""
from framework.graph import EdgeSpec, EdgeCondition, Goal, SuccessCriterion, Constraint
from framework.graph.edge import GraphSpec
from framework.graph.executor import ExecutionResult, GraphExecutor
from framework.graph.event_loop_node import EventLoopNode, LoopConfig
from framework.runtime.event_bus import EventBus
from framework.runtime.core import Runtime
from framework.llm import LiteLLMProvider
@@ -11,164 +10,132 @@ from framework.runner.tool_registry import ToolRegistry
from .config import default_config, metadata
from .nodes import (
parse_query_node,
search_sources_node,
fetch_content_node,
evaluate_sources_node,
synthesize_findings_node,
write_report_node,
quality_check_node,
save_report_node,
intake_node,
research_node,
review_node,
report_node,
)
# Goal definition
goal = Goal(
id="comprehensive-online-research",
name="Comprehensive Online Research",
description="Research any topic by searching multiple sources, synthesizing information, and producing a well-structured narrative report with citations.",
id="rigorous-interactive-research",
name="Rigorous Interactive Research",
description=(
"Research any topic by searching diverse sources, analyzing findings, "
"and producing a cited report — with user checkpoints to guide direction."
),
success_criteria=[
SuccessCriterion(
id="source-coverage",
description="Query 10+ diverse sources",
id="source-diversity",
description="Use multiple diverse, authoritative sources",
metric="source_count",
target=">=10",
weight=0.20,
),
SuccessCriterion(
id="relevance",
description="All sources directly address the query",
metric="relevance_score",
target="90%",
target=">=5",
weight=0.25,
),
SuccessCriterion(
id="synthesis",
description="Synthesize findings into coherent narrative",
metric="coherence_score",
target="85%",
weight=0.25,
),
SuccessCriterion(
id="citations",
description="Include citations for all claims",
id="citation-coverage",
description="Every factual claim in the report cites its source",
metric="citation_coverage",
target="100%",
weight=0.15,
weight=0.25,
),
SuccessCriterion(
id="actionable",
description="Report answers the user's question",
metric="answer_completeness",
id="user-satisfaction",
description="User reviews findings before report generation",
metric="user_approval",
target="true",
weight=0.25,
),
SuccessCriterion(
id="report-completeness",
description="Final report answers the original research questions",
metric="question_coverage",
target="90%",
weight=0.15,
weight=0.25,
),
],
constraints=[
Constraint(
id="no-hallucination",
description="Only include information found in sources",
description="Only include information found in fetched sources",
constraint_type="quality",
category="accuracy",
),
Constraint(
id="source-attribution",
description="Every factual claim must cite its source",
description="Every claim must cite its source with a numbered reference",
constraint_type="quality",
category="accuracy",
),
Constraint(
id="recency-preference",
description="Prefer recent sources when relevant",
constraint_type="quality",
category="relevance",
),
Constraint(
id="no-paywalled",
description="Avoid sources that require payment to access",
id="user-checkpoint",
description="Present findings to the user before writing the final report",
constraint_type="functional",
category="accessibility",
category="interaction",
),
],
)
# Node list
nodes = [
parse_query_node,
search_sources_node,
fetch_content_node,
evaluate_sources_node,
synthesize_findings_node,
write_report_node,
quality_check_node,
save_report_node,
intake_node,
research_node,
review_node,
report_node,
]
# Edge definitions
edges = [
# intake -> research
EdgeSpec(
id="parse-to-search",
source="parse-query",
target="search-sources",
id="intake-to-research",
source="intake",
target="research",
condition=EdgeCondition.ON_SUCCESS,
priority=1,
),
# research -> review
EdgeSpec(
id="search-to-fetch",
source="search-sources",
target="fetch-content",
id="research-to-review",
source="research",
target="review",
condition=EdgeCondition.ON_SUCCESS,
priority=1,
),
# review -> research (feedback loop)
EdgeSpec(
id="fetch-to-evaluate",
source="fetch-content",
target="evaluate-sources",
condition=EdgeCondition.ON_SUCCESS,
id="review-to-research-feedback",
source="review",
target="research",
condition=EdgeCondition.CONDITIONAL,
condition_expr="needs_more_research == True",
priority=1,
),
# review -> report (user satisfied)
EdgeSpec(
id="evaluate-to-synthesize",
source="evaluate-sources",
target="synthesize-findings",
condition=EdgeCondition.ON_SUCCESS,
priority=1,
),
EdgeSpec(
id="synthesize-to-write",
source="synthesize-findings",
target="write-report",
condition=EdgeCondition.ON_SUCCESS,
priority=1,
),
EdgeSpec(
id="write-to-quality",
source="write-report",
target="quality-check",
condition=EdgeCondition.ON_SUCCESS,
priority=1,
),
EdgeSpec(
id="quality-to-save",
source="quality-check",
target="save-report",
condition=EdgeCondition.ON_SUCCESS,
priority=1,
id="review-to-report",
source="review",
target="report",
condition=EdgeCondition.CONDITIONAL,
condition_expr="needs_more_research == False",
priority=2,
),
]
# Graph configuration
entry_node = "parse-query"
entry_points = {"start": "parse-query"}
entry_node = "intake"
entry_points = {"start": "intake"}
pause_nodes = []
terminal_nodes = ["save-report"]
terminal_nodes = ["report"]
class OnlineResearchAgent:
class DeepResearchAgent:
"""
Online Research Agent - Deep-dive research with narrative reports.
Deep Research Agent 4-node pipeline with user checkpoints.
Uses GraphExecutor directly with EventLoopNode instances registered
in the node_registry for multi-turn tool execution.
Flow: intake -> research -> review -> report
^ |
+-- feedback loop (if user wants more)
"""
def __init__(self, config=None):
@@ -188,7 +155,7 @@ class OnlineResearchAgent:
def _build_graph(self) -> GraphSpec:
"""Build the GraphSpec."""
return GraphSpec(
id="online-research-agent-graph",
id="deep-research-agent-graph",
goal_id=self.goal.id,
version="1.0.0",
entry_node=self.entry_node,
@@ -201,29 +168,11 @@ class OnlineResearchAgent:
max_tokens=self.config.max_tokens,
)
def _build_node_registry(self, tool_executor=None) -> dict:
"""Create EventLoopNode instances for all event_loop nodes."""
registry = {}
for node_spec in self.nodes:
if node_spec.node_type == "event_loop":
registry[node_spec.id] = EventLoopNode(
event_bus=self._event_bus,
judge=None, # implicit judge: accept when output_keys are filled
config=LoopConfig(
max_iterations=50,
max_tool_calls_per_turn=15,
stall_detection_threshold=3,
max_history_tokens=32000,
),
tool_executor=tool_executor,
)
return registry
def _setup(self, mock_mode=False) -> GraphExecutor:
"""Set up the executor with all components."""
from pathlib import Path
storage_path = Path.home() / ".hive" / "online_research_agent"
storage_path = Path.home() / ".hive" / "deep_research_agent"
storage_path.mkdir(parents=True, exist_ok=True)
self._event_bus = EventBus()
@@ -245,7 +194,6 @@ class OnlineResearchAgent:
tools = list(self._tool_registry.get_tools().values())
self._graph = self._build_graph()
node_registry = self._build_node_registry(tool_executor=tool_executor)
runtime = Runtime(storage_path)
self._executor = GraphExecutor(
@@ -253,7 +201,8 @@ class OnlineResearchAgent:
llm=llm,
tools=tools,
tool_executor=tool_executor,
node_registry=node_registry,
event_bus=self._event_bus,
storage_path=storage_path,
)
return self._executor
@@ -317,7 +266,7 @@ class OnlineResearchAgent:
"entry_points": self.entry_points,
"pause_nodes": self.pause_nodes,
"terminal_nodes": self.terminal_nodes,
"multi_entrypoint": True,
"client_facing_nodes": [n.id for n in self.nodes if n.client_facing],
}
def validate(self):
@@ -339,10 +288,6 @@ class OnlineResearchAgent:
if terminal not in node_ids:
errors.append(f"Terminal node '{terminal}' not found")
for pause in self.pause_nodes:
if pause not in node_ids:
errors.append(f"Pause node '{pause}' not found")
for ep_id, node_id in self.entry_points.items():
if node_id not in node_ids:
errors.append(
@@ -357,4 +302,4 @@ class OnlineResearchAgent:
# Create default instance
default_agent = OnlineResearchAgent()
default_agent = DeepResearchAgent()
@@ -32,12 +32,15 @@ class RuntimeConfig:
default_config = RuntimeConfig()
# Agent metadata
@dataclass
class AgentMetadata:
name: str = "Online Research Agent"
name: str = "Deep Research Agent"
version: str = "1.0.0"
description: str = "Research any topic by searching multiple sources, synthesizing information, and producing a well-structured narrative report with citations."
description: str = (
"Interactive research agent that rigorously investigates topics through "
"multi-source search, quality evaluation, and synthesis - with TUI conversation "
"at key checkpoints for user guidance and feedback."
)
metadata = AgentMetadata()
@@ -0,0 +1,147 @@
"""Node definitions for Deep Research Agent."""
from framework.graph import NodeSpec
# Node 1: Intake (client-facing)
# Brief conversation to clarify what the user wants researched.
intake_node = NodeSpec(
id="intake",
name="Research Intake",
description="Discuss the research topic with the user, clarify scope, and confirm direction",
node_type="event_loop",
client_facing=True,
input_keys=["topic"],
output_keys=["research_brief"],
system_prompt="""\
You are a research intake specialist. The user wants to research a topic.
Have a brief conversation to clarify what they need.
**STEP 1 Read and respond (text only, NO tool calls):**
1. Read the topic provided
2. If it's vague, ask 1-2 clarifying questions (scope, angle, depth)
3. If it's already clear, confirm your understanding and ask the user to confirm
Keep it short. Don't over-ask.
**STEP 2 After the user confirms, call set_output:**
- set_output("research_brief", "A clear paragraph describing exactly what to research, \
what questions to answer, what scope to cover, and how deep to go.")
""",
tools=[],
)
# Node 2: Research
# The workhorse — searches the web, fetches content, analyzes sources.
# One node with both tools avoids the context-passing overhead of 5 separate nodes.
research_node = NodeSpec(
id="research",
name="Research",
description="Search the web, fetch source content, and compile findings",
node_type="event_loop",
max_node_visits=3,
input_keys=["research_brief", "feedback"],
output_keys=["findings", "sources", "gaps"],
nullable_output_keys=["feedback"],
system_prompt="""\
You are a research agent. Given a research brief, find and analyze sources.
If feedback is provided, this is a follow-up round focus on the gaps identified.
Work in phases:
1. **Search**: Use web_search with 3-5 diverse queries covering different angles.
Prioritize authoritative sources (.edu, .gov, established publications).
2. **Fetch**: Use web_scrape on the most promising URLs (aim for 5-8 sources).
Skip URLs that fail. Extract the substantive content.
3. **Analyze**: Review what you've collected. Identify key findings, themes,
and any contradictions between sources.
Important:
- Work in batches of 3-4 tool calls at a time to manage context
- After each batch, assess whether you have enough material
- Prefer quality over quantity 5 good sources beat 15 thin ones
- Track which URL each finding comes from (you'll need citations later)
When done, use set_output:
- set_output("findings", "Structured summary: key findings with source URLs for each claim. \
Include themes, contradictions, and confidence levels.")
- set_output("sources", [{"url": "...", "title": "...", "summary": "..."}])
- set_output("gaps", "What aspects of the research brief are NOT well-covered yet, if any.")
""",
tools=["web_search", "web_scrape", "load_data", "save_data", "list_data_files"],
)
# Node 3: Review (client-facing)
# Shows the user what was found and asks whether to dig deeper or proceed.
review_node = NodeSpec(
id="review",
name="Review Findings",
description="Present findings to user and decide whether to research more or write the report",
node_type="event_loop",
client_facing=True,
max_node_visits=3,
input_keys=["findings", "sources", "gaps", "research_brief"],
output_keys=["needs_more_research", "feedback"],
system_prompt="""\
Present the research findings to the user clearly and concisely.
**STEP 1 Present (your first message, text only, NO tool calls):**
1. **Summary** (2-3 sentences of what was found)
2. **Key Findings** (bulleted, with confidence levels)
3. **Sources Used** (count and quality assessment)
4. **Gaps** (what's still unclear or under-covered)
End by asking: Are they satisfied, or do they want deeper research? \
Should we proceed to writing the final report?
**STEP 2 After the user responds, call set_output:**
- set_output("needs_more_research", "true") if they want more
- set_output("needs_more_research", "false") if they're satisfied
- set_output("feedback", "What the user wants explored further, or empty string")
""",
tools=[],
)
# Node 4: Report (client-facing)
# Writes the final report and presents it to the user.
report_node = NodeSpec(
id="report",
name="Write & Deliver Report",
description="Write a cited report from the findings and present it to the user",
node_type="event_loop",
client_facing=True,
input_keys=["findings", "sources", "research_brief"],
output_keys=["delivery_status"],
system_prompt="""\
Write a comprehensive research report and present it to the user.
**STEP 1 Write and present the report (text only, NO tool calls):**
Report structure:
1. **Executive Summary** (2-3 paragraphs)
2. **Findings** (organized by theme, with [n] citations)
3. **Analysis** (synthesis, implications, areas of debate)
4. **Conclusion** (key takeaways, confidence assessment)
5. **References** (numbered list of sources cited)
Requirements:
- Every factual claim must cite its source with [n] notation
- Be objective present multiple viewpoints where sources disagree
- Distinguish well-supported conclusions from speculation
- Answer the original research questions from the brief
End by asking the user if they have questions or want to save the report.
**STEP 2 After the user responds:**
- Answer follow-up questions from the research material
- If they want to save, use write_to_file tool
- When the user is satisfied: set_output("delivery_status", "completed")
""",
tools=["write_to_file"],
)
__all__ = [
"intake_node",
"research_node",
"review_node",
"report_node",
]
@@ -1,80 +0,0 @@
# Online Research Agent
Deep-dive research agent that searches 10+ sources and produces comprehensive narrative reports with citations.
## Features
- Generates multiple search queries from a topic
- Searches and fetches 15+ web sources
- Evaluates and ranks sources by relevance
- Synthesizes findings into themes
- Writes narrative report with numbered citations
- Quality checks for uncited claims
- Saves report to local markdown file
## Usage
### CLI
```bash
# Show agent info
uv run python -m online_research_agent info
# Validate structure
uv run python -m online_research_agent validate
# Run research on a topic
uv run python -m online_research_agent run --topic "impact of AI on healthcare"
# Interactive shell
uv run python -m online_research_agent shell
```
### Python API
```python
from online_research_agent import default_agent
# Simple usage
result = await default_agent.run({"topic": "climate change solutions"})
# Check output
if result.success:
print(f"Report saved to: {result.output['file_path']}")
print(result.output['final_report'])
```
## Workflow
```
parse-query → search-sources → fetch-content → evaluate-sources
write-report ← synthesize-findings
quality-check → save-report
```
## Output
Reports are saved to `./research_reports/` as markdown files with:
1. Executive Summary
2. Introduction
3. Key Findings (by theme)
4. Analysis
5. Conclusion
6. References
## Requirements
- Python 3.11+
- LLM provider API key (Groq, Cerebras, etc.)
- Internet access for web search/fetch
## Configuration
Edit `config.py` to change:
- `model`: LLM model (default: groq/moonshotai/kimi-k2-instruct-0905)
- `temperature`: Generation temperature (default: 0.7)
- `max_tokens`: Max tokens per response (default: 16384)
@@ -1,23 +0,0 @@
"""
Online Research Agent - Deep-dive research with narrative reports.
Research any topic by searching multiple sources, synthesizing information,
and producing a well-structured narrative report with citations.
"""
from .agent import OnlineResearchAgent, default_agent, goal, nodes, edges
from .config import RuntimeConfig, AgentMetadata, default_config, metadata
__version__ = "1.0.0"
__all__ = [
"OnlineResearchAgent",
"default_agent",
"goal",
"nodes",
"edges",
"RuntimeConfig",
"AgentMetadata",
"default_config",
"metadata",
]
@@ -1,232 +0,0 @@
"""Node definitions for Online Research Agent."""
from framework.graph import NodeSpec
# Node 1: Parse Query
parse_query_node = NodeSpec(
id="parse-query",
name="Parse Query",
description="Analyze the research topic and generate 3-5 diverse search queries to cover different aspects",
node_type="event_loop",
input_keys=["topic"],
output_keys=["search_queries", "research_focus", "key_aspects"],
system_prompt="""\
You are a research query strategist. Given a research topic, analyze it and generate search queries.
Your task:
1. Understand the core research question
2. Identify 3-5 key aspects to investigate
3. Generate 3-5 diverse search queries that will find comprehensive information
Use set_output to store each result:
- set_output("research_focus", "Brief statement of what we're researching")
- set_output("key_aspects", ["aspect1", "aspect2", "aspect3"])
- set_output("search_queries", ["query 1", "query 2", "query 3", "query 4", "query 5"])
""",
tools=[],
)
# Node 2: Search Sources
search_sources_node = NodeSpec(
id="search-sources",
name="Search Sources",
description="Execute web searches using the generated queries to find 15+ source URLs",
node_type="event_loop",
input_keys=["search_queries", "research_focus"],
output_keys=["source_urls", "search_results_summary"],
system_prompt="""\
You are a research assistant executing web searches. Use the web_search tool to find sources.
Your task:
1. Execute each search query using web_search tool
2. Collect URLs from search results
3. Aim for 15+ diverse sources
After searching, use set_output to store results:
- set_output("source_urls", ["url1", "url2", ...])
- set_output("search_results_summary", "Brief summary of what was found")
""",
tools=["web_search"],
)
# Node 3: Fetch Content
fetch_content_node = NodeSpec(
id="fetch-content",
name="Fetch Content",
description="Fetch and extract content from the discovered source URLs",
node_type="event_loop",
input_keys=["source_urls", "research_focus"],
output_keys=["fetched_sources", "fetch_errors"],
system_prompt="""\
You are a content fetcher. Use web_scrape tool to retrieve content from URLs.
Your task:
1. Fetch content from each source URL using web_scrape tool
2. Extract the main content relevant to the research focus
3. Track any URLs that failed to fetch
After fetching, use set_output to store results:
- set_output("fetched_sources", [{"url": "...", "title": "...", "content": "..."}])
- set_output("fetch_errors", ["url that failed", ...])
""",
tools=["web_scrape"],
)
# Node 4: Evaluate Sources
evaluate_sources_node = NodeSpec(
id="evaluate-sources",
name="Evaluate Sources",
description="Score sources for relevance and quality, filter to top 10",
node_type="event_loop",
input_keys=["fetched_sources", "research_focus", "key_aspects"],
output_keys=["ranked_sources", "source_analysis"],
system_prompt="""\
You are a source evaluator. Assess each source for quality and relevance.
Scoring criteria:
- Relevance to research focus (1-10)
- Source credibility (1-10)
- Information depth (1-10)
- Recency if relevant (1-10)
Your task:
1. Score each source
2. Rank by combined score
3. Select top 10 sources
4. Note what each source uniquely contributes
Use set_output to store results:
- set_output("ranked_sources", [{"url": "...", "title": "...", "score": 8.5}])
- set_output("source_analysis", "Overview of source quality and coverage")
""",
tools=[],
)
# Node 5: Synthesize Findings
synthesize_findings_node = NodeSpec(
id="synthesize-findings",
name="Synthesize Findings",
description="Extract key facts from sources and identify common themes",
node_type="event_loop",
input_keys=["ranked_sources", "research_focus", "key_aspects"],
output_keys=["key_findings", "themes", "source_citations"],
system_prompt="""\
You are a research synthesizer. Analyze multiple sources to extract insights.
Your task:
1. Identify key facts from each source
2. Find common themes across sources
3. Note contradictions or debates
4. Build a citation map (fact -> source URL)
Use set_output to store each result:
- set_output("key_findings", [{"finding": "...", "sources": ["url1"], "confidence": "high"}])
- set_output("themes", [{"theme": "...", "description": "...", "supporting_sources": [...]}])
- set_output("source_citations", {"fact or claim": ["url1", "url2"]})
""",
tools=[],
)
# Node 6: Write Report
write_report_node = NodeSpec(
id="write-report",
name="Write Report",
description="Generate a narrative report with proper citations",
node_type="event_loop",
input_keys=[
"key_findings",
"themes",
"source_citations",
"research_focus",
"ranked_sources",
],
output_keys=["report_content", "references"],
system_prompt="""\
You are a research report writer. Create a well-structured narrative report.
Report structure:
1. Executive Summary (2-3 paragraphs)
2. Introduction (context and scope)
3. Key Findings (organized by theme)
4. Analysis (synthesis and implications)
5. Conclusion
6. References (numbered list of all sources)
Citation format: Use numbered citations like [1], [2] that correspond to the References section.
IMPORTANT:
- Every factual claim MUST have a citation
- Write in clear, professional prose
- Be objective and balanced
- Highlight areas of consensus and debate
Use set_output to store results:
- set_output("report_content", "Full markdown report text with citations...")
- set_output("references", [{"number": 1, "url": "...", "title": "..."}])
""",
tools=[],
)
# Node 7: Quality Check
quality_check_node = NodeSpec(
id="quality-check",
name="Quality Check",
description="Verify all claims have citations and report is coherent",
node_type="event_loop",
input_keys=["report_content", "references", "source_citations"],
output_keys=["quality_score", "issues", "final_report"],
system_prompt="""\
You are a quality assurance reviewer. Check the research report for issues.
Check for:
1. Uncited claims (factual statements without [n] citation)
2. Broken citations (references to non-existent numbers)
3. Coherence (logical flow between sections)
4. Completeness (all key aspects covered)
5. Accuracy (claims match source content)
If issues found, fix them in the final report.
Use set_output to store results:
- set_output("quality_score", 0.95)
- set_output("issues", [{"type": "uncited_claim", "location": "...", "fixed": true}])
- set_output("final_report", "Corrected full report with all issues fixed...")
""",
tools=[],
)
# Node 8: Save Report
save_report_node = NodeSpec(
id="save-report",
name="Save Report",
description="Write the final report to a local markdown file",
node_type="event_loop",
input_keys=["final_report", "references", "research_focus"],
output_keys=["file_path", "save_status"],
system_prompt="""\
You are a file manager. Save the research report to disk.
Your task:
1. Generate a filename from the research focus (slugified, with date)
2. Use the write_to_file tool to save the report as markdown
3. Save to the ./research_reports/ directory
Filename format: research_YYYY-MM-DD_topic-slug.md
Use set_output to store results:
- set_output("file_path", "research_reports/research_2026-01-23_topic-name.md")
- set_output("save_status", "success")
""",
tools=["write_to_file"],
)
__all__ = [
"parse_query_node",
"search_sources_node",
"fetch_content_node",
"evaluate_sources_node",
"synthesize_findings_node",
"write_report_node",
"quality_check_node",
"save_report_node",
]
+70 -22
View File
@@ -158,6 +158,43 @@ intake_node = NodeSpec(
> **Legacy Note:** The old `pause_nodes` / `entry_points` pattern still works but `client_facing=True` is preferred for new agents.
**STEP 1 / STEP 2 Prompt Pattern:** For client-facing nodes, structure the system prompt with two explicit phases:
```python
system_prompt="""\
**STEP 1 — Respond to the user (text only, NO tool calls):**
[Present information, ask questions, etc.]
**STEP 2 — After the user responds, call set_output:**
[Call set_output with the structured outputs]
"""
```
This prevents the LLM from calling `set_output` prematurely before the user has had a chance to respond.
### Node Design: Fewer, Richer Nodes
Prefer fewer nodes that do more work over many thin single-purpose nodes:
- **Bad**: 8 thin nodes (parse query → search → fetch → evaluate → synthesize → write → check → save)
- **Good**: 4 rich nodes (intake → research → review → report)
Why: Each node boundary requires serializing outputs and passing context. Fewer nodes means the LLM retains full context of its work within the node. A research node that searches, fetches, and analyzes keeps all the source material in its conversation history.
### nullable_output_keys for Cross-Edge Inputs
When a node receives inputs that only arrive on certain edges (e.g., `feedback` only comes from a review → research feedback loop, not from intake → research), mark those keys as `nullable_output_keys`:
```python
research_node = NodeSpec(
id="research",
input_keys=["research_brief", "feedback"],
nullable_output_keys=["feedback"], # Not present on first visit
max_node_visits=3,
...
)
```
## Event Loop Architecture Concepts
### How EventLoopNode Works
@@ -169,40 +206,30 @@ An event loop node runs a multi-turn loop:
4. Judge evaluates: ACCEPT (exit loop), RETRY (loop again), or ESCALATE
5. Repeat until judge ACCEPTs or max_iterations reached
### CRITICAL: EventLoopNode Runtime Requirements
### EventLoopNode Runtime
EventLoopNodes are **not auto-created** by the graph executor. They must be explicitly instantiated and registered in a `node_registry` dict before execution.
**Required components:**
1. **`EventLoopNode` instances** — One per event_loop NodeSpec, registered in `node_registry`
2. **`Runtime` instance** — `GraphExecutor` calls `runtime.start_run()` internally. Passing `None` crashes the executor
3. **`GraphExecutor` (not `AgentRuntime`)** — `AgentRuntime`/`create_agent_runtime()` does NOT pass `node_registry` to the internal `GraphExecutor`, so all event_loop nodes fail with "not found in registry"
EventLoopNodes are **auto-created** by `GraphExecutor` at runtime. You do NOT need to manually register them. Both `GraphExecutor` (direct) and `AgentRuntime` / `create_agent_runtime()` handle event_loop nodes automatically.
```python
# Direct execution — executor auto-creates EventLoopNodes
from framework.graph.executor import GraphExecutor
from framework.graph.event_loop_node import EventLoopNode, LoopConfig
from framework.runtime.event_bus import EventBus
from framework.runtime.core import Runtime
# Build node_registry
event_bus = EventBus()
node_registry = {}
for node_spec in nodes:
if node_spec.node_type == "event_loop":
node_registry[node_spec.id] = EventLoopNode(
event_bus=event_bus,
config=LoopConfig(max_iterations=50, max_tool_calls_per_turn=15),
tool_executor=tool_executor,
)
# Create executor with Runtime and node_registry
runtime = Runtime(storage_path)
executor = GraphExecutor(
runtime=runtime,
llm=llm,
tools=tools,
tool_executor=tool_executor,
node_registry=node_registry,
storage_path=storage_path,
)
result = await executor.execute(graph=graph, goal=goal, input_data=input_data)
# TUI execution — AgentRuntime also works
from framework.runtime.agent_runtime import create_agent_runtime
runtime = create_agent_runtime(
graph=graph, goal=goal, storage_path=storage_path,
entry_points=[...], llm=llm, tools=tools, tool_executor=tool_executor,
)
```
@@ -210,8 +237,12 @@ executor = GraphExecutor(
Nodes produce structured outputs by calling `set_output(key, value)` — a synthetic tool injected by the framework. When the LLM calls `set_output`, the value is stored in the output accumulator and made available to downstream nodes via shared memory.
`set_output` is NOT a real tool — it is excluded from `real_tool_results`. For client-facing nodes, this means a turn where the LLM only calls `set_output` (no other tools) is treated as a conversational boundary and will block for user input.
### JudgeProtocol
**The judge is the SOLE mechanism for acceptance decisions.** Do not add ad-hoc framework gating, output rollback, or premature rejection logic. If the LLM calls `set_output` too early, fix it with better prompts or a custom judge — not framework-level guards.
The judge controls when a node's loop exits:
- **Implicit judge** (default, no judge configured): ACCEPTs when the LLM finishes with no tool calls and all required output keys are set
- **SchemaJudge**: Validates outputs against a Pydantic model
@@ -225,6 +256,23 @@ Controls loop behavior:
- `stall_detection_threshold` (default 3) — detects repeated identical responses
- `max_history_tokens` (default 32000) — triggers conversation compaction
### Data Tools (Spillover Management)
When tool results exceed the context window, the framework automatically saves them to a spillover directory and truncates with a hint. Nodes that produce or consume large data should include the data tools:
- `save_data(filename, data, data_dir)` — Write data to a file in the data directory
- `load_data(filename, data_dir, offset=0, limit=50)` — Read data with line-based pagination
- `list_data_files(data_dir)` — List available data files
These are real MCP tools (not synthetic). Add them to nodes that handle large tool results:
```python
research_node = NodeSpec(
...
tools=["web_search", "web_scrape", "load_data", "save_data", "list_data_files"],
)
```
### Fan-Out / Fan-In
Multiple ON_SUCCESS edges from the same source create parallel execution. All branches run concurrently via `asyncio.gather()`. Parallel event_loop nodes must have disjoint `output_keys`.
@@ -61,28 +61,38 @@ For agents needing multi-turn conversations with users, use `client_facing=True`
A client-facing node streams LLM output to the user and blocks for user input between conversational turns. This replaces the old pause/resume pattern.
```python
# Client-facing node blocks for user input
# Client-facing node with STEP 1/STEP 2 prompt pattern
intake_node = NodeSpec(
id="intake",
name="Intake",
description="Gather requirements from the user",
node_type="event_loop",
client_facing=True,
input_keys=[],
output_keys=["repo_url", "project_url"],
system_prompt="You are the intake agent. Ask the user for their repo URL and project URL. When you have both, call set_output for each.",
input_keys=["topic"],
output_keys=["research_brief"],
system_prompt="""\
You are an intake specialist.
**STEP 1 — Read and respond (text only, NO tool calls):**
1. Read the topic provided
2. If it's vague, ask 1-2 clarifying questions
3. If it's clear, confirm your understanding
**STEP 2 — After the user confirms, call set_output:**
- set_output("research_brief", "Clear description of what to research")
""",
)
# Internal node runs without user interaction
scanner_node = NodeSpec(
id="scanner",
name="Scanner",
description="Scan the repository",
research_node = NodeSpec(
id="research",
name="Research",
description="Search and analyze sources",
node_type="event_loop",
input_keys=["repo_url"],
output_keys=["scan_results"],
system_prompt="Scan the repository at {repo_url}...",
tools=["scan_github_repo"],
input_keys=["research_brief"],
output_keys=["findings", "sources"],
system_prompt="Research the topic using web_search and web_scrape...",
tools=["web_search", "web_scrape", "load_data", "save_data"],
)
```
@@ -91,6 +101,9 @@ scanner_node = NodeSpec(
- User input is injected via `node.inject_event(text)`
- When the LLM calls `set_output` to produce structured outputs, the judge evaluates and ACCEPTs
- Internal nodes (non-client-facing) run their entire loop without blocking
- `set_output` is a synthetic tool — a turn with only `set_output` calls (no real tools) triggers user input blocking
**STEP 1/STEP 2 pattern:** Always structure client-facing prompts with explicit phases. STEP 1 is text-only conversation. STEP 2 calls `set_output` after user confirmation. This prevents the LLM from calling `set_output` prematurely before the user responds.
### When to Use client_facing
@@ -160,6 +173,12 @@ EdgeSpec(
## Judge Patterns
**Core Principle: The judge is the SOLE mechanism for acceptance decisions.** Never add ad-hoc framework gating to compensate for LLM behavior. If the LLM calls `set_output` prematurely, fix the system prompt or use a custom judge. Anti-patterns to avoid:
- Output rollback logic
- `_user_has_responded` flags
- Premature set_output rejection
- Interaction protocol injection into system prompts
Judges control when an event_loop node's loop exits. Choose based on validation needs.
### Implicit Judge (Default)
@@ -241,15 +260,34 @@ EventLoopNode automatically manages context window usage with tiered compaction:
### Spillover Pattern
For large tool results, use `save_data()` to write to disk and pass the filename through `set_output`. This keeps the LLM context window small.
The framework automatically truncates large tool results and saves full content to a spillover directory. The LLM receives a truncation message with instructions to use `load_data` to read the full result.
```
LLM calls save_data(filename, large_data) → file written to spillover/
LLM calls set_output("results_file", filename) → filename stored in output
Downstream node calls load_data(filename) → reads from spillover/
For explicit data management, use the data tools (real MCP tools, not synthetic):
```python
# save_data, load_data, list_data_files are real MCP tools
# Each takes a data_dir parameter since the MCP server is shared
# Saving large results
save_data(filename="sources.json", data=large_json_string, data_dir="/path/to/spillover")
# Reading with pagination (line-based offset/limit)
load_data(filename="sources.json", data_dir="/path/to/spillover", offset=0, limit=50)
# Listing available files
list_data_files(data_dir="/path/to/spillover")
```
The `load_data()` tool supports `offset` and `limit` parameters for paginated reading of large files.
Add data tools to nodes that handle large tool results:
```python
research_node = NodeSpec(
...
tools=["web_search", "web_scrape", "load_data", "save_data", "list_data_files"],
)
```
The `data_dir` is passed by the framework (from the node's spillover directory). The LLM sees `data_dir` in truncation messages and uses it when calling `load_data`.
## Anti-Patterns
@@ -259,6 +297,29 @@ The `load_data()` tool supports `offset` and `limit` parameters for paginated re
- **Don't hide code in session** — Write to files as components are approved
- **Don't wait to write files** — Agent visible from first step
- **Don't batch everything** — Write incrementally, one component at a time
- **Don't create too many thin nodes** — Prefer fewer, richer nodes (see below)
- **Don't add framework gating for LLM behavior** — Fix prompts or use judges instead
### Fewer, Richer Nodes
A common mistake is splitting work into too many small single-purpose nodes. Each node boundary requires serializing outputs, losing in-context information, and adding edge complexity.
| Bad (8 thin nodes) | Good (4 rich nodes) |
|---------------------|---------------------|
| parse-query | intake (client-facing) |
| search-sources | research (search + fetch + analyze) |
| fetch-content | review (client-facing) |
| evaluate-sources | report (write + deliver) |
| synthesize-findings | |
| write-report | |
| quality-check | |
| save-report | |
**Why fewer nodes are better:**
- The LLM retains full context of its work within a single node
- A research node that searches, fetches, and analyzes keeps all source material in its conversation history
- Fewer edges means simpler graph and fewer failure points
- Data tools (`save_data`/`load_data`) handle context window limits within a single node
### MCP Tools - Correct Usage
+1 -5
View File
@@ -55,14 +55,10 @@ jobs:
- name: Install uv
uses: astral-sh/setup-uv@v4
- name: Install dependencies
- name: Install dependencies and run tests
run: |
cd core
uv sync
- name: Run tests
run: |
cd core
uv run pytest tests/ -v
test-tools:
-1
View File
@@ -54,7 +54,6 @@ __pycache__/
*.egg-info/
.eggs/
*.egg
uv.lock
# Generated runtime data
core/data/
+2 -3
View File
@@ -198,9 +198,8 @@ hive/ # Repository root
│ ├── quizzes/ # Developer quizzes
│ └── i18n/ # Translations
├── scripts/ # Build & utility scripts
── setup-python.sh # Python environment setup
│ └── setup.sh # Legacy setup script
├── scripts/ # Utility scripts
── auto-close-duplicates.ts # GitHub duplicate issue closer
├── quickstart.sh # Interactive setup wizard
├── ENVIRONMENT_SETUP.md # Complete Python setup guide
+10 -52
View File
@@ -21,42 +21,18 @@ This will:
- Fix package compatibility issues (openai + litellm)
- Verify all installations
## Quick Setup (Windows PowerShell)
## Windows Setup
Windows users can use the native PowerShell setup script.
Windows users should use **WSL (Windows Subsystem for Linux)** to set up and run agents.
Before running the script, allow script execution for the current session:
```powershell
Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
```
Run setup from the project root:
```powershell
./scripts/setup-python.ps1
```
This will:
- Check Python version (requires 3.11+)
- Create a local `.venv` virtual environment
- Install the core framework package (`framework`)
- Install the tools package (`aden_tools`)
- Fix package compatibility issues (openai + litellm)
- Verify all installations
After setup, activate the virtual environment:
```powershell
.\.venv\Scripts\Activate.ps1
```
Set `PYTHONPATH` (required in every new PowerShell session):
```powershell
$env:PYTHONPATH="core;exports"
```
1. [Install WSL 2](https://learn.microsoft.com/en-us/windows/wsl/install) if you haven't already:
```powershell
wsl --install
```
2. Open your WSL terminal, clone the repo, and run the quickstart script:
```bash
./quickstart.sh
```
## Alpine Linux Setup
@@ -326,12 +302,6 @@ Or run the setup script:
./quickstart.sh
```
Windows:
```powershell
./scripts/setup-python.ps1
```
### "ModuleNotFoundError: No module named 'openai.\_models'"
**Cause:** Outdated `openai` package (0.27.x) incompatible with `litellm`
@@ -375,12 +345,6 @@ uv pip uninstall framework tools
./quickstart.sh
```
Windows:
```powershell
./scripts/setup-python.ps1
```
## Package Structure
The Hive framework consists of three Python packages:
@@ -479,12 +443,6 @@ This design allows agents in `exports/` to be:
./quickstart.sh
```
Windows:
```powershell
./scripts/setup-python.ps1
```
### 2. Build Agent (Claude Code)
```
-4
View File
@@ -4,7 +4,6 @@
- **Added empty response retry logic** — LLM provider now detects empty responses (e.g. Gemini returning 200 with no content on rate limit) and retries with exponential backoff, preventing hallucinated output from the cleanup LLM
- **Added context-aware input compaction** — LLM nodes now estimate input token count before calling the model and progressively truncate the largest values if they exceed the context window budget
- **Increased rate limit retries to 10** with verbose `[retry]` and `[compaction]` logging that includes model name, finish reason, and attempt count
- **Updated setup scripts** — `scripts/setup-python.sh` now installs Playwright Chromium browser automatically for web scraping support
- **Interactive quickstart onboarding** — `quickstart.sh` rewritten as bee-themed interactive wizard that detects existing API keys (including Claude Code subscription), lets user pick ONE default LLM provider, and saves configuration to `~/.hive/configuration.json`
- **Fixed lint errors** across `hubspot_tool.py` (line length) and `agent_builder_server.py` (unused variable)
@@ -24,8 +23,6 @@
- `tools/src/aden_tools/tools/web_scrape_tool/README.md` — Updated docs
- `tools/pyproject.toml` — Added `playwright`, `playwright-stealth` deps
- `tools/Dockerfile` — Added `playwright install chromium --with-deps`
- `scripts/setup-python.sh` — Added Playwright Chromium browser install step
### LLM Reliability
- `core/framework/llm/litellm.py` — Empty response retry + max retries 10 + verbose logging
- `core/framework/graph/node.py` — Input compaction via `_compact_inputs()`, `_estimate_tokens()`, `_get_context_limit()`
@@ -41,7 +38,6 @@
## Test plan
- [ ] Run `make lint` — passes clean
- [ ] Run `./quickstart.sh` and verify interactive flow works, config saved to `~/.hive/configuration.json`
- [ ] Run `./scripts/setup-python.sh` and verify Playwright Chromium installs
- [ ] Run `pytest tests/tools/test_web_scrape_tool.py -v`
- [ ] Run agent against a JS-heavy site and verify `web_scrape` returns rendered content
- [ ] Set `HUBSPOT_ACCESS_TOKEN` and verify HubSpot tool CRUD operations work
+30
View File
@@ -0,0 +1,30 @@
# TUI Text Selection and Copy Guide
## Keybindings
| Key | Action |
|---------------|-----------------------|
| `Tab` | Next panel |
| `Shift+Tab` | Previous panel |
| `Ctrl+S` | Save SVG screenshot |
| `Ctrl+O` | Command palette |
| `Q` | Quit |
## Panel Cycle Order
`Tab` cycles: **Log Pane → Graph View → Chat Input**
## Text Selection
Textual apps capture the mouse, so normal click-drag selection won't work by default. To select and copy text from any pane:
1. **Hold `Shift`** while clicking and dragging — this bypasses Textual's mouse capture and lets your terminal handle selection natively.
2. Copy with your terminal's shortcut (`Cmd+C` on macOS, `Ctrl+Shift+C` on most Linux terminals).
## Log Pane Scrolling
The log pane uses `auto_scroll=False`. New output only scrolls to the bottom when you are already at the bottom of the log. If you've scrolled up to read earlier output, it stays in place.
## Screenshots
`Ctrl+S` saves an SVG screenshot to the `screenshots/` directory with a timestamped filename. Open the SVG in any browser to view it.
+110 -84
View File
@@ -144,19 +144,19 @@ class EventLoopNode(NodeProtocol):
1. Try to restore from durable state (crash recovery)
2. If no prior state, init from NodeSpec.system_prompt + input_keys
3. Loop: drain injection queue -> stream LLM -> execute tools
-> if client_facing + no tools: block for user input (inject_event)
-> if not client_facing or tools present: judge evaluates
-> if client_facing + no real tools: block for user input
-> judge evaluates (acceptance criteria)
(each add_* and set_output writes through to store immediately)
4. Publish events to EventBus at each stage
5. Write cursor after each iteration
6. Terminate when judge returns ACCEPT, shutdown signaled, or max iterations
7. Build output dict from OutputAccumulator
Client-facing blocking: When ``client_facing=True`` and the LLM produces
text without tool calls (a natural conversational turn), the node blocks
via ``_await_user_input()`` until ``inject_event()`` or ``signal_shutdown()``
is called. This separates blocking (node concern) from output evaluation
(judge concern).
Client-facing blocking: When ``client_facing=True`` and the LLM finishes
without real tool calls (stop_reason != tool_call), the node blocks via
``_await_user_input()`` until ``inject_event()`` or ``signal_shutdown()``
is called. After user input, the judge evaluates the judge is the
sole mechanism for acceptance decisions.
Always returns NodeResult with retryable=False semantics. The executor
must NOT retry event loop nodes -- retry is handled internally by the
@@ -212,8 +212,10 @@ class EventLoopNode(NodeProtocol):
# 2. Restore or create new conversation + accumulator
conversation, accumulator, start_iteration = await self._restore(ctx)
if conversation is None:
system_prompt = ctx.node_spec.system_prompt or ""
conversation = NodeConversation(
system_prompt=ctx.node_spec.system_prompt or "",
system_prompt=system_prompt,
max_history_tokens=self._config.max_history_tokens,
output_keys=ctx.node_spec.output_keys or None,
store=self._conversation_store,
@@ -276,15 +278,20 @@ class EventLoopNode(NodeProtocol):
iteration,
len(conversation.messages),
)
assistant_text, tool_results_list, turn_tokens = await self._run_single_turn(
ctx, conversation, tools, iteration, accumulator
)
(
assistant_text,
real_tool_results,
outputs_set,
turn_tokens,
) = await self._run_single_turn(ctx, conversation, tools, iteration, accumulator)
logger.info(
"[%s] iter=%d: LLM done — text=%d chars, tool_calls=%d, tokens=%s, accumulator=%s",
"[%s] iter=%d: LLM done — text=%d chars, real_tools=%d, "
"outputs_set=%s, tokens=%s, accumulator=%s",
node_id,
iteration,
len(assistant_text),
len(tool_results_list),
len(real_tool_results),
outputs_set or "[]",
turn_tokens,
{k: ("set" if v is not None else "None") for k, v in accumulator.to_dict().items()},
)
@@ -300,6 +307,31 @@ class EventLoopNode(NodeProtocol):
if conversation.needs_compaction():
await self._compact_tiered(ctx, conversation, accumulator)
# 6e'''. Empty response guard — if the LLM returned nothing
# (no text, no real tools, no set_output) and all required
# outputs are already set, accept immediately. This prevents
# wasted iterations when the LLM has genuinely finished its
# work (e.g. after calling set_output in a previous turn).
truly_empty = not assistant_text and not real_tool_results and not outputs_set
if truly_empty and accumulator is not None:
missing = self._get_missing_output_keys(
accumulator, ctx.node_spec.output_keys, ctx.node_spec.nullable_output_keys
)
if not missing:
logger.info(
"[%s] iter=%d: empty response but all outputs set — accepting",
node_id,
iteration,
)
await self._publish_loop_completed(stream_id, node_id, iteration + 1)
latency_ms = int((time.time() - start_time) * 1000)
return NodeResult(
success=True,
output=accumulator.to_dict(),
tokens_used=total_input_tokens + total_output_tokens,
latency_ms=latency_ms,
)
# 6f. Stall detection
recent_responses.append(assistant_text)
if len(recent_responses) > self._config.stall_detection_threshold:
@@ -321,18 +353,17 @@ class EventLoopNode(NodeProtocol):
# 6g. Write cursor checkpoint
await self._write_cursor(ctx, conversation, accumulator, iteration)
# 6h. Client-facing input wait
logger.info(
"[%s] iter=%d: 6h check — client_facing=%s, tool_results=%d",
node_id,
iteration,
ctx.node_spec.client_facing,
len(tool_results_list),
)
if ctx.node_spec.client_facing and not tool_results_list:
# LLM finished speaking (no tool calls) on a client-facing node.
# This is a conversational turn boundary: block for user input
# instead of running the judge.
# 6h. Client-facing input blocking
#
# For client_facing nodes, block for user input whenever the
# LLM finishes without making real tool calls (i.e. the LLM's
# stop_reason is not tool_call). set_output is separated from
# real tools by _run_single_turn, so this correctly treats
# set_output-only turns as conversational boundaries.
#
# After user input, always fall through to judge evaluation
# (6i). The judge handles all acceptance decisions.
if ctx.node_spec.client_facing and not real_tool_results:
if self._shutdown:
await self._publish_loop_completed(stream_id, node_id, iteration + 1)
latency_ms = int((time.time() - start_time) * 1000)
@@ -347,7 +378,6 @@ class EventLoopNode(NodeProtocol):
got_input = await self._await_user_input(ctx)
logger.info("[%s] iter=%d: unblocked, got_input=%s", node_id, iteration, got_input)
if not got_input:
# Shutdown signaled during wait
await self._publish_loop_completed(stream_id, node_id, iteration + 1)
latency_ms = int((time.time() - start_time) * 1000)
return NodeResult(
@@ -357,46 +387,13 @@ class EventLoopNode(NodeProtocol):
latency_ms=latency_ms,
)
# Clear stall detection — user input resets the conversation
recent_responses.clear()
# For nodes with an explicit judge, fall through to judge
# evaluation so the LLM gets structured feedback about missing
# outputs (e.g. "Missing output keys: [...]"). Without this,
# the LLM may generate text like "Ready to proceed!" without
# ever calling set_output, and the judge feedback never reaches it.
#
# For nodes without a judge (HITL review/approval with all-
# nullable keys), keep conversing UNLESS the LLM has already
# set an output — in that case fall through to the implicit
# judge which will ACCEPT and terminate the node.
if self._judge is None:
has_outputs = accumulator and any(
v is not None for v in accumulator.to_dict().values()
)
if not has_outputs:
logger.info(
"[%s] iter=%d: no judge, no outputs, continuing",
node_id,
iteration,
)
continue
logger.info(
"[%s] iter=%d: no judge, outputs set — implicit judge",
node_id,
iteration,
)
else:
logger.info(
"[%s] iter=%d: has judge, falling through to 6i",
node_id,
iteration,
)
# Fall through to judge evaluation (6i)
# 6i. Judge evaluation
should_judge = (
(iteration + 1) % self._config.judge_every_n_turns == 0
or not tool_results_list # no tool calls = natural stop
or not real_tool_results # no real tool calls = natural stop
)
logger.info("[%s] iter=%d: 6i should_judge=%s", node_id, iteration, should_judge)
@@ -406,7 +403,7 @@ class EventLoopNode(NodeProtocol):
conversation,
accumulator,
assistant_text,
tool_results_list,
real_tool_results,
iteration,
)
fb_preview = (verdict.feedback or "")[:200]
@@ -526,16 +523,24 @@ class EventLoopNode(NodeProtocol):
tools: list[Tool],
iteration: int,
accumulator: OutputAccumulator,
) -> tuple[str, list[dict], dict[str, int]]:
) -> tuple[str, list[dict], list[str], dict[str, int]]:
"""Run a single LLM turn with streaming and tool execution.
Returns (assistant_text, tool_results, token_counts).
Returns (assistant_text, real_tool_results, outputs_set, token_counts).
``real_tool_results`` contains only results from actual tools (web_search,
etc.), NOT from the synthetic ``set_output`` tool. ``outputs_set`` lists
the output keys written via ``set_output`` during this turn. This
separation lets the caller treat set_output as a framework concern
rather than a tool-execution concern.
"""
stream_id = ctx.node_id
node_id = ctx.node_id
token_counts: dict[str, int] = {"input": 0, "output": 0}
tool_call_count = 0
final_text = ""
# Track output keys set via set_output across all inner iterations
outputs_set_this_turn: list[str] = []
# Inner tool loop: stream may produce tool calls requiring re-invocation
while True:
@@ -606,10 +611,10 @@ class EventLoopNode(NodeProtocol):
# If no tool calls, turn is complete
if not tool_calls:
return final_text, [], token_counts
return final_text, [], outputs_set_this_turn, token_counts
# Execute tool calls
tool_results: list[dict] = []
# Execute tool calls — separate real tools from set_output
real_tool_results: list[dict] = []
limit_hit = False
executed_in_batch = 0
for tc in tool_calls:
@@ -624,21 +629,21 @@ class EventLoopNode(NodeProtocol):
stream_id, node_id, tc.tool_use_id, tc.tool_name, tc.tool_input
)
# Handle set_output synthetic tool
logger.info(
"[%s] tool_call: %s(%s)",
node_id,
tc.tool_name,
json.dumps(tc.tool_input)[:200],
)
if tc.tool_name == "set_output":
# --- Framework-level set_output handling ---
result = self._handle_set_output(tc.tool_input, ctx.node_spec.output_keys)
result = ToolResult(
tool_use_id=tc.tool_use_id,
content=result.content,
is_error=result.is_error,
)
# Async write-through for set_output
if not result.is_error:
value = tc.tool_input["value"]
# Parse JSON strings into native types so downstream
@@ -652,26 +657,27 @@ class EventLoopNode(NodeProtocol):
except (json.JSONDecodeError, TypeError):
pass
await accumulator.set(tc.tool_input["key"], value)
outputs_set_this_turn.append(tc.tool_input["key"])
else:
# Execute real tool
# --- Real tool execution ---
result = await self._execute_tool(tc)
# Truncate large results to prevent context blowup
result = self._truncate_tool_result(result, tc.tool_name)
real_tool_results.append(
{
"tool_use_id": tc.tool_use_id,
"tool_name": tc.tool_name,
"content": result.content,
"is_error": result.is_error,
}
)
# Record tool result in conversation (write-through)
# Record tool result in conversation (both real and set_output
# go into the conversation for LLM context continuity)
await conversation.add_tool_result(
tool_use_id=tc.tool_use_id,
content=result.content,
is_error=result.is_error,
)
tool_results.append(
{
"tool_use_id": tc.tool_use_id,
"tool_name": tc.tool_name,
"content": result.content,
"is_error": result.is_error,
}
)
# Publish tool call completed
await self._publish_tool_completed(
@@ -708,7 +714,9 @@ class EventLoopNode(NodeProtocol):
content=discard_msg,
is_error=True,
)
tool_results.append(
# Discarded calls go into real_tool_results so the
# caller sees they were attempted (for judge context).
real_tool_results.append(
{
"tool_use_id": tc.tool_use_id,
"tool_name": tc.tool_name,
@@ -716,9 +724,24 @@ class EventLoopNode(NodeProtocol):
"is_error": True,
}
)
# Prune old tool results NOW to prevent context bloat on the
# next turn. The char-based token estimator underestimates
# actual API tokens, so the standard compaction check in the
# outer loop may not trigger in time.
protect = max(2000, self._config.max_history_tokens // 12)
pruned = await conversation.prune_old_tool_results(
protect_tokens=protect,
min_prune_tokens=max(1000, protect // 3),
)
if pruned > 0:
logger.info(
"Post-limit pruning: cleared %d old tool results (budget: %d)",
pruned,
self._config.max_history_tokens,
)
# Limit hit — return from this turn so the judge can
# evaluate instead of looping back for another stream.
return final_text, tool_results, token_counts
return final_text, real_tool_results, outputs_set_this_turn, token_counts
# --- Mid-turn pruning: prevent context blowup within a single turn ---
if conversation.usage_ratio() >= 0.6:
@@ -1025,7 +1048,8 @@ class EventLoopNode(NodeProtocol):
truncated = (
f"[Result from {tool_name}: {len(result.content)} chars — "
f"too large for context, saved to '{filename}'. "
f"Use load_data('{filename}') to read the full result.]\n\n"
f"Use load_data(filename='{filename}', data_dir='{spill_dir}') "
f"to read the full result.]\n\n"
f"Preview:\n{preview}"
)
logger.info(
@@ -1244,9 +1268,11 @@ class EventLoopNode(NodeProtocol):
# 5. Spillover files hint
if self._config.spillover_dir:
spill = self._config.spillover_dir
parts.append(
"NOTE: Large tool results were saved to files. "
"Use load_data('<filename>') to read them."
f"Use load_data(filename='<filename>', data_dir='{spill}') "
"to read them."
)
# 6. Tool call history (prevent re-calling tools)
+145 -47
View File
@@ -14,6 +14,7 @@ import logging
import warnings
from collections.abc import Callable
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any
from framework.graph.edge import EdgeCondition, EdgeSpec, GraphSpec
@@ -128,6 +129,9 @@ class GraphExecutor:
cleansing_config: CleansingConfig | None = None,
enable_parallel_execution: bool = True,
parallel_config: ParallelExecutionConfig | None = None,
event_bus: Any | None = None,
stream_id: str = "",
storage_path: str | Path | None = None,
):
"""
Initialize the executor.
@@ -142,6 +146,9 @@ class GraphExecutor:
cleansing_config: Optional output cleansing configuration
enable_parallel_execution: Enable parallel fan-out execution (default True)
parallel_config: Configuration for parallel execution behavior
event_bus: Optional event bus for emitting node lifecycle events
stream_id: Stream ID for event correlation
storage_path: Optional base path for conversation persistence
"""
self.runtime = runtime
self.llm = llm
@@ -151,6 +158,9 @@ class GraphExecutor:
self.approval_callback = approval_callback
self.validator = OutputValidator()
self.logger = logging.getLogger(__name__)
self._event_bus = event_bus
self._stream_id = stream_id
self._storage_path = Path(storage_path) if storage_path else None
# Initialize output cleaner
self.cleansing_config = cleansing_config or CleansingConfig()
@@ -357,13 +367,33 @@ class GraphExecutor:
description=f"Validation errors for {current_node_id}: {validation_errors}",
)
# Emit node-started event (skip event_loop nodes — they emit their own)
if self._event_bus and node_spec.node_type != "event_loop":
await self._event_bus.emit_node_loop_started(
stream_id=self._stream_id, node_id=current_node_id
)
# Execute node
self.logger.info(" Executing...")
result = await node_impl.execute(ctx)
# Emit node-completed event (skip event_loop nodes)
if self._event_bus and node_spec.node_type != "event_loop":
await self._event_bus.emit_node_loop_completed(
stream_id=self._stream_id, node_id=current_node_id, iterations=1
)
if result.success:
# Validate output before accepting it
if result.output and node_spec.output_keys:
# Validate output before accepting it.
# Skip for event_loop nodes — their judge system is
# the sole acceptance mechanism (see WP-8). Empty
# strings and other flexible outputs are legitimate
# for LLM-driven nodes that already passed the judge.
if (
result.output
and node_spec.output_keys
and node_spec.node_type != "event_loop"
):
validation = self.validator.validate_all(
output=result.output,
expected_keys=node_spec.output_keys,
@@ -441,48 +471,66 @@ class GraphExecutor:
_is_retry = True
continue
else:
# Max retries exceeded - fail the execution
# Max retries exceeded - check for failure handlers
self.logger.error(
f" ✗ Max retries ({max_retries}) exceeded for node {current_node_id}"
)
self.runtime.report_problem(
severity="critical",
description=(
f"Node {current_node_id} failed after "
f"{max_retries} attempts: {result.error}"
),
)
self.runtime.end_run(
success=False,
output_data=memory.read_all(),
narrative=(
f"Failed at {node_spec.name} after "
f"{max_retries} retries: {result.error}"
),
# Check if there's an ON_FAILURE edge to follow
next_node = self._follow_edges(
graph=graph,
goal=goal,
current_node_id=current_node_id,
current_node_spec=node_spec,
result=result, # result.success=False triggers ON_FAILURE
memory=memory,
)
# Calculate quality metrics
total_retries_count = sum(node_retry_counts.values())
nodes_failed = list(node_retry_counts.keys())
if next_node:
# Found a failure handler - route to it
self.logger.info(f" → Routing to failure handler: {next_node}")
current_node_id = next_node
continue # Continue execution with handler
else:
# No failure handler - terminate execution
self.runtime.report_problem(
severity="critical",
description=(
f"Node {current_node_id} failed after "
f"{max_retries} attempts: {result.error}"
),
)
self.runtime.end_run(
success=False,
output_data=memory.read_all(),
narrative=(
f"Failed at {node_spec.name} after "
f"{max_retries} retries: {result.error}"
),
)
return ExecutionResult(
success=False,
error=(
f"Node '{node_spec.name}' failed after "
f"{max_retries} attempts: {result.error}"
),
output=memory.read_all(),
steps_executed=steps,
total_tokens=total_tokens,
total_latency_ms=total_latency,
path=path,
total_retries=total_retries_count,
nodes_with_failures=nodes_failed,
retry_details=dict(node_retry_counts),
had_partial_failures=len(nodes_failed) > 0,
execution_quality="failed",
node_visit_counts=dict(node_visit_counts),
)
# Calculate quality metrics
total_retries_count = sum(node_retry_counts.values())
nodes_failed = list(node_retry_counts.keys())
return ExecutionResult(
success=False,
error=(
f"Node '{node_spec.name}' failed after "
f"{max_retries} attempts: {result.error}"
),
output=memory.read_all(),
steps_executed=steps,
total_tokens=total_tokens,
total_latency_ms=total_latency,
path=path,
total_retries=total_retries_count,
nodes_with_failures=nodes_failed,
retry_details=dict(node_retry_counts),
had_partial_failures=len(nodes_failed) > 0,
execution_quality="failed",
node_visit_counts=dict(node_visit_counts),
)
# Check if we just executed a pause node - if so, save state and return
# This must happen BEFORE determining next node, since pause nodes may have no edges
@@ -781,11 +829,43 @@ class GraphExecutor:
)
if node_spec.node_type == "event_loop":
# Event loop nodes must be pre-registered (like function nodes)
raise RuntimeError(
f"EventLoopNode '{node_spec.id}' not found in registry. "
"Register it with executor.register_node() before execution."
# Auto-create EventLoopNode with sensible defaults.
# Custom configs can still be pre-registered via node_registry.
from framework.graph.event_loop_node import EventLoopNode, LoopConfig
# Create a FileConversationStore if a storage path is available
conv_store = None
if self._storage_path:
from framework.storage.conversation_store import FileConversationStore
store_path = self._storage_path / "conversations" / node_spec.id
conv_store = FileConversationStore(base_path=store_path)
# Auto-configure spillover directory for large tool results.
# When a tool result exceeds max_tool_result_chars, the full
# content is written to spillover_dir and the agent gets a
# truncated preview with instructions to use load_data().
spillover = None
if self._storage_path:
spillover = str(self._storage_path / "data")
node = EventLoopNode(
event_bus=self._event_bus,
judge=None, # implicit judge: accept when output_keys are filled
config=LoopConfig(
max_iterations=100 if node_spec.client_facing else 50,
max_tool_calls_per_turn=10,
stall_detection_threshold=3,
max_history_tokens=32000,
max_tool_result_chars=3_000,
spillover_dir=spillover,
),
tool_executor=self.tool_executor,
conversation_store=conv_store,
)
# Cache so inject_event() is reachable for client-facing input
self.node_registry[node_spec.id] = node
return node
# Should never reach here due to validation above
raise RuntimeError(f"Unhandled node type: {node_spec.node_type}")
@@ -814,9 +894,12 @@ class GraphExecutor:
source_node_name=current_node_spec.name if current_node_spec else current_node_id,
target_node_name=target_node_spec.name if target_node_spec else edge.target,
):
# Validate and clean output before mapping inputs
# Validate and clean output before mapping inputs.
# Use full memory state (not just result.output) because
# target input_keys may come from earlier nodes in the
# graph, not only from the immediate source node.
if self.cleansing_config.enabled and target_node_spec:
output_to_validate = result.output
output_to_validate = memory.read_all()
validation = self.output_cleaner.validate_output(
output=output_to_validate,
@@ -1012,10 +1095,13 @@ class GraphExecutor:
branch.status = "running"
try:
# Validate and clean output before mapping inputs (same as _follow_edges)
# Validate and clean output before mapping inputs (same as _follow_edges).
# Use full memory state since target input_keys may come
# from earlier nodes, not just the immediate source.
if self.cleansing_config.enabled and node_spec:
mem_snapshot = memory.read_all()
validation = self.output_cleaner.validate_output(
output=source_result.output,
output=mem_snapshot,
source_node_id=source_node_spec.id if source_node_spec else "unknown",
target_node_spec=node_spec,
)
@@ -1026,7 +1112,7 @@ class GraphExecutor:
f"{branch.node_id}: {validation.errors}"
)
cleaned_output = self.output_cleaner.clean_output(
output=source_result.output,
output=mem_snapshot,
source_node_id=source_node_spec.id if source_node_spec else "unknown",
target_node_spec=node_spec,
validation_errors=validation.errors,
@@ -1049,12 +1135,24 @@ class GraphExecutor:
ctx = self._build_context(node_spec, memory, goal, mapped, graph.max_tokens)
node_impl = self._get_node_implementation(node_spec, graph.cleanup_llm_model)
# Emit node-started event (skip event_loop nodes)
if self._event_bus and node_spec.node_type != "event_loop":
await self._event_bus.emit_node_loop_started(
stream_id=self._stream_id, node_id=branch.node_id
)
self.logger.info(
f" ▶ Branch {node_spec.name}: executing (attempt {attempt + 1})"
)
result = await node_impl.execute(ctx)
last_result = result
# Emit node-completed event (skip event_loop nodes)
if self._event_bus and node_spec.node_type != "event_loop":
await self._event_bus.emit_node_loop_completed(
stream_id=self._stream_id, node_id=branch.node_id, iterations=1
)
if result.success:
# Write outputs to shared memory using async write
for key, value in result.output.items():
+4 -1
View File
@@ -144,8 +144,11 @@ class OutputCleaner:
errors = []
warnings = []
# Check 1: Required input keys present
# Check 1: Required input keys present (skip nullable keys)
nullable = set(getattr(target_node_spec, "nullable_output_keys", None) or [])
for key in target_node_spec.input_keys:
if key in nullable:
continue
if key not in output:
errors.append(f"Missing required key: '{key}'")
continue
+10 -6
View File
@@ -572,17 +572,21 @@ class LiteLLMProvider(LLMProvider):
# and we skip the retry path — nothing was yielded in vain.)
has_content = accumulated_text or tool_calls_acc
if not has_content and attempt < RATE_LIMIT_MAX_RETRIES:
# If the conversation ends with an assistant message,
# an empty stream is expected (nothing new to say).
# Don't retry — just flush whatever we have.
# If the conversation ends with an assistant or tool
# message, an empty stream is expected — the LLM has
# nothing new to say. Don't burn retries on this;
# let the caller (EventLoopNode) decide what to do.
# Typical case: client_facing node where the LLM set
# all outputs via set_output tool calls, and the tool
# results are the last messages.
last_role = next(
(m["role"] for m in reversed(full_messages) if m.get("role") != "system"),
None,
)
if last_role == "assistant":
if last_role in ("assistant", "tool"):
logger.debug(
"[stream] Empty response after assistant message — "
"expected, not retrying."
"[stream] Empty response after %s message — expected, not retrying.",
last_role,
)
for event in tail_events:
yield event
+56 -15
View File
@@ -1105,17 +1105,30 @@ def validate_graph() -> str:
errors.append(f"Unreachable nodes: {unreachable}")
# === CONTEXT FLOW VALIDATION ===
# Build dependency map (node_id -> list of nodes it depends on)
# Build dependency maps — separate forward edges from feedback edges.
# Feedback edges (priority < 0) create cycles; they must not block the
# topological sort. Context they carry arrives on *revisits*, not on
# the first execution of a node.
feedback_edge_ids = {e.id for e in session.edges if e.priority < 0}
forward_dependencies: dict[str, list[str]] = {node.id: [] for node in session.nodes}
feedback_sources: dict[str, list[str]] = {node.id: [] for node in session.nodes}
# Combined map kept for error-message generation (all deps)
dependencies: dict[str, list[str]] = {node.id: [] for node in session.nodes}
for edge in session.edges:
if edge.target in dependencies:
dependencies[edge.target].append(edge.source)
if edge.target not in forward_dependencies:
continue
dependencies[edge.target].append(edge.source)
if edge.id in feedback_edge_ids:
feedback_sources[edge.target].append(edge.source)
else:
forward_dependencies[edge.target].append(edge.source)
# Build output map (node_id -> keys it produces)
node_outputs: dict[str, set[str]] = {node.id: set(node.output_keys) for node in session.nodes}
# Compute available context for each node (what keys it can read)
# Using topological order
# Using topological order on the forward-edge DAG
available_context: dict[str, set[str]] = {}
computed = set()
nodes_by_id = {n.id: n for n in session.nodes}
@@ -1125,7 +1138,8 @@ def validate_graph() -> str:
# Entry nodes can only read from initial context
initial_context_keys: set[str] = set()
# Compute in topological order
# Compute in topological order (forward edges only — feedback edges
# don't block, since their context arrives on revisits)
remaining = {n.id for n in session.nodes}
max_iterations = len(session.nodes) * 2
@@ -1134,18 +1148,23 @@ def validate_graph() -> str:
break
for node_id in list(remaining):
deps = dependencies.get(node_id, [])
fwd_deps = forward_dependencies.get(node_id, [])
# Can compute if all dependencies are computed (or no dependencies)
if all(d in computed for d in deps):
# Collect outputs from all dependencies
# Can compute if all FORWARD dependencies are computed
if all(d in computed for d in fwd_deps):
# Collect outputs from all forward dependencies
available = set(initial_context_keys)
for dep_id in deps:
# Add outputs from dependency
for dep_id in fwd_deps:
available.update(node_outputs.get(dep_id, set()))
# Also add what was available to the dependency (transitive)
available.update(available_context.get(dep_id, set()))
# Also include context from already-computed feedback
# sources (bonus, not blocking)
for fb_src in feedback_sources.get(node_id, []):
if fb_src in computed:
available.update(node_outputs.get(fb_src, set()))
available.update(available_context.get(fb_src, set()))
available_context[node_id] = available
computed.add(node_id)
remaining.remove(node_id)
@@ -1155,15 +1174,37 @@ def validate_graph() -> str:
context_errors = []
context_warnings = []
missing_inputs: dict[str, list[str]] = {}
feedback_only_inputs: dict[str, list[str]] = {}
for node in session.nodes:
available = available_context.get(node.id, set())
for input_key in node.input_keys:
if input_key not in available:
if node.id not in missing_inputs:
missing_inputs[node.id] = []
missing_inputs[node.id].append(input_key)
# Check if this input is provided by a feedback source
fb_provides = set()
for fb_src in feedback_sources.get(node.id, []):
fb_provides.update(node_outputs.get(fb_src, set()))
fb_provides.update(available_context.get(fb_src, set()))
if input_key in fb_provides:
# Input arrives via feedback edge — warn, don't error
if node.id not in feedback_only_inputs:
feedback_only_inputs[node.id] = []
feedback_only_inputs[node.id].append(input_key)
else:
if node.id not in missing_inputs:
missing_inputs[node.id] = []
missing_inputs[node.id].append(input_key)
# Warn about feedback-only inputs (available on revisits, not first run)
for node_id, fb_keys in feedback_only_inputs.items():
fb_srcs = feedback_sources.get(node_id, [])
context_warnings.append(
f"Node '{node_id}' input(s) {fb_keys} are only provided via "
f"feedback edge(s) from {fb_srcs}. These will be available on "
f"revisits but not on the first execution."
)
# Generate helpful error messages
for node_id, missing in missing_inputs.items():
+85 -28
View File
@@ -56,6 +56,18 @@ def register_commands(subparsers: argparse._SubParsersAction) -> None:
action="store_true",
help="Show detailed execution logs (steps, LLM calls, etc.)",
)
run_parser.add_argument(
"--tui",
action="store_true",
help="Launch interactive terminal dashboard",
)
run_parser.add_argument(
"--model",
"-m",
type=str,
default=None,
help="LLM model to use (any LiteLLM-compatible name)",
)
run_parser.set_defaults(func=cmd_run)
# info command
@@ -205,38 +217,83 @@ def cmd_run(args: argparse.Namespace) -> int:
print(f"Error reading input file: {e}", file=sys.stderr)
return 1
# Load and run agent
try:
runner = AgentRunner.load(
args.agent_path,
mock_mode=args.mock,
model=getattr(args, "model", "claude-haiku-4-5-20251001"),
)
except FileNotFoundError as e:
print(f"Error: {e}", file=sys.stderr)
return 1
# Run the agent (with TUI or standard)
if getattr(args, "tui", False):
from framework.tui.app import AdenTUI
# Auto-inject user_id if the agent expects it but it's not provided
entry_input_keys = runner.graph.nodes[0].input_keys if runner.graph.nodes else []
if "user_id" in entry_input_keys and context.get("user_id") is None:
import os
async def run_with_tui():
try:
# Load runner inside the async loop to ensure strict loop affinity
# (only one load — avoids spawning duplicate MCP subprocesses)
try:
runner = AgentRunner.load(
args.agent_path,
mock_mode=args.mock,
model=args.model,
enable_tui=True,
)
except Exception as e:
print(f"Error loading agent: {e}")
return
context["user_id"] = os.environ.get("USER", "default_user")
# Force setup inside the loop
if runner._agent_runtime is None:
runner._setup()
if not args.quiet:
info = runner.info()
print(f"Agent: {info.name}")
print(f"Goal: {info.goal_name}")
print(f"Steps: {info.node_count}")
print(f"Input: {json.dumps(context)}")
print()
print("=" * 60)
print("Executing agent...")
print("=" * 60)
print()
# Start runtime before TUI so it's ready for user input
if runner._agent_runtime and not runner._agent_runtime.is_running:
await runner._agent_runtime.start()
# Run the agent
result = asyncio.run(runner.run(context))
app = AdenTUI(runner._agent_runtime)
# TUI handles execution via ChatRepl — user submits input,
# ChatRepl calls runtime.trigger_and_wait(). No auto-launch.
await app.run_async()
except Exception as e:
import traceback
traceback.print_exc()
print(f"TUI error: {e}")
await runner.cleanup_async()
return None
asyncio.run(run_with_tui())
print("TUI session ended.")
return 0
else:
# Standard execution — load runner here (not shared with TUI path)
try:
runner = AgentRunner.load(
args.agent_path,
mock_mode=args.mock,
model=args.model,
enable_tui=False,
)
except FileNotFoundError as e:
print(f"Error: {e}", file=sys.stderr)
return 1
# Auto-inject user_id if the agent expects it but it's not provided
entry_input_keys = runner.graph.nodes[0].input_keys if runner.graph.nodes else []
if "user_id" in entry_input_keys and context.get("user_id") is None:
import os
context["user_id"] = os.environ.get("USER", "default_user")
if not args.quiet:
info = runner.info()
print(f"Agent: {info.name}")
print(f"Goal: {info.goal_name}")
print(f"Steps: {info.node_count}")
print(f"Input: {json.dumps(context)}")
print()
print("=" * 60)
print("Executing agent...")
print("=" * 60)
print()
result = asyncio.run(runner.run(context))
# Format output
output = {
+9
View File
@@ -362,6 +362,15 @@ class MCPClient:
# Call tool using persistent session
result = await self._session.call_tool(tool_name, arguments=arguments)
# Check for server-side errors (validation failures, tool exceptions, etc.)
if getattr(result, "isError", False):
error_text = ""
if result.content:
content_item = result.content[0]
if hasattr(content_item, "text"):
error_text = content_item.text
raise RuntimeError(f"MCP tool '{tool_name}' failed: {error_text}")
# Extract content
if result.content:
# MCP returns content as a list of content items
+212 -21
View File
@@ -28,6 +28,33 @@ logger = logging.getLogger(__name__)
# Configuration paths
HIVE_CONFIG_FILE = Path.home() / ".hive" / "configuration.json"
def _ensure_credential_key_env() -> None:
"""Load HIVE_CREDENTIAL_KEY from shell config if not already in environment.
The setup-credentials skill writes the encryption key to ~/.zshrc or ~/.bashrc.
If the user hasn't sourced their config in the current shell, this reads it
directly so the runner (and any MCP subprocesses it spawns) can unlock the
encrypted credential store.
Only HIVE_CREDENTIAL_KEY is loaded this way all other secrets (API keys, etc.)
come from the credential store itself.
"""
if os.environ.get("HIVE_CREDENTIAL_KEY"):
return
try:
from aden_tools.credentials.shell_config import check_env_var_in_shell_config
found, value = check_env_var_in_shell_config("HIVE_CREDENTIAL_KEY")
if found and value:
os.environ["HIVE_CREDENTIAL_KEY"] = value
logger.debug("Loaded HIVE_CREDENTIAL_KEY from shell config")
except ImportError:
pass
CLAUDE_CREDENTIALS_FILE = Path.home() / ".claude" / ".credentials.json"
@@ -236,6 +263,15 @@ class AgentRunner:
result = await runner.run({"lead_id": "123"})
"""
@staticmethod
def _resolve_default_model() -> str:
"""Resolve the default model from ~/.hive/configuration.json."""
config = get_hive_config()
llm = config.get("llm", {})
if llm.get("provider") and llm.get("model"):
return f"{llm['provider']}/{llm['model']}"
return "anthropic/claude-sonnet-4-20250514"
def __init__(
self,
agent_path: Path,
@@ -243,7 +279,8 @@ class AgentRunner:
goal: Goal,
mock_mode: bool = False,
storage_path: Path | None = None,
model: str = "cerebras/zai-glm-4.7",
model: str | None = None,
enable_tui: bool = False,
):
"""
Initialize the runner (use AgentRunner.load() instead).
@@ -254,14 +291,15 @@ class AgentRunner:
goal: Loaded Goal object
mock_mode: If True, use mock LLM responses
storage_path: Path for runtime storage (defaults to temp)
model: Model to use - any LiteLLM-compatible model name
(e.g., "claude-sonnet-4-20250514", "gpt-4o-mini", "gemini/gemini-pro")
model: Model to use (reads from agent config or ~/.hive/configuration.json if None)
enable_tui: If True, forces use of AgentRuntime with EventBus
"""
self.agent_path = agent_path
self.graph = graph
self.goal = goal
self.mock_mode = mock_mode
self.model = model
self.model = model or self._resolve_default_model()
self.enable_tui = enable_tui
# Set up storage
if storage_path:
@@ -275,6 +313,10 @@ class AgentRunner:
self._storage_path = default_storage
self._temp_dir = None
# Load HIVE_CREDENTIAL_KEY from shell config if not in env.
# Must happen before MCP subprocesses are spawned so they inherit it.
_ensure_credential_key_env()
# Initialize components
self._tool_registry = ToolRegistry()
self._runtime: Runtime | None = None
@@ -296,32 +338,121 @@ class AgentRunner:
if mcp_config_path.exists():
self._load_mcp_servers_from_config(mcp_config_path)
@staticmethod
def _import_agent_module(agent_path: Path):
"""Import an agent package from its directory path.
Tries package import first (works when exports/ is on sys.path,
which cli.py:_configure_paths() ensures). Falls back to direct
file import of agent.py via importlib.util.
"""
import importlib
package_name = agent_path.name
# Try importing as a package (works when exports/ is on sys.path)
try:
return importlib.import_module(package_name)
except ImportError:
pass
# Fallback: import agent.py directly via file path
import importlib.util
agent_py = agent_path / "agent.py"
if not agent_py.exists():
raise FileNotFoundError(
f"No importable agent found at {agent_path}. "
f"Expected a Python package with agent.py."
)
spec = importlib.util.spec_from_file_location(
f"{package_name}.agent",
agent_py,
submodule_search_locations=[str(agent_path)],
)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
@classmethod
def load(
cls,
agent_path: str | Path,
mock_mode: bool = False,
storage_path: Path | None = None,
model: str = "cerebras/zai-glm-4.7",
model: str | None = None,
enable_tui: bool = False,
) -> "AgentRunner":
"""
Load an agent from an export folder.
Imports the agent's Python package and reads module-level variables
(goal, nodes, edges, etc.) to build a GraphSpec. Falls back to
agent.json if no Python module is found.
Args:
agent_path: Path to agent folder (containing agent.json)
agent_path: Path to agent folder
mock_mode: If True, use mock LLM responses
storage_path: Path for runtime storage (defaults to temp)
model: LLM model to use (any LiteLLM-compatible model name)
storage_path: Path for runtime storage (defaults to ~/.hive/storage/{name})
model: LLM model to use (reads from agent's default_config if None)
enable_tui: If True, forces use of AgentRuntime with EventBus
Returns:
AgentRunner instance ready to run
"""
agent_path = Path(agent_path)
# Load agent.json
# Try loading from Python module first (code-based agents)
agent_py = agent_path / "agent.py"
if agent_py.exists():
agent_module = cls._import_agent_module(agent_path)
goal = getattr(agent_module, "goal", None)
nodes = getattr(agent_module, "nodes", None)
edges = getattr(agent_module, "edges", None)
if goal is None or nodes is None or edges is None:
raise ValueError(
f"Agent at {agent_path} must define 'goal', 'nodes', and 'edges' "
f"in agent.py (or __init__.py)"
)
# Read model and max_tokens from agent's config if not explicitly provided
agent_config = getattr(agent_module, "default_config", None)
if model is None:
if agent_config and hasattr(agent_config, "model"):
model = agent_config.model
max_tokens = getattr(agent_config, "max_tokens", 1024) if agent_config else 1024
# Build GraphSpec from module-level variables
graph = GraphSpec(
id=f"{agent_path.name}-graph",
goal_id=goal.id,
version="1.0.0",
entry_node=getattr(agent_module, "entry_node", nodes[0].id),
entry_points=getattr(agent_module, "entry_points", {}),
terminal_nodes=getattr(agent_module, "terminal_nodes", []),
pause_nodes=getattr(agent_module, "pause_nodes", []),
nodes=nodes,
edges=edges,
max_tokens=max_tokens,
)
return cls(
agent_path=agent_path,
graph=graph,
goal=goal,
mock_mode=mock_mode,
storage_path=storage_path,
model=model,
enable_tui=enable_tui,
)
# Fallback: load from agent.json (legacy JSON-based agents)
agent_json_path = agent_path / "agent.json"
if not agent_json_path.exists():
raise FileNotFoundError(f"agent.json not found in {agent_path}")
raise FileNotFoundError(f"No agent.py or agent.json found in {agent_path}")
with open(agent_json_path) as f:
graph, goal = load_agent_export(f.read())
@@ -333,6 +464,7 @@ class AgentRunner:
mock_mode=mock_mode,
storage_path=storage_path,
model=model,
enable_tui=enable_tui,
)
def register_tool(
@@ -471,16 +603,25 @@ class AgentRunner:
api_key_env = self._get_api_key_env_var(self.model)
if api_key_env and os.environ.get(api_key_env):
self._llm = LiteLLMProvider(model=self.model)
elif api_key_env:
print(f"Warning: {api_key_env} not set. LLM calls will fail.")
print(f"Set it with: export {api_key_env}=your-api-key")
else:
# Fall back to credential store
api_key = self._get_api_key_from_credential_store()
if api_key:
self._llm = LiteLLMProvider(model=self.model, api_key=api_key)
# Set env var so downstream code (e.g. cleanup LLM in
# node._extract_json) can also find it
if api_key_env:
os.environ[api_key_env] = api_key
elif api_key_env:
print(f"Warning: {api_key_env} not set. LLM calls will fail.")
print(f"Set it with: export {api_key_env}=your-api-key")
# Get tools for executor/runtime
tools = list(self._tool_registry.get_tools().values())
tool_executor = self._tool_registry.get_executor()
if self._uses_async_entry_points:
# Multi-entry-point mode: use AgentRuntime
if self._uses_async_entry_points or self.enable_tui:
# Multi-entry-point mode or TUI mode: use AgentRuntime
self._setup_agent_runtime(tools, tool_executor)
else:
# Single-entry-point mode: use legacy GraphExecutor
@@ -518,6 +659,33 @@ class AgentRunner:
# Default: assume OpenAI-compatible
return "OPENAI_API_KEY"
def _get_api_key_from_credential_store(self) -> str | None:
"""Get the LLM API key from the encrypted credential store.
Maps model name to credential store ID (e.g. "anthropic/..." -> "anthropic")
and retrieves the key via CredentialStore.get().
"""
if not os.environ.get("HIVE_CREDENTIAL_KEY"):
return None
# Map model prefix to credential store ID
model_lower = self.model.lower()
cred_id = None
if model_lower.startswith("anthropic/") or model_lower.startswith("claude"):
cred_id = "anthropic"
# Add more mappings as providers are added to LLM_CREDENTIALS
if cred_id is None:
return None
try:
from framework.credentials import CredentialStore
store = CredentialStore.with_encrypted_storage()
return store.get(cred_id)
except Exception:
return None
def _setup_legacy_executor(self, tools: list, tool_executor: Callable | None) -> None:
"""Set up legacy single-entry-point execution using GraphExecutor."""
# Create runtime
@@ -549,6 +717,19 @@ class AgentRunner:
)
entry_points.append(ep)
# If TUI enabled but no entry points (single-entry agent), create default
if not entry_points and self.enable_tui and self.graph.entry_node:
logger.info("Creating default entry point for TUI")
entry_points.append(
EntryPointSpec(
id="default",
name="Default",
entry_node=self.graph.entry_node,
trigger_type="manual",
isolation_level="shared",
)
)
# Create AgentRuntime with all entry points
self._agent_runtime = create_agent_runtime(
graph=self.graph,
@@ -599,7 +780,7 @@ class AgentRunner:
error=error_msg,
)
if self._uses_async_entry_points:
if self._uses_async_entry_points or self.enable_tui:
# Multi-entry-point mode: use AgentRuntime
return await self._run_with_agent_runtime(
input_data=input_data or {},
@@ -891,15 +1072,25 @@ class AgentRunner:
EnvVarStorage,
)
# Build env mapping for fallback
# Build env mapping for credential lookup
env_mapping = {
(spec.credential_id or name): spec.env_var
for name, spec in CREDENTIAL_SPECS.items()
}
storage = CompositeStorage(
primary=EncryptedFileStorage(),
fallbacks=[EnvVarStorage(env_mapping=env_mapping)],
)
# Only use EncryptedFileStorage if the encryption key is configured;
# otherwise just check env vars (avoids generating a throwaway key)
storages: list = [EnvVarStorage(env_mapping=env_mapping)]
if os.environ.get("HIVE_CREDENTIAL_KEY"):
storages.insert(0, EncryptedFileStorage())
if len(storages) == 1:
storage = storages[0]
else:
storage = CompositeStorage(
primary=storages[0],
fallbacks=storages[1:],
)
store = CredentialStore(storage=storage)
# Build reverse mappings
+21 -2
View File
@@ -33,6 +33,11 @@ class ToolRegistry:
4. Manually registered tools
"""
# Framework-internal context keys injected into tool calls.
# Stripped from LLM-facing schemas (the LLM doesn't know these values)
# and auto-injected at call time for tools that accept them.
CONTEXT_PARAMS = frozenset({"workspace_id", "agent_id", "session_id"})
def __init__(self):
self._tools: dict[str, RegisteredTool] = {}
self._mcp_clients: list[Any] = [] # List of MCPClient instances
@@ -275,7 +280,16 @@ class ToolRegistry:
return
base_dir = config_path.parent
for server_config in config.get("servers", []):
# Support both formats:
# {"servers": [{"name": "x", ...}]} (list format)
# {"server-name": {"transport": ...}, ...} (dict format)
server_list = config.get("servers", [])
if not server_list and "servers" not in config:
# Treat top-level keys as server names
server_list = [{"name": name, **cfg} for name, cfg in config.items()]
for server_config in server_list:
cwd = server_config.get("cwd")
if cwd and not Path(cwd).is_absolute():
server_config["cwd"] = str((base_dir / cwd).resolve())
@@ -333,7 +347,7 @@ class ToolRegistry:
# Register each tool
count = 0
for mcp_tool in client.list_tools():
# Convert MCP tool to framework Tool
# Convert MCP tool to framework Tool (strips context params from LLM schema)
tool = self._convert_mcp_tool_to_framework_tool(mcp_tool)
# Create executor that calls the MCP server
@@ -395,6 +409,11 @@ class ToolRegistry:
properties = input_schema.get("properties", {})
required = input_schema.get("required", [])
# Strip framework-internal context params from LLM-facing schema.
# The LLM can't know these values; they're auto-injected at call time.
properties = {k: v for k, v in properties.items() if k not in self.CONTEXT_PARAMS}
required = [r for r in required if r not in self.CONTEXT_PARAMS]
# Convert to framework Tool format
tool = Tool(
name=mcp_tool.name,
+19
View File
@@ -296,6 +296,25 @@ class AgentRuntime:
raise ValueError(f"Entry point '{entry_point_id}' not found")
return await stream.wait_for_completion(exec_id, timeout)
async def inject_input(self, node_id: str, content: str) -> bool:
"""Inject user input into a running client-facing node.
Routes input to the EventLoopNode identified by ``node_id``
across all active streams. Used by the TUI ChatRepl to deliver
user responses during client-facing node execution.
Args:
node_id: The node currently waiting for input
content: The user's input text
Returns:
True if input was delivered, False if no matching node found
"""
for stream in self._streams.values():
if await stream.inject_input(node_id, content):
return True
return False
async def get_goal_progress(self) -> dict[str, Any]:
"""
Evaluate goal progress across all streams.
+28 -1
View File
@@ -153,6 +153,7 @@ class ExecutionStream:
# Execution tracking
self._active_executions: dict[str, ExecutionContext] = {}
self._execution_tasks: dict[str, asyncio.Task] = {}
self._active_executors: dict[str, GraphExecutor] = {}
self._execution_results: OrderedDict[str, ExecutionResult] = OrderedDict()
self._execution_result_times: dict[str, float] = {}
self._completion_events: dict[str, asyncio.Event] = {}
@@ -237,6 +238,21 @@ class ExecutionStream:
)
)
async def inject_input(self, node_id: str, content: str) -> bool:
"""Inject user input into a running client-facing EventLoopNode.
Searches active executors for a node matching ``node_id`` and calls
its ``inject_event()`` method to unblock ``_await_user_input()``.
Returns True if input was delivered, False otherwise.
"""
for executor in self._active_executors.values():
node = executor.node_registry.get(node_id)
if node is not None and hasattr(node, "inject_event"):
await node.inject_event(content)
return True
return False
async def execute(
self,
input_data: dict[str, Any],
@@ -314,13 +330,21 @@ class ExecutionStream:
# Create runtime adapter for this execution
runtime_adapter = StreamRuntimeAdapter(self._runtime, execution_id)
# Create executor for this execution
# Create executor for this execution.
# Scope storage by execution_id so each execution gets
# fresh conversations and spillover directories.
exec_storage = self._storage.base_path / "sessions" / execution_id
executor = GraphExecutor(
runtime=runtime_adapter,
llm=self._llm,
tools=self._tools,
tool_executor=self._tool_executor,
event_bus=self._event_bus,
stream_id=self.stream_id,
storage_path=exec_storage,
)
# Track executor so inject_input() can reach EventLoopNode instances
self._active_executors[execution_id] = executor
# Create modified graph with entry point
# We need to override the entry_node to use our entry point
@@ -334,6 +358,9 @@ class ExecutionStream:
session_state=ctx.session_state,
)
# Clean up executor reference
self._active_executors.pop(execution_id, None)
# Store result with retention
self._record_execution_result(execution_id, result)
+518
View File
@@ -0,0 +1,518 @@
import logging
import time
from textual.app import App, ComposeResult
from textual.binding import Binding
from textual.containers import Container, Horizontal, Vertical
from textual.widgets import Footer, Label
from framework.runtime.agent_runtime import AgentRuntime
from framework.runtime.event_bus import AgentEvent, EventType
from framework.tui.widgets.chat_repl import ChatRepl
from framework.tui.widgets.graph_view import GraphOverview
from framework.tui.widgets.log_pane import LogPane
class StatusBar(Container):
"""Live status bar showing agent execution state."""
DEFAULT_CSS = """
StatusBar {
dock: top;
height: 1;
background: $panel;
color: $text;
padding: 0 1;
}
StatusBar > Label {
width: 100%;
}
"""
def __init__(self, graph_id: str = ""):
super().__init__()
self._graph_id = graph_id
self._state = "idle"
self._active_node: str | None = None
self._node_detail: str = ""
self._start_time: float | None = None
self._final_elapsed: float | None = None
def compose(self) -> ComposeResult:
yield Label(id="status-content")
def on_mount(self) -> None:
self._refresh()
self.set_interval(1.0, self._refresh)
def _format_elapsed(self, seconds: float) -> str:
total = int(seconds)
hours, remainder = divmod(total, 3600)
mins, secs = divmod(remainder, 60)
if hours:
return f"{hours}:{mins:02d}:{secs:02d}"
return f"{mins}:{secs:02d}"
def _refresh(self) -> None:
parts: list[str] = []
if self._graph_id:
parts.append(f"[bold]{self._graph_id}[/bold]")
if self._state == "idle":
parts.append("[dim]○ idle[/dim]")
elif self._state == "running":
parts.append("[bold green]● running[/bold green]")
elif self._state == "completed":
parts.append("[green]✓ done[/green]")
elif self._state == "failed":
parts.append("[bold red]✗ failed[/bold red]")
if self._active_node:
node_str = f"[cyan]{self._active_node}[/cyan]"
if self._node_detail:
node_str += f" [dim]({self._node_detail})[/dim]"
parts.append(node_str)
if self._state == "running" and self._start_time:
parts.append(f"[dim]{self._format_elapsed(time.time() - self._start_time)}[/dim]")
elif self._final_elapsed is not None:
parts.append(f"[dim]{self._format_elapsed(self._final_elapsed)}[/dim]")
try:
label = self.query_one("#status-content", Label)
label.update("".join(parts))
except Exception:
pass
def set_graph_id(self, graph_id: str) -> None:
self._graph_id = graph_id
self._refresh()
def set_running(self, entry_node: str = "") -> None:
self._state = "running"
self._active_node = entry_node or None
self._node_detail = ""
self._start_time = time.time()
self._final_elapsed = None
self._refresh()
def set_completed(self) -> None:
self._state = "completed"
if self._start_time:
self._final_elapsed = time.time() - self._start_time
self._active_node = None
self._node_detail = ""
self._start_time = None
self._refresh()
def set_failed(self, error: str = "") -> None:
self._state = "failed"
if self._start_time:
self._final_elapsed = time.time() - self._start_time
self._node_detail = error[:40] if error else ""
self._start_time = None
self._refresh()
def set_active_node(self, node_id: str, detail: str = "") -> None:
self._active_node = node_id
self._node_detail = detail
self._refresh()
def set_node_detail(self, detail: str) -> None:
self._node_detail = detail
self._refresh()
class AdenTUI(App):
TITLE = "Aden TUI Dashboard"
COMMAND_PALETTE_BINDING = "ctrl+o"
CSS = """
Screen {
layout: vertical;
background: $surface;
}
#left-pane {
width: 60%;
height: 100%;
layout: vertical;
background: $surface;
}
GraphOverview {
height: 40%;
background: $panel;
padding: 0;
}
LogPane {
height: 60%;
background: $surface;
padding: 0;
margin-bottom: 1;
}
ChatRepl {
width: 40%;
height: 100%;
background: $panel;
border-left: tall $primary;
padding: 0;
}
#chat-history {
height: 1fr;
width: 100%;
background: $surface;
border: none;
scrollbar-background: $panel;
scrollbar-color: $primary;
}
RichLog {
background: $surface;
border: none;
scrollbar-background: $panel;
scrollbar-color: $primary;
}
Input {
background: $surface;
border: tall $primary;
margin-top: 1;
}
Input:focus {
border: tall $accent;
}
StatusBar {
background: $panel;
color: $text;
height: 1;
padding: 0 1;
}
Footer {
background: $panel;
color: $text-muted;
}
"""
BINDINGS = [
Binding("q", "quit", "Quit"),
Binding("ctrl+s", "screenshot", "Screenshot (SVG)", show=True, priority=True),
Binding("tab", "focus_next", "Next Panel", show=True),
Binding("shift+tab", "focus_previous", "Previous Panel", show=False),
]
def __init__(self, runtime: AgentRuntime):
super().__init__()
self.runtime = runtime
self.log_pane = LogPane()
self.graph_view = GraphOverview(runtime)
self.chat_repl = ChatRepl(runtime)
self.status_bar = StatusBar(graph_id=runtime.graph.id)
self.is_ready = False
def compose(self) -> ComposeResult:
yield self.status_bar
yield Horizontal(
Vertical(
self.log_pane,
self.graph_view,
id="left-pane",
),
self.chat_repl,
)
yield Footer()
async def on_mount(self) -> None:
"""Called when app starts."""
self.title = "Aden TUI Dashboard"
# Add logging setup
self._setup_logging_queue()
# Set ready immediately so _poll_logs can process messages
self.is_ready = True
# Add event subscription with delay to ensure TUI is fully initialized
self.call_later(self._init_runtime_connection)
# Delay initial log messages until layout is fully rendered
def write_initial_logs():
logging.info("TUI Dashboard initialized successfully")
logging.info("Waiting for agent execution to start...")
# Wait for layout to be fully rendered before writing logs
self.set_timer(0.2, write_initial_logs)
def _setup_logging_queue(self) -> None:
"""Setup a thread-safe queue for logs."""
try:
import queue
from logging.handlers import QueueHandler
self.log_queue = queue.Queue()
self.queue_handler = QueueHandler(self.log_queue)
self.queue_handler.setLevel(logging.INFO)
# Get root logger
root_logger = logging.getLogger()
# Remove ALL existing handlers to prevent stdout output
# This is critical - StreamHandlers cause text to appear in header
for handler in root_logger.handlers[:]:
root_logger.removeHandler(handler)
# Add ONLY our queue handler
root_logger.addHandler(self.queue_handler)
root_logger.setLevel(logging.INFO)
# Suppress LiteLLM logging completely
litellm_logger = logging.getLogger("LiteLLM")
litellm_logger.setLevel(logging.CRITICAL) # Only show critical errors
litellm_logger.propagate = False # Don't propagate to root logger
# Start polling
self.set_interval(0.1, self._poll_logs)
except Exception:
pass
def _poll_logs(self) -> None:
"""Poll the log queue and update UI."""
if not self.is_ready:
return
try:
while not self.log_queue.empty():
record = self.log_queue.get_nowait()
# Filter out framework/library logs
if record.name.startswith(("textual", "LiteLLM", "litellm")):
continue
self.log_pane.write_python_log(record)
except Exception:
pass
_EVENT_TYPES = [
EventType.LLM_TEXT_DELTA,
EventType.CLIENT_OUTPUT_DELTA,
EventType.TOOL_CALL_STARTED,
EventType.TOOL_CALL_COMPLETED,
EventType.EXECUTION_STARTED,
EventType.EXECUTION_COMPLETED,
EventType.EXECUTION_FAILED,
EventType.NODE_LOOP_STARTED,
EventType.NODE_LOOP_ITERATION,
EventType.NODE_LOOP_COMPLETED,
EventType.CLIENT_INPUT_REQUESTED,
EventType.NODE_STALLED,
EventType.GOAL_PROGRESS,
EventType.GOAL_ACHIEVED,
EventType.CONSTRAINT_VIOLATION,
EventType.STATE_CHANGED,
EventType.NODE_INPUT_BLOCKED,
]
_LOG_PANE_EVENTS = frozenset(_EVENT_TYPES) - {
EventType.LLM_TEXT_DELTA,
EventType.CLIENT_OUTPUT_DELTA,
}
async def _init_runtime_connection(self) -> None:
"""Subscribe to runtime events with an async handler."""
try:
self._subscription_id = self.runtime.subscribe_to_events(
event_types=self._EVENT_TYPES,
handler=self._handle_event,
)
except Exception:
pass
async def _handle_event(self, event: AgentEvent) -> None:
"""Called from the agent thread — bridge to Textual's main thread."""
try:
self.call_from_thread(self._route_event, event)
except Exception:
pass
def _route_event(self, event: AgentEvent) -> None:
"""Route incoming events to widgets. Runs on Textual's main thread."""
if not self.is_ready:
return
try:
et = event.type
# --- Chat REPL events ---
if et in (EventType.LLM_TEXT_DELTA, EventType.CLIENT_OUTPUT_DELTA):
self.chat_repl.handle_text_delta(
event.data.get("content", ""),
event.data.get("snapshot", ""),
)
elif et == EventType.TOOL_CALL_STARTED:
self.chat_repl.handle_tool_started(
event.data.get("tool_name", "unknown"),
event.data.get("tool_input", {}),
)
elif et == EventType.TOOL_CALL_COMPLETED:
self.chat_repl.handle_tool_completed(
event.data.get("tool_name", "unknown"),
event.data.get("result", ""),
event.data.get("is_error", False),
)
elif et == EventType.EXECUTION_COMPLETED:
self.chat_repl.handle_execution_completed(event.data.get("output", {}))
elif et == EventType.EXECUTION_FAILED:
self.chat_repl.handle_execution_failed(event.data.get("error", "Unknown error"))
elif et == EventType.CLIENT_INPUT_REQUESTED:
self.chat_repl.handle_input_requested(
event.node_id or event.data.get("node_id", ""),
)
# --- Graph view events ---
if et in (
EventType.EXECUTION_STARTED,
EventType.EXECUTION_COMPLETED,
EventType.EXECUTION_FAILED,
):
self.graph_view.update_execution(event)
if et == EventType.NODE_LOOP_STARTED:
self.graph_view.handle_node_loop_started(event.node_id or "")
elif et == EventType.NODE_LOOP_ITERATION:
self.graph_view.handle_node_loop_iteration(
event.node_id or "",
event.data.get("iteration", 0),
)
elif et == EventType.NODE_LOOP_COMPLETED:
self.graph_view.handle_node_loop_completed(event.node_id or "")
elif et == EventType.NODE_STALLED:
self.graph_view.handle_stalled(
event.node_id or "",
event.data.get("reason", ""),
)
if et == EventType.TOOL_CALL_STARTED:
self.graph_view.handle_tool_call(
event.node_id or "",
event.data.get("tool_name", "unknown"),
started=True,
)
elif et == EventType.TOOL_CALL_COMPLETED:
self.graph_view.handle_tool_call(
event.node_id or "",
event.data.get("tool_name", "unknown"),
started=False,
)
# --- Status bar events ---
if et == EventType.EXECUTION_STARTED:
entry_node = event.data.get("entry_node") or (
self.runtime.graph.entry_node if self.runtime else ""
)
self.status_bar.set_running(entry_node)
elif et == EventType.EXECUTION_COMPLETED:
self.status_bar.set_completed()
elif et == EventType.EXECUTION_FAILED:
self.status_bar.set_failed(event.data.get("error", ""))
elif et == EventType.NODE_LOOP_STARTED:
self.status_bar.set_active_node(event.node_id or "", "thinking...")
elif et == EventType.NODE_LOOP_ITERATION:
self.status_bar.set_node_detail(f"step {event.data.get('iteration', '?')}")
elif et == EventType.TOOL_CALL_STARTED:
self.status_bar.set_node_detail(f"{event.data.get('tool_name', '')}...")
elif et == EventType.TOOL_CALL_COMPLETED:
self.status_bar.set_node_detail("thinking...")
elif et == EventType.NODE_STALLED:
self.status_bar.set_node_detail(f"stalled: {event.data.get('reason', '')}")
# --- Log pane events ---
if et in self._LOG_PANE_EVENTS:
self.log_pane.write_event(event)
except Exception:
pass
def save_screenshot(self, filename: str | None = None) -> str:
"""Save a screenshot of the current screen as SVG (viewable in browsers).
Args:
filename: Optional filename for the screenshot. If None, generates timestamp-based name.
Returns:
Path to the saved SVG file.
"""
from datetime import datetime
from pathlib import Path
# Create screenshots directory
screenshots_dir = Path("screenshots")
screenshots_dir.mkdir(exist_ok=True)
# Generate filename if not provided
if filename is None:
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
filename = f"tui_screenshot_{timestamp}.svg"
# Ensure .svg extension
if not filename.endswith(".svg"):
filename += ".svg"
# Full path
filepath = screenshots_dir / filename
# Temporarily hide borders for cleaner screenshot
chat_widget = self.query_one(ChatRepl)
original_chat_border = chat_widget.styles.border_left
chat_widget.styles.border_left = ("none", "transparent")
# Hide all Input widget borders
input_widgets = self.query("Input")
original_input_borders = []
for input_widget in input_widgets:
original_input_borders.append(input_widget.styles.border)
input_widget.styles.border = ("none", "transparent")
try:
# Get SVG data from Textual and save it
svg_data = self.export_screenshot()
filepath.write_text(svg_data, encoding="utf-8")
finally:
# Restore the original borders
chat_widget.styles.border_left = original_chat_border
for i, input_widget in enumerate(input_widgets):
input_widget.styles.border = original_input_borders[i]
return str(filepath)
def action_screenshot(self) -> None:
"""Take a screenshot (bound to Ctrl+S)."""
try:
filepath = self.save_screenshot()
self.notify(
f"Screenshot saved: {filepath} (SVG - open in browser)",
severity="information",
timeout=5,
)
except Exception as e:
self.notify(f"Screenshot failed: {e}", severity="error", timeout=5)
async def on_unmount(self) -> None:
"""Cleanup on app shutdown."""
self.is_ready = False
try:
if hasattr(self, "_subscription_id"):
self.runtime.unsubscribe_from_events(self._subscription_id)
except Exception:
pass
try:
if hasattr(self, "queue_handler"):
logging.getLogger().removeHandler(self.queue_handler)
except Exception:
pass
+303
View File
@@ -0,0 +1,303 @@
"""
Chat / REPL Widget - Uses RichLog for append-only, selection-safe display.
Streaming display approach:
- The processing-indicator Label is used as a live status bar during streaming
(Label.update() replaces text in-place, unlike RichLog which is append-only).
- On EXECUTION_COMPLETED, the final output is written to RichLog as permanent history.
- Tool events are written directly to RichLog as discrete status lines.
Client-facing input:
- When a client_facing=True EventLoopNode emits CLIENT_INPUT_REQUESTED, the
ChatRepl transitions to "waiting for input" state: input is re-enabled and
subsequent submissions are routed to runtime.inject_input() instead of
starting a new execution.
"""
import asyncio
import threading
from typing import Any
from textual.app import ComposeResult
from textual.containers import Vertical
from textual.widgets import Input, Label, RichLog
from framework.runtime.agent_runtime import AgentRuntime
class ChatRepl(Vertical):
"""Widget for interactive chat/REPL."""
DEFAULT_CSS = """
ChatRepl {
width: 100%;
height: 100%;
layout: vertical;
}
ChatRepl > RichLog {
width: 100%;
height: 1fr;
background: $surface;
border: none;
scrollbar-background: $panel;
scrollbar-color: $primary;
}
ChatRepl > #processing-indicator {
width: 100%;
height: 1;
background: $primary 20%;
color: $text;
text-style: bold;
display: none;
}
ChatRepl > Input {
width: 100%;
height: auto;
dock: bottom;
background: $surface;
border: tall $primary;
margin-top: 1;
}
ChatRepl > Input:focus {
border: tall $accent;
}
"""
def __init__(self, runtime: AgentRuntime):
super().__init__()
self.runtime = runtime
self._current_exec_id: str | None = None
self._streaming_snapshot: str = ""
self._waiting_for_input: bool = False
self._input_node_id: str | None = None
# Dedicated event loop for agent execution.
# Keeps blocking runtime code (LLM calls, MCP tools) off
# the Textual event loop so the UI stays responsive.
self._agent_loop = asyncio.new_event_loop()
self._agent_thread = threading.Thread(
target=self._agent_loop.run_forever,
daemon=True,
name="agent-execution",
)
self._agent_thread.start()
def compose(self) -> ComposeResult:
yield RichLog(id="chat-history", highlight=True, markup=True, auto_scroll=False, wrap=True)
yield Label("Agent is processing...", id="processing-indicator")
yield Input(placeholder="Enter input for agent...", id="chat-input")
def _write_history(self, content: str) -> None:
"""Write to chat history, only auto-scrolling if user is at the bottom."""
history = self.query_one("#chat-history", RichLog)
was_at_bottom = history.is_vertical_scroll_end
history.write(content)
if was_at_bottom:
history.scroll_end(animate=False)
def on_mount(self) -> None:
"""Add welcome message when widget mounts."""
history = self.query_one("#chat-history", RichLog)
history.write("[bold cyan]Chat REPL Ready[/bold cyan] — Type your input below\n")
async def on_input_submitted(self, message: Input.Submitted) -> None:
"""Handle input submission — either start new execution or inject input."""
user_input = message.value.strip()
if not user_input:
return
# Client-facing input: route to the waiting node
if self._waiting_for_input and self._input_node_id:
self._write_history(f"[bold green]You:[/bold green] {user_input}")
message.input.value = ""
# Disable input while agent processes the response
chat_input = self.query_one("#chat-input", Input)
chat_input.disabled = True
chat_input.placeholder = "Enter input for agent..."
self._waiting_for_input = False
indicator = self.query_one("#processing-indicator", Label)
indicator.update("Thinking...")
node_id = self._input_node_id
self._input_node_id = None
try:
future = asyncio.run_coroutine_threadsafe(
self.runtime.inject_input(node_id, user_input),
self._agent_loop,
)
await asyncio.wrap_future(future)
except Exception as e:
self._write_history(f"[bold red]Error delivering input:[/bold red] {e}")
return
# Double-submit guard: reject input while an execution is in-flight
if self._current_exec_id is not None:
self._write_history("[dim]Agent is still running — please wait.[/dim]")
return
indicator = self.query_one("#processing-indicator", Label)
# Append user message and clear input
self._write_history(f"[bold green]You:[/bold green] {user_input}")
message.input.value = ""
try:
# Get entry point
entry_points = self.runtime.get_entry_points()
if not entry_points:
self._write_history("[bold red]Error:[/bold red] No entry points")
return
# Determine the input key from the entry node
entry_point = entry_points[0]
entry_node = self.runtime.graph.get_node(entry_point.entry_node)
if entry_node and entry_node.input_keys:
input_key = entry_node.input_keys[0]
else:
input_key = "input"
# Reset streaming state
self._streaming_snapshot = ""
# Show processing indicator
indicator.update("Thinking...")
indicator.display = True
# Disable input while the agent is working
chat_input = self.query_one("#chat-input", Input)
chat_input.disabled = True
# Submit execution to the dedicated agent loop so blocking
# runtime code (LLM, MCP tools) never touches Textual's loop.
# trigger() returns immediately with an exec_id; the heavy
# execution task runs entirely on the agent thread.
future = asyncio.run_coroutine_threadsafe(
self.runtime.trigger(
entry_point_id=entry_point.id,
input_data={input_key: user_input},
),
self._agent_loop,
)
# wrap_future lets us await without blocking Textual's loop
self._current_exec_id = await asyncio.wrap_future(future)
except Exception as e:
indicator.display = False
self._current_exec_id = None
# Re-enable input on error
chat_input = self.query_one("#chat-input", Input)
chat_input.disabled = False
self._write_history(f"[bold red]Error:[/bold red] {e}")
# -- Event handlers called by app.py _handle_event --
def handle_text_delta(self, content: str, snapshot: str) -> None:
"""Handle a streaming text token from the LLM."""
self._streaming_snapshot = snapshot
# Show a truncated live preview in the indicator label
indicator = self.query_one("#processing-indicator", Label)
preview = snapshot[-80:] if len(snapshot) > 80 else snapshot
# Replace newlines for single-line display
preview = preview.replace("\n", " ")
indicator.update(
f"Thinking: ...{preview}" if len(snapshot) > 80 else f"Thinking: {preview}"
)
def handle_tool_started(self, tool_name: str, tool_input: dict[str, Any]) -> None:
"""Handle a tool call starting."""
# Update indicator to show tool activity
indicator = self.query_one("#processing-indicator", Label)
indicator.update(f"Using tool: {tool_name}...")
# Write a discrete status line to history
self._write_history(f"[dim]Tool: {tool_name}[/dim]")
def handle_tool_completed(self, tool_name: str, result: str, is_error: bool) -> None:
"""Handle a tool call completing."""
result_str = str(result)
preview = result_str[:200] + "..." if len(result_str) > 200 else result_str
preview = preview.replace("\n", " ")
if is_error:
self._write_history(f"[dim red]Tool {tool_name} error: {preview}[/dim red]")
else:
self._write_history(f"[dim]Tool {tool_name} result: {preview}[/dim]")
# Restore thinking indicator
indicator = self.query_one("#processing-indicator", Label)
indicator.update("Thinking...")
def handle_execution_completed(self, output: dict[str, Any]) -> None:
"""Handle execution finishing successfully."""
indicator = self.query_one("#processing-indicator", Label)
indicator.display = False
# Write the final streaming snapshot to permanent history (if any)
if self._streaming_snapshot:
self._write_history(f"[bold blue]Agent:[/bold blue] {self._streaming_snapshot}")
else:
output_str = str(output.get("output_string", output))
self._write_history(f"[bold blue]Agent:[/bold blue] {output_str}")
self._write_history("") # separator
self._current_exec_id = None
self._streaming_snapshot = ""
self._waiting_for_input = False
self._input_node_id = None
# Re-enable input
chat_input = self.query_one("#chat-input", Input)
chat_input.disabled = False
chat_input.placeholder = "Enter input for agent..."
chat_input.focus()
def handle_execution_failed(self, error: str) -> None:
"""Handle execution failing."""
indicator = self.query_one("#processing-indicator", Label)
indicator.display = False
self._write_history(f"[bold red]Error:[/bold red] {error}")
self._write_history("") # separator
self._current_exec_id = None
self._streaming_snapshot = ""
self._waiting_for_input = False
self._input_node_id = None
# Re-enable input
chat_input = self.query_one("#chat-input", Input)
chat_input.disabled = False
chat_input.placeholder = "Enter input for agent..."
chat_input.focus()
def handle_input_requested(self, node_id: str) -> None:
"""Handle a client-facing node requesting user input.
Transitions to 'waiting for input' state: flushes the current
streaming snapshot to history, re-enables the input widget,
and sets a flag so the next submission routes to inject_input().
"""
# Flush accumulated streaming text as agent output
if self._streaming_snapshot:
self._write_history(f"[bold blue]Agent:[/bold blue] {self._streaming_snapshot}")
self._streaming_snapshot = ""
self._waiting_for_input = True
self._input_node_id = node_id or None
indicator = self.query_one("#processing-indicator", Label)
indicator.update("Waiting for your input...")
chat_input = self.query_one("#chat-input", Input)
chat_input.disabled = False
chat_input.placeholder = "Type your response..."
chat_input.focus()
+194
View File
@@ -0,0 +1,194 @@
"""
Graph/Tree Overview Widget - Displays real agent graph structure.
"""
from textual.app import ComposeResult
from textual.containers import Vertical
from textual.widgets import RichLog
from framework.runtime.agent_runtime import AgentRuntime
from framework.runtime.event_bus import EventType
class GraphOverview(Vertical):
"""Widget to display Agent execution graph/tree with real data."""
DEFAULT_CSS = """
GraphOverview {
width: 100%;
height: 100%;
background: $panel;
}
GraphOverview > RichLog {
width: 100%;
height: 100%;
background: $panel;
border: none;
scrollbar-background: $surface;
scrollbar-color: $primary;
}
"""
def __init__(self, runtime: AgentRuntime):
super().__init__()
self.runtime = runtime
self.active_node: str | None = None
self.execution_path: list[str] = []
# Per-node status strings shown next to the node in the graph display.
# e.g. {"planner": "thinking...", "searcher": "web_search..."}
self._node_status: dict[str, str] = {}
def compose(self) -> ComposeResult:
# Use RichLog for formatted output
yield RichLog(id="graph-display", highlight=True, markup=True)
def on_mount(self) -> None:
"""Display initial graph structure."""
self._display_graph()
def _topo_order(self) -> list[str]:
"""BFS from entry_node following edges."""
graph = self.runtime.graph
visited: list[str] = []
seen: set[str] = set()
queue = [graph.entry_node]
while queue:
nid = queue.pop(0)
if nid in seen:
continue
seen.add(nid)
visited.append(nid)
for edge in graph.get_outgoing_edges(nid):
if edge.target not in seen:
queue.append(edge.target)
# Append orphan nodes not reachable from entry
for node in graph.nodes:
if node.id not in seen:
visited.append(node.id)
return visited
def _render_node_line(self, node_id: str) -> str:
"""Render a single node with status symbol and optional status text."""
graph = self.runtime.graph
is_terminal = node_id in (graph.terminal_nodes or [])
is_active = node_id == self.active_node
is_done = node_id in self.execution_path and not is_active
status = self._node_status.get(node_id, "")
if is_active:
sym = "[bold green]●[/bold green]"
elif is_done:
sym = "[dim]✓[/dim]"
elif is_terminal:
sym = "[yellow]■[/yellow]"
else:
sym = ""
if is_active:
name = f"[bold green]{node_id}[/bold green]"
elif is_done:
name = f"[dim]{node_id}[/dim]"
else:
name = node_id
suffix = f" [italic]{status}[/italic]" if status else ""
return f" {sym} {name}{suffix}"
def _render_edges(self, node_id: str) -> list[str]:
"""Render edge connectors from this node to its targets."""
edges = self.runtime.graph.get_outgoing_edges(node_id)
if not edges:
return []
if len(edges) == 1:
return ["", ""]
# Fan-out: show branches
lines: list[str] = []
for i, edge in enumerate(edges):
connector = "" if i == len(edges) - 1 else ""
cond = ""
if edge.condition.value not in ("always", "on_success"):
cond = f" [dim]({edge.condition.value})[/dim]"
lines.append(f" {connector}──▶ {edge.target}{cond}")
return lines
def _display_graph(self) -> None:
"""Display the graph as an ASCII DAG with edge connectors."""
display = self.query_one("#graph-display", RichLog)
display.clear()
graph = self.runtime.graph
display.write(f"[bold cyan]Agent Graph:[/bold cyan] {graph.id}\n")
# Render each node in topological order with edges
ordered = self._topo_order()
for node_id in ordered:
display.write(self._render_node_line(node_id))
for edge_line in self._render_edges(node_id):
display.write(edge_line)
# Execution path footer
if self.execution_path:
display.write("")
display.write(f"[dim]Path:[/dim] {''.join(self.execution_path[-5:])}")
def update_active_node(self, node_id: str) -> None:
"""Update the currently active node."""
self.active_node = node_id
if node_id not in self.execution_path:
self.execution_path.append(node_id)
self._display_graph()
def update_execution(self, event) -> None:
"""Update the displayed node status based on execution lifecycle events."""
if event.type == EventType.EXECUTION_STARTED:
self._node_status.clear()
self.execution_path.clear()
entry_node = event.data.get("entry_node") or (
self.runtime.graph.entry_node if self.runtime else None
)
if entry_node:
self.update_active_node(entry_node)
elif event.type == EventType.EXECUTION_COMPLETED:
self.active_node = None
self._node_status.clear()
self._display_graph()
elif event.type == EventType.EXECUTION_FAILED:
error = event.data.get("error", "Unknown error")
if self.active_node:
self._node_status[self.active_node] = f"[red]FAILED: {error}[/red]"
self.active_node = None
self._display_graph()
# -- Event handlers called by app.py _handle_event --
def handle_node_loop_started(self, node_id: str) -> None:
"""A node's event loop has started."""
self._node_status[node_id] = "thinking..."
self.update_active_node(node_id)
def handle_node_loop_iteration(self, node_id: str, iteration: int) -> None:
"""A node advanced to a new loop iteration."""
self._node_status[node_id] = f"step {iteration}"
self._display_graph()
def handle_node_loop_completed(self, node_id: str) -> None:
"""A node's event loop completed."""
self._node_status.pop(node_id, None)
self._display_graph()
def handle_tool_call(self, node_id: str, tool_name: str, *, started: bool) -> None:
"""Show tool activity next to the active node."""
if started:
self._node_status[node_id] = f"{tool_name}..."
else:
# Restore to generic thinking status after tool completes
self._node_status[node_id] = "thinking..."
self._display_graph()
def handle_stalled(self, node_id: str, reason: str) -> None:
"""Highlight a stalled node."""
self._node_status[node_id] = f"[red]stalled: {reason}[/red]"
self._display_graph()
+147
View File
@@ -0,0 +1,147 @@
"""
Log Pane Widget - Uses RichLog for reliable rendering.
"""
import logging
from datetime import datetime
from textual.app import ComposeResult
from textual.containers import Container
from textual.widgets import RichLog
from framework.runtime.event_bus import AgentEvent, EventType
class LogPane(Container):
"""Widget to display logs with reliable rendering."""
_EVENT_FORMAT: dict[EventType, tuple[str, str]] = {
EventType.EXECUTION_STARTED: (">>", "bold cyan"),
EventType.EXECUTION_COMPLETED: ("<<", "bold green"),
EventType.EXECUTION_FAILED: ("!!", "bold red"),
EventType.TOOL_CALL_STARTED: ("->", "yellow"),
EventType.TOOL_CALL_COMPLETED: ("<-", "green"),
EventType.NODE_LOOP_STARTED: ("@@", "cyan"),
EventType.NODE_LOOP_ITERATION: ("..", "dim"),
EventType.NODE_LOOP_COMPLETED: ("@@", "dim"),
EventType.NODE_STALLED: ("!!", "bold yellow"),
EventType.NODE_INPUT_BLOCKED: ("!!", "yellow"),
EventType.GOAL_PROGRESS: ("%%", "blue"),
EventType.GOAL_ACHIEVED: ("**", "bold green"),
EventType.CONSTRAINT_VIOLATION: ("!!", "bold red"),
EventType.STATE_CHANGED: ("~~", "dim"),
EventType.CLIENT_INPUT_REQUESTED: ("??", "magenta"),
}
_LOG_LEVEL_COLORS = {
logging.DEBUG: "dim",
logging.INFO: "",
logging.WARNING: "yellow",
logging.ERROR: "red",
logging.CRITICAL: "bold red",
}
DEFAULT_CSS = """
LogPane {
width: 100%;
height: 100%;
}
LogPane > RichLog {
width: 100%;
height: 100%;
background: $surface;
border: none;
scrollbar-background: $panel;
scrollbar-color: $primary;
}
"""
def compose(self) -> ComposeResult:
# RichLog is designed for log display and doesn't have TextArea's rendering issues
yield RichLog(id="main-log", highlight=True, markup=True, auto_scroll=False)
def write_event(self, event: AgentEvent) -> None:
"""Format an AgentEvent with timestamp + symbol and write to the log."""
ts = event.timestamp.strftime("%H:%M:%S")
symbol, color = self._EVENT_FORMAT.get(event.type, ("--", "dim"))
text = self._extract_event_text(event)
self.write_log(f"[dim]{ts}[/dim] [{color}]{symbol} {text}[/{color}]")
def _extract_event_text(self, event: AgentEvent) -> str:
"""Extract human-readable text from an event's data dict."""
et = event.type
data = event.data
if et == EventType.EXECUTION_STARTED:
return "Execution started"
elif et == EventType.EXECUTION_COMPLETED:
return "Execution completed"
elif et == EventType.EXECUTION_FAILED:
return f"Execution FAILED: {data.get('error', 'unknown')}"
elif et == EventType.TOOL_CALL_STARTED:
return f"Tool call: {data.get('tool_name', 'unknown')}"
elif et == EventType.TOOL_CALL_COMPLETED:
name = data.get("tool_name", "unknown")
if data.get("is_error"):
preview = str(data.get("result", ""))[:80]
return f"Tool error: {name} - {preview}"
return f"Tool done: {name}"
elif et == EventType.NODE_LOOP_STARTED:
return f"Node started: {event.node_id or 'unknown'}"
elif et == EventType.NODE_LOOP_ITERATION:
return f"{event.node_id or 'unknown'} iteration {data.get('iteration', '?')}"
elif et == EventType.NODE_LOOP_COMPLETED:
return f"Node done: {event.node_id or 'unknown'}"
elif et == EventType.NODE_STALLED:
reason = data.get("reason", "")
node = event.node_id or "unknown"
return f"Node stalled: {node} - {reason}" if reason else f"Node stalled: {node}"
elif et == EventType.NODE_INPUT_BLOCKED:
return f"Node input blocked: {event.node_id or 'unknown'}"
elif et == EventType.GOAL_PROGRESS:
return f"Goal progress: {data.get('progress', '?')}"
elif et == EventType.GOAL_ACHIEVED:
return "Goal achieved"
elif et == EventType.CONSTRAINT_VIOLATION:
return f"Constraint violated: {data.get('description', 'unknown')}"
elif et == EventType.STATE_CHANGED:
return f"State changed: {data.get('key', 'unknown')}"
elif et == EventType.CLIENT_INPUT_REQUESTED:
return "Waiting for user input"
else:
return f"{et.value}: {data}"
def write_python_log(self, record: logging.LogRecord) -> None:
"""Format a Python log record with timestamp and severity color."""
ts = datetime.fromtimestamp(record.created).strftime("%H:%M:%S")
color = self._LOG_LEVEL_COLORS.get(record.levelno, "")
msg = record.getMessage()
if color:
self.write_log(f"[dim]{ts}[/dim] [{color}]{record.levelname}[/{color}] {msg}")
else:
self.write_log(f"[dim]{ts}[/dim] {record.levelname} {msg}")
def write_log(self, message: str) -> None:
"""Write a log message to the log pane."""
try:
# Check if widget is mounted
if not self.is_mounted:
return
log = self.query_one("#main-log", RichLog)
# Check if log is mounted
if not log.is_mounted:
return
# Only auto-scroll if user is already at the bottom
was_at_bottom = log.is_vertical_scroll_end
log.write(message)
if was_at_bottom:
log.scroll_end(animate=False)
except Exception:
pass
+2 -1
View File
@@ -18,7 +18,8 @@ dependencies = [
"tools",
]
# [project.optional-dependencies]
[project.optional-dependencies]
tui = ["textual>=0.75.0"]
[project.scripts]
hive = "framework.cli:main"
+8 -4
View File
@@ -104,8 +104,10 @@ def test_event_loop_node_spec_accepted():
# --- _get_node_implementation() tests ---
def test_unregistered_event_loop_raises(runtime):
"""An event_loop node not in the registry should raise RuntimeError."""
def test_unregistered_event_loop_auto_creates(runtime):
"""An event_loop node not in the registry should be auto-created."""
from framework.graph.event_loop_node import EventLoopNode
spec = NodeSpec(
id="el1",
name="Event Loop",
@@ -114,8 +116,10 @@ def test_unregistered_event_loop_raises(runtime):
)
executor = GraphExecutor(runtime=runtime)
with pytest.raises(RuntimeError, match="not found in registry"):
executor._get_node_implementation(spec)
result = executor._get_node_implementation(spec)
assert isinstance(result, EventLoopNode)
# Auto-created node should be cached in registry
assert "el1" in executor.node_registry
def test_registered_event_loop_returns_impl(runtime):
+167 -1
View File
@@ -5,7 +5,7 @@ Focused on minimal success and failure scenarios.
import pytest
from framework.graph.edge import GraphSpec
from framework.graph.edge import EdgeCondition, EdgeSpec, GraphSpec
from framework.graph.executor import GraphExecutor
from framework.graph.goal import Goal
from framework.graph.node import NodeResult, NodeSpec
@@ -130,3 +130,169 @@ async def test_executor_single_node_failure():
assert result.success is False
assert result.error is not None
assert result.path == ["n1"]
# ---- Fake event bus that records calls ----
class FakeEventBus:
def __init__(self):
self.events = []
async def emit_node_loop_started(self, **kwargs):
self.events.append(("started", kwargs))
async def emit_node_loop_completed(self, **kwargs):
self.events.append(("completed", kwargs))
@pytest.mark.asyncio
async def test_executor_emits_node_events():
"""Executor should emit NODE_LOOP_STARTED/COMPLETED for each non-event_loop node."""
runtime = DummyRuntime()
event_bus = FakeEventBus()
graph = GraphSpec(
id="graph-ev",
goal_id="g-ev",
nodes=[
NodeSpec(
id="n1",
name="first",
description="first node",
node_type="llm_generate",
input_keys=[],
output_keys=["result"],
max_retries=0,
),
NodeSpec(
id="n2",
name="second",
description="second node",
node_type="llm_generate",
input_keys=["result"],
output_keys=["result"],
max_retries=0,
),
],
edges=[
EdgeSpec(
id="e1",
source="n1",
target="n2",
condition=EdgeCondition.ON_SUCCESS,
),
],
entry_node="n1",
terminal_nodes=["n2"],
)
executor = GraphExecutor(
runtime=runtime,
node_registry={
"n1": SuccessNode(),
"n2": SuccessNode(),
},
event_bus=event_bus,
stream_id="test-stream",
)
goal = Goal(id="g-ev", name="event-test", description="test events")
result = await executor.execute(graph=graph, goal=goal)
assert result.success is True
assert result.path == ["n1", "n2"]
# Should have 4 events: started/completed for n1, then started/completed for n2
assert len(event_bus.events) == 4
assert event_bus.events[0] == ("started", {"stream_id": "test-stream", "node_id": "n1"})
assert event_bus.events[1] == (
"completed",
{"stream_id": "test-stream", "node_id": "n1", "iterations": 1},
)
assert event_bus.events[2] == ("started", {"stream_id": "test-stream", "node_id": "n2"})
assert event_bus.events[3] == (
"completed",
{"stream_id": "test-stream", "node_id": "n2", "iterations": 1},
)
# ---- Fake event_loop node (registered, so executor won't emit for it) ----
class FakeEventLoopNode:
def validate_input(self, ctx):
return []
async def execute(self, ctx):
return NodeResult(success=True, output={"result": "loop-done"}, tokens_used=1, latency_ms=1)
@pytest.mark.asyncio
async def test_executor_skips_events_for_event_loop_nodes():
"""Executor should NOT emit events for event_loop nodes (they emit their own)."""
runtime = DummyRuntime()
event_bus = FakeEventBus()
graph = GraphSpec(
id="graph-el",
goal_id="g-el",
nodes=[
NodeSpec(
id="el1",
name="event-loop-node",
description="event loop node",
node_type="event_loop",
input_keys=[],
output_keys=["result"],
max_retries=0,
),
],
edges=[],
entry_node="el1",
)
executor = GraphExecutor(
runtime=runtime,
node_registry={"el1": FakeEventLoopNode()},
event_bus=event_bus,
stream_id="test-stream",
)
goal = Goal(id="g-el", name="el-test", description="test event_loop guard")
result = await executor.execute(graph=graph, goal=goal)
assert result.success is True
# No events should have been emitted — event_loop nodes are skipped
assert len(event_bus.events) == 0
@pytest.mark.asyncio
async def test_executor_no_events_without_event_bus():
"""Executor should work fine without an event bus (backward compat)."""
runtime = DummyRuntime()
graph = GraphSpec(
id="graph-nobus",
goal_id="g-nobus",
nodes=[
NodeSpec(
id="n1",
name="node1",
description="test node",
node_type="llm_generate",
input_keys=[],
output_keys=["result"],
max_retries=0,
)
],
edges=[],
entry_node="n1",
)
# No event_bus passed — should not crash
executor = GraphExecutor(
runtime=runtime,
node_registry={"n1": SuccessNode()},
)
goal = Goal(id="g-nobus", name="nobus-test", description="no event bus")
result = await executor.execute(graph=graph, goal=goal)
assert result.success is True
+360
View File
@@ -0,0 +1,360 @@
"""
Test that ON_FAILURE edges are followed when a node fails after max retries.
Verifies the fix for Issue #3449 where the executor would immediately terminate
when max retries were exceeded, without checking for ON_FAILURE edges that could
route to error handler nodes.
"""
from unittest.mock import AsyncMock, MagicMock
import pytest
from framework.graph.edge import EdgeCondition, EdgeSpec, GraphSpec
from framework.graph.executor import GraphExecutor
from framework.graph.goal import Goal
from framework.graph.node import NodeContext, NodeProtocol, NodeResult, NodeSpec
from framework.runtime.core import Runtime
class AlwaysFailsNode(NodeProtocol):
"""A node that always fails."""
def __init__(self):
self.attempt_count = 0
async def execute(self, ctx: NodeContext) -> NodeResult:
self.attempt_count += 1
return NodeResult(success=False, error=f"Permanent error (attempt {self.attempt_count})")
class FailureHandlerNode(NodeProtocol):
"""A node that handles failures from upstream nodes."""
def __init__(self):
self.executed = False
self.execute_count = 0
async def execute(self, ctx: NodeContext) -> NodeResult:
self.executed = True
self.execute_count += 1
return NodeResult(
success=True,
output={"handled": True, "recovery": "graceful"},
)
class SuccessNode(NodeProtocol):
"""A node that always succeeds with configurable output."""
def __init__(self, output: dict | None = None):
self.execute_count = 0
self._output = output or {"result": "ok"}
async def execute(self, ctx: NodeContext) -> NodeResult:
self.execute_count += 1
return NodeResult(success=True, output=self._output)
@pytest.fixture(autouse=True)
def fast_sleep(monkeypatch):
"""Mock asyncio.sleep to avoid real delays from exponential backoff."""
monkeypatch.setattr("asyncio.sleep", AsyncMock())
@pytest.fixture
def runtime():
"""Create a mock Runtime for testing."""
runtime = MagicMock(spec=Runtime)
runtime.start_run = MagicMock(return_value="test_run_id")
runtime.decide = MagicMock(return_value="test_decision_id")
runtime.record_outcome = MagicMock()
runtime.end_run = MagicMock()
runtime.report_problem = MagicMock()
runtime.set_node = MagicMock()
return runtime
@pytest.fixture
def goal():
return Goal(
id="test_goal",
name="Test Goal",
description="Test ON_FAILURE edge routing",
)
@pytest.mark.asyncio
async def test_on_failure_edge_followed_after_max_retries(runtime, goal):
"""
When a node fails after exhausting max retries, ON_FAILURE edges should
be followed to route execution to a failure handler node.
"""
nodes = [
NodeSpec(
id="failing",
name="Failing Node",
description="Always fails",
node_type="function",
output_keys=[],
max_retries=1,
),
NodeSpec(
id="handler",
name="Failure Handler",
description="Handles failures",
node_type="function",
output_keys=["handled", "recovery"],
),
]
edges = [
EdgeSpec(
id="fail_to_handler",
source="failing",
target="handler",
condition=EdgeCondition.ON_FAILURE,
),
]
graph = GraphSpec(
id="test_graph",
goal_id="test_goal",
name="Test Graph",
entry_node="failing",
nodes=nodes,
edges=edges,
terminal_nodes=["handler"],
)
executor = GraphExecutor(runtime=runtime)
failing_node = AlwaysFailsNode()
handler_node = FailureHandlerNode()
executor.register_node("failing", failing_node)
executor.register_node("handler", handler_node)
result = await executor.execute(graph, goal, {})
# The handler should have executed
assert handler_node.executed, "Failure handler was not executed"
assert handler_node.execute_count == 1
# Overall execution should succeed (handler recovered)
assert result.success
# Handler node should appear in the execution path
assert "handler" in result.path
@pytest.mark.asyncio
async def test_no_on_failure_edge_still_terminates(runtime, goal):
"""
When a node fails after max retries and there is no ON_FAILURE edge,
the executor should terminate with a failure result (original behavior).
"""
nodes = [
NodeSpec(
id="failing",
name="Failing Node",
description="Always fails",
node_type="function",
output_keys=[],
max_retries=1,
),
]
graph = GraphSpec(
id="test_graph",
goal_id="test_goal",
name="Test Graph",
entry_node="failing",
nodes=[nodes[0]],
edges=[],
terminal_nodes=["failing"],
)
executor = GraphExecutor(runtime=runtime)
failing_node = AlwaysFailsNode()
executor.register_node("failing", failing_node)
result = await executor.execute(graph, goal, {})
assert not result.success
assert "failed after 1 attempts" in result.error
@pytest.mark.asyncio
async def test_on_failure_edge_not_followed_on_success(runtime, goal):
"""
ON_FAILURE edges should NOT be followed when a node succeeds.
Only ON_SUCCESS edges should fire.
"""
nodes = [
NodeSpec(
id="working",
name="Working Node",
description="Always succeeds",
node_type="function",
output_keys=["result"],
),
NodeSpec(
id="handler",
name="Failure Handler",
description="Should not be reached",
node_type="function",
output_keys=["handled"],
),
NodeSpec(
id="next",
name="Next Node",
description="Normal successor",
node_type="function",
output_keys=["done"],
),
]
edges = [
EdgeSpec(
id="on_fail",
source="working",
target="handler",
condition=EdgeCondition.ON_FAILURE,
),
EdgeSpec(
id="on_success",
source="working",
target="next",
condition=EdgeCondition.ON_SUCCESS,
),
]
graph = GraphSpec(
id="test_graph",
goal_id="test_goal",
name="Test Graph",
entry_node="working",
nodes=nodes,
edges=edges,
terminal_nodes=["handler", "next"],
)
executor = GraphExecutor(runtime=runtime)
executor.register_node("working", SuccessNode(output={"result": "ok"}))
handler_node = FailureHandlerNode()
executor.register_node("handler", handler_node)
executor.register_node("next", SuccessNode(output={"done": True}))
result = await executor.execute(graph, goal, {})
assert result.success
assert not handler_node.executed, "Failure handler should not run on success"
assert "next" in result.path, "Should follow ON_SUCCESS edge to 'next'"
@pytest.mark.asyncio
async def test_on_failure_edge_with_zero_retries(runtime, goal):
"""
ON_FAILURE edges should work even when max_retries=0 (no retries allowed).
The node fails once and immediately routes to the failure handler.
"""
nodes = [
NodeSpec(
id="fragile",
name="Fragile Node",
description="Fails with no retries",
node_type="function",
output_keys=[],
max_retries=0,
),
NodeSpec(
id="handler",
name="Failure Handler",
description="Handles failures",
node_type="function",
output_keys=["handled", "recovery"],
),
]
edges = [
EdgeSpec(
id="fail_to_handler",
source="fragile",
target="handler",
condition=EdgeCondition.ON_FAILURE,
),
]
graph = GraphSpec(
id="test_graph",
goal_id="test_goal",
name="Test Graph",
entry_node="fragile",
nodes=nodes,
edges=edges,
terminal_nodes=["handler"],
)
executor = GraphExecutor(runtime=runtime)
failing_node = AlwaysFailsNode()
handler_node = FailureHandlerNode()
executor.register_node("fragile", failing_node)
executor.register_node("handler", handler_node)
result = await executor.execute(graph, goal, {})
# Should route to handler after single failure (no retries)
assert failing_node.attempt_count == 1
assert handler_node.executed
assert result.success
@pytest.mark.asyncio
async def test_on_failure_handler_appears_in_path(runtime, goal):
"""
The failure handler node should appear in the execution path.
"""
nodes = [
NodeSpec(
id="failing",
name="Failing Node",
description="Always fails",
node_type="function",
output_keys=[],
max_retries=1,
),
NodeSpec(
id="handler",
name="Failure Handler",
description="Handles failures",
node_type="function",
output_keys=["handled", "recovery"],
),
]
edges = [
EdgeSpec(
id="fail_to_handler",
source="failing",
target="handler",
condition=EdgeCondition.ON_FAILURE,
),
]
graph = GraphSpec(
id="test_graph",
goal_id="test_goal",
name="Test Graph",
entry_node="failing",
nodes=nodes,
edges=edges,
terminal_nodes=["handler"],
)
executor = GraphExecutor(runtime=runtime)
executor.register_node("failing", AlwaysFailsNode())
executor.register_node("handler", FailureHandlerNode())
result = await executor.execute(graph, goal, {})
assert "failing" in result.path
assert "handler" in result.path
assert result.node_visit_counts.get("handler") == 1
Generated
+2990
View File
File diff suppressed because it is too large Load Diff
+2 -2
View File
@@ -83,7 +83,7 @@ git clone https://github.com/adenhq/hive.git
cd hive
# Python वातावरण कॉन्फ़िगरेशन चलाएँ
./scripts/setup-python.sh
./quickstart.sh
```
यह इंस्टॉल करता है:
@@ -236,7 +236,7 @@ hive/
```bash
# एक-बार का कॉन्फ़िगरेशन
./scripts/setup-python.sh
./quickstart.sh
# यह इंस्टॉल करता है:
# - फ्रेमवर्क पैकेज (मुख्य रनटाइम)
+1 -4
View File
@@ -9,14 +9,11 @@
},
"license": "Apache-2.0",
"scripts": {
"setup": "echo '⚠️ This npm setup is for the archived web application. For agent development, use: ./scripts/setup-python.sh' && bash scripts/setup.sh",
"test:duplicates": "bun test scripts/auto-close-duplicates"
},
"devDependencies": {
"@types/node": "^20.10.0",
"tsx": "^4.7.0",
"typescript": "^5.3.0",
"yaml": "^2.3.0"
"typescript": "^5.3.0"
},
"engines": {
"node": ">=20.0.0",
-180
View File
@@ -1,180 +0,0 @@
/**
* Environment Generator Script
*
* Reads config.yaml and generates .env files for each service.
* This provides a single source of truth for configuration while
* maintaining compatibility with standard .env file workflows.
*
* Usage: npx tsx scripts/generate-env.ts
*/
import { readFileSync, writeFileSync, existsSync } from 'fs';
import { parse } from 'yaml';
import { join, dirname } from 'path';
import { fileURLToPath } from 'url';
const __dirname = dirname(fileURLToPath(import.meta.url));
const PROJECT_ROOT = join(__dirname, '..');
interface Config {
app: {
name: string;
environment: string;
log_level: string;
};
server: {
frontend: {
port: number;
};
backend: {
port: number;
host: string;
};
};
timescaledb: {
url: string;
port: number;
};
mongodb: {
url: string;
database: string;
erp_database: string;
port: number;
};
redis: {
url: string;
port: number;
};
auth: {
jwt_secret: string;
jwt_expires_in: string;
passphrase: string;
};
npm: {
token: string;
};
cors: {
origin: string;
};
features: {
registration: boolean;
rate_limiting: boolean;
request_logging: boolean;
mcp_server: boolean;
};
}
function loadConfig(): Config {
const configPath = join(PROJECT_ROOT, 'config.yaml');
if (!existsSync(configPath)) {
console.error('Error: config.yaml not found.');
console.error('Run: cp config.yaml.example config.yaml');
process.exit(1);
}
const configContent = readFileSync(configPath, 'utf-8');
return parse(configContent) as Config;
}
function generateRootEnv(config: Config): string {
return `# Generated from config.yaml - do not edit directly
# Regenerate with: npm run generate:env
# Application
NODE_ENV=${config.app.environment}
APP_NAME=${config.app.name}
LOG_LEVEL=${config.app.log_level}
# Ports
FRONTEND_PORT=${config.server.frontend.port}
BACKEND_PORT=${config.server.backend.port}
TSDB_PORT=${config.timescaledb.port}
MONGODB_PORT=${config.mongodb.port}
REDIS_PORT=${config.redis.port}
# API URL for frontend
VITE_API_URL=http://localhost:${config.server.backend.port}
# MongoDB
MONGODB_DBNAME=${config.mongodb.database}
MONGODB_ERP_DBNAME=${config.mongodb.erp_database}
# Authentication
JWT_SECRET=${config.auth.jwt_secret}
PASSPHRASE=${config.auth.passphrase}
# NPM (for Docker builds with private packages)
NPM_TOKEN=${config.npm.token}
# CORS
CORS_ORIGIN=${config.cors.origin}
`;
}
function generateFrontendEnv(config: Config): string {
return `# Generated from config.yaml - do not edit directly
# Regenerate with: npm run generate:env
VITE_API_URL=http://localhost:${config.server.backend.port}
VITE_APP_NAME=${config.app.name}
VITE_APP_ENV=${config.app.environment}
`;
}
function generateBackendEnv(config: Config): string {
return `# Generated from config.yaml - do not edit directly
# Regenerate with: npm run generate:env
# Server
NODE_ENV=${config.app.environment}
PORT=${config.server.backend.port}
# Application
LOG_LEVEL=${config.app.log_level}
# TimescaleDB (PostgreSQL)
TSDB_PG_URL=${config.timescaledb.url}
# MongoDB
MONGODB_URL=${config.mongodb.url}
MONGODB_DBNAME=${config.mongodb.database}
MONGODB_ERP_DBNAME=${config.mongodb.erp_database}
# Redis
REDIS_URL=${config.redis.url}
# Authentication
JWT_SECRET=${config.auth.jwt_secret}
PASSPHRASE=${config.auth.passphrase}
# Features
FEATURE_MCP_SERVER=${config.features.mcp_server}
`;
}
function main() {
console.log('Generating environment files from config.yaml...\n');
const config = loadConfig();
// Generate root .env (for docker-compose)
const rootEnvPath = join(PROJECT_ROOT, '.env');
writeFileSync(rootEnvPath, generateRootEnv(config));
console.log(`✓ Generated ${rootEnvPath}`);
// Generate frontend .env
const frontendEnvPath = join(PROJECT_ROOT, 'honeycomb', '.env');
writeFileSync(frontendEnvPath, generateFrontendEnv(config));
console.log(`✓ Generated ${frontendEnvPath}`);
// Generate backend .env
const backendEnvPath = join(PROJECT_ROOT, 'hive', '.env');
writeFileSync(backendEnvPath, generateBackendEnv(config));
console.log(`✓ Generated ${backendEnvPath}`);
console.log('\nDone! Environment files have been generated.');
console.log('\nNote: These files are git-ignored. Regenerate after editing config.yaml.');
}
main();
-251
View File
@@ -1,251 +0,0 @@
<#
setup-python.ps1 - Python Environment Setup for Aden Agent Framework
This script sets up the Python environment with all required packages
for building and running goal-driven agents.
#>
$ErrorActionPreference = "Stop"
# Colors for output
$RED = "Red"
$GREEN = "Green"
$YELLOW = "Yellow"
$BLUE = "Cyan"
# Get the directory where this script is located
$SCRIPT_DIR = Split-Path -Parent $MyInvocation.MyCommand.Path
$PROJECT_ROOT = Split-Path -Parent $SCRIPT_DIR
Write-Host ""
Write-Host "=================================================="
Write-Host " Aden Agent Framework - Python Setup"
Write-Host "=================================================="
Write-Host ""
# Check for Python
$pythonCmd = $null
if (Get-Command python -ErrorAction SilentlyContinue) {
$pythonCmd = "python"
}
if (-not $pythonCmd) {
Write-Host "Error: Python is not installed." -ForegroundColor $RED
Write-Host "Please install Python 3.11+ from https://python.org"
exit 1
}
# Check Python version
$versionInfo = & $pythonCmd -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')"
$major = & $pythonCmd -c "import sys; print(sys.version_info.major)"
$minor = & $pythonCmd -c "import sys; print(sys.version_info.minor)"
Write-Host "Detected Python: $versionInfo" -ForegroundColor $BLUE
if ($major -lt 3 -or ($major -eq 3 -and $minor -lt 11)) {
Write-Host "Error: Python 3.11+ is required (found $versionInfo)" -ForegroundColor $RED
Write-Host "Please upgrade your Python installation"
exit 1
}
if ($minor -lt 11) {
Write-Host "Warning: Python 3.11+ is recommended for best compatibility" -ForegroundColor $YELLOW
Write-Host "You have Python $versionInfo which may work but is not officially supported" -ForegroundColor $YELLOW
Write-Host ""
}
Write-Host "[OK] Python version check passed" -ForegroundColor $GREEN
Write-Host ""
# Create and activate virtual environment
Write-Host "=================================================="
Write-Host "Setting up Python Virtual Environment"
Write-Host "=================================================="
Write-Host ""
$VENV_PATH = Join-Path $PROJECT_ROOT ".venv"
$VENV_PYTHON = Join-Path $VENV_PATH "Scripts\python.exe"
$VENV_ACTIVATE = Join-Path $VENV_PATH "Scripts\Activate.ps1"
if (-not (Test-Path $VENV_PYTHON)) {
Write-Host "Creating virtual environment at .venv..."
& $pythonCmd -m venv $VENV_PATH
Write-Host "[OK] Virtual environment created" -ForegroundColor $GREEN
}
else {
Write-Host "[OK] Virtual environment already exists" -ForegroundColor $GREEN
}
# Activate venv
Write-Host "Activating virtual environment..."
& $VENV_ACTIVATE
Write-Host "[OK] Virtual environment activated" -ForegroundColor $GREEN
# From here on, always use venv python
$pythonCmd = $VENV_PYTHON
Write-Host ""
# Check for pip
try {
& $pythonCmd -m pip --version | Out-Null
}
catch {
Write-Host "Error: pip is not installed" -ForegroundColor $RED
Write-Host "Please install pip for Python $versionInfo"
exit 1
}
Write-Host "[OK] pip detected" -ForegroundColor $GREEN
Write-Host ""
# Upgrade pip, setuptools, and wheel
Write-Host "Upgrading pip, setuptools, and wheel..."
& $pythonCmd -m pip install --upgrade pip setuptools wheel
Write-Host "[OK] Core packages upgraded" -ForegroundColor $GREEN
Write-Host ""
# Install core framework package
Write-Host "=================================================="
Write-Host "Installing Core Framework Package"
Write-Host "=================================================="
Write-Host ""
Set-Location "$PROJECT_ROOT\core"
if (Test-Path "pyproject.toml") {
Write-Host "Installing framework from core/ (editable mode)..."
& $pythonCmd -m pip install -e . | Out-Null
Write-Host "[OK] Framework package installed" -ForegroundColor $GREEN
}
else {
Write-Host "[WARN] No pyproject.toml found in core/, skipping framework installation" -ForegroundColor $YELLOW
}
Write-Host ""
# Install tools package
Write-Host "=================================================="
Write-Host "Installing Tools Package (aden_tools)"
Write-Host "=================================================="
Write-Host ""
Set-Location "$PROJECT_ROOT\tools"
if (Test-Path "pyproject.toml") {
Write-Host "Installing aden_tools from tools/ (editable mode)..."
& $pythonCmd -m pip install -e . | Out-Null
Write-Host "[OK] Tools package installed" -ForegroundColor $GREEN
}
else {
Write-Host "Error: No pyproject.toml found in tools/" -ForegroundColor $RED
exit 1
}
Write-Host ""
# Fix openai version compatibility with litellm
Write-Host "=================================================="
Write-Host "Fixing Package Compatibility"
Write-Host "=================================================="
Write-Host ""
try {
$openaiVersion = & $pythonCmd -c "import openai; print(openai.__version__)"
}
catch {
$openaiVersion = "not_installed"
}
if ($openaiVersion -eq "not_installed") {
Write-Host "Installing openai package..."
& $pythonCmd -m pip install "openai>=1.0.0" | Out-Null
Write-Host "[OK] openai package installed" -ForegroundColor $GREEN
}
elseif ($openaiVersion.StartsWith("0.")) {
Write-Host "Found old openai version: $openaiVersion" -ForegroundColor $YELLOW
Write-Host "Upgrading to openai 1.x+ for litellm compatibility..."
& $pythonCmd -m pip install --upgrade "openai>=1.0.0" | Out-Null
$openaiVersion = & $pythonCmd -c "import openai; print(openai.__version__)"
Write-Host "[OK] openai upgraded to $openaiVersion" -ForegroundColor $GREEN
}
else {
Write-Host "[OK] openai $openaiVersion is compatible" -ForegroundColor $GREEN
}
Write-Host ""
# Verify installations
Write-Host "=================================================="
Write-Host "Verifying Installation"
Write-Host "=================================================="
Write-Host ""
Set-Location $PROJECT_ROOT
# Test framework import
& $pythonCmd -c "import framework" 2>$null
if ($LASTEXITCODE -eq 0) {
Write-Host "[OK] framework package imports successfully" -ForegroundColor Green
}
else {
Write-Host "[FAIL] framework package import failed" -ForegroundColor Red
}
# Test aden_tools import
& $pythonCmd -c "import aden_tools" 2>$null
if ($LASTEXITCODE -eq 0) {
Write-Host "[OK] aden_tools package imports successfully" -ForegroundColor Green
}
else {
Write-Host "[FAIL] aden_tools package import failed" -ForegroundColor Red
exit 1
}
# Test litellm
& $pythonCmd -c "import litellm" 2>$null
if ($LASTEXITCODE -eq 0) {
Write-Host "[OK] litellm package imports successfully" -ForegroundColor $GREEN
}
else {
Write-Host "[WARN] litellm import had issues (may be OK if not using LLM features)" -ForegroundColor $YELLOW
}
Write-Host ""
# Print agent commands
Write-Host "=================================================="
Write-Host " Setup Complete!"
Write-Host "=================================================="
Write-Host ""
Write-Host "Python packages installed:"
Write-Host " - framework (core agent runtime)"
Write-Host " - aden_tools (tools and MCP servers)"
Write-Host " - All dependencies and compatibility fixes applied"
Write-Host ""
Write-Host "To run agents on Windows (PowerShell):"
Write-Host ""
Write-Host "1. From the project root, set PYTHONPATH:"
Write-Host " `$env:PYTHONPATH=`"exports`""
Write-Host ""
Write-Host "2. Run an agent command:"
Write-Host " uv run python -m agent_name validate"
Write-Host " uv run python -m agent_name info"
Write-Host " uv run python -m agent_name run --input '{...}'"
Write-Host ""
Write-Host "Example (support_ticket_agent):"
Write-Host " uv run python -m support_ticket_agent validate"
Write-Host " uv run python -m support_ticket_agent info"
Write-Host " uv run python -m support_ticket_agent run --input '{""ticket_content"":""..."",""customer_id"":""..."",""ticket_id"":""...""}'"
Write-Host ""
Write-Host "Notes:"
Write-Host " - Ensure the virtual environment is activated (.venv)"
Write-Host " - PYTHONPATH must be set in each new PowerShell session"
Write-Host ""
Write-Host "Documentation:"
Write-Host " $PROJECT_ROOT\README.md"
Write-Host ""
Write-Host "Agent Examples:"
Write-Host " $PROJECT_ROOT\exports\"
Write-Host ""
-308
View File
@@ -1,308 +0,0 @@
#!/bin/bash
#
# setup-python.sh - Python Environment Setup for Aden Agent Framework
#
# DEPRECATED: Use ./quickstart.sh instead. It does everything this script
# does plus verifies MCP configuration, Claude Code skills, and API keys.
#
# This script is kept for CI/headless environments where the extra
# verification steps in quickstart.sh are not needed.
#
set -e
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Get the directory where this script is located
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
# Python Version
REQUIRED_PYTHON_VERSION="3.11"
# Python version split into Major and Minor
IFS='.' read -r PYTHON_MAJOR_VERSION PYTHON_MINOR_VERSION <<< "$REQUIRED_PYTHON_VERSION"
# Available python interpreter (follows sequence)
POSSIBLE_PYTHONS=("python3" "python" "py")
# Default python interpreter (initialized)
PYTHON_CMD=()
echo ""
echo "=================================================="
echo " Aden Agent Framework - Python Setup"
echo "=================================================="
echo ""
echo -e "${YELLOW}NOTE: Consider using ./quickstart.sh instead for a complete setup.${NC}"
echo ""
# Available Python interpreter
for cmd in "${POSSIBLE_PYTHONS[@]}"; do
# Check for python interpreter
if command -v "$cmd" >/dev/null 2>&1; then
# Specific check for Windows 'py' launcher
if [ "$cmd" = "py" ]; then
CURRENT_CMD=(py -3)
else
CURRENT_CMD=("$cmd")
fi
# Check Python version
if "${CURRENT_CMD[@]}" -c "import sys; sys.exit(0 if sys.version_info >= ($PYTHON_MAJOR_VERSION, $PYTHON_MINOR_VERSION) else 1)" >/dev/null 2>&1; then
echo -e "${GREEN}${NC} interpreter detected: ${CURRENT_CMD[@]}"
# Check for pip
if "${CURRENT_CMD[@]}" -m pip --version >/dev/null 2>&1; then
PYTHON_CMD=("${CURRENT_CMD[@]}")
echo -e "${GREEN}${NC} pip detected"
echo ""
break
else
echo -e "${RED}${NC} pip not found"
echo ""
fi
else
echo -e "${RED}${NC} ${CURRENT_CMD[@]} not found"
echo ""
fi
fi
done
# Display error message if python not found
if [ "${#PYTHON_CMD[@]}" -eq 0 ]; then
echo -e "${RED}Error:${NC} No suitable Python interpreter found with pip installed."
echo ""
echo "Requirements:"
echo " • Python $PYTHON_MAJOR_VERSION.$PYTHON_MINOR_VERSION+"
echo " • pip installed"
echo ""
echo "Tried the following commands:"
echo " ${POSSIBLE_PYTHONS[*]}"
echo ""
echo "Please install Python from:"
echo " https://www.python.org/downloads/"
exit 1
fi
# Display Python version
PYTHON_VERSION=$("${PYTHON_CMD[@]}" -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")')
echo -e "${BLUE}Detected Python:${NC} $PYTHON_VERSION"
echo -e "${GREEN}${NC} Python version check passed"
echo ""
# Check for uv
if ! command -v uv &> /dev/null; then
echo -e "${RED}Error: uv is not installed${NC}"
echo "Please install uv from https://github.com/astral-sh/uv"
exit 1
fi
echo -e "${GREEN}${NC} uv detected"
echo ""
# Install core framework package
echo "=================================================="
echo "Installing Core Framework Package"
echo "=================================================="
echo ""
cd "$PROJECT_ROOT/core"
# Create venv if it doesn't exist
if [ ! -d ".venv" ]; then
echo "Creating virtual environment in core/.venv..."
uv venv
echo -e "${GREEN}${NC} Virtual environment created"
else
echo -e "${GREEN}${NC} Virtual environment already exists"
fi
echo ""
if [ -f "pyproject.toml" ]; then
echo "Installing framework from core/ (editable mode)..."
CORE_PYTHON=".venv/bin/python"
if uv pip install --python "$CORE_PYTHON" -e .; then
echo -e "${GREEN}${NC} Framework package installed"
else
echo -e "${YELLOW}${NC} Framework installation encountered issues (may be OK if already installed)"
fi
else
echo -e "${YELLOW}${NC} No pyproject.toml found in core/, skipping framework installation"
fi
echo ""
# Install tools package
echo "=================================================="
echo "Installing Tools Package (aden_tools)"
echo "=================================================="
echo ""
cd "$PROJECT_ROOT/tools"
# Create venv if it doesn't exist
if [ ! -d ".venv" ]; then
echo "Creating virtual environment in tools/.venv..."
uv venv
echo -e "${GREEN}${NC} Virtual environment created"
else
echo -e "${GREEN}${NC} Virtual environment already exists"
fi
echo ""
if [ -f "pyproject.toml" ]; then
echo "Installing aden_tools from tools/ (editable mode)..."
TOOLS_PYTHON=".venv/bin/python"
if uv pip install --python "$TOOLS_PYTHON" -e .; then
echo -e "${GREEN}${NC} Tools package installed"
else
echo -e "${RED}${NC} Tools installation failed"
exit 1
fi
else
echo -e "${RED}Error: No pyproject.toml found in tools/${NC}"
exit 1
fi
echo ""
# Install Playwright browser for web scraping
echo "=================================================="
echo "Installing Playwright Browser"
echo "=================================================="
echo ""
if $PYTHON_CMD -c "import playwright" > /dev/null 2>&1; then
echo "Installing Chromium browser for web scraping..."
if $PYTHON_CMD -m playwright install chromium > /dev/null 2>&1; then
echo -e "${GREEN}${NC} Playwright Chromium installed"
else
echo -e "${YELLOW}${NC} Playwright browser install failed (web_scrape tool may not work)"
echo " Run manually: uv run python -m playwright install chromium"
fi
else
echo -e "${YELLOW}${NC} Playwright not found, skipping browser install"
fi
echo ""
# Fix openai version compatibility with litellm
echo "=================================================="
echo "Fixing Package Compatibility"
echo "=================================================="
echo ""
TOOLS_PYTHON="$PROJECT_ROOT/tools/.venv/bin/python"
# Check openai version in tools venv
OPENAI_VERSION=$($TOOLS_PYTHON -c "import openai; print(openai.__version__)" 2>/dev/null || echo "not_installed")
if [ "$OPENAI_VERSION" = "not_installed" ]; then
echo "Installing openai package..."
uv pip install --python "$TOOLS_PYTHON" "openai>=1.0.0"
echo -e "${GREEN}${NC} openai package installed"
elif [[ "$OPENAI_VERSION" =~ ^0\. ]]; then
echo -e "${YELLOW}Found old openai version: $OPENAI_VERSION${NC}"
echo "Upgrading to openai 1.x+ for litellm compatibility..."
uv pip install --python "$TOOLS_PYTHON" --upgrade "openai>=1.0.0"
OPENAI_VERSION=$($TOOLS_PYTHON -c "import openai; print(openai.__version__)" 2>/dev/null)
echo -e "${GREEN}${NC} openai upgraded to $OPENAI_VERSION"
else
echo -e "${GREEN}${NC} openai $OPENAI_VERSION is compatible"
fi
echo ""
# Ensure exports directory exists
echo "=================================================="
echo "Checking Directory Structure"
echo "=================================================="
echo ""
if [ ! -d "$PROJECT_ROOT/exports" ]; then
echo "Creating exports directory..."
mkdir -p "$PROJECT_ROOT/exports"
echo "# Agent Exports" > "$PROJECT_ROOT/exports/README.md"
echo "" >> "$PROJECT_ROOT/exports/README.md"
echo "This directory is the default location for generated agent packages." >> "$PROJECT_ROOT/exports/README.md"
echo -e "${GREEN}${NC} Created exports directory"
else
echo -e "${GREEN}${NC} exports directory exists"
fi
echo ""
# Verify installations
echo "=================================================="
echo "Verifying Installation"
echo "=================================================="
echo ""
cd "$PROJECT_ROOT"
# Test framework import using core venv
CORE_PYTHON="$PROJECT_ROOT/core/.venv/bin/python"
if [ -f "$CORE_PYTHON" ]; then
if $CORE_PYTHON -c "import framework; print('framework OK')" > /dev/null 2>&1; then
echo -e "${GREEN}${NC} framework package imports successfully"
else
echo -e "${RED}${NC} framework package import failed"
echo -e "${YELLOW} Note: This may be OK if you don't need the framework${NC}"
fi
else
echo -e "${RED}${NC} core/.venv not found - venv creation may have failed${NC}"
exit 1
fi
# Test aden_tools import using tools venv
TOOLS_PYTHON="$PROJECT_ROOT/tools/.venv/bin/python"
if [ -f "$TOOLS_PYTHON" ]; then
if $TOOLS_PYTHON -c "import aden_tools; print('aden_tools OK')" > /dev/null 2>&1; then
echo -e "${GREEN}${NC} aden_tools package imports successfully"
else
echo -e "${RED}${NC} aden_tools package import failed"
exit 1
fi
else
echo -e "${RED}${NC} tools/.venv not found - venv creation may have failed${NC}"
exit 1
fi
# Test litellm + openai compatibility using tools venv
if $TOOLS_PYTHON -c "import litellm; print('litellm OK')" > /dev/null 2>&1; then
echo -e "${GREEN}${NC} litellm package imports successfully"
else
echo -e "${YELLOW}${NC} litellm import had issues (may be OK if not using LLM features)"
fi
echo ""
# Print agent commands
echo "=================================================="
echo " Setup Complete!"
echo "=================================================="
echo ""
echo "Python packages installed:"
echo " • framework (core agent runtime)"
echo " • aden_tools (tools and MCP servers)"
echo " • All dependencies and compatibility fixes applied"
echo ""
echo "To run agents, use:"
echo ""
echo " ${BLUE}# From project root:${NC}"
echo " PYTHONPATH=exports uv run python -m agent_name validate"
echo " PYTHONPATH=exports uv run python -m agent_name info"
echo " PYTHONPATH=exports uv run python -m agent_name run --input '{...}'"
echo ""
echo "Available commands for your new agent:"
echo " PYTHONPATH=exports uv run python -m support_ticket_agent validate"
echo " PYTHONPATH=exports uv run python -m support_ticket_agent info"
echo " PYTHONPATH=exports uv run python -m support_ticket_agent run --input '{\"ticket_content\":\"...\",\"customer_id\":\"...\",\"ticket_id\":\"...\"}'"
echo ""
echo "To build new agents, use Claude Code skills:"
echo " • /building-agents - Build a new agent"
echo " • /testing-agent - Test an existing agent"
echo ""
echo "Documentation: ${PROJECT_ROOT}/README.md"
echo "Agent Examples: ${PROJECT_ROOT}/exports/"
echo ""
-79
View File
@@ -1,79 +0,0 @@
#!/bin/bash
# Legacy Web Application Setup Script
# NOTE: This script is for the archived honeycomb/hive web application.
# For agent development, use: ./quickstart.sh
set -e
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
echo "==================================="
echo " Legacy Web App Setup (Archived)"
echo "==================================="
echo ""
echo "⚠️ This script is for the archived web application."
echo " For agent development, use: ./quickstart.sh"
echo ""
# Check for Node.js
if ! command -v node &> /dev/null; then
echo "Error: Node.js is not installed."
echo "Please install Node.js 20+ from https://nodejs.org"
exit 1
fi
NODE_VERSION=$(node -v | cut -d'v' -f2 | cut -d'.' -f1)
if [ "$NODE_VERSION" -lt 20 ]; then
echo "Error: Node.js 20+ is required (found v$NODE_VERSION)"
exit 1
fi
echo "✓ Node.js $(node -v) detected"
# Check for Docker (optional)
if command -v docker &> /dev/null; then
echo "✓ Docker $(docker --version | cut -d' ' -f3 | tr -d ',') detected"
else
echo "⚠ Docker not found (optional, needed for containerized deployment)"
fi
echo ""
# Create config.yaml if it doesn't exist
if [ ! -f "$PROJECT_ROOT/config.yaml" ]; then
echo "Creating config.yaml from template..."
cp "$PROJECT_ROOT/config.yaml.example" "$PROJECT_ROOT/config.yaml"
echo "✓ Created config.yaml"
echo ""
echo " Please review and edit config.yaml with your settings."
echo ""
else
echo "✓ config.yaml already exists"
fi
# Install dependencies
echo ""
echo "Installing dependencies..."
cd "$PROJECT_ROOT"
npm install
echo "✓ Dependencies installed"
# Generate environment files
echo ""
echo "Generating environment files from config.yaml..."
npx tsx scripts/generate-env.ts
echo "✓ Environment files generated"
echo ""
echo "==================================="
echo " Setup Complete (Legacy)"
echo "==================================="
echo ""
echo "⚠️ NOTE: The honeycomb/hive web application has been archived."
echo ""
echo "For agent development, please use:"
echo " ./quickstart.sh"
echo ""
echo "See ENVIRONMENT_SETUP.md for complete agent development guide."
echo ""
+5
View File
@@ -26,6 +26,7 @@ from .email_tool import register_tools as register_email
from .example_tool import register_tools as register_example
from .file_system_toolkits.apply_diff import register_tools as register_apply_diff
from .file_system_toolkits.apply_patch import register_tools as register_apply_patch
from .file_system_toolkits.data_tools import register_tools as register_data_tools
from .file_system_toolkits.execute_command_tool import (
register_tools as register_execute_command,
)
@@ -82,6 +83,7 @@ def register_all_tools(
register_apply_patch(mcp)
register_grep_search(mcp)
register_execute_command(mcp)
register_data_tools(mcp)
register_csv(mcp)
return [
@@ -97,6 +99,9 @@ def register_all_tools(
"apply_patch",
"grep_search",
"execute_command_tool",
"load_data",
"save_data",
"list_data_files",
"csv_read",
"csv_write",
"csv_append",
@@ -0,0 +1,3 @@
from .data_tools import register_tools
__all__ = ["register_tools"]
@@ -0,0 +1,179 @@
"""
Data Tools - Load, save, and list data files for agent pipelines.
These tools let agents store large intermediate results in files and
retrieve them with pagination, keeping the LLM conversation context small.
Used in conjunction with the spillover system: when a tool result is too
large, the framework writes it to a file and the agent can load it back
with load_data().
"""
from __future__ import annotations
import json
from pathlib import Path
from mcp.server.fastmcp import FastMCP
def register_tools(mcp: FastMCP) -> None:
"""Register data management tools with the MCP server."""
@mcp.tool()
def save_data(filename: str, data: str, data_dir: str) -> dict:
"""
Purpose
Save data to a file for later retrieval by this or downstream nodes.
When to use
Store large results (search results, profiles, analysis) instead
of passing them inline through set_output.
Returns a brief summary with the filename to reference later.
Rules & Constraints
filename must be a simple name like 'results.json' no paths or '..'
data_dir must be the absolute path to the data directory
Args:
filename: Simple filename like 'github_users.json'. No paths or '..'.
data: The string data to write (typically JSON).
data_dir: Absolute path to the data directory.
Returns:
Dict with success status and file metadata, or error dict
"""
if not filename or ".." in filename or "/" in filename or "\\" in filename:
return {"error": "Invalid filename. Use simple names like 'users.json'"}
if not data_dir:
return {"error": "data_dir is required"}
try:
dir_path = Path(data_dir)
dir_path.mkdir(parents=True, exist_ok=True)
path = dir_path / filename
path.write_text(data, encoding="utf-8")
lines = data.count("\n") + 1
return {
"success": True,
"filename": filename,
"size_bytes": len(data.encode("utf-8")),
"lines": lines,
"preview": data[:200] + ("..." if len(data) > 200 else ""),
}
except Exception as e:
return {"error": f"Failed to save data: {str(e)}"}
@mcp.tool()
def load_data(
filename: str,
data_dir: str,
offset: int = 0,
limit: int = 50,
) -> dict:
"""
Purpose
Load data from a previously saved file with pagination.
When to use
Retrieve large tool results that were spilled to disk.
Read data saved by save_data or by the spillover system.
Page through large files without loading everything into context.
Rules & Constraints
filename must match a file in data_dir
Returns a page of lines with metadata about the full file
Args:
filename: The filename to load (as shown in spillover messages or save_data results).
data_dir: Absolute path to the data directory.
offset: 0-based line number to start reading from. Default 0.
limit: Max number of lines to return. Default 50.
Returns:
Dict with content, pagination info, and metadata
Examples:
load_data('users.json', '/path/to/data') # first 50 lines
load_data('users.json', '/path/to/data', offset=50, limit=50) # next 50
load_data('users.json', '/path/to/data', limit=200) # first 200 lines
"""
if not filename or ".." in filename or "/" in filename or "\\" in filename:
return {"error": "Invalid filename"}
if not data_dir:
return {"error": "data_dir is required"}
try:
offset = int(offset)
limit = int(limit)
path = Path(data_dir) / filename
if not path.exists():
return {"error": f"File not found: {filename}"}
content = path.read_text(encoding="utf-8")
size_bytes = len(content.encode("utf-8"))
# If content is a single long line, try to pretty-print JSON so
# line-based pagination actually works.
all_lines = content.split("\n")
if len(all_lines) <= 2 and size_bytes > 500:
try:
parsed = json.loads(content)
content = json.dumps(parsed, indent=2, ensure_ascii=False)
all_lines = content.split("\n")
except (json.JSONDecodeError, TypeError, ValueError):
pass
total = len(all_lines)
start = min(offset, total)
end = min(start + limit, total)
sliced = all_lines[start:end]
return {
"success": True,
"filename": filename,
"content": "\n".join(sliced),
"total_lines": total,
"size_bytes": size_bytes,
"offset": start,
"lines_returned": len(sliced),
"has_more": end < total,
}
except Exception as e:
return {"error": f"Failed to load data: {str(e)}"}
@mcp.tool()
def list_data_files(data_dir: str) -> dict:
"""
Purpose
List all data files in the data directory.
When to use
Discover what intermediate results or spillover files are available.
Check what data was saved by previous nodes in the pipeline.
Args:
data_dir: Absolute path to the data directory.
Returns:
Dict with list of files and their sizes
"""
if not data_dir:
return {"error": "data_dir is required"}
try:
dir_path = Path(data_dir)
if not dir_path.exists():
return {"files": []}
files = []
for f in sorted(dir_path.iterdir()):
if f.is_file():
files.append(
{
"filename": f.name,
"size_bytes": f.stat().st_size,
}
)
return {"files": files}
except Exception as e:
return {"error": f"Failed to list data files: {str(e)}"}
@@ -11,6 +11,7 @@ Auto-detection: If provider="auto", tries Brave first (backward compatible), the
from __future__ import annotations
import os
import time
from typing import TYPE_CHECKING, Literal
import httpx
@@ -35,27 +36,35 @@ def register_tools(
cse_id: str,
) -> dict:
"""Execute search using Google Custom Search API."""
response = httpx.get(
"https://www.googleapis.com/customsearch/v1",
params={
"key": api_key,
"cx": cse_id,
"q": query,
"num": min(num_results, 10),
"lr": f"lang_{language}",
"gl": country,
},
timeout=30.0,
)
max_retries = 3
for attempt in range(max_retries + 1):
response = httpx.get(
"https://www.googleapis.com/customsearch/v1",
params={
"key": api_key,
"cx": cse_id,
"q": query,
"num": min(num_results, 10),
"lr": f"lang_{language}",
"gl": country,
},
timeout=30.0,
)
if response.status_code == 401:
return {"error": "Invalid Google API key"}
elif response.status_code == 403:
return {"error": "Google API key not authorized or quota exceeded"}
elif response.status_code == 429:
return {"error": "Google rate limit exceeded. Try again later."}
elif response.status_code != 200:
return {"error": f"Google API request failed: HTTP {response.status_code}"}
if response.status_code == 429 and attempt < max_retries:
time.sleep(2**attempt)
continue
if response.status_code == 401:
return {"error": "Invalid Google API key"}
elif response.status_code == 403:
return {"error": "Google API key not authorized or quota exceeded"}
elif response.status_code == 429:
return {"error": "Google rate limit exceeded. Try again later."}
elif response.status_code != 200:
return {"error": f"Google API request failed: HTTP {response.status_code}"}
break
data = response.json()
results = []
@@ -82,26 +91,34 @@ def register_tools(
api_key: str,
) -> dict:
"""Execute search using Brave Search API."""
response = httpx.get(
"https://api.search.brave.com/res/v1/web/search",
params={
"q": query,
"count": min(num_results, 20),
"country": country,
},
headers={
"X-Subscription-Token": api_key,
"Accept": "application/json",
},
timeout=30.0,
)
max_retries = 3
for attempt in range(max_retries + 1):
response = httpx.get(
"https://api.search.brave.com/res/v1/web/search",
params={
"q": query,
"count": min(num_results, 20),
"country": country,
},
headers={
"X-Subscription-Token": api_key,
"Accept": "application/json",
},
timeout=30.0,
)
if response.status_code == 401:
return {"error": "Invalid Brave API key"}
elif response.status_code == 429:
return {"error": "Brave rate limit exceeded. Try again later."}
elif response.status_code != 200:
return {"error": f"Brave API request failed: HTTP {response.status_code}"}
if response.status_code == 429 and attempt < max_retries:
time.sleep(2**attempt)
continue
if response.status_code == 401:
return {"error": "Invalid Brave API key"}
elif response.status_code == 429:
return {"error": "Brave rate limit exceeded. Try again later."}
elif response.status_code != 200:
return {"error": f"Brave API request failed: HTTP {response.status_code}"}
break
data = response.json()
results = []
Generated
+3419
View File
File diff suppressed because it is too large Load Diff
Generated
+3478
View File
File diff suppressed because it is too large Load Diff