Merge pull request #5251 from vincentjiang777/main

docs: roadmap updates for architecure v3
This commit is contained in:
RichardTang-Aden
2026-02-22 20:48:43 -08:00
committed by GitHub
+725 -192
View File
@@ -125,229 +125,762 @@ flowchart TB
---
## Phase 1: Foundation
## Core Architecture & Swarm Primitives
### Backbone Architecture
### Node-Based Architecture
Implement the core execution engine where every Agent operates as an isolated, asynchronous graph of nodes.
- [ ] **Node-Based Architecture (Agent as a node)**
- [x] Object schema definition
- [x] Node wrapper SDK
- [x] Shared memory access
- [ ] Default monitoring hooks
- [x] Tool access layer
- [x] LLM integration layer (Natively supports all mainstream LLMs through LiteLLM)
- [x] Anthropic
- [x] OpenAI
- [x] Google
- [x] **Communication protocol between nodes**
- [x] **[Coding Agent] Goal Creation Session** (separate from coding session)
- [x] Instruction back and forth
- [x] Goal Object schema definition
- [x] Being able to generate the test cases
- [x] Test case validation for worker agent (Outcome driven)
- [ ] **[Coding Agent] Worker Agent Creation**
- [x] Coding Agent tools
- [ ] Use Template Agent as a start
- [x] Use our MCP tools
- [ ] **[Worker Agent] Human-in-the-Loop**
- [x] Worker Agents request with questions and options
- [x] Callback Handler System to receive events throughout execution
- [x] Tool-Based Intervention Points (tool to pause execution and request human input)
- [x] Multiple entrypoint for different event source (e.g. Human input, webhook)
- [ ] Streaming Interface for Real-time Monitoring
- [x] Request State Management
- [x] **Core Node Implementation**
- [x] NodeProtocol with JSON parsing utilities (graph/node.py)
- [x] EventLoopNode with LLM conversation management (graph/event_loop_node.py)
- [x] Flexible input/output keys with nullable output handling
- [x] Node wrapper SDK for agent creation
- [x] Tool access layer with MCP integration
- [x] **Graph Executor**
- [x] Graph traversal execution (graph/executor.py)
- [x] Node transition management
- [x] Error handling and output mapping
- [x] ExecutionResult with success/error status
- [x] **Shared Memory Access**
- [x] SharedState manager (runtime/shared_state.py)
- [x] Session-based storage (storage/session_store.py)
- [x] Isolation levels: ISOLATED, SHARED, SYNCHRONIZED
- [ ] **Default Monitoring Hooks**
- [ ] Performance metrics collection
- [ ] Resource usage tracking
- [ ] Health check endpoints
### Credential Management
### Node Protocol
Build the standard communication protocol for inter-node messaging and data passing.
- [x] **Credentials Setup Process**
- [x] Install Credential MCP
- [x] **Pluggable Credential Sources**
- [x] **Abstraction & Local Sources**
- [x] Introduce `CredentialSource` base class
- [x] Refactor existing logic into `EnvVarSource`
- [x] Implementation of Source Priority Chain mechanism
- [ ] Foundation unit tests
- [ ] **Enterprise Secret Managers**
- [x] `VaultSource` (HashiCorp Vault)
- [ ] `AWSSecretsSource` (AWS Secrets Manager)
- [ ] `AzureKeyVaultSource` (Azure Key Vault)
- [ ] Management of optional provider dependencies
- [ ] **Advanced Features**
- [x] Credential expiration and auto-refresh
- [x] **Edge Specifications**
- [x] ALWAYS: Always traverse (graph/edge.py)
- [x] ON_SUCCESS: Success-based routing
- [x] ON_FAILURE: Failure-based routing
- [x] CONDITIONAL: Expression-based routing with safe_eval
- [x] LLM_DECIDE: Goal-aware LLM-powered routing
- [x] **Event Bus System**
- [x] Full event bus implementation (runtime/event_bus.py)
- [x] LLM text deltas, tool calls, node transitions
- [x] Graph-scoped event routing for multi-agent scenarios
- [x] **Conversation Management**
- [x] NodeConversation tracks message history (graph/conversation.py)
- [x] Tool results, streaming content, metadata support
### Judge in Event Loop
A separate LLM-powered judge to determine if the workers finish their job.
- [x] **Conversation Judge (Level 2)**
- [x] Evaluates node completion against success criteria (graph/conversation_judge.py)
- [x] Reads recent conversation and assesses quality
- [x] Returns verdict: ACCEPT or RETRY with confidence scores
- [x] **Test Evaluation Judge**
- [x] Provider-agnostic (OpenAI, Anthropic, Google Gemini) (testing/llm_judge.py)
- [x] JSON response parsing for structured evaluation
- [ ] **Multi-Level Judgment Integration**
- [ ] Judge node integration with event loop
- [ ] Automatic retry logic based on judge verdict
- [ ] Judge performance monitoring
### Swarm Hierarchy
Develop the distinct behavioral logic for the Queen Bee (Orchestrator), Judge Bee (Evaluator), and Worker Bee (Executor).
- [x] **Judge Bee (Evaluator)**
- [x] Evaluation criteria framework (graph/goal.py)
- [x] Success/failure determination
- [x] Quality assessment with confidence scores
- [x] **Hive Coder Agent (Builder)**
- [x] Coder node: forever-alive event loop (agents/hive_coder/nodes/)
- [x] Guardian node: event-driven watchdog for supervised agents
- [x] Tool discovery (discover_mcp_tools)
- [x] Agent aware (list_agents, inspect sessions)
- [x] Post-build testing (run_agent_tests)
- [x] Debugging capabilities (inspect checkpoints, memory)
- [ ] **Queen Bee (Orchestrator)**
- [ ] Multi-agent coordination layer
- [ ] Task distribution logic
- [ ] Dynamic worker agent creation
- [ ] Swarm-level goal management
- [ ] **Worker Bee (Executor)**
- [ ] Worker taxonomy definition
- [ ] Worker agent templates
- [ ] Task execution patterns
### Coding Agent Workflows
Implement the Goal Creation Session via the Queen Bee and the dynamic Worker Agent Creation flow.
- [x] **Goal Creation Session**
- [x] Goal object schema definition (graph/goal.py)
- [x] SuccessCriterion: Measurable success (5+ criteria per goal)
- [x] Constraint: Hard/soft boundaries (time, cost, safety, scope, quality)
- [x] GoalStatus: DRAFT → READY → ACTIVE → COMPLETED/FAILED
- [x] Instruction back and forth in Hive Coder
- [x] Test case generation
- [x] Test case validation for worker agent
- [x] **Agent Creation Flow**
- [x] Hive Coder reads templates and discovers tools (mcp/agent_builder_server.py)
- [x] Generates agent.py, nodes/__init__.py, config.py
- [x] MCP server configuration discovery
- [x] Dynamic tool binding
- [ ] **Worker Agent Dynamic Creation**
- [ ] Template agent initialization from Queen Bee
- [ ] Runtime worker instantiation
- [ ] Worker lifecycle management
### Security Layer
Build robust, local Credential Management interfaces for secure API key handling.
- [x] **Unified Credential Store**
- [x] Multi-backend storage (credentials/store.py)
- [x] EncryptedFileStorage: Encrypted local storage (~/.hive/credentials)
- [x] EnvVarStorage: Environment variable mapping
- [x] InMemoryStorage: Testing
- [x] HashiCorp Vault: Enterprise secrets (credentials/storage.py)
- [x] Template resolution: `{{cred.key}}` patterns
- [x] Caching with TTL (default 5 min, configurable)
- [x] Thread-safe operations with RLock
- [x] **OAuth2 Providers**
- [x] Base provider pattern (credentials/oauth2/)
- [x] HubSpot provider integration
- [x] Lifecycle management (refresh tokens)
- [x] Browser opening for auth flows (tools/credentials/browser.py)
- [x] **Aden Sync Provider**
- [x] Syncs OAuth2 tokens from Aden authentication server (credentials/aden/)
- [x] Falls back to local storage if Aden unavailable
- [x] Auto-refresh on sync
- [ ] **Enterprise Secret Managers**
- [ ] AWS Secrets Manager integration
- [ ] Azure Key Vault integration
- [ ] Audit logging for compliance/tracking
- [ ] Per-environment configuration support
- [ ] **Documentation & DX**
- [ ] Comprehensive source documentation
- [ ] Example configurations for all providers
- [x] **Integration as tools coverage**
- [x] Gsuite Tools
- [x] Social Media
- [ ] Twitter(X)
- [x] Github
- [ ] Instagram
- [ ] SAAS
- [ ] Hubspot
- [ ] Slack
- [ ] Teams
- [ ] Zoom
- [ ] Stripe
- [ ] Salesforce
> [!IMPORTANT]
> **Community Contribution Wanted**: We appreciate help from the community to expand the "Integration as tools" capability. Leave an issue of the integration you want to support via Hive!
### Essential Tools
- [x] **File Use Tool Kit**
- [x] **Memory Tools**
- [x] STM Layer Tool (state-based short-term memory)
- [x] LTM Layer Tool (RLM - long-term memory)
- [ ] **Infrastructure Tools**
- [x] Runtime Log Tool (logs for coding agent)
- [x] Web Search
- [x] Web Scraper
- [x] CSV tools
- [x] PDF tools
- [ ] Excel tools
- [ ] Email Tools
- [ ] Recipe for "Add your own tools"
### Memory & File System
- [x] DB for long-term persistent memory (Filesystem as durable scratchpad pattern)
- [x] Session Local memory isolation
### Eval System (Basic)
- [x] Test Driven - Run test case for all agent iteration
- [ ] Failure recording mechanism
- [ ] SDK for defining failure conditions
- [ ] Basic observability hooks
- [ ] User-driven log analysis (OSS approach)
### Data Validation
- [x] Natively Support data validation of LLMs output with Pydantic
### Developer Experience
- [ ] **MVP Features**
- [ ] Debugging mode
- [ ] CLI tools for memory management
- [ ] CLI tools for credential management
- [ ] **MVP Resources & Documentation**
- [x] Quick start guide
- [x] Goal creation guide
- [x] Agent creation guide
- [x] GitHub Page setup
- [x] README with examples
- [x] Contributing guidelines
- [ ] Introduction Video
### Adaptiveness
- [ ] Runtime data feedback loop
- [ ] Instant Developer Feedback for improvement
### Sample Agents
- [ ] Knowledge Agent
- [ ] Blog Writer Agent
- [ ] SDR Agent
---
## Phase 2: Expansion
## Tooling Ecosystem & General Compute
### Basic Guardrails
### Sub-agents Parallel Execution
Develop the Sub-agent execution environment for parallel tasks execution. The subagents are designed with isolation for repeatability.
- [ ] Support Basic Monitoring from Agent node SDK
- [ ] SDK guardrail implementation (in node)
- [ ] Guardrail type support (Determined Condition as Guardrails)
- [x] **Multi-Graph Sessions**
- [x] Load multiple agent graphs in single session (runtime/agent_runtime.py)
- [x] Shared state between graphs
- [x] Independent execution streams
- [x] Graph lifecycle management (load/unload/start/restart)
- [x] **Concurrent Execution Management**
- [x] Max concurrent executions configuration
- [x] Isolation levels: isolated, shared, synchronized
- [ ] **Sub-agent Execution Environment**
- [ ] Isolated sub-agent runtime environment
- [ ] Task isolation mechanisms
- [ ] Result aggregation
- [ ] Error handling for parallel tasks
- [ ] Repeatability guarantees
### Agent Capability
### Browser Use Node
Implement native browser-integrated automation so agents can take over a browser for auth and agents perform the automation jobs. This node comes with a specific set of tools and system prompts.
- [ ] Streaming mode support
- [ ] Image Generation support
- [ ] Take end user input Image and flatfile understand capability
- [x] **Web Scraping with Playwright**
- [x] Headless Chromium launch (tools/web_scrape_tool/)
- [x] Stealth mode via playwright_stealth
- [x] JavaScript rendering with wait-for-domcontentloaded
- [x] CSS selector support
- [x] User-agent spoofing
- [x] Sandbox/automation detection evasion
- [x] **Browser Launch Utilities**
- [x] Platform-specific browser opening (macOS/Linux/Windows) (tools/credentials/browser.py)
- [x] OAuth2 flow integration
- [ ] **Full Browser Use Node**
- [ ] Multi-page automation workflows
- [ ] Form filling with vision-guided interactions
- [ ] Interactive screenshot capabilities
- [ ] Session management across navigations
- [ ] Browser-specific tool set
- [ ] System prompts for browser tasks
### Event-loop For Nodes (Opencode-style)
### Core Graph Framework Infra
Ship essential framework utilities: Node validation, HITL (Human-in-the-loop pause/approve), and node lifecycle management.
- [ ] **Event bus**
- [x] **Node Validation**
- [x] Pydantic-based validation
- [x] Schema enforcement
- [x] Output key validation (Level 0)
- [x] **Human-in-the-Loop (HITL)**
- [x] HITLRequest and HITLResponse protocol (graph/hitl.py)
- [x] Question types: FREE_TEXT, STRUCTURED, SELECTION, APPROVAL, MULTI_FIELD
- [x] Haiku-powered response parsing
- [x] User-friendly display formatting
- [x] Pause/approve workflow
- [x] State saved to checkpoint
- [x] Resume with HITLResponse merged into context
- [x] **TUI Integration**
- [x] Chat REPL with streaming support (tui/app.py)
- [x] Multi-graph session management
- [x] User presence detection
- [x] Real-time log viewing
- [x] **Node Lifecycle Management**
- [x] Start/stop/pause/resume in execution stream
- [x] State persistence via checkpoint store
- [x] Recovery mechanisms with checkpoint restore
- [ ] **Advanced HITL Features**
- [ ] Callback handlers for custom intervention logic
- [ ] Streaming interface for real-time monitoring
- [ ] Approval workflows at scale
### Memory System Iteration
### Infrastructure Tools
Port popular tools, and build out the Runtime Log, Audit Trail, Excel, and Email integrations.
- [x] **File Operations (36+ tools)**
- [x] read_file, write_file, edit_file (mcp/agent_builder_server.py)
- [x] list_directory, search_files
- [x] apply_diff / apply_patch for code modification (tools/file_system_toolkits/)
- [x] data_tools (CSV/Excel parsing)
- [x] **Web Tools**
- [x] Web Search (tools/web_search_tool/)
- [x] Web Scraper (tools/web_scrape_tool/)
- [x] Exa Search (tools/exa_search_tool/)
- [x] News Tool (tools/news_tool/)
- [x] SerpAPI (tools/serpapi_tool/)
- [x] **Data Tools**
- [x] CSV tools (tools/csv_tool/)
- [x] Excel tools (tools/excel_tool/)
- [x] PDF tools (tools/pdf_read_tool/)
- [x] Vision tool for image analysis (tools/vision_tool/)
- [x] Time tool (tools/time_tool/)
- [x] **Communication Tools (8 tools)**
- [x] Email tool (tools/email_tool/)
- [x] Gmail tool (tools/gmail_tool/)
- [x] Slack tool (tools/slack_tool/)
- [x] Discord tool (tools/discord_tool/)
- [x] Telegram tool (tools/telegram_tool/)
- [x] Google Docs (tools/google_docs_tool/)
- [x] Google Maps (tools/google_maps_tool/)
- [x] Cal.com (tools/calcom_tool/)
- [x] **CRM/API Integrations (5+ tools)**
- [x] HubSpot (tools/hubspot_tool/)
- [x] GitHub (tools/github_tool/)
- [x] Apollo (tools/apollo_tool/)
- [x] BigQuery (tools/bigquery_tool/)
- [x] Razorpay (tools/razorpay_tool/)
- [x] Calendar (tools/calendar_tool/)
- [x] **Security/Scanning Tools (5 tools)**
- [x] DNS Security Scanner (tools/dns_security_scanner/)
- [x] SSL/TLS Scanner (tools/ssl_tls_scanner/)
- [x] Port Scanner (tools/port_scanner/)
- [x] Subdomain Enumerator (tools/subdomain_enumerator/)
- [x] Tech Stack Detector (tools/tech_stack_detector/)
- [x] **Runtime & Logging**
- [x] Runtime Log Tool (tools/runtime_logs_tool/)
- [x] Runtime Logger with L1/L2/L3 levels (runtime/runtime_logger.py)
- [ ] **Audit Trail System**
- [ ] Decision tracing beyond logs
- [ ] Compliance reporting
- [ ] Historical query capabilities
---
## Memory, Storage & File System Capabilities
### Memory Tools
Simple pure file-based memory management
- [x] **Short-Term Memory (STM)**
- [x] SharedState manager for in-memory state (runtime/shared_state.py)
- [x] Session-based storage (storage/session_store.py)
- [x] State-based short-term memory layer
- [x] **Conversation Memory**
- [x] NodeConversation tracks message history (graph/conversation.py)
- [x] Tool results, streaming content, metadata
- [x] Context building for LLM prompts
- [ ] **Long-Term Memory (LTM)**
- [ ] Semantic indexing for memory retrieval
- [ ] RLM (Retrieval-augmented Long-term Memory) implementation
- [ ] Memory persistence beyond session
- [ ] Content-based memory search
### Durable Scratchpad
Integrate a lightweight, persistent DB for long-term memory using the filesystem-as-scratchpad pattern.
- [x] **Filesystem as Scratchpad**
- [x] File-based persistence layer (storage/)
- [x] Session store implementation
- [x] Data durability guarantees
- [x] **Checkpoint System**
- [x] Save/restore execution state (storage/checkpoint_store.py)
- [x] TTL-based cleanup
- [x] Async checkpoint support
- [x] Max age configuration
- [ ] **Message Model & Session Management**
- [ ] Introduce `Message` class with structured content types
- [ ] Implement `Session` classes for conversation state
- [ ] **Storage Migration**
- [ ] Implement granular per-message file persistence (`/message/[agentID]/...`)
- [ ] Migrate from monolithic run storage
- [ ] **Context Building & Conversation Loop**
- [ ] Implement `Message.stream(sessionID)`
- [ ] Update `EventLoopNode.execute()` for full context building
- [ ] Implement `Message.toModelMessages()` conversion
- [ ] **Proactive Compaction**
- [ ] Implement proactive overflow detection
- [ ] Develop backward-scanning pruning strategy (e.g., clearing old tool outputs)
- [ ] **Enhanced Token Tracking**
- [ ] Extend `LLMResponse` to track reasoning and cache tokens
- [ ] Integrate granular token metrics into compaction logic
- [ ] Message class with structured content types
- [ ] Session classes for conversation state
- [ ] Per-message file persistence
- [ ] Migration from monolithic run storage
### Coding Agent Support
### Memory Isolation
Enforce session-local memory isolation to prevent data bleed between concurrent agent runs.
- [ ] Claude Code
- [ ] Cursor
- [ ] Opencode
- [ ] Antigravity
- [ ] Codex CLI (in progress)
- [x] **Session Isolation**
- [x] Session-local memory implementation (storage/session_store.py)
- [x] Data bleed prevention
- [x] Concurrent run safety
- [x] Isolation levels: ISOLATED, SHARED, SYNCHRONIZED
- [x] **State Management**
- [x] SharedState with thread-safe operations (runtime/shared_state.py)
- [x] Session-scoped state access
- [ ] **Context Management**
- [ ] Message.stream(sessionID) implementation
- [ ] Full context building optimization
- [ ] Message to model conversion improvements
### File System Enhancement
### Agent Capabilities
Implement File I/O support, streaming mode, and allow users to supply custom functions as libraries/nodes.
- [ ] Semantic Search integration
- [ ] Interactive File System in product (frontend integration)
- [x] **File I/O**
- [x] File read/write operations (mcp/agent_builder_server.py)
- [x] File system navigation
- [x] Directory listing and search
- [x] **Execution Streaming**
- [x] Real-time event streaming (runtime/execution_stream.py)
- [x] Token-by-token output via event bus
- [x] Tool call streaming
- [x] **Custom Tool Integration**
- [x] MCP server discovery (mcp/agent_builder_server.py)
- [x] Dynamic tool binding
- [x] Custom tool registration
- [ ] **Streaming Mode Enhancements**
- [ ] Progressive result delivery optimization
- [ ] Backpressure handling
- [ ] **Custom Function Libraries**
- [ ] User-supplied function libraries as nodes
- [ ] Library versioning and management
- [ ] **Proactive Memory Compaction**
- [ ] Overflow detection
- [ ] Backward-scanning pruning strategy
- [ ] Token tracking integration for compaction decisions
### More Worker Tools
### File System Enhancements
Add semantic search capabilities and an interactive file system for frontend product integration.
- [ ] Custom Tool Integrator
- [ ] Integration as a tool (Credential Store & Support)
- [ ] **Core Agent Tools**
- [ ] Node Discovery Tool (find other agents in the graph)
- [ ] HITL Tool (pause execution for human approval)
- [ ] Wake-up Tool (resume agent tasks)
- [x] **File Search**
- [x] search_files tool (mcp/agent_builder_server.py)
- [x] Directory traversal
- [ ] **Semantic Search**
- [ ] Semantic indexing of files
- [ ] Natural language file search
- [ ] Content-based retrieval with embeddings
- [ ] **Interactive File System**
- [ ] Frontend file browser integration
- [ ] Real-time file system updates
- [ ] Visual file navigation in GUI
### Deployment (Self-Hosted)
---
- [ ] Worker agent docker container standardization
- [ ] Headless backend execution
- [ ] Exposed API for frontend attachment
- [ ] Local monitoring & observability
- [ ] Basic lifecycle APIs (Start, Stop, Pause, Resume)
## Eval System, DX, & Open Source Guardrails
### Deployment (Cloud)
### Eval System
Build the failure recording mechanism and an SDK for defining custom failure conditions.
- [ ] Cloud Service Options
- [ ] Support deployment to 3rd-party platforms
- [ ] Self-deploy + orchestrator connection
- [ ] **CI/CD Pipeline**
- [ ] Automated test execution
- [ ] Agent version control
- [ ] All tests must pass for deployment
- [x] **Multi-Level Evaluation**
- [x] Level 0: Output key validation (all required keys set)
- [x] Level 1: Literal checks (output_contains, output_equals)
- [x] Level 2: Conversation-aware judgment (graph/conversation_judge.py)
- [x] **Goal-Based Constraints**
- [x] Hard constraints (violation = failure) (graph/goal.py)
- [x] Soft constraints (prefer not to violate)
- [x] Categories: time, cost, safety, scope, quality
- [x] Constraint checking infrastructure
- [x] **Success Criteria Definition**
- [x] Weighted criteria (0.0-1.0)
- [x] Metrics: output_contains, output_equals, llm_judge, custom
- [x] 90% threshold for goal success
- [x] **Test Framework**
- [x] TestCase, TestResult, TestStorage classes (testing/)
- [x] LLM-based judgment for semantic evaluation (testing/llm_judge.py)
- [x] Approval CLI for manual approval workflows
- [x] Categorization and test result reporting
- [ ] **Failure Recording**
- [ ] Failure capture mechanism
- [ ] Failure analysis tools
- [ ] Historical failure tracking
- [ ] Continuous improvement loop
- [ ] **Custom Failure Conditions SDK**
- [ ] SDK for defining custom failure conditions
- [ ] Custom evaluator framework extension
- [ ] Condition validation DSL
### Developer Experience Enhancement
### Guardrails SDK
Implement deterministic condition guardrails directly in the node, complete with mitigation tracking and audit logs.
- [ ] Tool usage documentation
- [ ] Discord Support Channel
- [x] **Goal Constraints (Basic Guardrails)**
- [x] Hard/soft constraint definitions (graph/goal.py)
- [x] Constraint checking in goals
- [ ] **Deterministic Guardrails SDK**
- [ ] In-node guardrail implementation
- [ ] Condition-based guardrails
- [ ] Guardrail SDK for custom rules
- [ ] **Monitoring & Tracking**
- [ ] Mitigation tracking for violations
- [ ] Audit log system for guardrails
- [ ] Compliance reporting
- [ ] **Basic Monitoring Hooks**
- [ ] Agent node SDK monitoring hooks
- [ ] Event hook system for guardrails
- [ ] Default monitoring hooks in nodes
### More Agent Templates
### DevTools CLI
Release CLI tools specifically for rapid memory management and credential store editing.
- [ ] GTM Sales Agent (workflow)
- [ ] GTM Marketing Agent (workflow)
- [ ] Analytics Agent
- [ ] Training Agent
- [ ] Smart Entry / Form Agent (self-evolution emphasis)
- [x] **Main CLI**
- [x] Run, info, validate, list commands (cli.py)
- [x] Dispatch mode for batch execution
- [x] Shell mode for interactive use
- [x] Model selection configuration
- [x] **Testing CLI**
- [x] test-run, test-debug, test-list, test-stats (testing/cli.py)
- [x] Pytest integration
- [x] Test categorization
- [x] **TUI (Terminal UI)**
- [x] Interactive chat with streaming (tui/app.py)
- [x] Multi-graph management UI
- [x] Log pane for real-time output
- [x] Keyboard shortcuts (Ctrl+C, Ctrl+D, etc.)
- [ ] **Memory Management CLI**
- [ ] Memory inspection commands
- [ ] Memory cleanup utilities
- [ ] Session management commands
- [ ] **Credential Store CLI**
- [ ] Interactive credential editing
- [ ] Secure credential viewer
- [ ] Credential validation tools
- [ ] **Debugging Tools**
- [ ] Interactive debugging mode beyond TUI
- [ ] Breakpoint support in execution
- [ ] Step-through execution
### Cross-Platform
### Observability
Support user-driven log analysis, basic monitoring hooks from the SDK, and an interactive debugging mode.
- [ ] JavaScript / TypeScript Version SDK
- [ ] Better windows support
- [x] **Runtime Logging**
- [x] L1 (summary), L2 (detailed), L3 (tool) logging levels (runtime/runtime_logger.py)
- [x] Session logs directory storage
- [x] Audit trail for decision tracing in logs
- [x] **Event Bus Monitoring**
- [x] Real-time event streaming (runtime/event_bus.py)
- [x] LLM text deltas, tool calls, node transitions
- [x] Graph-scoped event routing
- [ ] **Log Analysis Tools**
- [ ] User-driven log analysis (OSS approach)
- [ ] Log aggregation utilities
- [ ] Log visualization tools
- [ ] **Monitoring Hooks**
- [ ] Basic observability hooks from SDK
- [ ] Performance metrics collection
- [ ] Health checks system
- [ ] **Token Tracking**
- [ ] Reasoning token tracking
- [ ] Cache token tracking
- [ ] Token metrics in compaction logic
### Developer Success
Write the Quick Start guide, detailed tool usage documentation, and set up the MVP README examples.
- [x] **Documentation**
- [x] Quick start guide
- [x] Goal creation guide
- [x] Agent creation guide
- [x] README with examples
- [x] Contributing guidelines
- [x] GitHub Page setup
- [ ] **Tool Usage Documentation**
- [ ] Comprehensive tool documentation
- [ ] Tool integration examples
- [ ] Best practices guide
- [ ] **Video Content**
- [ ] Introduction video
- [ ] Tutorial videos
- [ ] **Example Agents**
- [ ] Knowledge agent template
- [ ] Blog writer agent template
- [ ] SDR agent template
---
## Deployment, CI/CD & Community Templates
### Self-Deployment
Standardize the Docker container builds and establish headless backend execution APIs.
- [x] **Docker Support**
- [x] Python 3.11-slim base image (tools/Dockerfile)
- [x] Playwright Chromium installation
- [x] Non-root user for security
- [x] Health check endpoint
- [x] Volume mount for workspace persistence
- [x] Exposes port 4001 for MCP server
- [x] **Agent Runtime**
- [x] AgentRuntime: Top-level orchestrator (runtime/agent_runtime.py)
- [x] Multiple entry points (manual, webhook, timer, event, api)
- [x] Concurrent execution management
- [x] State persistence via session store
- [x] Outcome aggregation
- [x] **Async Entry Points**
- [x] AsyncEntryPointSpec: Webhook, timer, event triggers (graph/edge.py)
- [x] Timer config: cron expressions or interval_minutes
- [x] Event triggers for custom events
- [x] Isolation levels: isolated, shared, synchronized
- [ ] **Headless Backend Enhancements**
- [ ] Standardized backend execution APIs
- [ ] Frontend attachment interface
- [ ] Self-hosted setup guide with examples
### Lifecycle APIs
Expose basic REST/WebSocket endpoints for external control (Start, Stop, Pause, Resume).
- [x] **Webhook Server**
- [x] FastAPI-based webhook server (runtime/webhook_server.py)
- [x] Route configuration per entry point
- [x] Optional secret validation
- [x] **Graph Lifecycle Management**
- [x] Load/unload/start/restart in AgentRuntime
- [x] State persistence
- [x] Recovery mechanisms
- [ ] **REST API Endpoints**
- [ ] Start endpoint for agent execution
- [ ] Stop endpoint for graceful shutdown
- [ ] Pause endpoint for execution suspension
- [ ] Resume endpoint for continuation
- [ ] Status query endpoint for monitoring
- [ ] **WebSocket API**
- [ ] Real-time event streaming to clients
- [ ] Bidirectional communication
- [ ] Connection management with reconnection
### CI/CD Pipelines
Implement automated test execution, agent version control, and mandatory test-passing for deployment.
- [x] **Test Execution**
- [x] Test framework with pytest integration (testing/)
- [x] Test result reporting
- [x] Test CLI commands (test-run, test-debug, etc.)
- [ ] **Automated Testing Pipeline**
- [ ] CI integration (GitHub Actions, etc.)
- [ ] Mandatory test-passing gates
- [ ] Coverage reporting
- [ ] **Version Control**
- [ ] Agent versioning system
- [ ] Semantic versioning for agents
- [ ] Version compatibility checks
- [ ] **Deployment Automation**
- [ ] Continuous deployment pipeline
- [ ] Rollback mechanisms
- [ ] Blue-green deployment support
### Distribution
Launch the official PyPI package, Docker Hub image, and the community Discord channel.
- [ ] **Package Distribution**
- [ ] Official PyPI package
- [ ] Docker Hub image publication
- [ ] Version release automation
- [ ] Installation documentation
- [ ] **Community Channels**
- [ ] Discord channel setup
- [ ] Community support structure
- [ ] Contribution guidelines enforcement
- [ ] **Cloud Deployment**
- [ ] AWS Lambda integration
- [ ] GCP Cloud Functions support
- [ ] Azure Functions support
- [ ] 3rd-party platform integrations
- [ ] Self-deploy with orchestrator connection
### Example Agents
Ship ~20 ready-to-use templates including GTM Sales, Marketing, Analytics, Training, and Smart Entry agents.
- [x] **Hive Coder Agent**
- [x] Agent builder template (agents/hive_coder/)
- [x] Guardian node for supervision
- [ ] **Sales & Marketing Agents**
- [ ] GTM Sales Agent (workflow automation)
- [ ] GTM Marketing Agent (campaign management)
- [ ] Lead generation agent
- [ ] Email campaign agent
- [ ] Social media agent
- [ ] **Analytics & Insights Agents**
- [ ] Analytics Agent (data analysis)
- [ ] Data processing agent
- [ ] Report generation agent
- [ ] Dashboard agent
- [ ] **Training & Education Agents**
- [ ] Training Agent (onboarding)
- [ ] Content creation agent
- [ ] Knowledge base agent
- [ ] Documentation agent
- [ ] **Automation & Forms Agents**
- [ ] Smart Entry / Form Agent (self-evolution emphasis)
- [ ] Data validation agent
- [ ] Workflow automation agent
- [ ] Integration agent
- [ ] **Additional Templates**
- [ ] Customer support agent
- [ ] Document processing agent
- [ ] Scheduling agent
- [ ] Research agent
- [ ] Code review agent
---
## Open Hive
### Local API Gateway
Build a lightweight local server (e.g., FastAPI or Node) that securely exposes the Hive framework's core Event Bus and Memory Layer to the local browser environment.
- [x] **MCP Server Foundation**
- [x] FastMCP server implementation (mcp/agent_builder_server.py)
- [x] Agent builder tools exposed
- [x] Port 4001 exposed in Docker
- [x] **Event Bus Architecture**
- [x] Event Bus implementation (runtime/event_bus.py)
- [x] Real-time event streaming
- [x] Graph-scoped event routing
- [ ] **Local API Gateway**
- [ ] Lightweight local server (FastAPI or Node)
- [ ] Secure authentication layer for browser
- [ ] CORS and security configuration
- [ ] Event Bus API endpoints for browser access
- [ ] Event subscription management for frontend
- [ ] **Memory Layer API**
- [ ] Memory read/write endpoints
- [ ] Session management API for frontend
- [ ] Memory visualization data endpoints
### Visual Graph Explorer
Implement an interactive, drag-and-drop canvas (using libraries like React Flow) to visualize the Worker Graph, Queen Bee, and active execution paths in real-time.
- [ ] **Graph Visualization**
- [ ] React Flow integration
- [ ] Worker Graph rendering from agent definitions
- [ ] Node type visualization (EventLoop, Function, etc.)
- [ ] Edge visualization with condition types
- [ ] Active execution path highlighting
- [ ] **Interactive Features**
- [ ] Drag-and-drop canvas for graph editing
- [ ] Node editing capabilities
- [ ] Real-time graph updates during execution
- [ ] Zoom and pan controls
- [ ] Node inspection on click
- [ ] **Integration with Runtime**
- [ ] Live execution visualization
- [ ] Node state indicators
- [ ] Edge traversal animation
### TUI to GUI Upgrade
Port the existing Terminal User Interface (TUI) into a rich web application, allowing users to interact directly with the Queen Bee / Coding Agent via a browser chat interface.
- [x] **TUI Foundation**
- [x] Terminal chat interface (tui/app.py)
- [x] Streaming support
- [x] Multi-graph management
- [x] Log pane display
- [x] Keyboard shortcuts
- [ ] **Web Application**
- [ ] Modern web UI framework setup (React/Vue/Svelte)
- [ ] Responsive design implementation
- [ ] Cross-browser compatibility
- [ ] **Chat Interface**
- [ ] Browser-based chat UI
- [ ] Hive Coder interaction (Queen Bee proxy)
- [ ] Coding Agent interface
- [ ] Message history and search
- [ ] Rich message formatting (markdown, code blocks)
- [ ] **TUI Feature Parity**
- [ ] All TUI commands in GUI
- [ ] Keyboard shortcuts in browser
- [ ] Command palette (Cmd+K style)
### Memory & State Inspector
Create a UI component to inspect the Shared Memory and Write-Through Conversation Memory, allowing developers to click on any node and see exactly what it is thinking.
- [x] **Runtime Logs Tool**
- [x] Inspect agent session logs (tools/runtime_logs_tool/)
- [x] Session state retrieval (mcp/agent_builder_server.py)
- [ ] **Memory Inspector UI**
- [ ] Shared Memory visualization
- [ ] Conversation memory view (NodeConversation display)
- [ ] Memory search and filter
- [ ] Memory timeline view
- [ ] **Node State Inspection**
- [ ] Click-to-inspect functionality
- [ ] Node thought process display (LLM reasoning)
- [ ] State history timeline per node
- [ ] Input/output inspection
- [ ] **Debug Tools**
- [ ] Memory diff viewer (state changes between nodes)
- [ ] State snapshot comparison
- [ ] Memory leak detection
### Local Control Panel
Build a dashboard for localized Credential Management (editing the ~/.hive/credentials store safely) and swarm lifecycle management (Start, Pause, Kill, and HITL approvals).
- [x] **Credential Management Backend**
- [x] CredentialStore with file/env/vault backends (credentials/store.py)
- [x] OAuth2 provider support (credentials/oauth2/)
- [x] Template resolution and caching
- [ ] **Credential Management Dashboard**
- [ ] Safe credential editing interface (web UI)
- [ ] ~/.hive/credentials store management UI
- [ ] Credential validation and testing UI
- [ ] Encryption status display
- [ ] OAuth2 flow initiation from browser
- [ ] **Swarm Lifecycle Management**
- [ ] Start/Stop controls for agents
- [ ] Pause/Resume functionality
- [ ] Kill process management
- [ ] HITL approval interface in browser
- [ ] Multi-agent orchestration view
- [ ] **Monitoring Dashboard**
- [ ] Active agents display
- [ ] Resource usage monitoring (CPU, memory, tokens)
- [ ] Performance metrics visualization
- [ ] Execution history
### Local Model Integration
Build native frontend configurations to easily connect Open Hive's backend to local open-source inference engines like Ollama, keeping the entire stack offline and private.
- [x] **LLM Integration Layer**
- [x] Provider-agnostic LLM support via LiteLLM (graph/event_loop_node.py)
- [x] Model configuration in agent definitions
- [ ] **Local Model Support**
- [ ] Ollama integration and configuration
- [ ] Local LLM configuration UI
- [ ] Model selection and management dashboard
- [ ] Model performance monitoring
- [ ] **Offline Mode**
- [ ] Full offline functionality (no cloud API calls)
- [ ] Local-only execution mode flag
- [ ] Privacy-first architecture enforcement
- [ ] Local model fallback mechanisms
- [ ] **Model Configuration**
- [ ] Easy model switching in UI
- [ ] Model parameter tuning (temperature, top_p, etc.)
- [ ] Performance optimization settings
- [ ] Multi-model support (different models per node)
- [ ] Model cost tracking for local models
### Cross-Platform Support
- [ ] **JavaScript/TypeScript SDK**
- [ ] TypeScript SDK development
- [ ] npm package distribution
- [ ] Node.js runtime support
- [ ] Browser runtime support
- [ ] **Platform Compatibility**
- [ ] Windows support improvements
- [ ] macOS optimization
- [ ] Linux distribution support
### Coding Agent Integration
- [ ] **IDE Integrations**
- [ ] Claude Code integration
- [ ] Cursor integration
- [ ] Opencode integration
- [ ] Antigravity integration
- [ ] Codex CLI integration (in progress)