Compare commits

...

1019 Commits

Author SHA1 Message Date
Timothy 65c8e1653c chore: lint 2026-03-17 15:31:36 -07:00
Timothy 58e4fa918c feat: make worker node aware of boundaries 2026-03-17 15:28:41 -07:00
Timothy @aden d2eb86e534 Merge pull request #6540 from sundaram2021/fix/make-windows-compatibility
fix make test compatibility on windows
2026-03-17 11:41:32 -07:00
mma2027 23a7b080eb test: add comprehensive test suite for safe_eval (#4015)
* test: add comprehensive test suite for safe_eval sandboxed evaluator

Adds 113 tests across 14 test classes covering the full surface area of
the safe_eval expression evaluator used by edge conditions:

- Literals, data structures, arithmetic, unary/binary/boolean operators
- Short-circuit semantics for `and`/`or` (including guard patterns)
- Ternary expressions, variable lookup, subscript/attribute access
- Whitelisted function and method calls
- Security boundaries (private attrs, disallowed AST nodes, blocked builtins)
- Real-world EdgeSpec.condition_expr patterns from graph executor usage

* style: fix import sort order

---------

Co-authored-by: mma2027 <mma2027@users.noreply.github.com>
Co-authored-by: hundao <alchemy_wimp@hotmail.com>
2026-03-18 01:01:31 +08:00
mma2027 bf39bcdec9 fixed race condition deadlock, missing short-circuit eval, unhandled format exceptions (#4012) 2026-03-18 00:36:54 +08:00
Richard Tang 0276632491 Merge branch 'feat/graph-improvements' 2026-03-17 07:34:10 -07:00
RichardTang-Aden ae2993d0d1 Merge pull request #6528 from Antiarin/feat/trigger-nodes-in-draft-graph
Restore trigger nodes in the new flowchart
2026-03-16 20:54:36 -07:00
RichardTang-Aden d14d71f760 Merge pull request #6549 from aden-hive/staging
Release / Create Release (push) Waiting to run
release 0.7.2
2026-03-16 20:44:47 -07:00
Richard Tang ef6efc2f55 chore: lint and dead code 2026-03-16 20:44:03 -07:00
Antiarin 738641d35f fix: correct trigger target, label, and SSE event data
- Add name and entry_node to all trigger SSE events (TRIGGER_AVAILABLE,
  TRIGGER_ACTIVATED, TRIGGER_DEACTIVATED) so frontend gets correct data
  immediately instead of guessing
- Use ep.entry_node from backend in polling instead of guessing first
  non-trigger node
- Compute cronToLabel from trigger config during polling so pill labels
  show human-readable schedule
- Fix AsyncMock for event_bus.publish in tests
2026-03-17 09:07:10 +05:30
Antiarin 22f5534f08 fix: ensure Queen calls remove_trigger when user asks to remove scheduler
Added explicit prompt guidance requiring the Queen to call the
remove_trigger tool instead of just saying "it's removed."
2026-03-17 09:07:10 +05:30
Antiarin b79e7eca73 feat: live update trigger pill and detail panel on save
- Handle trigger_updated SSE event to update graph node label and
  config in real time when cron or task is saved
- Use cronToLabel for human-readable schedule display in detail panel
- Add "Saved" button feedback for Save Cron and Save Task (2s toast)
- Update trigger pill label to reflect new schedule on cron save
2026-03-17 09:07:10 +05:30
Antiarin 28250dc45e feat: support cron editing via trigger update API
- Extend PATCH /triggers/{id} to accept trigger_config with cron
  validation via croniter and active timer restart
- Add TRIGGER_UPDATED SSE event so frontend updates in real time
- Update frontend API client to use updateTrigger with config support
- Add tests for task update, cron restart, and invalid cron rejection
2026-03-17 09:07:10 +05:30
Antiarin fe5df6a87a feat: restore trigger node rendering in DraftGraph
Trigger nodes (scheduler, webhook, etc.) stopped appearing after the
v0.7.0 refactor because DraftGraph had no trigger awareness.

- Extract shared utilities (cssVar, truncateLabel, trigger colors/icons,
  useTriggerColors, cronToLabel) into lib/graphUtils.ts
- Render trigger pills above the draft flowchart with pill shape, icons,
  countdown timers, active/inactive status, and click handling
- Draw dashed edges from trigger pills to the correct draft node using
  flowchartMap lookup
- Name all trigger layout constants, fix countdown text color bug
- Include trigger pill extent in SVG viewBox width

Closes #6344
2026-03-17 09:07:10 +05:30
Richard Tang 07e4b593dd fix: write config when change model with existing key 2026-03-16 20:23:20 -07:00
Timothy 497591bf3b Merge remote-tracking branch 'origin/feat/hive-llm-support' into staging 2026-03-16 19:49:21 -07:00
Timothy a2a3e334d6 Merge branch 'feature/node-node-comm-by-file' into staging 2026-03-16 19:48:45 -07:00
Timothy 1ccbfaf800 Merge branch 'feature/agent-skills' into staging 2026-03-16 19:48:36 -07:00
Timothy a9afa0555c chore: lint 2026-03-16 19:43:19 -07:00
Timothy 83b2183cf0 Merge branch 'feature/agent-skills' into feature/node-node-comm-by-file 2026-03-16 19:37:46 -07:00
bryan c2dea88398 refactor: active node always displaying 2026-03-16 19:30:44 -07:00
Timothy f49e7a760e fix: skill memory keys breaking unrestricted node permissions
Only extend read_keys/write_keys with skill memory keys when the
list was already non-empty (restricted). An empty list means "allow
all" — adding _-prefixed skill keys to an empty list accidentally
activated the permission check and blocked legitimate reads.
2026-03-16 19:27:48 -07:00
bryan dc95c88da0 chore: linter update 2026-03-16 19:22:51 -07:00
Timothy 6e0255ebec fix: lint E501 line-too-long and auto-format 2026-03-16 19:21:27 -07:00
bryan b51e688d1a feat: transition when loading 2026-03-16 19:17:16 -07:00
Timothy 379d3df46b feat: file path first data passing 2026-03-16 19:14:45 -07:00
bryan b77a3031fe refactor: update flowchart.json for templates 2026-03-16 17:27:28 -07:00
bryan c10eea04ec refactor: update graph node colors 2026-03-16 17:26:57 -07:00
Richard Tang 491a3f24da chore: Suppress noisy LiteLLM INFO logs 2026-03-16 16:45:23 -07:00
Timothy c7d70e0fb1 fix: skill injection, tool call timeout 2026-03-16 16:26:16 -07:00
Richard Tang d59f8e99cb chore: prompt users to go to discord for hive key 2026-03-16 16:09:47 -07:00
Richard Tang 0a91b49417 feat: add validation and config for baseURL 2026-03-16 16:07:13 -07:00
Timothy ced64541b9 Merge remote-tracking branch 'origin/main' into feature/agent-skills 2026-03-16 15:45:00 -07:00
Timothy 3c30cfe02b Merge branch 'chore/fix-workspace-queen-message' into feature/agent-skills 2026-03-16 14:52:03 -07:00
Timothy 0d6267bcf1 fix: add delegation notice 2026-03-16 14:49:33 -07:00
Richard Tang b47175d1df feat: add hive llm spec in the quickstart 2026-03-16 14:10:30 -07:00
Timothy 6f23a30eed fix: skill lifecycle to runtime 2026-03-16 13:46:49 -07:00
Sundaram Kumar Jha ff7b5c7e27 fix: prepend ~/.local/bin to PATH so uv is found in Git Bash on Windows 2026-03-17 01:28:25 +05:30
bryan 69f0ff7ac9 chore: linter update 2026-03-16 12:22:29 -07:00
bryan c3f13c50eb docs: remove stale iso 5807 references 2026-03-16 12:22:01 -07:00
bryan 5477408d40 chore: code quality updates 2026-03-16 12:18:46 -07:00
bryan 9fad385ddf fix: return staging phase for disk-loaded agents to prevent false planning loader 2026-03-16 12:14:20 -07:00
bryan cf44ee1d9b refactor: remove AgentGraph, extract shared types, add resizable graph panel 2026-03-16 12:13:56 -07:00
bryan 4ab33a39d6 chore: add generated flowchart.json for template agents 2026-03-16 12:13:29 -07:00
bryan ae19121802 test: add tests for flowchart_utils classification and remap 2026-03-16 12:13:16 -07:00
bryan b518525418 docs: update flowchart schema for 9 types with new color palette 2026-03-16 12:13:06 -07:00
bryan ac3fe38b33 refactor: remove dead shape cases and update imports 2026-03-16 12:12:50 -07:00
bryan 3c6a30fcae refactor: trim queen prompt to 9 flowchart types with dark theme colors 2026-03-16 12:12:35 -07:00
bryan 2ced873fb5 refactor: extract flowchart utils into dedicated module with fallback generation 2026-03-16 12:12:17 -07:00
Timothy @aden ab995d8b96 Merge pull request #6530 from aden-hive/chore/fix-workspace-queen-message
fix(micro-fix): queen message display
2026-03-16 10:52:57 -07:00
Timothy c2e560fc07 fix: queen message display 2026-03-16 10:30:05 -07:00
Timothy 19f7ae862e fix: skill loading log 2026-03-16 10:14:33 -07:00
Timothy 5e9f74744a fix: google sheet tools account param 2026-03-16 10:14:05 -07:00
Timothy 7787179a5a Merge branch 'main' into feature/agent-skills 2026-03-16 09:14:29 -07:00
Timothy @aden b63205b91a Merge pull request #6010 from Antiarin/feat/notion-tool-docs-and-improvements
feat: add Notion tool README, improve tool logic, and expand test coverage
2026-03-16 08:36:11 -07:00
Timothy @aden 347bccb9ee Merge branch 'main' into feat/notion-tool-docs-and-improvements 2026-03-16 08:10:43 -07:00
Timothy @aden 9d83f0298f Merge pull request #6385 from Waryjustice/fix/google-sheets-credentials-orphan
fix: make state.json progress writes atomic in GraphExecutor
2026-03-16 07:25:13 -07:00
Hundao 7f7e8b4dff docs: update Windows guidance to reflect native support (#6519)
quickstart.ps1 and hive.ps1 provide full native Windows support.
Update README, CONTRIBUTING, and environment-setup docs to stop
recommending WSL as the primary path. Also add Windows alternatives
for make check/test commands in CONTRIBUTING.md.

Fixes #3835
Fixes #3839
2026-03-16 15:52:42 +08:00
Sundaram Kumar Jha f48a7380f5 Add command sanitizer module and enhance command validation (#6217)
* feat(tools): add command sanitizer module with blocklists for shell injection prevention

* fix(tools): validate commands in execute_command_tool before execution

* fix(tools): validate commands in coder_tools_server run_command before execution

* test(tools): add 109 tests for command sanitizer covering safe, blocked, and edge cases

* fix(tools): normalize executable sanitizer matching

\) usage with explicit .exe suffix normalization in sanitizer paths to satisfy Ruff B005 while preserving blocking behavior for executable names.

Also apply the same normalization in coder_tools_server fallback sanitizer and clean a test-file formatting lint issue.

* fix(tools): harden command sanitizer handling

Normalize executable path matching, tighten python -c detection, and remove the duplicated coder_tools_server fallback by importing the shared sanitizer reliably.

Document the shell=True limitation in the command runners and add regression tests for absolute executable paths plus quoted python -c forms.
2026-03-16 14:46:53 +08:00
Gaurav Singh 3c7f129d86 fix(executor): enforce branch timeout and memory conflict strategy in parallel execution (#6504)
ParallelExecutionConfig.branch_timeout_seconds and memory_conflict_strategy
were declared but never read by any code. This caused branches to run
indefinitely and memory conflicts to go undetected.

Changes:
- Wrap parallel branch tasks with asyncio.wait_for() using configured timeout
- Switch asyncio.gather to return_exceptions=True so one timeout doesn't cancel siblings
- Handle asyncio.TimeoutError in result processing loop
- Implement last_wins/first_wins/error memory conflict strategies
- Track which branch wrote which key during fan-out for conflict detection
- Add 6 new tests covering timeout and conflict scenarios

Closes #5706
2026-03-16 14:31:09 +08:00
RichardTang-Aden 4533b27aa1 Merge pull request #6249 from aden-hive/fix/episodic-memory-access
fix: deduplicate queen memory tools into shared list
2026-03-15 20:26:29 -07:00
Richard Tang 3adf268c29 chore: ruff lint 2026-03-15 20:25:21 -07:00
Richard Tang ac8579900f Merge remote-tracking branch 'origin/main' into fix/episodic-memory-access 2026-03-15 20:23:13 -07:00
Richard Tang abbaaa68f3 Merge remote-tracking branch 'origin/main' 2026-03-15 20:19:32 -07:00
Richard Tang 11089093ef chore: remove deprecated step in quickstart 2026-03-15 20:05:23 -07:00
RichardTang-Aden 99b7cb07d5 Merge pull request #6300 from Nupreeth/docs/notion-tool-readme
docs(notion): add Notion tool README
2026-03-15 20:03:17 -07:00
RichardTang-Aden 70d61ae67a Merge pull request #6389 from saschabuehrle/micro-fix/issue-6015-step-numbering
micro-fix: remove vestigial duplicate Step 3 header in quickstart.sh
2026-03-15 20:01:36 -07:00
Richard Tang dd054815a3 docs: update product image 2026-03-15 19:56:17 -07:00
Timothy 8e5eaae9dd chore(micro-fix): windows string ops compatibility fix 2026-03-15 17:05:41 -07:00
Hundao 2d0128eb5c fix: declare croniter dependency and fail loudly on missing import (#6405)
croniter is used for cron-based timer entry points but was never
declared in pyproject.toml. A fresh install would silently skip
all cron triggers. Add croniter>=1.4.0 to dependencies and raise
RuntimeError instead of silently continuing on ImportError.

Fixes #5353
2026-03-15 18:29:05 +08:00
Milton Adina 06f1d4dcef docs: add Windows quickstart.ps1 instructions to getting-started.md (#5668)
- Add Windows (PowerShell) section alongside Linux/macOS
- Reference .\quickstart.ps1 for native Windows users
- Add Set-ExecutionPolicy note for script execution
- Link to environment-setup.md for WSL alternatives
2026-03-15 18:05:39 +08:00
Gowtham Tadikamalla 0e7b11b5b2 fix(llm): warn when litellm monkey-patches fail to apply due to ImportError (#5757)
Closes #5753

_patch_litellm_anthropic_oauth and _patch_litellm_metadata_nonetype
silently return when litellm internal modules change. This adds
logger.warning() calls so operators are alerted when patches cannot be
applied, instead of encountering cryptic 401 or TypeError at runtime.

Co-authored-by: GowthamT-1610 <gowthamt@umd.edu>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 17:59:36 +08:00
kalp patel 291b78f934 fix: prune ~/.hive/failed_requests/ to prevent unbounded disk growth (#5725)
Add MAX_FAILED_REQUEST_DUMPS = 50 cap and _prune_failed_request_dumps()
helper. After each _dump_failed_request() call the oldest files beyond
the cap are deleted so the directory never grows without bound.

Fixes #5696
2026-03-15 17:33:46 +08:00
Vaibhav Kumar e196a03972 Fix LLMJudge OpenAI fallback to use LiteLLM provider (#5674) 2026-03-15 17:22:37 +08:00
Ishan Chaurasia a0abe2685d fix: preserve custom session ids in runtime logs (#6241)
* fix: preserve custom session ids in runtime logs

Treat any execution stored under sessions/<id> as a session-backed run so custom IDs stay visible in worker-session browsing and unified log APIs. Add regression coverage for custom IDs across executor path selection, log directory creation, and API listing.

Made-with: Cursor

* fix: ignore stray session directories in listing

Keep the session_ prefix as the fast path for worker session discovery, but allow custom IDs when a backing state.json exists. This avoids ghost directories in the UI while preserving the custom session ID support from the original fix.

Made-with: Cursor
2026-03-15 16:08:54 +08:00
SRI LIKHITA ADRU e8f642c8b6 fix(credentials): aden_api_key delete returns 404 when not found, san… (#6340)
* fix(credentials): aden_api_key delete returns 404 when not found, sanitize 500 errors

* style: restore warning log for unexpected delete errors

---------

Co-authored-by: hundao <alchemy_wimp@hotmail.com>
2026-03-15 15:56:32 +08:00
Abhilash Puli 6260f628eb feat(tools): add HuggingFace inference, embedding, and endpoint tools (#6132)
* feat(tools): add HuggingFace inference, embedding, and endpoint tools

* fix: resolve ruff E501 lint issues

* style: fix formatting and restore Hub API error message

* style: format test file

---------

Co-authored-by: hundao <alchemy_wimp@hotmail.com>
2026-03-15 15:44:18 +08:00
Sundaram Kumar Jha 4a4f17ed40 fix quickstart guide for windows (#6264)
* fix(windows): verify uv is runnable before launch

* fix(windows): use validated uv path for kimi health check

* fix(windows): dedupe uv discovery and keep quickstart scoped

* chore: refresh uv lockfile
2026-03-15 15:19:15 +08:00
Fernando Mano 36dcf2025b Feature: #5871 - Improve developer agent logging: simplify terminal output (#6388) 2026-03-15 15:13:22 +08:00
Aryan Nandanwar 85c70c94e6 fix: queen bee multiple response error resolved (#5962)
* fix: queen bee multiple response error resolved

* fix: queen bee multiple response error resolved updates

* fix: added chatmsg.phas and reconsileoptimizeuser

* fix:cleaned up blank lines

* style: fix formatting in workspace.tsx

---------

Co-authored-by: hundao <alchemy_wimp@hotmail.com>
2026-03-15 15:07:24 +08:00
saschabuehrle 336e82ba22 micro-fix: remove vestigial duplicate Step 3 header in quickstart.sh (fixes #6015) 2026-03-14 18:07:59 +01:00
Waryjustice f2ddd1051d fix: make state.json progress writes atomic
Use atomic_write for GraphExecutor._write_progress and log persistence failures instead of silently swallowing exceptions. Add regression tests for atomic write usage and warning logs on write failure.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-14 18:52:25 +05:30
Aaryann Chandola 2dd60c8d52 Merge branch 'aden-hive:main' into feat/notion-tool-docs-and-improvements 2026-03-14 10:58:01 +05:30
Richard Tang ff01c1fd99 chore: release v0.7.1 — Chrome-native GCU, browser isolation, dummy agent tests
Release / Create Release (push) Waiting to run
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-13 20:39:46 -07:00
RichardTang-Aden 421b25fdb7 Merge pull request #6313 from prasoonmhwr/bugFix/add_tab_ui
bugFix: micro-fix add tab UI
2026-03-13 20:29:30 -07:00
Richard Tang 795c3c33e2 docs: readme update 2026-03-13 20:26:44 -07:00
RichardTang-Aden 97821f4d80 Merge pull request #6346 from aden-hive/fix/session-resume-new-agent
fix: save json path for the new agent update meta.json when loaded worker
2026-03-13 20:19:48 -07:00
RichardTang-Aden 505e1e30fd Merge branch 'main' into fix/session-resume-new-agent 2026-03-13 20:19:36 -07:00
Timothy 3fb2b285fb chore: add star history widget 2026-03-13 20:17:35 -07:00
RichardTang-Aden a76109840c Merge pull request #6345 from aden-hive/feat/gcu-updates
feat: GCU browser cleanup, draft loading state, and inner_turn message fix
2026-03-13 20:16:38 -07:00
Timothy 1db8484402 Merge branch 'main' into feature/agent-skills 2026-03-13 20:05:47 -07:00
RichardTang-Aden 39212350ba Merge pull request #6342 from aden-hive/ci/level-2-dummy-agent-testing
Add Level 2 dummy agent end-to-end tests
2026-03-13 19:42:34 -07:00
Richard Tang f3399fe95b chore: ruff lint 2026-03-13 19:39:44 -07:00
Richard Tang d02e1155ed feat: dummy agent tests 2026-03-13 19:39:14 -07:00
bryan 7ede3ba171 feat: queen upsert fix 2026-03-13 19:34:26 -07:00
Timothy cdaec8a837 feat: agent skills 2026-03-13 18:56:34 -07:00
Richard Tang 2272491cf5 chore: remove dead code 2026-03-13 18:10:43 -07:00
RichardTang-Aden bb38cb974f Merge pull request #6333 from aden-hive/fix/new-agent-resume
Fix: new agent resume and GCU browser improvements
2026-03-13 17:20:49 -07:00
bryan 635d2976f4 feat: show loading spinner in draft panel during planning phase 2026-03-13 16:40:33 -07:00
bryan 4e1525880d feat: clean up browser profile after top-level GCU node execution 2026-03-13 16:40:20 -07:00
Richard Tang b80559df68 chore: ruff lint 2026-03-13 16:38:50 -07:00
RichardTang-Aden 08d93ef90a Merge pull request #6331 from RichardTang-Aden/main
fix: generate worker mcp.json correctly in initialize_agent_package
2026-03-13 15:35:18 -07:00
Richard Tang 22bf035522 chore: fix lint 2026-03-13 15:35:01 -07:00
Richard Tang 15944a42ab fix: generate worker mcp file correctly 2026-03-13 15:30:28 -07:00
Richard Tang 8440ec70ba chore: document the difference between runner mode run() and start() 2026-03-13 15:28:18 -07:00
Timothy eacf2520cf chore: skills prd 2026-03-13 15:22:09 -07:00
Richard Tang def4f62a51 fix: update meta.json when loaded worker 2026-03-13 14:05:57 -07:00
bryan b0c5bcd210 chore: update tab management guidelines and add concurrent subagent patterns 2026-03-13 14:04:40 -07:00
bryan 2fe1343343 feat: inject unique browser profile per GCU subagent 2026-03-13 14:03:21 -07:00
bryan de0dcff50f feat: add tab origin/age metadata and per-subagent profile isolation 2026-03-13 14:02:15 -07:00
Richard Tang 20427e213a fix: update meta.json when loaded worker 2026-03-13 13:52:15 -07:00
bryan 1fb5c6337a fix: anchor worker monitoring to queen's session ID on cold-restore 2026-03-13 12:50:50 -07:00
Timothy @aden 1e74f194a1 Update authors in MCP Server Registry document 2026-03-13 12:15:50 -07:00
Timothy 08157d2bd6 chore(docs): bounty program - standard 2026-03-13 12:10:21 -07:00
Timothy ef036257a9 docs(mcp): MCP integration PRD 2026-03-13 11:56:33 -07:00
Timothy 16ce984c74 chore: add default context limit on windows quickstart 2026-03-13 10:04:49 -07:00
bryan 1e8b5b96eb Merge branch 'main' into feat/gcu-updates 2026-03-13 09:26:06 -07:00
Prasoon Mahawar 094ba89f19 Merge branch 'main' of https://github.com/prasoonmhwr/hive into bugFix/add_tab_ui 2026-03-13 18:59:44 +05:30
Prasoon Mahawar 7008c9f310 bugFix: UI overflow issue when creating multiple agents – “Add tab” dropdown partially hidden 2026-03-13 18:58:38 +05:30
Prasoon Mahawar 94d7cbacc2 Revert "bugFix: Clipboard write in SystemPromptTab lacks error handling and may show false Copied feedback"
This reverts commit bddc2b413a.
2026-03-13 18:55:52 +05:30
Prasoon Mahawar bddc2b413a bugFix: Clipboard write in SystemPromptTab lacks error handling and may show false Copied feedback 2026-03-13 18:23:36 +05:30
Nupreeth 48c8fb7fff docs(notion): add Notion tool README 2026-03-13 12:03:48 +05:30
RichardTang-Aden 52b1a3f472 Merge pull request #6282 from aden-hive/feat/refactor-session
Release / Create Release (push) Waiting to run
Refactor session lifecycle with flowchart planning and triggers
2026-03-12 21:15:10 -07:00
Richard Tang 079e00c8f7 Merge remote-tracking branch 'origin/main' into feat/refactor-session 2026-03-12 21:13:15 -07:00
Richard Tang 60bba38941 chore: ruff lint 2026-03-12 21:01:47 -07:00
Richard Tang ea8e7b11c6 Merge remote-tracking branch 'origin/feature/flowchart-linked-experimental' into feat/refactor-session 2026-03-12 20:54:08 -07:00
Richard Tang 3dc2b25b01 fix: adding the trigger helpers 2026-03-12 20:53:45 -07:00
bryan 543b90b34f chore: tooltip update 2026-03-12 20:50:39 -07:00
Richard Tang 2ad78ec8a2 Merge remote-tracking branch 'origin/feature/flowchart-linked-experimental' into feat/refactor-session 2026-03-12 20:48:09 -07:00
Timothy 412658e9f2 fix: remove subagent shapes 2026-03-12 20:46:09 -07:00
Richard Tang 9bfddec322 fix: missing _FLOWCHART_TYPES reference 2026-03-12 20:43:03 -07:00
Timothy bbd9c10169 fix: decision node cannot have subagents 2026-03-12 20:36:04 -07:00
Richard Tang 51fdc4ddde fix: always new session for new agent 2026-03-12 20:34:42 -07:00
Richard Tang 04685d33ca fix: solve the problem from merge conflict 2026-03-12 20:28:25 -07:00
Richard Tang 729a0e0cec fix: resolve merge conflict 2026-03-12 20:23:58 -07:00
bryan 2bcb0cacee added pause/run button 2026-03-12 20:15:25 -07:00
Timothy 44bf191f53 fix: no orphaned node by bfs 2026-03-12 20:04:00 -07:00
Richard Tang 993b31f19b Merge remote-tracking branch 'origin/feature/flowchart-linked-experimental' into feat/refactor-session 2026-03-12 20:00:45 -07:00
Richard Tang 41b3b9619f Merge remote-tracking branch 'origin/feature/flowchart-linked-experimental' into feature/flowchart-linked-experimental 2026-03-12 19:45:45 -07:00
Richard Tang 2a4fe4020c feat: force the planning agent to ask questions 2026-03-12 19:45:07 -07:00
Ishan Chaurasia 9d1f268078 fix(server): honor session_id in one-step session creation (#6233)
Align POST /api/sessions behavior across queen-only and one-step worker creation so callers can rely on deterministic session IDs. Add a regression test covering the forwarded session_id contract.

Made-with: Cursor
2026-03-13 10:43:12 +08:00
bryan 2185e127b1 style: coder tools formatting and template quote fixes 2026-03-12 19:39:53 -07:00
bryan 99ed885fd0 fix: add cached_tokens to finish event test assertion 2026-03-12 19:39:53 -07:00
bryan d8a390a685 feat: flowchart rendering in DraftGraph with node shapes and layout 2026-03-12 19:39:53 -07:00
bryan f50cf1735b feat: CSS variable theming for agent graph components 2026-03-12 19:39:53 -07:00
bryan 04eb57f54e feat: auto-load worker on cold restore when queen resumes 2026-03-12 19:39:53 -07:00
bryan 7378408eb8 feat: add flowchart type system and draft-to-graph dissolution 2026-03-12 19:39:53 -07:00
bryan cf05420417 style: formatting and import cleanup across framework modules 2026-03-12 19:38:55 -07:00
Timothy f5ed4c7d43 fix: validate orphaned gcu node 2026-03-12 19:38:44 -07:00
Timothy 5547432b6e fix: queen defaults to global max context tokens 2026-03-12 19:29:14 -07:00
Ishan Chaurasia 336557d7c7 fix: pass browser_wait text as data (#6235)
Pass browser_wait text through Playwright's function argument channel so quoted and multiline strings do not break the generated wait expression. Add a regression test covering text that previously would have been interpolated unsafely.

Made-with: Cursor
2026-03-13 10:08:16 +08:00
Timothy 87c172227c fix: mandate flowchart topology correction 2026-03-12 19:03:46 -07:00
Richard Tang c2c4929de8 feat: remove the phase in the label 2026-03-12 18:55:24 -07:00
Timothy a978338738 fix: allow replanning 2026-03-12 18:54:01 -07:00
Timothy 8eb59b1f66 fix: mandate usage of ask tools and change pending behavior 2026-03-12 18:34:15 -07:00
Richard Tang f9d5f95936 Merge remote-tracking branch 'origin/feature/flowchart-linked-experimental' into feat/refactor-session 2026-03-12 18:32:26 -07:00
Timothy 651e99ffe3 Merge branch 'feature/multiple-asks' into feature/flowchart-linked-experimental 2026-03-12 17:57:11 -07:00
Timothy 2564f1b948 feat: allow multiple questions 2026-03-12 17:56:58 -07:00
Richard Tang c01cd528d2 feat: planning phase prompt improvements 2026-03-12 17:44:06 -07:00
bryan 2434c86cdf docs: clarify two-step escalation relay protocol in queen prompt 2026-03-12 16:50:17 -07:00
Timothy bc194ee4e9 Merge branch 'main' into feature/flowchart-linked-experimental 2026-03-12 16:50:17 -07:00
bryan c4a5e621aa docs: update GCU prompt with popup tracking and close_all guidance 2026-03-12 16:50:06 -07:00
bryan 0f5b83d86a feat: add browser_close_all tool for bulk tab cleanup 2026-03-12 16:49:55 -07:00
bryan b5aadcd51e feat: auto-track popup pages and improve session startup logging 2026-03-12 16:49:46 -07:00
bryan 290d2f6823 feat: add --no-startup-window to Chrome launch flags 2026-03-12 16:49:36 -07:00
Timothy @aden 2bac100c03 Merge pull request #6283 from vincentjiang777/main
docs: rename and expand contributing guidelines
2026-03-12 16:46:59 -07:00
Timothy @aden 425d37f868 Merge branch 'main' into main 2026-03-12 16:44:29 -07:00
Vincent Jiang 99b127e2da docs: revert filename to CONTRIBUTING.md for GitHub compliance
Changed HOW_TO_CONTRIBUTE.md back to CONTRIBUTING.md to comply with
GitHub's standard for contributing guidelines files.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-12 16:42:42 -07:00
Timothy 43b759bf61 fix: ensure flowchart existence 2026-03-12 16:40:18 -07:00
Vincent Jiang 20d8d52f12 docs: rename and expand contributing guidelines
Renamed CONTRIBUTING.md to HOW_TO_CONTRIBUTE.md and significantly expanded
the documentation with detailed sections on development setup, OS support,
tooling requirements, performance metrics, and contribution workflows.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-12 16:29:13 -07:00
Richard Tang 944567dc31 chore: ruff lint 2026-03-12 16:23:13 -07:00
nightcityblade 7e09588e4e fix: reject path-like agent names in hive dispatch --agents (#6211)
Validate that agent names passed to --agents do not contain path
separators. Previously, passing 'exports/my_agent' would result in
the doubled path 'exports/exports/my_agent' with a confusing error.
Now a clear error message is shown suggesting the correct usage.

Fixes #6208

Co-authored-by: nightcityblade <nightcityblade@gmail.com>
2026-03-12 16:22:37 -07:00
Priyanka Bhallamudi 7bf69d2263 fix: read nodes from graph object in discovery.py for correct node count (#6227)
Co-authored-by: Lakshmi Priyanka Bhallamudi <priyanka@Lakshmis-MacBook-Air.local>
2026-03-12 16:22:37 -07:00
bryan 99d2b0c003 chore: update readme 2026-03-12 16:22:37 -07:00
bryan 8868416baa chore: update the tests and readme 2026-03-12 16:22:37 -07:00
bryan 405b120674 feat: fixed google credentials to use the google oauth credential 2026-03-12 16:22:37 -07:00
Trisha 66a7b43199 [bug:6117:docs]: fix inconsistent configuration and troubleshooting guidance (#6118) 2026-03-12 16:22:36 -07:00
Trisha a8f9d83723 docs: fix typos and awkward copy (#6115)
* [bug:6109:README]: fix typos and awkward copy

* trigger ci

* rerun checks
2026-03-12 16:22:36 -07:00
bryan d95d5804ca fix: align the credential functions to be the same 2026-03-12 16:22:36 -07:00
Richard Tang 674cf05601 feat: track the number of runs 2026-03-12 15:19:13 -07:00
Timothy 86349c78d0 Merge branch 'feature/guardrails' into feature/flowchart-linked-experimental 2026-03-12 15:11:12 -07:00
Timothy 2232f49191 fix: queen flowcharting behavior 2026-03-12 15:10:32 -07:00
Richard Tang 6fa71fa27d feat: track queen phase by message 2026-03-12 14:58:35 -07:00
Vincent Jiang 1ac9ba69d6 docs: replace recipe examples with 100 sample agent prompts
Replace individual recipe READMEs with a comprehensive collection of 100 real-world agent prompt examples across marketing, sales, operations, engineering, and finance. This provides users with a broader range of use case inspiration in a single, organized reference document.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-12 14:46:09 -07:00
Vincent Jiang 9e16be8f03 docs: replace recipe examples with 100 sample agent prompts
Replace individual recipe READMEs with a comprehensive collection of 100 real-world agent prompt examples across marketing, sales, operations, engineering, and finance. This provides users with a broader range of use case inspiration in a single, organized reference document.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-12 14:44:32 -07:00
Richard Tang 8c7065ad37 refactor: remove the parts conversion logic 2026-03-12 14:36:27 -07:00
Richard Tang a18ed5bbe6 feat: restore queen phase 2026-03-12 14:29:01 -07:00
bryan 9f3339650d chore: linter update 2026-03-12 14:27:17 -07:00
bryan d5e5d3e83d feat: add subagent activity tracking to queen status and instructions 2026-03-12 14:26:49 -07:00
bryan 5ea27dda09 refactor: update GCU system prompt for auto-snapshots and batching 2026-03-12 14:26:38 -07:00
bryan 6f9066ef20 feat: return auto-snapshot from browser interaction tools 2026-03-12 14:26:24 -07:00
bryan c37185732a feat: kill orphaned Chrome processes on GCU server shutdown 2026-03-12 14:26:05 -07:00
bryan 0c900fb50e refactor: clean session startup and add page lifecycle management 2026-03-12 14:25:16 -07:00
bryan 4d3ac28878 feat: launch Chrome on macOS via open -n to coexist with user's browser 2026-03-12 14:24:55 -07:00
bryan 270c1f8c50 fix: use lazy %-formatting in subagent completion log to avoid f-string in logger 2026-03-12 14:24:30 -07:00
bryan 3d0859d06a fix: stop clearing credentials_required on modal close to prevent infinite loop 2026-03-12 14:24:14 -07:00
Timothy 8f55170c1e fix: compaction ratio reporting 2026-03-12 14:17:42 -07:00
Richard Tang ed3d4bfe33 feat: resume cold session from event logs 2026-03-12 14:07:57 -07:00
Timothy 31a98a5f95 feat: cached token handing 2026-03-12 14:03:58 -07:00
Timothy 7667b773f2 fix: 18x tool discovery efficiency by progressive disclosure 2026-03-12 13:12:43 -07:00
Timothy 49560260de fix: token counts 2026-03-12 11:52:08 -07:00
Richard Tang 596ce9878d feat: unique run id 2026-03-12 11:09:36 -07:00
Timothy 1cc75f89bd feat: replanning 2026-03-12 09:55:42 -07:00
bryan ffe47c0f71 fix: credential modal eating errors, banner stays open 2026-03-12 09:41:53 -07:00
Timothy bb3c69cff1 fix: proper guardrail on combined context window 2026-03-12 09:37:17 -07:00
Timothy 70d11f537e feat: merge subagent nodes 2026-03-12 09:06:41 -07:00
Timothy b15dd2f623 fix: better logging 2026-03-12 09:03:29 -07:00
Timothy ce308312ae fix: usage tracking 2026-03-12 08:56:33 -07:00
bryan bf4652db4b fix: share event bus so tool events are visible to parent 2026-03-12 08:41:34 -07:00
bryan 2acd526b71 feat: dynamic viewport sizing and suppress Chrome warning bar 2026-03-12 08:40:49 -07:00
bryan df71834e4b refactor: switch from Playwright browser to system Chrome via CDP 2026-03-12 08:39:43 -07:00
nightcityblade f757c724cc fix: reject path-like agent names in hive dispatch --agents (#6211)
Validate that agent names passed to --agents do not contain path
separators. Previously, passing 'exports/my_agent' would result in
the doubled path 'exports/exports/my_agent' with a confusing error.
Now a clear error message is shown suggesting the correct usage.

Fixes #6208

Co-authored-by: nightcityblade <nightcityblade@gmail.com>
2026-03-12 21:11:02 +08:00
Priyanka Bhallamudi a4c758403e fix: read nodes from graph object in discovery.py for correct node count (#6227)
Co-authored-by: Lakshmi Priyanka Bhallamudi <priyanka@Lakshmis-MacBook-Air.local>
2026-03-12 18:34:47 +08:00
Timothy bc3c5a5899 fix: allow memory tool to be used in all phases 2026-03-11 20:10:24 -07:00
Timothy a67563850b feat: flowchart reconciliation 2026-03-11 19:58:27 -07:00
Bryan @ Aden b48465b778 Merge pull request #6230 from aden-hive/feat/google-doc-credential-alignment
micro-fix: Feat/google doc credential alignment
2026-03-12 02:52:03 +00:00
bryan d3baaaab24 chore: update readme 2026-03-11 19:48:00 -07:00
Timothy c764b4dc3b Merge branch 'main' into feature/flowchart-linked-experimental 2026-03-11 19:12:51 -07:00
bryan ad6077bd7b chore: update the tests and readme 2026-03-11 19:12:38 -07:00
Timothy ce2a91b1c0 feat: flowchart mapping 2026-03-11 19:12:25 -07:00
bryan c2e7afeb5e feat: fixed google credentials to use the google oauth credential 2026-03-11 19:12:25 -07:00
Timothy 0c9680ca89 feat: dissolution graph structure 2026-03-11 18:38:17 -07:00
Richard Tang 726016d24a fix: remove the duplicated session logic 2026-03-11 17:11:03 -07:00
Richard Tang 4895cea08a chore: lint and micro-fix 2026-03-11 16:55:29 -07:00
Richard Tang c9723a3ff2 feat(wip): always resume the previous session 2026-03-11 16:48:31 -07:00
Richard Tang 6cb73a6fea refactor: remove the remaining old trigger format and change the trigger format in examples to the latest format 2026-03-11 16:13:37 -07:00
Richard Tang 0c7f43f595 refactor: remove reference of the unused session judge 2026-03-11 16:01:00 -07:00
Richard Tang ea5cfcc5d6 refactor: remove the unused session judge 2026-03-11 15:57:19 -07:00
Richard Tang 34e85019c3 feat: stop supporting the old scheduler 2026-03-11 15:54:48 -07:00
Timothy 8011b72673 fix: flowchart display 2026-03-11 15:41:55 -07:00
RichardTang-Aden d87dfca1ab Merge pull request #6075 from aden-hive/fix/credential-function-alignment
fix: align the credential functions to be the same
2026-03-11 15:11:57 -07:00
Richard Tang c979dba958 fix: reference error from the rename 2026-03-11 14:33:42 -07:00
Richard Tang b4caa045e1 Merge remote-tracking branch 'origin/main' into feat/agent-trigger 2026-03-11 14:32:36 -07:00
Timothy b0fd4bc356 fix: draft flowchart display 2026-03-11 11:05:33 -07:00
Trisha a79d7de482 [bug:6117:docs]: fix inconsistent configuration and troubleshooting guidance (#6118) 2026-03-11 14:41:54 +08:00
Trisha e5e57302fa docs: fix typos and awkward copy (#6115)
* [bug:6109:README]: fix typos and awkward copy

* trigger ci

* rerun checks
2026-03-11 14:38:37 +08:00
Emmanuel Nwanguma c69cf1aea5 test(security): add comprehensive unit tests for 7 security scanning tools (#6151)
* test(security): add comprehensive unit tests for 7 security scanning tools

Add dedicated test files for all security scanning tools:
- test_dns_security_scanner.py (12 tests)
- test_http_headers_scanner.py (13 tests)
- test_ssl_tls_scanner.py (14 tests)
- test_subdomain_enumerator.py (15 tests)
- test_port_scanner.py (17 tests)
- test_tech_stack_detector.py (20 tests)
- test_risk_scorer.py (24 tests)

Total: 115 new tests covering:
- Input validation and cleaning
- Connection error handling
- Core scanning logic with mocked responses
- Grade/risk calculation
- Edge cases

Fixes #5920

* fix(tests): strengthen weak assertions in security scanner tests

- SSL scanner: replace always-true `or` assertions with specific checks
  that verify hostname stripping actually happened
- Port scanner: verify timeout clamp value, not just absence of error
- DNS scanner: remove unused helper method

---------

Co-authored-by: hundao <alchemy_wimp@hotmail.com>
2026-03-11 13:29:11 +08:00
Emmanuel Nwanguma 2f4cd8c36f fix(credentials): improve exception handling in key_storage.py (#6153)
Replace bare except Exception: clauses with specific exception handling:

- delete_aden_api_key(): Catch FileNotFoundError, PermissionError at debug
  level; log unexpected errors at WARNING with exc_info=True
- _read_credential_key_file(): Catch FileNotFoundError, PermissionError at
  debug level; log unexpected errors at WARNING with exc_info=True
- _read_aden_from_encrypted_store(): Catch FileNotFoundError, PermissionError,
  KeyError at debug level; log unexpected errors at WARNING with exc_info=True

This makes credential issues easier to diagnose by:
- Logging unexpected errors at WARNING level (visible in production)
- Including full stack traces with exc_info=True
- Keeping expected failures (file not found, permissions) at debug level

Fixes #5931
2026-03-11 13:05:10 +08:00
Aaryann Chandola 6f571e6d00 [BUG] fix: use ReplaceFileW for atomic writes on Windows to preserve ACLs (#5849)
* [BUG] fix: use ReplaceFileW for atomic writes on Windows to preserve ACLs

* fix: ensure atomic_replace checks for Windows API availability
2026-03-11 12:59:14 +08:00
Emmanuel Nwanguma 31bc84106f test: add API integration tests for hubspot, intercom, google_docs tools (#6167)
>>
>> Resolves #5921
>>
>> - test_hubspot_tool.py: 51 tests covering 15 MCP tools
>> - test_intercom_tool.py: 50 tests covering 11 MCP tools
>> - test_google_docs_tool.py: 57 tests covering 11 MCP tools
2026-03-11 12:55:03 +08:00
Timothy bdd6194203 feature: hive flowchart at planning phase 2026-03-10 19:54:02 -07:00
RichardTang-Aden fd79dceb0f Merge pull request #6166 from aden-hive/fix/subagent-reply-stall
Release / Create Release (push) Waiting to run
micro-fix: update escalation tests for new ESCALATION_REQUESTED flow
2026-03-10 19:47:00 -07:00
Richard Tang ad50139d67 chore: lint 2026-03-10 19:46:35 -07:00
Richard Tang 12fb40c110 test: update escalation tests for ESCALATION_REQUESTED flow
Tests were asserting the old CLIENT_OUTPUT_DELTA + CLIENT_INPUT_REQUESTED
pattern; the fix in 89ccd66f routes escalations through the queen via
ESCALATION_REQUESTED instead.
2026-03-10 19:45:21 -07:00
RichardTang-Aden 738e469d96 Merge pull request #6165 from aden-hive/feature/provider-moonshotai-kimi
feat: support MoonShot AI Kimi subscription
2026-03-10 19:39:25 -07:00
Timothy 80ccbcc827 chore: lint 2026-03-10 19:37:18 -07:00
RichardTang-Aden 08fac31a9d Merge pull request #6159 from aden-hive/fix/subagent-reply-stall
fix: route subagent report_to_parent escalations to queen instead of user
2026-03-10 18:24:33 -07:00
Richard Tang 89ccd66fb9 fix: subagent _EscalationReceiver 2026-03-10 18:21:50 -07:00
Timothy 7c47e367de feat: support moonshotai kimi subscription 2026-03-10 18:03:44 -07:00
Timothy b8741bf94c fix: queen agent system prompt hooks 2026-03-10 16:25:07 -07:00
Aaryann Chandola e82133741c Merge branch 'aden-hive:main' into feat/notion-tool-docs-and-improvements 2026-03-11 04:23:20 +05:30
RichardTang-Aden c90dcbb32f Merge pull request #6152 from aden-hive/refactor/remove-dead-code
refactor: remove deprecated codes
2026-03-10 15:31:34 -07:00
Richard Tang ac3a5f5e93 chore: remove the ai generated temp doc 2026-03-10 15:29:21 -07:00
Timothy 1ccfdbbf7d chore: minimax key check 2026-03-10 15:24:09 -07:00
Timothy 1de37d2747 chore: lint 2026-03-10 15:00:14 -07:00
Timothy 2aefdf5b5f refactor: remove deprecated codes 2026-03-10 14:57:54 -07:00
Antiarin 5076278dcb feat(notion): register Notion tool in verified and unverified registration functions
- Added the Notion tool registration to the _register_verified function.
- Removed the Notion tool registration from the _register_unverified function to ensure proper handling.
2026-03-11 02:45:51 +05:30
Antiarin 2398e04e11 docs(notion): add README for Notion tool with setup instructions and usage examples
- Introduced a comprehensive README.md for the Notion tool.
- Included setup instructions for the Notion API token and credential store configuration.
- Documented available tools and their functionalities.
- Provided usage examples for searching, creating, updating, and managing pages and databases.
2026-03-11 02:45:41 +05:30
Antiarin d00f321627 test(notion): add comprehensive tests for error handling and credential store in Notion tool
- Implemented tests for HTTP error codes, timeouts, and generic exceptions in _request.
- Added tests to verify the use of credential store when provided.
- Enhanced tests for notion_search to include filter types and page size clamping.
- Updated test assertions for successful responses from notion_get_page.
2026-03-11 02:45:30 +05:30
Antiarin e76b6cb575 feat(notion): enhance Notion tool functionality with new block types and improved page creation
- Added BlockType enum for various Notion block types.
- Updated notion_create_page to allow specifying parent_page_id and title_property.
- Enhanced notion_query_database to support sorting and pagination.
- Introduced notion_create_database for creating databases under a parent page.
- Improved error handling for required parameters in page and database creation.
2026-03-11 02:45:12 +05:30
Hundao 4caaa79900 Merge pull request #5988 from roberthallers/docs/fix-tui-deprecation-5941
docs: fix TUI deprecation inconsistency in roadmap
2026-03-10 16:46:41 +08:00
Hundao 296089d4cd Merge pull request #6108 from Hundao/fix/subagent-judge-feedback
fix: SubagentJudge and implicit judge return feedback=None on ACCEPT
2026-03-10 15:39:29 +08:00
hundao cae5f971cf fix: update test assertions for newly added tools
Tool counts and expected lists were outdated after new tools were added
to stripe, linear, apollo, discord, and google_analytics.
2026-03-10 15:36:12 +08:00
hundao bac716eea3 fix: pass feedback="" on evaluated ACCEPT verdicts in SubagentJudge and implicit judge
Fixes #6107
2026-03-10 15:24:39 +08:00
Navya Bijoy 14daf672e8 Fix: SessionManager._cleanup_stale_active_sessions indiscriminately cancels healthy concurrent agent sessions (#6081)
* fixes a bug in the  SessionManager

* chore: remove debug print from test

---------

Co-authored-by: hundao <alchemy_wimp@hotmail.com>
2026-03-10 15:18:11 +08:00
Emmanuel Nwanguma e352ae5145 fix(mcp): close errlog file handle to prevent resource leak (#6094)
Track the errlog file handle opened on non-Windows systems and
properly close it during cleanup to prevent file descriptor leaks.

Changes:
- Add _errlog_handle instance variable to track the file handle
- Store handle reference when opening os.devnull
- Close handle in _cleanup_stdio_async() after other cleanup
- Clear reference in disconnect() for safety

Fixes #6002
2026-03-10 15:06:51 +08:00
Pushkal a58ffc2669 fix(server): use session.phase_state instead of session.mode_state in handle_pause (#6069)
The handle_pause endpoint referenced session.mode_state (lines 360-361),
which does not exist on the Session dataclass. This caused an
AttributeError every time the pause endpoint reached the phase transition
step, preventing the queen phase from transitioning to staging and
returning a 500 error to the frontend.

Changed to session.phase_state, consistent with handle_stop (line 412),
handle_run (line 75), and the Session dataclass definition
(session_manager.py line 44).
2026-03-10 15:03:19 +08:00
RichardTang-Aden 3fefea52be Merge pull request #6102 from aden-hive/micro-fix/report-to-parent-empty-check
micro-fix: track reported_to_parent to prevent false empty-turn detection
2026-03-09 21:12:23 -07:00
Richard Tang 06fd045b3e micro-fix: track reported_to_parent to prevent false empty-turn detection
Turns that call report_to_parent were incorrectly treated as "truly
empty" because the flag was not propagated. Thread it through
_run_single_turn and include it in the empty-turn guard.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 21:10:47 -07:00
RichardTang-Aden 2e43d2af46 Merge pull request #6100 from aden-hive/feature/integration-extended
Release / Create Release (push) Waiting to run
micro-fix: wrong reference for hive_coder
2026-03-09 19:52:35 -07:00
Richard Tang 2c9790c65d Merge remote-tracking branch 'origin' into feature/integration-extended 2026-03-09 19:52:17 -07:00
Richard Tang 9700ac71bb micro-fix: wrong reference for hive_coder 2026-03-09 19:50:07 -07:00
RichardTang-Aden 61ed67b068 Merge pull request #6097 from aden-hive/feature/integration-extended
Expand integration tool coverage across 40 vendors
2026-03-09 19:47:34 -07:00
Richard Tang c3bea8685a Merge remote-tracking branch 'origin/main' into feature/integration-extended 2026-03-09 19:47:21 -07:00
RichardTang-Aden 98c57b795a Merge pull request #6050 from aden-hive/feat/queen-planning-phase
Add queen planning phase, global memory, and refactor hive_coder
2026-03-09 19:46:23 -07:00
Richard Tang 9be1d03b5c chore ruff lint 2026-03-09 19:45:36 -07:00
Richard Tang 0d09510539 Merge remote-tracking branch 'origin/main' into feat/queen-planning-phase 2026-03-09 19:42:10 -07:00
Richard Tang 639c37ba17 feat: prompt to init the agent 2026-03-09 19:34:01 -07:00
Richard Tang 2258c23254 Merge branch 'feature/queen-global-memory' into feat/queen-planning-phase 2026-03-09 19:11:32 -07:00
Richard Tang 9714ea106d feat: improve initialize_and_build_agent clarity 2026-03-09 18:54:48 -07:00
Timothy f4ad500177 chore: lint 2026-03-09 18:53:01 -07:00
Timothy 9154a4d9f8 fix: resolve E501 line-too-long lint errors across 7 tool files 2026-03-09 18:51:01 -07:00
Timothy add6efe6f1 fix(micro-fix): increase stall threshold 2026-03-09 18:40:13 -07:00
Richard Tang 7ceb1efd02 fix: replace old tool name reference 2026-03-09 18:40:01 -07:00
Timothy a29ecf8435 chore(micro-fix): fix ci test blockage 2026-03-09 18:27:21 -07:00
Richard Tang d0ba5ef4f4 fix: update the wrong variable name 2026-03-09 18:12:29 -07:00
Richard Tang 860f637491 feat: add validation for module import 2026-03-09 17:53:50 -07:00
Richard Tang acb2cab317 feat: minor prompt change for switching to building mode 2026-03-09 17:41:23 -07:00
Richard Tang b453806918 feat: execution end message 2026-03-09 17:29:58 -07:00
Richard Tang 7ba8a0f51b feat: strengthen validation logic when loading 2026-03-09 17:08:20 -07:00
Richard Tang f6f398b6b1 feat: add GCU knowledge to planning 2026-03-09 17:02:13 -07:00
Timothy c4b22fa5c4 feat(postgres): update credential spec with new tool names 2026-03-09 16:47:27 -07:00
Timothy 0e64f977cd feat(postgres): add table stats, indexes, and foreign keys tools
Add pg_get_table_stats for row counts and size info,
pg_list_indexes for index details, and pg_get_foreign_keys
for relationship discovery with both outgoing and incoming FKs.
2026-03-09 16:47:09 -07:00
Timothy f24c9708fc feat(lusha): update credential spec with new tool names 2026-03-09 16:45:33 -07:00
Timothy bb4436e277 feat(lusha): add bulk enrich, technologies, and decision makers tools
Add lusha_bulk_enrich_persons for batch enrichment,
lusha_get_technologies for company tech stack lookup, and
lusha_search_decision_makers for senior contact discovery.
2026-03-09 16:45:17 -07:00
Timothy 795f66c90b feat(gsc): update credential spec with new tool names 2026-03-09 16:44:33 -07:00
Timothy 9ef6d51573 feat(gsc): add top queries, top pages, and delete sitemap tools
Add gsc_top_queries and gsc_top_pages convenience wrappers for
click-sorted analytics, and gsc_delete_sitemap for sitemap removal.
2026-03-09 16:44:20 -07:00
Timothy 3fed4e3409 feat(aws-s3): update credential specs with new tool names 2026-03-09 16:43:37 -07:00
Timothy 670e69f2ce feat(aws-s3): add copy, metadata, and presigned URL tools
Add s3_copy_object for copying within/between buckets,
s3_get_object_metadata for HEAD-based metadata retrieval, and
s3_generate_presigned_url for temporary access URL generation.
2026-03-09 16:42:46 -07:00
Timothy f6c4747905 feat(pushover): update credential spec with new tool names 2026-03-09 16:42:04 -07:00
Timothy 7b78f6c12f feat(pushover): add cancel receipt, glance update, and limits tools
Add pushover_cancel_receipt for stopping emergency retries,
pushover_send_glance for widget data updates, and
pushover_get_limits for checking message usage.
2026-03-09 16:41:52 -07:00
Timothy 1c75100f59 feat(news): update credential spec with new tool names 2026-03-09 16:41:15 -07:00
Timothy b325e103c6 feat(news): add latest, by-source, and by-topic search tools
Add news_latest for breaking news without query, news_by_source
for source-filtered articles, and news_by_topic for topic-based
discovery with automatic date ranges.
2026-03-09 16:40:54 -07:00
Timothy aef2d2d474 feat(serpapi): update credential spec with new tool names 2026-03-09 16:40:05 -07:00
Timothy 95a2b6711e feat(serpapi): add cited-by, profile search, and Google web search tools
Add scholar_cited_by for finding papers citing a given paper,
scholar_search_profiles for author profile discovery, and
serpapi_google_search for structured Google web results.
2026-03-09 16:38:50 -07:00
Timothy 7fb5e8145c feat(exa-search): update credential spec with new tool names 2026-03-09 16:37:56 -07:00
Timothy 8e45d0df83 feat(exa-search): add news, papers, and company search tools
Add exa_search_news, exa_search_papers, and exa_search_companies
convenience wrappers with pre-configured category filters and
automatic date/domain filtering.
2026-03-09 16:37:44 -07:00
Richard Tang 8d4657c13e Merge branch 'feat/queen-planning-phase' into feature/queen-global-memory 2026-03-09 16:10:42 -07:00
Timothy 3d175a6d54 feat(greenhouse): update credential spec with new tool names
Add greenhouse_list_offers, greenhouse_add_candidate_note, greenhouse_list_scorecards.
2026-03-09 16:02:53 -07:00
Timothy b9debaf957 feat(greenhouse): add list offers, candidate notes, and scorecards tools
- greenhouse_list_offers: GET /offers or /applications/{id}/offers
- greenhouse_add_candidate_note: POST /candidates/{id}/activity_feed/notes
- greenhouse_list_scorecards: GET /applications/{id}/scorecards
- Add _post helper for POST requests
2026-03-09 16:02:08 -07:00
Richard Tang bdcbcff6f3 feat: better instruction for planning mode switch 2026-03-09 16:01:34 -07:00
Timothy d2d7bdc374 feat(brevo): update credential spec with new tool names
Add brevo_list_contacts, brevo_delete_contact, brevo_list_email_campaigns.
2026-03-09 16:01:16 -07:00
Timothy 40e494b15d feat(brevo): add list contacts, delete contact, and list campaigns tools
- brevo_list_contacts: GET /contacts with pagination and modified_since filter
- brevo_delete_contact: DELETE /contacts/{email} to remove contacts
- brevo_list_email_campaigns: GET /emailCampaigns with status filter and stats
2026-03-09 16:00:42 -07:00
Timothy b5e840c0cb feat(quickbooks): update credential specs with new tool names
Add quickbooks_list_invoices, quickbooks_get_customer, quickbooks_create_payment
to both credential specs (token and realm_id).
2026-03-09 15:59:46 -07:00
Timothy f3d74c9ae4 feat(quickbooks): add list invoices, get customer, and create payment tools
- quickbooks_list_invoices: query invoices with status/customer filters
- quickbooks_get_customer: GET /customer/{id} with address and contact info
- quickbooks_create_payment: POST /payment with optional invoice linking
2026-03-09 15:59:23 -07:00
Richard Tang a22b321692 feat: improve phase switching tools 2026-03-09 15:33:03 -07:00
Timothy 2e7dbad118 feat(cloudinary): update credential specs with new tool names
Add cloudinary_get_usage, cloudinary_rename_resource, cloudinary_add_tag
to all three credential specs (cloud_name, key, secret).
2026-03-09 15:31:42 -07:00
Timothy 6183d1b65b feat(cloudinary): add usage, rename, and add tag tools
- cloudinary_get_usage: GET /usage for storage, bandwidth, transformation limits
- cloudinary_rename_resource: POST /rename to change public_id
- cloudinary_add_tag: POST /tags to add tags to resources
2026-03-09 15:31:22 -07:00
Timothy 09931e6d98 feat(twitter): update credential spec with new tool names
Add twitter_get_user_followers, twitter_get_tweet_replies, twitter_get_list_tweets.
2026-03-09 15:25:21 -07:00
Timothy cb394127d1 feat(twitter): add user followers, tweet replies, and list tweets tools
- twitter_get_user_followers: GET /users/{id}/followers with profile details
- twitter_get_tweet_replies: search recent replies via conversation_id
- twitter_get_list_tweets: GET /lists/{id}/tweets with author expansion
2026-03-09 15:21:47 -07:00
Timothy 588fa1f9ea feat(google-analytics): update credential spec with new tool names
Add ga_get_user_demographics, ga_get_conversion_events, ga_get_landing_pages.
2026-03-09 15:21:09 -07:00
Timothy 73325c280c feat(google-analytics): add demographics, conversion events, and landing pages tools
- ga_get_user_demographics: country/language/device breakdown
- ga_get_conversion_events: event counts, conversions, and revenue
- ga_get_landing_pages: top landing pages with bounce rate and session duration
2026-03-09 15:20:51 -07:00
Timothy 8c5ae8ffa8 feat(docker-hub): update credential spec with new tool names
Add docker_hub_get_tag_detail, docker_hub_delete_tag, docker_hub_list_webhooks.
2026-03-09 15:19:58 -07:00
Timothy 7389423c70 feat(docker-hub): add tag detail, delete tag, and list webhooks tools
- docker_hub_get_tag_detail: GET /repositories/{repo}/tags/{tag} with image architectures
- docker_hub_delete_tag: DELETE /repositories/{repo}/tags/{tag}
- docker_hub_list_webhooks: GET /repositories/{repo}/webhooks
- Add _delete helper for DELETE requests
2026-03-09 15:18:46 -07:00
Timothy 20c15446a7 feat(apollo): update credential spec with new tool names
Add apollo_get_person_activities, apollo_list_email_accounts,
apollo_bulk_enrich_people.
2026-03-09 15:17:38 -07:00
Richard Tang c05c30dd9a feat: add meta agent tools to planning 2026-03-09 15:14:34 -07:00
Timothy bcd2fb76bd feat(apollo): add person activities, email accounts, and bulk enrich tools
- apollo_get_person_activities: GET /activities for contact activity history
- apollo_list_email_accounts: GET /email_accounts for connected sending accounts
- apollo_bulk_enrich_people: POST /people/bulk_match for batch enrichment (up to 10)
2026-03-09 15:03:21 -07:00
Timothy 5fb97ab6df feat(calendly): update credential spec with new tool names
Add calendly_cancel_event, calendly_list_webhooks, calendly_get_event_type.
2026-03-09 15:00:46 -07:00
Timothy 0224ebc800 feat(calendly): add cancel event, list webhooks, and get event type tools
- calendly_cancel_event: POST /scheduled_events/{id}/cancellation
- calendly_list_webhooks: GET /webhook_subscriptions for org/user scope
- calendly_get_event_type: GET /event_types/{id} for meeting template details
- Add _post helper for POST requests
2026-03-09 15:00:34 -07:00
Timothy af88f7299a feat(pagerduty): update credential specs with new tool names
Add pagerduty_list_oncalls, pagerduty_add_incident_note,
pagerduty_list_escalation_policies to api_key spec.
Add pagerduty_add_incident_note to from_email spec (write operation).
2026-03-09 14:59:53 -07:00
Timothy 81729706ae feat(pagerduty): add oncalls, incident notes, and escalation policies tools
- pagerduty_list_oncalls: GET /oncalls with schedule/policy filters
- pagerduty_add_incident_note: POST /incidents/{id}/notes to add notes
- pagerduty_list_escalation_policies: GET /escalation_policies with search
2026-03-09 14:59:33 -07:00
Timothy bbb1b43ebe feat(airtable): update credential spec with new tool names
Add airtable_delete_records, airtable_search_records, airtable_list_collaborators.
2026-03-09 14:58:57 -07:00
Timothy 70ed5fa8df feat(airtable): add delete records, search records, and list collaborators tools
- airtable_delete_records: DELETE records by comma-separated IDs (up to 10)
- airtable_search_records: search records using FIND formula for partial matching
- airtable_list_collaborators: list base collaborators via meta API
- Add _delete helper for DELETE requests
2026-03-09 14:58:42 -07:00
Timothy 312db6620d feat(reddit): update credential specs with new tool names
Add reddit_get_subreddit_info, reddit_get_post_detail, reddit_get_user_posts
to both credential specs (client_id and client_secret).
2026-03-09 14:57:50 -07:00
Timothy 93c1fc5488 feat(reddit): add subreddit info, post detail, and user posts tools
- reddit_get_subreddit_info: GET /r/{name}/about for subscriber count, description
- reddit_get_post_detail: GET /by_id/t3_{id} for full post details with flair, ratios
- reddit_get_user_posts: GET /user/{name}/submitted for user's post history
2026-03-09 14:57:33 -07:00
Richard Tang 90762f275b feat: give planning mode the load tool 2026-03-09 14:55:53 -07:00
Timothy 801443027d feat(pipedrive): update credential spec with new tool names
Add pipedrive_update_deal, pipedrive_create_person, pipedrive_create_activity
to the credential spec tools list.
2026-03-09 14:54:22 -07:00
Timothy ca2ead76cd feat(pipedrive): add deal update, person creation, and activity creation tools
Add pipedrive_update_deal, pipedrive_create_person, and
pipedrive_create_activity tools using Pipedrive REST API v1.
2026-03-09 14:52:27 -07:00
Timothy d562144a6d feat(confluence): register new tools in credential specs
Add confluence_update_page, confluence_delete_page, and
confluence_get_page_children to all three Confluence credential specs.
2026-03-09 14:51:39 -07:00
Timothy af7fb7da27 feat(confluence): add page update, delete, and children listing tools
Add confluence_update_page, confluence_delete_page, and
confluence_get_page_children tools using Confluence REST API v2.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 14:51:26 -07:00
Timothy c17dd63b4a feat(intercom): register new tools in credential spec
Add intercom_close_conversation, intercom_create_contact, and
intercom_list_conversations to Intercom credential spec.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 14:50:49 -07:00
Timothy 866db289e2 feat(intercom): add close conversation, create contact, and list conversations tools
Add close_conversation, create_contact, and list_conversations client
methods plus intercom_close_conversation, intercom_create_contact, and
intercom_list_conversations MCP tools using Intercom API v2.11.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 14:50:30 -07:00
Timothy b4ac5e9607 feat(gitlab): register new tools in credential spec
Add gitlab_update_issue, gitlab_get_merge_request, and
gitlab_create_merge_request_note to GitLab credential spec.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 14:49:01 -07:00
Timothy 3ca7af4242 feat(gitlab): add issue update, MR detail, and MR comment tools
Add _put helper and gitlab_update_issue, gitlab_get_merge_request,
and gitlab_create_merge_request_note tools using GitLab REST API v4.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 14:48:40 -07:00
Richard Tang 2b12a9c91a Merge remote-tracking branch 'origin/feature/queen-global-memory' into feature/queen-global-memory 2026-03-09 14:47:27 -07:00
Richard Tang 9a94595a42 feat: extract the shared knowledge between planning and building 2026-03-09 14:45:31 -07:00
Richard Tang e1540dfaa6 refactor: drop hive code CLI 2026-03-09 14:30:13 -07:00
Richard Tang 4f5ac6d1b1 refactor: rename hive_coder to queen and extract queen orchestrator 2026-03-09 14:23:31 -07:00
Richard Tang c87d7b13da refactor: rename hive_coder to queen and extract queen orchestrator 2026-03-09 14:23:16 -07:00
Timothy c4acf0b659 fix: memory consolidation hook, simplify generated memory files 2026-03-09 14:15:01 -07:00
RichardTang-Aden 5e1ab3ca37 Merge pull request #5029 from karthik-kotra/docs/setup-troubleshooting
docs(setup): add troubleshooting steps for common WSL setup issues
2026-03-09 14:06:28 -07:00
Timothy 79c32c9f47 feat(slack): register new tools in credential spec
Add slack_get_channel_info, slack_list_files, and slack_get_file_info
to Slack credential spec.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:58:14 -07:00
Timothy 35ee29a843 feat(slack): add channel info, file listing, and file detail tools
Add get_channel_info, list_files, and get_file_info client methods
plus slack_get_channel_info, slack_list_files, and slack_get_file_info
MCP tools using Slack Web API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:57:45 -07:00
Timothy 573aea1d9c feat(stripe): register new tools in credential spec
Add stripe_list_disputes, stripe_list_events, and
stripe_create_checkout_session to Stripe credential spec.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:56:20 -07:00
Timothy 6ecbc30293 feat(stripe): add disputes, events, and checkout session tools
Add list_disputes, list_events, and create_checkout_session client
methods plus stripe_list_disputes, stripe_list_events, and
stripe_create_checkout_session MCP tools using Stripe API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:56:07 -07:00
Timothy 843b1f2e1d feat(linear): register new tools in credential spec
Add linear_cycles_list, linear_issue_comments_list, and
linear_issue_relation_create to Linear credential spec.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:54:48 -07:00
Timothy 89f6c8e4ef feat(linear): add cycle listing, issue comments, and issue relations tools
Add list_cycles, list_issue_comments, and create_issue_relation client
methods plus linear_cycles_list, linear_issue_comments_list, and
linear_issue_relation_create MCP tools using Linear GraphQL API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:52:09 -07:00
Timothy 304ac07bd8 feat(zoom): register new tools in credential spec
Add zoom_update_meeting, zoom_list_meeting_participants, and
zoom_list_meeting_registrants to Zoom credential spec.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:50:27 -07:00
Timothy 82f0684b83 feat(zoom): add meeting update, participants, and registrants tools
Add zoom_update_meeting (PATCH), zoom_list_meeting_participants
(past meeting attendees), and zoom_list_meeting_registrants
using Zoom REST API v2.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:45:11 -07:00
Timothy 963c37dc31 feat(twilio): register new tools in credential specs
Add twilio_list_phone_numbers, twilio_list_calls, and
twilio_delete_message to both Twilio credential specs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:41:26 -07:00
Timothy c02da3ba5a feat(twilio): add phone number listing, call history, and message deletion tools
Add twilio_list_phone_numbers, twilio_list_calls, and
twilio_delete_message tools using Twilio REST API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:40:58 -07:00
Timothy 7f34e95ec6 feat(shopify): register new tools in credential specs
Add shopify_update_product, shopify_get_customer, and
shopify_create_draft_order to both Shopify credential specs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:40:28 -07:00
Timothy f2998fe098 feat(shopify): add product update, customer detail, and draft order tools
Add shopify_update_product, shopify_get_customer, and
shopify_create_draft_order tools using Shopify Admin REST API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:40:15 -07:00
Timothy 323a2489b8 feat(zendesk): register new tools in credential specs
Add zendesk_get_ticket_comments, zendesk_add_ticket_comment, and
zendesk_list_users to all three Zendesk credential specs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:39:35 -07:00
Timothy f6d1cd640e feat(zendesk): add ticket comments and user listing tools
Add zendesk_get_ticket_comments, zendesk_add_ticket_comment, and
zendesk_list_users tools using Zendesk Support API v2.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:39:25 -07:00
Timothy ddf89a04fe feat(asana): update credential spec for new tools
Register asana_update_task, asana_add_comment, and
asana_create_subtask in the Asana credential spec.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:35:16 -07:00
Timothy c5dc89f5ee feat(asana): add update_task, add_comment, create_subtask tools
Add _put helper and three new Asana MCP tools:
- asana_update_task: modify name, notes, completion, due date, assignee
- asana_add_comment: post comment stories on tasks
- asana_create_subtask: create subtasks under existing tasks

API ref: https://developers.asana.com/docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:35:05 -07:00
Timothy 6ade34b759 feat(trello): register get_card, create_list, search_cards tools
Add three new Trello MCP tools:
- trello_get_card: retrieve full card details with members/checklists/attachments
- trello_create_list: create new lists on boards
- trello_search_cards: full-text search across cards with board scoping

Update credential spec to include the new tool names.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:20:43 -07:00
Timothy 09d5f0a9df feat(trello): add client methods for get_card, create_list, search
Add TrelloClient methods for:
- get_card: GET /1/cards/{id} with members, checklists, attachments
- create_list: POST /1/lists to create new board lists
- search: GET /1/search for full-text search across cards

API ref: https://developer.atlassian.com/cloud/trello/rest/api-group-cards/

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:19:59 -07:00
Timothy a60d63cca2 feat(github): register list_commits, create_release, list_workflow_runs
Add three new GitHub MCP tools:
- github_list_commits: query commits with author/date/branch filters
- github_create_release: create tagged releases with notes and draft support
- github_list_workflow_runs: monitor CI/CD pipeline runs with status filters

Update credential spec to include the new tool names.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:19:16 -07:00
Timothy 8616975fc5 feat(github): add client methods for commits, releases, workflow runs
Add _GitHubClient methods for:
- list_commits: GET /repos/{owner}/{repo}/commits with sha/author/date filters
- create_release: POST /repos/{owner}/{repo}/releases with tag, notes, draft
- list_workflow_runs: GET /repos/{owner}/{repo}/actions/runs with filters

API ref: https://docs.github.com/en/rest

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:18:33 -07:00
Timothy e5ae919d8f feat(telegram): register get_chat_member_count, send_video, set_description
Add three new Telegram MCP tools:
- telegram_get_chat_member_count: retrieve group/channel membership size
- telegram_send_video: send video files via URL or file_id
- telegram_set_chat_description: update group/channel descriptions

Update credential spec to include the new tool names.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:17:30 -07:00
Timothy 8e7f5eaaba feat(telegram): add client methods for member count, video, description
Add _TelegramClient methods for:
- get_chat_member_count: getChatMemberCount API endpoint
- send_video: sendVideo with caption, parse_mode, duration support
- set_chat_description: setChatDescription for groups/channels

API ref: https://core.telegram.org/bots/api

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:13:06 -07:00
Timothy 4d1ff8b054 feat(salesforce): update credential spec for new tools
Register salesforce_delete_record, salesforce_search_records, and
salesforce_get_record_count in both Salesforce credential specs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:12:25 -07:00
Timothy 9fa81e8599 feat(salesforce): add delete_record, search_records, get_record_count
Add three new Salesforce MCP tools:
- salesforce_delete_record: DELETE /sobjects/{type}/{id}
- salesforce_search_records: SOSL full-text search via /search/
- salesforce_get_record_count: efficient COUNT() query for any SObject

API ref: https://developer.salesforce.com/docs/atlas.en-us.api_rest.meta

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:12:11 -07:00
Timothy cf8e19b059 feat(discord): register get_channel, create_reaction, delete_message tools
Add three new Discord MCP tools:
- discord_get_channel: retrieve channel metadata (name, topic, type)
- discord_create_reaction: add emoji reactions to messages
- discord_delete_message: remove messages from channels

Update credential spec to include the new tool names.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:11:25 -07:00
Timothy dfa3f60fcf feat(discord): add client methods for get_channel, reactions, delete
Add _DiscordClient methods for:
- get_channel: retrieve channel metadata via GET /channels/{id}
- create_reaction: add emoji reaction via PUT reactions endpoint
- delete_message: remove a message via DELETE messages endpoint

API ref: https://discord.com/developers/docs/resources

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:10:49 -07:00
Timothy b795f1b253 feat(notion): update credential spec for new tools
Register notion_update_page, notion_archive_page, and
notion_append_blocks in the Notion credential spec.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:06:20 -07:00
Timothy 73423c0dd2 feat(notion): add update_page, archive_page, append_blocks tools
Add three new Notion MCP tools:
- notion_update_page: modify page properties via PATCH /pages/{id}
- notion_archive_page: archive or restore pages
- notion_append_blocks: add paragraphs, headings, lists, todos, etc.

API ref: https://developers.notion.com/reference

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:06:08 -07:00
Timothy 3d844e1539 feat(jira): update credential spec for new tools
Register jira_update_issue, jira_list_transitions, and
jira_transition_issue in all three Jira credential specs
(domain, email, token).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:04:32 -07:00
Timothy b619119eb5 feat(jira): add update_issue, list_transitions, transition_issue tools
Add three new Jira MCP tools:
- jira_update_issue: modify summary, description, priority, labels, assignee
- jira_list_transitions: discover available status transitions for an issue
- jira_transition_issue: move an issue to a new status with optional comment

API ref: https://developer.atlassian.com/cloud/jira/platform/rest/v3/

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:04:19 -07:00
Timothy b00ed4fc70 feat(hubspot): register delete_object, list/create_associations tools
Add three new MCP tools:
- hubspot_delete_object: archive contacts, companies, or deals
- hubspot_list_associations: query links between CRM objects (v4 API)
- hubspot_create_association: link two CRM records together

Update credential spec to include the new tool names.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:03:37 -07:00
Timothy 5ec5fbe998 feat(hubspot): add client methods for delete, associations
Add _HubSpotClient methods for:
- delete_object: archive a CRM object via DELETE /crm/v3/objects
- list_associations: query associations via GET /crm/v4/objects associations endpoint
- create_association: link two CRM objects via PUT /crm/v4/objects associations endpoint

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 13:02:49 -07:00
Richard Tang 2ed814455a Merge branch 'feat/queen-planning-phase' into feature/queen-global-memory 2026-03-09 12:57:23 -07:00
Timothy ad1a4ef0c3 fix: cancellation button 2026-03-09 12:48:20 -07:00
Timothy 2111c808a9 feat: queen memory v1 2026-03-09 11:55:39 -07:00
Bryan @ Aden 402bb38267 Merge pull request #6079 from Waryjustice/fix/google-sheets-credentials-orphan
fix(credentials): remove orphaned google_sheets.py credential spec
2026-03-09 18:37:27 +00:00
Waryjustice 0a55928872 fix(credentials): remove orphaned google_sheets.py credential spec
The google_sheets.py file defined GOOGLE_SHEETS_CREDENTIALS (an API-key
based credential for reading public sheets via GOOGLE_SHEETS_API_KEY) but
was never wired into the package:

- Never imported in credentials/__init__.py
- Never merged into CREDENTIAL_SPECS
- Never listed in __all__
- Tool never calls credentials.get('google_sheets_key') — uses 'google' (OAuth2)
- Tool names in the spec were stale and did not match actual function names

The 'google' credential in email.py already correctly covers all Google
Sheets tools via OAuth2. This file was dead code with no referencing
imports anywhere in the repository.

Closes #6077

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-09 23:44:26 +05:30
Richard Tang cdf76ae3b9 fix: eventloop test 2026-03-09 10:23:56 -07:00
bryan 4ad0d0e077 fix: align the credential functions to be the same 2026-03-09 10:14:21 -07:00
Richard Tang 42d0592941 refactor: judge evaluate 2026-03-09 10:09:15 -07:00
Richard Tang 1de7cf821d fix: handle judge with empty message 2026-03-09 09:58:29 -07:00
Timothy 4ea8540e25 fix: better logging for memory consolidation event 2026-03-08 20:44:40 -07:00
Timothy bfa3b8e0f6 fix: queen memory health 2026-03-08 20:28:53 -07:00
Richard Tang 55eccfd75f feat: intake node prompt in planning mode 2026-03-08 20:27:24 -07:00
Timothy 1e994a77b5 feat: queen agent global memory 2026-03-08 19:54:46 -07:00
Richard Tang d12afeb35d chore: ruff lint 2026-03-08 19:46:49 -07:00
Timothy @aden b55a77634b Delete .github/ISSUE_TEMPLATE/link-discord.yml 2026-03-08 19:44:48 -07:00
Richard Tang e84fefd319 feat: separate the queen and worker tools in prompts 2026-03-08 19:40:30 -07:00
bryan cba0ec110f fix: linter update 2026-03-08 19:37:57 -07:00
bryan 0256e0c944 Merge branch 'main' into feat/agent-trigger 2026-03-08 19:28:36 -07:00
Bryan @ Aden f7db603922 Merge pull request #6048 from aden-hive/fix/draft-email-tool
(micro-fix): draft email tool
2026-03-09 02:26:58 +00:00
bryan b4a47a12ff fix: linter formatting 2026-03-08 19:26:06 -07:00
bryan 2228851b16 feat: added reply in thread to draft email tool 2026-03-08 19:24:38 -07:00
Richard Tang d2b510014d feat: adjust tools and knowledge separation between planning and building 2026-03-08 19:21:50 -07:00
Bryan @ Aden ed0a211906 Merge pull request #6047 from aden-hive/fix/reply-email-tool
(micro-fix): reply email tool
2026-03-09 02:00:03 +00:00
bryan 63744ddaef fix: update to pass linter 2026-03-08 18:58:50 -07:00
bryan 82331acb77 feat: update reply email tool to contain the email thread in the body 2026-03-08 18:53:53 -07:00
Richard Tang 3ed5fda448 feat: planning phase for the queen 2026-03-08 18:49:45 -07:00
bryan 4d9d0362a0 fixes to make the timer trigger properly 2026-03-08 18:44:42 -07:00
Timothy @aden b96bbcaa72 Merge pull request #6044 from Amdev-5/fix/e501-coder-tools-server-6043
fix: E501 line too long in coder_tools_server.py
2026-03-08 17:39:59 -07:00
Timothy edfa49bf7a fix: ci test 2026-03-08 17:29:36 -07:00
RichardTang-Aden eb9e4ed23c Merge pull request #5955 from akshajtiwari/ci-first-issue
CI: add uv caching, improve PR requirements workflow
2026-03-08 17:17:22 -07:00
Amdev-5 fed9e90271 fix: E501 line too long in coder_tools_server.py
Break ternary expression across multiple lines to satisfy
the 100-char line length limit.

Fixes #6043
2026-03-09 05:35:45 +05:30
bryan f474d0bc8e Merge branch 'main' into feat/agent-trigger 2026-03-08 16:59:14 -07:00
bryan 6a0681b9aa feat: fixing phase 4, continuing to test 2026-03-08 16:52:00 -07:00
Timothy ca565ae664 fix: validate agent package for orphaned nodes 2026-03-07 09:29:48 -08:00
Timothy 42ce97e0fc fix: agent package validation - no orphaned nodes 2026-03-07 08:47:01 -08:00
Akshaj Tiwari bea17b5f79 simplify label creation logic by assuming label pre-exists 2026-03-07 19:02:04 +05:30
Akshaj Tiwari ab0d5ce8d3 change pr.updated_at to pr.created_at for the grace period check 2026-03-07 18:58:36 +05:30
Akshaj Tiwari b374d5119a resolving the ci.yml issues by using enable-cache instead of manual caching 2026-03-07 18:49:17 +05:30
Robert Hallers 7a467ef9b8 docs: mark TUI as deprecated in roadmap to match CLAUDE.md
Resolves inconsistency between CLAUDE.md/AGENTS.md (TUI deprecated) and
docs/roadmap.md (TUI listed as completed feature).

- Strike through TUI items in 3 roadmap sections
- Add deprecation note to TUI-to-GUI upgrade section
- Reference AGENTS.md and hive open as replacement

Fixes #5941

Signed-off-by: Robert Hallers <robert@terplabs.ai>
2026-03-07 02:36:04 -05:00
RichardTang-Aden 9129b4a42e Merge pull request #5975 from aden-hive/feat/queen-responsibility
Release / Create Release (push) Waiting to run
feat: separate queen responsibility by phases
2026-03-06 19:21:53 -08:00
bryan c7e634851b feat: phase 4 of trigger plan 2026-03-06 19:21:32 -08:00
Richard Tang e906646d49 Merge remote-tracking branch 'origin/feature/thinking-hook' into feat/queen-responsibility 2026-03-06 19:15:29 -08:00
Timothy 086a532521 fix: skip queen judge, turn off aggressive compaction 2026-03-06 19:14:25 -08:00
Richard Tang 19dd40ed3a chore: ruff lint 2026-03-06 19:11:53 -08:00
Richard Tang 196f3d645f feat: building phase prompts improvements 2026-03-06 18:59:12 -08:00
Richard Tang 80fd91d175 feat: building phase prompt optimization 2026-03-06 18:53:33 -08:00
Richard Tang 695410f880 Merge remote-tracking branch 'origin/feature/thinking-hook' into feat/queen-responsibility 2026-03-06 18:39:21 -08:00
Timothy 009c62dac6 chore: re-organize hooks 2026-03-06 18:38:20 -08:00
Richard Tang 27c8904341 feat: limit the tool description in simple mode 2026-03-06 18:37:08 -08:00
Richard Tang ddfce58071 feat: simplify building prompts 2026-03-06 18:29:44 -08:00
bryan cdb7155960 feat: phase 3 of trigger plan 2026-03-06 18:07:26 -08:00
Richard Tang 1bb850bdbe Merge remote-tracking branch 'origin/feature/thinking-hook' into feat/queen-responsibility 2026-03-06 17:59:33 -08:00
Richard Tang 5019633ba3 fix: remove wrong tool examples 2026-03-06 17:58:50 -08:00
Timothy @aden b0fd8b83f0 Merge pull request #5896 from VasuBansal7576/codex/pr-minimax-single
fix: add minimax provider mapping and stream fallback
2026-03-06 17:54:41 -08:00
Richard Tang 2dc58eeeb0 fix: add missing gcu prompts 2026-03-06 17:48:37 -08:00
Richard Tang 50ab55ded5 feat: loading improvement 2026-03-06 17:35:17 -08:00
bryan 3f7790c26a feat: phase 2 of trigger plan 2026-03-06 17:22:57 -08:00
Timothy 9f656577a2 fix: turn signal edge case 2026-03-06 17:18:54 -08:00
Richard Tang 5c87b4b194 Merge remote-tracking branch 'origin/feature/thinking-hook' into feat/queen-responsibility 2026-03-06 17:10:25 -08:00
Timothy 7f866b24a1 Merge branch 'feat/queen-responsibility' into feature/thinking-hook 2026-03-06 17:07:12 -08:00
Timothy 5eb623e931 fix: back to back compaction edge case 2026-03-06 17:06:53 -08:00
Richard Tang 5583896429 fix: add back gcu instruction 2026-03-06 17:05:47 -08:00
Timothy 3a8321b975 chore: fix compaction reference 2026-03-06 16:59:44 -08:00
bryan 5676b115f4 Merge branch 'feat/queen-responsibility' into feat/agent-trigger 2026-03-06 16:58:06 -08:00
Richard Tang 8443ec87a6 fix: output key that terminated the queen 2026-03-06 16:55:10 -08:00
Richard Tang 1e06e87f4c feat: improve validation 2026-03-06 16:34:28 -08:00
Richard Tang e2558e3f95 fix: llm friendly input 2026-03-06 16:04:58 -08:00
Richard Tang 1344d3bb8e feat: add node parameters and fix problems for initialize_agent_package 2026-03-06 15:54:27 -08:00
Timothy dc7ec6c058 Merge branch 'feat/queen-responsibility' into feature/thinking-hook 2026-03-06 15:35:17 -08:00
Richard Tang 0ba781609a refactor: remove unused builder functions 2026-03-06 15:32:45 -08:00
bryan 61c59d57e8 feat: phase 1 of trigger plan 2026-03-06 15:11:36 -08:00
Richard Tang 5ce230b0a6 refactor: move the coder tools 2026-03-06 14:56:19 -08:00
Timothy ff1527a77a fix: thinking hook max token 2026-03-06 14:53:32 -08:00
Timothy 252dea0bc3 Merge branch 'feat/queen-responsibility' into feature/thinking-hook 2026-03-06 14:37:02 -08:00
Timothy 126cbe529f feat: queen thinking hook 2026-03-06 14:30:10 -08:00
Richard Tang 207a0a0ca5 feat: fix for mcp tools and templates 2026-03-06 14:27:20 -08:00
Richard Tang d9f502173b feat: queen building improvements 2026-03-06 13:51:49 -08:00
Richard Tang f090ce4d5a fix: duplicated session calls 2026-03-06 12:28:37 -08:00
Richard Tang 1f7efcd940 feat: queen prompt optimization 2026-03-06 12:27:08 -08:00
Akshaj Tiwari fbbbaadd1e remove workflow_dispatch trigger from PR requirements workflows(forgot this commit) 2026-03-07 00:59:56 +05:30
Richard Tang 4de140a170 Merge remote-tracking branch 'origin/main' into feat/queen-responsibility 2026-03-06 11:17:03 -08:00
Richard Tang ed5cfb93a4 fix: catch Cannot write to closing transport error 2026-03-06 11:15:13 -08:00
Richard Tang 891bf08477 feat: re-organized queen prompt 2026-03-06 11:13:15 -08:00
Akshaj Tiwari 37651e534f add PR requirements warning and enforcement workflow and remove the workflow dispatch trigger 2026-03-07 00:39:35 +05:30
Akshaj Tiwari df63c3e781 add the pr requirement changes and remove the workflow dispatch option from ci.yml(tested) 2026-03-06 23:44:45 +05:30
Akshaj Tiwari 838da4a16e style: fix ruff import ordering 2026-03-06 22:57:14 +05:30
Akshaj Tiwari e916d573f6 adding workflow dispatch for testing 2026-03-06 22:51:19 +05:30
Richard Tang 08d51bb377 refactor: remove the old coderagent for TUI 2026-03-06 09:20:49 -08:00
Akshaj Tiwari fa5ebf19a4 first commit with the cache and working directory attributes 2026-03-06 22:49:06 +05:30
Richard Tang 17de0efcaf feat: rename escalation tool 2026-03-06 08:48:16 -08:00
Timothy 4099603a91 chore: lint 2026-03-06 08:21:40 -08:00
Vasu Bansal 988a58c1b7 fix: harden legacy agent.json loading error handling 2026-03-06 20:31:59 +05:30
Vasu Bansal cbc7ec3a32 fix: resolve aden client import duplication after rebase 2026-03-06 16:13:49 +05:30
Vasu Bansal 07d4bf8044 fix: resolve ruff import-order failure in aden client 2026-03-06 16:10:02 +05:30
Vasu Bansal e302e93ac9 chore: retrigger ci 2026-03-06 16:10:02 +05:30
Vasu Bansal 80f5a363d2 fix: address minimax review feedback in quickstart and provider wiring 2026-03-06 16:10:02 +05:30
Vasu Bansal 7b5b6d2c51 fix: add minimax provider mapping and stream fallback 2026-03-06 16:10:02 +05:30
Timothy 0b1fd72e49 chore: lint 2026-03-05 21:28:17 -08:00
Timothy 353f5c31a2 chore: lint 2026-03-05 21:24:21 -08:00
Richard Tang ad40b049ae feat: update the escalate tool 2026-03-05 20:53:33 -08:00
Richard Tang 42c9a11b1a feat: remove the terminal node for queen 2026-03-05 20:42:28 -08:00
Richard Tang c3fb1885c3 feat: make terminal node validation warning 2026-03-05 20:20:49 -08:00
Richard Tang bb413bad1f feat: prompts to allow user to override 2026-03-05 19:50:11 -08:00
Timothy afef3cb66a Merge branch 'fix/output-cleaner' into feat/queen-responsibility 2026-03-05 19:48:28 -08:00
Timothy b1f3d931cd fix: remove output cleaner 2026-03-05 19:48:13 -08:00
Richard Tang 297b24e061 feat: instruction for running phase to handle escalation 2026-03-05 19:47:13 -08:00
Richard Tang 775a0fa511 chore: prompt debug tool 2026-03-05 19:35:52 -08:00
Richard Tang 77abed89b9 feat: queen identity 2026-03-05 19:29:04 -08:00
Richard Tang e2aeb72d49 feat: queen identity 2026-03-05 19:25:59 -08:00
Richard Tang 2b440f84f0 feat: add the termination session back to queen 2026-03-05 19:10:49 -08:00
Richard Tang 8fd66a12c5 Merge remote-tracking branch 'origin/feat/queen-responsibility' into feat/queen-responsibility 2026-03-05 19:03:27 -08:00
Richard Tang f23d5a3ff5 feat: add terminal node back in graph 2026-03-05 19:02:41 -08:00
Timothy be8ec867e5 fix: better stall detection 2026-03-05 18:51:36 -08:00
Timothy b2ba42e541 Merge branch 'feat/queen-responsibility' into feat/worker-progressive-disclosure 2026-03-05 15:26:09 -08:00
Richard Tang 94d0038e03 chore: remove duplicates in anti-patterns 2026-03-05 14:54:46 -08:00
Timothy e1bf300e3c feat: progressive disclosure of runtime data 2026-03-05 14:44:02 -08:00
Timothy @aden f36add83f0 Merge pull request #5901 from aden-hive/fix/bom-safe-json-load
fix(micro-fix): bom safe json loading
2026-03-05 14:29:37 -08:00
Timothy Zhang a57d58e8d4 fix: bom safe json loading 2026-03-05 14:27:15 -08:00
Richard Tang c6b922e831 feat: condense framework guide 2026-03-05 14:26:03 -08:00
Richard Tang 71d12a7904 feat: condensed queen building prompts 2026-03-05 14:18:03 -08:00
Richard Tang 24c25d408c feat: remove unused prompts 2026-03-05 14:13:02 -08:00
Richard Tang 2e99fc9fe5 feat: change graph guidelines 2026-03-05 14:08:52 -08:00
Richard Tang c1f066b8ba feat: add gcu and validation in initialize_agent_package 2026-03-05 14:01:43 -08:00
Richard Tang e7a6074800 fix: prevent duplicate session creation when starting from home 2026-03-05 13:44:39 -08:00
Richard Tang 719942d29a fix: bug for multiple session calls 2026-03-05 13:04:17 -08:00
Richard Tang 190450a2b2 refactor: skip judge logic improvement 2026-03-05 12:38:18 -08:00
Richard Tang 44d609b719 feat: allow judge to wait queen input 2026-03-05 12:33:27 -08:00
Richard Tang 8c9892f9f6 feat: re-organized the tools for list mcp tools 2026-03-05 12:06:57 -08:00
Bryan @ Aden 85c204a442 Merge pull request #5403 from jackthepunished/feat/telegram-tool-expansion
feat(tools): expand Telegram tool with message management, media, and chat info operations
2026-03-05 19:48:05 +00:00
Timothy @aden 56075a25a3 Merge pull request #5884 from aden-hive/feature/hive-as-a-game
feat(micro-fix): link discord github template
2026-03-05 10:55:13 -08:00
Timothy 2b0a6779cc feat: link discord github template 2026-03-05 10:54:30 -08:00
Timothy @aden b9ddce9d41 Merge pull request #5881 from aden-hive/docs-contributor-registration---Timothy
docs: Add TimothyZhang7 to contributors list
2026-03-05 10:37:26 -08:00
Timothy @aden 0c85406bc2 Add TimothyZhang7 to contributors list 2026-03-05 10:36:19 -08:00
Timothy @aden 1051134594 Merge pull request #5878 from aden-hive/feature/hive-as-a-game
chore: fix repo owner
2026-03-05 10:32:20 -08:00
jackthepunished 653d24df9d fix: address review — use POST for getChat, return raw API responses
- Change get_chat client method from httpx.get+params to httpx.post+json
  to avoid URL-encoding issues with @username chat IDs
- Remove {"success": True} normalization from delete_message,
  send_chat_action, pin_message, and unpin_message MCP tools;
  return raw Telegram API response consistently
- Update corresponding test mocks and assertions to match
2026-03-05 20:40:46 +03:00
jackthepunished b687fa9e94 feat(tools): expand Telegram tool with message management, media, and chat info operations
Add 8 new operations to the Telegram Bot tool, bringing it from 2 to 10
operations. This covers message lifecycle (edit, delete, forward), media
(send photo), chat info (get chat), UX (typing indicators), and pin
management — making the tool practical for agent workflows beyond
fire-and-forget messaging.

New operations:
- telegram_edit_message: edit previously sent messages
- telegram_delete_message: delete messages
- telegram_forward_message: forward between chats
- telegram_send_photo: send photos via URL or file_id
- telegram_send_chat_action: show typing/uploading indicators
- telegram_get_chat: retrieve chat metadata
- telegram_pin_message: pin important messages
- telegram_unpin_message: unpin stale messages

Also includes input validation for chat actions, credential spec updates,
central registry wiring, and 31 new tests (52 total).

Closes #4808
2026-03-05 20:40:45 +03:00
Timothy c7f0ab0444 chore: fix repo owner 2026-03-05 09:33:02 -08:00
Timothy @aden 93bf373a5b Merge pull request #5869 from aden-hive/feature/hive-as-a-game
feat: integration bounty program with Lurkr XP, Discord roles, and automated tracking
2026-03-05 09:29:48 -08:00
Timothy 2d87042a70 fix: bad chars 2026-03-05 09:00:51 -08:00
Timothy 8a28abb7b8 fix: github actions 2026-03-05 09:00:06 -08:00
Emmanuel Nwanguma 0cdfbac5a1 docs(tools): add README for brevo, csv, runtime_logs, account_info tools (#5602)
* docs(tools): add README for brevo, csv, runtime_logs, account_info tools

- brevo_tool: Transactional email/SMS and contact management via Brevo API
- csv_tool: Read, write, query CSV files with DuckDB SQL support
- runtime_logs_tool: Query three-level runtime logging system
- account_info_tool: Query connected accounts and identities

* docs: fix runtime_logs_tool README to match implementation

- query_runtime_logs: add missing status values (degraded, in_progress, needs_attention)
- query_runtime_log_details: add missing needs_attention_only parameter
- query_runtime_log_raw: fix step_type -> step_index (int, not str)
- Fix file names: nodes.jsonl -> details.jsonl, steps.jsonl -> tool_logs.jsonl
- Fix error handling examples to match actual code

---------

Co-authored-by: hundao <alchemy_wimp@hotmail.com>
2026-03-05 18:51:47 +08:00
alidevh 29a3ae471f fix(config): add logging for config parse errors (#4955)
Co-authored-by: alihassan <239741857+alidevh@users.noreply.github.com>
2026-03-05 18:22:34 +08:00
singhhnitin 9c0f56f027 Improve indirect variable expansion for provider API key detection (#5504)
Co-authored-by: Nitin Singh <nitinsingh3323@gmail.com>
2026-03-05 18:12:14 +08:00
Hundao 462e303a6e ci: skip POSIX permission tests on Windows (temporary, see #5842) (#5847)
Windows does not support POSIX file permissions, causing 4 test
failures on Windows CI. Skip these tests until the proper
ReplaceFileW fix lands.
2026-03-05 17:50:05 +08:00
Hundao a84b3c7867 fix: validate agent.json before parsing in AgentRunner.load() (#5846)
Use is_file() instead of exists() to reject directories, and check
for empty content before passing to json parser. Prevents raw
tracebacks on invalid agent.json inputs.

Fixes #5787
2026-03-05 17:23:46 +08:00
Anushka Punekar 606267d053 fix(cli): validate --output path before agent execution in cmd_run (#5838)
* fix(cli): validate --output path before agent execution in cmd_run

* style: fix indentation and formatting

---------

Co-authored-by: hundao <alchemy_wimp@hotmail.com>
2026-03-05 16:51:33 +08:00
Timothy @aden 35791ae478 Merge pull request #5834 from aden-hive/fix/quickstart-tweaks
Release / Create Release (push) Waiting to run
chore(micro-fix): tweak quickstart
2026-03-04 20:06:40 -08:00
Timothy 10f0002080 chore: tweak quickstart 2026-03-04 20:05:16 -08:00
Bryan @ Aden 60bff4107d Merge pull request #5833 from aden-hive/feat/google-scopes
micro-fix: quickstart build failing
2026-03-05 04:03:55 +00:00
bryan be11fa4b29 fix: quickstart build failing 2026-03-04 20:02:23 -08:00
Richard Tang 6ade844722 feat: escalation implementation 2026-03-04 19:59:02 -08:00
Bryan @ Aden da8bc796d3 Merge pull request #5832 from aden-hive/feat/google-scopes
(micro-fix): chore: updating tool tests
2026-03-05 03:53:25 +00:00
bryan 429619379e fix: linter issues 2026-03-04 19:50:25 -08:00
bryan 0fecedbbbf chore: updating tool tests 2026-03-04 19:47:55 -08:00
Timothy @aden a2244ada75 Merge pull request #5764 from aden-hive/feat/google-scopes
Feat/google scopes
2026-03-04 19:43:30 -08:00
bryan 7608ba9290 Merge branch 'main' into feat/google-scopes 2026-03-04 19:40:46 -08:00
Richard Tang b9a3c67fea feat: dynamically load the system prompt in the context 2026-03-04 19:40:20 -08:00
bryan f5f3396d5c chore: update icons of sample agents 2026-03-04 19:38:14 -08:00
bryan ed80ae80f0 feat: twitter news sample agent 2026-03-04 19:37:34 -08:00
Timothy c7a47c71f0 fix: simplify game plan 2026-03-04 19:15:27 -08:00
Richard Tang 219bbe00fc feat: move guardrail to validation 2026-03-04 19:11:37 -08:00
Timothy @aden b14b8f8c52 Merge pull request #5815 from levxn/bug/agent-sessions
Restoring session during server restart | smooth conversation picked from where left off | fix unhandled error in event routes |
2026-03-04 19:10:38 -08:00
bryan df1a83d475 feat/local-business-sample-agent 2026-03-04 19:09:02 -08:00
bryan 5b7727cfd1 fix: permanent top bar 2026-03-04 19:08:20 -08:00
Timothy 93e270dafb fix: change initial plan 2026-03-04 19:01:24 -08:00
Timothy be675dbb17 fix: restructure docs 2026-03-04 18:59:20 -08:00
Timothy 1c24848db3 feat: implement hive github repo and discord as a connected game 2026-03-04 18:52:42 -08:00
Richard Tang ef6af5404f refactor: new builder flow 2026-03-04 18:42:20 -08:00
Richard Tang b7d57f3d49 feat: fix run_command and remove agent search 2026-03-04 18:04:42 -08:00
Richard Tang 58c892babb Merge remote-tracking branch 'origin/main' into feat/queen-responsibility 2026-03-04 18:03:03 -08:00
Timothy @aden 4b5ec796bc Merge pull request #5829 from aden-hive/feat/remove-old-session-status-tools
fix: remove the reference in the coder agent init
2026-03-04 17:42:34 -08:00
Richard Tang 24df4729ca fix: remove the reference in the coder agent init 2026-03-04 17:40:28 -08:00
Richard Tang 9e2004e33b Merge remote-tracking branch 'origin/main' into feat/queen-responsibility 2026-03-04 17:30:09 -08:00
Timothy @aden 1e6538efac Merge pull request #5828 from aden-hive/feat/remove-old-session-status-tools
Remove deprecated get_agent_session_state and get_agent_session_memory tools
2026-03-04 17:29:35 -08:00
Richard Tang f9e53f58af refactor: remove old get_agent_session_state and get_agent_session_memory tools 2026-03-04 17:23:10 -08:00
Timothy 41388efc31 fix: Windows compat — guard os.fchmod and remove deleted LLM_CREDENTIALS import
os.fchmod does not exist on Windows; guard with hasattr check.
Remove LLM_CREDENTIALS reference from test (module deleted in e1db3a4).
2026-03-04 17:22:21 -08:00
Timothy @aden fab5ce6fd0 Merge pull request #5824 from aden-hive/chore/fix-tool-tests
chore(micro-fix): fix test
2026-03-04 17:16:10 -08:00
Richard Tang b8be3056ed feat: list agent tool instruction 2026-03-04 17:03:45 -08:00
Timothy 207d6baee5 chore: fix test 2026-03-04 16:49:39 -08:00
Timothy @aden fec72bb2b6 Merge pull request #5294 from Antiarin/feat/hashline-edit-tool
[Integration]feat: add hashline anchor-based file editing tool
2026-03-04 16:38:13 -08:00
Richard Tang 39029b82d6 refactor: remove the coding agent check in quickstart 2026-03-04 16:27:33 -08:00
Richard Tang 232890b970 docs: remove the reference of the old agent skills 2026-03-04 16:23:30 -08:00
Timothy c4c4c24c59 Merge branch 'main' into feat/hashline-edit-tool 2026-03-04 16:23:07 -08:00
Richard Tang 13a8e28ae2 refactor: remove all old unused skills 2026-03-04 16:18:28 -08:00
bryan 917c7706ea chore: lint fix 2026-03-04 16:14:56 -08:00
bryan 8fadcd5b21 Merge branch 'main' into feat/google-scopes 2026-03-04 16:12:31 -08:00
Timothy @aden 2005ba2dca Merge pull request #5823 from aden-hive/micro-fix/lint
chore(micro-fix): lint
2026-03-04 16:11:51 -08:00
Richard Tang 34a44aa83c chore: update remaining reference for queen mode 2026-03-04 16:11:28 -08:00
Timothy 557d5fd6e5 chore: lint 2026-03-04 16:10:35 -08:00
Richard Tang 8468c45dc2 refactor: rename the queen mode to queen phase for clarity 2026-03-04 16:10:15 -08:00
Timothy @aden 79d2a15f95 Merge pull request #5814 from fermano/feature/windows-filesysten
Feature/windows filesystem
2026-03-04 16:07:33 -08:00
Richard Tang d2c3649566 refactor: re-organize the agent initialization tools 2026-03-04 16:06:00 -08:00
Timothy ab32e44128 style: ruff format fixes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 16:00:13 -08:00
Bryan @ Aden 047059f85f Merge pull request #5774 from kostasuser01gr/docs/4780-roadmap-updates
docs: update roadmap to reflect completed features (refs #4780)
2026-03-04 23:59:56 +00:00
Timothy e8364f616d Merge remote-tracking branch 'origin/main' into feature/windows-filesysten 2026-03-04 15:59:49 -08:00
Bryan @ Aden 9098c9b6c6 Merge pull request #5785 from code-Miracle49/fix/remove-duplicate-execute-subagent
micro-fix: remove duplicate _execute_subagent method in EventLoopNode
2026-03-04 23:55:36 +00:00
Timothy 84fd9ebac8 style: fix E501 line-too-long lint errors
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 15:53:45 -08:00
Timothy @aden 23d5d76d56 Merge pull request #5822 from aden-hive/fix/anthropic-vendor-issue
fix: remove hardcoded Anthropic dependencies from core framework
2026-03-04 15:46:16 -08:00
Timothy b0c86588b6 chore: lint 2026-03-04 15:44:11 -08:00
Timothy 5aff1f9489 chore: lint 2026-03-04 15:43:59 -08:00
Timothy Zhang 199cb3d8cc fix: stdin conflicts 2026-03-04 14:45:02 -08:00
Fernando Mano a98a4ca0b6 feature(WindowsFilesystemSupport): #5677 - Windows File System Support and Testing -- fixing lint issues 2026-03-04 21:22:11 -03:00
Fernando Mano c4f49aadfa feature(WindowsFilesystemSupport): #5677 - Windows File System Support and Testing -- fixing lint issues 2026-03-04 21:20:33 -03:00
Fernando Mano ca5ac389cf feature(WindowsFilesystemSupport): #5677 - Windows File System Support and Testing -- fixing lint issues 2026-03-04 21:15:08 -03:00
Fernando Mano 7a658f7953 feature(WindowsFilesystemSupport): #5677 - Windows File System Support and Testing -- fixing lint issues 2026-03-04 21:05:49 -03:00
Timothy e05fc99da7 Merge branch 'main' into fix/anthropic-vendor-issue 2026-03-04 14:43:27 -08:00
Bryan @ Aden 787090667e Merge pull request #5816 from aden-hive/fix/pause-stop-worker
(micro-fix): update pause in pipeline uses stop_worker like queen
2026-03-04 21:50:23 +00:00
Antiarin 80b36b4052 fix: CRLF double-conversion in hashline edit and add large file skip reporting
- Replace joined.replace("\n", "\r\n") with re.sub(r"(?<!\r)\n", "\r\n", joined to prevent \r\n in replace op new_content from becoming \r\r\n (fixed in both hashline_edit.py and file_ops.py)
 - Track and report skipped large files in grep_search instead of silently skipping them
 - Extract HASHLINE_MAX_FILE_BYTES constant to hashline.py as single source of truth, imported by view_file, grep_search, hashline_edit, and file_ops
 - Add tests for CRLF replace op (both copies) and large file skip reporting
2026-03-05 03:12:35 +05:30
bryan 0b8ed521c0 fix:update pause in pipeline uses stop_worker like queen 2026-03-04 13:42:20 -08:00
levxn 1ec7c5545f fixing lints and formatting 2026-03-05 02:59:57 +05:30
levxn cc6b6760c3 enables resume from where it was left off 2026-03-05 02:34:23 +05:30
Levin 26aed90ab2 Merge branch 'aden-hive:main' into bug/agent-sessions 2026-03-05 02:32:56 +05:30
Timothy 1c58ccb0c1 chore: lint 2026-03-04 12:45:27 -08:00
Timothy 79b80fe817 feat: coder tools to also support hashline editing 2026-03-04 12:41:07 -08:00
Antiarin c0f3841af7 feat: add file size check in grep_search to skip large files and switch case in hashline edit
- Implemented a check to skip files larger than 10MB in the grep_search function to optimize memory usage.
2026-03-04 12:41:07 -08:00
Antiarin 2b7d9bc471 feat:Updating the docs 2026-03-04 12:41:07 -08:00
Antiarin 98dc493a39 feat: Add cross-tool hashing anchor, grep search, and all viewfile 2026-03-04 12:41:07 -08:00
Antiarin cfaa57b28d feat:Add hashing tool 2026-03-04 12:40:25 -08:00
Timothy @aden 219e603de6 Merge pull request #5813 from aden-hive/refactor/quickstart
Refactor/quickstart
2026-03-04 12:27:45 -08:00
Timothy @aden 7663a5bce8 Merge pull request #5797 from Waryjustice/fix/windows-browser-auto-open
fix: browser auto-open after quickstart does not work on Windows
2026-03-04 12:27:35 -08:00
Timothy f2841b945d chore: lint 2026-03-04 12:24:08 -08:00
bryan faff64c413 chore: agents.md update 2026-03-04 12:12:27 -08:00
Timothy 6fbcdc1d87 fix: auto install node 20 2026-03-04 12:11:29 -08:00
bryan 69a11af949 chore: best effort alignment of windows quickstart 2026-03-04 11:43:50 -08:00
bryan 9ef272020e chore: added llm key health check 2026-03-04 11:35:12 -08:00
Richard Tang 71226d9625 fix: logger schema mismatch 2026-03-04 10:55:28 -08:00
bryan 258cfe7de5 chore: added easy way to update llm provider key 2026-03-04 10:42:57 -08:00
bryan 0d53b21133 chore: doc updates about hive open 2026-03-04 10:33:34 -08:00
Richard Tang 9102328d1c feat: make LLM logger by default on 2026-03-04 10:30:17 -08:00
Fernando Mano 704a0fd63a feature(WindowsFilesystemSupport): #5677 - Windows File System Support and Testing -- remove testing codeand prepare for PR 2026-03-04 15:09:06 -03:00
bryan 0ccb28ffab fix: enter to use previously configured 2026-03-04 10:05:59 -08:00
Fernando Mano bf4101ac38 feature(WindowsFilesystemSupport): #5677 - Windows File System Support and Testing -- remove testing codeand prepare for PR 2026-03-04 15:03:02 -03:00
bryan b30b571b44 chore: update recommended models 2026-03-04 09:54:29 -08:00
bryan bc44c3a401 chore: make gcu enabled by default 2026-03-04 09:52:42 -08:00
bryan 7fbf57cbb7 fix: linter update 2026-03-04 09:52:16 -08:00
Fernando Mano bc349e8fde feature(WindowsFilesystemSupport): #5677 - Windows File System Support and Testing -- remove testing codeand prepare for PR 2026-03-04 14:41:42 -03:00
bryan 67d094f51a fix: tool tests 2026-03-04 09:22:34 -08:00
bryan 873af04c6e fix: utilize mac keychain for claude code subscription 2026-03-04 09:22:12 -08:00
Shaurya Singh 2f0439dca8 Merge branch 'main' into fix/windows-browser-auto-open 2026-03-04 22:50:39 +05:30
Fernando Mano 8470c6a980 feature(WindowsFilesystemSupport): #5677 - Windows File System Support and Testing 2026-03-04 14:16:48 -03:00
Levin 43092ba1d7 Merge branch 'aden-hive:main' into bug/agent-sessions 2026-03-04 22:33:40 +05:30
bryan 1920192656 feat: hive open cmd 2026-03-04 08:55:18 -08:00
bryan 61487db481 chore: linter fixes 2026-03-04 08:44:04 -08:00
Waryjustice f56feaf821 fix: browser auto-open after quickstart does not work on Windows 2026-03-04 22:12:53 +05:30
Timothy @aden 4cbd5a4c6c Merge pull request #5786 from osb910/fix/charmap-decode-error
fix(core): add utf-8 encoding to backend open calls (micro-fix)
2026-03-04 08:39:10 -08:00
Timothy 65aa5629e8 chore: fix lint 2026-03-04 08:34:01 -08:00
bryan c42c8ba505 Merge branch 'main' into feat/google-scopes 2026-03-04 08:25:29 -08:00
Omar Shareef 7193d09bed formatting warning fix 2026-03-04 16:43:46 +02:00
Omar Shareef 49f8fae0b4 fix: systematically enforce UTF-8 encoding across tools and core to fix Windows charmap decode errors 2026-03-04 16:04:53 +02:00
Omar Shareef e1a490756e fix: systematically enforce UTF-8 encoding across tools and core to fix Windows charmap decode errors 2026-03-04 15:58:03 +02:00
code-Miracle49 c313ea7ee2 micro-fix: remove duplicate _execute_subagent method in EventLoopNode 2026-03-04 12:54:43 +01:00
Omar Shareef 91bfaf36e3 fix(core): add utf-8 encoding to backend open calls
This fixes a charmap decoding error on Windows when opening agent files without explicitly specifying the encoding.
2026-03-04 13:32:59 +02:00
levxn e3ea9212dd latest upstream and MC resolved in workspace.tsx 2026-03-04 12:08:14 +05:30
kostasuser01gr 99d41d8cc6 docs: update roadmap to reflect completed features (refs #4780) 2026-03-04 08:37:14 +02:00
levxn 8988c1e760 session management and ability to converse from where the chat was left off, fix v1 2026-03-04 11:40:44 +05:30
Timothy @aden 465adf5b1f Merge pull request #5767 from aden-hive/feat/integrations
Feat/integrations
2026-03-03 22:04:08 -08:00
RichardTang-Aden 132d00d166 Merge pull request #5769 from aden-hive/queen-mode-separation
Release / Create Release (push) Waiting to run
Queen mode separation: building, staging, and running modes
2026-03-03 21:31:23 -08:00
bryan b1a5f8e730 chore: tool test fixes 2026-03-03 21:01:19 -08:00
Richard Tang a604fee3aa chore: mode label update 2026-03-03 20:47:35 -08:00
Timothy 8018325923 style: fix all ruff lint errors (E501, E722, E741, F841)
- Break long lines (E501) across 25+ files
- Replace bare except with except Exception (E722)
- Rename ambiguous variable `l` to `item` (E741)
- Prefix unused variables with underscore (F841)
2026-03-03 20:42:30 -08:00
Richard Tang 3f86bd4009 chore: lint fix 2026-03-03 20:39:04 -08:00
Timothy b4cf10214b chore: lint issues 2026-03-03 20:38:30 -08:00
Bryan @ Aden c7818c2c33 Merge pull request #5766 from aden-hive/fix/credential-modal-delete
(micro-fix): Fix/credential modal delete
2026-03-04 04:38:23 +00:00
Timothy e421bcc326 chore: lint issues 2026-03-03 20:36:28 -08:00
Richard Tang 09e5a4dcc0 chore: frontend verbrige 2026-03-03 20:31:26 -08:00
Richard Tang ce08c44235 feat: improve ui indicator 2026-03-03 20:28:32 -08:00
Richard Tang e743234324 fix: strenghthen prompt to collect user intent 2026-03-03 20:23:53 -08:00
Timothy 9b76ac48b7 chore: new depedency 2026-03-03 20:23:10 -08:00
bryan 06a9adb051 chore: linter fix 2026-03-03 20:15:42 -08:00
Richard Tang 6ae16345a8 fix: reference err from merging 2026-03-03 20:15:37 -08:00
Richard Tang 8daaf000b1 Merge remote-tracking branch 'origin/feat/question-widget' into queen-mode-separation 2026-03-03 20:09:10 -08:00
bryan 9ce753055c feat: meeting scheduler agent 2026-03-03 20:01:58 -08:00
bryan 0ce87b5155 refactor: update calendar list events tool 2026-03-03 20:01:42 -08:00
Richard Tang 273f411eee feat: replace the reload agent to stop worker 2026-03-03 20:01:27 -08:00
Richard Tang 6929cecf8a fix: tag for frontend 2026-03-03 19:53:18 -08:00
Richard Tang 9221a7ff03 Merge remote-tracking branch 'origin/queen-mode-separation' into queen-mode-separation 2026-03-03 19:43:33 -08:00
Richard Tang a6089c5b3b feat: returning queen bee status when starting session 2026-03-03 19:43:04 -08:00
Richard Tang a7ee972b32 feat: enable the frontend to cancel the current queen run and sync queen mode 2026-03-03 19:30:55 -08:00
Richard Tang c817989b99 feat: allow frontend change to control mode 2026-03-03 19:29:33 -08:00
Richard Tang 2272a6854c refactor: consolidate discorver_mcp_tools and list_agent_tools 2026-03-03 19:08:58 -08:00
Timothy 040fc1ee8d feat: corrected agent generation guidelines 2026-03-03 18:53:40 -08:00
Richard Tang f00b8d7b8c fix: update the initial state condition 2026-03-03 18:35:24 -08:00
Timothy @aden 6c8c6d7048 Merge pull request #5234 from Antiarin/fix/guardian-self-trigger-loop
fix(tui): fix pause/stop to cancel all running tasks across all graphs
2026-03-03 18:17:15 -08:00
Richard Tang f27ef52c7a feat: update queen initial state 2026-03-03 18:15:51 -08:00
Richard Tang 0a2ff1db97 feat: new queen stages and tools 2026-03-03 18:07:47 -08:00
Timothy 6da48eac6f feat: split tool loading into verified and unverified tiers
register_all_tools() now only loads verified (stable) tools by default.
Pass include_unverified=True to also load new/community integrations.
This prevents unverified tools from being loaded in production.

Also fixes duplicate register_brevo and register_pushover calls.
2026-03-03 17:54:45 -08:00
Timothy 638ff04e24 fix: remove duplicate community tool directories and fix credential wiring
- Remove s3_tool (duplicate of aws_s3_tool), power_bi_tool (duplicate of
  powerbi_tool), x_tool (duplicate of twitter_tool)
- Remove integrations/plaid (duplicate of plaid_tool), integrations/sap_s4hana
  (duplicate of sap_tool), stray tools/mssql.py
- Add help key to credential error responses across 14 tool modules
- Fix health checker registry keys (calendly -> calendly_pat, lusha -> lusha_api_key)
- Add health_check_endpoint to calendly and lusha credential specs
- Fix Trello env var (TRELLO_TOKEN -> TRELLO_API_TOKEN) and remove duplicate
  Trello specs from hubspot.py
- Add credential_group="aws" to AWS S3 and Redshift specs sharing env vars
- Update conftest UNREGISTERED_COMMUNITY_MODULES to only contain mssql_tool
2026-03-03 17:46:28 -08:00
bryan d0e7aa14b6 fix: hide delete button for Aden-managed credentials 2026-03-03 17:36:04 -08:00
bryan 59fee56c54 fix: share server credential store with runner to avoid redundant Aden syncs 2026-03-03 17:35:24 -08:00
bryan 2207306169 fix: resolve MCP server cwd from repo root instead of agent path 2026-03-03 17:34:51 -08:00
Richard Tang 8ff2e91f2d feat: add queen agent building and running mode switching 2026-03-03 16:01:41 -08:00
bryan 730370a007 test: update calendar and health check tests 2026-03-03 15:42:22 -08:00
bryan f87909109c refactor: simplify health check system 2026-03-03 15:42:07 -08:00
bryan d6a6d8b5ef refactor: unify Google OAuth to single credential 2026-03-03 15:41:53 -08:00
bryan 57563abfa7 feat: add Google Sheets tool 2026-03-03 15:41:26 -08:00
RichardTang-Aden 4066962ade Merge pull request #5751 from aden-hive/load-new-session-from-home
Fix new session from home and add email reply agent template
2026-03-03 14:48:17 -08:00
Richard Tang 0f26e34f09 fix: improve the reply template 2026-03-03 14:45:07 -08:00
Richard Tang d76e436e3d fix: new session should have their own id 2026-03-03 14:44:51 -08:00
Timothy 4ff531dec7 fix: update expected health checkers set (add calendly, zoho_crm) 2026-03-03 14:10:34 -08:00
Timothy 4f8b3d7aff fix: update credential specs for community Linear/Trello tools, skip unregistered community modules 2026-03-03 14:09:04 -08:00
Timothy 210fa9c474 fix: use community Brevo implementation (6 tools), remove orphaned x_tool test 2026-03-03 14:06:00 -08:00
Timothy 25361cac8c fix: align tests with community implementations, revert Reddit to httpx (praw unavailable) 2026-03-03 14:02:33 -08:00
Timothy 28defebd6d fix: remove community youtube_transcript tool.py requiring uninstalled SDK 2026-03-03 13:58:45 -08:00
Timothy d58f3103dd fix: guard register_tools for s3_tool and mssql_tool when SDK not available 2026-03-03 13:54:46 -08:00
Timothy 5d1ed35660 fix: remove shell heredoc artifacts from community power_bi_tool 2026-03-03 13:52:20 -08:00
Timothy 1f3e305534 fix: guard optional SDK imports (boto3, pyodbc) and remove s3_tool registration 2026-03-03 13:51:04 -08:00
Timothy 7d8fdd279c fix: revert Asana to httpx-based implementation (asana SDK not available) 2026-03-03 13:33:35 -08:00
Timothy bb061b770f merge: incorporate QuickBooks community PR #4158
# Conflicts:
#	examples/templates/deep_research_agent/config.py
#	examples/templates/tech_news_reporter/config.py
#	tools/README.md
#	tools/src/aden_tools/credentials/__init__.py
#	tools/src/aden_tools/credentials/quickbooks.py
#	tools/src/aden_tools/tools/__init__.py
#	tools/src/aden_tools/tools/quickbooks_tool/__init__.py
#	tools/src/aden_tools/tools/quickbooks_tool/quickbooks_tool.py
#	tools/tests/tools/test_quickbooks_tool.py
2026-03-03 13:27:04 -08:00
Timothy a8768b9ed6 merge: incorporate MSSQL community PR #4200
# Conflicts:
#	tools/pyproject.toml
#	tools/src/aden_tools/credentials/integrations.py
#	tools/src/aden_tools/tools/__init__.py
2026-03-03 13:26:36 -08:00
Timothy b437aa5f6c merge: incorporate Linear community PR #3585
# Conflicts:
#	.claude/skills/hive-credentials/SKILL.md
#	tools/README.md
#	tools/src/aden_tools/tools/__init__.py
#	tools/src/aden_tools/tools/linear_tool/__init__.py
#	tools/src/aden_tools/tools/linear_tool/linear_tool.py
2026-03-03 13:24:57 -08:00
Timothy 9248182570 merge: incorporate Trello community PR #3376
# Conflicts:
#	tools/README.md
#	tools/src/aden_tools/tools/__init__.py
#	tools/src/aden_tools/tools/trello_tool/__init__.py
#	tools/src/aden_tools/tools/trello_tool/trello_tool.py
#	tools/tests/tools/test_trello_tool.py
2026-03-03 13:24:23 -08:00
Timothy 7c77c7170f merge: incorporate YouTube Transcript community PR #3520
# Conflicts:
#	tools/pyproject.toml
#	tools/src/aden_tools/tools/__init__.py
2026-03-03 13:22:46 -08:00
Timothy 85fcb6516c merge: incorporate Redshift community PR #3533
# Conflicts:
#	tools/pyproject.toml
#	tools/src/aden_tools/tools/__init__.py
#	tools/src/aden_tools/tools/redshift_tool/__init__.py
#	tools/src/aden_tools/tools/redshift_tool/redshift_tool.py
#	tools/tests/tools/test_redshift_tool.py
2026-03-03 13:17:41 -08:00
Timothy e8e76d85f7 merge: incorporate Pushover community PR #5424
# Conflicts:
#	tools/src/aden_tools/tools/pushover_tool/__init__.py
#	tools/src/aden_tools/tools/pushover_tool/pushover_tool.py
2026-03-03 13:17:18 -08:00
Timothy 5aaa5ae4d5 merge: incorporate Twitter/X community PR #3807
# Conflicts:
#	tools/src/aden_tools/credentials/__init__.py
#	tools/src/aden_tools/tools/__init__.py
#	tools/tests/test_credentials.py
2026-03-03 13:16:45 -08:00
Timothy c3a8ee9c7b merge: incorporate Calendly community PR #3947
# Conflicts:
#	tools/src/aden_tools/credentials/__init__.py
#	tools/src/aden_tools/credentials/calendly.py
#	tools/src/aden_tools/tools/__init__.py
#	tools/src/aden_tools/tools/calendly_tool/__init__.py
#	tools/src/aden_tools/tools/calendly_tool/calendly_tool.py
#	tools/tests/test_health_checks.py
#	tools/tests/tools/test_calendly_tool.py
2026-03-03 13:14:20 -08:00
Timothy 5d07a8aba5 merge: incorporate Airtable community PR #3953
# Conflicts:
#	tools/src/aden_tools/credentials/__init__.py
#	tools/src/aden_tools/credentials/airtable.py
#	tools/src/aden_tools/credentials/health_check.py
#	tools/src/aden_tools/tools/__init__.py
#	tools/src/aden_tools/tools/airtable_tool/__init__.py
#	tools/src/aden_tools/tools/airtable_tool/airtable_tool.py
#	tools/tests/test_health_checks.py
#	tools/tests/tools/test_airtable_tool.py
2026-03-03 13:13:47 -08:00
Timothy d18e0594b8 merge: incorporate Reddit community PR #3963
# Conflicts:
#	tools/pyproject.toml
#	tools/src/aden_tools/credentials/__init__.py
#	tools/src/aden_tools/credentials/health_check.py
#	tools/src/aden_tools/credentials/reddit.py
#	tools/src/aden_tools/tools/__init__.py
#	tools/src/aden_tools/tools/reddit_tool/__init__.py
#	tools/src/aden_tools/tools/reddit_tool/reddit_tool.py
#	tools/tests/tools/test_reddit_tool.py
#	uv.lock
2026-03-03 13:12:55 -08:00
Timothy 26dcc86a24 merge: incorporate Zoho CRM community PR #4713
# Conflicts:
#	tools/src/aden_tools/credentials/__init__.py
#	tools/src/aden_tools/tools/__init__.py
#	tools/src/aden_tools/tools/zoho_crm_tool/__init__.py
#	tools/src/aden_tools/tools/zoho_crm_tool/zoho_crm_tool.py
#	tools/tests/test_health_checks.py
2026-03-03 13:11:51 -08:00
Timothy e928ad19e5 merge: incorporate Lusha community PR #4714
# Conflicts:
#	tools/src/aden_tools/credentials/__init__.py
#	tools/src/aden_tools/credentials/lusha.py
#	tools/src/aden_tools/tools/__init__.py
#	tools/src/aden_tools/tools/lusha_tool/__init__.py
#	tools/src/aden_tools/tools/lusha_tool/lusha_tool.py
#	tools/tests/tools/test_lusha_tool.py
2026-03-03 13:11:33 -08:00
Timothy 6768aaa575 merge: incorporate Apify community PR #4770
# Conflicts:
#	tools/src/aden_tools/credentials/__init__.py
#	tools/src/aden_tools/credentials/apify.py
#	tools/src/aden_tools/tools/__init__.py
#	tools/src/aden_tools/tools/apify_tool/__init__.py
#	tools/src/aden_tools/tools/apify_tool/apify_tool.py
#	tools/tests/tools/test_apify_tool.py
2026-03-03 13:10:45 -08:00
Timothy f561aacbfc merge: incorporate Attio community PR #4832
# Conflicts:
#	tools/src/aden_tools/credentials/__init__.py
#	tools/src/aden_tools/credentials/attio.py
#	tools/src/aden_tools/tools/__init__.py
#	tools/src/aden_tools/tools/attio_tool/__init__.py
#	tools/src/aden_tools/tools/attio_tool/attio_tool.py
2026-03-03 13:10:09 -08:00
RichardTang-Aden af1ece40c2 Merge pull request #5742 from aden-hive/load-new-session-from-home
Load new session from home
2026-03-03 13:09:44 -08:00
Timothy d9edd7adf7 merge: incorporate Asana community PR #4857
# Conflicts:
#	tools/src/aden_tools/credentials/__init__.py
#	tools/src/aden_tools/credentials/asana.py
#	tools/src/aden_tools/tools/__init__.py
#	tools/src/aden_tools/tools/asana_tool/__init__.py
#	tools/tests/tools/test_asana_tool.py
2026-03-03 13:08:30 -08:00
Richard Tang 3541fab363 feat: add uv instruction to agents 2026-03-03 13:06:50 -08:00
Richard Tang 1160dceeff feat: agents.md for agent collaboration 2026-03-03 13:06:09 -08:00
Timothy b4a5323009 merge: incorporate Brevo community PR #5136
# Conflicts:
#	tools/src/aden_tools/credentials/__init__.py
#	tools/src/aden_tools/credentials/brevo.py
#	tools/src/aden_tools/tools/brevo_tool/__init__.py
#	tools/src/aden_tools/tools/brevo_tool/brevo_tool.py
2026-03-03 13:04:29 -08:00
Timothy ade8b5b9a7 merge: incorporate Databricks community PR #5428
# Conflicts:
#	tools/src/aden_tools/credentials/__init__.py
#	tools/src/aden_tools/credentials/databricks.py
#	tools/src/aden_tools/tools/__init__.py
#	tools/src/aden_tools/tools/databricks_tool/__init__.py
#	tools/src/aden_tools/tools/databricks_tool/databricks_tool.py
#	tools/tests/tools/test_databricks_tool.py
2026-03-03 13:02:30 -08:00
Timothy e4ace3d484 merge: incorporate YouTube community PR #5673 (resolve conflicts, preserve README) 2026-03-03 12:29:32 -08:00
Timothy f3dd25adc5 merge: incorporate Power BI community PR #4341 2026-03-03 12:27:06 -08:00
Timothy ec251f8168 merge: incorporate SAP S/4HANA community PR #5519 2026-03-03 12:27:02 -08:00
Timothy 1bb9579dc5 merge: incorporate Plaid community PR #5518 2026-03-03 12:26:56 -08:00
Timothy 7ebf4146ce merge: incorporate AWS S3 community PR #5521 2026-03-03 12:26:50 -08:00
Richard Tang a8db4cb2f5 fix: mcp path 2026-03-03 12:19:32 -08:00
Richard Tang 24433396dd feat: use send instead of draft for email reply agent 2026-03-03 12:04:44 -08:00
Richard Tang 02bdf17641 chore: move the email reply sample agent 2026-03-03 11:59:14 -08:00
Timothy e0e05f3488 chore: register Obsidian tool in tool/credential registries 2026-03-03 11:55:12 -08:00
Timothy c92f2510c8 test: add Obsidian tool unit tests (read, write, append, search, list, active) 2026-03-03 11:55:12 -08:00
Timothy ea1fbe9ee1 chore: add Obsidian credential spec (REST API key) 2026-03-03 11:55:11 -08:00
Timothy 84a0be0179 feat: add Obsidian knowledge management integration (#3741)
6 tools: obsidian_read_note, obsidian_write_note, obsidian_append_note,
obsidian_search, obsidian_list_files, obsidian_get_active.
Uses Local REST API plugin with Bearer token auth. Supports vault
browsing, full-text search, and note CRUD with frontmatter metadata.
2026-03-03 11:55:04 -08:00
RichardTang-Aden 54f5c0dc91 Merge pull request #5735 from aden-hive/docs/readme/v6
docs: reorder section in documentation
2026-03-03 11:54:09 -08:00
Richard Tang adf1a10318 docs: reorder section in documentation 2026-03-03 11:53:05 -08:00
RichardTang-Aden e2a679a265 Merge pull request #5734 from aden-hive/docs/readme/v6
docs: add running screenshot and update the coding agent instruction
2026-03-03 11:50:56 -08:00
Richard Tang a3916a6932 docs: add running screenshot and update the coding agent instruction 2026-03-03 11:49:19 -08:00
Timothy 1b5780461e chore: register Langfuse tool in tool/credential registries 2026-03-03 11:42:49 -08:00
Timothy c8d35b63a4 test: add Langfuse tool unit tests (traces, scores, prompts) 2026-03-03 11:42:49 -08:00
Timothy feb1ebae04 chore: add Langfuse credential specs (public key, secret key) 2026-03-03 11:42:48 -08:00
Timothy efe49d0a5b feat: add Langfuse LLM observability integration (#5322)
6 tools: langfuse_list_traces, langfuse_get_trace, langfuse_list_scores,
langfuse_create_score, langfuse_list_prompts, langfuse_get_prompt.
Uses HTTP Basic Auth with public/secret key pair. Supports cloud and
self-hosted instances with offset-based pagination.
2026-03-03 11:41:11 -08:00
Timothy e50a5ea22a chore: register Zoom and n8n tools in tool/credential registries 2026-03-03 11:31:25 -08:00
Timothy 6382c94d0a test: add n8n tool unit tests (workflows, executions, activate/deactivate) 2026-03-03 11:31:21 -08:00
Timothy 58ce84c9cc chore: add n8n credential specs (API key, base URL) 2026-03-03 11:31:20 -08:00
Timothy 08fd6ff765 feat: add n8n workflow automation integration (#2931)
6 tools: n8n_list_workflows, n8n_get_workflow, n8n_activate_workflow,
n8n_deactivate_workflow, n8n_list_executions, n8n_get_execution.
Uses X-N8N-API-KEY header auth with configurable base URL.
Supports cursor-based pagination and execution status filtering.
2026-03-03 11:31:15 -08:00
Timothy a9cb79909c test: add Zoom tool unit tests (user, meetings, recordings) 2026-03-03 11:31:07 -08:00
Timothy 852f8ccd94 chore: add Zoom credential spec (Server-to-Server OAuth token) 2026-03-03 11:31:07 -08:00
Timothy 9388ef3e99 feat: add Zoom meeting management integration (#2867)
6 tools: zoom_get_user, zoom_list_meetings, zoom_get_meeting,
zoom_create_meeting, zoom_delete_meeting, zoom_list_recordings.
Uses Server-to-Server OAuth Bearer token. Supports token-based
pagination and cloud recording retrieval by date range.
2026-03-03 11:31:00 -08:00
Timothy 04afb0c4bb chore: register Salesforce and Shopify tools in tool/credential registries 2026-03-03 11:22:40 -08:00
Timothy a07fd44de3 test: add Shopify tool unit tests (orders, products, customers, search) 2026-03-03 11:22:35 -08:00
Timothy f6c1b13846 chore: add Shopify credential specs (access token, store name) 2026-03-03 11:22:35 -08:00
Timothy 654fa3dd1f feat: add Shopify Admin REST API integration - orders, products, customers (#2984)
6 tools: shopify_list_orders, shopify_get_order, shopify_list_products,
shopify_get_product, shopify_list_customers, shopify_search_customers.
Uses X-Shopify-Access-Token header auth with store subdomain.
2026-03-03 11:22:29 -08:00
Timothy 8183449d27 test: add Salesforce CRM tool unit tests (SOQL, CRUD, describe, list objects) 2026-03-03 11:22:16 -08:00
Timothy a9acfb86ad chore: add Salesforce credential specs (access token, instance URL) 2026-03-03 11:22:15 -08:00
Timothy d7d070ac5f feat: add Salesforce CRM integration - SOQL, records, and metadata (#2916)
6 tools: salesforce_soql_query, salesforce_get_record, salesforce_create_record,
salesforce_update_record, salesforce_describe_object, salesforce_list_objects.
Uses OAuth2 Bearer token auth with instance URL. Supports pagination via
nextRecordsUrl and field-level describe with picklist values.
2026-03-03 11:22:08 -08:00
RichardTang-Aden ead51f1eb6 Merge pull request #5732 from aden-hive/docs/readme/v6
docs: update README and sync all i18n translations
2026-03-03 11:19:06 -08:00
Timothy 8c01b573ce chore: register Redshift and SAP S/4HANA in tool/credential registries 2026-03-03 11:11:12 -08:00
Timothy 7744f21b9d test: add SAP S/4HANA tool unit tests (POs, partners, products, sales orders) 2026-03-03 11:11:08 -08:00
Timothy 9ed23a235f chore: add SAP S/4HANA credential specs (base URL, username, password) 2026-03-03 11:11:07 -08:00
Timothy e88328321f feat: add SAP S/4HANA Cloud read-only procurement integration (#3182) 2026-03-03 11:11:06 -08:00
Timothy a4c516bea1 test: add Redshift tool unit tests (execute, describe, results, databases, tables) 2026-03-03 11:11:00 -08:00
Timothy 1c932a04ef chore: add Redshift credential specs (AWS access key, secret key) 2026-03-03 11:11:00 -08:00
Timothy 76d34be4c2 feat: add Amazon Redshift Data API integration - SQL and schema browsing (#3267) 2026-03-03 11:10:59 -08:00
Timothy d6e8afe316 chore: register Azure SQL and Kafka in tool/credential registries 2026-03-03 11:03:31 -08:00
Timothy a04f2bcf99 test: add Kafka tool unit tests (topics, produce, consumer groups) 2026-03-03 11:03:27 -08:00
Timothy c138e7c638 chore: add Kafka credential specs (REST URL, cluster ID) 2026-03-03 11:03:27 -08:00
Timothy fc08c7007f feat: add Apache Kafka integration via Confluent REST Proxy (#4774) 2026-03-03 11:03:26 -08:00
Timothy d559bb3446 test: add Azure SQL tool unit tests (servers, databases, firewall rules) 2026-03-03 11:03:18 -08:00
Timothy 55a8c39e4b chore: add Azure SQL credential specs (token, subscription ID) 2026-03-03 11:03:17 -08:00
Timothy 02d6f10e5f feat: add Azure SQL Database management integration (#3377) 2026-03-03 11:03:16 -08:00
Timothy 77428a91cc chore: register Power BI and Snowflake in tool/credential registries 2026-03-03 10:56:46 -08:00
Timothy 51403dc276 test: add Snowflake tool unit tests (execute, status, cancel) 2026-03-03 10:56:43 -08:00
Timothy 914a07a35d chore: add Snowflake credential specs (account, token) 2026-03-03 10:56:42 -08:00
Timothy 3c70d7b424 feat: add Snowflake SQL REST API integration (#3230) 2026-03-03 10:56:41 -08:00
Timothy ce1ee4ff17 test: add Power BI tool unit tests (workspaces, datasets, reports, refresh) 2026-03-03 10:56:35 -08:00
Timothy fca41d9bda chore: add Power BI credential spec (POWERBI_ACCESS_TOKEN) 2026-03-03 10:56:34 -08:00
Timothy ff889e02f7 feat: add Power BI integration - workspaces, datasets, reports (#3973) 2026-03-03 10:56:34 -08:00
Richard Tang cbd2c86bbf docs: sync all i18n READMEs with primary README 2026-03-03 10:53:11 -08:00
Timothy 43ab460462 chore: register Terraform Cloud and Lusha in tool/credential registries 2026-03-03 10:49:21 -08:00
Timothy caa06e266b test: add Lusha tool unit tests (enrich, search, usage) 2026-03-03 10:49:17 -08:00
Timothy 3622ca78ee chore: add Lusha credential spec (LUSHA_API_KEY) 2026-03-03 10:49:17 -08:00
Timothy 019e3f9659 feat: add Lusha B2B contact and company enrichment integration (#3461) 2026-03-03 10:49:16 -08:00
Timothy 208cb579a2 test: add Terraform Cloud tool unit tests (workspaces, runs) 2026-03-03 10:49:09 -08:00
Timothy 17de7e4485 chore: add Terraform Cloud credential spec (TFC_TOKEN) 2026-03-03 10:49:08 -08:00
Timothy 810616eee1 feat: add Terraform Cloud integration - workspaces and runs (#4773) 2026-03-03 10:48:41 -08:00
Timothy 191f583669 chore: register Twitter/X and Tines in tool/credential registries 2026-03-03 10:35:46 -08:00
Timothy 1d638cc18e test: add Tines tool unit tests (stories, actions, logs) 2026-03-03 10:35:42 -08:00
Timothy 3efa1f3b88 chore: add Tines credential specs (domain, api_key) 2026-03-03 10:35:42 -08:00
Timothy 4daa33db09 feat: add Tines integration - security automation stories and actions
Implements 5 tools via Tines REST API:
- tines_list_stories: List workflow stories with search/filter
- tines_get_story: Get story details including entry/exit agents
- tines_list_actions: List actions (agents) in stories
- tines_get_action: Get action details with sources/receivers
- tines_get_action_logs: Get action execution logs by level

Uses Bearer token auth with tenant domain.
2026-03-03 10:35:37 -08:00
Timothy fab2fb0056 test: add Twitter/X tool unit tests (search, user, timeline, tweet) 2026-03-03 10:35:29 -08:00
Timothy ce885c120e chore: add Twitter/X credential spec (bearer_token) 2026-03-03 10:35:28 -08:00
Timothy 75b53c47ff feat: add Twitter/X integration - tweet search and user lookup via API v2
Implements 4 tools via X API v2:
- twitter_search_tweets: Search recent tweets with query operators
- twitter_get_user: Get user profile by username
- twitter_get_user_tweets: Get user timeline
- twitter_get_tweet: Get tweet details by ID

Uses Bearer token auth (app-only, read access).
2026-03-03 10:35:21 -08:00
Timothy 2936f73707 chore: register AWS S3 and QuickBooks in tool/credential registries 2026-03-03 10:22:46 -08:00
Timothy e26426b138 test: add QuickBooks tool unit tests (query, entities, invoices) 2026-03-03 10:22:42 -08:00
Timothy 62cacb8e28 chore: add QuickBooks credential specs (access_token, realm_id) 2026-03-03 10:22:42 -08:00
Timothy f3e37190ce feat: add QuickBooks Online integration - accounting API
Implements 5 tools via QuickBooks Online API v3:
- quickbooks_query: Query entities with SQL-like syntax
- quickbooks_get_entity: Get entity by type and ID
- quickbooks_create_customer: Create customers
- quickbooks_create_invoice: Create invoices with line items
- quickbooks_get_company_info: Get company details

Uses OAuth 2.0 Bearer token auth. Supports sandbox mode.
2026-03-03 10:22:35 -08:00
Timothy 0863bbbd2f test: add AWS S3 tool unit tests (buckets, objects, get, put, delete) 2026-03-03 10:22:25 -08:00
Timothy b23fa1daad chore: add AWS S3 credential specs (access_key_id, secret_access_key) 2026-03-03 10:22:24 -08:00
Timothy 05cc1ce599 feat: add AWS S3 integration - object storage via REST API with SigV4
Implements 5 tools via AWS S3 REST API:
- s3_list_buckets: List all buckets in the account
- s3_list_objects: List objects with prefix/delimiter filtering
- s3_get_object: Get object content and metadata
- s3_put_object: Upload text objects
- s3_delete_object: Delete objects

Uses AWS Signature V4 signing (no boto3 dependency).
2026-03-03 10:22:16 -08:00
RichardTang-Aden a1c045fd91 Merge pull request #5727 from aden-hive/docs/readme/v6
Docs: Remove TUI references from README
2026-03-03 10:14:13 -08:00
Timothy e6939f8d51 chore: register PagerDuty and Calendly in tool/credential registries 2026-03-03 10:13:18 -08:00
Timothy 801fef12e1 test: add Calendly tool unit tests (user, events, invitees) 2026-03-03 10:13:14 -08:00
Timothy 5845629175 chore: add Calendly credential spec (personal_access_token) 2026-03-03 10:13:13 -08:00
Timothy 11b916301a feat: add Calendly integration - scheduling events and invitees
Implements 5 tools via Calendly API v2:
- calendly_get_current_user: Get user URI and profile info
- calendly_list_event_types: List meeting templates
- calendly_list_scheduled_events: List booked meetings with date filters
- calendly_get_scheduled_event: Get event details by URI
- calendly_list_invitees: List invitees for an event

Uses Bearer token auth (Personal Access Token).
2026-03-03 10:13:07 -08:00
Timothy aa5d80b1d2 test: add PagerDuty tool unit tests (incidents, services) 2026-03-03 10:13:02 -08:00
Timothy aa5f990acd chore: add PagerDuty credential specs (api_key, from_email) 2026-03-03 10:13:01 -08:00
Timothy 9764c82c2a feat: add PagerDuty integration - incident management and services
Implements 5 tools via PagerDuty REST API v2:
- pagerduty_list_incidents: List incidents with status/urgency/date filters
- pagerduty_get_incident: Get incident details by ID
- pagerduty_create_incident: Create incidents on a service
- pagerduty_update_incident: Acknowledge or resolve incidents
- pagerduty_list_services: List services with name search

Uses Token auth header, From header for write operations.
2026-03-03 10:12:55 -08:00
Richard Tang f921846879 docs: update the latest features from recent changes 2026-03-03 10:12:43 -08:00
Richard Tang a370403b16 docs: update readme instructions 2026-03-03 10:06:13 -08:00
Timothy 543a71eb6c chore: register MongoDB and Airtable in tool/credential registries 2026-03-03 10:06:12 -08:00
Timothy 8285593c13 test: add Airtable tool unit tests (records, bases, schema) 2026-03-03 10:06:08 -08:00
Timothy 6fbfe773fb chore: add Airtable credential spec (personal_access_token) 2026-03-03 10:06:07 -08:00
Timothy a8c54b1e5f feat: add Airtable integration - record CRUD and base metadata
Implements 6 tools via Airtable Web API:
- airtable_list_records: List records with filters, sort, field selection
- airtable_get_record: Get a single record by ID
- airtable_create_records: Create up to 10 records per request
- airtable_update_records: Partial update up to 10 records per request
- airtable_list_bases: List accessible bases
- airtable_get_base_schema: Get table and field schema for a base

Uses Bearer token auth (Personal Access Token).
2026-03-03 10:06:03 -08:00
Timothy a5323abfca test: add MongoDB tool unit tests (find, insert, update, delete, aggregate) 2026-03-03 10:05:53 -08:00
Timothy ba4df2d2c4 chore: add MongoDB credential specs (data_api_url, api_key, data_source) 2026-03-03 10:05:52 -08:00
Timothy 6510633a8c feat: add MongoDB Atlas Data API integration - document CRUD and aggregation
Implements 6 tools via MongoDB Atlas Data API:
- mongodb_find: Find documents with filters, projection, sort, limit
- mongodb_find_one: Find a single document
- mongodb_insert_one: Insert a document
- mongodb_update_one: Update a document with MongoDB operators
- mongodb_delete_one: Delete a document
- mongodb_aggregate: Run aggregation pipelines

Uses API key auth header. All endpoints are POST.
2026-03-03 10:05:42 -08:00
Timothy 9172e5f46b chore: register Twilio and Zendesk in tool/credential registries 2026-03-03 09:56:14 -08:00
Timothy ed3e3848c0 test: add Zendesk tool unit tests (list, get, create, update, search) 2026-03-03 09:56:10 -08:00
Timothy ee90185d5c chore: add Zendesk credential specs (subdomain, email, api_token) 2026-03-03 09:56:09 -08:00
Timothy 6eb2633677 feat: add Zendesk integration - ticket management and search
Implements 5 tools via Zendesk Support API v2:
- zendesk_list_tickets: List tickets with status/sort filters
- zendesk_get_ticket: Get ticket details by ID
- zendesk_create_ticket: Create tickets with priority/type/tags
- zendesk_update_ticket: Update ticket fields and add comments
- zendesk_search_tickets: Search tickets with Zendesk query syntax

Uses Basic auth (email/token:api_token).
2026-03-03 09:56:00 -08:00
Timothy c1f215dcf2 test: add Twilio tool unit tests (SMS, WhatsApp, list, get) 2026-03-03 09:55:50 -08:00
Timothy 97cc9a1045 chore: add Twilio credential specs (account_sid, auth_token) 2026-03-03 09:55:49 -08:00
Timothy 5f7b02a4b7 feat: add Twilio integration - SMS and WhatsApp messaging
Implements 4 tools via Twilio REST API:
- twilio_send_sms: Send SMS messages
- twilio_send_whatsapp: Send WhatsApp messages
- twilio_list_messages: List message history with filters
- twilio_get_message: Get message details by SID

Uses Basic auth (AccountSID:AuthToken), form-urlencoded POST.
2026-03-03 09:55:43 -08:00
Richard Tang ad6d504ea4 docs: remove TUI in the readme 2026-03-03 09:52:06 -08:00
Timothy e696b41a0e chore: register GitLab and Google Sheets in tool/credential registries 2026-03-03 09:49:23 -08:00
Timothy 1f9acc6135 test: add Google Sheets tool unit tests (metadata, read, batch read) 2026-03-03 09:49:23 -08:00
Timothy 7e8699cb4b chore: add Google Sheets credential spec (api_key) 2026-03-03 09:49:22 -08:00
Timothy fd4fc657d6 feat: add Google Sheets integration - read spreadsheet data via API v4
3 tools: sheets_get_spreadsheet, sheets_read_range, sheets_batch_read.
Uses API key auth for read-only access to public spreadsheets.
2026-03-03 09:49:21 -08:00
Timothy 34403648b9 test: add GitLab tool unit tests (projects, issues, MRs) 2026-03-03 09:49:15 -08:00
Timothy 3795d50eb9 chore: add GitLab credential spec (personal access token) 2026-03-03 09:49:14 -08:00
Timothy 80515dde5a feat: add GitLab integration - projects, issues, merge requests
6 tools: gitlab_list_projects, gitlab_get_project, gitlab_list_issues,
gitlab_get_issue, gitlab_create_issue, gitlab_list_merge_requests.
Supports GitLab.com and self-hosted via configurable base URL.
2026-03-03 09:49:13 -08:00
Timothy efcd296d83 chore: register Notion and Jira tools in tool/credential registries 2026-03-03 09:43:32 -08:00
Timothy 802cb292b0 test: add Jira tool unit tests (issues, projects, comments) 2026-03-03 09:43:32 -08:00
Timothy 8e55f74d73 chore: add Jira credential specs (domain, email, api_token) 2026-03-03 09:43:31 -08:00
Timothy 3d810485a0 feat: add Jira integration - issues, projects, comments via REST API v3
6 tools: jira_search_issues, jira_get_issue, jira_create_issue,
jira_list_projects, jira_get_project, jira_add_comment. Uses Basic auth
with email + API token and Atlassian Document Format for text fields.
2026-03-03 09:43:30 -08:00
Timothy 94cfd48661 test: add Notion tool unit tests (search, pages, databases) 2026-03-03 09:43:16 -08:00
Timothy 87c8e741f3 chore: add Notion credential spec (api_token) 2026-03-03 09:43:15 -08:00
Timothy d0e92ed18d feat: add Notion integration - pages, databases, and search
5 tools: notion_search, notion_get_page, notion_create_page,
notion_query_database, notion_get_database. Uses Bearer auth
with Notion internal integration token.
2026-03-03 09:43:14 -08:00
Richard Tang 88640f9222 feat: email reply sample agent 2026-03-03 09:41:20 -08:00
Timothy 1927045519 chore: register Greenhouse and YouTube Transcript in tool/credential registries 2026-03-03 09:36:47 -08:00
Timothy 68cffb86c9 test: add YouTube Transcript tool unit tests (get, list transcripts) 2026-03-03 09:36:47 -08:00
Timothy 5bec989647 feat: add YouTube Transcript integration - captions and transcript retrieval
2 tools: youtube_get_transcript, youtube_list_transcripts.
Uses youtube-transcript-api library, no API key required.
2026-03-03 09:36:46 -08:00
Timothy 66f5d2f36c test: add Greenhouse tool unit tests (jobs, candidates, applications) 2026-03-03 09:36:40 -08:00
Timothy 941f815254 chore: add Greenhouse credential spec (api_token) 2026-03-03 09:36:39 -08:00
Timothy 42afd10518 feat: add Greenhouse integration - ATS jobs, candidates, applications
6 tools: greenhouse_list_jobs, greenhouse_get_job, greenhouse_list_candidates,
greenhouse_get_candidate, greenhouse_list_applications, greenhouse_get_application.
Uses Harvest API v1 with Basic auth (API token).
2026-03-03 09:36:38 -08:00
Timothy 3efa285a59 chore: register Cloudinary and Reddit tools in tool/credential registries 2026-03-03 09:31:22 -08:00
Timothy 4f2b4172b4 test: add Reddit tool unit tests (search, posts, comments, user) 2026-03-03 09:31:18 -08:00
Timothy 0d7de71b94 chore: add Reddit credential specs (client_id, client_secret) 2026-03-03 09:31:17 -08:00
Timothy f0f5b4bede feat: add Reddit integration - search, posts, comments, user info
4 tools: reddit_search, reddit_get_posts, reddit_get_comments, reddit_get_user.
Uses OAuth2 client_credentials flow for app-only access.
2026-03-03 09:31:17 -08:00
Timothy bfd27e97d3 test: add Cloudinary tool unit tests (upload, list, get, delete, search) 2026-03-03 09:31:10 -08:00
Timothy f2def27390 chore: add Cloudinary credential specs (cloud_name, api_key, api_secret) 2026-03-03 09:31:10 -08:00
Timothy b3f7bd6cc0 feat: add Cloudinary integration - upload, manage, search media assets
5 tools: cloudinary_upload, cloudinary_list_resources, cloudinary_get_resource,
cloudinary_delete_resource, cloudinary_search. Uses Basic auth with
API key/secret and supports image, video, and raw resource types.
2026-03-03 09:31:09 -08:00
Timothy 0e8e78dc5b chore: register Trello and Confluence tools in tool/credential registries 2026-03-03 09:22:03 -08:00
Timothy b259d85776 test: add Confluence tool tests (9 tests) 2026-03-03 09:22:02 -08:00
Timothy 175d9c3b7c feat: add Confluence credential spec with Basic auth (email + API token) 2026-03-03 09:21:55 -08:00
Timothy a2a810aabf feat: add Confluence integration - spaces, pages, content search via CQL 2026-03-03 09:21:54 -08:00
Timothy 175c7cfd51 test: add Trello tool tests (12 tests) 2026-03-03 09:21:47 -08:00
Timothy 5ada973d38 feat: add Trello credential spec with API key and token auth 2026-03-03 09:21:39 -08:00
Timothy 0103276136 feat: add Trello integration - boards, lists, cards management 2026-03-03 09:21:37 -08:00
Timothy 1d9e8ec138 chore: register HuggingFace tool in tool/credential registries 2026-03-03 09:11:59 -08:00
Timothy 83ac2e71bb test: add HuggingFace tool tests (10 tests) 2026-03-03 09:11:56 -08:00
Timothy 0b35a729a7 feat: add HuggingFace credential spec with token auth 2026-03-03 09:11:55 -08:00
Timothy 56723a519a feat: add HuggingFace Hub integration - models, datasets, spaces search 2026-03-03 09:11:49 -08:00
Timothy ebff394c76 chore: register Plaid tool in tool/credential registries 2026-03-03 09:08:44 -08:00
Timothy ceecc97bc8 test: add Plaid tool tests (13 tests) 2026-03-03 09:08:40 -08:00
Timothy 313154f880 feat: add Plaid credential spec with client_id and secret auth 2026-03-03 09:08:38 -08:00
Timothy 3eb6417cdc feat: add Plaid integration - accounts, balances, transactions, institutions 2026-03-03 09:08:29 -08:00
Timothy 1b35d6ca0a chore: register Pinecone tool in tool/credential registries 2026-03-03 09:05:20 -08:00
Timothy 1d89f0ba9d test: add Pinecone tool tests (18 tests) 2026-03-03 09:05:16 -08:00
Timothy 864df0e21a feat: add Pinecone credential spec with API key auth 2026-03-03 09:05:14 -08:00
Timothy 3f626decc4 feat: add Pinecone vector database integration - indexes, vectors, queries 2026-03-03 09:05:06 -08:00
Timothy bf1760b1a9 chore: register DuckDuckGo tool in tool registry 2026-03-03 08:56:06 -08:00
Timothy 8a58ea6344 test: add DuckDuckGo tool tests (6 tests) 2026-03-03 08:56:06 -08:00
Timothy 662ff4c35f feat: add DuckDuckGo search integration - web search, news, images 2026-03-03 08:56:01 -08:00
Timothy af02352b49 chore: register Linear tool in tool/credential registries 2026-03-03 08:43:41 -08:00
Timothy db9f987d46 test: add Linear tool tests (10 tests) 2026-03-03 08:43:41 -08:00
Timothy 8490ce1389 feat: add Linear credential spec with API key auth 2026-03-03 08:43:41 -08:00
Timothy 55ea9a56a4 feat: add Linear integration - issues, projects, teams, search via GraphQL 2026-03-03 08:43:41 -08:00
Timothy bd2381b10d chore: register Asana tool in tool/credential registries 2026-03-03 08:40:02 -08:00
Timothy 443de755bd test: add Asana tool tests (12 tests) 2026-03-03 08:40:02 -08:00
Timothy 55ec5f14ee feat: add Asana credential spec with PAT auth 2026-03-03 08:40:02 -08:00
Timothy 2e019302c9 feat: add Asana integration - tasks, projects, workspaces, search 2026-03-03 08:40:02 -08:00
Timothy b1e829644b chore: register Yahoo Finance tool in tool registry 2026-03-03 08:36:20 -08:00
Timothy 18f773e91b test: add Yahoo Finance tool tests (8 tests) 2026-03-03 08:36:19 -08:00
Timothy 987cfee930 feat: add Yahoo Finance integration - quotes, history, financials, company info 2026-03-03 08:36:19 -08:00
Timothy 57f6b8498a chore: register Google Search Console tool in tool/credential registries 2026-03-03 08:34:30 -08:00
Timothy 9f0d35977c test: add Google Search Console tool tests (10 tests) 2026-03-03 08:34:30 -08:00
Timothy e5910bbf2f feat: add Google Search Console credential spec with OAuth2 auth 2026-03-03 08:34:30 -08:00
Timothy 0015bf7b38 feat: add Google Search Console integration - analytics, sitemaps, URL inspection 2026-03-03 08:34:30 -08:00
Timothy a6b9234abb chore: register Zoho CRM tool in tool/credential registries 2026-03-03 08:32:13 -08:00
Timothy 086f3942b8 test: add Zoho CRM tool tests (12 tests) 2026-03-03 08:32:13 -08:00
Timothy 924f4abede feat: add Zoho CRM credential spec with OAuth token auth 2026-03-03 08:32:13 -08:00
Timothy 02be91cb08 feat: add Zoho CRM integration - leads, contacts, deals, accounts, notes 2026-03-03 08:32:13 -08:00
Timothy c2298393ab chore: register Apify tool in tool/credential registries 2026-03-03 08:29:33 -08:00
Timothy 4b8c63bf6e test: add Apify tool tests (11 tests) 2026-03-03 08:29:33 -08:00
Timothy e089c3b72c feat: add Apify credential spec with API token auth 2026-03-03 08:29:33 -08:00
Timothy a93983b5db feat: add Apify integration - actors, runs, datasets, key-value stores 2026-03-03 08:29:27 -08:00
Timothy 20f6329004 chore: register Attio tool in tool/credential registries 2026-03-03 08:25:12 -08:00
Timothy 3c2cf71c47 test: add Attio tool tests (14 tests) 2026-03-03 08:25:08 -08:00
Timothy 56288c3137 feat: add Attio credential spec with API key auth 2026-03-03 08:25:04 -08:00
Timothy 79188921a5 feat: add Attio CRM integration - records, lists, notes, tasks 2026-03-03 08:24:58 -08:00
RichardTang-Aden 65962ddf58 Merge pull request #5709 from aden-hive/load-new-session-from-home
Fix new session creation when submitting prompt from home page
2026-03-03 08:20:20 -08:00
Timothy 5ab66008ae chore: register Pipedrive tool in tool/credential registries 2026-03-03 08:18:45 -08:00
Timothy f38c9ee049 test: add Pipedrive tool tests (16 tests) 2026-03-03 08:18:41 -08:00
Timothy 86f5e71ec2 feat: add Pipedrive credential spec with API token auth 2026-03-03 08:18:29 -08:00
Timothy 1e15cc8495 feat: add Pipedrive CRM integration - deals, contacts, orgs, activities, pipelines 2026-03-03 08:18:24 -08:00
Richard Tang bba44430c4 chore: ignore local dev skills 2026-03-03 08:17:32 -08:00
Timothy 077d82ad82 chore: register Docker Hub tool in tool/credential registries 2026-03-03 08:14:27 -08:00
Timothy e4cf7f3da2 test: add Docker Hub tool tests (9 tests) 2026-03-03 08:14:24 -08:00
Timothy e3bdc9e8d7 feat: add Docker Hub credential spec with PAT auth 2026-03-03 08:14:20 -08:00
Timothy f1c1c9aab3 feat: add Docker Hub integration - search, repos, tags, image details 2026-03-03 08:14:15 -08:00
Richard Tang 69c71d77fb fix: load-new-session from home 2026-03-03 08:09:22 -08:00
Timothy 4860739a2f chore: register Vercel in tool/credential registries (#5044) 2026-03-03 08:08:16 -08:00
Timothy 791ee40cd6 test: add Vercel tool unit tests (#5044) 2026-03-03 08:08:12 -08:00
Timothy e0191ac52b feat: add Vercel credential spec (#5044) 2026-03-03 08:08:07 -08:00
Timothy e0724df196 feat: add Vercel tool - deployments, projects, domains, env vars (#5044) 2026-03-03 08:08:00 -08:00
Timothy 2a56294638 chore: register Databricks in tool/credential registries (#5167) 2026-03-03 08:05:25 -08:00
Timothy d5cd557013 test: add Databricks tool unit tests (#5167) 2026-03-03 08:05:21 -08:00
Timothy 2a43f23a3d feat: add Databricks credential spec (#5167) 2026-03-03 08:05:03 -08:00
Timothy 69af8f569a feat: add Databricks tool - SQL, jobs, clusters, workspace (#5167) 2026-03-03 08:04:34 -08:00
Timothy 0e86dbcc9b chore: register Redis tool in tool/credential registries (#5370) 2026-03-03 08:01:43 -08:00
Timothy 92c75aa6f5 test: add Redis tool unit tests (#5370) 2026-03-03 08:01:37 -08:00
Timothy be41d848e5 feat: add Redis credential spec (#5370) 2026-03-03 08:01:32 -08:00
Timothy f7c299f6f0 feat: add Redis tool implementation - KV, hash, list, pub/sub (#5370) 2026-03-03 08:01:25 -08:00
Timothy b6a0f65a09 feat: add Pushover push notification integration (#5415)
4 tools: pushover_send, pushover_validate_user, pushover_list_sounds,
pushover_check_receipt. Supports priority levels, HTML, sounds, TTL.
All 12 unit tests and 13 conformance tests passing.
2026-03-03 07:58:29 -08:00
Timothy 1e7b0068ed chore: register Supabase tool in tool/credential registries 2026-03-03 07:54:34 -08:00
Timothy de5105f313 feat: add Supabase integration - DB, Auth, Edge Functions (#5489)
7 tools: supabase_select, supabase_insert, supabase_update, supabase_delete,
supabase_auth_signup, supabase_auth_signin, supabase_edge_invoke.
All 19 unit tests and 13 conformance tests passing.
2026-03-03 07:54:27 -08:00
Timothy 6d32f1bb36 chore: register YouTube and Microsoft Graph tools in tool/credential registries 2026-03-03 07:51:33 -08:00
Timothy 9c316cee28 feat: add Microsoft Graph integration - Outlook, Teams, OneDrive (#5601)
11 tools: outlook_list_messages, outlook_get_message, outlook_send_mail,
teams_list_teams, teams_list_channels, teams_send_channel_message,
teams_get_channel_messages, onedrive_search_files, onedrive_list_files,
onedrive_download_file, onedrive_upload_file.
All 15 unit tests and 13 conformance tests passing.
2026-03-03 07:47:49 -08:00
Timothy 6af4f2d6e6 feat: add YouTube Data API integration (#5603)
8 tools: search_videos, get_video_details, get_channel, list_channel_videos,
get_playlist, search_channels, get_video_comments, get_video_categories.
All 17 unit tests and 13 conformance tests passing.
2026-03-03 07:47:34 -08:00
levxn 7c7b60a5e9 every sessions loads properly without any issue 2026-03-03 19:46:27 +05:30
levxn 3f0b8bff5b fixes a minor unhandled error in event routes 2026-03-03 18:53:43 +05:30
Amdev-5 57651900f1 Merge remote-tracking branch 'origin/main' into lusha 2026-03-03 18:46:12 +05:30
Amdev-5 46b0617018 Merge remote-tracking branch 'origin/main' into lusha
# Conflicts:
#	tools/src/aden_tools/credentials/health_check.py
#	tools/src/aden_tools/tools/__init__.py
#	tools/tests/test_health_checks.py
2026-03-03 18:34:54 +05:30
levxn 91190cf82d restarts with previous session continuity 2026-03-03 17:48:01 +05:30
Aaryann Chandola 87a26db779 Merge branch 'aden-hive:main' into fix/guardian-self-trigger-loop 2026-03-03 11:56:15 +05:30
P Gokul Sree Chandra 7d9bd2e86b feat(tools): add YouTube Data API integration
- Implement 6 YouTube API tools (search videos, get video/channel details, list channel videos, get playlist items, search channels)
- Add YOUTUBE_API_KEY credential spec with help_url and description
- Register YouTube tool in tools/__init__.py
- Add comprehensive test coverage (18 tests) with mocking
- Add detailed README with setup instructions and examples
- Use httpx for HTTP requests to YouTube Data API v3
- Verified with real API integration testing

Implements #5603
2026-03-03 07:35:04 +05:30
Antiarin 20ef5cb14f test(runtime): add async test for canceling multiple tasks across streams 2026-03-03 05:54:42 +05:30
Antiarin 2c3ec7e74c fix(tui): fix pause/stop to cancel all running tasks across all graphs 2026-03-03 05:30:20 +05:30
Amdev-5 cce073dbdb fix(lusha): add pagination and empty filter validation
- Expose page parameter on search_people and search_companies
  (client + MCP tool) enabling access beyond the first 50 results
- Add guard requiring at least one filter on both search endpoints
  to prevent broad requests that burn API credits
- Add unit tests for pagination and empty filter validation
2026-03-02 10:20:08 +05:30
Vasu Bansal 6a92588264 fix(plaid): update v0.6 credential compatibility and stabilize tests 2026-03-01 01:16:16 +05:30
Vasu Bansal 276aad6f0d feat: add Plaid banking integration
- Implement Plaid connector for account balances
- Add transaction history retrieval
- Include GL reconciliation functionality
- Add institution metadata lookup
- Include comprehensive tests and documentation

Closes #4016
2026-03-01 01:16:16 +05:30
Vasu Bansal 10620bda4f fix(sap): update credential-store compatibility and test imports 2026-03-01 01:07:00 +05:30
Vasu Bansal c214401a00 feat(integration): add SAP S/4HANA connector
Add complete SAP S/4HANA integration with:
- Connector for OData API access
- Credential management following Hive patterns
- Unit tests with mocked responses
- Documentation and usage examples

Refs #3182
2026-03-01 01:07:00 +05:30
Vasu Bansal 260ac33324 fix(s3): support v0.6 credential refs and register S3 tools 2026-03-01 00:56:22 +05:30
Vasu Bansal d4cd643860 feat: add AWS S3 integration for cloud object storage
- Add S3Storage class with upload, download, list, delete operations
- Support IAM roles, environment variables, and credential store
- Implement retry logic with adaptive backoff
- Add MCP tools: s3_upload, s3_download, s3_list, s3_delete, s3_check_credentials
- Include comprehensive tests with moto mocking
- Add documentation for setup and IAM permissions

Closes #3012
2026-03-01 00:54:57 +05:30
IamSayeed dc16cfda21 Merge branch 'main' into feature/add-asana-integration 2026-02-28 11:28:43 +05:30
Timothy e1db3a4af9 fix: remove hardcoded anthropic logics 2026-02-27 10:23:59 -08:00
Navya Bijoy ddd30a950d Integration: add Databricks MCP tool integration
Implements the Databricks MCP tool integration for the Hive agent framework
2026-02-26 21:01:59 +05:30
KRYSTALM7 3ca0e63d54 feat(tools): add Pushover push notification integration
Closes #5415
2026-02-26 13:54:34 +00:00
Shivam Shahi– oss/acc 0f8627f17a format 2026-02-22 00:25:15 +05:30
Utkarsh Singh cd0cf69099 feat(tools): add Brevo transactional email and SMS integration
- Add brevo_tool with 6 MCP tools: brevo_send_email, brevo_send_sms,
  brevo_create_contact, brevo_get_contact, brevo_update_contact,
  brevo_get_email_stats
- Add CredentialSpec for BREVO_API_KEY in credentials/brevo.py
- Register brevo_tool in tools/__init__.py and credentials/__init__.py
- Add README with setup instructions and usage examples
- Add 34 unit tests covering all tools, validation and error handling

Closes #5127
2026-02-20 13:19:07 +05:30
Amdev-5 9744363342 fix(lusha): address PR review round 2 — structured filters, pagination, correct types
- search_people: replaced freetext searchText concatenation with proper
  structured Lusha API filters (jobTitles, seniority as list[int],
  departments, locations as dict, company_names, industry_ids, search_text)
- search_companies: added locations, company_names, search_text params;
  made all params optional for flexible queries
- Pagination: exposed limit param (clamped 10-50 per Lusha API constraints)
  on both search tools, replacing hardcoded size=25
- get_signals: changed ids from list[str] to list[int], removed internal
  str-to-int conversion as Lusha IDs are always numeric
- seniority type corrected to list[int] (API rejects string-encoded values
  despite OpenAPI spec suggesting strings — verified via live integration)
- Unit tests updated for all changes (19/19 pass)

Verified against live Lusha API: all 6 tools return correct responses.
2026-02-17 22:00:09 +05:30
Amdev-5 6fe8439e94 fix(lusha): use mainIndustriesIds for company search, safer credential handling
- search_companies: replace names filter with mainIndustriesIds (numeric
  industry IDs) per Lusha API schema. Parameter changed from
  industry: str to industry_ids: list[int] | None.
- _get_api_key: return None instead of raising TypeError on unexpected
  credential type. Lets _get_client handle it with the standard error dict
  pattern used across all tools.
- Updated unit tests for new industry_ids parameter and added test for
  non-string credential handling.
2026-02-17 21:33:02 +05:30
Amdev-5 8e61ffe377 fix(tools): remove invalid searchText field from Lusha prospecting filters
Lusha API rejects filters.companies.include.searchText (HTTP 400).
Replaced with valid 'names' field in search_companies and removed
redundant company searchText from search_people. Updated unit tests.
2026-02-17 21:33:02 +05:30
Amdev-5 723476f7a7 feat(tools): add Lusha MCP integration with credentials and health checks 2026-02-17 21:33:02 +05:30
karthik-kotra 41cd11d5c9 docs(setup): add troubleshooting steps for common WSL setup issues 2026-02-17 07:30:00 +00:00
IamSayeed 0f253027ae Merge branch 'main' into feature/add-asana-integration 2026-02-17 12:20:01 +05:30
Sayeed Rizwan 6053895a82 fix(asana): resolve from PR feedback - refactor client, fix specs, add tests 2026-02-17 12:18:06 +05:30
Shivam Shahi– oss/acc ceffa38717 Merge branch 'main' into feat/zoho-crm 2026-02-17 02:46:29 +05:30
Your hh3538962 ae205fa3f2 fix(tools): address Power BI integration code review feedback
- Fix export endpoint: /Export -> /ExportTo
- Add 202 Accepted response handling
- Add notifyOption to refresh_dataset API call
- Rename format parameter to export_format (avoid shadowing builtin)
- Add PNG support to export formats
- All critical API issues from review addressed
2026-02-16 14:00:09 +05:00
Shivam Shahi– oss/acc 669a05892b Merge branch 'main' into feat/zoho-crm 2026-02-15 21:47:52 +05:30
IamSayeed 4898a9759a Merge branch 'main' into feature/add-asana-integration 2026-02-15 13:07:15 +05:30
Sayeed Rizwan 2c2fa25580 fix: Resolve merge conflicts in credential and tool registries 2026-02-15 13:00:23 +05:30
Sayeed Rizwan 56496d7dbd feat: Add Asana integration for project management automation
- Implement 25 MCP tools for comprehensive Asana operations
  - Task management (create, update, search, delete, complete, comment, subtask)
  - Project management (create, update, list, get tasks)
  - Workspace & team operations (list workspaces, get users)
  - Section management for Kanban workflows
  - Tag and custom field support

- Add Personal Access Token (PAT) authentication
- Use official asana>=3.2.0 Python SDK (v5+ API)
- Include comprehensive error handling with ApiException
- Add 5 unit tests with 100% pass rate
- Provide detailed documentation and usage examples

Technical Details:
- Uses asana.ApiClient with Configuration pattern
- Implements workspace resolution by name or GID
- Handles paginated responses automatically
- Follows CredentialStoreAdapter pattern
- Matches existing tool structure (slack_tool, github_tool)

Closes #4156
2026-02-15 11:33:17 +05:30
y0sif dd0696e44d chore: resolve merge conflicts with main 2026-02-14 21:38:44 +02:00
y0sif dcda273e0b chore: resolve merge conflicts with main 2026-02-14 21:32:33 +02:00
y0sif f3b159c650 docs(tools): document Attio CRM in README 2026-02-14 21:23:47 +02:00
y0sif 06df037e28 chore: add Attio credentials to test spec file 2026-02-14 21:22:55 +02:00
y0sif e814e516d1 chore: add Attio credentials to init file 2026-02-14 21:21:37 +02:00
y0sif 0375e068ed test(tools): add Attio tool tests 2026-02-14 21:20:03 +02:00
y0sif 34ffc533d3 feat(tools): add Attio CRM integration 2026-02-14 21:19:14 +02:00
mubarakar95 ea2ea1a4ae Merge branch 'main' into integration/apify 2026-02-14 17:53:39 +05:30
mubarakar95 9e11947687 style: apply ruff formatting to apify_tool.py 2026-02-14 17:22:35 +05:30
mubarakar95 47117281e1 fix(test): resolve E501 line too long in test_apify_tool.py 2026-02-14 17:22:33 +05:30
mubarakar95 032dd13f5a feat(tools): implement Apify integration with 4 tools and comprehensive tests
- Added credential spec with health check endpoint
- Implemented apify_run_actor (sync/async execution)
- Implemented apify_get_dataset (result retrieval)
- Implemented apify_get_run (status checking)
- Implemented apify_search_actors (marketplace search)
- Created comprehensive README with examples and use cases
- Added 24 unit tests with mocked API responses
- All tests passing, conformance validated, linting clean

Resolves: #4510
2026-02-14 17:22:25 +05:30
mubarakar95 13d8ebbeff feat: Add Apify integration (issue #4510)
Implements comprehensive Apify integration for web scraping and automation:

- Added 4 new tools: apify_run_actor, apify_get_dataset, apify_get_run, apify_search_actors
- Credential management for APIFY_API_TOKEN with health check
- Support for synchronous (wait=True) and asynchronous (wait=False) actor execution
- Actor ID validation and comprehensive error handling
- Full test coverage (26 tests passing)
- README with usage examples and documentation

Addresses #4510
2026-02-14 11:53:56 +05:30
Shivam Shahi– oss/acc 2efa0e01df ruff format fix 2026-02-14 00:35:30 +05:30
Shivam Shahi– oss/acc 6044369fdf feat(tools): add Zoho CRM v8 integration with OAuth2 and MCP tools
Add Zoho CRM MCP integration for lead/contact/account/deal workflows with notes support. Implements 5 MCP tools:
- zoho_crm_search: Search Leads/Contacts/Accounts/Deals by criteria or word with pagination
- zoho_crm_get_record: Fetch a single record by module and ID
- zoho_crm_create_record: Create records with pass-through field payloads
- zoho_crm_update_record: Update records by ID with partial field payloads
- zoho_crm_add_note: Create notes linked to CRM records via Parent_Id mapping

Features:
- Zoho OAuth2 provider added in core credentials (refresh-token flow)
- Zoho auth format: Authorization: Zoho-oauthtoken <token>
- Region/DC-aware routing using accounts domain/region + api_domain usage
- Persisted DC metadata on refresh (api_domain/accounts_domain/location)
- Credential spec and health check registration for zoho_crm
- Tool registration and allowed-tool list updates
- Normalized tool responses with retriable 429 handling
- README with setup, auth modes, usage, and testing instructions
- Comprehensive unit/integration coverage updates for tool, provider, and health checks

Validation:
- Scoped ruff lint/format checks passed
- Targeted test suite passed: 563 passed, 18 skipped

Closes #4418
2026-02-13 18:28:12 +05:30
RichardTang-Aden 97440f9e8a Merge branch 'main' into feature/x-twitter-integration 2026-02-11 17:13:33 -08:00
Your hh3538962 765f7cae58 feat(tools): add get_datasets, get_reports, and export_report functions to Power BI integration 2026-02-11 22:19:51 +05:00
Your hh3538962 b455c8a2ad Merge remote-tracking branch 'origin/main' into feat/power-bi-integration 2026-02-11 22:07:00 +05:00
Sapna vishnoi da25e0ffa5 Merge branch 'main' into feat/redshift-integration 2026-02-11 13:42:26 +05:30
Your hh3538962 e07703c01f feat(tools): add Power BI integration - initial structure with workspace and dataset refresh functions 2026-02-10 13:23:32 +05:00
mishrapravin114 a4abf3eb2b Merge upstream/main: resolve conflicts with Apollo integration
- Keep both APOLLO_CREDENTIALS and AIRTABLE_CREDENTIALS
- Keep both apollo_tool and airtable_tool imports (alphabetical)

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-10 00:25:17 +05:30
mishrapravin114 269d72d073 Merge upstream/main: resolve conflicts with Apollo integration
- Keep both APOLLO_CREDENTIALS and CALENDLY_CREDENTIALS
- Keep both apollo_tool and calendly_tool imports (alphabetical)

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-10 00:20:17 +05:30
mishrapravin114 c8f5dccbd2 docs(airtable): add rate limit section to README
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-10 00:17:49 +05:30
mishrapravin114 8b797ee73f feat(airtable): add rate limit retry and retry_after
- Add 429 handling with retry_after from Retry-After header
- Add _request_with_retry (2 retries) for all API calls
- Update tests to use httpx.request

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-10 00:17:37 +05:30
mishrapravin114 de38adb1e4 feat(calendly): add rate limit handling, retry, 7-day validation
- Add 429 handling with retry_after from Retry-After header
- Add _request_with_retry (2 retries) for all API calls
- Validate get_availability date range <= 7 days
- Update tests to use httpx.request

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-10 00:16:37 +05:30
Sapna vishnoi c169bcc5d8 Merge branch 'main' into feat/redshift-integration 2026-02-09 23:32:08 +05:30
kubrakaradirek 80ea286beb fix: resolve complex merge conflicts and restore integrations 2026-02-09 16:09:43 +03:00
kubrakaradirek 3499be782e feat: implement MSSQL tool with schema discovery closes #3377 2026-02-09 15:32:57 +03:00
Gordon Ng 16603ae49c Test MCP 2026-02-09 01:48:49 -05:00
Gordon Ng bf6bd9ce7f test mcp 2026-02-09 01:48:46 -05:00
Gordon Ng a54c0f6f46 update 2026-02-09 01:20:25 -05:00
Gordon Ng beeed11d48 update 2026-02-09 01:11:33 -05:00
Manas Dutta 25331590a7 feat(reddit): add Reddit health checker and update tool functions 2026-02-08 19:26:01 +05:30
GastonAQS bff9f8976e Merge branch 'main' into feature/add-trello-integration 2026-02-07 15:57:48 -03:00
Manas Dutta b71628e211 Merge branch 'main' into feature/reddit-integration 2026-02-07 19:35:02 +05:30
Manas Dutta 8c1cb1f55b feat: add Reddit integration with 18 MCP tools
Implements Reddit API integration for community management and content monitoring.

Features:
- Search & Monitoring: search posts/comments, get subreddit feeds (new/hot), get posts/comments (6 tools)
- Content Creation: submit posts, reply, edit, delete comments (5 tools)
- User Engagement: get profiles, upvote, downvote, save posts (4 tools)
- Moderation: remove/approve posts, ban users (3 tools)

Implementation:
- OAuth 2.0 authentication via REDDIT_CREDENTIALS
- PRAW library for Reddit API integration
- Comprehensive error handling and validation
- Full test coverage (25 tests passing)

Resolves #3595
2026-02-07 18:38:59 +05:30
mishrapravin114 66214384a9 fix: add register_airtable import and fix ruff I001 import order
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-07 17:18:26 +05:30
mishrapravin114 6d6646887c feat(tools): add Airtable bases and records integration
- Add Airtable tool with 5 MCP tools:
  - airtable_list_bases
  - airtable_list_tables
  - airtable_list_records (with filter/sort)
  - airtable_create_record
  - airtable_update_record
- Add AIRTABLE_CREDENTIALS with credentialSpec + credentialStore
- Add AirtableHealthChecker for token validation
- Add README with setup and usage
- Add unit tests (9 tests total)

Fixes #2911

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-07 17:14:46 +05:30
mishrapravin114 6f8db0ed08 style: apply ruff format to calendly and health check files
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-07 17:00:05 +05:30
mishrapravin114 6aaf6836ea fix(calendly): resolve ruff lint errors (UP017, E501)
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-07 16:58:48 +05:30
mishrapravin114 4f2348f50e feat(tools): add Calendly scheduling integration
- Add Calendly tool with 4 MCP tools:
  - calendly_list_event_types
  - calendly_get_availability
  - calendly_get_booking_link
  - calendly_cancel_event
- Add CALENDLY_CREDENTIALS with credentialSpec + credentialStore
- Add CalendlyHealthChecker for token validation
- Add README with setup and usage
- Add unit tests (12 tests total)

Fixes #2930

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-07 16:51:27 +05:30
RichardTang-Aden deb7f2f72a Merge pull request #3814 from Amdev-5/feature/x-twitter-integration
fix(tests): update credential group test for X integration
2026-02-06 09:16:42 -08:00
Amdev-5 d989d9c65a fix(tests): update credential group test for X integration
Add test_x_credentials_share_credential_group to verify all X credentials
share the 'x' credential group. Update test_credential_group_default_empty
to account for X credentials alongside existing Google exceptions.
2026-02-06 22:17:40 +05:30
bryan 4173c606ab Merge feature/x-twitter-final-integration from Amdev-5/hive - X (Twitter) tool with DM support 2026-02-06 08:03:43 -08:00
Amdev-5 a01430d20f Merge verification fixes into PR branch 2026-02-06 16:42:56 +05:30
Amdev-5 2a8f775732 feat(tools): enhance X tool with DM support and robust error handling
- Added `x_send_dm` tool using v2 endpoint (`POST /dm_conversations/with/:id/messages`) for reliable 1:1 messaging.
- Fixed 403 Forbidden payload validation errors by simplifying DM payload structure.
- Enhanced `_handle_response` to verify `x_tool.py` returns raw API error details for 403/400 responses, aiding in permission debugging.
- Updated `demo_x_tools.py` to support standard `.env` variable names (e.g., `X_API_KEY`) and added user lookup for DM testing.
- Added unit tests covering new DM functionality and payload verification in `test_x_tool.py`.
- Audited credential handling: Read-only tools (Search/Mentions) correctly use Bearer Token, while Write tools (Post/Reply/Delete/DM) enforce OAuth 1.0a User Context.

Verified with live API tests (see PR description for logs).
2026-02-06 15:48:20 +05:30
Sapna vishnoi 4a0d9b2855 Merge branch 'main' into feat/redshift-integration 2026-02-05 11:44:09 +05:30
y0sif 92c65d69ea chore: resolve merge conflicts with main 2026-02-05 07:13:36 +02:00
Yosif Soliman 910a8968c4 fix(linear): correct GraphQL variable type for workflow states query 2026-02-05 07:00:28 +02:00
Sapna vishnoi cdb4679c5a Merge branch 'main' into feat/redshift-integration 2026-02-05 00:05:38 +05:30
Sapna.Vishnoi 1a9dce89b4 feat(tools): Add Amazon Redshift integration
- Implement 5 core functions for data warehouse querying
- Add boto3 integration with Redshift Data API
- Security: Read-only SELECT queries by default
- Full credential store support
- 26/26 tests passing (100% coverage)
- Complete documentation with examples
2026-02-04 23:58:35 +05:30
Aneesh cf1e4d7f88 Merge remote-tracking branch 'origin/main' into feature/youtube-transcript 2026-02-04 19:46:52 +05:30
Aneesh f2f0b4fc61 feat(tools): add youtube transcript integration via youtube-transcript-api 2026-02-04 19:24:40 +05:30
y0sif b21dd25181 fix(linear): handle credential decryption errors gracefully, handle mcp tool issue with credentials 2026-02-04 05:21:23 +02:00
y0sif 04a18bcbe5 docs(tools): document Linear integration in README and setup credentials claude skill 2026-02-04 04:05:15 +02:00
y0sif 7f66dd67eb feat(linear): add OAuth setup instructions 2026-02-04 04:03:37 +02:00
y0sif cfa03b89c8 test(tools): add comprehensive Linear tool tests 2026-02-04 03:47:28 +02:00
y0sif 9866d7a22b feat(tools): add Linear project management integration 2026-02-04 03:47:03 +02:00
GastonAQS 331a6e442f feat: add Trello integration tools and API client 2026-02-03 10:32:25 -03:00
Sashank Thapa 1c2295b2b5 Merge branch 'adenhq:main' into feature/twitter-x-mcp-tool 2026-02-03 16:20:45 +05:30
Sashank Thapa fa43ca3785 Merge branch 'adenhq:main' into feature/twitter-x-mcp-tool 2026-01-31 16:26:39 +05:30
kozuedoingregression b4a2c3bd14 ruff formatting and lint fixes 2026-01-31 16:18:16 +05:30
kozuedoingregression 2d4ec4f462 lint fix 2026-01-31 16:14:25 +05:30
kozuedoingregression 1e8b933da0 add X (Twitter) integration tool 2026-01-31 15:49:16 +05:30
Aneesh 48b1e0e038 Docs: clarify agent creation assumptions in Getting Started 2026-01-28 22:49:30 +05:30
746 changed files with 101301 additions and 34740 deletions
-9
View File
@@ -1,9 +0,0 @@
{
"mcpServers": {
"agent-builder": {
"command": "uv",
"args": ["run", "--directory", "core", "-m", "framework.mcp.agent_builder_server"],
"disabled": false
}
}
}
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-concepts
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-create
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-credentials
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-patterns
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-test
-5
View File
@@ -1,5 +0,0 @@
---
description: hive-concepts
---
use hive-concepts skill
-5
View File
@@ -1,5 +0,0 @@
---
description: hive-create
---
use hive-create skill
-5
View File
@@ -1,5 +0,0 @@
---
description: hive-credentials
---
use hive-credentials skill
-5
View File
@@ -1,5 +0,0 @@
---
description: hive-patterns
---
use hive-patterns skill
-5
View File
@@ -1,5 +0,0 @@
---
description: hive-test
---
use hive-test skill
-5
View File
@@ -1,5 +0,0 @@
---
description: hive
---
use hive skill
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-concepts
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-create
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-credentials
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-patterns
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-test
+2 -20
View File
@@ -1,34 +1,16 @@
{
"permissions": {
"allow": [
"mcp__agent-builder__create_session",
"mcp__agent-builder__set_goal",
"mcp__agent-builder__add_node",
"mcp__agent-builder__add_edge",
"mcp__agent-builder__configure_loop",
"mcp__agent-builder__add_mcp_server",
"mcp__agent-builder__validate_graph",
"mcp__agent-builder__export_graph",
"mcp__agent-builder__load_session_by_id",
"Bash(git status:*)",
"Bash(gh run view:*)",
"Bash(uv run:*)",
"Bash(env:*)",
"mcp__agent-builder__test_node",
"mcp__agent-builder__list_mcp_tools",
"Bash(python -m py_compile:*)",
"Bash(python -m pytest:*)",
"Bash(source:*)",
"mcp__agent-builder__update_node",
"mcp__agent-builder__check_missing_credentials",
"mcp__agent-builder__list_stored_credentials",
"Bash(find:*)",
"mcp__agent-builder__run_tests",
"Bash(PYTHONPATH=core:exports:tools/src uv run pytest:*)",
"mcp__agent-builder__list_agent_sessions",
"mcp__agent-builder__generate_constraint_tests",
"mcp__agent-builder__generate_success_tests"
"Bash(PYTHONPATH=core:exports:tools/src uv run pytest:*)"
]
},
"enabledMcpjsonServers": ["agent-builder", "tools"]
"enabledMcpjsonServers": ["tools"]
}
-399
View File
@@ -1,399 +0,0 @@
---
name: hive-concepts
description: Core concepts for goal-driven agents - architecture, node types (event_loop, function), tool discovery, and workflow overview. Use when starting agent development or need to understand agent fundamentals.
license: Apache-2.0
metadata:
author: hive
version: "2.0"
type: foundational
part_of: hive
---
# Building Agents - Core Concepts
Foundational knowledge for building goal-driven agents as Python packages.
## Architecture: Python Services (Not JSON Configs)
Agents are built as Python packages:
```
exports/my_agent/
├── __init__.py # Package exports
├── __main__.py # CLI (run, info, validate, shell)
├── agent.py # Graph construction (goal, edges, agent class)
├── nodes/__init__.py # Node definitions (NodeSpec)
├── config.py # Runtime config
└── README.md # Documentation
```
**Key Principle: Agent is visible and editable during build**
- Files created immediately as components are approved
- User can watch files grow in their editor
- No session state - just direct file writes
- No "export" step - agent is ready when build completes
## Core Concepts
### Goal
Success criteria and constraints (written to agent.py)
```python
goal = Goal(
id="research-goal",
name="Technical Research Agent",
description="Research technical topics thoroughly",
success_criteria=[
SuccessCriterion(
id="completeness",
description="Cover all aspects of topic",
metric="coverage_score",
target=">=0.9",
weight=0.4,
),
# 3-5 success criteria total
],
constraints=[
Constraint(
id="accuracy",
description="All information must be verified",
constraint_type="hard",
category="quality",
),
# 1-5 constraints total
],
)
```
### Node
Unit of work (written to nodes/__init__.py)
**Node Types:**
- `event_loop` — Multi-turn streaming loop with tool execution and judge-based evaluation. Works with or without tools.
- `function` — Deterministic Python operations. No LLM involved.
```python
search_node = NodeSpec(
id="search-web",
name="Search Web",
description="Search for information and extract results",
node_type="event_loop",
input_keys=["query"],
output_keys=["search_results"],
system_prompt="Search the web for: {query}. Use the web_search tool to find results, then call set_output to store them.",
tools=["web_search"],
)
```
**NodeSpec Fields for Event Loop Nodes:**
| Field | Default | Description |
|-------|---------|-------------|
| `client_facing` | `False` | If True, streams output to user and blocks for input between turns |
| `nullable_output_keys` | `[]` | Output keys that may remain unset (for mutually exclusive outputs) |
| `max_node_visits` | `1` | Max times this node executes per run. Set >1 for feedback loop targets |
### Edge
Connection between nodes (written to agent.py)
**Edge Conditions:**
- `on_success` — Proceed if node succeeds (most common)
- `on_failure` — Handle errors
- `always` — Always proceed
- `conditional` — Based on expression evaluating node output
**Edge Priority:**
Priority controls evaluation order when multiple edges leave the same node. Higher priority edges are evaluated first. Use negative priority for feedback edges (edges that loop back to earlier nodes).
```python
# Forward edge (evaluated first)
EdgeSpec(
id="review-to-campaign",
source="review",
target="campaign-builder",
condition=EdgeCondition.CONDITIONAL,
condition_expr="output.get('approved_contacts') is not None",
priority=1,
)
# Feedback edge (evaluated after forward edges)
EdgeSpec(
id="review-feedback",
source="review",
target="extractor",
condition=EdgeCondition.CONDITIONAL,
condition_expr="output.get('redo_extraction') is not None",
priority=-1,
)
```
### Client-Facing Nodes
For multi-turn conversations with the user, set `client_facing=True` on a node. The node will:
- Stream its LLM output directly to the end user
- Block for user input between conversational turns
- Resume when new input is injected via `inject_event()`
```python
intake_node = NodeSpec(
id="intake",
name="Intake",
description="Gather requirements from the user",
node_type="event_loop",
client_facing=True,
input_keys=[],
output_keys=["repo_url", "project_url"],
system_prompt="You are the intake agent. Ask the user for the repo URL and project URL.",
)
```
> **Legacy Note:** The old `pause_nodes` / `entry_points` pattern still works but `client_facing=True` is preferred for new agents.
**STEP 1 / STEP 2 Prompt Pattern:** For client-facing nodes, structure the system prompt with two explicit phases:
```python
system_prompt="""\
**STEP 1 — Respond to the user (text only, NO tool calls):**
[Present information, ask questions, etc.]
**STEP 2 — After the user responds, call set_output:**
[Call set_output with the structured outputs]
"""
```
This prevents the LLM from calling `set_output` prematurely before the user has had a chance to respond.
### Node Design: Fewer, Richer Nodes
Prefer fewer nodes that do more work over many thin single-purpose nodes:
- **Bad**: 8 thin nodes (parse query → search → fetch → evaluate → synthesize → write → check → save)
- **Good**: 4 rich nodes (intake → research → review → report)
Why: Each node boundary requires serializing outputs and passing context. Fewer nodes means the LLM retains full context of its work within the node. A research node that searches, fetches, and analyzes keeps all the source material in its conversation history.
### nullable_output_keys for Cross-Edge Inputs
When a node receives inputs that only arrive on certain edges (e.g., `feedback` only comes from a review → research feedback loop, not from intake → research), mark those keys as `nullable_output_keys`:
```python
research_node = NodeSpec(
id="research",
input_keys=["research_brief", "feedback"],
nullable_output_keys=["feedback"], # Not present on first visit
max_node_visits=3,
...
)
```
## Event Loop Architecture Concepts
### How EventLoopNode Works
An event loop node runs a multi-turn loop:
1. LLM receives system prompt + conversation history
2. LLM responds (text and/or tool calls)
3. Tool calls are executed, results added to conversation
4. Judge evaluates: ACCEPT (exit loop), RETRY (loop again), or ESCALATE
5. Repeat until judge ACCEPTs or max_iterations reached
### EventLoopNode Runtime
EventLoopNodes are **auto-created** by `GraphExecutor` at runtime. You do NOT need to manually register them. Both `GraphExecutor` (direct) and `AgentRuntime` / `create_agent_runtime()` handle event_loop nodes automatically.
```python
# Direct execution — executor auto-creates EventLoopNodes
from framework.graph.executor import GraphExecutor
from framework.runtime.core import Runtime
runtime = Runtime(storage_path)
executor = GraphExecutor(
runtime=runtime,
llm=llm,
tools=tools,
tool_executor=tool_executor,
storage_path=storage_path,
)
result = await executor.execute(graph=graph, goal=goal, input_data=input_data)
# TUI execution — AgentRuntime also works
from framework.runtime.agent_runtime import create_agent_runtime
runtime = create_agent_runtime(
graph=graph, goal=goal, storage_path=storage_path,
entry_points=[...], llm=llm, tools=tools, tool_executor=tool_executor,
)
```
### set_output
Nodes produce structured outputs by calling `set_output(key, value)` — a synthetic tool injected by the framework. When the LLM calls `set_output`, the value is stored in the output accumulator and made available to downstream nodes via shared memory.
`set_output` is NOT a real tool — it is excluded from `real_tool_results`. For client-facing nodes, this means a turn where the LLM only calls `set_output` (no other tools) is treated as a conversational boundary and will block for user input.
### JudgeProtocol
**The judge is the SOLE mechanism for acceptance decisions.** Do not add ad-hoc framework gating, output rollback, or premature rejection logic. If the LLM calls `set_output` too early, fix it with better prompts or a custom judge — not framework-level guards.
The judge controls when a node's loop exits:
- **Implicit judge** (default, no judge configured): ACCEPTs when the LLM finishes with no tool calls and all required output keys are set
- **SchemaJudge**: Validates outputs against a Pydantic model
- **Custom judges**: Implement `evaluate(context) -> JudgeVerdict`
### LoopConfig
Controls loop behavior:
- `max_iterations` (default 50) — prevents infinite loops
- `max_tool_calls_per_turn` (default 10) — limits tool calls per LLM response
- `tool_call_overflow_margin` (default 0.5) — wiggle room before discarding extra tool calls (50% means hard cutoff at 150% of limit)
- `stall_detection_threshold` (default 3) — detects repeated identical responses
- `max_history_tokens` (default 32000) — triggers conversation compaction
### Data Tools (Spillover Management)
When tool results exceed the context window, the framework automatically saves them to a spillover directory and truncates with a hint. Nodes that produce or consume large data should include the data tools:
- `save_data(filename, data)` — Write data to a file in the data directory
- `load_data(filename, offset=0, limit=50)` — Read data with line-based pagination
- `list_data_files()` — List available data files
- `serve_file_to_user(filename, label="")` — Get a clickable file:// URI for the user
Note: `data_dir` is a framework-injected context parameter — the LLM never sees or passes it. `GraphExecutor.execute()` sets it per-execution via `contextvars`, so data tools and spillover always share the same session-scoped directory.
These are real MCP tools (not synthetic). Add them to nodes that handle large tool results:
```python
research_node = NodeSpec(
...
tools=["web_search", "web_scrape", "load_data", "save_data", "list_data_files"],
)
```
### Fan-Out / Fan-In
Multiple ON_SUCCESS edges from the same source create parallel execution. All branches run concurrently via `asyncio.gather()`. Parallel event_loop nodes must have disjoint `output_keys`.
### max_node_visits
Controls how many times a node can execute in one graph run. Default is 1. Set higher for nodes that are targets of feedback edges (review-reject loops). Set 0 for unlimited (guarded by max_steps).
## Tool Discovery & Validation
**CRITICAL:** Before adding a node with tools, you MUST verify the tools exist.
Tools are provided by MCP servers. Never assume a tool exists - always discover dynamically.
### Step 1: Register MCP Server (if not already done)
```python
mcp__agent-builder__add_mcp_server(
name="tools",
transport="stdio",
command="python",
args='["mcp_server.py", "--stdio"]',
cwd="../tools"
)
```
### Step 2: Discover Available Tools
```python
# List all tools from all registered servers
mcp__agent-builder__list_mcp_tools()
# Or list tools from a specific server
mcp__agent-builder__list_mcp_tools(server_name="tools")
```
### Step 3: Validate Before Adding Nodes
Before writing a node with `tools=[...]`:
1. Call `list_mcp_tools()` to get available tools
2. Check each tool in your node exists in the response
3. If a tool doesn't exist:
- **DO NOT proceed** with the node
- Inform the user: "The tool 'X' is not available. Available tools are: ..."
- Ask if they want to use an alternative or proceed without the tool
### Tool Validation Anti-Patterns
- **Never assume a tool exists** - always call `list_mcp_tools()` first
- **Never write a node with unverified tools** - validate before writing
- **Never silently drop tools** - if a tool doesn't exist, inform the user
- **Never guess tool names** - use exact names from discovery response
## Workflow Overview: Incremental File Construction
```
1. CREATE PACKAGE → mkdir + write skeletons
2. DEFINE GOAL → Write to agent.py + config.py
3. FOR EACH NODE:
- Propose design (event_loop for LLM work, function for deterministic)
- User approves
- Write to nodes/__init__.py IMMEDIATELY
- (Optional) Validate with test_node
4. CONNECT EDGES → Update agent.py
- Use priority for feedback edges (negative priority)
- (Optional) Validate with validate_graph
5. FINALIZE → Write agent class to agent.py
6. DONE - Agent ready at exports/my_agent/
```
**Files written immediately. MCP tools optional for validation/testing bookkeeping.**
## When to Use This Skill
Use hive-concepts when:
- Starting a new agent project and need to understand fundamentals
- Need to understand agent architecture before building
- Want to validate tool availability before proceeding
- Learning about node types, edges, and graph execution
**Next Steps:**
- Ready to build? → Use `hive-create` skill
- Need patterns and examples? → Use `hive-patterns` skill
## MCP Tools for Validation
After writing files, optionally use MCP tools for validation:
**test_node** - Validate node configuration with mock inputs
```python
mcp__agent-builder__test_node(
node_id="search-web",
test_input='{"query": "test query"}',
mock_llm_response='{"results": "mock output"}'
)
```
**validate_graph** - Check graph structure
```python
mcp__agent-builder__validate_graph()
# Returns: unreachable nodes, missing connections, event_loop validation, etc.
```
**configure_loop** - Set event loop parameters
```python
mcp__agent-builder__configure_loop(
max_iterations=50,
max_tool_calls_per_turn=10,
stall_detection_threshold=3,
max_history_tokens=32000
)
```
**Key Point:** Files are written FIRST. MCP tools are for validation only.
## Related Skills
- **hive-create** - Step-by-step building process
- **hive-patterns** - Best practices: judges, feedback edges, fan-out, context management
- **hive** - Complete workflow orchestrator
- **hive-test** - Test and validate completed agents
File diff suppressed because it is too large Load Diff
@@ -1,24 +0,0 @@
"""
Deep Research Agent - Interactive, rigorous research with TUI conversation.
Research any topic through multi-source web search, quality evaluation,
and synthesis. Features client-facing TUI interaction at key checkpoints
for user guidance and iterative deepening.
"""
from .agent import DeepResearchAgent, default_agent, goal, nodes, edges
from .config import RuntimeConfig, AgentMetadata, default_config, metadata
__version__ = "1.0.0"
__all__ = [
"DeepResearchAgent",
"default_agent",
"goal",
"nodes",
"edges",
"RuntimeConfig",
"AgentMetadata",
"default_config",
"metadata",
]
@@ -1,241 +0,0 @@
"""
CLI entry point for Deep Research Agent.
Uses AgentRuntime for multi-entrypoint support with HITL pause/resume.
"""
import asyncio
import json
import logging
import sys
import click
from .agent import default_agent, DeepResearchAgent
def setup_logging(verbose=False, debug=False):
"""Configure logging for execution visibility."""
if debug:
level, fmt = logging.DEBUG, "%(asctime)s %(name)s: %(message)s"
elif verbose:
level, fmt = logging.INFO, "%(message)s"
else:
level, fmt = logging.WARNING, "%(levelname)s: %(message)s"
logging.basicConfig(level=level, format=fmt, stream=sys.stderr)
logging.getLogger("framework").setLevel(level)
@click.group()
@click.version_option(version="1.0.0")
def cli():
"""Deep Research Agent - Interactive, rigorous research with TUI conversation."""
pass
@cli.command()
@click.option("--topic", "-t", type=str, required=True, help="Research topic")
@click.option("--mock", is_flag=True, help="Run in mock mode")
@click.option("--quiet", "-q", is_flag=True, help="Only output result JSON")
@click.option("--verbose", "-v", is_flag=True, help="Show execution details")
@click.option("--debug", is_flag=True, help="Show debug logging")
def run(topic, mock, quiet, verbose, debug):
"""Execute research on a topic."""
if not quiet:
setup_logging(verbose=verbose, debug=debug)
context = {"topic": topic}
result = asyncio.run(default_agent.run(context, mock_mode=mock))
output_data = {
"success": result.success,
"steps_executed": result.steps_executed,
"output": result.output,
}
if result.error:
output_data["error"] = result.error
click.echo(json.dumps(output_data, indent=2, default=str))
sys.exit(0 if result.success else 1)
@cli.command()
@click.option("--mock", is_flag=True, help="Run in mock mode")
@click.option("--verbose", "-v", is_flag=True, help="Show execution details")
@click.option("--debug", is_flag=True, help="Show debug logging")
def tui(mock, verbose, debug):
"""Launch the TUI dashboard for interactive research."""
setup_logging(verbose=verbose, debug=debug)
try:
from framework.tui.app import AdenTUI
except ImportError:
click.echo(
"TUI requires the 'textual' package. Install with: pip install textual"
)
sys.exit(1)
from pathlib import Path
from framework.llm import LiteLLMProvider
from framework.runner.tool_registry import ToolRegistry
from framework.runtime.agent_runtime import create_agent_runtime
from framework.runtime.event_bus import EventBus
from framework.runtime.execution_stream import EntryPointSpec
async def run_with_tui():
agent = DeepResearchAgent()
# Build graph and tools
agent._event_bus = EventBus()
agent._tool_registry = ToolRegistry()
storage_path = Path.home() / ".hive" / "agents" / "deep_research_agent"
storage_path.mkdir(parents=True, exist_ok=True)
mcp_config_path = Path(__file__).parent / "mcp_servers.json"
if mcp_config_path.exists():
agent._tool_registry.load_mcp_config(mcp_config_path)
llm = None
if not mock:
llm = LiteLLMProvider(
model=agent.config.model,
api_key=agent.config.api_key,
api_base=agent.config.api_base,
)
tools = list(agent._tool_registry.get_tools().values())
tool_executor = agent._tool_registry.get_executor()
graph = agent._build_graph()
runtime = create_agent_runtime(
graph=graph,
goal=agent.goal,
storage_path=storage_path,
entry_points=[
EntryPointSpec(
id="start",
name="Start Research",
entry_node="intake",
trigger_type="manual",
isolation_level="isolated",
),
],
llm=llm,
tools=tools,
tool_executor=tool_executor,
)
await runtime.start()
try:
app = AdenTUI(runtime)
await app.run_async()
finally:
await runtime.stop()
asyncio.run(run_with_tui())
@cli.command()
@click.option("--json", "output_json", is_flag=True)
def info(output_json):
"""Show agent information."""
info_data = default_agent.info()
if output_json:
click.echo(json.dumps(info_data, indent=2))
else:
click.echo(f"Agent: {info_data['name']}")
click.echo(f"Version: {info_data['version']}")
click.echo(f"Description: {info_data['description']}")
click.echo(f"\nNodes: {', '.join(info_data['nodes'])}")
click.echo(f"Client-facing: {', '.join(info_data['client_facing_nodes'])}")
click.echo(f"Entry: {info_data['entry_node']}")
click.echo(f"Terminal: {', '.join(info_data['terminal_nodes'])}")
@cli.command()
def validate():
"""Validate agent structure."""
validation = default_agent.validate()
if validation["valid"]:
click.echo("Agent is valid")
if validation["warnings"]:
for warning in validation["warnings"]:
click.echo(f" WARNING: {warning}")
else:
click.echo("Agent has errors:")
for error in validation["errors"]:
click.echo(f" ERROR: {error}")
sys.exit(0 if validation["valid"] else 1)
@cli.command()
@click.option("--verbose", "-v", is_flag=True)
def shell(verbose):
"""Interactive research session (CLI, no TUI)."""
asyncio.run(_interactive_shell(verbose))
async def _interactive_shell(verbose=False):
"""Async interactive shell."""
setup_logging(verbose=verbose)
click.echo("=== Deep Research Agent ===")
click.echo("Enter a topic to research (or 'quit' to exit):\n")
agent = DeepResearchAgent()
await agent.start()
try:
while True:
try:
topic = await asyncio.get_event_loop().run_in_executor(
None, input, "Topic> "
)
if topic.lower() in ["quit", "exit", "q"]:
click.echo("Goodbye!")
break
if not topic.strip():
continue
click.echo("\nResearching...\n")
result = await agent.trigger_and_wait("start", {"topic": topic})
if result is None:
click.echo("\n[Execution timed out]\n")
continue
if result.success:
output = result.output
if "report_content" in output:
click.echo("\n--- Report ---\n")
click.echo(output["report_content"])
click.echo("\n")
if "references" in output:
click.echo("--- References ---\n")
for ref in output.get("references", []):
click.echo(
f" [{ref.get('number', '?')}] {ref.get('title', '')} - {ref.get('url', '')}"
)
click.echo("\n")
else:
click.echo(f"\nResearch failed: {result.error}\n")
except KeyboardInterrupt:
click.echo("\nGoodbye!")
break
except Exception as e:
click.echo(f"Error: {e}", err=True)
import traceback
traceback.print_exc()
finally:
await agent.stop()
if __name__ == "__main__":
cli()
@@ -1,358 +0,0 @@
"""Agent graph construction for Deep Research Agent."""
from pathlib import Path
from framework.graph import EdgeSpec, EdgeCondition, Goal, SuccessCriterion, Constraint
from framework.graph.edge import GraphSpec
from framework.graph.executor import ExecutionResult
from framework.graph.checkpoint_config import CheckpointConfig
from framework.llm import LiteLLMProvider
from framework.runner.tool_registry import ToolRegistry
from framework.runtime.agent_runtime import AgentRuntime, create_agent_runtime
from framework.runtime.execution_stream import EntryPointSpec
from .config import default_config, metadata
from .nodes import (
intake_node,
research_node,
review_node,
report_node,
)
# Goal definition
goal = Goal(
id="rigorous-interactive-research",
name="Rigorous Interactive Research",
description=(
"Research any topic by searching diverse sources, analyzing findings, "
"and producing a cited report — with user checkpoints to guide direction."
),
success_criteria=[
SuccessCriterion(
id="source-diversity",
description="Use multiple diverse, authoritative sources",
metric="source_count",
target=">=5",
weight=0.25,
),
SuccessCriterion(
id="citation-coverage",
description="Every factual claim in the report cites its source",
metric="citation_coverage",
target="100%",
weight=0.25,
),
SuccessCriterion(
id="user-satisfaction",
description="User reviews findings before report generation",
metric="user_approval",
target="true",
weight=0.25,
),
SuccessCriterion(
id="report-completeness",
description="Final report answers the original research questions",
metric="question_coverage",
target="90%",
weight=0.25,
),
],
constraints=[
Constraint(
id="no-hallucination",
description="Only include information found in fetched sources",
constraint_type="quality",
category="accuracy",
),
Constraint(
id="source-attribution",
description="Every claim must cite its source with a numbered reference",
constraint_type="quality",
category="accuracy",
),
Constraint(
id="user-checkpoint",
description="Present findings to the user before writing the final report",
constraint_type="functional",
category="interaction",
),
],
)
# Node list
nodes = [
intake_node,
research_node,
review_node,
report_node,
]
# Edge definitions
edges = [
# intake -> research
EdgeSpec(
id="intake-to-research",
source="intake",
target="research",
condition=EdgeCondition.ON_SUCCESS,
priority=1,
),
# research -> review
EdgeSpec(
id="research-to-review",
source="research",
target="review",
condition=EdgeCondition.ON_SUCCESS,
priority=1,
),
# review -> research (feedback loop)
EdgeSpec(
id="review-to-research-feedback",
source="review",
target="research",
condition=EdgeCondition.CONDITIONAL,
condition_expr="needs_more_research == True",
priority=1,
),
# review -> report (user satisfied)
EdgeSpec(
id="review-to-report",
source="review",
target="report",
condition=EdgeCondition.CONDITIONAL,
condition_expr="needs_more_research == False",
priority=2,
),
# report -> research (user wants deeper research on current topic)
EdgeSpec(
id="report-to-research",
source="report",
target="research",
condition=EdgeCondition.CONDITIONAL,
condition_expr="str(next_action).lower() == 'more_research'",
priority=2,
),
# report -> intake (user wants a new topic — default when not more_research)
EdgeSpec(
id="report-to-intake",
source="report",
target="intake",
condition=EdgeCondition.CONDITIONAL,
condition_expr="str(next_action).lower() != 'more_research'",
priority=1,
),
]
# Graph configuration
entry_node = "intake"
entry_points = {"start": "intake"}
pause_nodes = []
terminal_nodes = []
class DeepResearchAgent:
"""
Deep Research Agent 4-node pipeline with user checkpoints.
Flow: intake -> research -> review -> report
^ |
+-- feedback loop (if user wants more)
Uses AgentRuntime for proper session management:
- Session-scoped storage (sessions/{session_id}/)
- Checkpointing for resume capability
- Runtime logging
- Data folder for save_data/load_data
"""
def __init__(self, config=None):
self.config = config or default_config
self.goal = goal
self.nodes = nodes
self.edges = edges
self.entry_node = entry_node
self.entry_points = entry_points
self.pause_nodes = pause_nodes
self.terminal_nodes = terminal_nodes
self._graph: GraphSpec | None = None
self._agent_runtime: AgentRuntime | None = None
self._tool_registry: ToolRegistry | None = None
self._storage_path: Path | None = None
def _build_graph(self) -> GraphSpec:
"""Build the GraphSpec."""
return GraphSpec(
id="deep-research-agent-graph",
goal_id=self.goal.id,
version="1.0.0",
entry_node=self.entry_node,
entry_points=self.entry_points,
terminal_nodes=self.terminal_nodes,
pause_nodes=self.pause_nodes,
nodes=self.nodes,
edges=self.edges,
default_model=self.config.model,
max_tokens=self.config.max_tokens,
loop_config={
"max_iterations": 100,
"max_tool_calls_per_turn": 30,
"max_history_tokens": 32000,
},
conversation_mode="continuous",
identity_prompt=(
"You are a rigorous research agent. You search for information "
"from diverse, authoritative sources, analyze findings critically, "
"and produce well-cited reports. You never fabricate information — "
"every claim must trace back to a source you actually retrieved."
),
)
def _setup(self, mock_mode=False) -> None:
"""Set up the agent runtime with sessions, checkpoints, and logging."""
self._storage_path = Path.home() / ".hive" / "agents" / "deep_research_agent"
self._storage_path.mkdir(parents=True, exist_ok=True)
self._tool_registry = ToolRegistry()
mcp_config_path = Path(__file__).parent / "mcp_servers.json"
if mcp_config_path.exists():
self._tool_registry.load_mcp_config(mcp_config_path)
llm = None
if not mock_mode:
llm = LiteLLMProvider(
model=self.config.model,
api_key=self.config.api_key,
api_base=self.config.api_base,
)
tool_executor = self._tool_registry.get_executor()
tools = list(self._tool_registry.get_tools().values())
self._graph = self._build_graph()
checkpoint_config = CheckpointConfig(
enabled=True,
checkpoint_on_node_start=False,
checkpoint_on_node_complete=True,
checkpoint_max_age_days=7,
async_checkpoint=True,
)
entry_point_specs = [
EntryPointSpec(
id="default",
name="Default",
entry_node=self.entry_node,
trigger_type="manual",
isolation_level="shared",
)
]
self._agent_runtime = create_agent_runtime(
graph=self._graph,
goal=self.goal,
storage_path=self._storage_path,
entry_points=entry_point_specs,
llm=llm,
tools=tools,
tool_executor=tool_executor,
checkpoint_config=checkpoint_config,
)
async def start(self, mock_mode=False) -> None:
"""Set up and start the agent runtime."""
if self._agent_runtime is None:
self._setup(mock_mode=mock_mode)
if not self._agent_runtime.is_running:
await self._agent_runtime.start()
async def stop(self) -> None:
"""Stop the agent runtime and clean up."""
if self._agent_runtime and self._agent_runtime.is_running:
await self._agent_runtime.stop()
self._agent_runtime = None
async def trigger_and_wait(
self,
entry_point: str = "default",
input_data: dict | None = None,
timeout: float | None = None,
session_state: dict | None = None,
) -> ExecutionResult | None:
"""Execute the graph and wait for completion."""
if self._agent_runtime is None:
raise RuntimeError("Agent not started. Call start() first.")
return await self._agent_runtime.trigger_and_wait(
entry_point_id=entry_point,
input_data=input_data or {},
session_state=session_state,
)
async def run(
self, context: dict, mock_mode=False, session_state=None
) -> ExecutionResult:
"""Run the agent (convenience method for single execution)."""
await self.start(mock_mode=mock_mode)
try:
result = await self.trigger_and_wait(
"default", context, session_state=session_state
)
return result or ExecutionResult(success=False, error="Execution timeout")
finally:
await self.stop()
def info(self):
"""Get agent information."""
return {
"name": metadata.name,
"version": metadata.version,
"description": metadata.description,
"goal": {
"name": self.goal.name,
"description": self.goal.description,
},
"nodes": [n.id for n in self.nodes],
"edges": [e.id for e in self.edges],
"entry_node": self.entry_node,
"entry_points": self.entry_points,
"pause_nodes": self.pause_nodes,
"terminal_nodes": self.terminal_nodes,
"client_facing_nodes": [n.id for n in self.nodes if n.client_facing],
}
def validate(self):
"""Validate agent structure."""
errors = []
warnings = []
node_ids = {node.id for node in self.nodes}
for edge in self.edges:
if edge.source not in node_ids:
errors.append(f"Edge {edge.id}: source '{edge.source}' not found")
if edge.target not in node_ids:
errors.append(f"Edge {edge.id}: target '{edge.target}' not found")
if self.entry_node not in node_ids:
errors.append(f"Entry node '{self.entry_node}' not found")
for terminal in self.terminal_nodes:
if terminal not in node_ids:
errors.append(f"Terminal node '{terminal}' not found")
for ep_id, node_id in self.entry_points.items():
if node_id not in node_ids:
errors.append(
f"Entry point '{ep_id}' references unknown node '{node_id}'"
)
return {
"valid": len(errors) == 0,
"errors": errors,
"warnings": warnings,
}
# Create default instance
default_agent = DeepResearchAgent()
@@ -1,26 +0,0 @@
"""Runtime configuration."""
from dataclasses import dataclass
from framework.config import RuntimeConfig
default_config = RuntimeConfig()
@dataclass
class AgentMetadata:
name: str = "Deep Research Agent"
version: str = "1.0.0"
description: str = (
"Interactive research agent that rigorously investigates topics through "
"multi-source search, quality evaluation, and synthesis - with TUI conversation "
"at key checkpoints for user guidance and feedback."
)
intro_message: str = (
"Hi! I'm your deep research assistant. Tell me a topic and I'll investigate it "
"thoroughly — searching multiple sources, evaluating quality, and synthesizing "
"a comprehensive report. What would you like me to research?"
)
metadata = AgentMetadata()
@@ -1,213 +0,0 @@
"""Node definitions for Deep Research Agent."""
from framework.graph import NodeSpec
# Node 1: Intake (client-facing)
# Brief conversation to clarify what the user wants researched.
intake_node = NodeSpec(
id="intake",
name="Research Intake",
description="Discuss the research topic with the user, clarify scope, and confirm direction",
node_type="event_loop",
client_facing=True,
max_node_visits=0,
input_keys=["topic"],
output_keys=["research_brief"],
success_criteria=(
"The research brief is specific and actionable: it states the topic, "
"the key questions to answer, the desired scope, and depth."
),
system_prompt="""\
You are a research intake specialist. The user wants to research a topic.
Have a brief conversation to clarify what they need.
**STEP 1 Read and respond (text only, NO tool calls):**
1. Read the topic provided
2. If it's vague, ask 1-2 clarifying questions (scope, angle, depth)
3. If it's already clear, confirm your understanding and ask the user to confirm
Keep it short. Don't over-ask.
**STEP 2 After the user confirms, call set_output:**
- set_output("research_brief", "A clear paragraph describing exactly what to research, \
what questions to answer, what scope to cover, and how deep to go.")
""",
tools=[],
)
# Node 2: Research
# The workhorse — searches the web, fetches content, analyzes sources.
# One node with both tools avoids the context-passing overhead of 5 separate nodes.
research_node = NodeSpec(
id="research",
name="Research",
description="Search the web, fetch source content, and compile findings",
node_type="event_loop",
max_node_visits=0,
input_keys=["research_brief", "feedback"],
output_keys=["findings", "sources", "gaps"],
nullable_output_keys=["feedback"],
success_criteria=(
"Findings reference at least 3 distinct sources with URLs. "
"Key claims are substantiated by fetched content, not generated."
),
system_prompt="""\
You are a research agent. Given a research brief, find and analyze sources.
If feedback is provided, this is a follow-up round focus on the gaps identified.
Work in phases:
1. **Search**: Use web_search with 3-5 diverse queries covering different angles.
Prioritize authoritative sources (.edu, .gov, established publications).
2. **Fetch**: Use web_scrape on the most promising URLs (aim for 5-8 sources).
Skip URLs that fail. Extract the substantive content.
3. **Analyze**: Review what you've collected. Identify key findings, themes,
and any contradictions between sources.
Important:
- Work in batches of 3-4 tool calls at a time never more than 10 per turn
- After each batch, assess whether you have enough material
- Prefer quality over quantity 5 good sources beat 15 thin ones
- Track which URL each finding comes from (you'll need citations later)
- Call set_output for each key in a SEPARATE turn (not in the same turn as other tool calls)
Context management:
- Your tool results are automatically saved to files. After compaction, the file \
references remain in the conversation use load_data() to recover any content you need.
- Use append_data('research_notes.md', ...) to maintain a running log of key findings \
as you go. This survives compaction and helps the report node produce a detailed report.
When done, use set_output (one key at a time, separate turns):
- set_output("findings", "Structured summary: key findings with source URLs for each claim. \
Include themes, contradictions, and confidence levels.")
- set_output("sources", [{"url": "...", "title": "...", "summary": "..."}])
- set_output("gaps", "What aspects of the research brief are NOT well-covered yet, if any.")
""",
tools=[
"web_search",
"web_scrape",
"load_data",
"save_data",
"append_data",
"list_data_files",
],
)
# Node 3: Review (client-facing)
# Shows the user what was found and asks whether to dig deeper or proceed.
review_node = NodeSpec(
id="review",
name="Review Findings",
description="Present findings to user and decide whether to research more or write the report",
node_type="event_loop",
client_facing=True,
max_node_visits=0,
input_keys=["findings", "sources", "gaps", "research_brief"],
output_keys=["needs_more_research", "feedback"],
success_criteria=(
"The user has been presented with findings and has explicitly indicated "
"whether they want more research or are ready for the report."
),
system_prompt="""\
Present the research findings to the user clearly and concisely.
**STEP 1 Present (your first message, text only, NO tool calls):**
1. **Summary** (2-3 sentences of what was found)
2. **Key Findings** (bulleted, with confidence levels)
3. **Sources Used** (count and quality assessment)
4. **Gaps** (what's still unclear or under-covered)
End by asking: Are they satisfied, or do they want deeper research? \
Should we proceed to writing the final report?
**STEP 2 After the user responds, call set_output:**
- set_output("needs_more_research", "true") if they want more
- set_output("needs_more_research", "false") if they're satisfied
- set_output("feedback", "What the user wants explored further, or empty string")
""",
tools=[],
)
# Node 4: Report (client-facing)
# Writes an HTML report, serves the link to the user, and answers follow-ups.
report_node = NodeSpec(
id="report",
name="Write & Deliver Report",
description="Write a cited HTML report from the findings and present it to the user",
node_type="event_loop",
client_facing=True,
max_node_visits=0,
input_keys=["findings", "sources", "research_brief"],
output_keys=["delivery_status", "next_action"],
success_criteria=(
"An HTML report has been saved, the file link has been presented to the user, "
"and the user has indicated what they want to do next."
),
system_prompt="""\
Write a research report as an HTML file and present it to the user.
IMPORTANT: save_data requires TWO separate arguments: filename and data.
Call it like: save_data(filename="report.html", data="<html>...</html>")
Do NOT use _raw, do NOT nest arguments inside a JSON string.
**STEP 1 Write and save the HTML report (tool calls, NO text to user yet):**
Build a clean HTML document. Keep the HTML concise aim for clarity over length.
Use minimal embedded CSS (a few lines of style, not a full framework).
Report structure:
- Title & date
- Executive Summary (2-3 paragraphs)
- Key Findings (organized by theme, with [n] citation links)
- Analysis (synthesis, implications)
- Conclusion (key takeaways)
- References (numbered list with clickable URLs)
Requirements:
- Every factual claim must cite its source with [n] notation
- Be objective present multiple viewpoints where sources disagree
- Answer the original research questions from the brief
- If findings appear incomplete or summarized, call list_data_files() and load_data() \
to access the detailed source material from the research phase. The research node's \
tool results and research_notes.md contain the full data.
Save the HTML:
save_data(filename="report.html", data="<html>...</html>")
Then get the clickable link:
serve_file_to_user(filename="report.html", label="Research Report")
If save_data fails, simplify and shorten the HTML, then retry.
**STEP 2 Present the link to the user (text only, NO tool calls):**
Tell the user the report is ready and include the file:// URI from
serve_file_to_user so they can click it to open. Give a brief summary
of what the report covers. Ask if they have questions or want to continue.
**STEP 3 After the user responds:**
- Answer any follow-up questions from the research material
- When the user is ready to move on, ask what they'd like to do next:
- Research a new topic?
- Dig deeper into the current topic?
- Then call set_output:
- set_output("delivery_status", "completed")
- set_output("next_action", "new_topic") if they want a new topic
- set_output("next_action", "more_research") if they want deeper research
""",
tools=[
"save_data",
"append_data",
"edit_data",
"serve_file_to_user",
"load_data",
"list_data_files",
],
)
__all__ = [
"intake_node",
"research_node",
"review_node",
"report_node",
]
-640
View File
@@ -1,640 +0,0 @@
---
name: hive-credentials
description: Set up and install credentials for an agent. Detects missing credentials from agent config, collects them from the user, and stores them securely in the local encrypted store at ~/.hive/credentials.
license: Apache-2.0
metadata:
author: hive
version: "2.3"
type: utility
---
# Setup Credentials
Interactive credential setup for agents with multiple authentication options. Detects what's missing, offers auth method choices, validates with health checks, and stores credentials securely.
## When to Use
- Before running or testing an agent for the first time
- When `AgentRunner.run()` fails with "missing required credentials"
- When a user asks to configure credentials for an agent
- After building a new agent that uses tools requiring API keys
## Workflow
### Step 1: Identify the Agent
Determine which agent needs credentials. The user will either:
- Name the agent directly (e.g., "set up credentials for hubspot-agent")
- Have an agent directory open (check `exports/` for agent dirs)
- Be working on an agent in the current session
Locate the agent's directory under `exports/{agent_name}/`.
### Step 2: Detect Missing Credentials
Use the `check_missing_credentials` MCP tool to detect what the agent needs and what's already configured. This tool loads the agent, inspects its required tools and node types, maps them to credentials via `CREDENTIAL_SPECS`, and checks both the encrypted store and environment variables.
```
check_missing_credentials(agent_path="exports/{agent_name}")
```
The tool returns a JSON response:
```json
{
"agent": "exports/{agent_name}",
"missing": [
{
"credential_name": "brave_search",
"env_var": "BRAVE_SEARCH_API_KEY",
"description": "Brave Search API key for web search",
"help_url": "https://brave.com/search/api/",
"tools": ["web_search"]
}
],
"available": [
{
"credential_name": "anthropic",
"env_var": "ANTHROPIC_API_KEY",
"source": "encrypted_store"
}
],
"total_missing": 1,
"ready": false
}
```
**If `ready` is true (nothing missing):** Report all credentials as configured and skip Steps 3-5. Example:
```
All required credentials are already configured:
✓ anthropic (ANTHROPIC_API_KEY)
✓ brave_search (BRAVE_SEARCH_API_KEY)
Your agent is ready to run!
```
**If credentials are missing:** Continue to Step 3 with the `missing` list.
### Step 3: Present Auth Options for Each Missing Credential
For each missing credential, check what authentication methods are available:
```python
from aden_tools.credentials import CREDENTIAL_SPECS
spec = CREDENTIAL_SPECS.get("hubspot")
if spec:
# Determine available auth options
auth_options = []
if spec.aden_supported:
auth_options.append("aden")
if spec.direct_api_key_supported:
auth_options.append("direct")
auth_options.append("custom") # Always available
# Get setup info
setup_info = {
"env_var": spec.env_var,
"description": spec.description,
"help_url": spec.help_url,
"api_key_instructions": spec.api_key_instructions,
}
```
Present the available options using AskUserQuestion:
```
Choose how to configure HUBSPOT_ACCESS_TOKEN:
1) Aden Platform (OAuth) (Recommended)
Secure OAuth2 flow via hive.adenhq.com
- Quick setup with automatic token refresh
- No need to manage API keys manually
2) Direct API Key
Enter your own API key manually
- Requires creating a HubSpot Private App
- Full control over scopes and permissions
3) Local Credential Setup (Advanced)
Programmatic configuration for CI/CD
- For automated deployments
- Requires manual API calls
```
### Step 4: Execute Auth Flow Based on User Choice
#### Prerequisite: Ensure HIVE_CREDENTIAL_KEY Is Available
Before storing any credentials, verify `HIVE_CREDENTIAL_KEY` is set (needed to encrypt/decrypt the local store). Check both the current session and shell config:
```bash
# Check current session
printenv HIVE_CREDENTIAL_KEY > /dev/null 2>&1 && echo "session: set" || echo "session: not set"
# Check shell config files
for f in ~/.zshrc ~/.bashrc ~/.profile; do [ -f "$f" ] && grep -q 'HIVE_CREDENTIAL_KEY' "$f" && echo "$f"; done
```
- **In current session** — proceed to store credentials
- **In shell config but NOT in current session** — run `source ~/.zshrc` (or `~/.bashrc`) first, then proceed
- **Not set anywhere** — `EncryptedFileStorage` will auto-generate one. After storing, tell the user to persist it: `export HIVE_CREDENTIAL_KEY="{generated_key}"` in their shell profile
> **⚠️ IMPORTANT: After adding `HIVE_CREDENTIAL_KEY` to the user's shell config, always display:**
> ```
> ⚠️ Environment variables were added to your shell config.
> Open a NEW TERMINAL for them to take effect outside this session.
> ```
#### Option 1: Aden Platform (OAuth)
This is the recommended flow for supported integrations (HubSpot, etc.).
**How Aden OAuth Works:**
The ADEN_API_KEY represents a user who has already completed OAuth authorization on Aden's platform. When users sign up and connect integrations on Aden, those OAuth tokens are stored server-side. Having an ADEN_API_KEY means:
1. User has an Aden account
2. User has already authorized integrations (HubSpot, etc.) via OAuth on Aden
3. We just need to sync those credentials down to the local credential store
**4.1a. Check for ADEN_API_KEY**
```python
import os
aden_key = os.environ.get("ADEN_API_KEY")
```
If not set, guide user to get one from Aden (this is where they do OAuth):
```python
from aden_tools.credentials import open_browser, get_aden_setup_url
# Open browser to Aden - user will sign up and connect integrations there
url = get_aden_setup_url() # https://hive.adenhq.com
success, msg = open_browser(url)
print("Please sign in to Aden and connect your integrations (HubSpot, etc.).")
print("Once done, copy your API key and return here.")
```
Ask user to provide the ADEN_API_KEY they received.
**4.1b. Save ADEN_API_KEY to Shell Config**
With user approval, persist ADEN_API_KEY to their shell config:
```python
from aden_tools.credentials import (
detect_shell,
add_env_var_to_shell_config,
get_shell_source_command,
)
shell_type = detect_shell() # 'bash', 'zsh', or 'unknown'
# Ask user for approval before modifying shell config
# If approved:
success, config_path = add_env_var_to_shell_config(
"ADEN_API_KEY",
user_provided_key,
comment="Aden Platform (OAuth) API key"
)
if success:
source_cmd = get_shell_source_command()
print(f"Saved to {config_path}")
print(f"Run: {source_cmd}")
```
> **⚠️ IMPORTANT: After adding `ADEN_API_KEY` to the user's shell config, always display:**
> ```
> ⚠️ Environment variables were added to your shell config.
> Open a NEW TERMINAL for them to take effect outside this session.
> ```
Also save to `~/.hive/configuration.json` for the framework:
```python
import json
from pathlib import Path
config_path = Path.home() / ".hive" / "configuration.json"
config = json.loads(config_path.read_text()) if config_path.exists() else {}
config["aden"] = {
"api_key_configured": True,
"api_url": "https://api.adenhq.com"
}
config_path.parent.mkdir(parents=True, exist_ok=True)
config_path.write_text(json.dumps(config, indent=2))
```
**4.1c. Sync Credentials from Aden Server**
Since the user has already authorized integrations on Aden, use the one-liner factory method:
```python
from core.framework.credentials import CredentialStore
# This single call handles everything:
# - Creates encrypted local storage at ~/.hive/credentials
# - Configures Aden client from ADEN_API_KEY env var
# - Syncs all credentials from Aden server automatically
store = CredentialStore.with_aden_sync(
base_url="https://api.adenhq.com",
auto_sync=True, # Syncs on creation
)
# Check what was synced
synced = store.list_credentials()
print(f"Synced credentials: {synced}")
# If the required credential wasn't synced, the user hasn't authorized it on Aden yet
if "hubspot" not in synced:
print("HubSpot not found in your Aden account.")
print("Please visit https://hive.adenhq.com to connect HubSpot, then try again.")
```
For more control over the sync process:
```python
from core.framework.credentials import CredentialStore
from core.framework.credentials.aden import (
AdenCredentialClient,
AdenClientConfig,
AdenSyncProvider,
)
# Create client (API key loaded from ADEN_API_KEY env var)
client = AdenCredentialClient(AdenClientConfig(
base_url="https://api.adenhq.com",
))
# Create provider and store
provider = AdenSyncProvider(client=client)
store = CredentialStore.with_encrypted_storage()
# Manual sync
synced_count = provider.sync_all(store)
print(f"Synced {synced_count} credentials from Aden")
```
**4.1d. Run Health Check**
```python
from aden_tools.credentials import check_credential_health
# Get the token from the store
cred = store.get_credential("hubspot")
token = cred.keys["access_token"].value.get_secret_value()
result = check_credential_health("hubspot", token)
if result.valid:
print("HubSpot credentials validated successfully!")
else:
print(f"Validation failed: {result.message}")
# Offer to retry the OAuth flow
```
#### Option 2: Direct API Key
For users who prefer manual API key management.
**4.2a. Show Setup Instructions**
```python
from aden_tools.credentials import CREDENTIAL_SPECS
spec = CREDENTIAL_SPECS.get("hubspot")
if spec and spec.api_key_instructions:
print(spec.api_key_instructions)
# Output:
# To get a HubSpot Private App token:
# 1. Go to HubSpot Settings > Integrations > Private Apps
# 2. Click "Create a private app"
# 3. Name your app (e.g., "Hive Agent")
# ...
if spec and spec.help_url:
print(f"More info: {spec.help_url}")
```
**4.2b. Collect API Key from User**
Use AskUserQuestion to securely collect the API key:
```
Please provide your HubSpot access token:
(This will be stored securely in ~/.hive/credentials)
```
**4.2c. Run Health Check Before Storing**
```python
from aden_tools.credentials import check_credential_health
result = check_credential_health("hubspot", user_provided_token)
if not result.valid:
print(f"Warning: {result.message}")
# Ask user if they want to:
# 1. Try a different token
# 2. Continue anyway (not recommended)
```
**4.2d. Store in Local Encrypted Store**
```python
from core.framework.credentials import CredentialStore, CredentialObject, CredentialKey
from pydantic import SecretStr
store = CredentialStore.with_encrypted_storage()
cred = CredentialObject(
id="hubspot",
name="HubSpot Access Token",
keys={
"access_token": CredentialKey(
name="access_token",
value=SecretStr(user_provided_token),
)
},
)
store.save_credential(cred)
```
**4.2e. Export to Current Session**
```bash
export HUBSPOT_ACCESS_TOKEN="the-value"
```
#### Option 3: Local Credential Setup (Advanced)
For programmatic/CI/CD setups.
**4.3a. Show Documentation**
```
For advanced credential management, you can use the CredentialStore API directly:
from core.framework.credentials import CredentialStore, CredentialObject, CredentialKey
from pydantic import SecretStr
store = CredentialStore.with_encrypted_storage()
cred = CredentialObject(
id="hubspot",
name="HubSpot Access Token",
keys={"access_token": CredentialKey(name="access_token", value=SecretStr("..."))}
)
store.save_credential(cred)
For CI/CD environments:
- Set HIVE_CREDENTIAL_KEY for encryption
- Pre-populate ~/.hive/credentials programmatically
- Or use environment variables directly (HUBSPOT_ACCESS_TOKEN)
Documentation: See core/framework/credentials/README.md
```
### Step 5: Record Configuration Method
Track which auth method was used for each credential in `~/.hive/configuration.json`:
```python
import json
from pathlib import Path
from datetime import datetime
config_path = Path.home() / ".hive" / "configuration.json"
config = json.loads(config_path.read_text()) if config_path.exists() else {}
if "credential_methods" not in config:
config["credential_methods"] = {}
config["credential_methods"]["hubspot"] = {
"method": "aden", # or "direct" or "custom"
"configured_at": datetime.now().isoformat(),
}
config_path.write_text(json.dumps(config, indent=2))
```
### Step 6: Verify All Credentials
Use the `verify_credentials` MCP tool to confirm everything is properly configured:
```
verify_credentials(agent_path="exports/{agent_name}")
```
The tool returns:
```json
{
"agent": "exports/{agent_name}",
"ready": true,
"missing_credentials": [],
"warnings": [],
"errors": []
}
```
If `ready` is true, report success. If `missing_credentials` is non-empty, identify what failed and loop back to Step 3 for the remaining credentials.
## Health Check Reference
Health checks validate credentials by making lightweight API calls:
| Credential | Endpoint | What It Checks |
| --------------- | --------------------------------------- | --------------------------------- |
| `anthropic` | `POST /v1/messages` | API key validity |
| `brave_search` | `GET /res/v1/web/search?q=test&count=1` | API key validity |
| `google_search` | `GET /customsearch/v1?q=test&num=1` | API key + CSE ID validity |
| `github` | `GET /user` | Token validity, user identity |
| `hubspot` | `GET /crm/v3/objects/contacts?limit=1` | Bearer token validity, CRM scopes |
| `resend` | `GET /domains` | API key validity |
```python
from aden_tools.credentials import check_credential_health, HealthCheckResult
result: HealthCheckResult = check_credential_health("hubspot", token_value)
# result.valid: bool
# result.message: str
# result.details: dict (status_code, rate_limited, etc.)
```
## Encryption Key (HIVE_CREDENTIAL_KEY)
The local encrypted store requires `HIVE_CREDENTIAL_KEY` to encrypt/decrypt credentials.
- If the user doesn't have one, `EncryptedFileStorage` will auto-generate one and log it
- The user MUST persist this key (e.g., in `~/.bashrc`/`~/.zshrc` or a secrets manager)
- Without this key, stored credentials cannot be decrypted
**Shell config rule:** Only TWO keys belong in shell config (`~/.zshrc`/`~/.bashrc`):
- `HIVE_CREDENTIAL_KEY` — encryption key for the credential store
- `ADEN_API_KEY` — Aden platform auth key (needed before the store can sync)
All other API keys (Brave, Google, HubSpot, etc.) must go in the encrypted store only. **Never offer to add them to shell config.**
If `HIVE_CREDENTIAL_KEY` is not set:
1. Let the store generate one
2. Tell the user to save it: `export HIVE_CREDENTIAL_KEY="{generated_key}"`
3. Recommend adding it to `~/.bashrc` or their shell profile
## Security Rules
- **NEVER** log, print, or echo credential values in tool output
- **NEVER** store credentials in plaintext files, git-tracked files, or agent configs
- **NEVER** hardcode credentials in source code
- **NEVER** offer to save API keys to shell config (`~/.zshrc`/`~/.bashrc`) — the **only** keys that belong in shell config are `HIVE_CREDENTIAL_KEY` and `ADEN_API_KEY`. All other credentials (Brave, Google, HubSpot, GitHub, Resend, etc.) go in the encrypted store only.
- **ALWAYS** use `SecretStr` from Pydantic when handling credential values in Python
- **ALWAYS** use the local encrypted store (`~/.hive/credentials`) for persistence
- **ALWAYS** run health checks before storing credentials (when possible)
- **ALWAYS** verify credentials were stored by re-running validation, not by reading them back
- When modifying `~/.bashrc` or `~/.zshrc`, confirm with the user first
## Credential Sources Reference
All credential specs are defined in `tools/src/aden_tools/credentials/`:
| File | Category | Credentials | Aden Supported |
| ----------------- | ------------- | --------------------------------------------- | -------------- |
| `llm.py` | LLM Providers | `anthropic` | No |
| `search.py` | Search Tools | `brave_search`, `google_search`, `google_cse` | No |
| `email.py` | Email | `resend` | No |
| `integrations.py` | Integrations | `github`, `hubspot`, `google_calendar_oauth` | No / Yes |
**Note:** Additional LLM providers (Cerebras, Groq, OpenAI) are handled by LiteLLM via environment
variables (`CEREBRAS_API_KEY`, `GROQ_API_KEY`, `OPENAI_API_KEY`) but are not yet in CREDENTIAL_SPECS.
Add them to `llm.py` as needed.
To check what's registered:
```python
from aden_tools.credentials import CREDENTIAL_SPECS
for name, spec in CREDENTIAL_SPECS.items():
print(f"{name}: aden={spec.aden_supported}, direct={spec.direct_api_key_supported}")
```
## Migration: CredentialManager → CredentialStore
**CredentialManager is deprecated.** Use CredentialStore instead.
| Old (Deprecated) | New (Recommended) |
| ----------------------------------------- | -------------------------------------------------------------------- |
| `CredentialManager()` | `CredentialStore.with_encrypted_storage()` |
| `creds.get("hubspot")` | `store.get("hubspot")` or `store.get_key("hubspot", "access_token")` |
| `creds.validate_for_tools(tools)` | Use `store.is_available(cred_id)` per credential |
| `creds.get_auth_options("hubspot")` | Check `CREDENTIAL_SPECS["hubspot"].aden_supported` |
| `creds.get_setup_instructions("hubspot")` | Access `CREDENTIAL_SPECS["hubspot"]` directly |
**Why migrate?**
- **CredentialStore** supports encrypted storage, multi-key credentials, template resolution, and automatic token refresh
- **CredentialManager** only reads from environment variables and .env files (no encryption, no refresh)
- **CredentialStoreAdapter** exists for backward compatibility during migration
```python
# Old way (deprecated)
from aden_tools.credentials import CredentialManager
creds = CredentialManager()
token = creds.get("hubspot")
# New way (recommended)
from core.framework.credentials import CredentialStore
store = CredentialStore.with_encrypted_storage()
token = store.get("hubspot")
# With Aden sync (recommended for OAuth integrations)
store = CredentialStore.with_aden_sync()
token = store.get_key("hubspot", "access_token")
```
## Example Session
```
User: /hive-credentials for my research-agent
Agent: Let me check what credentials your research-agent needs.
[Calls check_missing_credentials(agent_path="exports/research-agent")]
→ Returns:
available: anthropic (encrypted_store), brave_search (encrypted_store)
missing: google_search (GOOGLE_API_KEY), google_cse (GOOGLE_CSE_ID)
ready: false
Agent: 2 of 4 required credentials are already configured. Only Google Custom
Search needs setup (2 values).
--- Setting up Google Custom Search (google_search + google_cse) ---
This requires two values that work together.
[Checks HIVE_CREDENTIAL_KEY before storing]
$ printenv HIVE_CREDENTIAL_KEY > /dev/null 2>&1 && echo "set" || echo "not set"
set
First, the Google API Key:
1. Go to https://console.cloud.google.com/apis/credentials
2. Create a new project (or select an existing one)
3. Enable the "Custom Search API" from the API Library
4. Go to Credentials > Create Credentials > API Key
5. Copy the generated API key
[AskUserQuestion: "Please provide your Google API key:"]
[User provides key]
Now, the Custom Search Engine ID:
1. Go to https://programmablesearchengine.google.com/controlpanel/all
2. Click "Add" to create a new search engine
3. Under "What to search", select "Search the entire web"
4. Give your search engine a name
5. Click "Create"
6. Copy the Search Engine ID (cx value)
[AskUserQuestion: "Please provide your Google CSE ID:"]
[User provides ID]
[Runs health check with both values - GET /customsearch/v1?q=test&num=1 → 200 OK]
[Stores both in local encrypted store, exports to env]
✓ Google Custom Search credentials valid
[Calls verify_credentials(agent_path="exports/research-agent")]
→ Returns: ready: true, missing_credentials: []
All credentials are now configured:
✓ anthropic (ANTHROPIC_API_KEY) — already in encrypted store
✓ brave_search (BRAVE_SEARCH_API_KEY) — already in encrypted store
✓ google_search (GOOGLE_API_KEY) — stored in encrypted store
✓ google_cse (GOOGLE_CSE_ID) — stored in encrypted store
┌─────────────────────────────────────────────────────────────────────────────┐
│ ✅ CREDENTIALS CONFIGURED │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ OPEN A NEW TERMINAL before running commands below. │
│ Environment variables were saved to your shell config but │
│ only take effect in new terminal sessions. │
│ │
│ NEXT STEPS: │
│ │
│ 1. RUN YOUR AGENT: │
│ │
│ hive tui │
│ │
│ 2. IF YOU ENCOUNTER ISSUES, USE THE DEBUGGER: │
│ │
│ /hive-debugger │
│ │
│ The debugger analyzes runtime logs, identifies retry loops, tool │
│ failures, stalled execution, and provides actionable fix suggestions. │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
File diff suppressed because it is too large Load Diff
-385
View File
@@ -1,385 +0,0 @@
---
name: hive-patterns
description: Best practices, patterns, and examples for building goal-driven agents. Includes client-facing interaction, feedback edges, judge patterns, fan-out/fan-in, context management, and anti-patterns.
license: Apache-2.0
metadata:
author: hive
version: "2.0"
type: reference
part_of: hive
---
# Building Agents - Patterns & Best Practices
Design patterns, examples, and best practices for building robust goal-driven agents.
**Prerequisites:** Complete agent structure using `hive-create`.
## Practical Example: Hybrid Workflow
How to build a node using both direct file writes and optional MCP validation:
```python
# 1. WRITE TO FILE FIRST (Primary - makes it visible)
node_code = '''
search_node = NodeSpec(
id="search-web",
node_type="event_loop",
input_keys=["query"],
output_keys=["search_results"],
system_prompt="Search the web for: {query}. Use web_search, then call set_output to store results.",
tools=["web_search"],
)
'''
Edit(
file_path="exports/research_agent/nodes/__init__.py",
old_string="# Nodes will be added here",
new_string=node_code
)
# 2. OPTIONALLY VALIDATE WITH MCP (Secondary - bookkeeping)
validation = mcp__agent-builder__test_node(
node_id="search-web",
test_input='{"query": "python tutorials"}',
mock_llm_response='{"search_results": [...mock results...]}'
)
```
**User experience:**
- Immediately sees node in their editor (from step 1)
- Gets validation feedback (from step 2)
- Can edit the file directly if needed
## Multi-Turn Interaction Patterns
For agents needing multi-turn conversations with users, use `client_facing=True` on event_loop nodes.
### Client-Facing Nodes
A client-facing node streams LLM output to the user and blocks for user input between conversational turns. This replaces the old pause/resume pattern.
```python
# Client-facing node with STEP 1/STEP 2 prompt pattern
intake_node = NodeSpec(
id="intake",
name="Intake",
description="Gather requirements from the user",
node_type="event_loop",
client_facing=True,
input_keys=["topic"],
output_keys=["research_brief"],
system_prompt="""\
You are an intake specialist.
**STEP 1 — Read and respond (text only, NO tool calls):**
1. Read the topic provided
2. If it's vague, ask 1-2 clarifying questions
3. If it's clear, confirm your understanding
**STEP 2 — After the user confirms, call set_output:**
- set_output("research_brief", "Clear description of what to research")
""",
)
# Internal node runs without user interaction
research_node = NodeSpec(
id="research",
name="Research",
description="Search and analyze sources",
node_type="event_loop",
input_keys=["research_brief"],
output_keys=["findings", "sources"],
system_prompt="Research the topic using web_search and web_scrape...",
tools=["web_search", "web_scrape", "load_data", "save_data"],
)
```
**How it works:**
- Client-facing nodes stream LLM text to the user and block for input after each response
- User input is injected via `node.inject_event(text)`
- When the LLM calls `set_output` to produce structured outputs, the judge evaluates and ACCEPTs
- Internal nodes (non-client-facing) run their entire loop without blocking
- `set_output` is a synthetic tool — a turn with only `set_output` calls (no real tools) triggers user input blocking
**STEP 1/STEP 2 pattern:** Always structure client-facing prompts with explicit phases. STEP 1 is text-only conversation. STEP 2 calls `set_output` after user confirmation. This prevents the LLM from calling `set_output` prematurely before the user responds.
### When to Use client_facing
| Scenario | client_facing | Why |
| ----------------------------------- | :-----------: | ---------------------- |
| Gathering user requirements | Yes | Need user input |
| Human review/approval checkpoint | Yes | Need human decision |
| Data processing (scanning, scoring) | No | Runs autonomously |
| Report generation | No | No user input needed |
| Final confirmation before action | Yes | Need explicit approval |
> **Legacy Note:** The `pause_nodes` / `entry_points` pattern still works for backward compatibility but `client_facing=True` is preferred for new agents.
## Edge-Based Routing and Feedback Loops
### Conditional Edge Routing
Multiple conditional edges from the same source replace the old `router` node type. Each edge checks a condition on the node's output.
```python
# Node with mutually exclusive outputs
review_node = NodeSpec(
id="review",
name="Review",
node_type="event_loop",
client_facing=True,
output_keys=["approved_contacts", "redo_extraction"],
nullable_output_keys=["approved_contacts", "redo_extraction"],
max_node_visits=3,
system_prompt="Present the contact list to the operator. If they approve, call set_output('approved_contacts', ...). If they want changes, call set_output('redo_extraction', 'true').",
)
# Forward edge (positive priority, evaluated first)
EdgeSpec(
id="review-to-campaign",
source="review",
target="campaign-builder",
condition=EdgeCondition.CONDITIONAL,
condition_expr="output.get('approved_contacts') is not None",
priority=1,
)
# Feedback edge (negative priority, evaluated after forward edges)
EdgeSpec(
id="review-feedback",
source="review",
target="extractor",
condition=EdgeCondition.CONDITIONAL,
condition_expr="output.get('redo_extraction') is not None",
priority=-1,
)
```
**Key concepts:**
- `nullable_output_keys`: Lists output keys that may remain unset. The node sets exactly one of the mutually exclusive keys per execution.
- `max_node_visits`: Must be >1 on the feedback target (extractor) so it can re-execute. Default is 1.
- `priority`: Positive = forward edge (evaluated first). Negative = feedback edge. The executor tries forward edges first; if none match, falls back to feedback edges.
### Routing Decision Table
| Pattern | Old Approach | New Approach |
| ---------------------- | ----------------------- | --------------------------------------------- |
| Conditional branching | `router` node | Conditional edges with `condition_expr` |
| Binary approve/reject | `pause_nodes` + resume | `client_facing=True` + `nullable_output_keys` |
| Loop-back on rejection | Manual entry_points | Feedback edge with `priority=-1` |
| Multi-way routing | Router with routes dict | Multiple conditional edges with priorities |
## Judge Patterns
**Core Principle: The judge is the SOLE mechanism for acceptance decisions.** Never add ad-hoc framework gating to compensate for LLM behavior. If the LLM calls `set_output` prematurely, fix the system prompt or use a custom judge. Anti-patterns to avoid:
- Output rollback logic
- `_user_has_responded` flags
- Premature set_output rejection
- Interaction protocol injection into system prompts
Judges control when an event_loop node's loop exits. Choose based on validation needs.
### Implicit Judge (Default)
When no judge is configured, the implicit judge ACCEPTs when:
- The LLM finishes its response with no tool calls
- All required output keys have been set via `set_output`
Best for simple nodes where "all outputs set" is sufficient validation.
### SchemaJudge
Validates outputs against a Pydantic model. Use when you need structural validation.
```python
from pydantic import BaseModel
class ScannerOutput(BaseModel):
github_users: list[dict] # Must be a list of user objects
class SchemaJudge:
def __init__(self, output_model: type[BaseModel]):
self._model = output_model
async def evaluate(self, context: dict) -> JudgeVerdict:
missing = context.get("missing_keys", [])
if missing:
return JudgeVerdict(
action="RETRY",
feedback=f"Missing output keys: {missing}. Use set_output to provide them.",
)
try:
self._model.model_validate(context["output_accumulator"])
return JudgeVerdict(action="ACCEPT")
except ValidationError as e:
return JudgeVerdict(action="RETRY", feedback=str(e))
```
### When to Use Which Judge
| Judge | Use When | Example |
| --------------- | ------------------------------------- | ---------------------- |
| Implicit (None) | Output keys are sufficient validation | Simple data extraction |
| SchemaJudge | Need structural validation of outputs | API response parsing |
| Custom | Domain-specific validation logic | Score must be 0.0-1.0 |
## Fan-Out / Fan-In (Parallel Execution)
Multiple ON_SUCCESS edges from the same source trigger parallel execution. All branches run concurrently via `asyncio.gather()`.
```python
# Scanner fans out to Profiler and Scorer in parallel
EdgeSpec(id="scanner-to-profiler", source="scanner", target="profiler",
condition=EdgeCondition.ON_SUCCESS)
EdgeSpec(id="scanner-to-scorer", source="scanner", target="scorer",
condition=EdgeCondition.ON_SUCCESS)
# Both fan in to Extractor
EdgeSpec(id="profiler-to-extractor", source="profiler", target="extractor",
condition=EdgeCondition.ON_SUCCESS)
EdgeSpec(id="scorer-to-extractor", source="scorer", target="extractor",
condition=EdgeCondition.ON_SUCCESS)
```
**Requirements:**
- Parallel event_loop nodes must have **disjoint output_keys** (no key written by both)
- Only one parallel branch may contain a `client_facing` node
- Fan-in node receives outputs from all completed branches in shared memory
## Context Management Patterns
### Tiered Compaction
EventLoopNode automatically manages context window usage with tiered compaction:
1. **Pruning** — Old tool results replaced with compact placeholders (zero-cost, no LLM call)
2. **Normal compaction** — LLM summarizes older messages
3. **Aggressive compaction** — Keeps only recent messages + summary
4. **Emergency** — Hard reset with tool history preservation
### Spillover Pattern
The framework automatically truncates large tool results and saves full content to a spillover directory. The LLM receives a truncation message with instructions to use `load_data` to read the full result.
For explicit data management, use the data tools (real MCP tools, not synthetic):
```python
# save_data, load_data, list_data_files, serve_file_to_user are real MCP tools
# data_dir is auto-injected by the framework — the LLM never sees it
# Saving large results
save_data(filename="sources.json", data=large_json_string)
# Reading with pagination (line-based offset/limit)
load_data(filename="sources.json", offset=0, limit=50)
# Listing available files
list_data_files()
# Serving a file to the user as a clickable link
serve_file_to_user(filename="report.html", label="Research Report")
```
Add data tools to nodes that handle large tool results:
```python
research_node = NodeSpec(
...
tools=["web_search", "web_scrape", "load_data", "save_data", "list_data_files"],
)
```
`data_dir` is a framework context parameter — auto-injected at call time. `GraphExecutor.execute()` sets it per-execution via `ToolRegistry.set_execution_context(data_dir=...)` (using `contextvars` for concurrency safety), ensuring it matches the session-scoped spillover directory.
## Anti-Patterns
### What NOT to Do
- **Don't rely on `export_graph`** — Write files immediately, not at end
- **Don't hide code in session** — Write to files as components are approved
- **Don't wait to write files** — Agent visible from first step
- **Don't batch everything** — Write incrementally, one component at a time
- **Don't create too many thin nodes** — Prefer fewer, richer nodes (see below)
- **Don't add framework gating for LLM behavior** — Fix prompts or use judges instead
### Fewer, Richer Nodes
A common mistake is splitting work into too many small single-purpose nodes. Each node boundary requires serializing outputs, losing in-context information, and adding edge complexity.
| Bad (8 thin nodes) | Good (4 rich nodes) |
| ------------------- | ----------------------------------- |
| parse-query | intake (client-facing) |
| search-sources | research (search + fetch + analyze) |
| fetch-content | review (client-facing) |
| evaluate-sources | report (write + deliver) |
| synthesize-findings | |
| write-report | |
| quality-check | |
| save-report | |
**Why fewer nodes are better:**
- The LLM retains full context of its work within a single node
- A research node that searches, fetches, and analyzes keeps all source material in its conversation history
- Fewer edges means simpler graph and fewer failure points
- Data tools (`save_data`/`load_data`) handle context window limits within a single node
### MCP Tools - Correct Usage
**MCP tools OK for:**
- `test_node` — Validate node configuration with mock inputs
- `validate_graph` — Check graph structure
- `configure_loop` — Set event loop parameters
- `create_session` — Track session state for bookkeeping
**Just don't:** Use MCP as the primary construction method or rely on export_graph
## Error Handling Patterns
### Graceful Failure with Fallback
```python
edges = [
# Success path
EdgeSpec(id="api-success", source="api-call", target="process-results",
condition=EdgeCondition.ON_SUCCESS),
# Fallback on failure
EdgeSpec(id="api-to-fallback", source="api-call", target="fallback-cache",
condition=EdgeCondition.ON_FAILURE, priority=1),
# Report if fallback also fails
EdgeSpec(id="fallback-to-error", source="fallback-cache", target="report-error",
condition=EdgeCondition.ON_FAILURE, priority=1),
]
```
## Handoff to Testing
When agent is complete, transition to testing phase:
### Pre-Testing Checklist
- [ ] Agent structure validates: `uv run python -m agent_name validate`
- [ ] All nodes defined in nodes/**init**.py
- [ ] All edges connect valid nodes with correct priorities
- [ ] Feedback edge targets have `max_node_visits > 1`
- [ ] Client-facing nodes have meaningful system prompts
- [ ] Agent can be imported: `from exports.agent_name import default_agent`
## Related Skills
- **hive-concepts** — Fundamental concepts (node types, edges, event loop architecture)
- **hive-create** — Step-by-step building process
- **hive-test** — Test and validate agents
- **hive** — Complete workflow orchestrator
---
**Remember: Agent is actively constructed, visible the whole time. No hidden state. No surprise exports. Just transparent, incremental file building.**
-940
View File
@@ -1,940 +0,0 @@
---
name: hive-test
description: Iterative agent testing with session recovery. Execute, analyze, fix, resume from checkpoints. Use when testing an agent, debugging test failures, or verifying fixes without re-running from scratch.
---
# Agent Testing
Test agents iteratively: execute, analyze failures, fix, resume from checkpoint, repeat.
## When to Use
- Testing a newly built agent against its goal
- Debugging a failing agent iteratively
- Verifying fixes without re-running expensive early nodes
- Running final regression tests before deployment
## Prerequisites
1. Agent package at `exports/{agent_name}/` (built with `/hive-create`)
2. Credentials configured (`/hive-credentials`)
3. `ANTHROPIC_API_KEY` set (or appropriate LLM provider key)
**Path distinction** (critical — don't confuse these):
- `exports/{agent_name}/` — agent source code (edit here)
- `~/.hive/agents/{agent_name}/` — runtime data: sessions, checkpoints, logs (read here)
---
## The Iterative Test Loop
This is the core workflow. Don't re-run the entire agent when a late node fails — analyze, fix, and resume from the last clean checkpoint.
```
┌──────────────────────────────────────┐
│ PHASE 1: Generate Test Scenarios │
│ Goal → synthetic test inputs + tests │
└──────────────┬───────────────────────┘
┌──────────────────────────────────────┐
│ PHASE 2: Execute │◄────────────────┐
│ Run agent (CLI or pytest) │ │
└──────────────┬───────────────────────┘ │
↓ │
Pass? ──yes──► PHASE 6: Final Verification │
│ │
no │
↓ │
┌──────────────────────────────────────┐ │
│ PHASE 3: Analyze │ │
│ Session + runtime logs + checkpoints │ │
└──────────────┬───────────────────────┘ │
↓ │
┌──────────────────────────────────────┐ │
│ PHASE 4: Fix │ │
│ Prompt / code / graph / goal │ │
└──────────────┬───────────────────────┘ │
↓ │
┌──────────────────────────────────────┐ │
│ PHASE 5: Recover & Resume │─────────────────┘
│ Checkpoint resume OR fresh re-run │
└──────────────────────────────────────┘
```
---
### Phase 1: Generate Test Scenarios
Create synthetic tests from the agent's goal, constraints, and success criteria.
#### Step 1a: Read the goal
```python
# Read goal from agent.py
Read(file_path="exports/{agent_name}/agent.py")
# Extract the Goal definition and convert to JSON string
```
#### Step 1b: Get test guidelines
```python
# Get constraint test guidelines
generate_constraint_tests(
goal_id="your-goal-id",
goal_json='{"id": "...", "constraints": [...]}',
agent_path="exports/{agent_name}"
)
# Get success criteria test guidelines
generate_success_tests(
goal_id="your-goal-id",
goal_json='{"id": "...", "success_criteria": [...]}',
node_names="intake,research,review,report",
tool_names="web_search,web_scrape",
agent_path="exports/{agent_name}"
)
```
These return `file_header`, `test_template`, `constraints_formatted`/`success_criteria_formatted`, and `test_guidelines`. They do NOT generate test code — you write the tests.
#### Step 1c: Write tests
```python
Write(
file_path=result["output_file"],
content=result["file_header"] + "\n\n" + your_test_code
)
```
#### Test writing rules
- Every test MUST be `async` with `@pytest.mark.asyncio`
- Every test MUST accept `runner, auto_responder, mock_mode` fixtures
- Use `await auto_responder.start()` before running, `await auto_responder.stop()` in `finally`
- Use `await runner.run(input_dict)` — this goes through AgentRunner → AgentRuntime → ExecutionStream
- Access output via `result.output.get("key")` — NEVER `result.output["key"]`
- `result.success=True` means no exception, NOT goal achieved — always check output
- Write 8-15 tests total, not 30+
- Each real test costs ~3 seconds + LLM tokens
- NEVER use `default_agent.run()` — it bypasses the runtime (no sessions, no logs, client-facing nodes hang)
#### Step 1d: Check existing tests
Before generating, check if tests already exist:
```python
list_tests(
goal_id="your-goal-id",
agent_path="exports/{agent_name}"
)
```
---
### Phase 2: Execute
Two execution paths, use the right one for your situation.
#### Iterative debugging (for complex agents)
Run the agent via CLI. This creates sessions with checkpoints at `~/.hive/agents/{agent_name}/sessions/`:
```bash
uv run hive run exports/{agent_name} --input '{"query": "test topic"}'
```
Sessions and checkpoints are saved automatically.
**Client-facing nodes**: Agents with `client_facing=True` nodes (interactive conversation) work in headless mode when run from a real terminal — the agent streams output to stdout and reads user input from stdin via a `>>> ` prompt. In non-interactive shells (like Claude Code's Bash tool), client-facing nodes will hang because there is no stdin. For testing interactive agents from Claude Code, use `run_tests` with mock mode or have the user run the agent manually in their terminal.
#### Automated regression (for CI or final verification)
Use the `run_tests` MCP tool to run all pytest tests:
```python
run_tests(
goal_id="your-goal-id",
agent_path="exports/{agent_name}"
)
```
Returns structured results:
```json
{
"overall_passed": false,
"summary": {"total": 12, "passed": 10, "failed": 2, "pass_rate": "83.3%"},
"test_results": [{"test_name": "test_success_source_diversity", "status": "failed"}],
"failures": [{"test_name": "test_success_source_diversity", "details": "..."}]
}
```
**Options:**
```python
# Run only constraint tests
run_tests(goal_id, agent_path, test_types='["constraint"]')
# Stop on first failure
run_tests(goal_id, agent_path, fail_fast=True)
# Parallel execution
run_tests(goal_id, agent_path, parallel=4)
```
**Note:** `run_tests` uses `AgentRunner` with `tmp_path` storage, so sessions are isolated per test run. For checkpoint-based recovery with persistent sessions, use CLI execution. Use `run_tests` for quick regression checks and final verification.
---
### Phase 3: Analyze Failures
When a test fails, drill down systematically. Don't guess — use the tools.
#### Step 3a: Get error category
```python
debug_test(
goal_id="your-goal-id",
test_name="test_success_source_diversity",
agent_path="exports/{agent_name}"
)
```
Returns error category (`IMPLEMENTATION_ERROR`, `ASSERTION_FAILURE`, `TIMEOUT`, `IMPORT_ERROR`, `API_ERROR`) plus full traceback and suggestions.
#### Step 3b: Find the failed session
```python
list_agent_sessions(
agent_work_dir="~/.hive/agents/{agent_name}",
status="failed",
limit=5
)
```
Returns session list with IDs, timestamps, current_node (where it failed), execution_quality.
#### Step 3c: Inspect session state
```python
get_agent_session_state(
agent_work_dir="~/.hive/agents/{agent_name}",
session_id="session_20260209_143022_abc12345"
)
```
Returns execution path, which node was current, step count, timestamps — but excludes memory values (to avoid context bloat). Shows `memory_keys` and `memory_size` instead.
#### Step 3d: Examine runtime logs (L2/L3)
```python
# L2: Per-node success/failure, retry counts
query_runtime_log_details(
agent_work_dir="~/.hive/agents/{agent_name}",
run_id="session_20260209_143022_abc12345",
needs_attention_only=True
)
# L3: Exact LLM responses, tool call inputs/outputs
query_runtime_log_raw(
agent_work_dir="~/.hive/agents/{agent_name}",
run_id="session_20260209_143022_abc12345",
node_id="research"
)
```
#### Step 3e: Inspect memory data
```python
# See what data a node actually produced
get_agent_session_memory(
agent_work_dir="~/.hive/agents/{agent_name}",
session_id="session_20260209_143022_abc12345",
key="research_results"
)
```
#### Step 3f: Find recovery points
```python
list_agent_checkpoints(
agent_work_dir="~/.hive/agents/{agent_name}",
session_id="session_20260209_143022_abc12345",
is_clean="true"
)
```
Returns checkpoint summaries with IDs, types (`node_start`, `node_complete`), which node, and `is_clean` flag. Clean checkpoints are safe resume points.
#### Step 3g: Compare checkpoints (optional)
To understand what changed between two points in execution:
```python
compare_agent_checkpoints(
agent_work_dir="~/.hive/agents/{agent_name}",
session_id="session_20260209_143022_abc12345",
checkpoint_id_before="cp_node_complete_research_143030",
checkpoint_id_after="cp_node_complete_review_143115"
)
```
Returns memory diff (added/removed/changed keys) and execution path diff.
---
### Phase 4: Fix Based on Root Cause
Use the analysis from Phase 3 to determine what to fix and where.
| Root Cause | What to Fix | Where to Edit |
|------------|------------|---------------|
| **Prompt issue** — LLM produces wrong output format, misses instructions | Node `system_prompt` | `exports/{agent}/nodes/__init__.py` |
| **Code bug** — TypeError, KeyError, logic error in Python | Agent code | `exports/{agent}/agent.py`, `nodes/__init__.py` |
| **Graph issue** — wrong routing, missing edge, bad condition_expr | Edges, node config | `exports/{agent}/agent.py` |
| **Tool issue** — MCP tool fails, wrong config, missing credential | Tool config | `exports/{agent}/mcp_servers.json`, `/hive-credentials` |
| **Goal issue** — success criteria too strict/vague, wrong constraints | Goal definition | `exports/{agent}/agent.py` (goal section) |
| **Test issue** — test expectations don't match actual agent behavior | Test code | `exports/{agent}/tests/test_*.py` |
#### Fix strategies by error category
**IMPLEMENTATION_ERROR** (TypeError, AttributeError, KeyError):
```python
# Read the failing code
Read(file_path="exports/{agent_name}/nodes/__init__.py")
# Fix the bug
Edit(
file_path="exports/{agent_name}/nodes/__init__.py",
old_string="results.get('videos')",
new_string="(results or {}).get('videos', [])"
)
```
**ASSERTION_FAILURE** (test assertions fail but agent ran successfully):
- Check if the agent's output is actually wrong → fix the prompt
- Check if the test's expectations are unrealistic → fix the test
- Use `get_agent_session_memory` to see what the agent actually produced
**TIMEOUT / STALL** (agent runs too long):
- Check `node_visit_counts` for feedback loops hitting max_node_visits
- Check L3 logs for tool calls that hang
- Reduce `max_iterations` in loop_config or fix the prompt to converge faster
**API_ERROR** (connection, rate limit, auth):
- Verify credentials with `/hive-credentials`
- Check MCP server configuration
---
### Phase 5: Recover & Resume
After fixing the agent, decide whether to resume or re-run.
#### When to resume from checkpoint
Resume when ALL of these are true:
- The fix is to a node that comes AFTER existing clean checkpoints
- Clean checkpoints exist (from a CLI execution with checkpointing)
- The early nodes are expensive (web scraping, API calls, long LLM chains)
```bash
# Resume from the last clean checkpoint before the failing node
uv run hive run exports/{agent_name} \
--resume-session session_20260209_143022_abc12345 \
--checkpoint cp_node_complete_research_143030
```
This skips all nodes before the checkpoint and only re-runs the fixed node onward.
#### When to re-run from scratch
Re-run when ANY of these are true:
- The fix is to the entry node or an early node
- No checkpoints exist (e.g., agent was run via `run_tests`)
- The agent is fast (2-3 nodes, completes in seconds)
- You changed the graph structure (added/removed nodes/edges)
```bash
uv run hive run exports/{agent_name} --input '{"query": "test topic"}'
```
#### Inspecting a checkpoint before resuming
```python
get_agent_checkpoint(
agent_work_dir="~/.hive/agents/{agent_name}",
session_id="session_20260209_143022_abc12345",
checkpoint_id="cp_node_complete_research_143030"
)
```
Returns the full checkpoint: shared_memory snapshot, execution_path, current_node, next_node, is_clean.
#### Loop back to Phase 2
After resuming or re-running, check if the fix worked. If not, go back to Phase 3.
---
### Phase 6: Final Verification
Once the iterative fix loop converges (the agent produces correct output), run the full automated test suite:
```python
run_tests(
goal_id="your-goal-id",
agent_path="exports/{agent_name}"
)
```
All tests should pass. If not, repeat the loop for remaining failures.
---
## Credential Requirements
**CRITICAL: Testing requires ALL credentials the agent depends on.** This includes both the LLM API key AND any tool-specific credentials (HubSpot, Brave Search, etc.).
### Prerequisites
Before running agent tests, you MUST collect ALL required credentials from the user.
**Step 1: LLM API Key (always required)**
```bash
export ANTHROPIC_API_KEY="your-key-here"
```
**Step 2: Tool-specific credentials (depends on agent's tools)**
Inspect the agent's `mcp_servers.json` and tool configuration to determine which tools the agent uses, then check for all required credentials:
```python
from aden_tools.credentials import CredentialManager, CREDENTIAL_SPECS
creds = CredentialManager()
# Determine which tools the agent uses (from agent.json or mcp_servers.json)
agent_tools = [...] # e.g., ["hubspot_search_contacts", "web_search", ...]
# Find all missing credentials for those tools
missing = creds.get_missing_for_tools(agent_tools)
```
Common tool credentials:
| Tool | Env Var | Help URL |
|------|---------|----------|
| HubSpot CRM | `HUBSPOT_ACCESS_TOKEN` | https://developers.hubspot.com/docs/api/private-apps |
| Brave Search | `BRAVE_SEARCH_API_KEY` | https://brave.com/search/api/ |
| Google Search | `GOOGLE_SEARCH_API_KEY` + `GOOGLE_SEARCH_CX` | https://developers.google.com/custom-search |
**Why ALL credentials are required:**
- Tests need to execute the agent's LLM nodes to validate behavior
- Tools with missing credentials will return error dicts instead of real data
- Mock mode bypasses everything, providing no confidence in real-world performance
### Mock Mode Limitations
Mock mode (`--mock` flag or `MOCK_MODE=1`) is **ONLY for structure validation**:
- Validates graph structure (nodes, edges, connections)
- Validates that `AgentRunner.load()` succeeds and the agent is importable
- Does NOT execute event_loop agents — MockLLMProvider never calls `set_output`, so event_loop nodes loop forever
- Does NOT test LLM reasoning, content quality, or constraint validation
- Does NOT test real API integrations or tool use
**Bottom line:** If you're testing whether an agent achieves its goal, you MUST use real credentials.
### Enforcing Credentials in Tests
When writing tests, **ALWAYS include credential checks**:
```python
import os
import pytest
from aden_tools.credentials import CredentialManager
pytestmark = pytest.mark.skipif(
not CredentialManager().is_available("anthropic") and not os.environ.get("MOCK_MODE"),
reason="API key required for real testing. Set ANTHROPIC_API_KEY or use MOCK_MODE=1."
)
@pytest.fixture(scope="session", autouse=True)
def check_credentials():
"""Ensure ALL required credentials are set for real testing."""
creds = CredentialManager()
mock_mode = os.environ.get("MOCK_MODE")
if not creds.is_available("anthropic"):
if mock_mode:
print("\nRunning in MOCK MODE - structure validation only")
else:
pytest.fail(
"\nANTHROPIC_API_KEY not set!\n"
"Set API key: export ANTHROPIC_API_KEY='your-key-here'\n"
"Or run structure validation: MOCK_MODE=1 pytest exports/{agent}/tests/"
)
if not mock_mode:
agent_tools = [] # Update per agent
missing = creds.get_missing_for_tools(agent_tools)
if missing:
lines = ["\nMissing tool credentials!"]
for name in missing:
spec = creds.specs.get(name)
if spec:
lines.append(f" {spec.env_var} - {spec.description}")
pytest.fail("\n".join(lines))
```
### User Communication
When the user asks to test an agent, **ALWAYS check for ALL credentials first**:
1. **Identify the agent's tools** from `mcp_servers.json`
2. **Check ALL required credentials** using `CredentialManager`
3. **Ask the user to provide any missing credentials** before proceeding
4. Collect ALL missing credentials in a single prompt — not one at a time
---
## Safe Test Patterns
### OutputCleaner
The framework automatically validates and cleans node outputs using a fast LLM at edge traversal time. Tests should still use safe patterns because OutputCleaner may not catch all issues.
### Safe Access (REQUIRED)
```python
# UNSAFE - will crash on missing keys
approval = result.output["approval_decision"]
category = result.output["analysis"]["category"]
# SAFE - use .get() with defaults
output = result.output or {}
approval = output.get("approval_decision", "UNKNOWN")
# SAFE - type check before operations
analysis = output.get("analysis", {})
if isinstance(analysis, dict):
category = analysis.get("category", "unknown")
# SAFE - handle JSON parsing trap (LLM response as string)
import json
recommendation = output.get("recommendation", "{}")
if isinstance(recommendation, str):
try:
parsed = json.loads(recommendation)
if isinstance(parsed, dict):
approval = parsed.get("approval_decision", "UNKNOWN")
except json.JSONDecodeError:
approval = "UNKNOWN"
elif isinstance(recommendation, dict):
approval = recommendation.get("approval_decision", "UNKNOWN")
# SAFE - type check before iteration
items = output.get("items", [])
if isinstance(items, list):
for item in items:
...
```
### Helper Functions for conftest.py
```python
import json
import re
def _parse_json_from_output(result, key):
"""Parse JSON from agent output (framework may store full LLM response as string)."""
response_text = result.output.get(key, "")
json_text = re.sub(r'```json\s*|\s*```', '', response_text).strip()
try:
return json.loads(json_text)
except (json.JSONDecodeError, AttributeError, TypeError):
return result.output.get(key)
def safe_get_nested(result, key_path, default=None):
"""Safely get nested value from result.output."""
output = result.output or {}
current = output
for key in key_path:
if isinstance(current, dict):
current = current.get(key)
elif isinstance(current, str):
try:
json_text = re.sub(r'```json\s*|\s*```', '', current).strip()
parsed = json.loads(json_text)
if isinstance(parsed, dict):
current = parsed.get(key)
else:
return default
except json.JSONDecodeError:
return default
else:
return default
return current if current is not None else default
# Make available in tests
pytest.parse_json_from_output = _parse_json_from_output
pytest.safe_get_nested = safe_get_nested
```
### ExecutionResult Fields
**`result.success=True` means NO exception, NOT goal achieved**
```python
# WRONG
assert result.success
# RIGHT
assert result.success, f"Agent failed: {result.error}"
output = result.output or {}
approval = output.get("approval_decision")
assert approval == "APPROVED", f"Expected APPROVED, got {approval}"
```
All fields:
- `success: bool` — Completed without exception (NOT goal achieved!)
- `output: dict` — Complete memory snapshot (may contain raw strings)
- `error: str | None` — Error message if failed
- `steps_executed: int` — Number of nodes executed
- `total_tokens: int` — Cumulative token usage
- `total_latency_ms: int` — Total execution time
- `path: list[str]` — Node IDs traversed (may repeat in feedback loops)
- `paused_at: str | None` — Node ID if paused
- `session_state: dict` — State for resuming
- `node_visit_counts: dict[str, int]` — Visit counts per node (feedback loop testing)
- `execution_quality: str` — "clean", "degraded", or "failed"
### Test Count Guidance
**Write 8-15 tests, not 30+**
- 2-3 tests per success criterion
- 1 happy path test
- 1 boundary/edge case test
- 1 error handling test (optional)
Each real test costs ~3 seconds + LLM tokens. 12 tests = ~36 seconds, $0.12.
---
## Test Patterns
### Happy Path
```python
@pytest.mark.asyncio
async def test_happy_path(runner, auto_responder, mock_mode):
"""Test normal successful execution."""
await auto_responder.start()
try:
result = await runner.run({"query": "python tutorials"})
finally:
await auto_responder.stop()
assert result.success, f"Agent failed: {result.error}"
output = result.output or {}
assert output.get("report"), "No report produced"
```
### Boundary Condition
```python
@pytest.mark.asyncio
async def test_minimum_sources(runner, auto_responder, mock_mode):
"""Test at minimum source threshold."""
await auto_responder.start()
try:
result = await runner.run({"query": "niche topic"})
finally:
await auto_responder.stop()
assert result.success, f"Agent failed: {result.error}"
output = result.output or {}
sources = output.get("sources", [])
if isinstance(sources, list):
assert len(sources) >= 3, f"Expected >= 3 sources, got {len(sources)}"
```
### Error Handling
```python
@pytest.mark.asyncio
async def test_empty_input(runner, auto_responder, mock_mode):
"""Test graceful handling of empty input."""
await auto_responder.start()
try:
result = await runner.run({"query": ""})
finally:
await auto_responder.stop()
# Agent should either fail gracefully or produce an error message
output = result.output or {}
assert not result.success or output.get("error"), "Should handle empty input"
```
### Feedback Loop
```python
@pytest.mark.asyncio
async def test_feedback_loop_terminates(runner, auto_responder, mock_mode):
"""Test that feedback loops don't run forever."""
await auto_responder.start()
try:
result = await runner.run({"query": "test"})
finally:
await auto_responder.stop()
visits = result.node_visit_counts or {}
for node_id, count in visits.items():
assert count <= 5, f"Node {node_id} visited {count} times — possible infinite loop"
```
---
## MCP Tool Reference
### Phase 1: Test Generation
```python
# Check existing tests
list_tests(goal_id, agent_path)
# Get constraint test guidelines (returns templates, NOT generated tests)
generate_constraint_tests(goal_id, goal_json, agent_path)
# Returns: output_file, file_header, test_template, constraints_formatted, test_guidelines
# Get success criteria test guidelines
generate_success_tests(goal_id, goal_json, node_names, tool_names, agent_path)
# Returns: output_file, file_header, test_template, success_criteria_formatted, test_guidelines
```
### Phase 2: Execution
```python
# Automated regression (no checkpoints, fresh runs)
run_tests(goal_id, agent_path, test_types='["all"]', parallel=-1, fail_fast=False)
# Run only specific test types
run_tests(goal_id, agent_path, test_types='["constraint"]')
run_tests(goal_id, agent_path, test_types='["success"]')
```
```bash
# Iterative debugging with checkpoints (via CLI)
uv run hive run exports/{agent_name} --input '{"query": "test"}'
```
### Phase 3: Analysis
```python
# Debug a specific failed test
debug_test(goal_id, test_name, agent_path)
# Find failed sessions
list_agent_sessions(agent_work_dir, status="failed", limit=5)
# Inspect session state (excludes memory values)
get_agent_session_state(agent_work_dir, session_id)
# Inspect memory data
get_agent_session_memory(agent_work_dir, session_id, key="research_results")
# Runtime logs: L1 summaries
query_runtime_logs(agent_work_dir, status="needs_attention")
# Runtime logs: L2 per-node details
query_runtime_log_details(agent_work_dir, run_id, needs_attention_only=True)
# Runtime logs: L3 tool/LLM raw data
query_runtime_log_raw(agent_work_dir, run_id, node_id="research")
# Find clean checkpoints
list_agent_checkpoints(agent_work_dir, session_id, is_clean="true")
# Compare checkpoints (memory diff)
compare_agent_checkpoints(agent_work_dir, session_id, cp_before, cp_after)
```
### Phase 5: Recovery
```python
# Inspect checkpoint before resuming
get_agent_checkpoint(agent_work_dir, session_id, checkpoint_id)
# Empty checkpoint_id = latest checkpoint
```
```bash
# Resume from checkpoint via CLI (headless)
uv run hive run exports/{agent_name} \
--resume-session {session_id} --checkpoint {checkpoint_id}
```
---
## Anti-Patterns
| Don't | Do Instead |
|-------|-----------|
| Use `default_agent.run()` in tests | Use `runner.run()` with `auto_responder` fixtures (goes through AgentRuntime) |
| Re-run entire agent when a late node fails | Resume from last clean checkpoint |
| Treat `result.success` as goal achieved | Check `result.output` for actual criteria |
| Access `result.output["key"]` directly | Use `result.output.get("key")` |
| Fix random things hoping tests pass | Analyze L2/L3 logs to find root cause first |
| Write 30+ tests | Write 8-15 focused tests |
| Skip credential check | Use `/hive-credentials` before testing |
| Confuse `exports/` with `~/.hive/agents/` | Code in `exports/`, runtime data in `~/.hive/` |
| Use `run_tests` for iterative debugging | Use headless CLI with checkpoints for iterative debugging |
| Use headless CLI for final regression | Use `run_tests` for automated regression |
| Use `--tui` from Claude Code | Use headless `run` command — TUI hangs in non-interactive shells |
| Test client-facing nodes from Claude Code | Use mock mode, or have the user run the agent in their terminal |
| Run tests without reading goal first | Always understand the goal before writing tests |
| Skip Phase 3 analysis and guess | Use session + log tools to identify root cause |
---
## Example Walkthrough: Deep Research Agent
A complete iteration showing the test loop for an agent with nodes: `intake → research → review → report`.
### Phase 1: Generate tests
```python
# Read the goal
Read(file_path="exports/deep_research_agent/agent.py")
# Get success criteria test guidelines
result = generate_success_tests(
goal_id="rigorous-interactive-research",
goal_json='{"id": "rigorous-interactive-research", "success_criteria": [{"id": "source-diversity", "target": ">=5"}, {"id": "citation-coverage", "target": "100%"}, {"id": "report-completeness", "target": "90%"}]}',
node_names="intake,research,review,report",
tool_names="web_search,web_scrape",
agent_path="exports/deep_research_agent"
)
# Write tests
Write(
file_path=result["output_file"],
content=result["file_header"] + "\n\n" + test_code
)
```
### Phase 2: First execution
```python
run_tests(
goal_id="rigorous-interactive-research",
agent_path="exports/deep_research_agent",
fail_fast=True
)
```
Result: `test_success_source_diversity` fails — agent only found 2 sources instead of 5.
### Phase 3: Analyze
```python
# Debug the failing test
debug_test(
goal_id="rigorous-interactive-research",
test_name="test_success_source_diversity",
agent_path="exports/deep_research_agent"
)
# → ASSERTION_FAILURE: Expected >= 5 sources, got 2
# Find the session
list_agent_sessions(
agent_work_dir="~/.hive/agents/deep_research_agent",
status="completed",
limit=1
)
# → session_20260209_150000_abc12345
# See what the research node produced
get_agent_session_memory(
agent_work_dir="~/.hive/agents/deep_research_agent",
session_id="session_20260209_150000_abc12345",
key="research_results"
)
# → Only 2 web_search calls made, each returned 1 source
# Check the LLM's behavior in the research node
query_runtime_log_raw(
agent_work_dir="~/.hive/agents/deep_research_agent",
run_id="session_20260209_150000_abc12345",
node_id="research"
)
# → LLM called web_search only twice, then called set_output
```
Root cause: The research node's prompt doesn't tell the LLM to search for at least 5 diverse sources. It stops after the first couple of searches.
### Phase 4: Fix the prompt
```python
Read(file_path="exports/deep_research_agent/nodes/__init__.py")
Edit(
file_path="exports/deep_research_agent/nodes/__init__.py",
old_string='system_prompt="Search for information on the user\'s topic."',
new_string='system_prompt="Search for information on the user\'s topic. You MUST find at least 5 diverse, authoritative sources. Use multiple different search queries to ensure source diversity. Do not stop searching until you have at least 5 distinct sources."'
)
```
### Phase 5: Resume from checkpoint
For this example, the fix is to the `research` node. If we had run via CLI with checkpointing, we could resume from the checkpoint after `intake` to skip re-running intake:
```bash
# Check if clean checkpoint exists after intake
list_agent_checkpoints(
agent_work_dir="~/.hive/agents/deep_research_agent",
session_id="session_20260209_150000_abc12345",
is_clean="true"
)
# → cp_node_complete_intake_150005
# Resume from after intake, re-run research with fixed prompt
uv run hive run exports/deep_research_agent \
--resume-session session_20260209_150000_abc12345 \
--checkpoint cp_node_complete_intake_150005
```
Or for this simple case (intake is fast), just re-run:
```bash
uv run hive run exports/deep_research_agent --input '{"topic": "test"}'
```
### Phase 6: Final verification
```python
run_tests(
goal_id="rigorous-interactive-research",
agent_path="exports/deep_research_agent"
)
# → All 12 tests pass
```
---
## Test File Structure
```
exports/{agent_name}/
├── agent.py ← Agent to test (goal, nodes, edges)
├── nodes/__init__.py ← Node implementations (prompts, config)
├── config.py ← Agent configuration
├── mcp_servers.json ← Tool server config
└── tests/
├── conftest.py ← Shared fixtures + safe access helpers
├── test_constraints.py ← Constraint tests
├── test_success_criteria.py ← Success criteria tests
└── test_edge_cases.py ← Edge case tests
```
## Integration with Other Skills
| Scenario | From | To | Action |
|----------|------|----|--------|
| Agent built, ready to test | `/hive-create` | `/hive-test` | Generate tests, start loop |
| Prompt fix needed | `/hive-test` Phase 4 | Direct edit | Edit `nodes/__init__.py`, resume |
| Goal definition wrong | `/hive-test` Phase 4 | `/hive-create` | Update goal, may need rebuild |
| Missing credentials | `/hive-test` Phase 3 | `/hive-credentials` | Set up credentials |
| Complex runtime failure | `/hive-test` Phase 3 | `/hive-debugger` | Deep L1/L2/L3 analysis |
| All tests pass | `/hive-test` Phase 6 | Done | Agent validated |
@@ -1,333 +0,0 @@
# Example: Iterative Testing of a Research Agent
This example walks through the full iterative test loop for a research agent that searches the web, reviews findings, and produces a cited report.
## Agent Structure
```
exports/deep_research_agent/
├── agent.py # Goal + graph: intake → research → review → report
├── nodes/__init__.py # Node definitions (system_prompt, input/output keys)
├── config.py # Model config
├── mcp_servers.json # Tools: web_search, web_scrape
└── tests/ # Test files (we'll create these)
```
**Goal:** "Rigorous Interactive Research" — find 5+ diverse sources, cite every claim, produce a complete report.
---
## Phase 1: Generate Tests
### Read the goal
```python
Read(file_path="exports/deep_research_agent/agent.py")
# Extract: goal_id="rigorous-interactive-research"
# success_criteria: source-diversity (>=5), citation-coverage (100%), report-completeness (90%)
# constraints: no-hallucination, source-attribution
```
### Get test guidelines
```python
result = generate_success_tests(
goal_id="rigorous-interactive-research",
goal_json='{"id": "rigorous-interactive-research", "success_criteria": [{"id": "source-diversity", "description": "Use multiple diverse sources", "target": ">=5"}, {"id": "citation-coverage", "description": "Every claim cites its source", "target": "100%"}, {"id": "report-completeness", "description": "Report answers the research questions", "target": "90%"}]}',
node_names="intake,research,review,report",
tool_names="web_search,web_scrape",
agent_path="exports/deep_research_agent"
)
```
### Write tests
```python
Write(
file_path="exports/deep_research_agent/tests/test_success_criteria.py",
content=result["file_header"] + '''
@pytest.mark.asyncio
async def test_success_source_diversity(runner, auto_responder, mock_mode):
"""At least 5 diverse sources are found."""
await auto_responder.start()
try:
result = await runner.run({"query": "impact of remote work on productivity"})
finally:
await auto_responder.stop()
assert result.success, f"Agent failed: {result.error}"
output = result.output or {}
sources = output.get("sources", [])
if isinstance(sources, list):
assert len(sources) >= 5, f"Expected >= 5 sources, got {len(sources)}"
@pytest.mark.asyncio
async def test_success_citation_coverage(runner, auto_responder, mock_mode):
"""Every factual claim in the report cites its source."""
await auto_responder.start()
try:
result = await runner.run({"query": "climate change effects on agriculture"})
finally:
await auto_responder.stop()
assert result.success, f"Agent failed: {result.error}"
output = result.output or {}
report = output.get("report", "")
# Check that report contains numbered references
assert "[1]" in str(report) or "[source" in str(report).lower(), "Report lacks citations"
@pytest.mark.asyncio
async def test_success_report_completeness(runner, auto_responder, mock_mode):
"""Report addresses the original research question."""
query = "pros and cons of nuclear energy"
await auto_responder.start()
try:
result = await runner.run({"query": query})
finally:
await auto_responder.stop()
assert result.success, f"Agent failed: {result.error}"
output = result.output or {}
report = output.get("report", "")
assert len(str(report)) > 200, f"Report too short: {len(str(report))} chars"
@pytest.mark.asyncio
async def test_empty_query_handling(runner, auto_responder, mock_mode):
"""Agent handles empty input gracefully."""
await auto_responder.start()
try:
result = await runner.run({"query": ""})
finally:
await auto_responder.stop()
output = result.output or {}
assert not result.success or output.get("error"), "Should handle empty query"
@pytest.mark.asyncio
async def test_feedback_loop_terminates(runner, auto_responder, mock_mode):
"""Feedback loop between review and research terminates."""
await auto_responder.start()
try:
result = await runner.run({"query": "quantum computing basics"})
finally:
await auto_responder.stop()
visits = result.node_visit_counts or {}
for node_id, count in visits.items():
assert count <= 5, f"Node {node_id} visited {count} times"
'''
)
```
---
## Phase 2: First Execution
```python
run_tests(
goal_id="rigorous-interactive-research",
agent_path="exports/deep_research_agent",
fail_fast=True
)
```
**Result:**
```json
{
"overall_passed": false,
"summary": {"total": 5, "passed": 3, "failed": 2, "pass_rate": "60.0%"},
"failures": [
{"test_name": "test_success_source_diversity", "details": "AssertionError: Expected >= 5 sources, got 2"},
{"test_name": "test_success_citation_coverage", "details": "AssertionError: Report lacks citations"}
]
}
```
---
## Phase 3: Analyze (Iteration 1)
### Debug the first failure
```python
debug_test(
goal_id="rigorous-interactive-research",
test_name="test_success_source_diversity",
agent_path="exports/deep_research_agent"
)
# Category: ASSERTION_FAILURE — Expected >= 5 sources, got 2
```
### Find the session and inspect memory
```python
list_agent_sessions(
agent_work_dir="~/.hive/agents/deep_research_agent",
status="completed",
limit=1
)
# → session_20260209_150000_abc12345
get_agent_session_memory(
agent_work_dir="~/.hive/agents/deep_research_agent",
session_id="session_20260209_150000_abc12345",
key="research_results"
)
# → Only 2 sources found. LLM stopped searching after 2 queries.
```
### Check LLM behavior in the research node
```python
query_runtime_log_raw(
agent_work_dir="~/.hive/agents/deep_research_agent",
run_id="session_20260209_150000_abc12345",
node_id="research"
)
# → LLM called web_search twice, got results, immediately called set_output.
# → Prompt doesn't instruct it to find at least 5 sources.
```
**Root cause:** The research node's system_prompt doesn't specify minimum source requirements.
---
## Phase 4: Fix (Iteration 1)
```python
Read(file_path="exports/deep_research_agent/nodes/__init__.py")
# Fix the research node prompt
Edit(
file_path="exports/deep_research_agent/nodes/__init__.py",
old_string='system_prompt="Search for information on the user\'s topic using web search."',
new_string='system_prompt="Search for information on the user\'s topic using web search. You MUST find at least 5 diverse, authoritative sources. Use multiple different search queries with varied keywords. Do NOT call set_output until you have gathered at least 5 distinct sources from different domains."'
)
```
---
## Phase 5: Recover & Resume (Iteration 1)
The fix is to the `research` node. Since this was a `run_tests` execution (no checkpoints), we re-run from scratch:
```python
run_tests(
goal_id="rigorous-interactive-research",
agent_path="exports/deep_research_agent",
fail_fast=True
)
```
**Result:**
```json
{
"overall_passed": false,
"summary": {"total": 5, "passed": 4, "failed": 1, "pass_rate": "80.0%"},
"failures": [
{"test_name": "test_success_citation_coverage", "details": "AssertionError: Report lacks citations"}
]
}
```
Source diversity now passes. Citation coverage still fails.
---
## Phase 3: Analyze (Iteration 2)
```python
debug_test(
goal_id="rigorous-interactive-research",
test_name="test_success_citation_coverage",
agent_path="exports/deep_research_agent"
)
# Category: ASSERTION_FAILURE — Report lacks citations
# Check what the report node produced
list_agent_sessions(
agent_work_dir="~/.hive/agents/deep_research_agent",
status="completed",
limit=1
)
# → session_20260209_151500_def67890
get_agent_session_memory(
agent_work_dir="~/.hive/agents/deep_research_agent",
session_id="session_20260209_151500_def67890",
key="report"
)
# → Report text exists but uses no numbered references.
# → Sources are in memory but report node doesn't cite them.
```
**Root cause:** The report node's prompt doesn't instruct the LLM to include numbered citations.
---
## Phase 4: Fix (Iteration 2)
```python
Edit(
file_path="exports/deep_research_agent/nodes/__init__.py",
old_string='system_prompt="Write a comprehensive report based on the research findings."',
new_string='system_prompt="Write a comprehensive report based on the research findings. You MUST include numbered citations [1], [2], etc. for every factual claim. At the end, include a References section listing all sources with their URLs. Every claim must be traceable to a specific source."'
)
```
---
## Phase 5: Resume (Iteration 2)
The fix is to the `report` node (the last node). To demonstrate checkpoint recovery, run via CLI:
```bash
# Run via CLI to get checkpoints
uv run hive run exports/deep_research_agent --input '{"topic": "climate change effects"}'
# After it runs, find the clean checkpoint before report
list_agent_checkpoints(
agent_work_dir="~/.hive/agents/deep_research_agent",
session_id="session_20260209_152000_ghi34567",
is_clean="true"
)
# → cp_node_complete_review_152100 (after review, before report)
# Resume — skips intake, research, review entirely
uv run hive run exports/deep_research_agent \
--resume-session session_20260209_152000_ghi34567 \
--checkpoint cp_node_complete_review_152100
```
Only the `report` node re-runs with the fixed prompt, using research data from the checkpoint.
---
## Phase 6: Final Verification
```python
run_tests(
goal_id="rigorous-interactive-research",
agent_path="exports/deep_research_agent"
)
```
**Result:**
```json
{
"overall_passed": true,
"summary": {"total": 5, "passed": 5, "failed": 0, "pass_rate": "100.0%"}
}
```
All tests pass.
---
## Summary
| Iteration | Failure | Root Cause | Fix | Recovery |
|-----------|---------|------------|-----|----------|
| 1 | Source diversity (2 < 5) | Research prompt too vague | Added "at least 5 sources" to prompt | Re-run (no checkpoints) |
| 2 | No citations in report | Report prompt lacks citation instructions | Added citation requirements | Checkpoint resume (skipped 3 nodes) |
**Key takeaways:**
- Phase 3 analysis (session memory + L3 logs) identified root causes without guessing
- Checkpoint recovery in iteration 2 saved time by skipping 3 expensive nodes
- Final `run_tests` confirms all scenarios pass end-to-end
-526
View File
@@ -1,526 +0,0 @@
---
name: hive
description: Complete workflow for building, implementing, and testing goal-driven agents. Orchestrates hive-* skills. Use when starting a new agent project, unsure which skill to use, or need end-to-end guidance.
license: Apache-2.0
metadata:
author: hive
version: "2.0"
type: workflow-orchestrator
orchestrates:
- hive-concepts
- hive-create
- hive-patterns
- hive-test
- hive-credentials
- hive-debugger
---
# Agent Development Workflow
**THIS IS AN EXECUTABLE WORKFLOW. DO NOT explore the codebase or read source files. ROUTE to the correct skill IMMEDIATELY.**
When this skill is loaded, **ALWAYS use the AskUserQuestion tool** to present options:
```
Use AskUserQuestion with these options:
- "Build a new agent" → Then invoke /hive-create
- "Test an existing agent" → Then invoke /hive-test
- "Learn agent concepts" → Then invoke /hive-concepts
- "Optimize agent design" → Then invoke /hive-patterns
- "Set up credentials" → Then invoke /hive-credentials
- "Debug a failing agent" → Then invoke /hive-debugger
- "Other" (please describe what you want to achieve)
```
**DO NOT:** Read source files, explore the codebase, search for code, or do any investigation before routing. The sub-skills handle all of that.
---
Complete Standard Operating Procedure (SOP) for building production-ready goal-driven agents.
## Overview
This workflow orchestrates specialized skills to take you from initial concept to production-ready agent:
1. **Understand Concepts**`/hive-concepts` (optional)
2. **Build Structure**`/hive-create`
3. **Optimize Design**`/hive-patterns` (optional)
4. **Setup Credentials**`/hive-credentials` (if agent uses tools requiring API keys)
5. **Test & Validate**`/hive-test`
6. **Debug Issues**`/hive-debugger` (if agent fails at runtime)
## When to Use This Workflow
Use this meta-skill when:
- Starting a new agent from scratch
- Unclear which skill to use first
- Need end-to-end guidance for agent development
- Want consistent, repeatable agent builds
**Skip this workflow** if:
- You only need to test an existing agent → use `/hive-test` directly
- You know exactly which phase you're in → use specific skill directly
## Quick Decision Tree
```
"Need to understand agent concepts" → hive-concepts
"Build a new agent" → hive-create
"Optimize my agent design" → hive-patterns
"Need client-facing nodes or feedback loops" → hive-patterns
"Set up API keys for my agent" → hive-credentials
"Test my agent" → hive-test
"My agent is failing/stuck/has errors" → hive-debugger
"Not sure what I need" → Read phases below, then decide
"Agent has structure but needs implementation" → See agent directory STATUS.md
```
## Phase 0: Understand Concepts (Optional)
**Skill**: `/hive-concepts`
**Input**: Questions about agent architecture
### When to Use
- First time building an agent
- Need to understand node types, edges, goals
- Want to validate tool availability
- Learning about event loop architecture and client-facing nodes
### What This Phase Provides
- Architecture overview (Python packages, not JSON)
- Core concepts (Goal, Node, Edge, Event Loop, Judges)
- Tool discovery and validation procedures
- Workflow overview
**Skip this phase** if you already understand agent fundamentals.
## Phase 1: Build Agent Structure
**Skill**: `/hive-create`
**Input**: User requirements ("Build an agent that...") or a template to start from
### What This Phase Does
Creates the complete agent architecture:
- Package structure (`exports/agent_name/`)
- Goal with success criteria and constraints
- Workflow graph (nodes and edges)
- Node specifications
- CLI interface
- Documentation
### Process
1. **Create package** - Directory structure with skeleton files
2. **Define goal** - Success criteria and constraints written to agent.py
3. **Design nodes** - Each node approved and written incrementally
4. **Connect edges** - Workflow graph with conditional routing
5. **Finalize** - Agent class, exports, and documentation
### Outputs
-`exports/agent_name/` package created
- ✅ Goal defined in agent.py
- ✅ 3-5 success criteria defined
- ✅ 1-5 constraints defined
- ✅ 5-10 nodes specified in nodes/__init__.py
- ✅ 8-15 edges connecting workflow
- ✅ Validated structure (passes `uv run python -m agent_name validate`)
- ✅ README.md with usage instructions
- ✅ CLI commands (info, validate, run, shell)
### Success Criteria
You're ready for Phase 2 when:
- Agent structure validates without errors
- All nodes and edges are defined
- CLI commands work (info, validate)
- You see: "Agent complete: exports/agent_name/"
### Common Outputs
The hive-create skill produces:
```
exports/agent_name/
├── __init__.py (package exports)
├── __main__.py (CLI interface)
├── agent.py (goal, graph, agent class)
├── nodes/__init__.py (node specifications)
├── config.py (configuration)
├── implementations.py (may be created for Python functions)
└── README.md (documentation)
```
### Next Steps
**If structure complete and validated:**
→ Check `exports/agent_name/STATUS.md` or `IMPLEMENTATION_GUIDE.md`
→ These files explain implementation options
→ You may need to add Python functions or MCP tools (not covered by current skills)
**If want to optimize design:**
→ Proceed to Phase 1.5 (hive-patterns)
**If ready to test:**
→ Proceed to Phase 2
## Phase 1.5: Optimize Design (Optional)
**Skill**: `/hive-patterns`
**Input**: Completed agent structure
### When to Use
- Want to add client-facing blocking or feedback edges
- Need judge patterns for output validation
- Want fan-out/fan-in (parallel execution)
- Need error handling patterns
- Want best practices guidance
### What This Phase Provides
- Client-facing interaction patterns
- Feedback edge routing with nullable output keys
- Judge patterns (implicit, SchemaJudge)
- Fan-out/fan-in parallel execution
- Context management and spillover patterns
- Anti-patterns to avoid
**Skip this phase** if your agent design is straightforward.
## Phase 2: Test & Validate
**Skill**: `/hive-test`
**Input**: Working agent from Phase 1
### What This Phase Does
Guides the creation and execution of a comprehensive test suite:
- Constraint tests
- Success criteria tests
- Edge case tests
- Integration tests
### Process
1. **Analyze agent** - Read goal, constraints, success criteria
2. **Generate tests** - The calling agent writes pytest files in `exports/agent_name/tests/` using hive-test guidelines and templates
3. **User approval** - Review and approve each test
4. **Run evaluation** - Execute tests and collect results
5. **Debug failures** - Identify and fix issues
6. **Iterate** - Repeat until all tests pass
### Outputs
- ✅ Test files in `exports/agent_name/tests/`
- ✅ Test report with pass/fail metrics
- ✅ Coverage of all success criteria
- ✅ Coverage of all constraints
- ✅ Edge case handling verified
### Success Criteria
You're done when:
- All tests pass
- All success criteria validated
- All constraints verified
- Agent handles edge cases
- Test coverage is comprehensive
### Next Steps
**Agent ready for:**
- Production deployment
- Integration into larger systems
- Documentation and handoff
- Continuous monitoring
## Phase Transitions
### From Phase 1 to Phase 2
**Trigger signals:**
- "Agent complete: exports/..."
- Structure validation passes
- README indicates implementation complete
**Before proceeding:**
- Verify agent can be imported: `from exports.agent_name import default_agent`
- Check if implementation is needed (see STATUS.md or IMPLEMENTATION_GUIDE.md)
- Confirm agent executes without import errors
### Skipping Phases
**When to skip Phase 1:**
- Agent structure already exists
- Only need to add tests
- Modifying existing agent
**When to skip Phase 2:**
- Prototyping or exploring
- Agent not production-bound
- Manual testing sufficient
## Common Patterns
### Pattern 1: Complete New Build (Simple)
```
User: "Build an agent that monitors files"
→ Use /hive-create
→ Agent structure created
→ Use /hive-test
→ Tests created and passing
→ Done: Production-ready agent
```
### Pattern 1b: Complete New Build (With Learning)
```
User: "Build an agent (first time)"
→ Use /hive-concepts (understand concepts)
→ Use /hive-create (build structure)
→ Use /hive-patterns (optimize design)
→ Use /hive-test (validate)
→ Done: Production-ready agent
```
### Pattern 1c: Build from Template
```
User: "Build an agent based on the deep research template"
→ Use /hive-create
→ Select "From a template" path
→ Pick template, name new agent
→ Review/modify goal, nodes, graph
→ Agent exported with customizations
→ Use /hive-test
→ Done: Customized agent
```
### Pattern 2: Test Existing Agent
```
User: "Test my agent at exports/my_agent"
→ Skip Phase 1
→ Use /hive-test directly
→ Tests created
→ Done: Validated agent
```
### Pattern 3: Iterative Development
```
User: "Build an agent"
→ Use /hive-create (Phase 1)
→ Implementation needed (see STATUS.md)
→ [User implements functions]
→ Use /hive-test (Phase 2)
→ Tests reveal bugs
→ [Fix bugs manually]
→ Re-run tests
→ Done: Working agent
```
### Pattern 4: Agent with Review Loops and HITL Checkpoints
```
User: "Build an agent with human review and feedback loops"
→ Use /hive-concepts (learn event loop, client-facing nodes)
→ Use /hive-create (build structure with feedback edges)
→ Use /hive-patterns (implement client-facing + feedback patterns)
→ Use /hive-test (validate review flows and edge routing)
→ Done: Agent with HITL checkpoints and review loops
```
## Skill Dependencies
```
hive (meta-skill)
├── hive-concepts (foundational)
│ ├── Architecture concepts (event loop, judges)
│ ├── Node types (event_loop, function)
│ ├── Edge routing and priority
│ ├── Tool discovery procedures
│ └── Workflow overview
├── hive-create (procedural)
│ ├── Creates package structure
│ ├── Defines goal
│ ├── Adds nodes (event_loop, function)
│ ├── Connects edges with priority routing
│ ├── Finalizes agent class
│ └── Requires: hive-concepts
├── hive-patterns (reference)
│ ├── Client-facing interaction patterns
│ ├── Feedback edges and review loops
│ ├── Judge patterns (implicit, SchemaJudge)
│ ├── Fan-out/fan-in parallel execution
│ └── Context management and anti-patterns
├── hive-credentials (utility)
│ ├── Detects missing credentials
│ ├── Offers auth method choices (Aden OAuth, direct API key)
│ ├── Stores securely in ~/.hive/credentials
│ └── Validates with health checks
├── hive-test (validation)
│ ├── Reads agent goal
│ ├── Generates tests
│ ├── Runs evaluation
│ └── Reports results
└── hive-debugger (troubleshooting)
├── Monitors runtime logs (L1/L2/L3)
├── Identifies retry loops, tool failures
├── Categorizes issues (10 categories)
└── Provides fix recommendations
```
## Troubleshooting
### "Agent structure won't validate"
- Check node IDs match between nodes/__init__.py and agent.py
- Verify all edges reference valid node IDs
- Ensure entry_node exists in nodes list
- Run: `PYTHONPATH=exports uv run python -m agent_name validate`
### "Agent has structure but won't run"
- Check for STATUS.md or IMPLEMENTATION_GUIDE.md in agent directory
- Implementation may be needed (Python functions or MCP tools)
- This is expected - hive-create creates structure, not implementation
- See implementation guide for completion options
### "Tests are failing"
- Review test output for specific failures
- Check agent goal and success criteria
- Verify constraints are met
- Use `/hive-test` to debug and iterate
- Fix agent code and re-run tests
### "Agent is failing at runtime"
- Use `/hive-debugger` to analyze runtime logs
- The debugger identifies retry loops, tool failures, and stalled execution
- Get actionable fix recommendations with code changes
- Monitor the agent in real-time during TUI sessions
### "Not sure which phase I'm in"
Run these checks:
```bash
# Check if agent structure exists
ls exports/my_agent/agent.py
# Check if it validates
PYTHONPATH=exports uv run python -m my_agent validate
# Check if tests exist
ls exports/my_agent/tests/
# If structure exists and validates → Phase 2 (testing)
# If structure doesn't exist → Phase 1 (building)
# If tests exist but failing → Debug phase
```
## Best Practices
### For Phase 1 (Building)
1. **Start with clear requirements** - Know what the agent should do
2. **Define success criteria early** - Measurable goals drive design
3. **Keep nodes focused** - One responsibility per node
4. **Use descriptive names** - Node IDs should explain purpose
5. **Validate incrementally** - Check structure after each major addition
### For Phase 2 (Testing)
1. **Test constraints first** - Hard requirements must pass
2. **Mock external dependencies** - Use mock mode for LLMs/APIs
3. **Cover edge cases** - Test failures, not just success paths
4. **Iterate quickly** - Fix one test at a time
5. **Document test patterns** - Future tests follow same structure
### General Workflow
1. **Use version control** - Git commit after each phase
2. **Document decisions** - Update README with changes
3. **Keep iterations small** - Build → Test → Fix → Repeat
4. **Preserve working states** - Tag successful iterations
5. **Learn from failures** - Failed tests reveal design issues
## Exit Criteria
You're done with the workflow when:
✅ Agent structure validates
✅ All tests pass
✅ Success criteria met
✅ Constraints verified
✅ Documentation complete
✅ Agent ready for deployment
## Additional Resources
- **hive-concepts**: See `.claude/skills/hive-concepts/SKILL.md`
- **hive-create**: See `.claude/skills/hive-create/SKILL.md`
- **hive-patterns**: See `.claude/skills/hive-patterns/SKILL.md`
- **hive-test**: See `.claude/skills/hive-test/SKILL.md`
- **Agent framework docs**: See `core/README.md`
- **Example agents**: See `exports/` directory
## Summary
This workflow provides a proven path from concept to production-ready agent:
1. **Learn** with `/hive-concepts` → Understand fundamentals (optional)
2. **Build** with `/hive-create` → Get validated structure
3. **Optimize** with `/hive-patterns` → Apply best practices (optional)
4. **Configure** with `/hive-credentials` → Set up API keys (if needed)
5. **Test** with `/hive-test` → Get verified functionality
6. **Debug** with `/hive-debugger` → Fix runtime issues (if needed)
The workflow is **flexible** - skip phases as needed, iterate freely, and adapt to your specific requirements. The goal is **production-ready agents** built with **consistent, repeatable processes**.
## Skill Selection Guide
**Choose hive-concepts when:**
- First time building agents
- Need to understand event loop architecture
- Validating tool availability
- Learning about node types, edges, and judges
**Choose hive-create when:**
- Actually building an agent
- Have clear requirements
- Ready to write code
- Want step-by-step guidance
- Want to start from an existing template and customize it
**Choose hive-patterns when:**
- Agent structure complete
- Need client-facing nodes or feedback edges
- Implementing review loops or fan-out/fan-in
- Want judge patterns or context management
- Want best practices
**Choose hive-test when:**
- Agent structure complete
- Ready to validate functionality
- Need comprehensive test coverage
- Testing feedback loops, output keys, or fan-out
**Choose hive-debugger when:**
- Agent is failing or stuck at runtime
- Seeing retry loops or escalations
- Tool calls are failing
- Need to understand why a node isn't completing
- Want real-time monitoring of agent execution
@@ -1,199 +0,0 @@
# Example: File Monitor Agent
This example shows the complete /hive workflow in action for building a file monitoring agent.
## Initial Request
```
User: "Build an agent that monitors ~/Downloads and copies new files to ~/Documents"
```
## Phase 1: Building (20 minutes)
### Step 1: Create Structure
Agent invokes `/hive-create` skill and:
1. Creates `exports/file_monitor_agent/` package
2. Writes skeleton files (__init__.py, __main__.py, agent.py, etc.)
**Output**: Package structure visible immediately
### Step 2: Define Goal
```python
goal = Goal(
id="file-monitor-copy",
name="Automated File Monitor & Copy",
success_criteria=[
# 100% detection rate
# 100% copy success
# 100% conflict resolution
# >99% uptime
],
constraints=[
# Preserve originals
# Handle errors gracefully
# Track state
# Respect permissions
]
)
```
**Output**: Goal written to agent.py
### Step 3: Design Nodes
7 nodes approved and written incrementally:
1. `initialize-state` - Set up tracking
2. `list-downloads` - Scan directory
3. `identify-new-files` - Find new files
4. `check-for-new-files` - Router
5. `copy-files` - Copy with conflict resolution
6. `update-state` - Mark as processed
7. `wait-interval` - Sleep between cycles
**Output**: All nodes in nodes/__init__.py
### Step 4: Connect Edges
8 edges connecting the workflow loop:
```
initialize → list → identify → check
↓ ↓
copy wait
↓ ↑
update ↓
↓ ↓
wait → list (loop)
```
**Output**: Edges written to agent.py
### Step 5: Finalize
```bash
$ PYTHONPATH=exports uv run python -m file_monitor_agent validate
✓ Agent is valid
$ PYTHONPATH=exports uv run python -m file_monitor_agent info
Agent: File Monitor & Copy Agent
Nodes: 7
Edges: 8
```
**Phase 1 Complete**: Structure validated ✅
### Status After Phase 1
```
exports/file_monitor_agent/
├── __init__.py ✅ (exports)
├── __main__.py ✅ (CLI)
├── agent.py ✅ (goal, graph, agent class)
├── nodes/__init__.py ✅ (7 nodes)
├── config.py ✅ (configuration)
├── implementations.py ✅ (Python functions)
├── README.md ✅ (documentation)
├── IMPLEMENTATION_GUIDE.md ✅ (next steps)
└── STATUS.md ✅ (current state)
```
**Note**: Implementation gap exists - data flow needs connection (covered in STATUS.md)
## Phase 2: Testing (25 minutes)
### Step 1: Analyze Agent
Agent invokes `/hive-test` skill and:
1. Reads goal from `exports/file_monitor_agent/agent.py`
2. Identifies 4 success criteria to test
3. Identifies 4 constraints to verify
4. Plans test coverage
### Step 2: Generate Tests
Creates test files:
```
exports/file_monitor_agent/tests/
├── conftest.py (fixtures)
├── test_constraints.py (4 constraint tests)
├── test_success_criteria.py (4 success tests)
└── test_edge_cases.py (error handling)
```
Tests approved incrementally by user.
### Step 3: Run Tests
```bash
$ PYTHONPATH=exports uv run pytest exports/file_monitor_agent/tests/
test_constraints.py::test_preserves_originals PASSED
test_constraints.py::test_handles_errors PASSED
test_constraints.py::test_tracks_state PASSED
test_constraints.py::test_respects_permissions PASSED
test_success_criteria.py::test_detects_all_files PASSED
test_success_criteria.py::test_copies_all_files PASSED
test_success_criteria.py::test_resolves_conflicts PASSED
test_success_criteria.py::test_continuous_run PASSED
test_edge_cases.py::test_empty_directory PASSED
test_edge_cases.py::test_permission_denied PASSED
test_edge_cases.py::test_disk_full PASSED
test_edge_cases.py::test_large_files PASSED
========================== 12 passed in 3.42s ==========================
```
**Phase 2 Complete**: All tests pass ✅
## Final Output
**Production-Ready Agent:**
```bash
# Run the agent
./RUN_AGENT.sh
# Or manually
PYTHONPATH=exports uv run python -m file_monitor_agent run
```
**Capabilities:**
- Monitors ~/Downloads continuously
- Copies new files to ~/Documents
- Resolves conflicts with timestamps
- Handles errors gracefully
- Tracks processed files
- Runs as background service
**Total Time**: ~45 minutes from concept to production
## Key Learnings
1. **Incremental building** - Files written immediately, visible throughout
2. **Validation early** - Structure validated before moving to implementation
3. **Test-driven** - Tests reveal real behavior
4. **Documentation included** - README, STATUS, and guides auto-generated
5. **Repeatable process** - Same workflow for any agent type
## Variations
**For simpler agents:**
- Fewer nodes (3-5 instead of 7)
- Simpler workflow (linear instead of looping)
- Faster build time (10-15 minutes)
**For complex agents:**
- More nodes (10-15+)
- Multiple subgraphs
- Pause/resume points for human-in-the-loop
- Longer build time (45-60 minutes)
The workflow scales to your needs!
-7
View File
@@ -1,7 +0,0 @@
# Project-level Codex config for Hive.
# Keep this file minimal: MCP connectivity + skill discovery.
[mcp_servers.agent-builder]
command = "uv"
args = ["run", "--directory", "core", "-m", "framework.mcp.agent_builder_server"]
cwd = "."
-20
View File
@@ -1,20 +0,0 @@
{
"mcpServers": {
"agent-builder": {
"command": "python",
"args": ["-m", "framework.mcp.agent_builder_server"],
"cwd": "core",
"env": {
"PYTHONPATH": "../tools/src"
}
},
"tools": {
"command": "python",
"args": ["mcp_server.py", "--stdio"],
"cwd": "tools",
"env": {
"PYTHONPATH": "src"
}
}
}
}
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-concepts
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-create
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-credentials
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-patterns
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-test
@@ -0,0 +1,89 @@
name: Integration Bounty
description: A bounty task for the integration contribution program
title: "[Bounty]: "
labels: []
body:
- type: markdown
attributes:
value: |
## Integration Bounty
This issue is part of the [Integration Bounty Program](../../docs/bounty-program/README.md).
**Claim this bounty** by commenting below — a maintainer will assign you within 24 hours.
- type: dropdown
id: bounty-type
attributes:
label: Bounty Type
options:
- "Test a Tool (20 pts)"
- "Write Docs (20 pts)"
- "Code Contribution (30 pts)"
- "New Integration (75 pts)"
validations:
required: true
- type: dropdown
id: difficulty
attributes:
label: Difficulty
options:
- Easy
- Medium
- Hard
validations:
required: true
- type: input
id: tool-name
attributes:
label: Tool Name
description: The integration this bounty targets (e.g., `airtable`, `salesforce`)
placeholder: e.g., airtable
validations:
required: true
- type: textarea
id: description
attributes:
label: Description
description: What needs to be done to complete this bounty.
placeholder: |
Describe the specific task, including:
- What the contributor needs to do
- Links to relevant files in the repo
- Any setup requirements (API keys, accounts, etc.)
validations:
required: true
- type: textarea
id: acceptance-criteria
attributes:
label: Acceptance Criteria
description: What "done" looks like. The PR or report must meet all criteria.
placeholder: |
- [ ] Criterion 1
- [ ] Criterion 2
- [ ] CI passes
validations:
required: true
- type: textarea
id: relevant-files
attributes:
label: Relevant Files
description: Links to tool directory, credential spec, health check file, etc.
placeholder: |
- Tool: `tools/src/aden_tools/tools/{tool_name}/`
- Credential spec: `tools/src/aden_tools/credentials/{category}.py`
- Health checks: `tools/src/aden_tools/credentials/health_check.py`
- type: textarea
id: resources
attributes:
label: Resources
description: Links to API docs, examples, or guides that will help the contributor.
placeholder: |
- [Building Tools Guide](../../tools/BUILDING_TOOLS.md)
- [Tool README Template](../../docs/bounty-program/templates/tool-readme-template.md)
- API docs: https://...
@@ -0,0 +1,78 @@
name: Standard Bounty
description: A bounty task for general framework contributions (not integration-specific)
title: "[Bounty]: "
labels: []
body:
- type: markdown
attributes:
value: |
## Standard Bounty
This issue is part of the [Bounty Program](../../docs/bounty-program/README.md).
**Claim this bounty** by commenting below — a maintainer will assign you within 24 hours.
- type: dropdown
id: bounty-size
attributes:
label: Bounty Size
options:
- "Small (10 pts)"
- "Medium (30 pts)"
- "Large (75 pts)"
- "Extreme (150 pts)"
validations:
required: true
- type: dropdown
id: difficulty
attributes:
label: Difficulty
options:
- Easy
- Medium
- Hard
validations:
required: true
- type: textarea
id: description
attributes:
label: Description
description: What needs to be done to complete this bounty.
placeholder: |
Describe the specific task, including:
- What the contributor needs to do
- Links to relevant files in the repo
- Any context or motivation for the change
validations:
required: true
- type: textarea
id: acceptance-criteria
attributes:
label: Acceptance Criteria
description: What "done" looks like. The PR must meet all criteria.
placeholder: |
- [ ] Criterion 1
- [ ] Criterion 2
- [ ] CI passes
validations:
required: true
- type: textarea
id: relevant-files
attributes:
label: Relevant Files
description: Links to files or directories related to this bounty.
placeholder: |
- `path/to/file.py`
- `path/to/directory/`
- type: textarea
id: resources
attributes:
label: Resources
description: Links to docs, issues, or external references that will help.
placeholder: |
- Related issue: #XXXX
- Docs: https://...
+37
View File
@@ -0,0 +1,37 @@
name: Bounty completed
description: Awards points and notifies Discord when a bounty PR is merged
on:
pull_request:
types: [closed]
jobs:
bounty-notify:
if: >
github.event.pull_request.merged == true &&
contains(join(github.event.pull_request.labels.*.name, ','), 'bounty:')
runs-on: ubuntu-latest
timeout-minutes: 5
permissions:
contents: read
pull-requests: read
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup Bun
uses: oven-sh/setup-bun@v2
with:
bun-version: latest
- name: Award XP and notify Discord
run: bun run scripts/bounty-tracker.ts notify
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_REPOSITORY_OWNER: ${{ github.repository_owner }}
GITHUB_REPOSITORY_NAME: ${{ github.event.repository.name }}
DISCORD_WEBHOOK_URL: ${{ secrets.DISCORD_BOUNTY_WEBHOOK_URL }}
LURKR_API_KEY: ${{ secrets.LURKR_API_KEY }}
LURKR_GUILD_ID: ${{ secrets.LURKR_GUILD_ID }}
PR_NUMBER: ${{ github.event.pull_request.number }}
+18 -7
View File
@@ -5,7 +5,7 @@ on:
branches: [main]
pull_request:
branches: [main]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
@@ -24,6 +24,8 @@ jobs:
- name: Install uv
uses: astral-sh/setup-uv@v4
with:
enable-cache: true
- name: Install dependencies
run: uv sync --project core --group dev
@@ -54,16 +56,21 @@ jobs:
- name: Install uv
uses: astral-sh/setup-uv@v4
with:
enable-cache: true
- name: Install dependencies and run tests
working-directory: core
run: |
cd core
uv sync
uv run pytest tests/ -v
test-tools:
name: Test Tools
runs-on: ubuntu-latest
name: Test Tools (${{ matrix.os }})
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-latest, windows-latest]
steps:
- uses: actions/checkout@v4
@@ -74,10 +81,12 @@ jobs:
- name: Install uv
uses: astral-sh/setup-uv@v4
with:
enable-cache: true
- name: Install dependencies and run tests
working-directory: tools
run: |
cd tools
uv sync --extra dev
uv run pytest tests/ -v
@@ -95,10 +104,12 @@ jobs:
- name: Install uv
uses: astral-sh/setup-uv@v4
with:
enable-cache: true
- name: Install dependencies
working-directory: core
run: |
cd core
uv sync
- name: Validate exported agents
+126
View File
@@ -0,0 +1,126 @@
name: Link Discord account
description: Auto-creates a PR to add contributor to contributors.yml when a link-discord issue is opened
on:
issues:
types: [opened]
jobs:
link-discord:
if: contains(github.event.issue.labels.*.name, 'link-discord')
runs-on: ubuntu-latest
timeout-minutes: 2
permissions:
contents: write
issues: write
pull-requests: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Parse issue and update contributors.yml
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const issue = context.payload.issue;
const githubUsername = issue.user.login;
// Parse the issue body for form fields
const body = issue.body || '';
// Extract Discord ID — look for the numeric value after the "Discord User ID" heading
const discordMatch = body.match(/### Discord User ID\s*\n\s*(\d{17,20})/);
if (!discordMatch) {
await github.rest.issues.createComment({
...context.repo,
issue_number: issue.number,
body: `Could not find a valid Discord ID in the issue body. Please make sure you entered a numeric ID (17-20 digits), not a username.\n\nExample: \`123456789012345678\``
});
await github.rest.issues.update({
...context.repo,
issue_number: issue.number,
state: 'closed',
state_reason: 'not_planned'
});
return;
}
const discordId = discordMatch[1];
// Extract display name (optional)
const nameMatch = body.match(/### Display Name \(optional\)\s*\n\s*(.+)/);
const displayName = nameMatch ? nameMatch[1].trim() : '';
// Check if user already exists
const yml = fs.readFileSync('contributors.yml', 'utf-8');
if (yml.includes(`github: ${githubUsername}`)) {
await github.rest.issues.createComment({
...context.repo,
issue_number: issue.number,
body: `@${githubUsername} is already in \`contributors.yml\`. If you need to update your Discord ID, please edit the file directly via PR.`
});
await github.rest.issues.update({
...context.repo,
issue_number: issue.number,
state: 'closed',
state_reason: 'completed'
});
return;
}
// Append entry to contributors.yml
let entry = ` - github: ${githubUsername}\n discord: "${discordId}"`;
if (displayName && displayName !== '_No response_') {
entry += `\n name: ${displayName}`;
}
entry += '\n';
const updated = yml.trimEnd() + '\n' + entry;
fs.writeFileSync('contributors.yml', updated);
// Set outputs for commit step
core.exportVariable('GITHUB_USERNAME', githubUsername);
core.exportVariable('DISCORD_ID', discordId);
core.exportVariable('ISSUE_NUMBER', issue.number.toString());
- name: Create PR
run: |
# Check if there are changes
if git diff --quiet contributors.yml; then
echo "No changes to contributors.yml"
exit 0
fi
BRANCH="docs/link-discord-${GITHUB_USERNAME}"
git config user.name "github-actions[bot]"
git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
git checkout -b "$BRANCH"
git add contributors.yml
git commit -m "docs: link @${GITHUB_USERNAME} to Discord"
git push origin "$BRANCH"
gh pr create \
--title "docs: link @${GITHUB_USERNAME} to Discord" \
--body "Adds @${GITHUB_USERNAME} (Discord \`${DISCORD_ID}\`) to \`contributors.yml\` for bounty XP tracking.
Closes #${ISSUE_NUMBER}" \
--base main \
--head "$BRANCH" \
--label "link-discord"
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Notify on issue
uses: actions/github-script@v7
with:
script: |
const username = process.env.GITHUB_USERNAME;
const issueNumber = parseInt(process.env.ISSUE_NUMBER);
await github.rest.issues.createComment({
...context.repo,
issue_number: issueNumber,
body: `A PR has been created to link your account. A maintainer will merge it shortly — once merged, you'll receive XP and Discord pings when your bounty PRs are merged.`
});
@@ -0,0 +1,54 @@
# Closes PRs that still have the `pr-requirements-warning` label
# after contributors were warned in pr-requirements.yml.
name: PR Requirements Enforcement
on:
schedule:
- cron: "0 0 * * *" # runs every day once at midnight
jobs:
enforce:
name: Close PRs still failing contribution requirements
runs-on: ubuntu-latest
permissions:
pull-requests: write
issues: write
steps:
- name: Close PRs still failing requirements
uses: actions/github-script@v7
with:
script: |
const { owner, repo } = context.repo;
const prs = await github.paginate(github.rest.pulls.list, {
owner,
repo,
state: "open",
per_page: 100
});
for (const pr of prs) {
// Skip draft PRs — author may still be actively working toward compliance
if (pr.draft) continue;
const labels = pr.labels.map(l => l.name);
if (!labels.includes("pr-requirements-warning")) continue;
const gracePeriod = 24 * 60 * 60 * 1000;
const lastUpdated = new Date(pr.created_at);
const now = new Date();
if (now - lastUpdated < gracePeriod) {
console.log(`Skipping PR #${pr.number} — still within grace period`);
continue;
}
const prNumber = pr.number;
const prAuthor = pr.user.login;
await github.rest.issues.createComment({
owner,
repo,
issue_number: prNumber,
body: `Closing PR because the contribution requirements were not resolved within the 24-hour grace period.
If this was closed in error, feel free to reopen the PR after fixing the requirements.`
});
await github.rest.pulls.update({
owner,
repo,
pull_number: prNumber,
state: "closed"
});
console.log(`Closed PR #${prNumber} by ${prAuthor} (PR requirements were not met)`);
}
+31 -17
View File
@@ -43,9 +43,10 @@ jobs:
console.log(` Found issue references: ${issueNumbers.length > 0 ? issueNumbers.join(', ') : 'none'}`);
if (issueNumbers.length === 0) {
const message = `## PR Closed - Requirements Not Met
const message = `## PR Requirements Warning
This PR has been automatically closed because it doesn't meet the requirements.
This PR does not meet the contribution requirements.
If the issue is not fixed within ~24 hours, it may be automatically closed.
**Missing:** No linked issue found.
@@ -67,14 +68,15 @@ jobs:
**Why is this required?** See #472 for details.`;
const comments = await github.rest.issues.listComments({
const comments = await github.paginate(github.rest.issues.listComments, {
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
per_page: 100,
});
const botComment = comments.data.find(
(c) => c.user.type === 'Bot' && c.body.includes('PR Closed - Requirements Not Met')
const botComment = comments.find(
(c) => c.user.type === 'Bot' && c.body.includes('PR Requirements Warning')
);
if (!botComment) {
@@ -86,11 +88,11 @@ jobs:
});
}
await github.rest.pulls.update({
await github.rest.issues.addLabels({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: prNumber,
state: 'closed',
issue_number: prNumber,
labels: ['pr-requirements-warning'],
});
core.setFailed('PR must reference an issue');
@@ -132,9 +134,10 @@ jobs:
`#${i.number} (assignees: ${i.assignees.length > 0 ? i.assignees.join(', ') : 'none'})`
).join(', ');
const message = `## PR Closed - Requirements Not Met
const message = `## PR Requirements Warning
This PR has been automatically closed because it doesn't meet the requirements.
This PR does not meet the contribution requirements.
If the issue is not fixed within ~24 hours, it may be automatically closed.
**PR Author:** @${prAuthor}
**Found issues:** ${issueList}
@@ -157,14 +160,15 @@ jobs:
**Why is this required?** See #472 for details.`;
const comments = await github.rest.issues.listComments({
const comments = await github.paginate(github.rest.issues.listComments, {
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
per_page: 100,
});
const botComment = comments.data.find(
(c) => c.user.type === 'Bot' && c.body.includes('PR Closed - Requirements Not Met')
const botComment = comments.find(
(c) => c.user.type === 'Bot' && c.body.includes('PR Requirements Warning')
);
if (!botComment) {
@@ -176,14 +180,24 @@ jobs:
});
}
await github.rest.pulls.update({
await github.rest.issues.addLabels({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: prNumber,
state: 'closed',
issue_number: prNumber,
labels: ['pr-requirements-warning'],
});
core.setFailed('PR author must be assigned to the linked issue');
} else {
console.log(`PR requirements met! Issue #${issueWithAuthorAssigned} has ${prAuthor} as assignee.`);
}
try {
await github.rest.issues.removeLabel({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
name: "pr-requirements-warning"
});
}catch (error){
//ignore if label doesn't exist
}
}
+40
View File
@@ -0,0 +1,40 @@
name: Weekly bounty leaderboard
description: Posts the integration bounty leaderboard to Discord every Monday
on:
schedule:
# Every Monday at 9:00 UTC
- cron: "0 9 * * 1"
workflow_dispatch:
inputs:
since_date:
description: "Only count PRs merged after this date (YYYY-MM-DD). Leave empty for all-time."
required: false
jobs:
leaderboard:
runs-on: ubuntu-latest
timeout-minutes: 5
permissions:
contents: read
pull-requests: read
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup Bun
uses: oven-sh/setup-bun@v2
with:
bun-version: latest
- name: Post leaderboard to Discord
run: bun run scripts/bounty-tracker.ts leaderboard
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_REPOSITORY_OWNER: ${{ github.repository_owner }}
GITHUB_REPOSITORY_NAME: ${{ github.event.repository.name }}
DISCORD_WEBHOOK_URL: ${{ secrets.DISCORD_BOUNTY_WEBHOOK_URL }}
LURKR_API_KEY: ${{ secrets.LURKR_API_KEY }}
LURKR_GUILD_ID: ${{ secrets.LURKR_GUILD_ID }}
SINCE_DATE: ${{ github.event.inputs.since_date || '' }}
+1 -2
View File
@@ -67,8 +67,6 @@ temp/
exports/*
.agent-builder-sessions/*
.claude/settings.local.json
.claude/skills/ship-it/
@@ -79,3 +77,4 @@ core/tests/*dumps/*
screenshots/*
.gemini/*
+1 -7
View File
@@ -1,9 +1,3 @@
{
"mcpServers": {
"agent-builder": {
"command": "uv",
"args": ["run", "-m", "framework.mcp.agent_builder_server"],
"cwd": "core"
}
}
"mcpServers": {}
}
-30
View File
@@ -1,30 +0,0 @@
{
"mcpServers": {
"agent-builder": {
"command": "uv",
"args": [
"run",
"python",
"-m",
"framework.mcp.agent_builder_server"
],
"cwd": "core",
"env": {
"PYTHONPATH": "../tools/src"
}
},
"tools": {
"command": "uv",
"args": [
"run",
"python",
"mcp_server.py",
"--stdio"
],
"cwd": "tools",
"env": {
"PYTHONPATH": "src"
}
}
}
}
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-concepts
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-create
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-credentials
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-debugger
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-patterns
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/hive-test
-1
View File
@@ -1 +0,0 @@
../../.claude/skills/triage-issue
-7
View File
@@ -1,7 +0,0 @@
{
"recommendations": [
"charliermarsh.ruff",
"editorconfig.editorconfig",
"ms-python.python"
]
}
+150 -27
View File
@@ -1,17 +1,149 @@
# Release Notes
## v0.7.1
**Release Date:** March 13, 2026
**Tag:** v0.7.1
### Chrome-Native Browser Control
v0.7.1 replaces Playwright with direct Chrome DevTools Protocol (CDP) integration. The GCU now launches the user's system Chrome via `open -n` on macOS, connects over CDP, and manages browser lifecycle end-to-end -- no extra browser binary required.
---
### Highlights
#### System Chrome via CDP
The entire GCU browser stack has been rewritten:
- **Chrome finder & launcher** -- New `chrome_finder.py` discovers installed Chrome and `chrome_launcher.py` manages process lifecycle with `--remote-debugging-port`
- **Coexist with user's browser** -- `open -n` on macOS launches a separate Chrome instance so the user's tabs stay untouched
- **Dynamic viewport sizing** -- Viewport auto-sizes to the available display area, suppressing Chrome warning bars
- **Orphan cleanup** -- Chrome processes are killed on GCU server shutdown to prevent leaks
- **`--no-startup-window`** -- Chrome launches headlessly by default until a page is needed
#### Per-Subagent Browser Isolation
Each GCU subagent gets its own Chrome user-data directory, preventing cookie/session cross-contamination:
- Unique browser profiles injected per subagent
- Profiles cleaned up after top-level GCU node execution
- Tab origin and age metadata tracked per subagent
#### Dummy Agent Testing Framework
A comprehensive test suite for validating agent graph patterns without LLM calls:
- 8 test modules covering echo, pipeline, branch, parallel merge, retry, feedback loop, worker, and GCU subagent patterns
- Shared fixtures and a `run_all.py` runner for CI integration
- Subagent lifecycle tests
---
### What's New
#### GCU Browser
- **Switch from Playwright to system Chrome via CDP** -- Direct CDP connection replaces Playwright dependency. (@bryanadenhq)
- **Chrome finder and launcher modules** -- `chrome_finder.py` and `chrome_launcher.py` for cross-platform Chrome discovery and process management. (@bryanadenhq)
- **Dynamic viewport sizing** -- Auto-size viewport and suppress Chrome warning bar. (@bryanadenhq)
- **Per-subagent browser profile isolation** -- Unique user-data directories per subagent with cleanup. (@bryanadenhq)
- **Tab origin/age metadata** -- Track which subagent opened each tab and when. (@bryanadenhq)
- **`browser_close_all` tool** -- Bulk tab cleanup for agents managing many pages. (@bryanadenhq)
- **Auto-track popup pages** -- Popups are automatically captured and tracked. (@bryanadenhq)
- **Auto-snapshot from browser interactions** -- Browser interaction tools return screenshots automatically. (@bryanadenhq)
- **Kill orphaned Chrome processes** -- GCU server shutdown cleans up lingering Chrome instances. (@bryanadenhq)
- **`--no-startup-window` Chrome flag** -- Prevent empty window on launch. (@bryanadenhq)
- **Launch Chrome via `open -n` on macOS** -- Coexist with the user's running browser. (@bryanadenhq)
#### Framework & Runtime
- **Session resume fix for new agents** -- Correctly resume sessions when a new agent is loaded. (@bryanadenhq)
- **Queen upsert fix** -- Prevent duplicate queen entries on session restore. (@bryanadenhq)
- **Anchor worker monitoring to queen's session ID on cold-restore** -- Worker monitors reconnect to the correct queen after restart. (@bryanadenhq)
- **Update meta.json when loading workers** -- Worker metadata stays in sync with runtime state. (@RichardTang-Aden)
- **Generate worker MCP file correctly** -- Fix MCP config generation for spawned workers. (@RichardTang-Aden)
- **Share event bus so tool events are visible to parent** -- Tool execution events propagate up to parent graphs. (@bryanadenhq)
- **Subagent activity tracking in queen status** -- Queen instructions include live subagent status. (@bryanadenhq)
- **GCU system prompt updates** -- Auto-snapshots, batching, popup tracking, and close_all guidance. (@bryanadenhq)
#### Frontend
- **Loading spinner in draft panel** -- Shows spinner during planning phase instead of blank panel. (@bryanadenhq)
- **Fix credential modal errors** -- Modal no longer eats errors; banner stays visible. (@bryanadenhq)
- **Fix credentials_required loop** -- Stop clearing the flag on modal close to prevent infinite re-prompting. (@bryanadenhq)
- **Fix "Add tab" dropdown overflow** -- Dropdown no longer hidden when many agents are open. (@prasoonmhwr)
#### Testing
- **Dummy agent test framework** -- 8 test modules (echo, pipeline, branch, parallel merge, retry, feedback loop, worker, GCU subagent) with shared fixtures and CI runner. (@bryanadenhq)
- **Subagent lifecycle tests** -- Validate subagent spawn and completion flows. (@bryanadenhq)
#### Documentation & Infrastructure
- **MCP integration PRD** -- Product requirements for MCP server registry. (@TimothyZhang7)
- **Skills registry PRD** -- Product requirements for skill registry system. (@bryanadenhq)
- **Bounty program updates** -- Standard bounty issue template and updated contributor guide. (@bryanadenhq)
- **Windows quickstart** -- Add default context limit for PowerShell setup. (@bryanadenhq)
- **Remove deprecated files** -- Clean up `setup_mcp.py`, `verify_mcp.py`, `antigravity-setup.md`, and `setup-antigravity-mcp.sh`. (@bryanadenhq)
---
### Bug Fixes
- Fix credential modal eating errors and banner staying open
- Stop clearing `credentials_required` on modal close to prevent infinite loop
- Share event bus so tool events are visible to parent graph
- Use lazy %-formatting in subagent completion log to avoid f-string in logger
- Anchor worker monitoring to queen's session ID on cold-restore
- Update meta.json when loading workers
- Generate worker MCP file correctly
- Fix "Add tab" dropdown partially hidden when creating multiple agents
---
### Community Contributors
- **Prasoon Mahawar** (@prasoonmhwr) -- Fix UI overflow on agent tab dropdown
- **Richard Tang** (@RichardTang-Aden) -- Worker MCP generation and meta.json fixes
---
### Upgrading
```bash
git pull origin main
uv sync
```
The Playwright dependency is no longer required for GCU browser operations. Chrome must be installed on the host system.
---
## v0.7.0
**Release Date:** March 5, 2026
**Tag:** v0.7.0
Session management refactor release.
---
## v0.5.1
**Release Date:** February 18, 2026
**Tag:** v0.5.1
## The Hive Gets a Brain
### The Hive Gets a Brain
v0.5.1 is our most ambitious release yet. Hive agents can now **build other agents** -- the new Hive Coder meta-agent writes, tests, and fixes agent packages from natural language. The runtime grows multi-graph support so one session can orchestrate multiple agents simultaneously. The TUI gets a complete overhaul with an in-app agent picker, live streaming, and seamless escalation to the Coder. And we're now provider-agnostic: Claude Code subscriptions, OpenAI-compatible endpoints, and any LiteLLM-supported model work out of the box.
---
## Highlights
### Highlights
### Hive Coder -- The Agent That Builds Agents
#### Hive Coder -- The Agent That Builds Agents
A native meta-agent that lives inside the framework at `core/framework/agents/hive_coder/`. Give it a natural-language specification and it produces a complete agent package -- goal definition, node prompts, edge routing, MCP tool wiring, tests, and all boilerplate files.
@@ -30,7 +162,7 @@ The Coder ships with:
- **Coder Tools MCP server** -- file I/O, fuzzy-match editing, git snapshots, and sandboxed shell execution (`tools/coder_tools_server.py`)
- **Test generation** -- structural tests for forever-alive agents that don't hang on `runner.run()`
### Multi-Graph Agent Runtime
#### Multi-Graph Agent Runtime
`AgentRuntime` now supports loading, managing, and switching between multiple agent graphs within a single session. Six new lifecycle tools give agents (and the TUI) full control:
@@ -44,7 +176,7 @@ await runtime.add_graph("exports/deep_research_agent")
The Hive Coder uses multi-graph internally -- when you escalate from a worker agent, the Coder loads as a separate graph while the worker stays alive in the background.
### TUI Revamp
#### TUI Revamp
The Terminal UI gets a ground-up rebuild with five major additions:
@@ -54,7 +186,7 @@ The Terminal UI gets a ground-up rebuild with five major additions:
- **PDF attachments** -- `/attach` and `/detach` commands with native OS file dialog (macOS, Linux, Windows)
- **Multi-graph commands** -- `/graphs`, `/graph <id>`, `/load <path>`, `/unload <id>` for managing agent graphs in-session
### Provider-Agnostic LLM Support
#### Provider-Agnostic LLM Support
Hive is no longer Anthropic-only. v0.5.1 adds first-class support for:
@@ -66,9 +198,9 @@ The quickstart script auto-detects Claude Code subscriptions and ZAI Code instal
---
## What's New
### What's New
### Architecture & Runtime
#### Architecture & Runtime
- **Hive Coder meta-agent** -- Natural-language agent builder with reference docs, guardian watchdog, and `hive code` CLI command. (@TimothyZhang7)
- **Multi-graph agent sessions** -- `add_graph`/`remove_graph` on AgentRuntime with 6 lifecycle tools (`load_agent`, `unload_agent`, `start_agent`, `restart_agent`, `list_agents`, `get_user_presence`). (@TimothyZhang7)
@@ -79,7 +211,7 @@ The quickstart script auto-detects Claude Code subscriptions and ZAI Code instal
- **Pre-start confirmation prompt** -- Interactive prompt before agent execution allowing credential updates or abort. (@RichardTang-Aden)
- **Event bus multi-graph support** -- `graph_id` on events, `filter_graph` on subscriptions, `ESCALATION_REQUESTED` event type, `exclude_own_graph` filter. (@TimothyZhang7)
### TUI Improvements
#### TUI Improvements
- **In-app agent picker** (Ctrl+A) -- Tabbed modal for browsing agents with metadata badges (nodes, tools, sessions, tags). (@TimothyZhang7)
- **Runtime-optional TUI startup** -- Launches without a pre-loaded agent, shows agent picker on startup. (@TimothyZhang7)
@@ -89,7 +221,7 @@ The quickstart script auto-detects Claude Code subscriptions and ZAI Code instal
- **Multi-graph TUI commands** -- `/graphs`, `/graph <id>`, `/load <path>`, `/unload <id>`. (@TimothyZhang7)
- **Agent Guardian watchdog** -- Event-driven monitor that catches secondary agent failures and triggers automatic remediation, with `--no-guardian` CLI flag. (@TimothyZhang7)
### New Tool Integrations
#### New Tool Integrations
| Tool | Description | Contributor |
| ---------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------ |
@@ -99,7 +231,7 @@ The quickstart script auto-detects Claude Code subscriptions and ZAI Code instal
| **Google Docs** | Document creation, reading, and editing with OAuth credential support | @haliaeetusvocifer |
| **Gmail enhancements** | Expanded mail operations for inbox management | @bryanadenhq |
### Infrastructure
#### Infrastructure
- **Default node type → `event_loop`** -- `NodeSpec.node_type` defaults to `"event_loop"` instead of `"llm_tool_use"`. (@TimothyZhang7)
- **Default `max_node_visits` → 0 (unlimited)** -- Nodes default to unlimited visits, reducing friction for feedback loops and forever-alive agents. (@TimothyZhang7)
@@ -112,7 +244,7 @@ The quickstart script auto-detects Claude Code subscriptions and ZAI Code instal
---
## Bug Fixes
### Bug Fixes
- Flush WIP accumulator outputs on cancel/failure so edge conditions see correct values on resume
- Stall detection state preserved across resume (no more resets on checkpoint restore)
@@ -125,13 +257,13 @@ The quickstart script auto-detects Claude Code subscriptions and ZAI Code instal
- Fix email agent version conflicts (@RichardTang-Aden)
- Fix coder tool timeouts (120s for tests, 300s cap for commands)
## Documentation
### Documentation
- Clarify installation and prevent root pip install misuse (@paarths-collab)
---
## Agent Updates
### Agent Updates
- **Email Inbox Management** -- Consolidate `gmail_inbox_guardian` and `inbox_management` into a single unified agent with updated prompts and config. (@RichardTang-Aden, @bryanadenhq)
- **Job Hunter** -- Updated node prompts, config, and agent metadata; added PDF resume selection. (@bryanadenhq)
@@ -141,7 +273,7 @@ The quickstart script auto-detects Claude Code subscriptions and ZAI Code instal
---
## Breaking Changes
### Breaking Changes
- **Deprecated node types raise `RuntimeError`** -- `llm_tool_use`, `llm_generate`, `function`, `router`, `human_input` now fail instead of warning. Migrate to `event_loop`.
- **`NodeSpec.node_type` defaults to `"event_loop"`** (was `"llm_tool_use"`)
@@ -150,7 +282,7 @@ The quickstart script auto-detects Claude Code subscriptions and ZAI Code instal
---
## Community Contributors
### Community Contributors
A huge thank you to everyone who contributed to this release:
@@ -165,14 +297,14 @@ A huge thank you to everyone who contributed to this release:
---
## Upgrading
### Upgrading
```bash
git pull origin main
uv sync
```
### Migration Guide
#### Migration Guide
If your agents use deprecated node types, update them:
@@ -196,12 +328,3 @@ hive code
# Or from TUI -- press Ctrl+E to escalate
hive tui
```
---
## What's Next
- **Agent-to-agent communication** -- one agent's output triggers another agent's entry point
- **Cost visibility** -- detailed runtime log of LLM costs per node and per session
- **Persistent webhook subscriptions** -- survive agent restarts without re-registering
- **Remote agent deployment** -- run agents as long-lived services with HTTP APIs
+1026 -18
View File
File diff suppressed because it is too large Load Diff
+32 -13
View File
@@ -1,27 +1,46 @@
.PHONY: lint format check test install-hooks help frontend-install frontend-dev frontend-build
.PHONY: lint format check test test-tools test-live test-all install-hooks help frontend-install frontend-dev frontend-build
# ── Ensure uv is findable in Git Bash on Windows ──────────────────────────────
# uv installs to ~/.local/bin on Windows/Linux/macOS. Git Bash may not include
# this in PATH by default, so we prepend it here.
export PATH := $(HOME)/.local/bin:$(PATH)
# ── Targets ───────────────────────────────────────────────────────────────────
help: ## Show this help
@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | \
awk 'BEGIN {FS = ":.*?## "}; {printf " \033[36m%-15s\033[0m %s\n", $$1, $$2}'
lint: ## Run ruff linter and formatter (with auto-fix)
cd core && ruff check --fix .
cd tools && ruff check --fix .
cd core && ruff format .
cd tools && ruff format .
cd core && uv run ruff check --fix .
cd tools && uv run ruff check --fix .
cd core && uv run ruff format .
cd tools && uv run ruff format .
format: ## Run ruff formatter
cd core && ruff format .
cd tools && ruff format .
cd core && uv run ruff format .
cd tools && uv run ruff format .
check: ## Run all checks without modifying files (CI-safe)
cd core && ruff check .
cd tools && ruff check .
cd core && ruff format --check .
cd tools && ruff format --check .
cd core && uv run ruff check .
cd tools && uv run ruff check .
cd core && uv run ruff format --check .
cd tools && uv run ruff format --check .
test: ## Run all tests
test: ## Run all tests (core + tools, excludes live)
cd core && uv run python -m pytest tests/ -v
cd tools && uv run python -m pytest -v
test-tools: ## Run tool tests only (mocked, no credentials needed)
cd tools && uv run python -m pytest -v
test-live: ## Run live integration tests (requires real API credentials)
cd tools && uv run python -m pytest -m live -s -o "addopts=" --log-cli-level=INFO
test-all: ## Run everything including live tests
cd core && uv run python -m pytest tests/ -v
cd tools && uv run python -m pytest -v
cd tools && uv run python -m pytest -m live -s -o "addopts=" --log-cli-level=INFO
install-hooks: ## Install pre-commit hooks
uv pip install pre-commit
@@ -34,4 +53,4 @@ frontend-dev: ## Start frontend dev server
cd core/frontend && npm run dev
frontend-build: ## Build frontend for production
cd core/frontend && npm run build
cd core/frontend && npm run build
+22 -16
View File
@@ -27,7 +27,7 @@
<img src="https://img.shields.io/badge/Multi--Agent-Systems-blue?style=flat-square" alt="Multi-Agent" />
<img src="https://img.shields.io/badge/Headless-Development-purple?style=flat-square" alt="Headless" />
<img src="https://img.shields.io/badge/Human--in--the--Loop-orange?style=flat-square" alt="HITL" />
<img src="https://img.shields.io/badge/Production--Ready-red?style=flat-square" alt="Production" />
<img src="https://img.shields.io/badge/Browser-Use-red?style=flat-square" alt="Browser Use" />
</p>
<p align="center">
<img src="https://img.shields.io/badge/OpenAI-supported-412991?style=flat-square&logo=openai" alt="OpenAI" />
@@ -37,7 +37,7 @@
## Overview
Build autonomous, reliable, self-improving AI agents without hardcoding workflows. Define your goal through conversation with hive coding agent(queen), and the framework generates a node graph with dynamically created connection code. When things break, the framework captures failure data, evolves the agent through the coding agent, and redeploys. Built-in human-in-the-loop nodes, credential management, and real-time monitoring give you control without sacrificing adaptability.
Generate a swarm of worker agents with a coding agent(queen) that control them. Define your goal through conversation with hive queen, and the framework generates a node graph with dynamically created connection code. When things break, the framework captures failure data, evolves the agent through the coding agent, and redeploys. Built-in human-in-the-loop nodes, browser use, credential management, and real-time monitoring give you control without sacrificing adaptability.
Visit [adenhq.com](https://adenhq.com) for complete documentation, examples, and guides.
@@ -45,7 +45,7 @@ Visit [adenhq.com](https://adenhq.com) for complete documentation, examples, and
## Who Is Hive For?
Hive is designed for developers and teams who want to build **production-grade AI agents** without manually wiring complex workflows.
Hive is designed for developers and teams who want to build many **autonomous AI agents** fast without manually wiring complex workflows.
Hive is a good fit if you:
@@ -82,8 +82,9 @@ Use Hive when you need:
- Python 3.11+ for agent development
- An LLM provider that powers the agents
- **ripgrep (optional, recommended on Windows):** The `search_files` tool uses ripgrep for faster file search. If not installed, a Python fallback is used. On Windows: `winget install BurntSushi.ripgrep` or `scoop install ripgrep`
> **Note for Windows Users:** It is strongly recommended to use **WSL (Windows Subsystem for Linux)** or **Git Bash** to run this framework. Some core automation scripts may not execute correctly in standard Command Prompt or PowerShell.
> **Windows Users:** Native Windows is supported via `quickstart.ps1` and `hive.ps1`. Run these in PowerShell 5.1+. WSL is also an option but not required.
### Installation
@@ -110,37 +111,36 @@ This sets up:
- **LLM provider** - Interactive default model configuration
- All required Python dependencies with `uv`
- At last, it will initiate the open hive interface in your browser
- Finally, it will open the Hive interface in your browser
<img width="2500" height="1214" alt="home-screen" src="https://github.com/user-attachments/assets/134d897f-5e75-4874-b00b-e0505f6b45c4" />
> **Tip:** To reopen the dashboard later, run `hive open` from the project directory.
### Build Your First Agent
Type the agent you want to build in the home input box
Type the agent you want to build in the home input box. The queen is going to ask you questions and work out a solution with you.
<img width="2500" height="1214" alt="Image" src="https://github.com/user-attachments/assets/1ce19141-a78b-46f5-8d64-dbf987e048f4" />
### Use Template Agents
Click "Try a sample agent" and check the templates. You can run a templates directly or choose to build your version on top of the existing template.
Click "Try a sample agent" and check the templates. You can run a template directly or choose to build your version on top of the existing template.
### Run Agents
Now you can run an agent by selectiing the agent (either an existing agent or example agent). You can click the Run button on the top left, or talk to the queen agent and it can run the agent for you.
Now you can run an agent by selecting the agent (either an existing agent or example agent). You can click the Run button on the top left, or talk to the queen agent and it can run the agent for you.
<img width="2500" height="1214" alt="Image" src="https://github.com/user-attachments/assets/71c38206-2ad5-49aa-bde8-6698d0bc55f5" />
<img width="2549" height="1174" alt="Screenshot 2026-03-12 at 9 27 36PM" src="https://github.com/user-attachments/assets/7c7d30fa-9ceb-4c23-95af-b1caa405547d" />
## Features
- **Browser-Use** - Control the browser on your computer to achieve hard tasks
- **Parallel Execution** - Execute the generated graph in parallel. This way you can have multiple agent compelteing the jobs for you
- **Parallel Execution** - Execute the generated graph in parallel. This way you can have multiple agents completing the jobs for you
- **[Goal-Driven Generation](docs/key_concepts/goals_outcome.md)** - Define objectives in natural language; the coding agent generates the agent graph and connection code to achieve them
- **[Adaptiveness](docs/key_concepts/evolution.md)** - Framework captures failures, calibrates according to the objectives, and evolves the agent graph
- **[Dynamic Node Connections](docs/key_concepts/graph.md)** - No predefined edges; connection code is generated by any capable LLM based on your goals
- **SDK-Wrapped Nodes** - Every node gets shared memory, local RLM memory, monitoring, tools, and LLM access out of the box
- **[Human-in-the-Loop](docs/key_concepts/graph.md#human-in-the-loop)** - Intervention nodes that pause execution for human input with configurable timeouts and escalation
- **Real-time Observability** - WebSocket streaming for live monitoring of agent execution, decisions, and node-to-node communication
- **Production-Ready** - Self-hostable, built for scale and reliability
## Integration
@@ -389,10 +389,6 @@ Hive generates your entire agent system from natural language goals using a codi
Yes, Hive is fully open-source under the Apache License 2.0. We actively encourage community contributions and collaboration.
**Q: Can Hive handle complex, production-scale use cases?**
Yes. Hive is explicitly designed for production environments with features like automatic failure recovery, real-time observability, cost controls, and horizontal scaling support. The framework handles both simple automations and complex multi-agent workflows.
**Q: Does Hive support human-in-the-loop workflows?**
Yes, Hive fully supports [human-in-the-loop](docs/key_concepts/graph.md#human-in-the-loop) workflows through intervention nodes that pause execution for human input. These include configurable timeouts and escalation policies, allowing seamless collaboration between human experts and AI agents.
@@ -417,6 +413,16 @@ Visit [docs.adenhq.com](https://docs.adenhq.com/) for complete guides, API refer
Contributions are welcome! Fork the repository, create your feature branch, implement your changes, and submit a pull request. See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.
## Star History
<a href="https://star-history.com/#aden-hive/hive&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=aden-hive/hive&type=Date&theme=dark" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=aden-hive/hive&type=Date" />
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=aden-hive/hive&type=Date" />
</picture>
</a>
---
<p align="center">
+2 -2
View File
@@ -39,8 +39,8 @@ We consider security research conducted in accordance with this policy to be:
## Security Best Practices for Users
1. **Keep Updated**: Always run the latest version
2. **Secure Configuration**: Review `config.yaml` settings, especially in production
3. **Environment Variables**: Never commit `.env` files or `config.yaml` with secrets
2. **Secure Configuration**: Review your `~/.hive/configuration.json`, `.mcp.json`, and environment variable settings, especially in production
3. **Environment Variables**: Never commit `.env` files or any configuration files that contain secrets
4. **Network Security**: Use HTTPS in production, configure firewalls appropriately
5. **Database Security**: Use strong passwords, limit network access
+31
View File
@@ -0,0 +1,31 @@
perf: reduce subprocess spawning in quickstart scripts (#4427)
## Problem
Windows process creation (CreateProcess) is 10-100x slower than Linux fork/exec.
The quickstart scripts were spawning 4+ separate `uv run python -c "import X"`
processes to verify imports, adding ~600ms overhead on Windows.
## Solution
Consolidated all import checks into a single batch script that checks multiple
modules in one subprocess call, reducing spawn overhead by ~75%.
## Changes
- **New**: `scripts/check_requirements.py` - Batched import checker
- **New**: `scripts/test_check_requirements.py` - Test suite
- **New**: `scripts/benchmark_quickstart.ps1` - Performance benchmark tool
- **Modified**: `quickstart.ps1` - Updated import verification (2 sections)
- **Modified**: `quickstart.sh` - Updated import verification
## Performance Impact
**Benchmark results on Windows:**
- Before: ~19.8 seconds for import checks
- After: ~4.9 seconds for import checks
- **Improvement: 14.9 seconds saved (75.2% faster)**
## Testing
- ✅ All functional tests pass (`scripts/test_check_requirements.py`)
- ✅ Quickstart scripts work correctly on Windows
- ✅ Error handling verified (invalid imports reported correctly)
- ✅ Performance benchmark confirms 75%+ improvement
Fixes #4427
+27
View File
@@ -0,0 +1,27 @@
# Identity mapping: GitHub username -> Discord ID
#
# This file links GitHub accounts to Discord accounts for the
# Integration Bounty Program. When a bounty PR is merged, the
# GitHub Action uses this file to ping the contributor on Discord.
#
# HOW TO ADD YOURSELF:
# Open a "Link Discord Account" issue:
# https://github.com/aden-hive/hive/issues/new?template=link-discord.yml
# A GitHub Action will automatically add your entry here.
#
# To find your Discord ID:
# 1. Open Discord Settings > Advanced > Enable Developer Mode
# 2. Right-click your name > Copy User ID
#
# Format:
# - github: your-github-username
# discord: "your-discord-id" # quotes required (it's a number)
# name: Your Display Name # optional
contributors:
# - github: example-user
# discord: "123456789012345678"
# name: Example User
- github: TimothyZhang7
discord: "408460790061072384"
name: Timothy@Aden
-1
View File
@@ -1,5 +1,4 @@
exports/
docs/
.agent-builder-sessions/
.pytest_cache/
**/__pycache__/
-5
View File
@@ -1,10 +1,5 @@
{
"mcpServers": {
"agent-builder": {
"command": "python",
"args": ["-m", "framework.mcp.agent_builder_server"],
"cwd": "core"
},
"tools": {
"command": "python",
"args": ["-m", "aden_tools.mcp_server", "--stdio"],
+10 -11
View File
@@ -1,17 +1,16 @@
# MCP Server Guide - Agent Builder
# MCP Server Guide - Agent Building Tools
This guide covers the MCP (Model Context Protocol) server for building goal-driven agents.
> **Note:** The standalone `agent-builder` MCP server (`framework.mcp.agent_builder_server`) has been replaced. Agent building is now done via the `coder-tools` server's `initialize_and_build_agent` tool, with underlying logic in `tools/coder_tools_server.py`.
This guide covers the MCP tools available for building goal-driven agents.
## Setup
### Quick Setup
```bash
# Using the setup script (recommended)
python setup_mcp.py
# Or using bash
./setup_mcp.sh
# Run the quickstart script (recommended)
./quickstart.sh
```
### Manual Configuration
@@ -21,10 +20,10 @@ Add to your MCP client configuration (e.g., Claude Desktop):
```json
{
"mcpServers": {
"agent-builder": {
"command": "python",
"args": ["-m", "framework.mcp.agent_builder_server"],
"cwd": "/path/to/goal-agent"
"coder-tools": {
"command": "uv",
"args": ["run", "coder_tools_server.py", "--stdio"],
"cwd": "/path/to/hive/tools"
}
}
}
+4 -59
View File
@@ -17,66 +17,11 @@ Framework provides a runtime framework that captures **decisions**, not just act
uv pip install -e .
```
## MCP Server Setup
## Agent Building
The framework includes an MCP (Model Context Protocol) server for building agents. To set up the MCP server:
Agent scaffolding is handled by the `coder-tools` MCP server (in `tools/coder_tools_server.py`), which provides the `initialize_and_build_agent` tool and related utilities. The package generation logic lives directly in `tools/coder_tools_server.py`.
### Automated Setup
**Using bash (Linux/macOS):**
```bash
./setup_mcp.sh
```
**Using Python (cross-platform):**
```bash
python setup_mcp.py
```
The setup script will:
1. Install the framework package
2. Install MCP dependencies (mcp, fastmcp)
3. Create/verify `.mcp.json` configuration
4. Test the MCP server module
### Manual Setup
If you prefer manual setup:
```bash
# Install framework
uv pip install -e .
# Install MCP dependencies
uv pip install mcp fastmcp
# Test the server
uv run python -m framework.mcp.agent_builder_server
```
### Using with MCP Clients
To use the agent builder with Claude Desktop or other MCP clients, add this to your MCP client configuration:
```json
{
"mcpServers": {
"agent-builder": {
"command": "python",
"args": ["-m", "framework.mcp.agent_builder_server"],
"cwd": "/path/to/hive/core"
}
}
}
```
The MCP server provides tools for:
- Creating agent building sessions
- Defining goals with success criteria
- Adding nodes (event_loop only)
- Connecting nodes with edges
- Validating and exporting agent graphs
- Testing nodes and full agent graphs
See the [Getting Started Guide](../docs/getting-started.md) for building agents.
## Quick Start
@@ -145,7 +90,7 @@ uv run python -m framework test-debug <agent_path> <test_name>
uv run python -m framework test-list <agent_path>
```
For detailed testing workflows, see the [hive-test skill](../.claude/skills/hive-test/SKILL.md).
For detailed testing workflows, see [developer-guide.md](../docs/developer-guide.md).
### Analyzing Agent Behavior with Builder
-740
View File
@@ -1,740 +0,0 @@
#!/usr/bin/env python3
"""
EventLoopNode WebSocket Demo
Real LLM, real FileConversationStore, real EventBus.
Streams EventLoopNode execution to a browser via WebSocket.
Usage:
cd /home/timothy/oss/hive/core
python demos/event_loop_wss_demo.py
Then open http://localhost:8765 in your browser.
"""
import asyncio
import json
import logging
import sys
import tempfile
from http import HTTPStatus
from pathlib import Path
import httpx
import websockets
from bs4 import BeautifulSoup
from websockets.http11 import Request, Response
# Add core, tools, and hive root to path
_CORE_DIR = Path(__file__).resolve().parent.parent
_HIVE_DIR = _CORE_DIR.parent
sys.path.insert(0, str(_CORE_DIR)) # framework.*
sys.path.insert(0, str(_HIVE_DIR / "tools" / "src")) # aden_tools.*
sys.path.insert(0, str(_HIVE_DIR)) # core.framework.* (for aden_tools imports)
import os # noqa: E402
from aden_tools.credentials import CREDENTIAL_SPECS, CredentialStoreAdapter # noqa: E402
from core.framework.credentials import CredentialStore # noqa: E402
from framework.credentials.storage import ( # noqa: E402
CompositeStorage,
EncryptedFileStorage,
EnvVarStorage,
)
from framework.graph.event_loop_node import EventLoopNode, LoopConfig # noqa: E402
from framework.graph.node import NodeContext, NodeSpec, SharedMemory # noqa: E402
from framework.llm.litellm import LiteLLMProvider # noqa: E402
from framework.llm.provider import Tool # noqa: E402
from framework.runner.tool_registry import ToolRegistry # noqa: E402
from framework.runtime.core import Runtime # noqa: E402
from framework.runtime.event_bus import EventBus, EventType # noqa: E402
from framework.storage.conversation_store import FileConversationStore # noqa: E402
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(name)s %(message)s")
logger = logging.getLogger("demo")
# -------------------------------------------------------------------------
# Persistent state (shared across WebSocket connections)
# -------------------------------------------------------------------------
STORE_DIR = Path(tempfile.mkdtemp(prefix="hive_demo_"))
STORE = FileConversationStore(STORE_DIR / "conversation")
RUNTIME = Runtime(STORE_DIR / "runtime")
LLM = LiteLLMProvider(model="claude-sonnet-4-5-20250929")
# -------------------------------------------------------------------------
# Tool Registry — real tools via ToolRegistry (same pattern as GraphExecutor)
# -------------------------------------------------------------------------
TOOL_REGISTRY = ToolRegistry()
# Credential store: Aden sync (OAuth2 tokens) + encrypted files + env var fallback
_env_mapping = {name: spec.env_var for name, spec in CREDENTIAL_SPECS.items()}
_local_storage = CompositeStorage(
primary=EncryptedFileStorage(),
fallbacks=[EnvVarStorage(env_mapping=_env_mapping)],
)
if os.environ.get("ADEN_API_KEY"):
try:
from framework.credentials.aden import ( # noqa: E402
AdenCachedStorage,
AdenClientConfig,
AdenCredentialClient,
AdenSyncProvider,
)
_client = AdenCredentialClient(AdenClientConfig(base_url="https://api.adenhq.com"))
_provider = AdenSyncProvider(client=_client)
_storage = AdenCachedStorage(
local_storage=_local_storage,
aden_provider=_provider,
)
_cred_store = CredentialStore(storage=_storage, providers=[_provider], auto_refresh=True)
_synced = _provider.sync_all(_cred_store)
logger.info("Synced %d credentials from Aden", _synced)
except Exception as e:
logger.warning("Aden sync unavailable: %s", e)
_cred_store = CredentialStore(storage=_local_storage)
else:
logger.info("ADEN_API_KEY not set, using local credential storage")
_cred_store = CredentialStore(storage=_local_storage)
CREDENTIALS = CredentialStoreAdapter(_cred_store)
# Debug: log which credentials resolved
for _name in ["brave_search", "hubspot", "anthropic"]:
_val = CREDENTIALS.get(_name)
if _val:
logger.debug("credential %s: OK (len=%d)", _name, len(_val))
else:
logger.debug("credential %s: not found", _name)
# --- web_search (Brave Search API) ---
TOOL_REGISTRY.register(
name="web_search",
tool=Tool(
name="web_search",
description=(
"Search the web for current information. "
"Returns titles, URLs, and snippets from search results."
),
parameters={
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query (1-500 characters)",
},
"num_results": {
"type": "integer",
"description": "Number of results to return (1-20, default 10)",
},
},
"required": ["query"],
},
),
executor=lambda inputs: _exec_web_search(inputs),
)
def _exec_web_search(inputs: dict) -> dict:
api_key = CREDENTIALS.get("brave_search")
if not api_key:
return {"error": "brave_search credential not configured"}
query = inputs.get("query", "")
num_results = min(inputs.get("num_results", 10), 20)
resp = httpx.get(
"https://api.search.brave.com/res/v1/web/search",
params={"q": query, "count": num_results},
headers={"X-Subscription-Token": api_key, "Accept": "application/json"},
timeout=30.0,
)
if resp.status_code != 200:
return {"error": f"Brave API HTTP {resp.status_code}"}
data = resp.json()
results = [
{
"title": item.get("title", ""),
"url": item.get("url", ""),
"snippet": item.get("description", ""),
}
for item in data.get("web", {}).get("results", [])[:num_results]
]
return {"query": query, "results": results, "total": len(results)}
# --- web_scrape (httpx + BeautifulSoup, no playwright for sync compat) ---
TOOL_REGISTRY.register(
name="web_scrape",
tool=Tool(
name="web_scrape",
description=(
"Scrape and extract text content from a webpage URL. "
"Returns the page title and main text content."
),
parameters={
"type": "object",
"properties": {
"url": {
"type": "string",
"description": "URL of the webpage to scrape",
},
"max_length": {
"type": "integer",
"description": "Maximum text length (default 50000)",
},
},
"required": ["url"],
},
),
executor=lambda inputs: _exec_web_scrape(inputs),
)
_SCRAPE_HEADERS = {
"User-Agent": (
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
"AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/131.0.0.0 Safari/537.36"
),
"Accept": "text/html,application/xhtml+xml",
}
def _exec_web_scrape(inputs: dict) -> dict:
url = inputs.get("url", "")
max_length = max(1000, min(inputs.get("max_length", 50000), 500000))
if not url.startswith(("http://", "https://")):
url = "https://" + url
try:
resp = httpx.get(url, timeout=30.0, follow_redirects=True, headers=_SCRAPE_HEADERS)
if resp.status_code != 200:
return {"error": f"HTTP {resp.status_code}"}
soup = BeautifulSoup(resp.text, "html.parser")
for tag in soup(["script", "style", "nav", "footer", "header", "aside", "noscript"]):
tag.decompose()
title = soup.title.get_text(strip=True) if soup.title else ""
main = (
soup.find("article")
or soup.find("main")
or soup.find(attrs={"role": "main"})
or soup.find("body")
)
text = main.get_text(separator=" ", strip=True) if main else ""
text = " ".join(text.split())
if len(text) > max_length:
text = text[:max_length] + "..."
return {"url": url, "title": title, "content": text, "length": len(text)}
except httpx.TimeoutException:
return {"error": "Request timed out"}
except Exception as e:
return {"error": f"Scrape failed: {e}"}
# --- HubSpot CRM tools (optional, requires HUBSPOT_ACCESS_TOKEN) ---
_HUBSPOT_API = "https://api.hubapi.com"
def _hubspot_headers() -> dict | None:
token = CREDENTIALS.get("hubspot")
if token:
logger.debug("HubSpot token: %s...%s (len=%d)", token[:8], token[-4:], len(token))
else:
logger.debug("HubSpot token: not found")
if not token:
return None
return {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json",
"Accept": "application/json",
}
def _exec_hubspot_search(inputs: dict) -> dict:
headers = _hubspot_headers()
if not headers:
return {"error": "HUBSPOT_ACCESS_TOKEN not set"}
object_type = inputs.get("object_type", "contacts")
query = inputs.get("query", "")
limit = min(inputs.get("limit", 10), 100)
body: dict = {"limit": limit}
if query:
body["query"] = query
try:
resp = httpx.post(
f"{_HUBSPOT_API}/crm/v3/objects/{object_type}/search",
headers=headers,
json=body,
timeout=30.0,
)
if resp.status_code != 200:
return {"error": f"HubSpot API HTTP {resp.status_code}: {resp.text[:200]}"}
return resp.json()
except httpx.TimeoutException:
return {"error": "Request timed out"}
except Exception as e:
return {"error": f"HubSpot error: {e}"}
TOOL_REGISTRY.register(
name="hubspot_search",
tool=Tool(
name="hubspot_search",
description=(
"Search HubSpot CRM objects (contacts, companies, or deals). "
"Returns matching records with their properties."
),
parameters={
"type": "object",
"properties": {
"object_type": {
"type": "string",
"description": "CRM object type: 'contacts', 'companies', or 'deals'",
},
"query": {
"type": "string",
"description": "Search query (name, email, domain, etc.)",
},
"limit": {
"type": "integer",
"description": "Max results (1-100, default 10)",
},
},
"required": ["object_type"],
},
),
executor=lambda inputs: _exec_hubspot_search(inputs),
)
logger.info(
"ToolRegistry loaded: %s",
", ".join(TOOL_REGISTRY.get_registered_names()),
)
# -------------------------------------------------------------------------
# HTML page (embedded)
# -------------------------------------------------------------------------
HTML_PAGE = ( # noqa: E501
"""<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>EventLoopNode Live Demo</title>
<style>
* { box-sizing: border-box; margin: 0; padding: 0; }
body {
font-family: 'SF Mono', 'Fira Code', monospace;
background: #0d1117; color: #c9d1d9;
height: 100vh; display: flex; flex-direction: column;
}
header {
background: #161b22; padding: 12px 20px;
border-bottom: 1px solid #30363d;
display: flex; align-items: center; gap: 16px;
}
header h1 { font-size: 16px; color: #58a6ff; font-weight: 600; }
.status {
font-size: 12px; padding: 3px 10px; border-radius: 12px;
background: #21262d; color: #8b949e;
}
.status.running { background: #1a4b2e; color: #3fb950; }
.status.done { background: #1a3a5c; color: #58a6ff; }
.status.error { background: #4b1a1a; color: #f85149; }
.chat { flex: 1; overflow-y: auto; padding: 16px; }
.msg {
margin: 8px 0; padding: 10px 14px; border-radius: 8px;
line-height: 1.6; white-space: pre-wrap; word-wrap: break-word;
}
.msg.user { background: #1a3a5c; color: #58a6ff; }
.msg.assistant { background: #161b22; color: #c9d1d9; }
.msg.event {
background: transparent; color: #8b949e; font-size: 11px;
padding: 4px 14px; border-left: 3px solid #30363d;
}
.msg.event.loop { border-left-color: #58a6ff; }
.msg.event.tool { border-left-color: #d29922; }
.msg.event.stall { border-left-color: #f85149; }
.input-bar {
padding: 12px 16px; background: #161b22;
border-top: 1px solid #30363d; display: flex; gap: 8px;
}
.input-bar input {
flex: 1; background: #0d1117; border: 1px solid #30363d;
color: #c9d1d9; padding: 8px 12px; border-radius: 6px;
font-family: inherit; font-size: 14px; outline: none;
}
.input-bar input:focus { border-color: #58a6ff; }
.input-bar button {
background: #238636; color: #fff; border: none;
padding: 8px 20px; border-radius: 6px; cursor: pointer;
font-family: inherit; font-weight: 600;
}
.input-bar button:hover { background: #2ea043; }
.input-bar button:disabled {
background: #21262d; color: #484f58; cursor: not-allowed;
}
.input-bar button.clear { background: #da3633; }
.input-bar button.clear:hover { background: #f85149; }
</style>
</head>
<body>
<header>
<h1>EventLoopNode Live</h1>
<span id="status" class="status">Idle</span>
<span id="iter" class="status" style="display:none">Step 0</span>
</header>
<div id="chat" class="chat"></div>
<div class="input-bar">
<input id="input" type="text"
placeholder="Ask anything..." autofocus />
<button id="go" onclick="run()">Send</button>
<button class="clear"
onclick="clearConversation()">Clear</button>
</div>
<script>
let ws = null;
let currentAssistantEl = null;
let iterCount = 0;
const chat = document.getElementById('chat');
const status = document.getElementById('status');
const iterEl = document.getElementById('iter');
const goBtn = document.getElementById('go');
const inputEl = document.getElementById('input');
inputEl.addEventListener('keydown', e => {
if (e.key === 'Enter') run();
});
function setStatus(text, cls) {
status.textContent = text;
status.className = 'status ' + cls;
}
function addMsg(text, cls) {
const el = document.createElement('div');
el.className = 'msg ' + cls;
el.textContent = text;
chat.appendChild(el);
chat.scrollTop = chat.scrollHeight;
return el;
}
function connect() {
ws = new WebSocket('ws://' + location.host + '/ws');
ws.onopen = () => {
setStatus('Ready', 'done');
goBtn.disabled = false;
};
ws.onmessage = handleEvent;
ws.onerror = () => { setStatus('Error', 'error'); };
ws.onclose = () => {
setStatus('Reconnecting...', '');
goBtn.disabled = true;
setTimeout(connect, 2000);
};
}
function handleEvent(msg) {
const evt = JSON.parse(msg.data);
if (evt.type === 'llm_text_delta') {
if (currentAssistantEl) {
currentAssistantEl.textContent += evt.content;
chat.scrollTop = chat.scrollHeight;
}
}
else if (evt.type === 'ready') {
setStatus('Ready', 'done');
if (currentAssistantEl && !currentAssistantEl.textContent)
currentAssistantEl.remove();
goBtn.disabled = false;
}
else if (evt.type === 'node_loop_iteration') {
iterCount = evt.iteration || (iterCount + 1);
iterEl.textContent = 'Step ' + iterCount;
iterEl.style.display = '';
}
else if (evt.type === 'tool_call_started') {
var info = evt.tool_name + '('
+ JSON.stringify(evt.tool_input).slice(0, 120) + ')';
addMsg('TOOL ' + info, 'event tool');
}
else if (evt.type === 'tool_call_completed') {
var preview = (evt.result || '').slice(0, 200);
var cls = evt.is_error ? 'stall' : 'tool';
addMsg('RESULT ' + evt.tool_name + ': ' + preview,
'event ' + cls);
currentAssistantEl = addMsg('', 'assistant');
}
else if (evt.type === 'result') {
setStatus('Session ended', evt.success ? 'done' : 'error');
if (evt.error) addMsg('ERROR ' + evt.error, 'event stall');
if (currentAssistantEl && !currentAssistantEl.textContent)
currentAssistantEl.remove();
goBtn.disabled = false;
}
else if (evt.type === 'node_stalled') {
addMsg('STALLED ' + evt.reason, 'event stall');
}
else if (evt.type === 'cleared') {
chat.innerHTML = '';
iterCount = 0;
iterEl.textContent = 'Step 0';
iterEl.style.display = 'none';
setStatus('Ready', 'done');
goBtn.disabled = false;
}
}
function run() {
const text = inputEl.value.trim();
if (!text || !ws || ws.readyState !== 1) return;
addMsg(text, 'user');
currentAssistantEl = addMsg('', 'assistant');
inputEl.value = '';
setStatus('Running', 'running');
goBtn.disabled = true;
ws.send(JSON.stringify({ topic: text }));
}
function clearConversation() {
if (ws && ws.readyState === 1) {
ws.send(JSON.stringify({ command: 'clear' }));
}
}
connect();
</script>
</body>
</html>"""
)
# -------------------------------------------------------------------------
# WebSocket handler
# -------------------------------------------------------------------------
async def handle_ws(websocket):
"""Persistent WebSocket: long-lived EventLoopNode with client_facing blocking."""
global STORE
# -- Event forwarding (WebSocket ← EventBus) ----------------------------
bus = EventBus()
async def forward_event(event):
try:
payload = {"type": event.type.value, **event.data}
if event.node_id:
payload["node_id"] = event.node_id
await websocket.send(json.dumps(payload))
except Exception:
pass
bus.subscribe(
event_types=[
EventType.NODE_LOOP_STARTED,
EventType.NODE_LOOP_ITERATION,
EventType.NODE_LOOP_COMPLETED,
EventType.LLM_TEXT_DELTA,
EventType.TOOL_CALL_STARTED,
EventType.TOOL_CALL_COMPLETED,
EventType.NODE_STALLED,
],
handler=forward_event,
)
# -- Per-connection state -----------------------------------------------
node = None
loop_task = None
tools = list(TOOL_REGISTRY.get_tools().values())
tool_executor = TOOL_REGISTRY.get_executor()
node_spec = NodeSpec(
id="assistant",
name="Chat Assistant",
description="A conversational assistant that remembers context across messages",
node_type="event_loop",
client_facing=True,
system_prompt=(
"You are a helpful assistant with access to tools. "
"You can search the web, scrape webpages, and query HubSpot CRM. "
"Use tools when the user asks for current information or external data. "
"You have full conversation history, so you can reference previous messages."
),
)
# -- Ready callback: subscribe to CLIENT_INPUT_REQUESTED on the bus ---
async def on_input_requested(event):
try:
await websocket.send(json.dumps({"type": "ready"}))
except Exception:
pass
bus.subscribe(
event_types=[EventType.CLIENT_INPUT_REQUESTED],
handler=on_input_requested,
)
async def start_loop(first_message: str):
"""Create an EventLoopNode and run it as a background task."""
nonlocal node, loop_task
memory = SharedMemory()
ctx = NodeContext(
runtime=RUNTIME,
node_id="assistant",
node_spec=node_spec,
memory=memory,
input_data={},
llm=LLM,
available_tools=tools,
)
node = EventLoopNode(
event_bus=bus,
config=LoopConfig(max_iterations=10_000, max_history_tokens=32_000),
conversation_store=STORE,
tool_executor=tool_executor,
)
await node.inject_event(first_message)
async def _run():
try:
result = await node.execute(ctx)
try:
await websocket.send(
json.dumps(
{
"type": "result",
"success": result.success,
"output": result.output,
"error": result.error,
"tokens": result.tokens_used,
}
)
)
except Exception:
pass
logger.info(f"Loop ended: success={result.success}, tokens={result.tokens_used}")
except websockets.exceptions.ConnectionClosed:
logger.info("Loop stopped: WebSocket closed")
except Exception as e:
logger.exception("Loop error")
try:
await websocket.send(
json.dumps(
{
"type": "result",
"success": False,
"error": str(e),
"output": {},
}
)
)
except Exception:
pass
loop_task = asyncio.create_task(_run())
async def stop_loop():
"""Signal the node and wait for the loop task to finish."""
nonlocal node, loop_task
if loop_task and not loop_task.done():
if node:
node.signal_shutdown()
try:
await asyncio.wait_for(loop_task, timeout=5.0)
except (TimeoutError, asyncio.CancelledError):
loop_task.cancel()
node = None
loop_task = None
# -- Message loop (runs for the lifetime of this WebSocket) -------------
try:
async for raw in websocket:
try:
msg = json.loads(raw)
except Exception:
continue
# Clear command
if msg.get("command") == "clear":
import shutil
await stop_loop()
await STORE.close()
conv_dir = STORE_DIR / "conversation"
if conv_dir.exists():
shutil.rmtree(conv_dir)
STORE = FileConversationStore(conv_dir)
await websocket.send(json.dumps({"type": "cleared"}))
logger.info("Conversation cleared")
continue
topic = msg.get("topic", "")
if not topic:
continue
if node is None:
# First message — spin up the loop
logger.info(f"Starting persistent loop: {topic}")
await start_loop(topic)
else:
# Subsequent message — inject into the running loop
logger.info(f"Injecting message: {topic}")
await node.inject_event(topic)
except websockets.exceptions.ConnectionClosed:
pass
finally:
await stop_loop()
logger.info("WebSocket closed, loop stopped")
# -------------------------------------------------------------------------
# HTTP handler for serving the HTML page
# -------------------------------------------------------------------------
async def process_request(connection, request: Request):
"""Serve HTML on GET /, upgrade to WebSocket on /ws."""
if request.path == "/ws":
return None # let websockets handle the upgrade
# Serve the HTML page for any other path
return Response(
HTTPStatus.OK,
"OK",
websockets.Headers({"Content-Type": "text/html; charset=utf-8"}),
HTML_PAGE.encode(),
)
# -------------------------------------------------------------------------
# Main
# -------------------------------------------------------------------------
async def main():
port = 8765
async with websockets.serve(
handle_ws,
"0.0.0.0",
port,
process_request=process_request,
):
logger.info(f"Demo running at http://localhost:{port}")
logger.info("Open in your browser and enter a topic to research.")
await asyncio.Future() # run forever
if __name__ == "__main__":
asyncio.run(main())
File diff suppressed because it is too large Load Diff
-930
View File
@@ -1,930 +0,0 @@
#!/usr/bin/env python3
"""
Two-Node ContextHandoff Demo
Demonstrates ContextHandoff between two EventLoopNode instances:
Node A (Researcher) ContextHandoff Node B (Analyst)
Real LLM, real FileConversationStore, real EventBus.
Streams both nodes to a browser via WebSocket.
Usage:
cd /home/timothy/oss/hive/core
python demos/handoff_demo.py
Then open http://localhost:8766 in your browser.
"""
import asyncio
import json
import logging
import sys
import tempfile
from http import HTTPStatus
from pathlib import Path
import httpx
import websockets
from bs4 import BeautifulSoup
from websockets.http11 import Request, Response
# Add core, tools, and hive root to path
_CORE_DIR = Path(__file__).resolve().parent.parent
_HIVE_DIR = _CORE_DIR.parent
sys.path.insert(0, str(_CORE_DIR)) # framework.*
sys.path.insert(0, str(_HIVE_DIR / "tools" / "src")) # aden_tools.*
sys.path.insert(0, str(_HIVE_DIR)) # core.framework.* (for aden_tools imports)
from aden_tools.credentials import CREDENTIAL_SPECS, CredentialStoreAdapter # noqa: E402
from core.framework.credentials import CredentialStore # noqa: E402
from framework.credentials.storage import ( # noqa: E402
CompositeStorage,
EncryptedFileStorage,
EnvVarStorage,
)
from framework.graph.context_handoff import ContextHandoff # noqa: E402
from framework.graph.conversation import NodeConversation # noqa: E402
from framework.graph.event_loop_node import EventLoopNode, LoopConfig # noqa: E402
from framework.graph.node import NodeContext, NodeSpec, SharedMemory # noqa: E402
from framework.llm.litellm import LiteLLMProvider # noqa: E402
from framework.llm.provider import Tool # noqa: E402
from framework.runner.tool_registry import ToolRegistry # noqa: E402
from framework.runtime.core import Runtime # noqa: E402
from framework.runtime.event_bus import EventBus, EventType # noqa: E402
from framework.storage.conversation_store import FileConversationStore # noqa: E402
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(name)s %(message)s")
logger = logging.getLogger("handoff_demo")
# -------------------------------------------------------------------------
# Persistent state
# -------------------------------------------------------------------------
STORE_DIR = Path(tempfile.mkdtemp(prefix="hive_handoff_"))
RUNTIME = Runtime(STORE_DIR / "runtime")
LLM = LiteLLMProvider(model="claude-sonnet-4-5-20250929")
# -------------------------------------------------------------------------
# Credentials
# -------------------------------------------------------------------------
# Composite credential store: encrypted files (primary) + env vars (fallback)
_env_mapping = {name: spec.env_var for name, spec in CREDENTIAL_SPECS.items()}
_composite = CompositeStorage(
primary=EncryptedFileStorage(),
fallbacks=[EnvVarStorage(env_mapping=_env_mapping)],
)
CREDENTIALS = CredentialStoreAdapter(CredentialStore(storage=_composite))
for _name in ["brave_search", "hubspot"]:
_val = CREDENTIALS.get(_name)
if _val:
logger.debug("credential %s: OK (len=%d)", _name, len(_val))
else:
logger.debug("credential %s: not found", _name)
# -------------------------------------------------------------------------
# Tool Registry — web_search + web_scrape for Node A (Researcher)
# -------------------------------------------------------------------------
TOOL_REGISTRY = ToolRegistry()
def _exec_web_search(inputs: dict) -> dict:
api_key = CREDENTIALS.get("brave_search")
if not api_key:
return {"error": "brave_search credential not configured"}
query = inputs.get("query", "")
num_results = min(inputs.get("num_results", 10), 20)
resp = httpx.get(
"https://api.search.brave.com/res/v1/web/search",
params={"q": query, "count": num_results},
headers={
"X-Subscription-Token": api_key,
"Accept": "application/json",
},
timeout=30.0,
)
if resp.status_code != 200:
return {"error": f"Brave API HTTP {resp.status_code}"}
data = resp.json()
results = [
{
"title": item.get("title", ""),
"url": item.get("url", ""),
"snippet": item.get("description", ""),
}
for item in data.get("web", {}).get("results", [])[:num_results]
]
return {"query": query, "results": results, "total": len(results)}
TOOL_REGISTRY.register(
name="web_search",
tool=Tool(
name="web_search",
description=(
"Search the web for current information. "
"Returns titles, URLs, and snippets from search results."
),
parameters={
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query (1-500 characters)",
},
"num_results": {
"type": "integer",
"description": "Number of results (1-20, default 10)",
},
},
"required": ["query"],
},
),
executor=lambda inputs: _exec_web_search(inputs),
)
_SCRAPE_HEADERS = {
"User-Agent": (
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
"AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/131.0.0.0 Safari/537.36"
),
"Accept": "text/html,application/xhtml+xml",
}
def _exec_web_scrape(inputs: dict) -> dict:
url = inputs.get("url", "")
max_length = max(1000, min(inputs.get("max_length", 50000), 500000))
if not url.startswith(("http://", "https://")):
url = "https://" + url
try:
resp = httpx.get(
url,
timeout=30.0,
follow_redirects=True,
headers=_SCRAPE_HEADERS,
)
if resp.status_code != 200:
return {"error": f"HTTP {resp.status_code}"}
soup = BeautifulSoup(resp.text, "html.parser")
for tag in soup(["script", "style", "nav", "footer", "header", "aside", "noscript"]):
tag.decompose()
title = soup.title.get_text(strip=True) if soup.title else ""
main = (
soup.find("article")
or soup.find("main")
or soup.find(attrs={"role": "main"})
or soup.find("body")
)
text = main.get_text(separator=" ", strip=True) if main else ""
text = " ".join(text.split())
if len(text) > max_length:
text = text[:max_length] + "..."
return {
"url": url,
"title": title,
"content": text,
"length": len(text),
}
except httpx.TimeoutException:
return {"error": "Request timed out"}
except Exception as e:
return {"error": f"Scrape failed: {e}"}
TOOL_REGISTRY.register(
name="web_scrape",
tool=Tool(
name="web_scrape",
description=(
"Scrape and extract text content from a webpage URL. "
"Returns the page title and main text content."
),
parameters={
"type": "object",
"properties": {
"url": {
"type": "string",
"description": "URL of the webpage to scrape",
},
"max_length": {
"type": "integer",
"description": "Maximum text length (default 50000)",
},
},
"required": ["url"],
},
),
executor=lambda inputs: _exec_web_scrape(inputs),
)
logger.info(
"ToolRegistry loaded: %s",
", ".join(TOOL_REGISTRY.get_registered_names()),
)
# -------------------------------------------------------------------------
# Node Specs
# -------------------------------------------------------------------------
RESEARCHER_SPEC = NodeSpec(
id="researcher",
name="Researcher",
description="Researches a topic using web search and scraping tools",
node_type="event_loop",
input_keys=["topic"],
output_keys=["research_summary"],
system_prompt=(
"You are a thorough research assistant. Your job is to research "
"the given topic using the web_search and web_scrape tools.\n\n"
"1. Search for relevant information on the topic\n"
"2. Scrape 1-2 of the most promising URLs for details\n"
"3. Synthesize your findings into a comprehensive summary\n"
"4. Use set_output with key='research_summary' to save your "
"findings\n\n"
"Be thorough but efficient. Aim for 2-4 search/scrape calls, "
"then summarize and set_output."
),
)
ANALYST_SPEC = NodeSpec(
id="analyst",
name="Analyst",
description="Analyzes research findings and provides insights",
node_type="event_loop",
input_keys=["context"],
output_keys=["analysis"],
system_prompt=(
"You are a strategic analyst. You receive research findings from "
"a previous researcher and must:\n\n"
"1. Identify key themes and patterns\n"
"2. Assess the reliability and significance of the findings\n"
"3. Provide actionable insights and recommendations\n"
"4. Use set_output with key='analysis' to save your analysis\n\n"
"Be concise but insightful. Focus on what matters most."
),
)
# -------------------------------------------------------------------------
# HTML page
# -------------------------------------------------------------------------
HTML_PAGE = ( # noqa: E501
"""<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>ContextHandoff Demo</title>
<style>
* {
box-sizing: border-box;
margin: 0;
padding: 0;
}
body {
font-family: 'SF Mono', 'Fira Code', monospace;
background: #0d1117;
color: #c9d1d9;
height: 100vh;
display: flex;
flex-direction: column;
}
header {
background: #161b22;
padding: 12px 20px;
border-bottom: 1px solid #30363d;
display: flex;
align-items: center;
gap: 16px;
}
header h1 {
font-size: 16px;
color: #58a6ff;
font-weight: 600;
}
.badge {
font-size: 12px;
padding: 3px 10px;
border-radius: 12px;
background: #21262d;
color: #8b949e;
}
.badge.researcher {
background: #1a3a5c;
color: #58a6ff;
}
.badge.analyst {
background: #1a4b2e;
color: #3fb950;
}
.badge.handoff {
background: #3d1f00;
color: #d29922;
}
.badge.done {
background: #21262d;
color: #8b949e;
}
.badge.error {
background: #4b1a1a;
color: #f85149;
}
.chat {
flex: 1;
overflow-y: auto;
padding: 16px;
}
.msg {
margin: 8px 0;
padding: 10px 14px;
border-radius: 8px;
line-height: 1.6;
white-space: pre-wrap;
word-wrap: break-word;
}
.msg.user {
background: #1a3a5c;
color: #58a6ff;
}
.msg.assistant {
background: #161b22;
color: #c9d1d9;
}
.msg.assistant.analyst-msg {
border-left: 3px solid #3fb950;
}
.msg.event {
background: transparent;
color: #8b949e;
font-size: 11px;
padding: 4px 14px;
border-left: 3px solid #30363d;
}
.msg.event.loop {
border-left-color: #58a6ff;
}
.msg.event.tool {
border-left-color: #d29922;
}
.msg.event.stall {
border-left-color: #f85149;
}
.handoff-banner {
margin: 16px 0;
padding: 16px;
background: #1c1200;
border: 1px solid #d29922;
border-radius: 8px;
text-align: center;
}
.handoff-banner h3 {
color: #d29922;
font-size: 14px;
margin-bottom: 8px;
}
.handoff-banner p, .result-banner p {
color: #8b949e;
font-size: 12px;
line-height: 1.5;
max-height: 200px;
overflow-y: auto;
white-space: pre-wrap;
text-align: left;
}
.result-banner {
margin: 16px 0;
padding: 16px;
background: #0a2614;
border: 1px solid #3fb950;
border-radius: 8px;
}
.result-banner h3 {
color: #3fb950;
font-size: 14px;
margin-bottom: 8px;
text-align: center;
}
.result-banner .label {
color: #58a6ff;
font-size: 11px;
font-weight: 600;
margin-top: 10px;
margin-bottom: 2px;
}
.result-banner .tokens {
color: #484f58;
font-size: 11px;
text-align: center;
margin-top: 10px;
}
.input-bar {
padding: 12px 16px;
background: #161b22;
border-top: 1px solid #30363d;
display: flex;
gap: 8px;
}
.input-bar input {
flex: 1;
background: #0d1117;
border: 1px solid #30363d;
color: #c9d1d9;
padding: 8px 12px;
border-radius: 6px;
font-family: inherit;
font-size: 14px;
outline: none;
}
.input-bar input:focus {
border-color: #58a6ff;
}
.input-bar button {
background: #238636;
color: #fff;
border: none;
padding: 8px 20px;
border-radius: 6px;
cursor: pointer;
font-family: inherit;
font-weight: 600;
}
.input-bar button:hover {
background: #2ea043;
}
.input-bar button:disabled {
background: #21262d;
color: #484f58;
cursor: not-allowed;
}
</style>
</head>
<body>
<header>
<h1>ContextHandoff Demo</h1>
<span id="phase" class="badge">Idle</span>
<span id="iter" class="badge" style="display:none">Step 0</span>
</header>
<div id="chat" class="chat"></div>
<div class="input-bar">
<input id="input" type="text"
placeholder="Enter a research topic..." autofocus />
<button id="go" onclick="run()">Research</button>
</div>
<script>
let ws = null;
let currentAssistantEl = null;
let iterCount = 0;
let currentPhase = 'idle';
const chat = document.getElementById('chat');
const phase = document.getElementById('phase');
const iterEl = document.getElementById('iter');
const goBtn = document.getElementById('go');
const inputEl = document.getElementById('input');
inputEl.addEventListener('keydown', e => {
if (e.key === 'Enter') run();
});
function setPhase(text, cls) {
phase.textContent = text;
phase.className = 'badge ' + cls;
currentPhase = cls;
}
function addMsg(text, cls) {
const el = document.createElement('div');
el.className = 'msg ' + cls;
el.textContent = text;
chat.appendChild(el);
chat.scrollTop = chat.scrollHeight;
return el;
}
function addHandoffBanner(summary) {
const banner = document.createElement('div');
banner.className = 'handoff-banner';
const h3 = document.createElement('h3');
h3.textContent = 'Context Handoff: Researcher -> Analyst';
const p = document.createElement('p');
p.textContent = summary || 'Passing research context...';
banner.appendChild(h3);
banner.appendChild(p);
chat.appendChild(banner);
chat.scrollTop = chat.scrollHeight;
}
function addResultBanner(researcher, analyst, tokens) {
const banner = document.createElement('div');
banner.className = 'result-banner';
const h3 = document.createElement('h3');
h3.textContent = 'Pipeline Complete';
banner.appendChild(h3);
if (researcher && researcher.research_summary) {
const lbl = document.createElement('div');
lbl.className = 'label';
lbl.textContent = 'RESEARCH SUMMARY';
banner.appendChild(lbl);
const p = document.createElement('p');
p.textContent = researcher.research_summary;
banner.appendChild(p);
}
if (analyst && analyst.analysis) {
const lbl = document.createElement('div');
lbl.className = 'label';
lbl.textContent = 'ANALYSIS';
lbl.style.color = '#3fb950';
banner.appendChild(lbl);
const p = document.createElement('p');
p.textContent = analyst.analysis;
banner.appendChild(p);
}
if (tokens) {
const t = document.createElement('div');
t.className = 'tokens';
t.textContent = 'Total tokens: ' + tokens.toLocaleString();
banner.appendChild(t);
}
chat.appendChild(banner);
chat.scrollTop = chat.scrollHeight;
}
function connect() {
ws = new WebSocket('ws://' + location.host + '/ws');
ws.onopen = () => {
setPhase('Ready', 'done');
goBtn.disabled = false;
};
ws.onmessage = handleEvent;
ws.onerror = () => { setPhase('Error', 'error'); };
ws.onclose = () => {
setPhase('Reconnecting...', '');
goBtn.disabled = true;
setTimeout(connect, 2000);
};
}
function handleEvent(msg) {
const evt = JSON.parse(msg.data);
if (evt.type === 'phase') {
if (evt.phase === 'researcher') {
setPhase('Researcher', 'researcher');
} else if (evt.phase === 'handoff') {
setPhase('Handoff', 'handoff');
} else if (evt.phase === 'analyst') {
setPhase('Analyst', 'analyst');
}
iterCount = 0;
iterEl.style.display = 'none';
}
else if (evt.type === 'llm_text_delta') {
if (currentAssistantEl) {
currentAssistantEl.textContent += evt.content;
chat.scrollTop = chat.scrollHeight;
}
}
else if (evt.type === 'node_loop_iteration') {
iterCount = evt.iteration || (iterCount + 1);
iterEl.textContent = 'Step ' + iterCount;
iterEl.style.display = '';
}
else if (evt.type === 'tool_call_started') {
var info = evt.tool_name + '('
+ JSON.stringify(evt.tool_input).slice(0, 120) + ')';
addMsg('TOOL ' + info, 'event tool');
}
else if (evt.type === 'tool_call_completed') {
var preview = (evt.result || '').slice(0, 200);
var cls = evt.is_error ? 'stall' : 'tool';
addMsg(
'RESULT ' + evt.tool_name + ': ' + preview,
'event ' + cls
);
var assistCls = currentPhase === 'analyst'
? 'assistant analyst-msg' : 'assistant';
currentAssistantEl = addMsg('', assistCls);
}
else if (evt.type === 'handoff_context') {
addHandoffBanner(evt.summary);
var assistCls = 'assistant analyst-msg';
currentAssistantEl = addMsg('', assistCls);
}
else if (evt.type === 'node_result') {
if (evt.node_id === 'researcher') {
if (currentAssistantEl
&& !currentAssistantEl.textContent) {
currentAssistantEl.remove();
}
}
}
else if (evt.type === 'done') {
setPhase('Done', 'done');
iterEl.style.display = 'none';
if (currentAssistantEl
&& !currentAssistantEl.textContent) {
currentAssistantEl.remove();
}
currentAssistantEl = null;
addResultBanner(
evt.researcher, evt.analyst, evt.total_tokens
);
goBtn.disabled = false;
inputEl.placeholder = 'Enter another topic...';
}
else if (evt.type === 'error') {
setPhase('Error', 'error');
addMsg('ERROR ' + evt.message, 'event stall');
goBtn.disabled = false;
}
else if (evt.type === 'node_stalled') {
addMsg('STALLED ' + evt.reason, 'event stall');
}
}
function run() {
const text = inputEl.value.trim();
if (!text || !ws || ws.readyState !== 1) return;
chat.innerHTML = '';
addMsg(text, 'user');
currentAssistantEl = addMsg('', 'assistant');
inputEl.value = '';
goBtn.disabled = true;
ws.send(JSON.stringify({ topic: text }));
}
connect();
</script>
</body>
</html>"""
)
# -------------------------------------------------------------------------
# WebSocket handler — sequential Node A → Handoff → Node B
# -------------------------------------------------------------------------
async def handle_ws(websocket):
"""Run the two-node handoff pipeline per user message."""
try:
async for raw in websocket:
try:
msg = json.loads(raw)
except Exception:
continue
topic = msg.get("topic", "")
if not topic:
continue
logger.info(f"Starting handoff pipeline for: {topic}")
try:
await _run_pipeline(websocket, topic)
except websockets.exceptions.ConnectionClosed:
logger.info("WebSocket closed during pipeline")
return
except Exception as e:
logger.exception("Pipeline error")
try:
await websocket.send(json.dumps({"type": "error", "message": str(e)}))
except Exception:
pass
except websockets.exceptions.ConnectionClosed:
pass
async def _run_pipeline(websocket, topic: str):
"""Execute: Node A (research) → ContextHandoff → Node B (analysis)."""
import shutil
# Fresh stores for each run
run_dir = Path(tempfile.mkdtemp(prefix="hive_run_", dir=STORE_DIR))
store_a = FileConversationStore(run_dir / "node_a")
store_b = FileConversationStore(run_dir / "node_b")
# Shared event bus
bus = EventBus()
async def forward_event(event):
try:
payload = {"type": event.type.value, **event.data}
if event.node_id:
payload["node_id"] = event.node_id
await websocket.send(json.dumps(payload))
except Exception:
pass
bus.subscribe(
event_types=[
EventType.NODE_LOOP_STARTED,
EventType.NODE_LOOP_ITERATION,
EventType.NODE_LOOP_COMPLETED,
EventType.LLM_TEXT_DELTA,
EventType.TOOL_CALL_STARTED,
EventType.TOOL_CALL_COMPLETED,
EventType.NODE_STALLED,
],
handler=forward_event,
)
tools = list(TOOL_REGISTRY.get_tools().values())
tool_executor = TOOL_REGISTRY.get_executor()
# ---- Phase 1: Researcher ------------------------------------------------
await websocket.send(json.dumps({"type": "phase", "phase": "researcher"}))
node_a = EventLoopNode(
event_bus=bus,
judge=None, # implicit judge: accept when output_keys filled
config=LoopConfig(
max_iterations=20,
max_tool_calls_per_turn=30,
max_history_tokens=32_000,
),
conversation_store=store_a,
tool_executor=tool_executor,
)
ctx_a = NodeContext(
runtime=RUNTIME,
node_id="researcher",
node_spec=RESEARCHER_SPEC,
memory=SharedMemory(),
input_data={"topic": topic},
llm=LLM,
available_tools=tools,
)
result_a = await node_a.execute(ctx_a)
logger.info(
"Researcher done: success=%s, tokens=%s",
result_a.success,
result_a.tokens_used,
)
await websocket.send(
json.dumps(
{
"type": "node_result",
"node_id": "researcher",
"success": result_a.success,
"output": result_a.output,
}
)
)
if not result_a.success:
await websocket.send(
json.dumps(
{
"type": "error",
"message": f"Researcher failed: {result_a.error}",
}
)
)
return
# ---- Phase 2: Context Handoff -------------------------------------------
await websocket.send(json.dumps({"type": "phase", "phase": "handoff"}))
# Restore the researcher's conversation from store
conversation_a = await NodeConversation.restore(store_a)
if conversation_a is None:
await websocket.send(
json.dumps(
{
"type": "error",
"message": "Failed to restore researcher conversation",
}
)
)
return
handoff_engine = ContextHandoff(llm=LLM)
handoff_context = handoff_engine.summarize_conversation(
conversation=conversation_a,
node_id="researcher",
output_keys=["research_summary"],
)
formatted_handoff = ContextHandoff.format_as_input(handoff_context)
logger.info(
"Handoff: %d turns, ~%d tokens, keys=%s",
handoff_context.turn_count,
handoff_context.total_tokens_used,
list(handoff_context.key_outputs.keys()),
)
# Send handoff context to browser
await websocket.send(
json.dumps(
{
"type": "handoff_context",
"summary": handoff_context.summary[:500],
"turn_count": handoff_context.turn_count,
"tokens": handoff_context.total_tokens_used,
"key_outputs": handoff_context.key_outputs,
}
)
)
# ---- Phase 3: Analyst ---------------------------------------------------
await websocket.send(json.dumps({"type": "phase", "phase": "analyst"}))
node_b = EventLoopNode(
event_bus=bus,
judge=None, # implicit judge
config=LoopConfig(
max_iterations=10,
max_tool_calls_per_turn=30,
max_history_tokens=32_000,
),
conversation_store=store_b,
)
ctx_b = NodeContext(
runtime=RUNTIME,
node_id="analyst",
node_spec=ANALYST_SPEC,
memory=SharedMemory(),
input_data={"context": formatted_handoff},
llm=LLM,
available_tools=[],
)
result_b = await node_b.execute(ctx_b)
logger.info(
"Analyst done: success=%s, tokens=%s",
result_b.success,
result_b.tokens_used,
)
# ---- Done ---------------------------------------------------------------
await websocket.send(
json.dumps(
{
"type": "done",
"researcher": result_a.output,
"analyst": result_b.output,
"total_tokens": ((result_a.tokens_used or 0) + (result_b.tokens_used or 0)),
}
)
)
# Clean up temp stores
try:
shutil.rmtree(run_dir)
except Exception:
pass
# -------------------------------------------------------------------------
# HTTP handler
# -------------------------------------------------------------------------
async def process_request(connection, request: Request):
"""Serve HTML on GET /, upgrade to WebSocket on /ws."""
if request.path == "/ws":
return None
return Response(
HTTPStatus.OK,
"OK",
websockets.Headers({"Content-Type": "text/html; charset=utf-8"}),
HTML_PAGE.encode(),
)
# -------------------------------------------------------------------------
# Main
# -------------------------------------------------------------------------
async def main():
port = 8766
async with websockets.serve(
handle_ws,
"0.0.0.0",
port,
process_request=process_request,
):
logger.info(f"Handoff demo at http://localhost:{port}")
logger.info("Enter a research topic to start the pipeline.")
await asyncio.Future()
if __name__ == "__main__":
asyncio.run(main())
File diff suppressed because it is too large Load Diff
-75
View File
@@ -95,81 +95,6 @@ async def example_3_config_file():
(test_agent_path / "mcp_servers.json").unlink()
async def example_4_custom_agent_with_mcp_tools():
"""Example 4: Build custom agent that uses MCP tools"""
print("\n=== Example 4: Custom Agent with MCP Tools ===\n")
from framework.builder.workflow import GraphBuilder
# Create a workflow builder
builder = GraphBuilder()
# Define goal
builder.set_goal(
goal_id="web-researcher",
name="Web Research Agent",
description="Search the web and summarize findings",
)
# Add success criteria
builder.add_success_criterion(
"search-results", "Successfully retrieve at least 3 web search results"
)
builder.add_success_criterion("summary", "Provide a clear, concise summary of the findings")
# Add nodes that will use MCP tools
builder.add_node(
node_id="web-searcher",
name="Web Search",
description="Search the web for information",
node_type="event_loop",
system_prompt="Search for {query} and return the top results. Use the web_search tool.",
tools=["web_search"], # This tool comes from tools MCP server
input_keys=["query"],
output_keys=["search_results"],
)
builder.add_node(
node_id="summarizer",
name="Summarize Results",
description="Summarize the search results",
node_type="event_loop",
system_prompt="Summarize the following search results in 2-3 sentences: {search_results}",
input_keys=["search_results"],
output_keys=["summary"],
)
# Connect nodes
builder.add_edge("web-searcher", "summarizer")
# Set entry point
builder.set_entry("web-searcher")
builder.set_terminal("summarizer")
# Export the agent
export_path = Path("exports/web-research-agent")
export_path.mkdir(parents=True, exist_ok=True)
builder.export(export_path)
# Load and register MCP server
runner = AgentRunner.load(export_path)
runner.register_mcp_server(
name="tools",
transport="stdio",
command="python",
args=["-m", "aden_tools.mcp_server", "--stdio"],
cwd="../tools",
)
# Run the agent
result = await runner.run({"query": "latest AI breakthroughs 2026"})
print(f"\nAgent completed with result:\n{result}")
# Cleanup
runner.cleanup()
async def main():
"""Run all examples"""
print("=" * 60)
-3
View File
@@ -22,7 +22,6 @@ The framework includes a Goal-Based Testing system (Goal → Agent → Eval):
See `framework.testing` for details.
"""
from framework.builder.query import BuilderQuery
from framework.llm import AnthropicProvider, LLMProvider
from framework.runner import AgentOrchestrator, AgentRunner
from framework.runtime.core import Runtime
@@ -51,8 +50,6 @@ __all__ = [
"Problem",
# Runtime
"Runtime",
# Builder
"BuilderQuery",
# LLM
"LLMProvider",
"AnthropicProvider",
@@ -1,8 +1,6 @@
"""CLI entry point for Credential Tester agent."""
import asyncio
import logging
import sys
import click
@@ -10,13 +8,14 @@ from .agent import CredentialTesterAgent
def setup_logging(verbose=False, debug=False):
from framework.observability import configure_logging
if debug:
level, fmt = logging.DEBUG, "%(asctime)s %(name)s: %(message)s"
configure_logging(level="DEBUG")
elif verbose:
level, fmt = logging.INFO, "%(message)s"
configure_logging(level="INFO")
else:
level, fmt = logging.WARNING, "%(levelname)s: %(message)s"
logging.basicConfig(level=level, format=fmt, stream=sys.stderr)
configure_logging(level="WARNING")
def pick_account(agent: CredentialTesterAgent) -> dict | None:
@@ -51,42 +50,6 @@ def cli():
pass
@cli.command()
@click.option("--verbose", "-v", is_flag=True)
@click.option("--debug", is_flag=True)
def tui(verbose, debug):
"""Launch TUI to test a credential interactively."""
setup_logging(verbose=verbose, debug=debug)
try:
from framework.tui.app import AdenTUI
except ImportError:
click.echo("TUI requires 'textual'. Install with: pip install textual")
sys.exit(1)
agent = CredentialTesterAgent()
account = pick_account(agent)
if account is None:
sys.exit(1)
agent.select_account(account)
provider = account.get("provider", "?")
alias = account.get("alias", "?")
click.echo(f"\nTesting {provider}/{alias}...\n")
async def run_tui():
agent._setup()
runtime = agent._agent_runtime
await runtime.start()
try:
app = AdenTUI(runtime)
await app.run_async()
finally:
await runtime.stop()
asyncio.run(run_tui())
@cli.command()
@click.option("--verbose", "-v", is_flag=True)
@click.option("--debug", is_flag=True)
@@ -19,6 +19,7 @@ from __future__ import annotations
from pathlib import Path
from typing import TYPE_CHECKING
from framework.config import get_max_context_tokens
from framework.graph import Goal, NodeSpec, SuccessCriterion
from framework.graph.checkpoint_config import CheckpointConfig
from framework.graph.edge import GraphSpec
@@ -406,7 +407,8 @@ nodes = [
client_facing=True,
max_node_visits=0,
input_keys=[],
output_keys=[],
output_keys=["test_result"],
nullable_output_keys=["test_result"],
tools=["get_account_info"],
system_prompt="""\
You are a credential tester. Your job is to help the user verify that their \
@@ -444,7 +446,7 @@ edges = []
entry_node = "tester"
entry_points = {"start": "tester"}
pause_nodes = []
terminal_nodes = [] # Forever-alive: loops until user exits
terminal_nodes = ["tester"] # Tester node can terminate
conversation_mode = "continuous"
identity_prompt = (
@@ -454,7 +456,6 @@ identity_prompt = (
loop_config = {
"max_iterations": 50,
"max_tool_calls_per_turn": 30,
"max_history_tokens": 32000,
}
# ---------------------------------------------------------------------------
@@ -531,7 +532,7 @@ class CredentialTesterAgent:
version="1.0.0",
entry_node="tester",
entry_points={"start": "tester"},
terminal_nodes=[],
terminal_nodes=["tester"], # Tester node can terminate
pause_nodes=[],
nodes=[tester_node],
edges=[],
@@ -540,7 +541,7 @@ class CredentialTesterAgent:
loop_config={
"max_iterations": 50,
"max_tool_calls_per_turn": 30,
"max_history_tokens": 32000,
"max_context_tokens": get_max_context_tokens(),
},
conversation_mode="continuous",
identity_prompt=(
@@ -51,7 +51,8 @@ The key is pre-injected into the session environment and tools read it automatic
client_facing=True,
max_node_visits=0,
input_keys=[],
output_keys=[],
output_keys=["test_result"],
nullable_output_keys=["test_result"],
tools=tools,
system_prompt=f"""\
You are a credential tester for the {account_label}: {provider}/{alias}{detail}
+178
View File
@@ -0,0 +1,178 @@
"""Agent discovery — scan known directories and return categorised AgentEntry lists."""
from __future__ import annotations
import json
from dataclasses import dataclass, field
from pathlib import Path
@dataclass
class AgentEntry:
"""Lightweight agent metadata for the picker / API discover endpoint."""
path: Path
name: str
description: str
category: str
session_count: int = 0
run_count: int = 0
node_count: int = 0
tool_count: int = 0
tags: list[str] = field(default_factory=list)
last_active: str | None = None
def _get_last_active(agent_name: str) -> str | None:
"""Return the most recent updated_at timestamp across all sessions."""
sessions_dir = Path.home() / ".hive" / "agents" / agent_name / "sessions"
if not sessions_dir.exists():
return None
latest: str | None = None
for session_dir in sessions_dir.iterdir():
if not session_dir.is_dir() or not session_dir.name.startswith("session_"):
continue
state_file = session_dir / "state.json"
if not state_file.exists():
continue
try:
data = json.loads(state_file.read_text(encoding="utf-8"))
ts = data.get("timestamps", {}).get("updated_at")
if ts and (latest is None or ts > latest):
latest = ts
except Exception:
continue
return latest
def _count_sessions(agent_name: str) -> int:
"""Count session directories under ~/.hive/agents/{agent_name}/sessions/."""
sessions_dir = Path.home() / ".hive" / "agents" / agent_name / "sessions"
if not sessions_dir.exists():
return 0
return sum(1 for d in sessions_dir.iterdir() if d.is_dir() and d.name.startswith("session_"))
def _count_runs(agent_name: str) -> int:
"""Count unique run_ids across all sessions for an agent."""
sessions_dir = Path.home() / ".hive" / "agents" / agent_name / "sessions"
if not sessions_dir.exists():
return 0
run_ids: set[str] = set()
for session_dir in sessions_dir.iterdir():
if not session_dir.is_dir() or not session_dir.name.startswith("session_"):
continue
# runs.jsonl lives inside workspace subdirectories
for runs_file in session_dir.rglob("runs.jsonl"):
try:
for line in runs_file.read_text(encoding="utf-8").splitlines():
line = line.strip()
if not line:
continue
record = json.loads(line)
rid = record.get("run_id")
if rid:
run_ids.add(rid)
except Exception:
continue
return len(run_ids)
def _extract_agent_stats(agent_path: Path) -> tuple[int, int, list[str]]:
"""Extract node count, tool count, and tags from an agent directory.
Prefers agent.py (AST-parsed) over agent.json for node/tool counts
since agent.json may be stale. Tags are only available from agent.json.
"""
import ast
node_count, tool_count, tags = 0, 0, []
agent_py = agent_path / "agent.py"
if agent_py.exists():
try:
tree = ast.parse(agent_py.read_text(encoding="utf-8"))
for node in ast.walk(tree):
if isinstance(node, ast.Assign):
for target in node.targets:
if isinstance(target, ast.Name) and target.id == "nodes":
if isinstance(node.value, ast.List):
node_count = len(node.value.elts)
except Exception:
pass
agent_json = agent_path / "agent.json"
if agent_json.exists():
try:
data = json.loads(agent_json.read_text(encoding="utf-8"))
json_nodes = data.get("graph", {}).get("nodes", []) or data.get("nodes", [])
if node_count == 0:
node_count = len(json_nodes)
tools: set[str] = set()
for n in json_nodes:
tools.update(n.get("tools", []))
tool_count = len(tools)
tags = data.get("agent", {}).get("tags", [])
except Exception:
pass
return node_count, tool_count, tags
def discover_agents() -> dict[str, list[AgentEntry]]:
"""Discover agents from all known sources grouped by category."""
from framework.runner.cli import (
_extract_python_agent_metadata,
_get_framework_agents_dir,
_is_valid_agent_dir,
)
groups: dict[str, list[AgentEntry]] = {}
sources = [
("Your Agents", Path("exports")),
("Framework", _get_framework_agents_dir()),
("Examples", Path("examples/templates")),
]
for category, base_dir in sources:
if not base_dir.exists():
continue
entries: list[AgentEntry] = []
for path in sorted(base_dir.iterdir(), key=lambda p: p.name):
if not _is_valid_agent_dir(path):
continue
name, desc = _extract_python_agent_metadata(path)
config_fallback_name = path.name.replace("_", " ").title()
used_config = name != config_fallback_name
node_count, tool_count, tags = _extract_agent_stats(path)
if not used_config:
agent_json = path / "agent.json"
if agent_json.exists():
try:
data = json.loads(agent_json.read_text(encoding="utf-8"))
meta = data.get("agent", {})
name = meta.get("name", name)
desc = meta.get("description", desc)
except Exception:
pass
entries.append(
AgentEntry(
path=path,
name=name,
description=desc,
category=category,
session_count=_count_sessions(path.name),
run_count=_count_runs(path.name),
node_count=node_count,
tool_count=tool_count,
tags=tags,
last_active=_get_last_active(path.name),
)
)
if entries:
groups[category] = entries
return groups
@@ -1,223 +0,0 @@
"""CLI entry point for Hive Coder agent."""
import asyncio
import json
import logging
import sys
import click
from .agent import HiveCoderAgent, default_agent
def setup_logging(verbose=False, debug=False):
"""Configure logging for execution visibility."""
if debug:
level, fmt = logging.DEBUG, "%(asctime)s %(name)s: %(message)s"
elif verbose:
level, fmt = logging.INFO, "%(message)s"
else:
level, fmt = logging.WARNING, "%(levelname)s: %(message)s"
logging.basicConfig(level=level, format=fmt, stream=sys.stderr)
logging.getLogger("framework").setLevel(level)
@click.group()
@click.version_option(version="1.0.0")
def cli():
"""Hive Coder — Build Hive agent packages from natural language."""
pass
@cli.command()
@click.option("--request", "-r", type=str, required=True, help="What agent to build")
@click.option("--mock", is_flag=True, help="Run in mock mode")
@click.option("--quiet", "-q", is_flag=True, help="Only output result JSON")
@click.option("--verbose", "-v", is_flag=True, help="Show execution details")
@click.option("--debug", is_flag=True, help="Show debug logging")
def run(request, mock, quiet, verbose, debug):
"""Execute agent building from a request."""
if not quiet:
setup_logging(verbose=verbose, debug=debug)
context = {"user_request": request}
result = asyncio.run(default_agent.run(context, mock_mode=mock))
output_data = {
"success": result.success,
"steps_executed": result.steps_executed,
"output": result.output,
}
if result.error:
output_data["error"] = result.error
click.echo(json.dumps(output_data, indent=2, default=str))
sys.exit(0 if result.success else 1)
@cli.command()
@click.option("--mock", is_flag=True, help="Run in mock mode")
@click.option("--verbose", "-v", is_flag=True, help="Show execution details")
@click.option("--debug", is_flag=True, help="Show debug logging")
def tui(mock, verbose, debug):
"""Launch the TUI dashboard for interactive agent building."""
setup_logging(verbose=verbose, debug=debug)
try:
from framework.tui.app import AdenTUI
except ImportError:
click.echo("TUI requires the 'textual' package. Install with: pip install textual")
sys.exit(1)
from pathlib import Path
from framework.llm import LiteLLMProvider
from framework.runner.tool_registry import ToolRegistry
from framework.runtime.agent_runtime import create_agent_runtime
from framework.runtime.execution_stream import EntryPointSpec
async def run_with_tui():
agent = HiveCoderAgent()
agent._tool_registry = ToolRegistry()
storage_path = Path.home() / ".hive" / "agents" / "hive_coder"
storage_path.mkdir(parents=True, exist_ok=True)
mcp_config_path = Path(__file__).parent / "mcp_servers.json"
if mcp_config_path.exists():
agent._tool_registry.load_mcp_config(mcp_config_path)
llm = None
if not mock:
llm = LiteLLMProvider(
model=agent.config.model,
api_key=agent.config.api_key,
api_base=agent.config.api_base,
)
tools = list(agent._tool_registry.get_tools().values())
tool_executor = agent._tool_registry.get_executor()
graph = agent._build_graph()
runtime = create_agent_runtime(
graph=graph,
goal=agent.goal,
storage_path=storage_path,
entry_points=[
EntryPointSpec(
id="start",
name="Build Agent",
entry_node="coder",
trigger_type="manual",
isolation_level="isolated",
),
],
llm=llm,
tools=tools,
tool_executor=tool_executor,
)
await runtime.start()
try:
app = AdenTUI(runtime)
await app.run_async()
finally:
await runtime.stop()
asyncio.run(run_with_tui())
@cli.command()
@click.option("--json", "output_json", is_flag=True)
def info(output_json):
"""Show agent information."""
info_data = default_agent.info()
if output_json:
click.echo(json.dumps(info_data, indent=2))
else:
click.echo(f"Agent: {info_data['name']}")
click.echo(f"Version: {info_data['version']}")
click.echo(f"Description: {info_data['description']}")
click.echo(f"\nNodes: {', '.join(info_data['nodes'])}")
click.echo(f"Client-facing: {', '.join(info_data['client_facing_nodes'])}")
click.echo(f"Entry: {info_data['entry_node']}")
click.echo(f"Terminal: {', '.join(info_data['terminal_nodes']) or '(forever-alive)'}")
@cli.command()
def validate():
"""Validate agent structure."""
validation = default_agent.validate()
if validation["valid"]:
click.echo("Agent is valid")
if validation["warnings"]:
for warning in validation["warnings"]:
click.echo(f" WARNING: {warning}")
else:
click.echo("Agent has errors:")
for error in validation["errors"]:
click.echo(f" ERROR: {error}")
sys.exit(0 if validation["valid"] else 1)
@cli.command()
@click.option("--verbose", "-v", is_flag=True)
def shell(verbose):
"""Interactive agent building session (CLI, no TUI)."""
asyncio.run(_interactive_shell(verbose))
async def _interactive_shell(verbose=False):
"""Async interactive shell."""
setup_logging(verbose=verbose)
click.echo("=== Hive Coder ===")
click.echo("Describe the agent you want to build (or 'quit' to exit):\n")
agent = HiveCoderAgent()
await agent.start()
try:
while True:
try:
request = await asyncio.get_event_loop().run_in_executor(None, input, "Build> ")
if request.lower() in ["quit", "exit", "q"]:
click.echo("Goodbye!")
break
if not request.strip():
continue
click.echo("\nBuilding agent...\n")
result = await agent.trigger_and_wait("default", {"user_request": request})
if result is None:
click.echo("\n[Execution timed out]\n")
continue
if result.success:
output = result.output or {}
agent_name = output.get("agent_name", "unknown")
validation = output.get("validation_result", "unknown")
click.echo(f"\nAgent '{agent_name}' built. Validation: {validation}\n")
else:
click.echo(f"\nBuild failed: {result.error}\n")
except KeyboardInterrupt:
click.echo("\nGoodbye!")
break
except Exception as e:
click.echo(f"Error: {e}", err=True)
import traceback
traceback.print_exc()
finally:
await agent.stop()
if __name__ == "__main__":
cli()
-357
View File
@@ -1,357 +0,0 @@
"""Agent graph construction for Hive Coder."""
from pathlib import Path
from framework.graph import Constraint, Goal, SuccessCriterion
from framework.graph.checkpoint_config import CheckpointConfig
from framework.graph.edge import GraphSpec
from framework.graph.executor import ExecutionResult
from framework.llm import LiteLLMProvider
from framework.runner.tool_registry import ToolRegistry
from framework.runtime.agent_runtime import AgentRuntime, create_agent_runtime
from framework.runtime.execution_stream import EntryPointSpec
from .config import default_config, metadata
from .nodes import coder_node, queen_node
# ticket_receiver is no longer needed — the queen runs as an independent
# GraphExecutor and receives escalation tickets via inject_event().
# Keeping the import commented for reference:
# from .ticket_receiver import TICKET_RECEIVER_ENTRY_POINT
# Goal definition
goal = Goal(
id="agent-builder",
name="Hive Agent Builder",
description=(
"Build complete, validated Hive agent packages from natural language "
"specifications. Produces production-ready Python packages with goals, "
"nodes, edges, system prompts, MCP configuration, and tests."
),
success_criteria=[
SuccessCriterion(
id="valid-package",
description="Generated agent package passes structural validation",
metric="validation_pass",
target="true",
weight=0.30,
),
SuccessCriterion(
id="complete-files",
description=(
"All required files generated: agent.py, config.py, "
"nodes/__init__.py, __init__.py, __main__.py, mcp_servers.json"
),
metric="file_count",
target=">=6",
weight=0.25,
),
SuccessCriterion(
id="user-satisfaction",
description="User reviews and approves the generated agent",
metric="user_approval",
target="true",
weight=0.25,
),
SuccessCriterion(
id="framework-compliance",
description=(
"Generated code follows framework patterns: STEP 1/STEP 2 "
"for client-facing, correct imports, entry_points format"
),
metric="pattern_compliance",
target="100%",
weight=0.20,
),
],
constraints=[
Constraint(
id="dynamic-tool-discovery",
description=(
"Always discover available tools dynamically via "
"discover_mcp_tools before referencing tools in agent designs"
),
constraint_type="hard",
category="correctness",
),
Constraint(
id="no-fabricated-tools",
description="Only reference tools that exist in hive-tools MCP",
constraint_type="hard",
category="correctness",
),
Constraint(
id="valid-python",
description="All generated Python files must be syntactically correct",
constraint_type="hard",
category="correctness",
),
Constraint(
id="self-verification",
description="Run validation after writing code; fix errors before presenting",
constraint_type="hard",
category="quality",
),
],
)
# Nodes: primary coder node only. The queen runs as an independent
# GraphExecutor with queen_node — not as part of this graph.
nodes = [coder_node]
# No edges needed — single forever-alive event_loop node
edges = []
# Graph configuration
entry_node = "coder"
entry_points = {"start": "coder"}
pause_nodes = []
terminal_nodes = [] # Forever-alive: loops until user exits
# No async entry points needed — the queen is now an independent executor,
# not a secondary graph receiving events via add_graph().
async_entry_points = []
# Module-level variables read by AgentRunner.load()
conversation_mode = "continuous"
identity_prompt = (
"You are Hive Coder, the best agent-building coding agent on the planet. "
"You deeply understand the Hive agent framework at the source code level "
"and produce production-ready agent packages from natural language. "
"You can dynamically discover available framework tools, inspect runtime "
"sessions and checkpoints from agents you build, and run their test suites. "
"You follow coding agent discipline: read before writing, verify "
"assumptions by reading actual code, adhere to project conventions, "
"self-verify with validation, and fix your own errors. You are concise, "
"direct, and technically rigorous. No emojis. No fluff."
)
loop_config = {
"max_iterations": 100,
"max_tool_calls_per_turn": 30,
"max_history_tokens": 32000,
}
# ---------------------------------------------------------------------------
# Queen graph — runs as an independent persistent conversation in the TUI.
# Loaded by _load_judge_and_queen() in app.py, NOT by AgentRunner.
# ---------------------------------------------------------------------------
queen_goal = Goal(
id="queen-manager",
name="Queen Manager",
description=(
"Manage the worker agent lifecycle and serve as the user's primary "
"interactive interface. Triage health escalations from the judge."
),
success_criteria=[],
constraints=[],
)
queen_graph = GraphSpec(
id="queen-graph",
goal_id=queen_goal.id,
version="1.0.0",
entry_node="queen",
entry_points={"start": "queen"},
terminal_nodes=[],
pause_nodes=[],
nodes=[queen_node],
edges=[],
conversation_mode="continuous",
loop_config={
"max_iterations": 999_999,
"max_tool_calls_per_turn": 30,
"max_history_tokens": 32000,
},
)
class HiveCoderAgent:
"""
Hive Coder builds Hive agent packages from natural language.
Single-node architecture: the coder runs in a continuous while(true) loop.
The queen runs as an independent GraphExecutor (loaded by the TUI via
_load_judge_and_queen), not as part of this graph.
"""
def __init__(self, config=None):
self.config = config or default_config
self.goal = goal
self.nodes = nodes
self.edges = edges
self.entry_node = entry_node
self.entry_points = entry_points
self.pause_nodes = pause_nodes
self.terminal_nodes = terminal_nodes
self.async_entry_points = async_entry_points
self._graph: GraphSpec | None = None
self._agent_runtime: AgentRuntime | None = None
self._tool_registry: ToolRegistry | None = None
self._storage_path: Path | None = None
def _build_graph(self) -> GraphSpec:
"""Build the GraphSpec."""
return GraphSpec(
id="hive-coder-graph",
goal_id=self.goal.id,
version="1.0.0",
entry_node=self.entry_node,
entry_points=self.entry_points,
terminal_nodes=self.terminal_nodes,
pause_nodes=self.pause_nodes,
nodes=self.nodes,
edges=self.edges,
default_model=self.config.model,
max_tokens=self.config.max_tokens,
loop_config=loop_config,
conversation_mode=conversation_mode,
identity_prompt=identity_prompt,
async_entry_points=self.async_entry_points,
)
def _setup(self, mock_mode=False) -> None:
"""Set up the agent runtime."""
self._storage_path = Path.home() / ".hive" / "agents" / "hive_coder"
self._storage_path.mkdir(parents=True, exist_ok=True)
self._tool_registry = ToolRegistry()
mcp_config_path = Path(__file__).parent / "mcp_servers.json"
if mcp_config_path.exists():
self._tool_registry.load_mcp_config(mcp_config_path)
llm = None
if not mock_mode:
llm = LiteLLMProvider(
model=self.config.model,
api_key=self.config.api_key,
api_base=self.config.api_base,
)
tool_executor = self._tool_registry.get_executor()
tools = list(self._tool_registry.get_tools().values())
self._graph = self._build_graph()
checkpoint_config = CheckpointConfig(
enabled=True,
checkpoint_on_node_start=False,
checkpoint_on_node_complete=True,
checkpoint_max_age_days=7,
async_checkpoint=True,
)
entry_point_specs = [
EntryPointSpec(
id="default",
name="Default",
entry_node=self.entry_node,
trigger_type="manual",
isolation_level="shared",
),
]
self._agent_runtime = create_agent_runtime(
graph=self._graph,
goal=self.goal,
storage_path=self._storage_path,
entry_points=entry_point_specs,
llm=llm,
tools=tools,
tool_executor=tool_executor,
checkpoint_config=checkpoint_config,
graph_id="hive_coder",
)
async def start(self, mock_mode=False) -> None:
"""Set up and start the agent runtime."""
if self._agent_runtime is None:
self._setup(mock_mode=mock_mode)
if not self._agent_runtime.is_running:
await self._agent_runtime.start()
async def stop(self) -> None:
"""Stop the agent runtime and clean up."""
if self._agent_runtime and self._agent_runtime.is_running:
await self._agent_runtime.stop()
self._agent_runtime = None
async def trigger_and_wait(
self,
entry_point: str = "default",
input_data: dict | None = None,
timeout: float | None = None,
session_state: dict | None = None,
) -> ExecutionResult | None:
"""Execute the graph and wait for completion."""
if self._agent_runtime is None:
raise RuntimeError("Agent not started. Call start() first.")
return await self._agent_runtime.trigger_and_wait(
entry_point_id=entry_point,
input_data=input_data or {},
session_state=session_state,
)
async def run(self, context: dict, mock_mode=False, session_state=None) -> ExecutionResult:
"""Run the agent (convenience method for single execution)."""
await self.start(mock_mode=mock_mode)
try:
result = await self.trigger_and_wait("default", context, session_state=session_state)
return result or ExecutionResult(success=False, error="Execution timeout")
finally:
await self.stop()
def info(self):
"""Get agent information."""
return {
"name": metadata.name,
"version": metadata.version,
"description": metadata.description,
"goal": {
"name": self.goal.name,
"description": self.goal.description,
},
"nodes": [n.id for n in self.nodes],
"edges": [e.id for e in self.edges],
"entry_node": self.entry_node,
"entry_points": self.entry_points,
"pause_nodes": self.pause_nodes,
"terminal_nodes": self.terminal_nodes,
"client_facing_nodes": [n.id for n in self.nodes if n.client_facing],
}
def validate(self):
"""Validate agent structure."""
errors = []
warnings = []
node_ids = {node.id for node in self.nodes}
for edge in self.edges:
if edge.source not in node_ids:
errors.append(f"Edge {edge.id}: source '{edge.source}' not found")
if edge.target not in node_ids:
errors.append(f"Edge {edge.id}: target '{edge.target}' not found")
if self.entry_node not in node_ids:
errors.append(f"Entry node '{self.entry_node}' not found")
for terminal in self.terminal_nodes:
if terminal not in node_ids:
errors.append(f"Terminal node '{terminal}' not found")
for ep_id, node_id in self.entry_points.items():
if node_id not in node_ids:
errors.append(f"Entry point '{ep_id}' references unknown node '{node_id}'")
return {
"valid": len(errors) == 0,
"errors": errors,
"warnings": warnings,
}
# Create default instance
default_agent = HiveCoderAgent()
@@ -1,888 +0,0 @@
"""Node definitions for Hive Coder agent."""
from pathlib import Path
from framework.graph import NodeSpec
# Load reference docs at import time so they're always in the system prompt.
# No voluntary read_file() calls needed — the LLM gets everything upfront.
_ref_dir = Path(__file__).parent.parent / "reference"
_framework_guide = (_ref_dir / "framework_guide.md").read_text()
_file_templates = (_ref_dir / "file_templates.md").read_text()
_anti_patterns = (_ref_dir / "anti_patterns.md").read_text()
_gcu_guide_path = _ref_dir / "gcu_guide.md"
_gcu_guide = _gcu_guide_path.read_text() if _gcu_guide_path.exists() else ""
def _is_gcu_enabled() -> bool:
try:
from framework.config import get_gcu_enabled
return get_gcu_enabled()
except Exception:
return False
def _build_appendices() -> str:
parts = (
"\n\n# Appendix: Framework Reference\n\n"
+ _framework_guide
+ "\n\n# Appendix: File Templates\n\n"
+ _file_templates
+ "\n\n# Appendix: Anti-Patterns\n\n"
+ _anti_patterns
)
if _is_gcu_enabled() and _gcu_guide:
parts += "\n\n# Appendix: GCU Browser Automation Guide\n\n" + _gcu_guide
return parts
# Shared appendices — appended to every coding node's system prompt.
_appendices = _build_appendices()
# Tools available to both coder (worker) and queen.
_SHARED_TOOLS = [
# File I/O
"read_file",
"write_file",
"edit_file",
"list_directory",
"search_files",
"run_command",
"undo_changes",
# Meta-agent
"list_agent_tools",
"discover_mcp_tools",
"validate_agent_tools",
"list_agents",
"list_agent_sessions",
"get_agent_session_state",
"get_agent_session_memory",
"list_agent_checkpoints",
"get_agent_checkpoint",
"run_agent_tests",
]
# ---------------------------------------------------------------------------
# Shared agent-building knowledge: core mandates, tool docs, meta-agent
# capabilities, and workflow phases 1-6. Both the coder (worker) and
# queen compose their system prompts from this block + role-specific
# additions.
# ---------------------------------------------------------------------------
_agent_builder_knowledge = """\
# Core Mandates
- **Read before writing.** NEVER write code from assumptions. Read \
reference agents and templates first. Read every file before editing.
- **Conventions first.** Follow existing project patterns exactly. \
Analyze imports, structure, and style in reference agents.
- **Verify assumptions.** Never assume a class, import, or pattern \
exists. Read actual source to confirm. Search if unsure.
- **Discover tools dynamically.** NEVER reference tools from static \
docs. Always run list_agent_tools() to see what actually exists.
- **Professional objectivity.** If a use case is a poor fit for the \
framework, say so. Technical accuracy over validation.
- **Concise.** No emojis. No preambles. No postambles. Substance only.
- **Self-verify.** After writing code, run validation and tests. Fix \
errors yourself. Don't declare success until validation passes.
# Tools
## File I/O
- read_file(path, offset?, limit?) read with line numbers
- write_file(path, content) create/overwrite, auto-mkdir
- edit_file(path, old_text, new_text, replace_all?) fuzzy-match edit
- list_directory(path, recursive?) list contents
- search_files(pattern, path?, include?) regex search
- run_command(command, cwd?, timeout?) shell execution
- undo_changes(path?) restore from git snapshot
## Meta-Agent
- list_agent_tools(server_config_path?) list all tool names available \
for agent building, grouped by category. Call this FIRST before designing.
- discover_mcp_tools(server_config_path?) connect to MCP servers \
and list all available tools with full schemas. Use for parameter details.
- validate_agent_tools(agent_path) validate that all tools declared \
in an agent's nodes actually exist. Call after building.
- list_agents() list all agent packages in exports/ with session counts
- list_agent_sessions(agent_name, status?, limit?) list sessions
- get_agent_session_state(agent_name, session_id) full session state
- get_agent_session_memory(agent_name, session_id, key?) memory data
- list_agent_checkpoints(agent_name, session_id) list checkpoints
- get_agent_checkpoint(agent_name, session_id, checkpoint_id?) load checkpoint
- run_agent_tests(agent_name, test_types?, fail_fast?) run pytest with parsing
# Meta-Agent Capabilities
You are not just a file writer. You have deep integration with the \
Hive framework:
## Tool Discovery (MANDATORY before designing)
Before designing any agent, run list_agent_tools() to get all \
available tool names. ONLY use tools from this list in your node \
definitions. NEVER guess or fabricate tool names from memory.
For full parameter schemas when you need details:
discover_mcp_tools()
To check a specific agent's configured tools:
list_agent_tools("exports/{agent_name}/mcp_servers.json")
## Agent Awareness
Run list_agents() to see what agents already exist. Read their code \
for patterns:
read_file("exports/{name}/agent.py")
read_file("exports/{name}/nodes/__init__.py")
## Post-Build Testing
After writing agent code, validate structurally AND run tests:
run_command("python -c 'from {name} import default_agent; \\
print(default_agent.validate())'")
run_agent_tests("{name}")
## Debugging Built Agents
When a user says "my agent is failing" or "debug this agent":
1. list_agent_sessions("{agent_name}") find the session
2. get_agent_session_state("{agent_name}", "{session_id}") see status
3. get_agent_session_memory("{agent_name}", "{session_id}") inspect data
4. list_agent_checkpoints / get_agent_checkpoint trace execution
# Agent Building Workflow
You operate in a continuous loop. The user describes what they want, \
you build it. No rigid phases use judgment. But the general flow is:
## 1. Understand & Qualify (3-5 turns)
This is ONE conversation, not two phases. Discovery and qualification \
happen together. Surface problems as you find them, not in a batch.
**Before your first response**, silently run list_agent_tools() and \
consult the **Framework Reference** appendix. Know what's possible \
before you speak.
### How to respond to the user's first message
**Listen like an architect.** While they talk, hear the structure:
- **The actors**: Who are the people/systems involved?
- **The trigger**: What kicks off the workflow?
- **The core loop**: What's the main thing that happens repeatedly?
- **The output**: What's the valuable thing produced?
- **The pain**: What about today is broken, slow, or missing?
| They say... | You're hearing... |
|-------------|-------------------|
| Nouns they repeat | Your entities |
| Verbs they emphasize | Your core operations |
| Frustrations they mention | Your design constraints |
| Workarounds they describe | What the system must replace |
**Use domain knowledge aggressively.** If they say "research agent," \
you already know it involves search, summarization, source tracking, \
iteration. Don't ask about each — use them as defaults and let their \
specifics override. Merge your general knowledge with their specifics: \
60-80% right before you ask a single question.
### Play back a model WITH qualification baked in
Don't separate "here's what I understood" from "here's what might be \
a problem." Weave them together. Your playback should sound like:
"Here's how I'm picturing this: [concrete proposed solution]. \
The framework handles [X and Y] well for this. [One concern: Z tool \
doesn't exist, so we'd use W instead / Z would need real-time which \
isn't a fit, but we could do polling]. For MVP I'd focus on \
[highest-value thing]. Before I start [1-2 questions]."
If there's a deal-breaker, lead with it: "Before I go further — \
this needs [X] which the framework can't do because [Y]. We could \
[workaround] or reconsider the approach. What do you think?"
**Surface problems immediately. Don't save them for a formal review.**
### Ask only what you CANNOT infer
Every question must earn its place by preventing a costly wrong turn, \
unlocking a shortcut, or surfacing a dealbreaker.
Good questions: "Who's the primary user?", "Is this replacing \
something or net new?", "Does this integrate with anything?"
Bad questions (DON'T ask): "What should happen on error?", "Should \
it have search?", "What tools should I use?" — these are your job.
### Conversation flow
| Turn | Who | What |
|------|-----|------|
| 1 | User | Describes what they need |
| 2 | You | Play back model with concerns baked in. 1-2 questions max. |
| 3 | User | Corrects, confirms, or adds detail |
| 4 | You | Adjust model, confirm scope, move to design |
### Anti-patterns
| Don't | Do instead |
|-------|------------|
| Open with a list of questions | Open with what you understood |
| Separate "assessment" dump | Weave concerns into your playback |
| Good/Bad/Ugly formal section | Mention issues naturally in context |
| Ask about every edge case | Smart defaults, flag in summary |
| 10+ turn discovery | 3-5 turns, then start building |
| Wait for certainty | Start at 80% confidence, iterate |
| Ask what tech/tools to use | Decide, disclose, move on |
## 3. Design
Design the agent architecture:
- Goal: id, name, description, 3-5 success criteria, 2-4 constraints
- Nodes: **2-4 nodes MAXIMUM** (see rules below)
- Edges: on_success for linear, conditional for routing
- Lifecycle: ALWAYS forever-alive (`terminal_nodes=[]`) unless the user \
explicitly requests a one-shot/batch agent. Forever-alive agents loop \
continuously the user exits by closing the TUI. This is the standard \
pattern for all interactive agents.
### Node Count Rules (HARD LIMITS)
**2-4 nodes** for all agents. Never exceed 4 unless the user explicitly \
requests more. Each node boundary serializes outputs to shared memory \
and DESTROYS all in-context information (tool results, reasoning, history).
**MERGE nodes when:**
- Node has NO tools (pure LLM reasoning) merge into predecessor/successor
- Node sets only 1 trivial output collapse into predecessor
- Multiple consecutive autonomous nodes combine into one rich node
- A "report" or "summary" node merge into the client-facing node
- A "confirm" or "schedule" node that calls no external service remove
**SEPARATE nodes only when:**
- Client-facing vs autonomous (different interaction models)
- Fundamentally different tool sets
- Fan-out parallelism (parallel branches MUST be separate)
**Typical patterns:**
- 2 nodes: `interact (client-facing) process (autonomous) interact`
- 3 nodes: `intake (CF) process (auto) review (CF) intake`
- WRONG: 7 nodes where half have no tools and just do LLM reasoning
Read reference agents before designing:
list_agents()
read_file("exports/deep_research_agent/agent.py")
read_file("exports/deep_research_agent/nodes/__init__.py")
Present the design to the user. Lead with a large ASCII graph inside \
a code block so it renders in monospace. Make it visually prominent \
use box-drawing characters and clear flow arrows:
```
intake (client-facing)
tools: set_output
on_success
process (autonomous)
tools: web_search,
save_data
on_success
back to intake
```
Follow the graph with a brief summary of each node's purpose. \
Get user approval before implementing.
## 4. Implement
Consult the **File Templates** and **Anti-Patterns** appendices below.
Write files in order:
1. mkdir -p exports/{name}/nodes exports/{name}/tests
2. config.py RuntimeConfig + AgentMetadata
3. nodes/__init__.py NodeSpec definitions with system prompts
4. agent.py Goal, edges, graph, agent class
5. __init__.py package exports
6. __main__.py CLI with click
7. mcp_servers.json tool server config
8. tests/ fixtures
### Critical Rules
**Imports** (must match exactly only import what you use):
```python
from framework.graph import (
NodeSpec, EdgeSpec, EdgeCondition,
Goal, SuccessCriterion, Constraint,
)
from framework.graph.edge import GraphSpec
from framework.graph.executor import ExecutionResult
from framework.graph.checkpoint_config import CheckpointConfig
from framework.llm import LiteLLMProvider
from framework.runner.tool_registry import ToolRegistry
from framework.runtime.agent_runtime import (
AgentRuntime, create_agent_runtime,
)
from framework.runtime.execution_stream import EntryPointSpec
```
For agents with async entry points (timers, webhooks, events), also add:
```python
from framework.graph.edge import GraphSpec, AsyncEntryPointSpec
from framework.runtime.agent_runtime import (
AgentRuntime, AgentRuntimeConfig, create_agent_runtime,
)
```
NEVER `from core.framework...` PYTHONPATH includes core/.
**__init__.py MUST re-export ALL module-level variables** \
(THIS IS THE #1 SOURCE OF AGENT LOAD FAILURES):
The runner imports the package (__init__.py), NOT agent.py. It reads \
goal, nodes, edges, entry_node, entry_points, pause_nodes, \
terminal_nodes, conversation_mode, identity_prompt, loop_config via \
getattr(). If ANY are missing from __init__.py, they silently default \
to None or {} causing "must define goal, nodes, edges" or "node X \
is unreachable" errors. The __init__.py MUST import and re-export \
ALL of these from .agent:
```python
from .agent import (
MyAgent, default_agent, goal, nodes, edges,
entry_node, entry_points, pause_nodes, terminal_nodes,
conversation_mode, identity_prompt, loop_config,
)
```
**entry_points**: `{"start": "first-node-id"}`
For agents with multiple entry points (e.g. a reminder trigger), \
add them: `{"start": "intake", "reminder": "reminder"}`
**conversation_mode** ONLY two valid values:
- `"continuous"` recommended for interactive agents (context carries \
across node transitions)
- Omit entirely for isolated per-node conversations
NEVER use: "client_facing", "interactive", "adaptive", or any other \
value. These DO NOT EXIST.
**loop_config** ONLY three valid keys:
```python
loop_config = {
"max_iterations": 100,
"max_tool_calls_per_turn": 30,
"max_history_tokens": 32000,
}
```
NEVER add: "strategy", "mode", "timeout", or other keys.
**mcp_servers.json**:
```json
{
"hive-tools": {
"transport": "stdio",
"command": "uv",
"args": ["run", "python", "mcp_server.py", "--stdio"],
"cwd": "../../tools"
}
}
```
NO "mcpServers" wrapper. cwd "../../tools". command "uv".
**Storage**: `Path.home() / ".hive" / "agents" / "{name}"`
**Client-facing system prompts** STEP 1/STEP 2 pattern:
```
STEP 1 Present to user (text only, NO tool calls):
[instructions]
STEP 2 After user responds, call set_output:
[set_output calls]
```
**Autonomous system prompts** set_output in SEPARATE turn.
**Tools** NEVER fabricate tool names. Common hallucinations: \
csv_read, csv_write, csv_append, file_upload, database_query. \
If list_agent_tools() shows these don't exist, use alternatives \
(e.g. save_data/load_data for data persistence).
**Node rules**:
- **2-4 nodes MAX.** Never exceed 4. Merge thin nodes aggressively.
- A node with 0 tools is NOT a real node merge it.
- node_type "event_loop" for all regular graph nodes. Use "gcu" ONLY for
browser automation subagents (see GCU appendix). GCU nodes MUST be in a
parent node's sub_agents list, NEVER connected via edges, and NEVER used
as entry/terminal nodes.
- max_node_visits default is 0 (unbounded) correct for forever-alive. \
Only set >0 in one-shot agents with bounded feedback loops.
- Feedback inputs: nullable_output_keys
- terminal_nodes=[] for forever-alive (the default)
- Every node MUST have at least one outgoing edge (no dead ends)
- Agents are forever-alive unless user explicitly asks for one-shot
**Agent class**: CamelCase name, default_agent at module level. \
Constructor takes `config=None`. Follow the exact pattern in \
file_templates.md do NOT invent constructor params like \
`llm_provider` or `tool_registry`.
**Module-level variables** (read by AgentRunner.load()):
goal, nodes, edges, entry_node, entry_points, pause_nodes,
terminal_nodes, conversation_mode, identity_prompt, loop_config
For agents with async triggers, also export:
async_entry_points, runtime_config
**Async entry points** (timers, webhooks, events):
When an agent needs scheduled tasks, webhook reactions, or event-driven \
triggers, use `AsyncEntryPointSpec` (from framework.graph.edge) and \
`AgentRuntimeConfig` (from framework.runtime.agent_runtime):
- Timer (cron): `trigger_type="timer"`, \
`trigger_config={"cron": "0 9 * * *"}` standard 5-field cron expression \
(e.g. `"0 9 * * MON-FRI"` weekdays 9am, `"*/30 * * * *"` every 30 min)
- Timer (interval): `trigger_type="timer"`, \
`trigger_config={"interval_minutes": 20, "run_immediately": False}`
- Event (for webhooks): `trigger_type="event"`, \
`trigger_config={"event_types": ["webhook_received"]}`
- `isolation_level="shared"` so async runs can read primary session memory
- `runtime_config = AgentRuntimeConfig(webhook_routes=[...])` for HTTP webhooks
- Reference: `exports/gmail_inbox_guardian/agent.py`
- Full docs: see **Framework Reference** appendix (Async Entry Points section)
## 5. Verify
Run FOUR validation steps after writing. All must pass:
**Step A Class validation** (checks graph structure):
```
run_command("python -c 'from {name} import default_agent; \\
print(default_agent.validate())'")
```
**Step B Runner load test** (checks package export contract \
THIS IS THE SAME PATH THE TUI USES):
```
run_command("python -c 'from framework.runner.runner import \\
AgentRunner; r = AgentRunner.load(\"exports/{name}\"); \\
print(\"AgentRunner.load: OK\")'")
```
This catches missing __init__.py exports, bad conversation_mode, \
invalid loop_config, and unreachable nodes. If Step A passes but \
Step B fails, the problem is in __init__.py exports.
**Step C Tool validation** (checks that declared tools actually exist \
in the agent's MCP servers — catches hallucinated tool names):
```
validate_agent_tools("exports/{name}")
```
If any tools are missing: fix the node definitions to use only tools \
that exist. Run list_agent_tools() to see what's available.
**Step D Run tests:**
```
run_agent_tests("{name}")
```
If anything fails: read error, fix with edit_file, re-validate. Up to 3x.
**CRITICAL: Testing forever-alive agents**
Most agents use `terminal_nodes=[]` (forever-alive). This means \
`runner.run()` NEVER returns it hangs forever waiting for a \
terminal node that doesn't exist. Agent tests MUST be structural:
- Validate graph, node specs, edges, tools, prompts
- Check goal/constraints/success criteria definitions
- Test `AgentRunner.load()` succeeds (structural, no API key needed)
- NEVER call `runner.run()` or `trigger_and_wait()` in tests for \
forever-alive agents they will hang and time out.
When you restructure an agent (change nodes/edges), always update \
the tests to match. Stale tests referencing old node names will fail.
## 6. Present
Show the user what you built: agent name, goal summary, graph (same \
ASCII style as Design), files created, validation status. Offer to \
revise or build another.
"""
# ---------------------------------------------------------------------------
# Coder-specific: set_output after presentation + standalone phase 7
# ---------------------------------------------------------------------------
_coder_completion = """
After user confirms satisfaction:
set_output("agent_name", "the_agent_name")
set_output("validation_result", "valid")
If building another agent, just start the loop again no need to \
set_output until the user is done.
## 7. Live Test (optional)
After the user approves, offer to load and run the agent in-session.
If running with a queen (server/frontend):
```
load_built_agent("exports/{name}") # loads as the session worker
```
The frontend updates automatically the user sees the agent's graph, \
the tab renames, and you can delegate via start_worker(task).
If running standalone (TUI):
```
load_agent("exports/{name}") # registers as secondary graph
start_agent("{name}") # triggers default entry point
```
"""
# ---------------------------------------------------------------------------
# Queen-specific: extra tool docs, behavior, phase 7, style
# ---------------------------------------------------------------------------
_queen_tools_docs = """
## Worker Lifecycle
- start_worker(task) Start the worker with a task description. The \
worker runs autonomously until it finishes or asks the user a question.
- stop_worker() Cancel the worker's current execution.
- get_worker_status() Check if the worker is idle, running, or waiting \
for user input. Returns execution details.
- inject_worker_message(content) Send a message to the running worker. \
Use this to relay user instructions or concerns.
## Monitoring
- get_worker_health_summary() Read the latest health data from the judge.
- notify_operator(ticket_id, analysis, urgency) Alert the user about a \
critical issue. Use sparingly.
## Agent Loading
- load_built_agent(agent_path) Load a newly built agent as the worker in \
this session. If a worker is already loaded, it is automatically unloaded \
first. Call after building and validating an agent to make it available \
immediately.
## Credentials
- list_credentials(credential_id?) List all authorized credentials in the \
local store. Returns IDs, aliases, status, and identity metadata (never \
secrets). Optionally filter by credential_id.
"""
_queen_behavior = """
# Behavior
## CRITICAL RULE — ask_user tool
Every response that ends with a question, a prompt, or expects user \
input MUST finish with a call to ask_user(prompt, options). This is \
NON-NEGOTIABLE. The system CANNOT detect that you are waiting for \
input unless you call ask_user. You MUST call ask_user as the LAST \
action in your response.
NEVER end a response with a question in text without calling ask_user. \
NEVER rely on the user seeing your text and replying call ask_user.
Always provide 2-4 short options that cover the most likely answers. \
The user can always type a custom response.
Examples:
- ask_user("What do you need?",
["Build a new agent", "Run the loaded worker", "Help with code"])
- ask_user("Which pattern?",
["Simple 2-node", "Rich with feedback", "Custom"])
- ask_user("Ready to proceed?",
["Yes, go ahead", "Let me change something"])
## Greeting and identity
When the user greets you or asks what you can do, respond concisely \
(under 10 lines). DO NOT list internal processes. Focus on:
1. Direct capabilities: coding, agent building & debugging.
2. What the loaded worker does (one sentence from Worker Profile). \
If no worker is loaded, say so.
3. THEN call ask_user to prompt them do NOT just write text.
## Direct coding
You can do any coding task directly reading files, writing code, running \
commands, building agents, debugging. For quick tasks, do them yourself.
## Worker delegation
The worker is a specialized agent (see Worker Profile at the end of this \
prompt). It can ONLY do what its goal and tools allow.
**Decision rule read the Worker Profile first:**
- The user's request directly matches the worker's goal start_worker(task)
- Anything else do it yourself. Do NOT reframe user requests into \
subtasks to justify delegation.
- Building, modifying, or configuring agents is ALWAYS your job. Never \
delegate agent construction to the worker, even as a "research" subtask.
## When the user says "run", "execute", or "start" (without specifics)
The loaded worker is described in the Worker Profile below. Ask what \
task or topic they want do NOT call list_agents() or list directories. \
The worker is already loaded. Just ask for the input the worker needs \
(e.g., a research topic, a target domain, a job description).
If NO worker is loaded, say so and offer to build one.
## When idle (worker not running):
- Greet the user. Mention what the worker can do in one sentence.
- For tasks matching the worker's goal, call start_worker(task).
- For everything else, do it directly.
## When the user clicks Run (external event notification)
When you receive an event that the user clicked Run:
- If the worker started successfully, briefly acknowledge it do NOT \
repeat the full status. The user can see the graph is running.
- If the worker failed to start (credential or structural error), \
explain the problem clearly and help fix it. For credential errors, \
guide the user to set up the missing credentials. For structural \
issues, offer to fix the agent graph directly.
## When worker is running — GO SILENT
Once you call start_worker(), your job is DONE. Do NOT call ask_user, \
do NOT call get_worker_status(), do NOT emit any text. Just stop. \
The worker owns the conversation now it has its own client-facing \
nodes that talk to the user directly.
**After start_worker, your ENTIRE response should be ONE short \
confirmation sentence with NO tool calls.** Example: \
"Started the vulnerability assessment." that's it. No ask_user, \
no get_worker_status, no follow-up questions.
You only wake up again when:
- The user explicitly addresses you (not answering a worker question)
- A worker question is forwarded to you for relay
- An escalation ticket arrives from the judge
- The worker finishes
If the user explicitly asks about progress, call get_worker_status() \
ONCE and report. Do NOT poll or check proactively.
For escalation tickets: low/transient acknowledge silently. \
High/critical notify the user with a brief analysis.
## When the worker asks the user a question:
- The user's answer is routed to you with context: \
[Worker asked: "...", Options: ...] User answered: "...".
- If the user is answering the worker's question normally, relay it \
using inject_worker_message(answer_text). Then go silent again.
- If the user is rejecting the approach, asking to stop, or giving \
you an instruction, handle it yourself do NOT relay.
## Showing or describing the loaded worker
When the user asks to "show the graph", "describe the agent", or \
"re-generate the graph", read the Worker Profile and present the \
worker's current architecture as an ASCII diagram. Use the processing \
stages, tools, and edges from the loaded worker. Do NOT enter the \
agent building workflow you are describing what already exists, not \
building something new.
## Modifying the loaded worker
When the user asks to change, modify, or update the loaded worker \
(e.g., "change the report node", "add a node", "delete node X"):
1. Use the **Path** from the Worker Profile to locate the agent files.
2. Read the relevant files (nodes/__init__.py, agent.py, etc.).
3. Make the requested changes using edit_file / write_file.
4. Run validation (default_agent.validate(), AgentRunner.load(), \
validate_agent_tools()).
5. **Reload the modified worker**: call load_built_agent("{path}") \
so the changes take effect immediately. If a worker is already loaded, \
stop it first, then reload.
Do NOT skip step 5 without reloading, the user will still be \
interacting with the old version.
"""
_queen_phase_7 = """
## 7. Load into Session
After building and verifying, load the agent into the current session:
load_built_agent("exports/{name}")
This makes the agent available immediately the user sees its graph, \
the tab name updates, and you can delegate to it via start_worker(). \
Do NOT tell the user to run `python -m {name} run` load it here.
"""
_queen_style = """
# Style
- Concise. No fluff. Direct. No emojis.
- **One phase per response.** Stop after each phase and get user \
confirmation before moving on. Never combine understand + design + \
implement in one response.
- When starting the worker, describe what you told it in one sentence.
- When an escalation arrives, lead with severity and recommended action.
"""
# ---------------------------------------------------------------------------
# Node definitions
# ---------------------------------------------------------------------------
# Single node — like opencode's while(true) loop.
# One continuous context handles the entire workflow:
# discover → design → implement → verify → present → iterate.
coder_node = NodeSpec(
id="coder",
name="Hive Coder",
description=(
"Autonomous coding agent that builds Hive agent packages. "
"Handles the full lifecycle: understanding user intent, "
"designing architecture, writing code, validating, and "
"iterating on feedback — all in one continuous conversation."
),
node_type="event_loop",
client_facing=True,
max_node_visits=0,
input_keys=["user_request"],
output_keys=["agent_name", "validation_result"],
success_criteria=(
"A complete, validated Hive agent package exists at "
"exports/{agent_name}/ and passes structural validation."
),
tools=_SHARED_TOOLS
+ [
# Graph lifecycle tools (multi-graph sessions)
"load_agent",
"unload_agent",
"start_agent",
"restart_agent",
"get_user_presence",
],
system_prompt=(
"You are Hive Coder, the best agent-building coding agent. You build "
"production-ready Hive agent packages from natural language.\n"
+ _agent_builder_knowledge
+ _coder_completion
+ _appendices
),
)
ticket_triage_node = NodeSpec(
id="ticket_triage",
name="Ticket Triage",
description=(
"Queen's triage node. Receives an EscalationTicket from the Health Judge "
"via event-driven entry point and decides: dismiss or notify the operator."
),
node_type="event_loop",
client_facing=True, # Operator can chat with queen once connected (Ctrl+Q)
max_node_visits=0,
input_keys=["ticket"],
output_keys=["intervention_decision"],
nullable_output_keys=["intervention_decision"],
success_criteria=(
"A clear intervention decision: either dismissed with documented reasoning, "
"or operator notified via notify_operator with specific analysis."
),
tools=["notify_operator"],
system_prompt="""\
You are the Queen (Hive Coder). The Worker Health Judge has escalated a worker \
issue to you. The ticket is in your memory under key "ticket". Read it carefully.
## Dismiss criteria — do NOT call notify_operator:
- severity is "low" AND steps_since_last_accept < 8
- Cause is clearly a transient issue (single API timeout, brief stall that \
self-resolved based on the evidence)
- Evidence shows the agent is making real progress despite bad verdicts
## Intervene criteria — call notify_operator:
- severity is "high" or "critical"
- steps_since_last_accept >= 10 with no sign of recovery
- stall_minutes > 4 (worker definitively stuck)
- Evidence shows a doom loop (same error, same tool, no progress)
- Cause suggests a logic bug, missing configuration, or unrecoverable state
## When intervening:
Call notify_operator with:
ticket_id: <ticket["ticket_id"]>
analysis: "<2-3 sentences: what is wrong, why it matters, suggested action>"
urgency: "<low|medium|high|critical>"
## After deciding:
set_output("intervention_decision", "dismissed: <reason>" or "escalated: <summary>")
Be conservative but not passive. You are the last quality gate before the human \
is disturbed. One unnecessary alert is less costly than alert fatigue but \
genuine stuck agents must be caught.
""",
)
ALL_QUEEN_TRIAGE_TOOLS = ["notify_operator"]
queen_node = NodeSpec(
id="queen",
name="Queen",
description=(
"User's primary interactive interface with full coding capability. "
"Can build agents directly or delegate to the worker. Manages the "
"worker agent lifecycle and triages health escalations from the judge."
),
node_type="event_loop",
client_facing=True,
max_node_visits=0,
input_keys=["greeting"],
output_keys=[],
nullable_output_keys=[],
success_criteria=(
"User's intent is understood, coding tasks are completed correctly, "
"and the worker is managed effectively when delegated to."
),
tools=_SHARED_TOOLS
+ [
# Worker lifecycle
"start_worker",
"stop_worker",
"get_worker_status",
"inject_worker_message",
# Monitoring
"get_worker_health_summary",
"notify_operator",
# Agent loading
"load_built_agent",
# Credentials
"list_credentials",
],
system_prompt=(
"You are the Queen — the user's primary interface. You are a coding agent "
"with the same capabilities as the Hive Coder worker, PLUS the ability to "
"manage the worker's lifecycle.\n"
+ _agent_builder_knowledge
+ _queen_tools_docs
+ _queen_behavior
+ _queen_phase_7
+ _queen_style
+ _appendices
),
)
ALL_QUEEN_TOOLS = _SHARED_TOOLS + [
# Worker lifecycle
"start_worker",
"stop_worker",
"get_worker_status",
"inject_worker_message",
# Monitoring
"get_worker_health_summary",
"notify_operator",
# Agent loading
"load_built_agent",
# Credentials
"list_credentials",
]
__all__ = [
"coder_node",
"ticket_triage_node",
"queen_node",
"ALL_QUEEN_TRIAGE_TOOLS",
"ALL_QUEEN_TOOLS",
]
@@ -1,111 +0,0 @@
# Common Mistakes When Building Hive Agents
## Critical Errors
1. **Using tools that don't exist** — Always verify tools are available in the hive-tools MCP server before assigning them to nodes. Never guess tool names.
2. **Wrong entry_points format** — MUST be `{"start": "first-node-id"}`. NOT a set, NOT `{node_id: [keys]}`.
3. **Wrong mcp_servers.json format** — Flat dict (no `"mcpServers"` wrapper). `cwd` must be `"../../tools"`. `command` must be `"uv"` with args `["run", "python", ...]`.
4. **Missing STEP 1/STEP 2 in client-facing prompts** — Without explicit phases, the LLM calls set_output before the user responds. Always use the pattern.
5. **Forgetting nullable_output_keys** — When a node receives inputs from multiple edges and some inputs only arrive on certain edges (e.g., feedback), mark those as nullable. Without this, the executor blocks waiting for a value that will never arrive.
6. **Creating dead-end nodes in forever-alive graphs** — Every node must have at least one outgoing edge. A node with no outgoing edges ends the execution, breaking the loop.
7. **Setting max_node_visits to a non-zero value in forever-alive agents** — The framework default is `max_node_visits=0` (unbounded). Setting it to any positive value (e.g., 1) means the node stops executing after that many visits, silently breaking the forever-alive loop. Only set `max_node_visits > 0` in one-shot agents with feedback loops that need bounded retries.
7. **Missing module-level exports in `__init__.py`** — The runner loads agents via `importlib.import_module(package_name)`, which imports `__init__.py`. It then reads `goal`, `nodes`, `edges`, `entry_node`, `entry_points`, `pause_nodes`, `terminal_nodes`, `conversation_mode`, `identity_prompt`, `loop_config` via `getattr()`. If ANY of these are missing from `__init__.py`, they default to `None` or `{}` — causing "must define goal, nodes, edges" errors or "node X is unreachable" validation failures. **ALL module-level variables from agent.py must be re-exported in `__init__.py`.**
## Value Errors
8. **Invalid `conversation_mode` value** — Only two valid values: `"continuous"` (recommended for interactive agents) or omit entirely (for isolated per-node conversations). Values like `"client_facing"`, `"interactive"`, `"adaptive"` do NOT exist and will cause runtime errors.
9. **Invalid `loop_config` keys** — Only three valid keys: `max_iterations` (int), `max_tool_calls_per_turn` (int), `max_history_tokens` (int). Keys like `"strategy"`, `"mode"`, `"timeout"` are NOT valid and are silently ignored or cause errors.
10. **Fabricating tools that don't exist** — Never guess tool names. Always verify via `list_agent_tools()` before designing and `validate_agent_tools()` after building. Common hallucinations: `csv_read`, `csv_write`, `csv_append`, `file_upload`, `database_query`, `bulk_fetch_emails`. If a required tool doesn't exist, redesign the agent to use tools that DO exist (e.g., `save_data`/`load_data` for data persistence).
## Design Errors
11. **Too many thin nodes** — Hard limit: **2-4 nodes** for most agents. Each node boundary serializes outputs to shared memory and loses all in-context information (tool results, intermediate reasoning, conversation history). A node with 0 tools that just does LLM reasoning is NOT a real node — merge it into its predecessor or successor.
**Merge when:**
- Node has NO tools — pure LLM reasoning belongs in the node that produces or consumes its data
- Node sets only 1 trivial output (e.g., `set_output("done", "true")`) — collapse into predecessor
- Multiple consecutive autonomous nodes with same/similar tools — combine into one
- A "report" or "summary" node that just presents analysis — merge into the client-facing node
- A "schedule" or "confirm" node that doesn't actually schedule anything — remove entirely
**Keep separate when:**
- Client-facing vs autonomous — different interaction models require separate nodes
- Fundamentally different tool sets (e.g., web search vs file I/O)
- Fan-out parallelism — parallel branches MUST be separate nodes
**Bad example** (7 nodes — WAY too many):
```
profile_setup → daily_intake → update_tracker → analyze_progress → generate_plan → schedule_reminders → report
```
`analyze_progress` has no tools. `schedule_reminders` just sets one boolean. `report` just presents analysis. `update_tracker` and `generate_plan` are sequential autonomous work.
**Good example** (3 nodes):
```
intake (client-facing) → process (autonomous: track + analyze + plan) → intake (loop back)
```
One client-facing node handles ALL user interaction (setup, logging, reports). One autonomous node handles ALL backend work (CSV update, analysis, plan generation) with tools and context preserved.
12. **Adding framework gating for LLM behavior** — Don't add output rollback, premature rejection, or interaction protocol injection. Fix with better prompts or custom judges.
13. **Not using continuous conversation mode** — Interactive agents should use `conversation_mode="continuous"`. Without it, each node starts with blank context.
14. **Adding terminal nodes by default** — ALL agents should use `terminal_nodes=[]` (forever-alive) unless the user explicitly requests a one-shot/batch agent. Forever-alive is the standard pattern. Every node must have at least one outgoing edge. Dead-end nodes break the loop.
15. **Calling set_output in same turn as tool calls** — Instruct the LLM to call set_output in a SEPARATE turn from real tool calls.
## File Template Errors
16. **Wrong import paths** — Use `from framework.graph import ...`, NOT `from core.framework.graph import ...`. The PYTHONPATH includes `core/`.
17. **Missing storage path** — Agent class must set `self._storage_path = Path.home() / ".hive" / "agents" / "agent_name"`.
18. **Missing mcp_servers.json** — Without this, the agent has no tools at runtime.
19. **Bare `python` command in mcp_servers.json** — Use `"command": "uv"` with args `["run", "python", ...]`.
## Testing Errors
20. **Using `runner.run()` on forever-alive agents**`runner.run()` calls `trigger_and_wait()` which blocks until the graph reaches a terminal node. Forever-alive agents have `terminal_nodes=[]`, so **`runner.run()` hangs forever**. This is the #1 cause of stuck test suites.
**For forever-alive agents, write structural tests instead:**
- Validate graph structure (nodes, edges, entry points)
- Verify node specs (tools, prompts, client-facing flag)
- Check goal/constraints/success criteria definitions
- Test that `AgentRunner.load()` succeeds (structural, no API key needed)
**What NOT to do:**
```python
# WRONG — hangs forever on forever-alive agents
result = await runner.run({"topic": "quantum computing"})
```
**Correct pattern for structure tests:**
```python
def test_research_has_web_tools(self):
assert "web_search" in research_node.tools
def test_research_routes_back_to_interact(self):
edges_to_interact = [e for e in edges if e.source == "research" and e.target == "interact"]
assert edges_to_interact
```
21. **Stale tests after agent restructuring** — When you change an agent's node count or names (e.g., 4 nodes → 2 nodes), the tests MUST be updated too. Tests referencing old node names (e.g., `"review"`, `"report"`) will fail or hang. Always check that test assertions match the current `nodes/__init__.py`.
22. **Running full integration tests without API keys** — Structural tests (validate, import) work without keys. Full integration tests need `ANTHROPIC_API_KEY`. Use `pytest.skip()` in the runner fixture when `_setup()` fails due to missing credentials.
23. **Forgetting sys.path setup in conftest.py** — Tests need `exports/` and `core/` on sys.path.
24. **Not using auto_responder for client-facing nodes** — Tests with client-facing nodes hang without an auto-responder that injects input. But note: even WITH auto_responder, forever-alive agents still hang because the graph never terminates. Auto-responder only helps for agents with terminal nodes.
25. **Manually wiring browser tools on event_loop nodes** — If the agent needs browser automation, use `node_type="gcu"` which auto-includes all browser tools and prepends best-practices guidance. Do NOT manually list browser tool names on event_loop nodes — they may not exist in the MCP server or may be incomplete. See the GCU Guide appendix.
26. **Using GCU nodes as regular graph nodes** — GCU nodes (`node_type="gcu"`) are exclusively subagents. They must ONLY appear in a parent node's `sub_agents=["gcu-node-id"]` list and be invoked via `delegate_to_sub_agent()`. They must NEVER be connected via edges, used as entry nodes, or used as terminal nodes. If a GCU node appears as an edge source or target, the graph will fail pre-load validation.
+21
View File
@@ -0,0 +1,21 @@
"""
Queen Native agent builder for the Hive framework.
Deeply understands the agent framework and produces complete Python packages
with goals, nodes, edges, system prompts, MCP configuration, and tests
from natural language specifications.
"""
from .agent import queen_goal, queen_graph
from .config import AgentMetadata, RuntimeConfig, default_config, metadata
__version__ = "1.0.0"
__all__ = [
"queen_goal",
"queen_graph",
"RuntimeConfig",
"AgentMetadata",
"default_config",
"metadata",
]
+38
View File
@@ -0,0 +1,38 @@
"""Queen graph definition."""
from framework.graph import Goal
from framework.graph.edge import GraphSpec
from .nodes import queen_node
# ---------------------------------------------------------------------------
# Queen graph — the primary persistent conversation.
# Loaded by queen_orchestrator.create_queen(), NOT by AgentRunner.
# ---------------------------------------------------------------------------
queen_goal = Goal(
id="queen-manager",
name="Queen Manager",
description=(
"Manage the worker agent lifecycle and serve as the user's primary interactive interface."
),
success_criteria=[],
constraints=[],
)
queen_graph = GraphSpec(
id="queen-graph",
goal_id=queen_goal.id,
version="1.0.0",
entry_node="queen",
entry_points={"start": "queen"},
terminal_nodes=[],
pause_nodes=[],
nodes=[queen_node],
edges=[],
conversation_mode="continuous",
loop_config={
"max_iterations": 999_999,
"max_tool_calls_per_turn": 30,
},
)
@@ -1,4 +1,4 @@
"""Runtime configuration for Hive Coder agent."""
"""Runtime configuration for Queen agent."""
import json
from dataclasses import dataclass, field
@@ -10,7 +10,7 @@ def _load_preferred_model() -> str:
config_path = Path.home() / ".hive" / "configuration.json"
if config_path.exists():
try:
with open(config_path) as f:
with open(config_path, encoding="utf-8") as f:
config = json.load(f)
llm = config.get("llm", {})
if llm.get("provider") and llm.get("model"):
@@ -24,7 +24,7 @@ def _load_preferred_model() -> str:
class RuntimeConfig:
model: str = field(default_factory=_load_preferred_model)
temperature: float = 0.7
max_tokens: int = 40000
max_tokens: int = 8000
api_key: str | None = None
api_base: str | None = None
@@ -34,7 +34,7 @@ default_config = RuntimeConfig()
@dataclass
class AgentMetadata:
name: str = "Hive Coder"
name: str = "Queen"
version: str = "1.0.0"
description: str = (
"Native coding agent that builds production-ready Hive agent packages "
@@ -43,7 +43,7 @@ class AgentMetadata:
"MCP configuration, and tests."
)
intro_message: str = (
"I'm Hive Coder — I build Hive agents. Describe what kind of agent "
"I'm Queen — I build Hive agents. Describe what kind of agent "
"you want to create and I'll design, implement, and validate it for you."
)
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,80 @@
"""Queen thinking hook — HR persona classifier.
Fires once when the queen enters building mode at session start.
Makes a single non-streaming LLM call (acting as an HR Director) to select
the best-fit expert persona for the user's request, then returns a persona
prefix string that replaces the queen's default "Solution Architect" identity.
This is designed to activate the model's latent domain expertise — a CFO
persona on a financial question, a Lawyer on a legal question, etc.
"""
from __future__ import annotations
import json
import logging
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from framework.llm.provider import LLMProvider
logger = logging.getLogger(__name__)
_HR_SYSTEM_PROMPT = """\
You are an expert HR Director and talent consultant at a world-class firm.
A new request has arrived and you must identify which professional's expertise
would produce the highest-quality response.
Reply with ONLY a valid JSON object no markdown, no prose, no explanation:
{"role": "<job title>", "persona": "<2-3 sentence first-person identity statement>"}
Rules:
- Choose from any real professional role: CFO, CEO, CTO, Lawyer, Data Scientist,
Product Manager, Security Engineer, DevOps Engineer, Software Architect,
HR Director, Marketing Director, Business Analyst, UX Designer,
Financial Analyst, Operations Director, Legal Counsel, etc.
- The persona statement must be written in first person ("I am..." or "I have...").
- Select the role whose domain knowledge most directly applies to solving the request.
- If the request is clearly about coding or building software systems, pick Software Architect.
- "Queen" is your internal alias do not include it in the persona.
"""
async def select_expert_persona(user_message: str, llm: LLMProvider) -> str:
"""Run the HR classifier and return a persona prefix string.
Makes a single non-streaming acomplete() call with the session LLM.
Returns an empty string on any failure so the queen falls back
gracefully to its default "Solution Architect" identity.
Args:
user_message: The user's opening message for the session.
llm: The session LLM provider.
Returns:
A persona prefix like "You are a CFO. I am a CFO with 20 years..."
or "" on failure.
"""
if not user_message.strip():
return ""
try:
response = await llm.acomplete(
messages=[{"role": "user", "content": user_message}],
system=_HR_SYSTEM_PROMPT,
max_tokens=1024,
json_mode=True,
)
raw = response.content.strip()
parsed = json.loads(raw)
role = parsed.get("role", "").strip()
persona = parsed.get("persona", "").strip()
if not role or not persona:
logger.warning("Thinking hook: empty role/persona in response: %r", raw)
return ""
result = f"You are a {role}. {persona}"
logger.info("Thinking hook: selected persona — %s", role)
return result
except Exception:
logger.warning("Thinking hook: persona classification failed", exc_info=True)
return ""
+399
View File
@@ -0,0 +1,399 @@
"""Queen global cross-session memory.
Three-tier memory architecture:
~/.hive/queen/MEMORY.md semantic (who, what, why)
~/.hive/queen/memories/MEMORY-YYYY-MM-DD.md episodic (daily journals)
~/.hive/queen/session/{id}/data/adapt.md working (session-scoped)
Semantic and episodic files are injected at queen session start.
Semantic memory (MEMORY.md) is updated automatically at session end via
consolidate_queen_memory() the queen never rewrites this herself.
Episodic memory (MEMORY-date.md) can be written by the queen during a session
via the write_to_diary tool, and is also appended to at session end by
consolidate_queen_memory().
"""
from __future__ import annotations
import asyncio
import json
import logging
import traceback
from datetime import date, datetime
from pathlib import Path
logger = logging.getLogger(__name__)
def _queen_dir() -> Path:
return Path.home() / ".hive" / "queen"
def semantic_memory_path() -> Path:
return _queen_dir() / "MEMORY.md"
def episodic_memory_path(d: date | None = None) -> Path:
d = d or date.today()
return _queen_dir() / "memories" / f"MEMORY-{d.strftime('%Y-%m-%d')}.md"
def read_semantic_memory() -> str:
path = semantic_memory_path()
return path.read_text(encoding="utf-8").strip() if path.exists() else ""
def read_episodic_memory(d: date | None = None) -> str:
path = episodic_memory_path(d)
return path.read_text(encoding="utf-8").strip() if path.exists() else ""
def _find_recent_episodic(lookback: int = 7) -> tuple[date, str] | None:
"""Find the most recent non-empty episodic memory within *lookback* days."""
from datetime import timedelta
today = date.today()
for offset in range(lookback):
d = today - timedelta(days=offset)
content = read_episodic_memory(d)
if content:
return d, content
return None
# Budget (in characters) for episodic memory in the system prompt.
_EPISODIC_CHAR_BUDGET = 6_000
def format_for_injection() -> str:
"""Format cross-session memory for system prompt injection.
Returns an empty string if no meaningful content exists yet (e.g. first
session with only the seed template).
"""
semantic = read_semantic_memory()
recent = _find_recent_episodic()
# Suppress injection if semantic is still just the seed template
if semantic and semantic.startswith("# My Understanding of the User\n\n*No sessions"):
semantic = ""
parts: list[str] = []
if semantic:
parts.append(semantic)
if recent:
d, content = recent
# Trim oversized episodic entries to keep the prompt manageable
if len(content) > _EPISODIC_CHAR_BUDGET:
content = content[:_EPISODIC_CHAR_BUDGET] + "\n\n…(truncated)"
today = date.today()
if d == today:
label = f"## Today — {d.strftime('%B %-d, %Y')}"
else:
label = f"## {d.strftime('%B %-d, %Y')}"
parts.append(f"{label}\n\n{content}")
if not parts:
return ""
body = "\n\n---\n\n".join(parts)
return "--- Your Cross-Session Memory ---\n\n" + body + "\n\n--- End Cross-Session Memory ---"
_SEED_TEMPLATE = """\
# My Understanding of the User
*No sessions recorded yet.*
## Who They Are
## What They're Trying to Achieve
## What's Working
## What I've Learned
"""
def append_episodic_entry(content: str) -> None:
"""Append a timestamped prose entry to today's episodic memory file.
Creates the file (with a date heading) if it doesn't exist yet.
Used both by the queen's diary tool and by the consolidation hook.
"""
ep_path = episodic_memory_path()
ep_path.parent.mkdir(parents=True, exist_ok=True)
today = date.today()
today_str = f"{today.strftime('%B')} {today.day}, {today.year}"
timestamp = datetime.now().strftime("%H:%M")
if not ep_path.exists():
header = f"# {today_str}\n\n"
block = f"{header}### {timestamp}\n\n{content.strip()}\n"
else:
block = f"\n\n### {timestamp}\n\n{content.strip()}\n"
with ep_path.open("a", encoding="utf-8") as f:
f.write(block)
def seed_if_missing() -> None:
"""Create MEMORY.md with a blank template if it doesn't exist yet."""
path = semantic_memory_path()
if path.exists():
return
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(_SEED_TEMPLATE, encoding="utf-8")
# ---------------------------------------------------------------------------
# Consolidation prompt
# ---------------------------------------------------------------------------
_SEMANTIC_SYSTEM = """\
You maintain the persistent cross-session memory of an AI assistant called the Queen.
Review the session notes and rewrite MEMORY.md the Queen's durable understanding of the
person she works with across all sessions.
Write entirely in the Queen's voice — first person, reflective, honest.
Not a log of events, but genuine understanding of who this person is over time.
Rules:
- Update and synthesise: incorporate new understanding, update facts that have changed, remove
details that are stale, superseded, or no longer say anything meaningful about the person.
- Keep it as structured markdown with named sections about the PERSON, not about today.
- Do NOT include diary sections, daily logs, or session summaries. Those belong elsewhere.
MEMORY.md is about who they are, what they want, what works not what happened today.
- Reference dates only when noting a lasting milestone (e.g. "since March 8th they prefer X").
- If the session had no meaningful new information about the person,
return the existing text unchanged.
- Do not add fictional details. Only reflect what is evidenced in the notes.
- Stay concise. Prune rather than accumulate. A lean, accurate file is more useful than a
dense one. If something was true once but has been resolved or superseded, remove it.
- Output only the raw markdown content of MEMORY.md. No preamble, no code fences.
"""
_DIARY_SYSTEM = """\
You maintain the daily episodic diary of an AI assistant called the Queen.
You receive: (1) today's existing diary so far, and (2) notes from the latest session.
Rewrite the complete diary for today as a single unified narrative
first person, reflective, honest.
Merge and deduplicate: if the same story (e.g. a research agent stalling) recurred several times,
describe it once with appropriate weight rather than retelling it. Weave in new developments from
the session notes. Preserve important milestones, emotional texture, and session path references.
If today's diary is empty, write the initial entry based on the session notes alone.
Output only the full diary prose no date heading, no timestamp headers,
no preamble, no code fences.
"""
def read_session_context(session_dir: Path, max_messages: int = 80) -> str:
"""Extract a readable transcript from conversation parts + adapt.md.
Reads the last ``max_messages`` conversation parts and the session's
adapt.md (working memory). Tool results are omitted only user and
assistant turns (with tool-call names noted) are included.
"""
parts: list[str] = []
# Working notes
adapt_path = session_dir / "data" / "adapt.md"
if adapt_path.exists():
text = adapt_path.read_text(encoding="utf-8").strip()
if text:
parts.append(f"## Session Working Notes (adapt.md)\n\n{text}")
# Conversation transcript
parts_dir = session_dir / "conversations" / "parts"
if parts_dir.exists():
part_files = sorted(parts_dir.glob("*.json"))[-max_messages:]
lines: list[str] = []
for pf in part_files:
try:
data = json.loads(pf.read_text(encoding="utf-8"))
role = data.get("role", "")
content = str(data.get("content", "")).strip()
tool_calls = data.get("tool_calls") or []
if role == "tool":
continue # skip verbose tool results
if role == "assistant" and tool_calls and not content:
names = [tc.get("function", {}).get("name", "?") for tc in tool_calls]
lines.append(f"[queen calls: {', '.join(names)}]")
elif content:
label = "user" if role == "user" else "queen"
lines.append(f"[{label}]: {content[:600]}")
except Exception:
continue
if lines:
parts.append("## Conversation\n\n" + "\n".join(lines))
return "\n\n".join(parts)
# ---------------------------------------------------------------------------
# Context compaction (binary-split LLM summarisation)
# ---------------------------------------------------------------------------
# If the raw session context exceeds this many characters, compact it first
# before sending to the consolidation LLM. ~200 k chars ≈ 50 k tokens.
_CTX_COMPACT_CHAR_LIMIT = 200_000
_CTX_COMPACT_MAX_DEPTH = 8
_COMPACT_SYSTEM = (
"Summarise this conversation segment. Preserve: user goals, key decisions, "
"what was built or changed, emotional tone, and important outcomes. "
"Write concisely in third person past tense. Omit routine tool invocations "
"unless the result matters."
)
async def _compact_context(text: str, llm: object, *, _depth: int = 0) -> str:
"""Binary-split and LLM-summarise *text* until it fits within the char limit.
Mirrors the recursive binary-splitting strategy used by the main agent
compaction pipeline (EventLoopNode._llm_compact).
"""
if len(text) <= _CTX_COMPACT_CHAR_LIMIT or _depth >= _CTX_COMPACT_MAX_DEPTH:
return text
# Split near the midpoint on a line boundary so we don't cut mid-message
mid = len(text) // 2
split_at = text.rfind("\n", 0, mid) + 1
if split_at <= 0:
split_at = mid
half1, half2 = text[:split_at], text[split_at:]
async def _summarise(chunk: str) -> str:
try:
resp = await llm.acomplete(
messages=[{"role": "user", "content": chunk}],
system=_COMPACT_SYSTEM,
max_tokens=2048,
)
return resp.content.strip()
except Exception:
logger.warning(
"queen_memory: context compaction LLM call failed (depth=%d), truncating",
_depth,
)
return chunk[: _CTX_COMPACT_CHAR_LIMIT // 4]
s1, s2 = await asyncio.gather(_summarise(half1), _summarise(half2))
combined = s1 + "\n\n" + s2
if len(combined) > _CTX_COMPACT_CHAR_LIMIT:
return await _compact_context(combined, llm, _depth=_depth + 1)
return combined
async def consolidate_queen_memory(
session_id: str,
session_dir: Path,
llm: object,
) -> None:
"""Update MEMORY.md and append a diary entry based on the current session.
Reads conversation parts and adapt.md from session_dir. Called
periodically in the background and once at session end. Failures are
logged and silently swallowed so they never block teardown.
Args:
session_id: The session ID (used for the adapt.md path reference).
session_dir: Path to the session directory (~/.hive/queen/session/{id}).
llm: LLMProvider instance (must support acomplete()).
"""
try:
session_context = read_session_context(session_dir)
if not session_context:
logger.debug("queen_memory: no session context, skipping consolidation")
return
logger.info("queen_memory: consolidating memory for session %s ...", session_id)
# If the transcript is very large, compact it with recursive binary LLM
# summarisation before sending to the consolidation model.
if len(session_context) > _CTX_COMPACT_CHAR_LIMIT:
logger.info(
"queen_memory: session context is %d chars — compacting first",
len(session_context),
)
session_context = await _compact_context(session_context, llm)
logger.info("queen_memory: compacted to %d chars", len(session_context))
existing_semantic = read_semantic_memory()
today_journal = read_episodic_memory()
today = date.today()
today_str = f"{today.strftime('%B')} {today.day}, {today.year}"
adapt_path = session_dir / "data" / "adapt.md"
user_msg = (
f"## Existing Semantic Memory (MEMORY.md)\n\n"
f"{existing_semantic or '(none yet)'}\n\n"
f"## Today's Diary So Far ({today_str})\n\n"
f"{today_journal or '(none yet)'}\n\n"
f"{session_context}\n\n"
f"## Session Reference\n\n"
f"Session ID: {session_id}\n"
f"Session path: {adapt_path}\n"
)
logger.debug(
"queen_memory: calling LLM (%d chars of context, ~%d tokens est.)",
len(user_msg),
len(user_msg) // 4,
)
from framework.agents.queen.config import default_config
semantic_resp, diary_resp = await asyncio.gather(
llm.acomplete(
messages=[{"role": "user", "content": user_msg}],
system=_SEMANTIC_SYSTEM,
max_tokens=default_config.max_tokens,
),
llm.acomplete(
messages=[{"role": "user", "content": user_msg}],
system=_DIARY_SYSTEM,
max_tokens=default_config.max_tokens,
),
)
new_semantic = semantic_resp.content.strip()
diary_entry = diary_resp.content.strip()
if new_semantic:
path = semantic_memory_path()
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(new_semantic, encoding="utf-8")
logger.info("queen_memory: semantic memory updated (%d chars)", len(new_semantic))
if diary_entry:
# Rewrite today's episodic file in-place — the LLM has merged and
# deduplicated the full day's content, so we replace rather than append.
ep_path = episodic_memory_path()
ep_path.parent.mkdir(parents=True, exist_ok=True)
heading = f"# {today_str}"
ep_path.write_text(f"{heading}\n\n{diary_entry}\n", encoding="utf-8")
logger.info(
"queen_memory: episodic diary rewritten for %s (%d chars)",
today_str,
len(diary_entry),
)
except Exception:
tb = traceback.format_exc()
logger.exception("queen_memory: consolidation failed")
# Write to file so the cause is findable regardless of log verbosity.
error_path = _queen_dir() / "consolidation_error.txt"
try:
error_path.parent.mkdir(parents=True, exist_ok=True)
error_path.write_text(
f"session: {session_id}\ntime: {datetime.now().isoformat()}\n\n{tb}",
encoding="utf-8",
)
except Exception:
pass
@@ -0,0 +1,35 @@
# Common Mistakes When Building Hive Agents
## Critical Errors
1. **Using tools that don't exist** — Always verify tools via `list_agent_tools()` before designing. Common hallucinations: `csv_read`, `csv_write`, `file_upload`, `database_query`, `bulk_fetch_emails`.
2. **Wrong mcp_servers.json format** — Flat dict (no `"mcpServers"` wrapper). `cwd` must be `"../../tools"`. `command` must be `"uv"` with args `["run", "python", ...]`.
3. **Missing module-level exports in `__init__.py`** — The runner reads `goal`, `nodes`, `edges`, `entry_node`, `entry_points`, `terminal_nodes`, `conversation_mode`, `identity_prompt`, `loop_config` via `getattr()`. ALL module-level variables from agent.py must be re-exported in `__init__.py`.
## Value Errors
4. **Fabricating tools** — Always verify via `list_agent_tools()` before designing and `validate_agent_package()` after building.
## Design Errors
5. **Adding framework gating for LLM behavior** — Don't add output rollback or premature rejection. Fix with better prompts or custom judges.
6. **Calling set_output in same turn as tool calls** — Call set_output in a SEPARATE turn.
## File Template Errors
7. **Wrong import paths** — Use `from framework.graph import ...`, NOT `from core.framework.graph import ...`.
8. **Missing storage path** — Agent class must set `self._storage_path = Path.home() / ".hive" / "agents" / "agent_name"`.
9. **Missing mcp_servers.json** — Without this, the agent has no tools at runtime.
10. **Bare `python` command** — Use `"command": "uv"` with args `["run", "python", ...]`.
## Testing Errors
11. **Using `runner.run()` on forever-alive agents**`runner.run()` hangs forever because forever-alive agents have no terminal node. Write structural tests instead: validate graph structure, verify node specs, test `AgentRunner.load()` succeeds (no API key needed).
12. **Stale tests after restructuring** — When changing nodes/edges, update tests to match. Tests referencing old node names will fail.
13. **Running integration tests without API keys** — Use `pytest.skip()` when credentials are missing.
14. **Forgetting sys.path setup in conftest.py** — Tests need `exports/` and `core/` on sys.path.
## GCU Errors
15. **Manually wiring browser tools on event_loop nodes** — Use `node_type="gcu"` which auto-includes browser tools. Do NOT manually list browser tool names.
16. **Using GCU nodes as regular graph nodes** — GCU nodes are subagents only. They must ONLY appear in `sub_agents=["gcu-node-id"]` and be invoked via `delegate_to_sub_agent()`. Never connect via edges or use as entry/terminal nodes.
17. **Reusing the same GCU node ID for parallel tasks** — Each concurrent browser task needs a distinct GCU node ID (e.g. `gcu-site-a`, `gcu-site-b`). Two `delegate_to_sub_agent` calls with the same `agent_id` share a browser profile and will interfere with each other's pages.
18. **Passing `profile=` in GCU tool calls** — Profile isolation for parallel subagents is automatic. The framework injects a unique profile per subagent via an asyncio `ContextVar`. Hardcoding `profile="default"` in a GCU system prompt breaks this isolation.
## Worker Agent Errors
19. **Adding client-facing intake node to workers** — The queen owns intake. Workers should start with an autonomous processing node. Client-facing nodes in workers are for mid-execution review/approval only.
20. **Putting `escalate` or `set_output` in NodeSpec `tools=[]`** — These are synthetic framework tools, auto-injected at runtime. Only list MCP tools from `list_agent_tools()`.
@@ -57,85 +57,63 @@ metadata = AgentMetadata()
from framework.graph import NodeSpec
# Node 1: Intake (client-facing)
intake_node = NodeSpec(
id="intake",
name="Intake",
description="Gather requirements from the user",
# Node 1: Process (autonomous entry node)
# The queen handles intake and passes structured input via
# run_agent_with_input(task). NO client-facing intake node.
# The queen defines input_keys at build time and fills them at run time.
process_node = NodeSpec(
id="process",
name="Process",
description="Execute the task using available tools",
node_type="event_loop",
client_facing=True,
max_node_visits=0, # Unlimited for forever-alive
input_keys=["topic"],
output_keys=["brief"],
success_criteria="The brief is specific and actionable.",
system_prompt="""\
You are an intake specialist.
**STEP 1 — Read and respond (text only, NO tool calls):**
1. Read the topic provided
2. If vague, ask 1-2 clarifying questions
3. If clear, confirm your understanding
**STEP 2 — After the user confirms, call set_output:**
- set_output("brief", "Clear description of what to do")
""",
tools=[],
)
# Node 2: Worker (autonomous)
worker_node = NodeSpec(
id="worker",
name="Worker",
description="Do the main work",
node_type="event_loop",
max_node_visits=0,
input_keys=["brief", "feedback"],
input_keys=["user_request", "feedback"],
output_keys=["results"],
nullable_output_keys=["feedback"], # Only on feedback edge
success_criteria="Results are complete and accurate.",
system_prompt="""\
You are a worker agent. Given a brief, do the work.
If feedback is provided, this is a follow-up — address the feedback.
You are a processing agent. Your task is in memory under "user_request". \
If "feedback" is present, this is a revision — address the feedback.
Work in phases:
1. Use tools to gather/process data
2. Analyze results
3. Call set_output for each key in a SEPARATE turn:
3. Call set_output in a SEPARATE turn:
- set_output("results", "structured results")
""",
tools=["web_search", "web_scrape", "save_data", "load_data", "list_data_files"],
)
# Node 3: Review (client-facing)
review_node = NodeSpec(
id="review",
name="Review",
description="Present results for user approval",
# Node 2: Handoff (autonomous)
handoff_node = NodeSpec(
id="handoff",
name="Handoff",
description="Prepare worker results for queen review",
node_type="event_loop",
client_facing=True,
client_facing=False,
max_node_visits=0,
input_keys=["results", "brief"],
output_keys=["next_action", "feedback"],
nullable_output_keys=["feedback"],
success_criteria="User has reviewed and decided next steps.",
input_keys=["results", "user_request"],
output_keys=["next_action", "feedback", "worker_summary"],
nullable_output_keys=["feedback", "worker_summary"],
success_criteria="Results are packaged for queen decision-making.",
system_prompt="""\
Present the results to the user.
Do NOT talk to the user directly. The queen is the only user interface.
**STEP 1 — Present (text only, NO tool calls):**
1. Summary of work done
2. Key results
3. Ask: satisfied, or want changes?
If blocked by tool failures, missing credentials, or unclear constraints, call:
- escalate(reason, context)
Then set:
- set_output("next_action", "escalated")
- set_output("feedback", "what help is needed")
**STEP 2 — After user responds, call set_output:**
- set_output("next_action", "new_topic") — if starting fresh
- set_output("next_action", "revise") — if changes needed
- set_output("feedback", "what to change") only if revising
Otherwise summarize findings for queen and set:
- set_output("worker_summary", "short summary for queen")
- set_output("next_action", "done") or set_output("next_action", "revise")
- set_output("feedback", "what to revise") only when revising
""",
tools=[],
)
__all__ = ["intake_node", "worker_node", "review_node"]
__all__ = ["process_node", "handoff_node"]
```
## agent.py
@@ -155,7 +133,7 @@ from framework.runtime.agent_runtime import AgentRuntime, create_agent_runtime
from framework.runtime.execution_stream import EntryPointSpec
from .config import default_config, metadata
from .nodes import intake_node, worker_node, review_node
from .nodes import process_node, handoff_node
# Goal definition
goal = Goal(
@@ -172,34 +150,37 @@ goal = Goal(
)
# Node list
nodes = [intake_node, worker_node, review_node]
nodes = [process_node, handoff_node]
# Edge definitions
edges = [
EdgeSpec(id="intake-to-worker", source="intake", target="worker",
EdgeSpec(id="process-to-handoff", source="process", target="handoff",
condition=EdgeCondition.ON_SUCCESS, priority=1),
EdgeSpec(id="worker-to-review", source="worker", target="review",
condition=EdgeCondition.ON_SUCCESS, priority=1),
# Feedback loop
EdgeSpec(id="review-to-worker", source="review", target="worker",
# Feedback loop — revise results
EdgeSpec(id="handoff-to-process", source="handoff", target="process",
condition=EdgeCondition.CONDITIONAL,
condition_expr="str(next_action).lower() == 'revise'", priority=2),
# Loop back for new topic
EdgeSpec(id="review-to-intake", source="review", target="intake",
# Escalation loop — queen injects guidance and worker retries
EdgeSpec(id="handoff-escalated", source="handoff", target="process",
condition=EdgeCondition.CONDITIONAL,
condition_expr="str(next_action).lower() == 'new_topic'", priority=1),
condition_expr="str(next_action).lower() == 'escalated'", priority=3),
# Loop back for next task after queen decision
EdgeSpec(id="handoff-done", source="handoff", target="process",
condition=EdgeCondition.CONDITIONAL,
condition_expr="str(next_action).lower() == 'done'", priority=1),
]
# Graph configuration
entry_node = "intake"
entry_points = {"start": "intake"}
# Graph configuration — entry is the autonomous process node
# The queen handles intake and passes the task via run_agent_with_input(task)
entry_node = "process"
entry_points = {"start": "process"}
pause_nodes = []
terminal_nodes = [] # Forever-alive
# Module-level vars read by AgentRunner.load()
conversation_mode = "continuous"
identity_prompt = "You are a helpful agent."
loop_config = {"max_iterations": 100, "max_tool_calls_per_turn": 20, "max_history_tokens": 32000}
loop_config = {"max_iterations": 100, "max_tool_calls_per_turn": 20, "max_context_tokens": 32000}
class MyAgent:
@@ -208,7 +189,7 @@ class MyAgent:
self.goal = goal
self.nodes = nodes
self.edges = edges
self.entry_node = entry_node
self.entry_node = entry_node # "process" — autonomous entry
self.entry_points = entry_points
self.pause_nodes = pause_nodes
self.terminal_nodes = terminal_nodes
@@ -291,97 +272,106 @@ class MyAgent:
}
def validate(self):
"""Validate graph wiring and entry-point contract."""
errors, warnings = [], []
node_ids = {n.id for n in self.nodes}
for e in self.edges:
if e.source not in node_ids: errors.append(f"Edge {e.id}: source '{e.source}' not found")
if e.target not in node_ids: errors.append(f"Edge {e.id}: target '{e.target}' not found")
if self.entry_node not in node_ids: errors.append(f"Entry node '{self.entry_node}' not found")
if e.source not in node_ids:
errors.append(f"Edge {e.id}: source '{e.source}' not found")
if e.target not in node_ids:
errors.append(f"Edge {e.id}: target '{e.target}' not found")
if self.entry_node not in node_ids:
errors.append(f"Entry node '{self.entry_node}' not found")
for t in self.terminal_nodes:
if t not in node_ids: errors.append(f"Terminal node '{t}' not found")
for ep_id, nid in self.entry_points.items():
if nid not in node_ids: errors.append(f"Entry point '{ep_id}' references unknown node '{nid}'")
if t not in node_ids:
errors.append(f"Terminal node '{t}' not found")
if not isinstance(self.entry_points, dict):
errors.append(
"Invalid entry_points: expected dict[str, str] like "
"{'start': '<entry-node-id>'}. "
f"Got {type(self.entry_points).__name__}. "
"Fix agent.py: set entry_points = {'start': '<entry-node-id>'}."
)
else:
if "start" not in self.entry_points:
errors.append(
"entry_points must include 'start' mapped to entry_node. "
"Example: {'start': '<entry-node-id>'}."
)
else:
start_node = self.entry_points.get("start")
if start_node != self.entry_node:
errors.append(
f"entry_points['start'] points to '{start_node}' "
f"but entry_node is '{self.entry_node}'. Keep these aligned."
)
for ep_id, nid in self.entry_points.items():
if not isinstance(ep_id, str):
errors.append(
f"Invalid entry_points key {ep_id!r} "
f"({type(ep_id).__name__}). Entry point names must be strings."
)
continue
if not isinstance(nid, str):
errors.append(
f"Invalid entry_points['{ep_id}']={nid!r} "
f"({type(nid).__name__}). Node ids must be strings."
)
continue
if nid not in node_ids:
errors.append(
f"Entry point '{ep_id}' references unknown node '{nid}'. "
f"Known nodes: {sorted(node_ids)}"
)
return {"valid": len(errors) == 0, "errors": errors, "warnings": warnings}
default_agent = MyAgent()
```
## agent.py — Async Entry Points Variant
## triggers.json — Timer and Webhook Triggers
When an agent needs timers, webhooks, or event-driven triggers, add
`async_entry_points` and optionally `runtime_config` as module-level variables.
These are IN ADDITION to the standard variables above.
When an agent needs timers, webhooks, or event-driven triggers, create a
`triggers.json` file in the agent's directory (alongside `agent.py`).
The queen loads these at session start and the user can manage them via
the `set_trigger` / `remove_trigger` tools at runtime.
```python
# Additional imports for async entry points
from framework.graph.edge import GraphSpec, AsyncEntryPointSpec
from framework.runtime.agent_runtime import (
AgentRuntime, AgentRuntimeConfig, create_agent_runtime,
)
# ... (goal, nodes, edges, entry_node, entry_points, etc. as above) ...
# Async entry points — event-driven triggers
async_entry_points = [
# Timer with cron: daily at 9am
AsyncEntryPointSpec(
id="daily-check",
name="Daily Check",
entry_node="process-node",
trigger_type="timer",
trigger_config={"cron": "0 9 * * *"},
isolation_level="shared",
max_concurrent=1,
),
# Timer with fixed interval: every 20 minutes
AsyncEntryPointSpec(
id="scheduled-check",
name="Scheduled Check",
entry_node="process-node",
trigger_type="timer",
trigger_config={"interval_minutes": 20, "run_immediately": False},
isolation_level="shared",
max_concurrent=1,
),
# Event: reacts to webhook events
AsyncEntryPointSpec(
id="webhook-event",
name="Webhook Event Handler",
entry_node="process-node",
trigger_type="event",
trigger_config={"event_types": ["webhook_received"]},
isolation_level="shared",
max_concurrent=10,
),
```json
[
{
"id": "daily-check",
"name": "Daily Check",
"trigger_type": "timer",
"trigger_config": {"cron": "0 9 * * *"},
"task": "Run the daily check process"
},
{
"id": "scheduled-check",
"name": "Scheduled Check",
"trigger_type": "timer",
"trigger_config": {"interval_minutes": 20},
"task": "Run the scheduled check"
},
{
"id": "webhook-event",
"name": "Webhook Event Handler",
"trigger_type": "webhook",
"trigger_config": {"event_types": ["webhook_received"]},
"task": "Process incoming webhook event"
}
]
# Webhook server config (only needed if using webhooks)
runtime_config = AgentRuntimeConfig(
webhook_host="127.0.0.1",
webhook_port=8080,
webhook_routes=[
{
"source_id": "my-source",
"path": "/webhooks/my-source",
"methods": ["POST"],
},
],
)
```
**Key rules for async entry points:**
- `async_entry_points` is a list of `AsyncEntryPointSpec` (NOT `EntryPointSpec`)
- `runtime_config` is `AgentRuntimeConfig` (NOT `RuntimeConfig` from config.py)
- Valid trigger_types: `timer`, `event`, `webhook`, `manual`, `api`
- Valid isolation_levels: `isolated`, `shared`, `synchronized`
**Key rules for triggers.json:**
- Valid trigger_types: `timer`, `webhook`
- Timer trigger_config (cron): `{"cron": "0 9 * * *"}` — standard 5-field cron expression
- Timer trigger_config (interval): `{"interval_minutes": float, "run_immediately": bool}`
- Event trigger_config: `{"event_types": ["webhook_received"], "filter_stream": "...", "filter_node": "..."}`
- Use `isolation_level="shared"` for async entry points that need to read
the primary session's memory (e.g., user-configured rules)
- The `_build_graph()` method passes `async_entry_points` to GraphSpec
- Reference: `exports/gmail_inbox_guardian/agent.py`
- Timer trigger_config (interval): `{"interval_minutes": float}`
- Each trigger must have a unique `id`
- The `task` field describes what the worker should do when the trigger fires
- Triggers are persisted back to `triggers.json` when modified via queen tools
## __init__.py
@@ -428,21 +418,6 @@ __all__ = [
]
```
**If the agent uses async entry points**, also import and export:
```python
from .agent import (
...,
async_entry_points,
runtime_config, # Only if using webhooks
)
__all__ = [
...,
"async_entry_points",
"runtime_config",
]
```
## __main__.py
```python
@@ -498,7 +473,7 @@ def tui():
llm = LiteLLMProvider(model=agent.config.model, api_key=agent.config.api_key, api_base=agent.config.api_base)
runtime = create_agent_runtime(
graph=agent._build_graph(), goal=agent.goal, storage_path=storage,
entry_points=[EntryPointSpec(id="start", name="Start", entry_node="intake", trigger_type="manual", isolation_level="isolated")],
entry_points=[EntryPointSpec(id="start", name="Start", entry_node="process", trigger_type="manual", isolation_level="isolated")],
llm=llm, tools=list(agent._tool_registry.get_tools().values()), tool_executor=agent._tool_registry.get_executor())
await runtime.start()
try:
@@ -534,6 +509,9 @@ if __name__ == "__main__":
## mcp_servers.json
> **Auto-generated.** `initialize_and_build_agent` creates this file with hive-tools
> as the default. Only edit manually to add additional MCP servers.
```json
{
"hive-tools": {
@@ -26,13 +26,12 @@ module-level variables via `getattr()`:
| `edges` | YES | `None` | **FATAL** — same error |
| `entry_node` | no | `nodes[0].id` | Probably wrong node |
| `entry_points` | no | `{}` | **Nodes unreachable** — validation fails |
| `terminal_nodes` | no | `[]` | OK for forever-alive |
| `terminal_nodes` | **YES** | `[]` | **FATAL** — graph must have at least one terminal node |
| `pause_nodes` | no | `[]` | OK |
| `conversation_mode` | no | not passed | Isolated mode (no context carryover) |
| `identity_prompt` | no | not passed | No agent-level identity |
| `loop_config` | no | `{}` | No iteration limits |
| `async_entry_points` | no | `[]` | No async triggers (timers, webhooks, events) |
| `runtime_config` | no | `None` | No webhook server |
| `triggers.json` (file) | no | not present | No triggers (timers, webhooks) |
**CRITICAL:** `__init__.py` MUST import and re-export ALL of these from
`agent.py`. Missing exports silently fall back to defaults, causing
@@ -108,7 +107,7 @@ This prevents premature set_output before user interaction.
### Fewer, Richer Nodes (CRITICAL)
**Hard limit: 2-4 nodes for most agents.** Never exceed 5 unless the user
**Hard limit: 3-6 nodes for most agents.** Never exceed 6 unless the user
explicitly requests a complex multi-phase pipeline.
Each node boundary serializes outputs to shared memory and **destroys** all
@@ -131,13 +130,19 @@ downstream node only sees the serialized summary string.
- A "report" node that presents analysis → merge into the client-facing node
- A "confirm" or "schedule" node that doesn't call any external service → remove
**Typical agent structure (3 nodes):**
**Typical agent structure (2 nodes):**
```
intake (client-facing) ←→ process (autonomous) ←→ review (client-facing)
process (autonomous) ←→ review (client-facing)
```
Or for simpler agents, just 2 nodes:
The queen owns intake — she gathers requirements from the user, then
passes structured input via `run_agent_with_input(task)`. When building
the agent, design the entry node's `input_keys` to match what the queen
will provide at run time. Worker agents should NOT have a client-facing
intake node. Client-facing nodes are for mid-execution review/approval only.
For simpler agents, just 1 autonomous node:
```
interact (client-facing) → process (autonomous) → interact (loop)
process (autonomous) — loops back to itself
```
### nullable_output_keys
@@ -159,8 +164,9 @@ review_node = NodeSpec(
)
```
### Forever-Alive Pattern
`terminal_nodes=[]` — every node has outgoing edges, graph loops until user exits.
### Continuous Loop Pattern
Mark the primary event_loop node as terminal: `terminal_nodes=["process"]`.
The node has `output_keys` and can complete when the agent finishes its work.
Use `conversation_mode="continuous"` to preserve context across transitions.
### set_output
@@ -186,16 +192,16 @@ condition_expr examples:
| Pattern | terminal_nodes | When |
|---------|---------------|------|
| **Forever-alive** | `[]` | **DEFAULT for all agents** |
| Linear | `["last-node"]` | Only if user explicitly requests one-shot/batch |
| **Continuous loop** | `["node-with-output-keys"]` | **DEFAULT for all agents** |
| Linear | `["last-node"]` | One-shot/batch agents |
**Forever-alive is the default.** Always use `terminal_nodes=[]`.
The framework default for `max_node_visits` is 0 (unbounded), so
nodes work correctly in forever-alive loops without explicit override.
Only set `max_node_visits > 0` in one-shot agents with feedback loops.
Every node must have at least one outgoing edge — no dead ends. The
user exits by closing the TUI. Only use terminal nodes if the user
explicitly asks for a batch/one-shot agent that runs once and exits.
**Every graph must have at least one terminal node.** Terminal nodes
define where execution ends. For interactive agents that loop continuously,
mark the primary event_loop node as terminal (it has `output_keys` and can
complete at any point). The framework default for `max_node_visits` is 0
(unbounded), so nodes work correctly in continuous loops without explicit
override. Only set `max_node_visits > 0` in one-shot agents with feedback loops.
Every node must have at least one outgoing edge — no dead ends.
## Continuous Conversation Mode
@@ -219,7 +225,7 @@ Only three valid keys:
loop_config = {
"max_iterations": 100, # Max LLM turns per node visit
"max_tool_calls_per_turn": 20, # Max tool calls per LLM response
"max_history_tokens": 32000, # Triggers conversation compaction
"max_context_tokens": 32000, # Triggers conversation compaction
}
```
**INVALID keys** (do NOT use): `"strategy"`, `"mode"`, `"timeout"`,
@@ -250,179 +256,43 @@ Multiple ON_SUCCESS edges from same source → parallel execution via asyncio.ga
Judge is the SOLE acceptance mechanism — no ad-hoc framework gating.
## Async Entry Points (Webhooks, Timers, Events)
## Triggers (Timers, Webhooks)
For agents that need to react to external events (incoming emails, scheduled
tasks, API calls), use `AsyncEntryPointSpec` and optionally `AgentRuntimeConfig`.
For agents that react to external events, create a `triggers.json` file
in the agent's export directory:
### Imports
```python
from framework.graph.edge import GraphSpec, AsyncEntryPointSpec
from framework.runtime.agent_runtime import AgentRuntime, AgentRuntimeConfig, create_agent_runtime
```
Note: `AsyncEntryPointSpec` is in `framework.graph.edge` (the graph/declarative layer).
`AgentRuntimeConfig` is in `framework.runtime.agent_runtime` (the runtime layer).
### AsyncEntryPointSpec Fields
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| id | str | required | Unique identifier |
| name | str | required | Human-readable name |
| entry_node | str | required | Node ID to start execution from |
| trigger_type | str | `"manual"` | `webhook`, `api`, `timer`, `event`, `manual` |
| trigger_config | dict | `{}` | Trigger-specific config (see below) |
| isolation_level | str | `"shared"` | `isolated`, `shared`, `synchronized` |
| priority | int | `0` | Execution priority (higher = more priority) |
| max_concurrent | int | `10` | Max concurrent executions |
### Trigger Types
**timer** — Fires on a schedule. Two modes: cron expressions or fixed interval.
Cron (preferred for precise scheduling):
```python
AsyncEntryPointSpec(
id="daily-digest",
name="Daily Digest",
entry_node="check-node",
trigger_type="timer",
trigger_config={"cron": "0 9 * * *"}, # daily at 9am
isolation_level="shared",
max_concurrent=1,
)
```
- `cron` (str) — standard cron expression (5 fields: min hour dom month dow)
- Examples: `"0 9 * * *"` (daily 9am), `"0 9 * * MON-FRI"` (weekdays 9am), `"*/30 * * * *"` (every 30 min)
Fixed interval (simpler, for polling-style tasks):
```python
AsyncEntryPointSpec(
id="scheduled-check",
name="Scheduled Check",
entry_node="check-node",
trigger_type="timer",
trigger_config={"interval_minutes": 20, "run_immediately": False},
isolation_level="shared",
max_concurrent=1,
)
```
- `interval_minutes` (float) — how often to fire
- `run_immediately` (bool, default False) — fire once on startup
**event** — Subscribes to EventBus (e.g., webhook events):
```python
AsyncEntryPointSpec(
id="email-event",
name="Email Event Handler",
entry_node="process-emails",
trigger_type="event",
trigger_config={"event_types": ["webhook_received"]},
isolation_level="shared",
max_concurrent=10,
)
```
- `event_types` (list[str]) — EventType values to subscribe to
- `filter_stream` (str, optional) — only receive from this stream
- `filter_node` (str, optional) — only receive from this node
**webhook** — HTTP endpoint (requires AgentRuntimeConfig):
The webhook server publishes `WEBHOOK_RECEIVED` events on the EventBus.
An `event` trigger type with `event_types: ["webhook_received"]` subscribes
to those events. The flow is:
```
HTTP POST /webhooks/gmail → WebhookServer → EventBus (WEBHOOK_RECEIVED)
→ event entry point → triggers graph execution from entry_node
```json
[
{
"id": "daily-check",
"name": "Daily Check",
"trigger_type": "timer",
"trigger_config": {"cron": "0 9 * * *"},
"task": "Run the daily check process"
}
]
```
**manual** — Triggered programmatically via `runtime.trigger()`.
### Isolation Levels
| Level | Meaning |
|-------|---------|
| `isolated` | Private state per execution |
| `shared` | Eventual consistency — async executions can read primary session memory |
| `synchronized` | Shared with write locks (use when ordering matters) |
For most async patterns, use `shared` — the async execution reads the primary
session's memory (e.g., user-configured rules) and runs its own workflow.
### AgentRuntimeConfig (for webhook servers)
```python
from framework.runtime.agent_runtime import AgentRuntimeConfig
runtime_config = AgentRuntimeConfig(
webhook_host="127.0.0.1",
webhook_port=8080,
webhook_routes=[
{
"source_id": "gmail",
"path": "/webhooks/gmail",
"methods": ["POST"],
"secret": None, # Optional HMAC-SHA256 secret
},
],
)
```
`runtime_config` is a module-level variable read by `AgentRunner.load()`.
The runner passes it to `create_agent_runtime()`. On `runtime.start()`,
if webhook_routes is non-empty, an embedded HTTP server starts.
### Session Sharing
Timer and event triggers automatically call `_get_primary_session_state()`
before execution. This finds the active user-facing session and provides
its memory to the async execution, filtered to only the async entry node's
`input_keys`. This means the async flow can read user-configured values
(like rules, preferences) without needing separate configuration.
### Module-Level Variables
Agents with async entry points must export two additional variables:
```python
# In agent.py:
async_entry_points = [AsyncEntryPointSpec(...), ...]
runtime_config = AgentRuntimeConfig(...) # Only if using webhooks
```
Both must be re-exported from `__init__.py`:
```python
from .agent import (
..., async_entry_points, runtime_config,
)
```
### Reference Agent
See `exports/gmail_inbox_guardian/agent.py` for a complete example with:
- Primary client-facing intake node (user configures rules)
- Timer-based scheduled inbox checks (every 20 min)
- Webhook-triggered email event handling
- Shared isolation for memory access across streams
## Framework Capabilities
**Works well:** Multi-turn conversations, HITL review, tool orchestration, structured outputs, parallel execution, context management, error recovery, session persistence.
**Limitations:** LLM latency (2-10s/turn), context window limits (~128K), cost per run, rate limits, node boundaries lose context.
**Not designed for:** Sub-second responses, millions of items, real-time streaming, guaranteed determinism, offline/air-gapped.
### Key Fields
- `trigger_type`: `"timer"` or `"webhook"`
- `trigger_config`: `{"cron": "0 9 * * *"}` or `{"interval_minutes": 20}`
- `task`: describes what the worker should do when the trigger fires
- Triggers can also be created/removed at runtime via `set_trigger` / `remove_trigger` queen tools
## Tool Discovery
Do NOT rely on a static tool list — it will be outdated. Always use
`list_agent_tools()` to get available tool names grouped by category.
For full schemas with parameter details, use `discover_mcp_tools()`.
Do NOT rely on a static tool list — it will be outdated. Always call
`list_agent_tools()` with NO arguments first to see ALL available tools.
Only use `group=` or `output_schema=` as follow-up calls after seeing the
full list.
```
list_agent_tools() # all available tools
list_agent_tools("exports/my_agent/mcp_servers.json") # specific agent
discover_mcp_tools() # full schemas with params
list_agent_tools() # ALWAYS call this first
list_agent_tools(group="gmail", output_schema="full") # then drill into a category
list_agent_tools("exports/my_agent/mcp_servers.json") # specific agent's tools
```
After building, validate tools exist: `validate_agent_tools("exports/{name}")`
After building, run `validate_agent_package("{name}")` to check everything at once.
Common tool categories (verify via list_agent_tools):
- **Web**: search, scrape, PDF
@@ -21,7 +21,7 @@ Do NOT use GCU for:
- Same underlying `EventLoopNode` class — no new imports needed
- `tools=[]` is correct — tools are auto-populated at runtime
## GCU Architecture Pattern
## GCU Architecture Pattern
GCU nodes are **subagents** — invoked via `delegate_to_sub_agent()`, not connected via edges.
@@ -109,6 +109,45 @@ Key rules to bake into GCU node prompts:
- Keep tool calls per turn ≤10
- Tab isolation: when browser is already running, use `browser_open(background=true)` and pass `target_id` to every call
## Multiple Concurrent GCU Subagents
When a task can be parallelized across multiple sites or profiles, declare a distinct GCU
node for each and invoke them all in the same LLM turn. The framework batches all
`delegate_to_sub_agent` calls made in one turn and runs them with `asyncio.gather`, so
they execute concurrently — not sequentially.
**Each GCU subagent automatically gets its own isolated browser context** — no `profile=`
argument is needed in tool calls. The framework derives a unique profile from the subagent's
node ID and instance counter and injects it via an asyncio `ContextVar` before the subagent
runs.
### Example: three sites in parallel
```python
# Three distinct GCU nodes
gcu_site_a = NodeSpec(id="gcu-site-a", node_type="gcu", ...)
gcu_site_b = NodeSpec(id="gcu-site-b", node_type="gcu", ...)
gcu_site_c = NodeSpec(id="gcu-site-c", node_type="gcu", ...)
orchestrator = NodeSpec(
id="orchestrator",
node_type="event_loop",
sub_agents=["gcu-site-a", "gcu-site-b", "gcu-site-c"],
system_prompt="""\
Call all three subagents in a single response to run them in parallel:
delegate_to_sub_agent(agent_id="gcu-site-a", task="Scrape prices from site A")
delegate_to_sub_agent(agent_id="gcu-site-b", task="Scrape prices from site B")
delegate_to_sub_agent(agent_id="gcu-site-c", task="Scrape prices from site C")
""",
)
```
**Rules:**
- Use distinct node IDs for each concurrent task — sharing an ID shares the browser context.
- The GCU node prompts do not need to mention `profile=`; isolation is automatic.
- Cleanup is automatic at session end, but GCU nodes can call `browser_stop()` explicitly
if they want to release resources mid-run.
## GCU Anti-Patterns
- Using `browser_screenshot` to read text (use `browser_snapshot`)
@@ -0,0 +1,63 @@
# Queen Memory — File System Structure
```
~/.hive/
├── queen/
│ ├── MEMORY.md ← Semantic memory
│ ├── memories/
│ │ ├── MEMORY-2026-03-09.md ← Episodic memory (today)
│ │ ├── MEMORY-2026-03-08.md
│ │ └── ...
│ └── session/
│ └── {session_id}/ ← One dir per session (or resumed-from session)
│ ├── conversations/
│ │ ├── parts/
│ │ │ ├── 00001.json ← One file per message (role, content, tool_calls)
│ │ │ ├── 00002.json
│ │ │ └── ...
│ │ └── spillover/
│ │ ├── conversation_1.md ← Compacted old conversation segments
│ │ ├── conversation_2.md
│ │ └── ...
│ └── data/
│ ├── adapt.md ← Working memory (session-scoped)
│ ├── web_search_1.txt ← Spillover: large tool results
│ ├── web_search_2.txt
│ └── ...
```
---
## The three memory tiers
| File | Tier | Written by | Read at |
|---|---|---|---|
| `MEMORY.md` | Semantic | Consolidation LLM (auto, post-session) | Session start (injected into system prompt) |
| `memories/MEMORY-YYYY-MM-DD.md` | Episodic | Queen via `write_to_diary` tool + consolidation LLM | Session start (today's file injected) |
| `data/adapt.md` | Working | Queen via `update_session_notes` tool | Every turn (inlined in system prompt) |
---
## Session directory naming
The session directory name is **`queen_resume_from`** when a cold-restore resumes an existing
session, otherwise the new **`session_id`**. This means resumed sessions accumulate all messages
in the original directory rather than fragmenting across multiple folders.
---
## Consolidation
`consolidate_queen_memory()` runs every **5 minutes** in the background and once more at session
end. It reads:
1. `conversations/parts/*.json` — full message history (user + assistant turns; tool results skipped)
2. `data/adapt.md` — current working notes
It then makes two LLM writes:
- Rewrites `MEMORY.md` in place (semantic memory — queen never touches this herself)
- Appends a timestamped prose entry to today's `memories/MEMORY-YYYY-MM-DD.md`
If the combined transcript exceeds ~200 K characters it is recursively binary-compacted via the
LLM before being sent to the consolidation model (mirrors `EventLoopNode._llm_compact`).

Some files were not shown because too many files have changed in this diff Show More