Compare commits

...

536 Commits

Author SHA1 Message Date
Timothy e3c71f77de chore: fix ruff format 2026-02-02 17:37:37 -08:00
Timothy b09824faec chore: fix lint 2026-02-02 17:36:02 -08:00
Lakshitaa Chellaramani 0715fc5498 Merge branch 'main' into feature/github-tool 2026-01-31 23:31:22 +05:30
lakshitaa f9fddd6663 fix(github-tool): Address PR feedback - security and integration fixes
Addresses all blockers and suggestions from code review:

**Blockers fixed:**
1. Register tools in tools/__init__.py - Added import, registration call,
   and all 13 tool names to return list
2. Add credential spec - Created GitHub entry in credentials/integrations.py
   with env_var, tools list, help URL, and health check config
3. Move tests to correct location - Relocated from
   tools/src/.../github_tool/tests/ to tools/tests/tools/test_github_tool.py
4. Removed .claude/settings.local.json from PR

**Security improvements:**
1. URL parameter sanitization - Added _sanitize_path_param() to reject
   path traversal attempts (/ or ..) in owner, repo, branch, username params
2. Error message sanitization - Added _sanitize_error_message() to prevent
   token leaks from httpx.RequestError exceptions

All 38 tests passing.
2026-01-31 23:26:33 +05:30
Muzzaiyyan Hussain 58b60b84fd fix: make agent builder exports atomic (#2605)
* fix: make agent builder exports atomic
2026-01-31 17:59:31 +08:00
Hundao 86aef3319f fix(ci): apply ruff format to csv_tool.py (#2910) 2026-01-31 17:50:15 +08:00
RichardTang-Aden 0015b3d43d Merge pull request #1873 from DhruvPokhriyal/bugfix/csv_read-negative-offset
fix: validate non-negative limit and offset in csv_read function
2026-01-30 20:12:39 -08:00
RichardTang-Aden 9c4d44c057 Merge pull request #2205 from mishrapravin114/fix/1390-pdf-read-max-pages
pdf_read: surface truncation when exceeding max_pages
2026-01-30 20:12:01 -08:00
RichardTang-Aden 800c7fbe11 Merge pull request #2316 from NicklausFW/1843-csv-write-fail-without-parent-dir
fix(csv): handle csv_write with no parent directory
2026-01-30 20:11:47 -08:00
Timothy @aden 291ba24229 Merge pull request #2832 from adenhq/feat/node-conversation-class-WP-6-
nodeConversation Class
2026-01-30 19:01:30 -08:00
RichardTang-Aden ffa4096390 Merge pull request #2601 from Hundao/feat/email-tool
[Integration] feat(tools): add email service tool with Resend provider
2026-01-30 16:32:16 -08:00
bryan f2b6fc6948 linter updates 2026-01-30 16:18:48 -08:00
bryan acff8a0ece nodeConversation Class 2026-01-30 16:16:34 -08:00
Richard Tang 347c222f78 fix: quickstart compatibility 2026-01-30 16:07:05 -08:00
lakshitaa bfb660275e feat(tools): Add GitHub tool for repository and issue management
Implements comprehensive GitHub REST API v3 integration with 15 MCP tools
for managing repositories, issues, pull requests, code search, and branches.

Features:
- Repository management (list, get, search repos)
- Issue operations (create, update, close, list issues)
- Pull request management (create, list, get PRs)
- Code search across GitHub
- Branch operations (list, get branch info)

Technical details:
- 15 MCP tools organized in 5 categories
- 38 comprehensive tests with mocking (all passing)
- Full credential store support (env var + CredentialStoreAdapter)
- Proper error handling (timeout, network, API errors)
- Follows HubSpot/Slack tool patterns exactly

Files:
- tools/src/aden_tools/tools/github_tool/github_tool.py (757 lines)
- tools/src/aden_tools/tools/github_tool/tests/test_github_tool.py (628 lines)
- tools/src/aden_tools/tools/github_tool/README.md (646 lines)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-31 04:32:46 +05:30
Timothy f58619e378 Merge branch 'main' into feat/email-tool 2026-01-30 14:00:54 -08:00
RichardTang-Aden 472cfe1437 Merge pull request #2815 from RichardTang-Aden/main
Docs: improving Q&A and hive features
2026-01-30 13:57:46 -08:00
Richard Tang 8b7efe27c1 docs: updated hive descriptions 2026-01-30 13:57:16 -08:00
Timothy eb00c10d9b Merge remote-tracking branch 'origin/main' into feat/email-tool 2026-01-30 13:56:15 -08:00
Richard Tang 71249f4f88 docs: updated q&a and why aden 2026-01-30 13:54:37 -08:00
Timothy 0beeda3eec fix: email tool to credential store 2026-01-30 13:54:01 -08:00
lakshitaa d6ae48bc58 Merge upstream/main 2026-01-31 03:19:12 +05:30
Timothy @aden dc4a40468b Merge pull request #2808 from TimothyZhang7/feature/credential-manager-aden-provider
Feature/credential manager aden provider
2026-01-30 13:32:45 -08:00
Timothy 7fa2295d30 fix: ruff format issue 2026-01-30 13:27:29 -08:00
Timothy 756f013ecd fix: mcp test case 2026-01-30 13:24:23 -08:00
Richard Tang a963d49306 docs: remove duplicated run agent command 2026-01-30 13:23:02 -08:00
Timothy 4b00852bdf Merge remote-tracking branch 'origin/main' into feature/credential-manager-aden-provider 2026-01-30 13:18:11 -08:00
RichardTang-Aden b9b1731dc1 Merge pull request #2807 from RichardTang-Aden/main
Docs: Update instruction for tools/integration contribution
2026-01-30 13:06:13 -08:00
Richard Tang 34791e6bbd docs: update issue links 2026-01-30 13:04:54 -08:00
Richard Tang d1ebdfc92f docs: tools contribution guide 2026-01-30 12:59:56 -08:00
austin931114 33040b7978 Merge pull request #1316 from Shivraj12/fix/tool-registry-invalid-json
fix(tool_registry): handle invalid JSON returned by tools
2026-01-30 21:43:32 +01:00
austin931114 3b6b6c48a5 Merge pull request #919 from Siddharth2624/chore/validation-error-message
docs: clarify illustrative output sanitization example
2026-01-30 21:32:39 +01:00
Timothy c3fddd3c8c fix: deprecate credential manager 2026-01-30 12:28:27 -08:00
Richard Tang 41e5558715 docs: update readme 2026-01-30 12:24:16 -08:00
austin931114 58969085bf Merge pull request #1816 from NicklausFW/1277-execution-quality-tracking
fix(executor): add execution quality tracking to expose retry metrics
2026-01-30 21:15:23 +01:00
austin931114 f45ad2d543 Merge pull request #1656 from hrshmakwana/fix/setup-creates-exports
fix(micro-fix): setup script now creates missing exports directory (#1645)
2026-01-30 21:01:08 +01:00
hundao 0030d6b499 feat(tools): add cc/bcc support to email tool
Add optional cc and bcc parameters to send_email and
send_budget_alert_email. Empty strings and whitespace-only values are
filtered out via _normalize_recipients to prevent invalid payloads.
2026-01-30 18:35:42 +08:00
hundao 5f019f44ca feat(tools): add email service tool with Resend provider
Integrate a mail service to enable email notifications for budget alerts.
Closes #7.

New tools:
- send_email: general-purpose email sending with multi-provider support
- send_budget_alert_email: formatted budget alert notifications with
  severity levels (INFO/WARNING/CRITICAL/EXCEEDED)

Architecture:
- Multi-provider pattern (matching web_search_tool), Resend as primary
- from_email resolved via explicit param or EMAIL_FROM env var
- Credential integration via CredentialManager with env var fallback

Also fixes: web_scrape_tool test mock missing content-type header
2026-01-30 18:35:42 +08:00
Hundao 0d602f92a3 fix(ci): add missing content-type header in scrape test mock and format mcp_client (#2612) 2026-01-30 18:33:37 +08:00
RichardTang-Aden b10d617166 Merge pull request #638 from Sourabsb/fix/run-async-is-running-check
fix: add is_running() and is_closed() checks to _run_async() to prevent deadlock
2026-01-29 21:17:58 -08:00
RichardTang-Aden 348c646bab Merge pull request #534 from Sourabsb/fix/mcp-client-resource-leak
fix: properly close MCP session and STDIO context managers in disconnect()
2026-01-29 21:06:30 -08:00
RichardTang-Aden a8243e6746 Merge pull request #274 from dithzz/fix/cmd-list-keyerror-steps
fix(cli): fix KeyError 'steps' in cmd_list function
2026-01-29 20:45:20 -08:00
RichardTang-Aden 9368828f94 Merge pull request #189 from RussellLuo/fix-mcp-servers-parsing
fix(skills): load MCP servers correctly
2026-01-29 18:27:24 -08:00
RichardTang-Aden 51e9a3ecdf Merge branch 'main' into fix-mcp-servers-parsing 2026-01-29 18:26:49 -08:00
Timothy 2f03605980 fix: change to production api endpoint 2026-01-29 17:53:47 -08:00
RichardTang-Aden 74e754b4e1 Merge pull request #2496 from RichardTang-Aden/main
Docs: Update Roadmap and Mermaid chart
2026-01-29 17:49:35 -08:00
RichardTang-Aden f332e40000 Merge pull request #2486 from adenhq/chore-docs-update
Docs: updating documentation
2026-01-29 17:49:22 -08:00
Richard Tang d6064147e4 chore: update mermaid chart type 2026-01-29 17:44:24 -08:00
bryan 1fb5005bf5 removing .env.example from tools 2026-01-29 17:43:24 -08:00
bryan 57fbb0479b remove env example 2026-01-29 17:37:32 -08:00
RichardTang-Aden 26154cc648 Merge pull request #2212 from MuzzaiyyanHussain/docs/i18n-hindi
docs(i18n): add Hindi (हिंदी) README translation
2026-01-29 17:23:34 -08:00
Richard Tang e207cee4ff feat: update mermaid chart 2026-01-29 17:07:54 -08:00
Richard Tang e7a2d957f5 chore: update roadmap to reflect recent direction calibration 2026-01-29 16:48:31 -08:00
bryan 7e5f02eebe updating documentation 2026-01-29 16:12:16 -08:00
Timothy 248716c093 feat: credential store auto sync 2026-01-29 14:37:20 -08:00
Muzzaiyyan Hussain 37a3fce27d Translated the last line to hindi 2026-01-30 01:41:13 +05:30
Muzzaiyyan Hussain 7976c1dac7 linked the translated hindi version hi.md in to the main readme 2026-01-30 00:30:26 +05:30
Aden HQ da2bac1b48 Merge pull request #2414 from RichardTang-Aden/main
fix: litellm missing from tools dependencies; quickstart.sh only vali…
2026-01-29 10:30:23 -08:00
Richard Tang 4096eba564 fix: litellm missing from tools dependencies; quickstart.sh only validates tools venv 2026-01-29 10:20:19 -08:00
RichardTang-Aden 3f3a23e4b2 Merge pull request #2314 from Sourabsb/fix/remove-debug-print-statements-v2
micro-fix: remove debug print statements that leak API key
2026-01-29 07:56:45 -08:00
RichardTang-Aden 934e3145b8 Merge pull request #2348 from JVSCHANDRADITHYA/main
(Micro-Fix) CLI crash when exports/ directory is missing
2026-01-29 07:56:18 -08:00
RichardTang-Aden 6155ccbf4d Merge pull request #2037 from shivamhwp/conductorchicago
Feat(quick-start): Address PR #716 review feedback: MCP config, Python version, venv docs
2026-01-29 07:45:36 -08:00
Chandradithya Janaswami 6cadc81be8 Merge branch 'adenhq:main' into main 2026-01-29 21:12:00 +05:30
RichardTang-Aden 412521edb0 Merge branch 'main' into conductorchicago 2026-01-29 07:40:53 -08:00
Nicklaus Wibowo ec3be40ddd fix(csv): handle csv_write with no parent directory
Guard against empty parent_dir when path has no directory component (e.g., 'data.csv'). Prevents FileNotFoundError from os.makedirs(''). Adds test coverage for root-level file writes.
2026-01-29 20:44:27 +07:00
Sourabsb fd00471189 fix: remove debug print statements that leak API key to stdout 2026-01-29 18:57:33 +05:30
Muzzaiyyan Hussain 65c3fcf76d docs(i18n): add Hindi (हिंदी) README translation 2026-01-29 14:23:40 +05:30
mishrapravin114 83f77af2ab pdf_read: surface truncation when exceeding max_pages 2026-01-29 13:38:16 +05:30
suhanijindal 2fe83187d6 fix: add Python 3.13 classifier to tools/pyproject.toml (#1780)
Co-authored-by: United IT Services <uniteditservices@Uniteds-MacBook-Air.local>
2026-01-29 15:57:19 +08:00
Ayush Pandey e65052c237 fix(core): explicitly set utf-8 encoding for storage and testing backends (#641) 2026-01-29 14:07:37 +08:00
Mrunal Nirajkumar Shah 38bc7c12ae fix(setup): Fixes python and pip version detection and mismatch (#1190)
* Fixed Python and pip version mismatch with robust code #476

- Ensured python version is found across all available python interpreters including python3, python, and py -3, and made it robust for easy interpreter add-on.
- Ensured that pip is found for the respective python interpreter.
- Generalized some variables like PYTHON_VERSION for flexiblity.
- Added a split to PYTHON_VERSION into Major and Minor version to create a robust code.
- Added clear documentation throughout the code .

Related to issue #476

* fix(setup): Code fixes raised during review by @Hundao

- PYTHON_CMD initialized to no value (blank). Fixes the bug
- PYTHON_VERSION used to generalize is changed to REQUIRED_PYTHON_VERSION due to name collision
- quotes added to "${POSSIBLE_PYTHONS[@]}" so py -3 can work.

Pending:
eval related issues pending.

* fix(setup): Code fixes raised during review by @Hundao

- eval removed altogether.
- py -3 is replaced with py in POSSIBLE_PYTHONS, and will be replaced to py -3 after the interpreter selection.

* fix(setup): Code fixes raised during review by @bryanadenhq

- Implemented Array and refactored entire code. PYTHON_CMD is changed at all places in the entire code.
- Redundant code is removed, design changed a bit for user understanding. (See Screenshots)
- Using 2>&1 as standard. Fix the mis-match in standard code writing.
2026-01-29 14:06:54 +08:00
Sourabsb 758c5157b8 Merge upstream/main and resolve conflicts 2026-01-29 10:17:44 +05:30
Sourabsb ce6b47c0d4 fix: resolve all lint issues in mcp_client.py 2026-01-29 10:10:11 +05:30
Shivam Sharma 22c95b62ce quickstart: auto-install uv and pick Python >=3.11 2026-01-29 08:49:02 +05:30
Shivam Sharma 9684311176 Merge upstream/main 2026-01-29 08:48:43 +05:30
Timothy aa0fff8ac5 fix: use credential store by default 2026-01-28 18:51:20 -08:00
RichardTang-Aden a1229d8e98 Merge pull request #1945 from Anshu-bhatt/feature/add-python-version-file
micro-fix: add .python-version for automatic Python version detection
2026-01-28 18:08:51 -08:00
RichardTang-Aden ad1b10db63 Merge pull request #1602 from magiawala/fix/workflowbuilder-import-error
[micro-fix] Correct WorkflowBuilder import to GraphBuilder in MCP example
2026-01-28 18:04:09 -08:00
RichardTang-Aden 96308637d6 Merge pull request #1745 from Aman030304/fix/runner-logging
Refactor: Replace print with logging in AgentRunner
2026-01-28 18:00:28 -08:00
Timothy e8a4cc908c Merge branch 'feature/hubspot-integration' into feature/credential-manager-aden-provider 2026-01-28 17:51:37 -08:00
Timothy 3c8ac436bd fix: onboarding experience 2026-01-28 17:49:13 -08:00
Shivam Sharma 4d341611a4 Merge upstream/main and resolve setup-python.sh conflict
Resolved conflict in scripts/setup-python.sh by keeping upstream's
improved formatting with color codes and ${PYTHON_CMD} variable.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 06:35:48 +05:30
Bryan @ Aden ef94bfe1fb Merge pull request #2042 from adenhq/fix/ruff-tests
(micro-fix): fix lint
2026-01-28 16:44:11 -08:00
bryan a58b52f420 fix lint 2026-01-28 16:39:40 -08:00
Bryan @ Aden 7852990073 Merge pull request #519 from vakrahul/perf/heuristic-json-repair
style: fix line length violation in output_cleaner.py
2026-01-28 16:30:01 -08:00
Bryan @ Aden 14c9478080 Merge pull request #2040 from adenhq/fix/ruff-tests
(micro-fix): ruff fix
2026-01-28 16:21:58 -08:00
bryan c5ebd91651 ruff fix 2026-01-28 16:19:49 -08:00
Bryan @ Aden 088f3cc817 Merge pull request #1444 from tjsasakifln/feat/1334-root-cli-entry-point
feat(cli): add root hive CLI entry point to eliminate PYTHONPATH
2026-01-28 16:18:14 -08:00
Bryan @ Aden 50087bb24c Merge pull request #1366 from tjsasakifln/fix/1332-rewrite-configuration-docs
docs(configuration): rewrite configuration.md to reflect actual Python framework architecture
2026-01-28 16:17:14 -08:00
Timothy @aden ca06465305 Merge branch 'main' into feature/hubspot-integration 2026-01-28 16:06:33 -08:00
Timothy ea719d5441 Merge branch 'main' into feature/credential-manager-aden-provider 2026-01-28 16:03:25 -08:00
Timothy 2627b6e69c fix: aden client 2026-01-28 16:03:10 -08:00
RichardTang-Aden c869e1955a Merge pull request #1934 from mishrapravin114/fix/auto-close-circular-duplicate
Fix/auto close circular duplicate
2026-01-28 15:44:45 -08:00
RichardTang-Aden 8293f75152 Merge pull request #295 from Invens/fix/setup-python-detect-311
Fix/setup python detect 311
2026-01-28 15:35:08 -08:00
Shivam Sharma 3ccf4bc383 Address PR review feedback
- Restore MCP server configurations in .mcp.json with updated paths
  for separate virtual environments (core/.venv and tools/.venv)
- Align Python version: change .python-version from 3.13 to 3.11
  to match pyproject.toml target version
- Remove AGENTS.md as suggested (quickstart is sufficient)
- Document cross-package imports and separate venv architecture
  in ENVIRONMENT_SETUP.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 04:26:12 +05:30
Shivam Sharma e71d850b79 Merge remote-tracking branch 'origin/main' into conductorchicago
Resolved conflict in tools/pyproject.toml by keeping the expanded format
with sql dependency from main.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 04:19:16 +05:30
Timothy @aden 774911b46c Merge pull request #2012 from TimothyZhang7/feature/credential-manager-aden-provider
chore: aden provider factory method
2026-01-28 14:22:45 -08:00
Timothy 480ade22ce chore: aden provider factory method 2026-01-28 14:04:08 -08:00
RichardTang-Aden bd31323876 Merge pull request #1999 from Jai-Harrish/docs/remove-stray-orchestration-text
docs: remove stray 'orchestration' text from project structure
2026-01-28 13:54:33 -08:00
RichardTang-Aden 2f3b8b27b8 Merge pull request #1973 from ryanbijoy/fix/docs-link
Docs: Fixed the .claude/ url to open the right file
2026-01-28 13:54:08 -08:00
RichardTang-Aden d39abf4312 Merge pull request #1909 from ayushigithub12/fix/csv-read-total-row1850
Fix/csv read total row1850
2026-01-28 13:52:07 -08:00
RichardTang-Aden ec7058414f Merge pull request #1957 from ryanbijoy/fix/readme-links
Docs: Fixed the to the correct URL - docs/architecture/README.md
2026-01-28 13:50:28 -08:00
RichardTang-Aden 8dc63771ca Merge pull request #2003 from adenhq/chore-add-micro-fix-requirements
chore: add-micro-fix-requirements
2026-01-28 13:39:02 -08:00
Richard Tang 434f1d7298 chore: add-micro-fix-requirements 2026-01-28 13:33:29 -08:00
ryanbijoy ee0ae20d06 Merge branch 'main' into fix/readme-links 2026-01-29 02:49:22 +05:30
RichardTang-Aden a7e16c84a5 Merge pull request #1839 from SH-Nihil-Mukkesh-25/micro-fix/roadmap-header
micro-fix(docs): add markdown header to ROADMAP
2026-01-28 13:16:39 -08:00
Jai Harrish A eaa54d9d4a docs: remove stray 'orchestration' text from project structure 2026-01-28 21:14:15 +00:00
RichardTang-Aden 2c4d034536 Merge pull request #1896 from aarav-shukla07/chore/remove-honeycomb-references
chore: remove references to archived honeycomb frontend
2026-01-28 12:43:13 -08:00
RichardTang-Aden a43b7c9403 Merge pull request #1981 from adenhq/docs-i18n-readme
chore: re-organize readmes
2026-01-28 12:32:07 -08:00
Richard Tang 752979da01 chore: re-organize readmes 2026-01-28 12:29:27 -08:00
Timothy @aden c4be938b7f Merge pull request #1960 from TimothyZhang7/feature/credential-manager-aden-provider
feature: aden sync provider for credential store
2026-01-28 12:27:51 -08:00
Timothy 3a308ba67e fix: load aden provider api key from default env var 2026-01-28 12:23:35 -08:00
ryanbijoy cadf401f23 micro-fix: Fixed the .claude/ url to open the right file 2026-01-29 01:40:50 +05:30
ryanbijoy 24dd41410a micro-fix: Fixed the to the correct URL - docs/architecture/README.md 2026-01-29 01:25:37 +05:30
Timothy 2abf43ed21 feature: aden sync provider for credential store 2026-01-28 11:54:14 -08:00
Anshu-bhatt 2e5ed77909 micro-fix: touch .python-version to reopen PR 2026-01-29 00:47:20 +05:30
Anshu-bhatt 0ae0bfda83 chore(dx): add .python-version for automatic Python version detection 2026-01-29 00:36:09 +05:30
mishrapravin114 22007e7aa9 chore: remove cross-verify doc from PR 2026-01-29 00:26:39 +05:30
mishrapravin114 05dde7414f fix(workflow): prevent circular duplicate closure in auto-close script
- Skip closing an issue as duplicate of another that is already closed
  (avoids circular closure when bot and human close in opposite order).
- Skip when duplicate target is self (same issue number).
- Extract testable helpers: isDupeComment, isDupeCommentOldEnough,
  authorDisagreedWithDupe, getLastDupeComment, decideAutoClose.
- Add 23 unit tests (Bun) and run them in CI before auto-close step.
- Add scripts/AUTO_CLOSE_DUPLICATES_CROSS_VERIFY.md for impact summary.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-01-29 00:22:23 +05:30
ayushi sharma 721cfb1ac8 Fixed the csv_read total_rows is incorrect for CSV files 2026-01-29 00:03:39 +05:30
ayushi sharma 5973168a8c Fixed the csv_read total_rows is incorrect for CSV files 2026-01-28 23:58:26 +05:30
Tiago Sasaki 56ed24a092 feat(cli): add root hive CLI entry point to eliminate PYTHONPATH requirement
Fixes #1334
2026-01-28 15:22:25 -03:00
Tiago Sasaki ca031f3ee1 docs(configuration): rewrite to reflect actual Python framework architecture
Fixes #1332
2026-01-28 15:17:59 -03:00
Bryan @ Aden c9f3de1af6 Merge pull request #1752 from ebrahimzaher/docs/add-alpine-support
docs: add setup instructions for Alpine Linux users
2026-01-28 09:29:12 -08:00
Bryan @ Aden d8d4b9399e Merge pull request #1871 from adenhq/fix/ruff-tests
(micro-fix): fixing linter
2026-01-28 09:28:44 -08:00
bryan 30bf1da424 fixing linter 2026-01-28 09:26:27 -08:00
Bryan @ Aden 6712fa9a8a Merge pull request #1855 from lachhmansingh16/fix/hive-list-crash
fix(cli): micro-fix to prevent hive list crash
2026-01-28 09:21:15 -08:00
Bryan @ Aden 2306b13fdc Merge pull request #1817 from Vikasverma9515/fix/tools-dependency-tests
fix(tools): handle optional duckdb dependency and update credential tests
2026-01-28 09:21:02 -08:00
Timothy @aden 14907a7c6e Merge pull request #1836 from kunnaaalll/chore/harden-triage-bot
chore(ci): harden triage bot against low-quality AI spam
2026-01-28 09:17:20 -08:00
DhruvPokhriyal 967cbf814b fix: validate non-negative limit and offset in csv_read function 2026-01-28 22:46:56 +05:30
Vikas Verma 0dfec38b4b fix: remove duplicate import in test_csv_tool.py 2026-01-28 22:39:18 +05:30
Lachhman Singh 9ad4702c08 style: fix indentation alignment 2026-01-28 22:07:01 +05:00
Kunal Parmar ec89bf3622 chore(ci): harden triage bot against low-quality AI spam 2026-01-28 22:32:40 +05:30
Lachhman Singh ab7c924b9a style: fix indentation and improve status message 2026-01-28 21:56:46 +05:00
Timothy @aden 0c2a2f31f6 Merge pull request #1858 from TimothyZhang7/main
fix: auto closing bot
2026-01-28 08:53:06 -08:00
RichardTang-Aden 2b52ed6397 Merge pull request #1807 from mgaldon17/feature/plan-failed-dependency-resolution
feat(plan): Implemented a resolution for the failed dependency
2026-01-28 08:51:59 -08:00
Timothy 1b2befaae9 fix: auto closing bot 2026-01-28 08:50:57 -08:00
Lachhman Singh bca56f8ff6 fix(cli): prevent crash when exports dir missing 2026-01-28 21:35:40 +05:00
Timothy @aden e9f7f75c34 Merge pull request #1844 from TimothyZhang7/feature/cursor-support
fix: usage guide
2026-01-28 08:18:48 -08:00
Timothy 69cd9ab9f5 fix: usage guide
Release / Create Release (push) Waiting to run
2026-01-28 08:16:05 -08:00
Timothy @aden fa1bba3320 Merge pull request #1837 from TimothyZhang7/feature/cursor-support
feature: cursor-aligned agent skils
2026-01-28 08:02:06 -08:00
Nihil 9b23668136 micro-fix(docs): add markdown header to ROADMAP 2026-01-28 21:27:35 +05:30
Timothy bf347d5e78 feature: cursor-aligned agent skils 2026-01-28 07:56:11 -08:00
RichardTang-Aden 3cfc88c4d6 Merge pull request #1720 from ryanbijoy/fix/gitingore-issue
micro-fix/gitingore issue
2026-01-28 07:36:57 -08:00
Bryan @ Aden 031b20574c Merge pull request #1509 from brilliantkid87/test/storage-module-coverage
Test/storage module coverage
2026-01-28 07:36:39 -08:00
RichardTang-Aden f37448e602 Merge pull request #584 from AbdulTaufeeq01/fix/web-scrape-relative-urls
fix(web-scrape): convert relative URLs to absolute URLs using urljoin
2026-01-28 07:36:16 -08:00
Vikas Verma ab5b1a254f Merge branch 'main' into fix/tools-dependency-tests 2026-01-28 20:53:33 +05:30
Nicklaus Wibowo 5d8996fe54 fix(executor): add execution quality tracking to ExecutionResult
Track retries, failed nodes, and execution quality (clean/degraded/failed) to expose retry metrics in ExecutionResult. This allows dashboards and monitoring to distinguish between clean success and degraded success with retries.
2026-01-28 21:26:45 +07:00
Manuel Galdon 6b30c2e8e7 feat(plan): Implemented a resolution for the failed dependency 2026-01-28 15:17:15 +01:00
brilliantkid87 1298d4b379 Merge branch 'test/storage-module-coverage' of https://github.com/brilliantkid87/hive into test/storage-module-coverage 2026-01-28 21:13:24 +07:00
Brilliantkid ac3aaa9348 fix: linter error 2026-01-28 21:13:17 +07:00
austin931114 bedc0eadf3 Merge pull request #1773 from dekalouis/fix/llm-decide-edge-condition-loader
fix(runner): add missing llm_decide edge condition mapping
2026-01-28 14:46:39 +01:00
dekalouis fe352ea54e fix(runner): add missing llm_decide edge condition mapping
The condition_map in _load_from_dict was missing the llm_decide
mapping, causing goal-aware routing to break for exported agents.

When agents with LLM_DECIDE edges were exported and re-imported,
the edge conditions were silently defaulted to ON_SUCCESS instead
of preserving the LLM_DECIDE routing logic.

This fix ensures that agents exported with goal-aware routing edges
maintain their correct behavior after re-import.
2026-01-28 20:42:42 +07:00
austin931114 7c990dd90a Merge pull request #1679 from MuzzaiyyanHussain/test-graph-executor-coverage
test: add pytest coverage for core graph executor success and failure paths
2026-01-28 14:24:03 +01:00
austin931114 f93111c319 Merge pull request #1766 from sashankthapa/docs/fix-claude-agent-skills-and-example
docs: fix Claude agent skills structure and workflow examples
2026-01-28 13:06:06 +01:00
kozuedoingregression d4b2c82d54 docs: fix Claude agent skills structure and workflow examples 2026-01-28 17:15:56 +05:30
Aman 169827636f Refactor: Replace print with logging in AgentRunner 2026-01-28 16:22:01 +05:30
ryanbijoy b6ef35fe55 micro-fix: Removed the NULL boxes and Renamed it to 2026-01-28 15:19:52 +05:30
ryanbijoy 6fb84b6889 gitignore changes 2026-01-28 15:06:19 +05:30
Muzzaiyyan Hussain 6e94402a8d Merge branch 'main' into test-graph-executor-coverage 2026-01-28 13:45:32 +05:30
Muzzaiyyan Hussain d68b822687 chore: apply automated lint fixes 2026-01-28 13:03:51 +05:30
Harsh Makwana 64299e959a fix: setup script now creates missing exports directory 2026-01-28 12:41:20 +05:30
Aarav Shukla d14d23b010 chore: remove references to archived honeycomb frontend 2026-01-28 12:26:45 +05:30
JVSCHANDRADITHYA 30f1c700ce Changes made to _select_agent function to lazy create exports directory 2026-01-28 06:27:50 +00:00
Abdul Taufeeq M ccae478347 Merge branch 'main' into fix/web-scrape-relative-urls 2026-01-28 11:06:27 +05:30
Abdul Taufeeq M 3a2639f565 fix: web scrape tool improvements with content-type validation and max_length simplification
- Add Content-Type validation to skip non-HTML content
- Simplify max_length validation using max() and min()
- Improve title extraction with cleaner code
2026-01-28 10:52:10 +05:30
Patrick e241ec3341 Merge pull request #179 from PatrickChen928/main
feat: add nullable_output_keys, fix: #178
2026-01-28 13:16:39 +08:00
Timothy bc6f70933b feat: hubspot integration and advanced scraper 2026-01-27 20:50:17 -08:00
Devanshu Magiawala bc070c3e39 fix: correct WorkflowBuilder import to GraphBuilder in MCP example
The MCP integration example referenced WorkflowBuilder which doesn't exist.
Changed to GraphBuilder which is the correct class name.
Fixes import error when running: python core/examples/mcp_integration_example.py
2026-01-27 20:46:00 -08:00
Bryan @ Aden f30f42a4d3 Merge pull request #1577 from adenhq/fix/ruff-tests1
fixed ruff format --check
2026-01-27 20:12:03 -08:00
bryan e4c95c7a91 fixed ruff format --check 2026-01-27 20:09:31 -08:00
Bryan @ Aden bfb1a81b7a Merge pull request #944 from adionit7/docs/remove-docker-compose-references
docs: remove outdated Docker Compose references
2026-01-27 19:45:52 -08:00
Brilliantkid 257e36615a fix: linter error 2026-01-28 10:39:37 +07:00
Bryan @ Aden 2a049df099 Merge pull request #1540 from jaffarkeikei/fix/list-dir-isdir-check
fix(list_dir): add isdir check before listing
2026-01-27 19:17:25 -08:00
vakrahul 2194301260 Merge branch 'main' into perf/heuristic-json-repair 2026-01-28 08:00:38 +05:30
Bryan @ Aden 095dd05b17 Merge pull request #1173 from JohnnyWalker010/fix/json-validation-error-handling
Fix/json validation error handling
2026-01-27 18:15:04 -08:00
Aden HQ 6d03934452 Merge pull request #1535 from RichardTang-Aden/main
docs: chore for calling claude skills
2026-01-27 17:22:22 -08:00
RichardTang-Aden 5051f44543 Merge branch 'adenhq:main' into main 2026-01-27 17:14:10 -08:00
Richard Tang 9d98f9f678 docs: update the claude code skill instruction 2026-01-27 17:13:38 -08:00
Timothy @aden 9e0c24cd3a Merge pull request #1532 from TimothyZhang7/main
chore: fix lint issues
2026-01-27 17:01:05 -08:00
Timothy b66eec1e66 chore: fix lint issues 2026-01-27 16:58:06 -08:00
Timothy @aden aca66d60ed Merge pull request #1530 from adenhq/staging
Staging
2026-01-27 16:52:10 -08:00
RichardTang-Aden 8316e7c0e9 Merge pull request #1523 from adenhq/chore--ruff-fix
micro-fix: fix ruff and excluding Docs
2026-01-27 16:41:44 -08:00
Emmanuel Nwanguma 3bbecad044 config: add .gitattributes for cross-platform line ending consistency (#951)
* config: add .gitattributes for cross-platform line ending consistency

- Add comprehensive .gitattributes to normalize line endings
- Ensure shell scripts always use LF (required for Unix execution)
- Mark binary files explicitly to prevent corruption
- Eliminate CRLF warnings for Windows contributors
- Follow cross-platform best practices

This fixes persistent 'LF will be replaced by CRLF' warnings that
confuse Windows contributors during normal git operations.

Fixes #950

* fix: add trailing newline at end of file

Per review feedback from @Hundao
2026-01-28 08:41:11 +08:00
RichardTang-Aden a8eb7127aa Merge branch 'main' into chore--ruff-fix 2026-01-27 16:39:53 -08:00
Richard Tang ba2889faf8 chore: allow excluding doc PRs 2026-01-27 16:26:21 -08:00
Richard Tang 1e6c5b8e11 fix: CI issues 2026-01-27 16:26:21 -08:00
Richard Tang 1199c02bfd feat: allow micro fixes be passed as a PR 2026-01-27 16:26:21 -08:00
Bryan @ Aden 688451b2a9 Merge pull request #1521 from adenhq/feat--allow-Micro-fixes-to-excluded
feat: allow micro fixes be passed as a PR
2026-01-27 16:13:33 -08:00
Richard Tang 9ef3628209 feat: allow micro fixes be passed as a PR 2026-01-27 16:08:42 -08:00
Richard Tang 8695f3fea0 chore: fix ruff 2026-01-27 16:01:52 -08:00
brilliantkid87 88b094b5de Merge branch 'test/storage-module-coverage' of https://github.com/brilliantkid87/hive into test/storage-module-coverage 2026-01-28 04:34:54 +07:00
brilliantkid87 8b3b0c51f5 test(core): add test coverage for storage module Fixes #902 2026-01-28 04:34:33 +07:00
brilliantkid87 322ff7c470 git commit -m "test(core): add test coverage for storage module
Fixes #902"
2026-01-28 04:30:44 +07:00
Timothy @aden ad968a0b54 Merge pull request #1458 from TimothyZhang7/release/v_0_3_0
DX Improvements: Linting, Formatting & Pre-Commit Hooks
2026-01-27 11:04:48 -08:00
Timothy 5d79a7078c fix: precommit hooks for different pyproject 2026-01-27 10:50:11 -08:00
Timothy e4f451e3f5 fix: lint issues with new enforcement 2026-01-27 10:45:49 -08:00
Timothy d8496c47f0 fix: linter 2026-01-27 10:19:23 -08:00
Timothy @aden 9c28284331 Merge pull request #1428 from TimothyZhang7/feature/parallel-fanout
Release / Create Release (push) Waiting to run
feat: parallel execution framework
2026-01-27 10:17:07 -08:00
Timothy 075e9179c1 fix: retry logic broken by merge conflict 2026-01-27 10:11:54 -08:00
Timothy e61bdfc417 test(arch): fanout/fanin 2026-01-27 10:07:58 -08:00
Timothy @aden f6c5c5cadb Merge branch 'main' into feature/parallel-fanout 2026-01-27 10:04:54 -08:00
jaffar 8923011304 fix(list_dir): add isdir check before listing 2026-01-27 12:00:56 -05:00
Aman e6900647f8 ci: add windows runner to test workflow 2026-01-27 22:06:59 +05:30
Timothy @aden c441494c2f Merge pull request #1368 from adenhq/main
sync main to staging
2026-01-27 08:34:48 -08:00
Timothy @aden e1bea18357 Merge pull request #1113 from TanujaNair03/refactor/llm-judge-agnostic
refactor: provider-agnostic LLMJudge with auto-detection for OpenAI (#1103)
2026-01-27 08:31:50 -08:00
Timothy @aden 197f4f984a Merge pull request #1353 from Tahir-yamin/fix/concurrent-storage-file-locks-leak
fix(memory): patch ConcurrentStorage leak (WeakValueDictionary)
2026-01-27 08:23:05 -08:00
Tahir yamin 0381a5c87b Merge branch 'adenhq:main' into fix/concurrent-storage-file-locks-leak 2026-01-27 20:36:19 +05:00
Tahir Yamin 112b1baf2e fix(memory): patch ConcurrentStorage leak with WeakValueDictionary (Isolated Logic) 2026-01-27 20:28:22 +05:00
Shivraj12 c61c958964 fix(tool_registry): handle invalid JSON returned by tools 2026-01-27 20:23:36 +05:30
vrijmetse a59d6ac6db refactor(tools): add multi-provider support to web_search tool (#795)
* feat(tools): add Google Custom Search as alternative to Brave Search

Adds google_search tool using Google Custom Search API as an alternative
to the existing web_search tool (Brave Search).

Changes:
- Add google_search_tool with full implementation
- Register Google credentials (GOOGLE_API_KEY, GOOGLE_CSE_ID)
- Register tool in tools/__init__.py
- Add README with setup instructions

Closes #793

* test(tools): add unit tests for google_search tool

Adds 7 tests mirroring web_search_tool test patterns:
- Missing API key error handling
- Missing CSE ID error handling
- Empty query validation
- Long query validation
- num_results clamping
- Default parameters
- Custom language/country parameters

All tests pass.

* refactor(tools): add multi-provider support to web_search tool

BREAKING CHANGE: None - backward compatible. Brave remains default.

- Add Google Custom Search as alternative provider in web_search
- Add 'provider' parameter: 'auto' (default), 'google', 'brave'
- Auto mode tries Brave first for backward compatibility
- Remove separate google_search_tool (consolidated into web_search)
- Update tests to cover multi-provider functionality (13 tests)
- Update README documentation

Users with BRAVE_SEARCH_API_KEY: No changes needed
Users with GOOGLE_API_KEY + GOOGLE_CSE_ID: Can use provider='google'
Users with both: Brave preferred by default, use provider='google' to force

Closes #793

* feat(tools): fixed readme

---------

Co-authored-by: Mustafa Abdat <abdamus@hilti.com>
2026-01-27 22:46:41 +08:00
Vikas Verma 37b9be3ff6 fix(tools): handle optional duckdb dependency and update credential tests 2026-01-27 20:00:45 +05:30
Hundao 9d39c09e27 Merge pull request #973 from AryanyAI/refactor/logging-mcp-scripts
refactor(mcp): replace print() with logging in setup scripts
2026-01-27 20:56:40 +08:00
root ff38962ff2 fix: remove duplicate content 2026-01-27 12:30:38 +00:00
root 121f33687a docs: add setup instructions for Alpine Linux users 2026-01-27 12:11:59 +00:00
Tanuu 598cc8b078 refactor: provider-agnostic LLMJudge with ruff styling fixes (#1103) 2026-01-27 14:24:57 +05:30
Tanuu 3605f3705b refactor: make LLMJudge provider-agnostic with OpenAI support (#1103) 2026-01-27 14:16:34 +05:30
AryanyAI 407816ddbf style: fix ruff quote style violations (Q000)
- Change single quotes to double quotes in logging formatters
- Fixes: setup_mcp.py, verify_mcp.py formatter strings
- Addresses Q000 linter errors from PR review
2026-01-27 13:54:20 +05:30
Hundao 6acdb65c1c Merge pull request #948 from TanujaNair03/refactor/provider-agnostic-prompts
Refactor/provider agnostic prompts
2026-01-27 14:18:59 +08:00
Hundao a4b0c66564 Merge pull request #558 from Hundao/feature/csv-tools
feat(tools): add CSV tools with DuckDB SQL support
2026-01-27 14:02:06 +08:00
Timothy @aden d1e6101a0f Merge pull request #1007 from TimothyZhang7/feature/credential-manager-stor
Feature/credential manager store
2026-01-26 21:30:58 -08:00
Timothy 330fbb19ac feature(credentials): credential store arch 2026-01-26 20:16:43 -08:00
Abdul Taufeeq M 8cc431ee52 fix: correct link validation to use absolute_href instead of href 2026-01-27 09:29:58 +05:30
Timothy 39831cf4b1 feat: parallel execution framework 2026-01-26 19:25:25 -08:00
Bryan @ Aden bc8cdfd6da Merge pull request #941 from vakrahul/fix/graph-retry-backoff
Fix/graph retry backoff
2026-01-26 19:20:35 -08:00
Tanuu 500876d65e style: add required trailing newline to prompts.py 2026-01-27 07:35:54 +05:30
Tanuu e59bb2d83f style: fix linting issues (whitespace and newline) 2026-01-27 07:29:48 +05:30
vakrahul 03910d531f Merge branch 'main' into fix/graph-retry-backoff 2026-01-27 07:28:22 +05:30
vakrahul a122345f9c fix(graph): restore node.max_retries and fix type check per review 2026-01-27 07:26:40 +05:30
Bryan @ Aden 6d025c808a Merge pull request #946 from not-anas-ali/fix/callable-type-annotations
fix(types): correct type annotation from lowercase 'callable' to 'Callable'
2026-01-26 17:52:00 -08:00
Bryan @ Aden 8525aec49c Merge pull request #934 from adionit7/fix/validate-exports-skip-when-empty
ci: make Validate Agent Exports skip clearly when exports/ is missing or empty
2026-01-26 17:48:44 -08:00
Tanuja Nair b0435a188f Merge branch 'adenhq:main' into refactor/provider-agnostic-prompts 2026-01-27 07:07:01 +05:30
Bryan @ Aden 3eb964eff2 Merge pull request #933 from adionit7/docs/fix-execute-command-tool-name-readme
docs(tools): fix tool name in README table (execute_command → execute_command_tool)
2026-01-26 17:36:24 -08:00
Bryan @ Aden ed88129b00 Merge pull request #927 from saboor2632/fix/worker-node-json-logging
fix(graph): add logging for JSON parsing failures in worker_node
2026-01-26 17:36:13 -08:00
vakrahul e1d8624483 Merge branch 'main' into perf/heuristic-json-repair 2026-01-27 07:03:07 +05:30
vakrahul 68264b54d9 style: fix linting issues in output_cleaner.py 2026-01-27 07:02:43 +05:30
adionit7 fc36a5e607 docs: remove outdated Docker Compose references
The repository does not include docker-compose files, but multiple docs
claimed "Docker Compose deployment out of the box." This was left over
from a previous release.

Changes:
- README.md: Update FAQ to describe Python package deployment
- README.ko.md: Same update for Korean translation
- docs/configuration.md: Remove "Docker Compose Integration" section
  and docker compose commands
- docs/quizzes: Update tasks that referenced docker-compose.yml
- .github/CODEOWNERS: Remove docker-compose*.yml entry
- scripts/setup.sh: Remove docker-compose.override.yml copy step

Fixes #923
2026-01-27 06:58:18 +05:30
vakrahul 1631d01dd2 merge: resolve conflicts in executor.pyx 2026-01-27 06:52:07 +05:30
Tanuu e846ad6ea7 refactor: implement provider-agnostic logic for test templates
Centralized _get_api_key in prompts.py to support OpenAI, Cerebras, and Groq via environment variables while maintaining Anthropic support through CredentialManager.
2026-01-27 06:39:55 +05:30
adionit7 e57cad7159 ci: make Validate Agent Exports skip clearly when exports/ is missing or empty
Previously, when exports/ was missing or empty, the bash glob
`exports/*/` would not match anything and the loop would silently
do nothing. The job would pass without actually validating anything,
which was misleading.

Now the job:
- Explicitly checks if exports/ directory exists
- Uses nullglob to handle empty directories properly
- Logs clear messages when skipping validation
- Reports the number of agents validated when successful

Fixes #887
2026-01-27 05:59:43 +05:30
adionit7 0cf9e39f6f docs(tools): fix tool name in README table (execute_command → execute_command_tool)
The "Available Tools" table listed `execute_command` but the actual
registered name is `execute_command_tool`. This aligns the docs with
the runtime name in __init__.py and the tool's own README.

Fixes #901
2026-01-27 05:58:59 +05:30
saboor2632 852332483a fix(graph): add logging for JSON parsing failures in worker_node 2026-01-27 05:10:34 +05:00
not-anas-ali 2b8604610c fix(types): correct type annotation from lowercase 'callable' to 'Callable'
Fixes #922
2026-01-27 05:06:27 +05:00
Yevhen Omelianenko b07aff1be3 Merge branch 'adenhq:main' into fix/json-validation-error-handling 2026-01-27 01:25:12 +02:00
yevhen_omelianenko f3df70e8fe fix: add consistent JSON validation error handling in agent_builder_server.py
Wrap json.loads() calls in try-catch blocks for add_node() and update_node()
  functions to match the error handling pattern used elsewhere in the file.

  Fixes #907
2026-01-27 01:13:42 +02:00
Bryan @ Aden 9230ac6c20 Merge pull request #871 from pradyten/feat/llm-judge-configurable-provider
feat(testing): add configurable LLM provider to LLMJudge
2026-01-26 14:53:12 -08:00
Bryan @ Aden 5cf25c6f10 Merge pull request #906 from adenhq/fix/ruff-tests
fixed linter
2026-01-26 14:49:45 -08:00
bryan d064c98998 fixed linter 2026-01-26 14:47:56 -08:00
Bryan @ Aden 25fabd8068 Merge pull request #576 from savankansagara1/fix/mock-mode-llm-provider
Fix: Add MockLLMProvider to enable mock mode execution
2026-01-26 14:41:13 -08:00
Bryan @ Aden 396e5c35a6 Merge pull request #528 from gaurav-code098/fix/web-scrape-content-type
fix(tools): validate Content-Type in web_scrape tool (Closes #487)
2026-01-26 14:34:37 -08:00
RichardTang-Aden 0a8c30c3da Merge pull request #788 from SoulSniper-V2/feat/add-deepseek-docs
docs(llm): add DeepSeek models support documentation and examples
2026-01-26 14:33:51 -08:00
Aden HQ 798f3cfd36 Merge pull request #349 from Himanshu-ABES/feat/pydantic-llm-validation
feat(validation): add Pydantic model validation for LLM outputs
2026-01-26 14:14:12 -08:00
pradyumn tendulkar 69ad0be5ff Merge branch 'main' into feat/llm-judge-configurable-provider 2026-01-26 17:06:30 -05:00
Himanshu Chauhan 60f2e674ec feat(validation): add Pydantic model validation for LLM outputs
- Add output_model field to NodeSpec for specifying Pydantic model
- Add max_validation_retries field (default: 2) for retry configuration
- Add validation_errors field to NodeResult for error tracking
- Implement validate_with_pydantic() in OutputValidator
- Implement format_validation_feedback() for LLM retry prompts
- Auto-generate JSON schema from Pydantic model for response_format
- Add retry loop that feeds validation errors back to LLM
- Add 28 comprehensive tests covering all new functionality
2026-01-26 14:06:29 -08:00
Siddharth Varshney 6bb256e277 docs: clarify illustrative examples in validation section 2026-01-26 21:47:00 +00:00
Bryan @ Aden 81ad85db5e Merge pull request #876 from adenhq/fix/ruff-tests
Fix/ruff tests
2026-01-26 13:41:15 -08:00
Timothy @aden ed25ef7562 Merge pull request #762 from vishalharkal15/fix/concurrent-storage-race-condition
Fix race condition in ConcurrentStorage.stop() causing data loss
2026-01-26 13:38:17 -08:00
bryan d9c696aa22 fixed all linter errors 2026-01-26 13:37:25 -08:00
bryan 22358a2d83 Merge branch 'main' into fix/ruff-tests 2026-01-26 13:37:12 -08:00
Timothy @aden 39a2a34380 Merge pull request #874 from TimothyZhang7/main
fix: git actions
2026-01-26 13:36:50 -08:00
Timothy @aden 07077dbb52 Merge branch 'adenhq:main' into main 2026-01-26 13:35:33 -08:00
Timothy e1346ae557 fix: include actual status check in pr requirements 2026-01-26 13:34:54 -08:00
Timothy 4f3d34d01e fix: consolidate dedupe and triage 2026-01-26 13:29:33 -08:00
pradyten 8516eba7c5 feat(testing): add configurable LLM provider to LLMJudge
Allow LLMJudge to accept any LLMProvider instance instead of being
hardcoded to use Anthropic. This aligns with the framework's pluggable
LLM design and enables users to:

- Use the same LLM provider across their agent and tests
- Run tests with cheaper or local models
- Avoid requiring an Anthropic API key for testing

Backward compatible: existing code using LLMJudge() without arguments
continues to work by falling back to Anthropic.

Closes #477

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 16:27:08 -05:00
Timothy @aden 63010d45b2 Merge pull request #861 from TimothyZhang7/main
fix: explain the PR requirements
2026-01-26 12:43:30 -08:00
Timothy @aden 59db8f99d7 Merge branch 'adenhq:main' into main 2026-01-26 12:42:22 -08:00
Timothy 236e8e8638 fix: explain the pr requirement 2026-01-26 12:41:16 -08:00
Timothy @aden 3279686342 Merge pull request #858 from TimothyZhang7/main
fix: PR requirements enforcement
2026-01-26 12:32:11 -08:00
Timothy @aden b6a77ffd7e Merge branch 'adenhq:main' into main 2026-01-26 12:31:25 -08:00
Timothy e0544a57f9 fix: pr requirements 2026-01-26 12:30:12 -08:00
AryanRevolutionizingWorld 82c32e8d9f refactor(mcp): replace print() with logging in setup scripts
Replace direct print() statements with Python's logging module in MCP
setup and verification scripts for better configurability and
production readiness.

Changes:
- setup_mcp.py: Convert 30+ print() calls to structured logging
- verify_mcp.py: Convert 40+ print() calls to structured logging
- mcp_server.py: Convert 4 print() calls to structured logging
- Preserve colored CLI output using logging formatters
- Maintain all functional behavior (refactor only)

Benefits:
- Configurable log levels (debug/info/warning/error)
- Better observability in production environments
- Cleaner programmatic usage (no stdout pollution)
- Professional logging practices

Fixes #833
2026-01-27 01:57:16 +05:30
Bryan @ Aden a180d78d0c Merge pull request #782 from ayush123-bit/docs/windows-environment-clarification
docs: clarify Windows environment expectations in setup guides
2026-01-26 12:22:43 -08:00
Bryan @ Aden 9be036aa37 Merge pull request #602 from Kira714/fix/callable-type-annotation-599
fix(llm): correct type annotation from lowercase `callable` to `Callable`
2026-01-26 12:22:34 -08:00
Bryan @ Aden 8c39dad22d Merge pull request #605 from Kira714/fix/session-state-dict-validation-590
fix(executor): add type validation for session state memory
2026-01-26 12:22:21 -08:00
Bryan @ Aden 0a7aa62c45 Merge pull request #608 from Kira714/fix/agent-runtime-keyerror
fix(runtime): use safe dictionary access in trigger_and_wait()
2026-01-26 12:22:12 -08:00
Bryan @ Aden cbd34db278 Merge pull request #665 from Ranxin2023/main
Make MCP tool registration idempotent to avoid conflicts with agent-generated tools
2026-01-26 12:22:02 -08:00
Bryan @ Aden 414d86f2f0 Merge pull request #672 from subham-panja/docs/fix-architecture-link
docs(readme): fix broken architecture documentation link
2026-01-26 12:21:34 -08:00
Timothy @aden 4852d7f63b Merge pull request #842 from TimothyZhang7/main
fix: PR requirements backfill
2026-01-26 11:55:55 -08:00
Timothy @aden 1165858a58 Merge branch 'adenhq:main' into main 2026-01-26 11:54:51 -08:00
Timothy 4575540d69 fix: pr requirements 2026-01-26 11:54:10 -08:00
Timothy @aden 051aa4f065 Merge pull request #840 from TimothyZhang7/main
fix: backfill pr requirements
2026-01-26 11:49:06 -08:00
Timothy 6834dcfcb7 fix: backfill pr requirements 2026-01-26 11:47:04 -08:00
Timothy @aden 95c481ae52 Merge pull request #824 from TimothyZhang7/main
Fix: PR requirements
2026-01-26 11:17:09 -08:00
Timothy @aden 5c266d6920 Merge branch 'adenhq:main' into main 2026-01-26 11:16:04 -08:00
Timothy 7fe21d91f2 fix: pr requirements 2026-01-26 11:15:29 -08:00
Timothy @aden 751715bffe Merge pull request #822 from TimothyZhang7/main
PR Requirements Workflow
2026-01-26 11:12:31 -08:00
Timothy @aden a6bda9628c Merge branch 'adenhq:main' into main 2026-01-26 10:57:47 -08:00
Timothy ac646603c9 chore: enforce pr requirement 2026-01-26 10:54:50 -08:00
Timothy @aden 551e648be7 Merge pull request #810 from TimothyZhang7/main
GitHub Actions to auto dedupe issues
2026-01-26 10:42:13 -08:00
Timothy @aden 2f852a7eba Merge branch 'adenhq:main' into main 2026-01-26 10:41:16 -08:00
Timothy 7d462ff976 feat(actions): auto dedupe workflow 2026-01-26 10:38:01 -08:00
Timothy d1cfef5d8a fix: issue dedupe action 2026-01-26 10:06:37 -08:00
Bryan @ Aden f3c9c591bf Merge pull request #610 from Kira714/fix/semaphore-private-access
fix(stream): avoid private Semaphore._value attribute access
2026-01-26 10:05:32 -08:00
Bryan @ Aden 0bbe2d5889 Merge pull request #444 from fermano/fix/executionstream-oom
fix(runtime): execution stream memory leak
2026-01-26 10:02:23 -08:00
Timothy @aden aa341317f5 Merge pull request #791 from TimothyZhang7/main
chore(actions): automated bot
2026-01-26 09:45:45 -08:00
Timothy 6ae38b66ba chore(actions): automated bot 2026-01-26 09:43:25 -08:00
Arush Wadhawan 40e39d29f8 docs(llm): add DeepSeek models support documentation and examples
Signed-off-by: Arush Wadhawan <warush23+github@gmail.com>
2026-01-26 12:24:51 -05:00
ayush123-bit 6d7d472792 docs: clarify Windows environment expectations and WSL recommendation 2026-01-26 22:31:20 +05:30
Vishal dae63214d5 Fix race condition in ConcurrentStorage.stop() causing data loss
Fixes #755

Problem:
The stop() method had a critical race condition where _flush_pending() and
_batch_task competed for queue items, causing:
- Data loss during shutdown
- Queue items processed twice or lost
- Batch writer cancelled mid-write

Root Cause:
The method called _flush_pending() while _batch_task was still running.
Both operations drained the same queue simultaneously, leading to conflicts.

Solution:
Reordered shutdown sequence to:
1. Cancel batch task first
2. Wait for task completion (handles CancelledError with final flush)
3. Then flush any remaining items

This eliminates queue competition because:
- _batch_writer() flushes its current batch when cancelled
- After cancellation completes, _flush_pending() safely processes remaining items
- No race condition, no data loss

Changes:
- Moved batch task cancellation before _flush_pending()
- Ensures clean shutdown sequence
- Prevents queue drain conflicts

Testing:
- All 209 tests pass
- No duplicate flushes
- Clean shutdown guaranteed

Impact:
- Prevents data loss during graceful shutdown
- Eliminates race condition between flush operations
- Ensures all writes complete before stop returns
2026-01-26 21:38:59 +05:30
bryan 46bdedcabb ruff check fix 2026-01-26 07:32:03 -08:00
Patrick 5fbaae5d8d Merge branch 'adenhq:main' into main 2026-01-26 20:04:51 +08:00
Abdul Taufeeq M c9bc2b287e security: prevent path traversal attacks in FileStorage
Add comprehensive input validation to _validate_key() method that blocks:
- Empty keys
- Path separators (/ and \)
- Parent directory references (..)
- Absolute paths
- Null bytes
- Dangerous shell characters

Apply validation to all index operations: _get_index(), _add_to_index(), _remove_from_index()
Add 21 comprehensive test cases covering valid keys and all attack scenarios

Fixes: CWE-22 Path Traversal vulnerability (CVSS 7.5-9.1 Critical)

Tests: 21/21 passing
2026-01-26 17:31:43 +05:30
subhampanja28 5b46132c81 docs(readme): fix broken architecture documentation link 2026-01-26 17:24:22 +05:30
RanxinLi 7e65ab0b36 Revert local Claude settings 2026-01-26 03:28:03 -08:00
RanxinLi 8a86787b64 Merge branch 'main' of https://github.com/Ranxin2023/hive 2026-01-26 03:24:01 -08:00
RanxinLi b2acfb5447 change the file tool register bug 2026-01-26 03:23:49 -08:00
Sourabsb 10ea23be34 fix: improve cleanup race handling, thread join warning, and CancelledError strategy
- Treat run_coroutine_threadsafe race (RuntimeError) as expected: mark cleanup_attempted and log debug
- Mark cleanup_attempted on timeout/errors to avoid misleading fallback
- Add warning when loop thread fails to terminate within join timeout
- Make CancelledError best-effort (log, no re-raise) for session and stdio cleanup
2026-01-26 16:38:58 +05:30
Sourabsb 37a0324c05 fix: increase thread join timeout and clarify redundant None assignments
Changes based on Copilot AI review (2 issues):

1. Thread join timeout was shorter than cleanup timeout (Issue #1):
   - Changed _THREAD_JOIN_TIMEOUT from 5 to 12 seconds
   - Must be >= cleanup timeout (10s) plus buffer for loop.stop()
   - Prevents thread abandonment during active cleanup

2. Added detailed comment for redundant None assignments (Issue #2):
   - Explained why we set _session/_stdio_context to None even if
     _cleanup_stdio_async() already did it
   - Documents the safety cases: timeout, failure, skip, cancellation
   - Makes code intent clear for future maintainers
2026-01-26 16:31:22 +05:30
Sourabsb 837ef2da59 fix: address Copilot AI review - timeouts and CancelledError handling
Changes based on Copilot AI review (3 issues):

1. Increased thread join timeout (Issue #1):
   - Changed from 2 to 5 seconds
   - Made proportional to cleanup timeout
   - Defined as class constant _THREAD_JOIN_TIMEOUT

2. Handle asyncio.CancelledError explicitly (Issue #2):
   - Added separate except clause for CancelledError
   - Logs specific warning for cancelled cleanup
   - Re-raises CancelledError as per asyncio best practices
   - Added for both session and stdio_context cleanup

3. Increased cleanup timeout to match connection timeout (Issue #3):
   - Changed from 5 to 10 seconds (matches _connect_stdio timeout)
   - Defined as class constant _CLEANUP_TIMEOUT
   - Prevents incomplete cleanup with slow MCP servers
2026-01-26 16:23:44 +05:30
Muzzaiyyan Hussain e0bc265bb2 test: add pytest coverage for core graph executor success and failure 2026-01-26 16:18:32 +05:30
Sourabsb a39afbea23 fix: separate TimeoutError handling for better error reporting
Per Copilot AI review: distinguish timeout scenarios from actual
cleanup failures by catching TimeoutError separately. This helps
with debugging by providing clearer error messages.
2026-01-26 16:15:57 +05:30
Sourabsb 7375b26925 fix: address all Copilot AI review comments
Changes based on Copilot AI review (5 issues):

1. Simplified _cleanup_stdio_async():
   - Used try/finally pattern for cleaner reference clearing
   - References cleared in finally block (always executed)

2. Removed deprecated asyncio.get_event_loop():
   - Removed complex temp loop pattern entirely
   - Simplified fallback to just log warning and clear refs

3. Simplified fallback path (Issue #4):
   - When loop exists but not running, resources are in undefined state
   - Complex event loop manipulation removed
   - Just log warning and proceed with reference clearing
   - OS will reclaim resources on process exit

4. Handled race condition (Issue #5):
   - Added comment documenting the inherent race condition
   - Added try/except around loop.call_soon_threadsafe()
   - Track cleanup_attempted flag for proper fallback handling

5. Added explanatory comments:
   - Documented why redundant None assignments exist (safety)
   - Explained race condition handling approach

Note: Test coverage suggestion (#3) acknowledged but deferred
to separate PR to keep this fix focused.
2026-01-26 16:09:36 +05:30
Sourabsb 3626051b1a fix: address Copilot AI review suggestions for disconnect cleanup
Changes based on Copilot AI review:

1. Fixed fallback path using temp event loop pattern:
   - asyncio.run() may fail if there's already an event loop in current thread
   - Now uses new_event_loop() + set_event_loop() + run_until_complete() pattern
   - Preserves and restores original loop if one existed

2. Set references to None immediately after __aexit__:
   - self._session = None after closing session
   - self._stdio_context = None after closing context
   - Prevents window where closed objects are still referenced
   - Also clears on error to prevent reuse of broken objects

3. Added documentation for critical cleanup order:
   - Session must close BEFORE stdio_context
   - Session depends on streams provided by stdio_context
   - Mirrors initialization order in _connect_stdio()
   - Added warning comment to prevent future breakage
2026-01-26 15:59:59 +05:30
Sourabsb fbcdaf7c6d fix: add is_running() and is_closed() checks to _run_async() to prevent deadlock
When self._loop exists but is not running or is closed (e.g., crashed,
stopped externally, or closed), the code now falls through to the
standard approach that properly handles both sync and async contexts.

Key changes:
- Added is_running() AND is_closed() checks before using run_coroutine_threadsafe()
- Removed separate else branch with asyncio.run() that didn't handle async context
- Now falls through to standard approach which:
  - Detects if already in async context (get_running_loop)
  - Uses separate thread with new event loop if in async context
  - Uses asyncio.run() only when no event loop is running

Edge cases covered:
1. self._loop is None (sync context) -> uses asyncio.run()
2. self._loop is None (async context) -> uses thread with new loop
3. self._loop running normally -> uses run_coroutine_threadsafe()
4. self._loop stopped (sync context) -> falls through, uses asyncio.run()
5. self._loop stopped (async context) -> falls through, uses thread
6. self._loop closed (sync context) -> falls through, uses asyncio.run()
7. self._loop closed (async context) -> falls through, uses thread

Fixes #625
2026-01-26 15:35:53 +05:30
Kira714 6934b331d4 fix(stream): avoid private Semaphore._value attribute access
Calculate available_slots from running execution count instead of
accessing the private _value attribute of asyncio.Semaphore.

Private attributes may change between Python versions and are not
part of the public API.

Fixes #609

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 16:37:49 +08:00
Kira714 734fe1e4d7 fix(runtime): use safe dictionary access in trigger_and_wait()
Replace direct dictionary access with .get() and explicit ValueError
to prevent KeyError when entry_point_id is not found in _streams dict.

Fixes #589

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 16:36:55 +08:00
Kira714 d900f38f64 fix(executor): add type validation for session state memory
Fixes #590

Previously, the code assumed `session_state["memory"]` was always a dict
when the key existed. If it was `None` or another non-dict type, this
would raise a TypeError during iteration.

Now we validate the type before iterating and log a warning if the
memory data is not a dict, preventing runtime crashes when resuming
from malformed session states.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 16:28:56 +08:00
Kira714 1c78174aaf fix(llm): correct type annotation from lowercase callable to Callable
Fixes #599

The `callable` keyword in Python is a builtin function to check if something
is callable, NOT a type annotation. For type hints, we need `Callable` from
the typing module.

Changed:
- `tool_executor: callable` → `tool_executor: Callable[[ToolUse], ToolResult]`

Files updated:
- core/framework/llm/provider.py
- core/framework/llm/anthropic.py
- core/framework/llm/litellm.py

This fixes mypy/pyright type checking errors like:
"Variable annotation syntax is for types; callable is a function"

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 16:27:54 +08:00
Abdul Taufeeq M b897e5bdf2 test(web-scrape): add comprehensive tests for URL link conversion
Add 7 test methods to TestWebScrapeToolLinkConversion class to validate
the new URL conversion feature:

- test_relative_links_converted_to_absolute: ../page and page.html -> absolute
- test_root_relative_links_converted: /about -> absolute
- test_absolute_links_unchanged: https://external.com remains unchanged
- test_links_after_redirects: Uses final URL, not requested URL
- test_fragment_links_preserved: #section1 anchors work correctly
- test_query_parameters_preserved: ?id=123&sort=date retained
- test_empty_href_skipped: Empty text links filtered out

All tests use unittest.mock for HTTP response mocking to avoid live network calls.
Tests comprehensively validate the urljoin() implementation that converts all
relative URLs to absolute URLs based on the final response URL.
2026-01-26 13:28:12 +05:30
Abdul Taufeeq M 09dd990273 fix(web-scrape): convert relative URLs to absolute URLs using urljoin
- Add urljoin import from urllib.parse
- Convert all extracted links to absolute URLs based on page base_url
- Use response.url as base_url to handle redirects correctly
- Fixes issue where relative links like '../page' were unusable
2026-01-26 13:09:34 +05:30
savan patel af3b8b1b80 Fix: Add MockLLMProvider to enable mock mode execution
- Created MockLLMProvider class that generates placeholder JSON responses
- Updated AgentRunner._setup() to use MockLLMProvider when mock_mode=True
- Added MockLLMProvider to llm module exports
- Fixes issue where agents failed with 'LLM not available' in mock mode

The MockLLMProvider extracts expected output keys from system prompts
and generates mock JSON responses for structural validation without
making real LLM API calls. This enables:
- Testing agent structure without API keys
- Fast iteration on agent graphs
- CI/CD testing without credentials
- Zero-cost structural validation

Tested with simple agent - all nodes execute successfully in mock mode.
2026-01-26 12:34:14 +05:30
Sourabsb fc539a5d7b fix: add fallback cleanup when event loop is not running
Added else branch to handle edge case where loop exists but is not running. Uses asyncio.run() as fallback to ensure cleanup happens even if the loop was stopped externally or due to an error.
2026-01-26 12:04:43 +05:30
hundao d558bf4f60 feat(tools): add CSV tools with DuckDB SQL support
Add comprehensive CSV manipulation tools:
- csv_read: Read CSV with pagination (limit/offset)
- csv_write: Create new CSV files
- csv_append: Append rows to existing CSV
- csv_info: Get CSV metadata (columns, row count, file size)
- csv_sql: Query CSV using SQL (powered by DuckDB)

Features:
- Session sandbox security (workspace_id, agent_id, session_id)
- DuckDB as optional dependency for SQL queries
- Security: Only SELECT queries allowed, dangerous keywords blocked
- Full Unicode support
- 45 tests covering all tools

Install SQL support: pip install tools[sql]
2026-01-26 14:23:18 +08:00
Sourabsb 99efbe03bb fix: properly close MCP session and STDIO context managers in disconnect()
Added _cleanup_stdio_async() method to properly call __aexit__() on session and stdio_context before stopping the event loop.

This prevents resource leaks, zombie processes, and unclosed file handles.
2026-01-26 11:35:28 +05:30
gaurav 5168ed3cd4 fix(tools): validate Content-Type in web_scrape tool (Closes #487) 2026-01-26 11:15:52 +05:30
vakrahul f614ee7f15 style: fix line length violation in output_cleaner.py 2026-01-26 10:50:42 +05:30
Timothy @aden 02330653ee Merge pull request #489 from TimothyZhang7/main
docs: architecture readme
2026-01-25 19:37:36 -08:00
Timothy ae37d9816e docs: architecture readme 2026-01-25 19:36:03 -08:00
RichardTang-Aden 7351675795 Merge pull request #222 from Chrishabh2002/feat/manual-agent-codefirst
Add minimal code-first agent example and isolate core dependencies
2026-01-25 19:21:22 -08:00
RichardTang-Aden fa5d5057f4 Merge pull request #447 from pradyten/fix/hallucination-detection-full-string-check
fix(graph): check entire string for code indicators in hallucination detection
2026-01-25 19:08:19 -08:00
Bryan @ Aden 854a867597 Merge pull request #293 from yumosx/graph
feat(file_system_toolkits): add encoding and max_size params to view_file
2026-01-25 18:37:42 -08:00
RichardTang-Aden 35ef467dbe Merge pull request #361 from Koushith/fix/docs-hardcoded-path-and-venv
fix(docs): remove hardcoded path and add venv troubleshooting
2026-01-25 18:19:02 -08:00
yumosx 89dbc638e1 test(file_system): add tests for file viewing edge cases
Add tests for file viewing functionality with max_size truncation, negative max_size, custom encoding, and invalid encoding scenarios to ensure proper error handling and behavior.
2026-01-26 10:14:18 +08:00
Timothy @aden 4eac1b9e97 Merge pull request #475 from aiSynergy37/fix/validate-api-key-warning
Fix validate() fallback to warn on model-specific API keys
2026-01-25 18:09:17 -08:00
mithileshk 80f938a7af Fix validate warning for model-specific API keys 2026-01-26 07:34:13 +05:30
Richard T 2f7cf3bc57 chore: remove the outdated architecture documentation 2026-01-25 17:56:18 -08:00
vakrahul 1a7ed9c962 style: fix F821 undefined name and E501 line length errors 2026-01-26 07:25:23 +05:30
Bryan @ Aden 7004fffc08 Merge pull request #402 from Shamanth-8/fix/rce-safe-eval
Unsanitized expression evaluation in EdgeSpec (RCE Vulnerability)
2026-01-25 17:53:15 -08:00
vakrahul 06535192e6 verifying 2026-01-26 07:22:04 +05:30
vakrahul 5923147a71 chore(graph): fix lint issues in retry backoff loggings 2026-01-26 07:19:59 +05:30
Timothy @aden acaa89f584 Merge pull request #434 from nihalmorshed/documentation/fix-tool-name-references
docs(README): update tool names and descriptions in README inside "tools"
2026-01-25 17:47:36 -08:00
Timothy @aden e6af1f64ac Merge pull request #427 from guillermop2002/fix/remove-hardcoded-anthropic-provider
fix(llm): use LiteLLMProvider instead of hardcoded AnthropicProvider
2026-01-25 17:47:05 -08:00
Richard T 53aebd5cea docs: add issue assignment for contributors 2026-01-25 17:38:23 -08:00
Shivam Sharma d64020e024 removing the custom rule. 2026-01-26 07:05:43 +05:30
Shivam Sharma 975a002796 1. fixing quickstart.sh to use uv.
2. giving core and tools separate venv.

yea thats all.
2026-01-26 06:53:52 +05:30
Shivam Sharma 6e6b83848f uv sync 2026-01-26 06:53:52 +05:30
Shivam Sharma 3fb255c906 added agents.md 2026-01-26 06:53:52 +05:30
Shivam Sharma cd51d663fb make it fast using uv(package manager), ruff(linter) and ty(type
checker).

1. added an agents.md file for better ai assistance.
2. repalced pip with uv and added ty type checker.
2026-01-26 06:53:52 +05:30
Bryan @ Aden 28b0b6206b Merge pull request #470 from adenhq/chore-fix-python-tests
chore: fixed python tests
2026-01-25 17:19:57 -08:00
bryan 9859dc65e0 chore: fixed python tests 2026-01-25 17:19:21 -08:00
Bryan @ Aden 5c2288fbf5 Merge pull request #397 from Tahir-yamin/fix/respect-node-max-retries
fix(graph): Respect node_spec.max_retries configuration
2026-01-25 17:07:51 -08:00
yumosx 1b47d1cad4 Merge remote-tracking branch 'upstream/main' into graph 2026-01-26 09:06:20 +08:00
Bryan @ Aden 126bbf17c3 Merge pull request #228 from himanshu748/fix/remove-duplicate-web-search-registration
fix: remove duplicate web_search tool registration
2026-01-25 17:04:07 -08:00
Timothy 995ab8faaf fix: allow triage for all issues 2026-01-25 16:58:30 -08:00
RichardTang-Aden 9d1b1ab9d4 Merge pull request #187 from RussellLuo/improve-runtime-config
feat(skills): add support for setting `api_key` and `api_base` in RuntimeConfig
2026-01-25 16:45:21 -08:00
Bryan @ Aden 7e630b9416 Merge pull request #259 from charan2456/fix/docs-exports-clarification
docs: clarify that exports/ is user-generated, not included in repo
2026-01-25 16:27:15 -08:00
Timothy 14faca3933 fix: remove oidc token permission check 2026-01-25 16:22:19 -08:00
Timothy e8c9cc65dc chore: use GITHUB_TOKEN in action 2026-01-25 16:19:16 -08:00
RichardTang-Aden f0deedb1f8 Merge pull request #174 from AysunItai/fix/anthropic-provider-response-format
fix: align AnthropicProvider.complete with LLMProvider (response_format)
2026-01-25 16:01:16 -08:00
Bryan @ Aden 70693f4824 Merge pull request #231 from himanshu748/docs/update-skills-directory-structure
docs: update skills directory structure to match actual output
2026-01-25 15:57:27 -08:00
RichardTang-Aden 3ee380d98f Merge pull request #166 from LunaStev/translate/korean
Translate Korean
2026-01-25 15:54:14 -08:00
Timothy @aden b9b0c2c844 Merge pull request #451 from adenhq/add-claude-github-actions-1769383167894
Add claude GitHub actions 1769383167894
2026-01-25 15:41:07 -08:00
bryan c53acfdf77 set model 2026-01-25 15:39:31 -08:00
bryan 08beffea33 added claude issue triage workflow 2026-01-25 15:31:10 -08:00
Bryan @ Aden 7ed5006a70 "Claude Code Review workflow" 2026-01-25 15:19:29 -08:00
Bryan @ Aden e009de1c9a "Claude PR Assistant workflow" 2026-01-25 15:19:28 -08:00
Pradyumn Tendulkar df7b950e6f fix(graph): check entire string for code indicators in hallucination detection
Previously, the hallucination detection in SharedMemory.write() and
OutputValidator.validate_no_hallucination() only checked the first 500
characters for code indicators. This allowed hallucinated code to bypass
detection by prefixing with innocuous text.

Changes:
- Add _contains_code_indicators() method to SharedMemory and OutputValidator
- Check entire string for strings under 10KB
- Use strategic sampling (start, 25%, 50%, 75%, end) for longer strings
- Expand code indicators to include JavaScript, SQL, and HTML/script patterns
- Add comprehensive test suite with 19 test cases

Fixes #443

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 18:06:09 -05:00
Fernando Mano 7f3bc811b0 fix(runtime): execution stream memory leak -- adjust gitignore 2026-01-25 19:42:47 -03:00
guillermop2002 f0c9d4e87f fix(llm): use LiteLLMProvider instead of hardcoded AnthropicProvider
Fixes #213
2026-01-25 22:19:29 +01:00
Nihal Morshed 57781c520e docs(README): update tool names and descriptions in README inside "tools" 2026-01-26 03:17:28 +06:00
Nihal Morshed 05b18fb312 fix(tools): remove duplicate registration of web search tool 2026-01-26 03:06:50 +06:00
Fernando Mano 829783749c fix(runtime): execution stream memory leak 2026-01-25 17:21:05 -03:00
Shamanth-8 48b38e5d95 Fix: Unsanitized expression evaluation needs fix to use the safe evaluator 2026-01-25 23:56:01 +05:30
Tahir Yamin 1527a05336 fix(graph): Respect node_spec.max_retries configuration
- Remove hardcoded max_retries_per_node = 3
- Use node_spec.max_retries for all retry logic
- Add comprehensive test suite (6 test cases)
- Allows per-node retry configuration as intended

Fixes #363
2026-01-25 23:06:26 +05:00
vakrahul 491e6585a4 fix(graph): implement exponential backoff for node retries 2026-01-25 23:09:09 +05:30
koushith 8333ba6ec2 fix(docs): remove hardcoded path and add venv troubleshooting
- Replace hardcoded /home/timothy/oss/hive/ with generic instruction
- Add troubleshooting section for PEP 668 externally-managed-environment error
- Document virtual environment setup for Python 3.12+ on macOS/WSL/Linux

Fixes #322
Fixes #355
2026-01-25 22:22:45 +05:30
yumosx a5fcb89991 feat(file_system_toolkits): add encoding and max_size params to view_file
Add support for custom file encoding and size limits when viewing files. The max_size parameter prevents loading excessively large files by truncating content and adding a warning message when the limit is exceeded. Also includes validation for negative max_size values and checks if path is a file.
2026-01-25 21:53:51 +08:00
kali 3fd8f9f97a fix: enhance Python detection and error handling in setup script 2026-01-25 19:11:52 +05:30
kali 2180a60c21 Fix setup-python.sh to prefer python3.12/3.11 and support PYTHON override 2026-01-25 18:56:27 +05:30
Adith L S f64820a13e fix(cli): fix KeyError 'steps' in cmd_list function
The cmd_list function stored node count as 'nodes' but tried to
access it as 'steps', causing a KeyError when listing agents.

Changed agent['steps'] to agent['nodes'] to match the dict key.
2026-01-25 18:25:03 +05:30
Kotapati Venkata Sai Charan 073be1f870 docs: clarify that exports/ is user-generated, not included in repo
Fixes #202

- Update docs/getting-started.md to explain exports/ is created by users

- Remove references to non-existent support_ticket_agent example

- Update DEVELOPER.md with correct agent creation instructions
2026-01-25 18:10:06 +05:30
himanshu748 86686fc8f9 docs: update skills directory structure to match actual output
- Update .claude/skills/ structure in getting-started.md
- Reflect actual skills generated by quickstart.sh:
  - agent-workflow/
  - building-agents-construction/
  - building-agents-core/
  - building-agents-patterns/
  - testing-agent/

Fixes #177
2026-01-25 07:10:46 -05:00
himanshu748 8fe51a8aa9 fix: remove duplicate web_search tool registration
- Remove redundant register_web_search(mcp) call on line 54
- Keep single registration with credentials parameter
- Tool implementation handles both credential sources internally
- Added clarifying comment explaining the credential handling

Fixes #172
2026-01-25 07:05:13 -05:00
Chrishabh2002 715df547bb chore: remove generated agent logs and ignore them 2026-01-25 17:23:50 +05:30
Chrishabh2002 c454870ac8 add code-first agent example and isolate core dependencies 2026-01-25 17:21:58 +05:30
RussellLuo 68766fd131 fix(skills): load MCP servers correctly
Closes #188.
2026-01-25 17:34:34 +08:00
RussellLuo ce39cb7dde feat(skills): add support for setting api_key and api_base
Closes #186.
2026-01-25 16:05:13 +08:00
patrick e1663793c7 feat: add nullable_output_keys 2026-01-24 22:43:42 +08:00
Aysun Itai e2f387965e fix: align AnthropicProvider.complete with LLMProvider (response_format)
Update AnthropicProvider.complete to accept response_format and forward it to LiteLLMProvider.
Added unit test in test_litellm_provider.py to verify parameter forwarding.
2026-01-24 11:59:53 +02:00
LunaStev e75253f16a add missed 2026-01-24 15:05:26 +09:00
LunaStev 7d416f5421 translate korean 2026-01-24 15:00:38 +09:00
Timothy @aden cdbcac68b8 Merge pull request #165 from adenhq/staging
staging to main
2026-01-23 18:40:12 -08:00
Timothy @aden d52b6e8e56 Merge pull request #164 from TimothyZhang7/feature/multi-entrypoint-arch
Feature/multi entrypoint arch
2026-01-23 18:39:37 -08:00
Timothy 510975619d fix: register mcp tools properly, load parent env 2026-01-23 18:32:04 -08:00
Timothy 49724b6da0 Merge branch 'staging' into feature/multi-entrypoint-arch 2026-01-23 17:05:33 -08:00
Richard T c84e9c96f5 feat: clean up tool testing 2026-01-23 17:00:53 -08:00
Timothy @aden 31b252c018 Merge pull request #159 from bryanadenhq/fix-json-output
Chore:Small bug fixes with json output
2026-01-23 16:59:41 -08:00
Timothy dd2254989f fix: adjust tool credential check 2026-01-23 16:56:44 -08:00
Timothy 7aa56b905c feat: framework guardrails 2026-01-23 16:31:46 -08:00
Timothy 9f4948edbe fix: agent building skills 2026-01-23 15:28:51 -08:00
RichardTang-Aden cfba965c52 Merge pull request #126 from ashrotd/add-mcp-server-tests
Test_mcp_server added with smoke tests
2026-01-23 15:10:03 -08:00
Timothy 2765c9fe93 feat: concurrent framework entrypoints 2026-01-23 15:02:55 -08:00
RichardTang-Aden bffaab6ac0 Merge pull request #107 from uttam-salamander/test/add-unit-tests-security-plan-example
test: add unit tests for security, plan, and example_tool modules
2026-01-23 14:50:27 -08:00
bryan 8f223ee564 Merge branch 'staging' into fix-json-output 2026-01-23 14:48:42 -08:00
Richard T 482a4933d5 feat: Add Ruff configuration and update .gitignore
- Add Ruff linter configuration to core/pyproject.toml
- Add uv.lock to .gitignore

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-23 14:43:03 -08:00
bryan b0e870d1db updated output to clean json, update set goal, changed llm to llm_generate 2026-01-23 14:27:45 -08:00
RichardTang-Aden 93f0181ff5 Merge pull request #155 from iconicompany/main
feat: Add .venv to .gitignore and improve script error handling
2026-01-23 13:51:52 -08:00
Viacheslav Borisov 4b33f2a237 feat: Add .venv to .gitignore and improve script error handling
Adds the `.venv` directory to the `.gitignore` file to prevent accidental commits.

Also, enhances the `scripts/setup-python.sh` script to include error handling for the `pip install` command, providing a more informative message if the upgrade fails.
2026-01-24 01:14:08 +04:00
Timothy @aden 17bfbf9732 Merge pull request #152 from TimothyZhang7/feature/spec
Feature/spec
2026-01-23 12:09:10 -08:00
Timothy da0c0acdcf Merge branch 'staging' into feature/spec 2026-01-23 12:04:09 -08:00
Timothy @aden ea4c56108b Merge pull request #149 from bryanadenhq/fix-remove-llm-from-mcp
Fix remove llm from mcp
2026-01-23 12:00:45 -08:00
Timothy @aden 73ba72ee52 Merge pull request #151 from adenhq/main
Align staging branch
2026-01-23 11:51:32 -08:00
Timothy @aden b21c29b56a Merge pull request #150 from bryanadenhq/chore-fix-warnings
Chore fix warnings
2026-01-23 11:49:58 -08:00
bryan f83bfdf50c fixed pytest warnings 2026-01-23 11:45:02 -08:00
bryan f67e0cc4ae cli and documentation updates 2026-01-23 11:31:10 -08:00
bryan 8d4f107f63 removed all llm dependencies from mcp server 2026-01-23 11:15:24 -08:00
Timothy @aden e434579258 Merge pull request #147 from TimothyZhang7/fix/python-version
chore: requires python3.11
2026-01-23 11:13:16 -08:00
Timothy f494c80051 chore: requires python3.11 2026-01-23 11:12:03 -08:00
Timothy @aden 6cc11590cd Merge pull request #70 from HarshaKilaru/fix/grep-search-error-handling
fix(grep_search): improve grep_search error granularity and regex validation
2026-01-23 09:55:09 -08:00
Timothy @aden 9619cf903b Merge pull request #138 from adenhq/staging
Staging to main
2026-01-23 09:42:24 -08:00
Timothy @aden 8504ad7c8c Merge pull request #141 from TimothyZhang7/staging
chore: lint issues
2026-01-23 09:39:39 -08:00
Timothy 447d25d7cc chore: lint issues 2026-01-23 09:35:55 -08:00
Sriharsha Kilaru 10b9db2771 Merge branch 'main' into fix/grep-search-error-handling 2026-01-23 12:20:08 -05:00
Sriharsha Kilaru 5176b6a459 refactor: move grep_search to tools path to align with main 2026-01-23 11:59:35 -05:00
Sriharsha Kilaru b23e1edea8 chore: force GitHub merge conflict re-evaluation in grep_search 2026-01-23 11:39:54 -05:00
Sriharsha Kilaru 460ffa0260 chore: trigger merge conflict re-evaluation 2026-01-23 11:34:13 -05:00
Sriharsha Kilaru 7cab63f28d chore: manual cleanup of grep_search 2026-01-23 11:27:37 -05:00
Sriharsha Kilaru db4b79a32b fix: finalize grep_search logic and resolve merge conflict 2026-01-23 11:13:01 -05:00
Sriharsha Kilaru d669fe132e Merge branch 'main' into fix/grep-search-error-handling 2026-01-23 11:01:52 -05:00
Timothy @aden ea0b47ce05 Merge pull request #131 from bryanadenhq/chore--update-quickstart
update to quickstart
2026-01-23 07:54:22 -08:00
RichardTang-Aden c94a94cbe0 Merge pull request #65 from Samkit02/feature/robots-txt-compliance 2026-01-22 20:03:20 -08:00
Timothy 7c6c3a8cc2 feat: node I/O cleaner 2026-01-22 19:59:29 -08:00
Samkit Shah 5e4d2331d5 feature(web-scrape): add robots.txt compliance
- Add respect_robots_txt parameter (default: True)
- Implement _get_robots_parser() with caching
- Implement _is_allowed_by_robots() check
- Return clear error when blocked by robots.txt
Fixes #23
2026-01-22 21:58:32 -06:00
Timothy @aden ffff7d0758 Merge pull request #68 from yumosx/main
test: add test cases for run.py
2026-01-22 19:23:18 -08:00
bryan 8051505800 update to quickstart 2026-01-22 18:59:25 -08:00
yumosx 6f4c3b117d Merge remote-tracking branch 'upstream/main' 2026-01-23 10:35:13 +08:00
yumosx 012bf5d987 fix(test_run): cast duration to int in assertion 2026-01-23 10:34:24 +08:00
Timothy 5930a3c95d chore: llm provider note 2026-01-22 16:15:52 -08:00
Timothy @aden a79f9f82b0 Merge pull request #128 from bryanadenhq/fix/testing
testing updates
2026-01-22 16:12:38 -08:00
bryan d439fc06c7 testing updates 2026-01-22 16:08:22 -08:00
RichardTang-Aden 111c38c943 Merge pull request #63 from bryanadenhq/fix/testing
update testing phase for new mcp tools
2026-01-22 14:29:11 -08:00
bryan 4cab6ec387 Merge branch 'staging' into fix/testing 2026-01-22 14:27:37 -08:00
Timothy @aden e0019fe59d Merge pull request #127 from adenhq/main
chore: sync main back to staging
2026-01-22 13:56:02 -08:00
bryan 75b37a4fbd fixes to merge 2026-01-22 13:49:50 -08:00
bryan 1f39b50dc0 Merge branch 'staging' into fix/testing 2026-01-22 13:49:39 -08:00
dhakalrabin cb80d89b72 Test_mcp_server added 2026-01-22 16:35:11 -05:00
bryan d05d4aabd7 updated testing tools to use full code 2026-01-22 13:12:53 -08:00
Timothy @aden 0a4cb748be Merge pull request #125 from TimothyZhang7/chore/fix-git-actions
Release / Create Release (push) Waiting to run
fix: fully tested ruff lint
2026-01-22 12:25:19 -08:00
Timothy e3ae1c30da fix: fully tested ruff lint 2026-01-22 12:23:54 -08:00
Timothy @aden 50e6c40941 Merge pull request #123 from TimothyZhang7/chore/fix-git-actions
chore: python lint
2026-01-22 11:57:57 -08:00
Timothy 56fb8e27f4 chore: python lint 2026-01-22 11:56:28 -08:00
Timothy @aden c6a46294b6 Merge pull request #119 from TimothyZhang7/chore/fix-git-actions
chore: fix git actions
2026-01-22 11:38:35 -08:00
Timothy d2fa847cfb chore: fix git actions 2026-01-22 11:37:54 -08:00
Timothy @aden c575b2c53d Merge pull request #115 from TimothyZhang7/main
Update GitHub Actions for Python Agent Framework
2026-01-22 11:07:24 -08:00
Timothy @aden 3a322b9c32 Merge branch 'adenhq:main' into main 2026-01-22 11:05:36 -08:00
Timothy f5e887939c chore: fix git actions 2026-01-22 11:04:26 -08:00
Timothy @aden 062eb0f148 Merge pull request #114 from adenhq/staging
DOCS: Archive honeycomb and hive, update documentation for agent framework
2026-01-22 10:53:23 -08:00
Timothy @aden e176cb980f Merge pull request #113 from TimothyZhang7/staging
chore: update documents of all languages
2026-01-22 10:52:28 -08:00
Timothy 95b92c1ee2 chore: update documents of all languages 2026-01-22 10:50:43 -08:00
Timothy @aden 6aaa78c8d3 Merge pull request #112 from adenhq/staging
Staging
2026-01-22 10:07:24 -08:00
Timothy @aden c5b2b7a1f5 Merge pull request #111 from TimothyZhang7/staging
Staging
2026-01-22 10:06:46 -08:00
Timothy 47f83651ff chore: fix documentations 2026-01-22 10:04:26 -08:00
Timothy 560ff6ad34 fix: environment setup 2026-01-22 09:49:09 -08:00
Timothy 24a8f04e0a Merge branch 'fix/local-staging-backup' into staging 2026-01-22 08:26:27 -08:00
bryan 8bcec7da14 Merge branch 'staging' into fix/testing 2026-01-22 08:16:19 -08:00
Timothy 16328cbe8f refactor: aden-tools to tools 2026-01-22 08:13:06 -08:00
Timothy 9d5f36b61e refactor: temporarily archive tracing tools 2026-01-22 08:10:53 -08:00
Timothy fd36692ab0 Merge branch 'main' into staging 2026-01-22 08:02:41 -08:00
RichardTang-Aden 0cfc8eca08 Merge pull request #96 from uttam-salamander/feature/orchestrator-litellm-provider
refactor(orchestrator): use LiteLLMProvider for multi-provider support
2026-01-22 07:55:35 -08:00
Timothy @aden 57218e08d0 Merge pull request #106 from TimothyZhang7/feature/split-skills
Feature/split skills
2026-01-22 07:55:23 -08:00
Timothy @aden ffce338459 Merge pull request #67 from Samkit02/fix/docker-dev-hot-reload
fix(docker): enable dev build with production target alias
2026-01-22 07:54:17 -08:00
Uttam Kumar fc2bfc67cd test(example-tool): add unit tests for example_tool
Add 17 tests covering:
- Valid input: basic message, uppercase, repeat options
- Input validation: empty message, max length, repeat range
- Edge cases: unicode, special characters, whitespace

Closes #59
2026-01-22 08:52:13 -07:00
Uttam Kumar c02eba403a test(plan): add unit tests for Plan enums and dataclasses
Add 41 tests covering:
- Enum values: ActionType, StepStatus, ApprovalDecision, JudgmentAction, ExecutionStatus
- PlanStep.is_ready() with various dependency scenarios
- Plan.from_json() parsing and error handling
- Plan methods: get_step, get_ready_steps, is_complete, to_feedback_context
- Serialization round-trip tests

Closes #58
2026-01-22 08:52:07 -07:00
Uttam Kumar cb1cac00bf test(security): add unit tests for get_secure_path()
Add 19 tests covering:
- Happy path: session directory creation, path resolution, nested paths
- Security: path traversal attacks, symlink detection patterns
- Error handling: missing IDs, None values, empty paths

Closes #57
2026-01-22 08:51:58 -07:00
RichardTang-Aden 3a02411d1e Merge pull request #84 from AkaashThawani/fix/broken-link-docs
fix(docs): fix broken link and update file name
2026-01-22 07:42:50 -08:00
Timothy @aden c4948b6e2e Merge pull request #66 from vincentjiang777/chore/readme
new languages on readme
2026-01-22 07:41:36 -08:00
Uttam Kumar 5c11d743cd refactor(orchestrator): use LiteLLMProvider for multi-provider support
Replace AnthropicProvider with LiteLLMProvider in AgentOrchestrator to
enable support for multiple LLM providers (OpenAI, Anthropic, Gemini, etc).

- LiteLLM auto-detects provider from model name
- LiteLLM auto-detects appropriate API key from environment
- Removes restrictive ANTHROPIC_API_KEY check
- Matches pattern used in AgentRunner

Closes #47
2026-01-22 04:02:21 -07:00
Akaash Thawani d0b094424d fix(docs): fix broken link and update file name 2026-01-22 02:05:59 -08:00
Sriharsha Kilaru 4cb0ca673d fix(tools): improve grep_search error handling and regex validation
Aligned implementation with README documentation by adding specific exception handling for FileNotFoundError and PermissionError.
2026-01-22 02:36:01 -05:00
yumosx 946cf91038 test: remove unused imports and docstrings in test_run.py 2026-01-22 13:30:59 +08:00
Vincent Jiang 11ed2398dc new languages on readme 2026-01-21 21:29:04 -08:00
yumosx 4bffe17402 Merge remote-tracking branch 'origin1/main' 2026-01-22 13:26:59 +08:00
yumosx d9a58dcfe6 test: add test cases for run module 2026-01-22 13:25:00 +08:00
Samkit Shah 7fbbe63955 fix(docker): enable dev build with production target alias
The Dockerfile.dev files lacked the 'production' stage alias that
docker-compose.yml expects, causing build failures. Added 'AS production'
to enable proper dev builds with hot reload.
Fixes #26
2026-01-21 23:09:47 -06:00
Timothy @aden e4cdbff58c Merge pull request #64 from RichardTang-Aden/fix-file-tools-fix
refactor: Remove file read and write tools and update the workspace d…
2026-01-21 20:28:44 -08:00
Timothy @aden a6e40fbc8c Merge pull request #34 from RichardTang-Aden/feat-incorporate-file-system-tools
Feat incorporate file system tools (updated tool description)
2026-01-21 20:28:11 -08:00
Timothy 2356bdb3e4 fix: deprecate building-agents 2026-01-21 20:17:41 -08:00
Timothy f44c1314f9 feat: re-organize skills 2026-01-21 20:17:07 -08:00
Timothy 17fcd3f774 fix: mcp server path 2026-01-21 19:39:30 -08:00
Richard T 406ad7924c refactor: Remove file read and write tools and update the workspace directory to use the user's home directory. 2026-01-21 19:19:48 -08:00
bryan 937cbfffb6 update to gitignore 2026-01-21 19:02:29 -08:00
Timothy @aden 1c9bbd7b02 Merge pull request #52 from RichardTang-Aden/fix-dependency-issue-by-vitest
fix: fix dependency issues
2026-01-21 18:54:00 -08:00
Timothy @aden c3127ecc6a Merge pull request #53 from bryanadenhq/feat/credential-manager
updates to skills to use credentials and check tools existing
2026-01-21 17:40:43 -08:00
bryan bfa5305cac updates to skills to use credentials and check tools existing 2026-01-21 17:11:59 -08:00
Richard T 1c0eb2db61 chore: Add .dockerignore files to exclude development artifacts and include @types/node dev dependency. 2026-01-21 17:03:53 -08:00
Richard T 8263835fce fix: update formatDate label parameter type from any to ReactNode in chart components. 2026-01-21 16:49:13 -08:00
Richard T dfded8f625 fix: Updated CostTrendChart.tsx and TokenUsageChart.tsx to handle tooltip labels safely, resolving the build failures. 2026-01-21 16:46:12 -08:00
Richard T 989801dd77 fix: fix dependency issues 2026-01-21 16:40:30 -08:00
Timothy @aden 9688236f88 Merge pull request #48 from bryanadenhq/feat/credential-manager
Feat/credential manager
2026-01-21 14:46:21 -08:00
bryan 888569416b Merge branch 'staging' into feat/credential-manager 2026-01-21 14:33:28 -08:00
Timothy @aden e915e79704 Merge pull request #49 from vincentjiang777/chore/readme
new readme
2026-01-21 14:28:48 -08:00
Timothy @aden 6d939777e9 Merge pull request #50 from TimothyZhang7/feature/construction-phase
Feature/construction phase
2026-01-21 14:23:21 -08:00
Vincent Jiang 76563a3274 updated email contact@adenhq.com 2026-01-21 14:21:33 -08:00
Vincent Jiang cd2e64bcc8 changes to readme, job posts, styling 2026-01-21 14:19:11 -08:00
Timothy aa67f028ac fix: upgrade testing agent skill 2026-01-21 14:15:53 -08:00
bryan fba2751f73 documentation for credentials manager 2026-01-21 13:55:52 -08:00
bryan d1e3daa532 credentials 2026-01-21 13:52:00 -08:00
Vincent Jiang 2d3a02d6a5 Merge branch 'chore/readme' of https://github.com/vincentjiang777/hive into chore/readme 2026-01-21 13:48:02 -08:00
Vincent Jiang 6d516efe93 new changes 2026-01-21 13:47:21 -08:00
Timothy eb29dd7fff feat: align skills with direct code generation approach 2026-01-21 13:23:18 -08:00
RichardTang-Aden 2e0f3117ef Merge pull request #39 from mohamedawnallah/integrate-litellm
core: integrate `LiteLLM` provider
2026-01-21 13:08:52 -08:00
RichardTang-Aden 137d5c2d45 Merge pull request #38 from RED-ROSE515/feat/vitest
feat(honeycomb): Set up Vitest testing infrastructure
2026-01-21 11:12:29 -08:00
RichardTang-Aden d3ccb6dde0 Merge pull request #40 from gupta-piyush19/chore/adds-jsdoc-comments
chore: adds JSDoc comments to services
2026-01-21 11:05:24 -08:00
Red Rose 1e23057849 fix: update package lock file 2026-01-21 10:59:56 -08:00
Red Rose af0d5b1f11 fix: add lockfile 2026-01-21 10:55:33 -08:00
Red Rose a7ea8a53bc feat: set up vitest testing infrastructure 2026-01-21 10:53:27 -08:00
RichardTang-Aden 63206ba28e Merge pull request #42 from uttam-salamander/feature/jest-test-infrastructure
feat: Set up Jest test infrastructure
2026-01-21 10:49:51 -08:00
RichardTang-Aden b4adcb0ee2 Merge pull request #43 from SimbaChasumba1/fix/user-controller-final
Improve error type safety in user controller
2026-01-21 10:15:48 -08:00
Timothy @aden ae349e133c Merge pull request #44 from TimothyZhang7/feature/persistent-sessions
Feature/persistent sessions
2026-01-21 09:10:45 -08:00
Timothy fc0f7767b4 fix: export_graph with proper entrypoints 2026-01-21 08:20:35 -08:00
Timothy 26d0ab4419 feat(cli): basic session support, mcp integration issues 2026-01-21 07:45:41 -08:00
SimbaChasumba1 977ab30def refactor(user-controller): improve error type safety using unknown 2026-01-21 17:10:25 +02:00
Uttam Kumar 169cf970cf feat: add Jest test infrastructure
- Add jest.config.js with TypeScript support (ts-jest)
- Add tsconfig.test.json for test compilation
- Add supertest for HTTP endpoint testing
- Create test utilities for database mocking (PostgreSQL, MongoDB)
- Add example health endpoint test as template

Closes #13
2026-01-21 02:12:26 -07:00
Piyush Gupta 1f950706ff chore: adds jsdoc comments 2026-01-21 11:58:56 +05:30
Mohamed Awnallah 56a84d9991 README.md: update docs 2026-01-21 07:25:43 +02:00
Mohamed Awnallah a50bdcfc72 core: make AnthropicProvider backward compatible with litellm integration 2026-01-21 07:24:55 +02:00
Mohamed Awnallah 86d11bbf39 core: introduce litellm provider 2026-01-21 07:24:35 +02:00
Timothy @aden 967318da3e Merge pull request #36 from mohamedawnallah/clarify-magic-numbers
docs: clarify magic numbers
2026-01-20 19:32:20 -08:00
Mohamed Awnallah 93a173bd94 docs: add descriptive comments to hardcoded magic numbers
- userApi.ts: Extract DEFAULT_API_TOKEN_TTL_SECONDS constant (157680000 = 5 years)
- agentControlApi.ts: Extract DEFAULT_LOGS_LIMIT (500) and DEFAULT_AGGREGATED_LOGS_LIMIT (100) constants
- useAgentStatus.ts: Add comment explaining RECONNECT_DELAY_MS purpose

Closes #8
2026-01-21 05:19:32 +02:00
Timothy ebafd90b9f feat: persistent session data 2026-01-20 18:36:00 -08:00
Timothy @aden 0870930c1e Merge pull request #33 from bryanadenhq/feat/test-phase
initial test phase
2026-01-20 17:17:36 -08:00
Timothy @aden 948888721c Merge pull request #35 from RichardTang-Aden/feat-incorporate-file-system-tools
Feat incorporate file system tools
2026-01-20 17:03:27 -08:00
Richard T d936ebd898 Merge branch 'feat-incorporate-file-system-tools' of https://github.com/RichardTang-Aden/hive into feat-incorporate-file-system-tools 2026-01-20 16:59:02 -08:00
Richard T 3fee7b328f feat: add more detailed file tools description 2026-01-20 16:55:41 -08:00
bryan 6aa5634363 Merge branch 'main' into feat/test-phase 2026-01-20 16:36:27 -08:00
bryan e2945b6c99 initial test phase 2026-01-20 16:28:21 -08:00
503 changed files with 61772 additions and 35740 deletions
+15
View File
@@ -0,0 +1,15 @@
{
"hooks": {
"PostToolUse": [
{
"matcher": "Edit|Write|NotebookEdit",
"hooks": [
{
"type": "command",
"command": "ruff check --fix \"$CLAUDE_FILE_PATH\" 2>/dev/null; ruff format \"$CLAUDE_FILE_PATH\" 2>/dev/null; true"
}
]
}
]
}
}
+46
View File
@@ -0,0 +1,46 @@
{
"permissions": {
"allow": [
"Bash(npm install:*)",
"Bash(npm test:*)",
"Skill(building-agents-construction)",
"Skill(building-agents-construction:*)",
"Bash(PYTHONPATH=core:exports pytest:*)",
"mcp__agent-builder__create_session",
"mcp__agent-builder__get_session_status",
"mcp__agent-builder__set_goal",
"mcp__agent-builder__list_mcp_servers",
"mcp__agent-builder__test_node",
"mcp__agent-builder__add_node",
"mcp__agent-builder__add_edge",
"mcp__agent-builder__validate_graph",
"Bash(ruff check:*)",
"Bash(PYTHONPATH=core:exports python:*)",
"mcp__agent-builder__list_tests",
"mcp__agent-builder__generate_constraint_tests",
"Bash(python -m agent:*)",
"Bash(python agent.py:*)",
"Bash(python -c:*)",
"Bash(done)",
"Bash(xargs cat:*)",
"mcp__agent-builder__list_mcp_tools",
"mcp__agent-builder__add_mcp_server",
"Bash(gh issue list:*)",
"WebFetch(domain:github.com)",
"Bash(pip install:*)",
"Bash(python -m pytest:*)",
"Bash(git checkout:*)",
"Bash(git add:*)",
"Bash(git commit -m \"$\\(cat <<''EOF''\nfeat\\(tools\\): Add Excel tool for spreadsheet operations\n\nAdds a new Excel tool for reading and manipulating .xlsx/.xlsm files:\n- excel_read: Read Excel files with pagination and sheet selection\n- excel_write: Create new Excel files with data\n- excel_append: Append rows to existing files\n- excel_info: Get metadata about Excel files \\(sheets, columns, row counts\\)\n- excel_sheet_list: List all sheets in a workbook\n\nIncludes comprehensive test coverage \\(37 tests\\) and documentation.\n\nReferences #2805\n\nCo-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>\nEOF\n\\)\")",
"Bash(git push:*)",
"Bash(git pull:*)",
"Bash(git stash:*)",
"Bash(git merge:*)"
]
},
"enableAllProjectMcpServers": true,
"enabledMcpjsonServers": [
"agent-builder",
"tools"
]
}
+463
View File
@@ -0,0 +1,463 @@
---
name: agent-workflow
description: Complete workflow for building, implementing, and testing goal-driven agents. Orchestrates building-agents-* and testing-agent skills. Use when starting a new agent project, unsure which skill to use, or need end-to-end guidance.
license: Apache-2.0
metadata:
author: hive
version: "2.0"
type: workflow-orchestrator
orchestrates:
- building-agents-core
- building-agents-construction
- building-agents-patterns
- testing-agent
- setup-credentials
---
# Agent Development Workflow
Complete Standard Operating Procedure (SOP) for building production-ready goal-driven agents.
## Overview
This workflow orchestrates specialized skills to take you from initial concept to production-ready agent:
1. **Understand Concepts**`/building-agents-core` (optional)
2. **Build Structure**`/building-agents-construction`
3. **Optimize Design**`/building-agents-patterns` (optional)
4. **Setup Credentials**`/setup-credentials` (if agent uses tools requiring API keys)
5. **Test & Validate**`/testing-agent`
## When to Use This Workflow
Use this meta-skill when:
- Starting a new agent from scratch
- Unclear which skill to use first
- Need end-to-end guidance for agent development
- Want consistent, repeatable agent builds
**Skip this workflow** if:
- You only need to test an existing agent → use `/testing-agent` directly
- You know exactly which phase you're in → use specific skill directly
## Quick Decision Tree
```
"Need to understand agent concepts" → building-agents-core
"Build a new agent" → building-agents-construction
"Optimize my agent design" → building-agents-patterns
"Set up API keys for my agent" → setup-credentials
"Test my agent" → testing-agent
"Not sure what I need" → Read phases below, then decide
"Agent has structure but needs implementation" → See agent directory STATUS.md
```
## Phase 0: Understand Concepts (Optional)
**Duration**: 5-10 minutes
**Skill**: `/building-agents-core`
**Input**: Questions about agent architecture
### When to Use
- First time building an agent
- Need to understand node types, edges, goals
- Want to validate tool availability
- Learning about pause/resume architecture
### What This Phase Provides
- Architecture overview (Python packages, not JSON)
- Core concepts (Goal, Node, Edge, Pause/Resume)
- Tool discovery and validation procedures
- Workflow overview
**Skip this phase** if you already understand agent fundamentals.
## Phase 1: Build Agent Structure
**Duration**: 15-30 minutes
**Skill**: `/building-agents-construction`
**Input**: User requirements ("Build an agent that...")
### What This Phase Does
Creates the complete agent architecture:
- Package structure (`exports/agent_name/`)
- Goal with success criteria and constraints
- Workflow graph (nodes and edges)
- Node specifications
- CLI interface
- Documentation
### Process
1. **Create package** - Directory structure with skeleton files
2. **Define goal** - Success criteria and constraints written to agent.py
3. **Design nodes** - Each node approved and written incrementally
4. **Connect edges** - Workflow graph with conditional routing
5. **Finalize** - Agent class, exports, and documentation
### Outputs
-`exports/agent_name/` package created
- ✅ Goal defined in agent.py
- ✅ 3-5 success criteria defined
- ✅ 1-5 constraints defined
- ✅ 5-10 nodes specified in nodes/__init__.py
- ✅ 8-15 edges connecting workflow
- ✅ Validated structure (passes `python -m agent_name validate`)
- ✅ README.md with usage instructions
- ✅ CLI commands (info, validate, run, shell)
### Success Criteria
You're ready for Phase 2 when:
- Agent structure validates without errors
- All nodes and edges are defined
- CLI commands work (info, validate)
- You see: "Agent complete: exports/agent_name/"
### Common Outputs
The building-agents-construction skill produces:
```
exports/agent_name/
├── __init__.py (package exports)
├── __main__.py (CLI interface)
├── agent.py (goal, graph, agent class)
├── nodes/__init__.py (node specifications)
├── config.py (configuration)
├── implementations.py (may be created for Python functions)
└── README.md (documentation)
```
### Next Steps
**If structure complete and validated:**
→ Check `exports/agent_name/STATUS.md` or `IMPLEMENTATION_GUIDE.md`
→ These files explain implementation options
→ You may need to add Python functions or MCP tools (not covered by current skills)
**If want to optimize design:**
→ Proceed to Phase 1.5 (building-agents-patterns)
**If ready to test:**
→ Proceed to Phase 2
## Phase 1.5: Optimize Design (Optional)
**Duration**: 10-15 minutes
**Skill**: `/building-agents-patterns`
**Input**: Completed agent structure
### When to Use
- Want to add pause/resume functionality
- Need error handling patterns
- Want to optimize performance
- Need examples of complex routing
- Want best practices guidance
### What This Phase Provides
- Practical examples and patterns
- Pause/resume architecture
- Error handling strategies
- Anti-patterns to avoid
- Performance optimization techniques
**Skip this phase** if your agent design is straightforward.
## Phase 2: Test & Validate
**Duration**: 20-40 minutes
**Skill**: `/testing-agent`
**Input**: Working agent from Phase 1
### What This Phase Does
Creates comprehensive test suite:
- Constraint tests (verify hard requirements)
- Success criteria tests (measure goal achievement)
- Edge case tests (handle failures gracefully)
- Integration tests (end-to-end workflows)
### Process
1. **Analyze agent** - Read goal, constraints, success criteria
2. **Generate tests** - Create pytest files in `exports/agent_name/tests/`
3. **User approval** - Review and approve each test
4. **Run evaluation** - Execute tests and collect results
5. **Debug failures** - Identify and fix issues
6. **Iterate** - Repeat until all tests pass
### Outputs
- ✅ Test files in `exports/agent_name/tests/`
- ✅ Test report with pass/fail metrics
- ✅ Coverage of all success criteria
- ✅ Coverage of all constraints
- ✅ Edge case handling verified
### Success Criteria
You're done when:
- All tests pass
- All success criteria validated
- All constraints verified
- Agent handles edge cases
- Test coverage is comprehensive
### Next Steps
**Agent ready for:**
- Production deployment
- Integration into larger systems
- Documentation and handoff
- Continuous monitoring
## Phase Transitions
### From Phase 1 to Phase 2
**Trigger signals:**
- "Agent complete: exports/..."
- Structure validation passes
- README indicates implementation complete
**Before proceeding:**
- Verify agent can be imported: `from exports.agent_name import default_agent`
- Check if implementation is needed (see STATUS.md or IMPLEMENTATION_GUIDE.md)
- Confirm agent executes without import errors
### Skipping Phases
**When to skip Phase 1:**
- Agent structure already exists
- Only need to add tests
- Modifying existing agent
**When to skip Phase 2:**
- Prototyping or exploring
- Agent not production-bound
- Manual testing sufficient
## Common Patterns
### Pattern 1: Complete New Build (Simple)
```
User: "Build an agent that monitors files"
→ Use /building-agents-construction
→ Agent structure created
→ Use /testing-agent
→ Tests created and passing
→ Done: Production-ready agent
```
### Pattern 1b: Complete New Build (With Learning)
```
User: "Build an agent (first time)"
→ Use /building-agents-core (understand concepts)
→ Use /building-agents-construction (build structure)
→ Use /building-agents-patterns (optimize design)
→ Use /testing-agent (validate)
→ Done: Production-ready agent
```
### Pattern 2: Test Existing Agent
```
User: "Test my agent at exports/my_agent"
→ Skip Phase 1
→ Use /testing-agent directly
→ Tests created
→ Done: Validated agent
```
### Pattern 3: Iterative Development
```
User: "Build an agent"
→ Use /building-agents-construction (Phase 1)
→ Implementation needed (see STATUS.md)
→ [User implements functions]
→ Use /testing-agent (Phase 2)
→ Tests reveal bugs
→ [Fix bugs manually]
→ Re-run tests
→ Done: Working agent
```
### Pattern 4: Complex Agent with Patterns
```
User: "Build an agent with multi-turn conversations"
→ Use /building-agents-core (learn pause/resume)
→ Use /building-agents-construction (build structure)
→ Use /building-agents-patterns (implement pause/resume pattern)
→ Use /testing-agent (validate conversation flows)
→ Done: Complex conversational agent
```
## Skill Dependencies
```
agent-workflow (meta-skill)
├── building-agents-core (foundational)
│ ├── Architecture concepts
│ ├── Node/Edge/Goal definitions
│ ├── Tool discovery procedures
│ └── Workflow overview
├── building-agents-construction (procedural)
│ ├── Creates package structure
│ ├── Defines goal
│ ├── Adds nodes incrementally
│ ├── Connects edges
│ ├── Finalizes agent class
│ └── Requires: building-agents-core
├── building-agents-patterns (reference)
│ ├── Best practices
│ ├── Pause/resume patterns
│ ├── Error handling
│ ├── Anti-patterns
│ └── Performance optimization
└── testing-agent
├── Reads agent goal
├── Generates tests
├── Runs evaluation
└── Reports results
```
## Troubleshooting
### "Agent structure won't validate"
- Check node IDs match between nodes/__init__.py and agent.py
- Verify all edges reference valid node IDs
- Ensure entry_node exists in nodes list
- Run: `PYTHONPATH=core:exports python -m agent_name validate`
### "Agent has structure but won't run"
- Check for STATUS.md or IMPLEMENTATION_GUIDE.md in agent directory
- Implementation may be needed (Python functions or MCP tools)
- This is expected - building-agents-construction creates structure, not implementation
- See implementation guide for completion options
### "Tests are failing"
- Review test output for specific failures
- Check agent goal and success criteria
- Verify constraints are met
- Use `/testing-agent` to debug and iterate
- Fix agent code and re-run tests
### "Not sure which phase I'm in"
Run these checks:
```bash
# Check if agent structure exists
ls exports/my_agent/agent.py
# Check if it validates
PYTHONPATH=core:exports python -m my_agent validate
# Check if tests exist
ls exports/my_agent/tests/
# If structure exists and validates → Phase 2 (testing)
# If structure doesn't exist → Phase 1 (building)
# If tests exist but failing → Debug phase
```
## Best Practices
### For Phase 1 (Building)
1. **Start with clear requirements** - Know what the agent should do
2. **Define success criteria early** - Measurable goals drive design
3. **Keep nodes focused** - One responsibility per node
4. **Use descriptive names** - Node IDs should explain purpose
5. **Validate incrementally** - Check structure after each major addition
### For Phase 2 (Testing)
1. **Test constraints first** - Hard requirements must pass
2. **Mock external dependencies** - Use mock mode for LLMs/APIs
3. **Cover edge cases** - Test failures, not just success paths
4. **Iterate quickly** - Fix one test at a time
5. **Document test patterns** - Future tests follow same structure
### General Workflow
1. **Use version control** - Git commit after each phase
2. **Document decisions** - Update README with changes
3. **Keep iterations small** - Build → Test → Fix → Repeat
4. **Preserve working states** - Tag successful iterations
5. **Learn from failures** - Failed tests reveal design issues
## Exit Criteria
You're done with the workflow when:
✅ Agent structure validates
✅ All tests pass
✅ Success criteria met
✅ Constraints verified
✅ Documentation complete
✅ Agent ready for deployment
## Additional Resources
- **building-agents-core**: See `.claude/skills/building-agents-core/SKILL.md`
- **building-agents-construction**: See `.claude/skills/building-agents-construction/SKILL.md`
- **building-agents-patterns**: See `.claude/skills/building-agents-patterns/SKILL.md`
- **testing-agent**: See `.claude/skills/testing-agent/SKILL.md`
- **Agent framework docs**: See `core/README.md`
- **Example agents**: See `exports/` directory
## Summary
This workflow provides a proven path from concept to production-ready agent:
1. **Learn** with `/building-agents-core` → Understand fundamentals (optional)
2. **Build** with `/building-agents-construction` → Get validated structure
3. **Optimize** with `/building-agents-patterns` → Apply best practices (optional)
4. **Test** with `/testing-agent` → Get verified functionality
The workflow is **flexible** - skip phases as needed, iterate freely, and adapt to your specific requirements. The goal is **production-ready agents** built with **consistent, repeatable processes**.
## Skill Selection Guide
**Choose building-agents-core when:**
- First time building agents
- Need to understand architecture
- Validating tool availability
- Learning about node types and edges
**Choose building-agents-construction when:**
- Actually building an agent
- Have clear requirements
- Ready to write code
- Want step-by-step guidance
**Choose building-agents-patterns when:**
- Agent structure complete
- Need advanced patterns
- Implementing pause/resume
- Optimizing performance
- Want best practices
**Choose testing-agent when:**
- Agent structure complete
- Ready to validate functionality
- Need comprehensive test coverage
- Debugging agent behavior
@@ -0,0 +1,199 @@
# Example: File Monitor Agent
This example shows the complete agent-workflow in action for building a file monitoring agent.
## Initial Request
```
User: "Build an agent that monitors ~/Downloads and copies new files to ~/Documents"
```
## Phase 1: Building (20 minutes)
### Step 1: Create Structure
Agent invokes `/building-agents` skill and:
1. Creates `exports/file_monitor_agent/` package
2. Writes skeleton files (__init__.py, __main__.py, agent.py, etc.)
**Output**: Package structure visible immediately
### Step 2: Define Goal
```python
goal = Goal(
id="file-monitor-copy",
name="Automated File Monitor & Copy",
success_criteria=[
# 100% detection rate
# 100% copy success
# 100% conflict resolution
# >99% uptime
],
constraints=[
# Preserve originals
# Handle errors gracefully
# Track state
# Respect permissions
]
)
```
**Output**: Goal written to agent.py
### Step 3: Design Nodes
7 nodes approved and written incrementally:
1. `initialize-state` - Set up tracking
2. `list-downloads` - Scan directory
3. `identify-new-files` - Find new files
4. `check-for-new-files` - Router
5. `copy-files` - Copy with conflict resolution
6. `update-state` - Mark as processed
7. `wait-interval` - Sleep between cycles
**Output**: All nodes in nodes/__init__.py
### Step 4: Connect Edges
8 edges connecting the workflow loop:
```
initialize → list → identify → check
↓ ↓
copy wait
↓ ↑
update ↓
↓ ↓
wait → list (loop)
```
**Output**: Edges written to agent.py
### Step 5: Finalize
```bash
$ PYTHONPATH=core:exports python -m file_monitor_agent validate
✓ Agent is valid
$ PYTHONPATH=core:exports python -m file_monitor_agent info
Agent: File Monitor & Copy Agent
Nodes: 7
Edges: 8
```
**Phase 1 Complete**: Structure validated ✅
### Status After Phase 1
```
exports/file_monitor_agent/
├── __init__.py ✅ (exports)
├── __main__.py ✅ (CLI)
├── agent.py ✅ (goal, graph, agent class)
├── nodes/__init__.py ✅ (7 nodes)
├── config.py ✅ (configuration)
├── implementations.py ✅ (Python functions)
├── README.md ✅ (documentation)
├── IMPLEMENTATION_GUIDE.md ✅ (next steps)
└── STATUS.md ✅ (current state)
```
**Note**: Implementation gap exists - data flow needs connection (covered in STATUS.md)
## Phase 2: Testing (25 minutes)
### Step 1: Analyze Agent
Agent invokes `/testing-agent` skill and:
1. Reads goal from `exports/file_monitor_agent/agent.py`
2. Identifies 4 success criteria to test
3. Identifies 4 constraints to verify
4. Plans test coverage
### Step 2: Generate Tests
Creates test files:
```
exports/file_monitor_agent/tests/
├── conftest.py (fixtures)
├── test_constraints.py (4 constraint tests)
├── test_success_criteria.py (4 success tests)
└── test_edge_cases.py (error handling)
```
Tests approved incrementally by user.
### Step 3: Run Tests
```bash
$ PYTHONPATH=core:exports pytest exports/file_monitor_agent/tests/
test_constraints.py::test_preserves_originals PASSED
test_constraints.py::test_handles_errors PASSED
test_constraints.py::test_tracks_state PASSED
test_constraints.py::test_respects_permissions PASSED
test_success_criteria.py::test_detects_all_files PASSED
test_success_criteria.py::test_copies_all_files PASSED
test_success_criteria.py::test_resolves_conflicts PASSED
test_success_criteria.py::test_continuous_run PASSED
test_edge_cases.py::test_empty_directory PASSED
test_edge_cases.py::test_permission_denied PASSED
test_edge_cases.py::test_disk_full PASSED
test_edge_cases.py::test_large_files PASSED
========================== 12 passed in 3.42s ==========================
```
**Phase 2 Complete**: All tests pass ✅
## Final Output
**Production-Ready Agent:**
```bash
# Run the agent
./RUN_AGENT.sh
# Or manually
PYTHONPATH=core:exports:tools/src python -m file_monitor_agent run
```
**Capabilities:**
- Monitors ~/Downloads continuously
- Copies new files to ~/Documents
- Resolves conflicts with timestamps
- Handles errors gracefully
- Tracks processed files
- Runs as background service
**Total Time**: ~45 minutes from concept to production
## Key Learnings
1. **Incremental building** - Files written immediately, visible throughout
2. **Validation early** - Structure validated before moving to implementation
3. **Test-driven** - Tests reveal real behavior
4. **Documentation included** - README, STATUS, and guides auto-generated
5. **Repeatable process** - Same workflow for any agent type
## Variations
**For simpler agents:**
- Fewer nodes (3-5 instead of 7)
- Simpler workflow (linear instead of looping)
- Faster build time (10-15 minutes)
**For complex agents:**
- More nodes (10-15+)
- Multiple subgraphs
- Pause/resume points for human-in-the-loop
- Longer build time (45-60 minutes)
The workflow scales to your needs!
-1
View File
@@ -1 +0,0 @@
../../core/.claude/skills/building-agents
@@ -0,0 +1,361 @@
---
name: building-agents-construction
description: Step-by-step guide for building goal-driven agents. Creates package structure, defines goals, adds nodes, connects edges, and finalizes agent class. Use when actively building an agent.
license: Apache-2.0
metadata:
author: hive
version: "2.0"
type: procedural
part_of: building-agents
requires: building-agents-core
---
# Agent Construction - EXECUTE THESE STEPS
**THIS IS AN EXECUTABLE WORKFLOW. DO NOT DISPLAY THIS FILE. EXECUTE THE STEPS BELOW.**
When this skill is loaded, IMMEDIATELY begin executing Step 1. Do not explain what you will do - just do it.
---
## STEP 1: Initialize Build Environment
**EXECUTE THESE TOOL CALLS NOW:**
1. Register the hive-tools MCP server:
```
mcp__agent-builder__add_mcp_server(
name="hive-tools",
transport="stdio",
command="python",
args='["mcp_server.py", "--stdio"]',
cwd="tools",
description="Hive tools MCP server"
)
```
2. Create a build session (replace AGENT_NAME with the user's requested agent name in snake_case):
```
mcp__agent-builder__create_session(name="AGENT_NAME")
```
3. Discover available tools:
```
mcp__agent-builder__list_mcp_tools()
```
4. Create the package directory:
```
mkdir -p exports/AGENT_NAME/nodes
```
**AFTER completing these calls**, tell the user:
> ✅ Build environment initialized
>
> - Session created
> - Available tools: [list the tools from step 3]
>
> Proceeding to define the agent goal...
**THEN immediately proceed to STEP 2.**
---
## STEP 2: Define and Approve Goal
**PROPOSE a goal to the user.** Based on what they asked for, propose:
- Goal ID (kebab-case)
- Goal name
- Goal description
- 3-5 success criteria (each with: id, description, metric, target, weight)
- 2-4 constraints (each with: id, description, constraint_type, category)
**FORMAT your proposal as a clear summary, then ask for approval:**
> **Proposed Goal: [Name]**
>
> [Description]
>
> **Success Criteria:**
>
> 1. [criterion 1]
> 2. [criterion 2]
> ...
>
> **Constraints:**
>
> 1. [constraint 1]
> 2. [constraint 2]
> ...
**THEN call AskUserQuestion:**
```
AskUserQuestion(questions=[{
"question": "Do you approve this goal definition?",
"header": "Goal",
"options": [
{"label": "Approve", "description": "Goal looks good, proceed"},
{"label": "Modify", "description": "I want to change something"}
],
"multiSelect": false
}])
```
**WAIT for user response.**
- If **Approve**: Call `mcp__agent-builder__set_goal(...)` with the goal details, then proceed to STEP 3
- If **Modify**: Ask what they want to change, update proposal, ask again
---
## STEP 3: Design Node Workflow
**BEFORE designing nodes**, review the available tools from Step 1. Nodes can ONLY use tools that exist.
**DESIGN the workflow** as a series of nodes. For each node, determine:
- node_id (kebab-case)
- name
- description
- node_type: `"llm_generate"` (no tools) or `"llm_tool_use"` (uses tools)
- input_keys (what data this node receives)
- output_keys (what data this node produces)
- tools (ONLY tools that exist - empty list for llm_generate)
- system_prompt
**PRESENT the workflow to the user:**
> **Proposed Workflow: [N] nodes**
>
> 1. **[node-id]** - [description]
>
> - Type: [llm_generate/llm_tool_use]
> - Input: [keys]
> - Output: [keys]
> - Tools: [tools or "none"]
>
> 2. **[node-id]** - [description]
> ...
>
> **Flow:** node1 → node2 → node3 → ...
**THEN call AskUserQuestion:**
```
AskUserQuestion(questions=[{
"question": "Do you approve this workflow design?",
"header": "Workflow",
"options": [
{"label": "Approve", "description": "Workflow looks good, proceed to build nodes"},
{"label": "Modify", "description": "I want to change the workflow"}
],
"multiSelect": false
}])
```
**WAIT for user response.**
- If **Approve**: Proceed to STEP 4
- If **Modify**: Ask what they want to change, update design, ask again
---
## STEP 4: Build Nodes One by One
**FOR EACH node in the approved workflow:**
1. **Call** `mcp__agent-builder__add_node(...)` with the node details
- input_keys and output_keys must be JSON strings: `'["key1", "key2"]'`
- tools must be a JSON string: `'["tool1"]'` or `'[]'`
2. **Call** `mcp__agent-builder__test_node(...)` to validate:
```
mcp__agent-builder__test_node(
node_id="the-node-id",
test_input='{"key": "test value"}',
mock_llm_response='{"output_key": "test output"}'
)
```
3. **Check result:**
- If valid: Tell user "✅ Node [id] validated" and continue to next node
- If invalid: Show errors, fix the node, re-validate
4. **Show progress** after each node:
```
mcp__agent-builder__get_session_status()
```
> ✅ Node [X] of [Y] complete: [node-id]
**AFTER all nodes are added and validated**, proceed to STEP 5.
---
## STEP 5: Connect Edges
**DETERMINE the edges** based on the workflow flow. For each connection:
- edge_id (kebab-case)
- source (node that outputs)
- target (node that receives)
- condition: `"on_success"`, `"always"`, `"on_failure"`, or `"conditional"`
- condition_expr (Python expression, only if conditional)
- priority (integer, lower = higher priority)
**FOR EACH edge, call:**
```
mcp__agent-builder__add_edge(
edge_id="source-to-target",
source="source-node-id",
target="target-node-id",
condition="on_success",
condition_expr="",
priority=1
)
```
**AFTER all edges are added, validate the graph:**
```
mcp__agent-builder__validate_graph()
```
- If valid: Tell user "✅ Graph structure validated" and proceed to STEP 6
- If invalid: Show errors, fix edges, re-validate
---
## STEP 6: Generate Agent Package
**EXPORT the graph data:**
```
mcp__agent-builder__export_graph()
```
This returns JSON with all the goal, nodes, edges, and MCP server configurations.
**THEN write the Python package files** using the exported data. Create these files in `exports/AGENT_NAME/`:
1. `config.py` - Runtime configuration with model settings
2. `nodes/__init__.py` - All NodeSpec definitions
3. `agent.py` - Goal, edges, graph config, and agent class
4. `__init__.py` - Package exports
5. `__main__.py` - CLI interface
6. `mcp_servers.json` - MCP server configurations
7. `README.md` - Usage documentation
**IMPORTANT entry_points format:**
- MUST be: `{"start": "first-node-id"}`
- NOT: `{"first-node-id": ["input_keys"]}` (WRONG)
- NOT: `{"first-node-id"}` (WRONG - this is a set)
**Use the example agent** at `.claude/skills/building-agents-construction/examples/online_research_agent/` as a template for file structure and patterns.
**AFTER writing all files, tell the user:**
> ✅ Agent package created: `exports/AGENT_NAME/`
>
> **Files generated:**
>
> - `__init__.py` - Package exports
> - `agent.py` - Goal, nodes, edges, agent class
> - `config.py` - Runtime configuration
> - `__main__.py` - CLI interface
> - `nodes/__init__.py` - Node definitions
> - `mcp_servers.json` - MCP server config
> - `README.md` - Usage documentation
>
> **Test your agent:**
>
> ```bash
> cd /home/timothy/oss/hive
> PYTHONPATH=core:exports python -m AGENT_NAME validate
> PYTHONPATH=core:exports python -m AGENT_NAME info
> ```
---
## STEP 7: Verify and Test
**RUN validation:**
```bash
cd /home/timothy/oss/hive && PYTHONPATH=core:exports python -m AGENT_NAME validate
```
- If valid: Agent is complete!
- If errors: Fix the issues and re-run
**SHOW final session summary:**
```
mcp__agent-builder__get_session_status()
```
**TELL the user the agent is ready** and suggest next steps:
- Run with mock mode to test without API calls
- Use `/testing-agent` skill for comprehensive testing
- Use `/setup-credentials` if the agent needs API keys
---
## REFERENCE: Node Types
| Type | tools param | Use when |
| -------------- | ---------------------- | ---------------------------------------------- |
| `llm_generate` | `'[]'` | Pure reasoning, JSON output, no external calls |
| `llm_tool_use` | `'["tool1", "tool2"]'` | Needs to call MCP tools |
---
## REFERENCE: Edge Conditions
| Condition | When edge is followed |
| ------------- | ------------------------------------- |
| `on_success` | Source node completed successfully |
| `on_failure` | Source node failed |
| `always` | Always, regardless of success/failure |
| `conditional` | When condition_expr evaluates to True |
---
## REFERENCE: System Prompt Best Practice
For nodes with JSON output, include this in the system_prompt:
```
CRITICAL: Return ONLY raw JSON. NO markdown, NO code blocks.
Just the JSON object starting with { and ending with }.
Return this exact structure:
{
"key1": "...",
"key2": "..."
}
```
---
## COMMON MISTAKES TO AVOID
1. **Using tools that don't exist** - Always check `mcp__agent-builder__list_mcp_tools()` first
2. **Wrong entry_points format** - Must be `{"start": "node-id"}`, NOT a set or list
3. **Skipping validation** - Always validate nodes and graph before proceeding
4. **Not waiting for approval** - Always ask user before major steps
5. **Displaying this file** - Execute the steps, don't show documentation
@@ -0,0 +1,80 @@
# Online Research Agent
Deep-dive research agent that searches 10+ sources and produces comprehensive narrative reports with citations.
## Features
- Generates multiple search queries from a topic
- Searches and fetches 15+ web sources
- Evaluates and ranks sources by relevance
- Synthesizes findings into themes
- Writes narrative report with numbered citations
- Quality checks for uncited claims
- Saves report to local markdown file
## Usage
### CLI
```bash
# Show agent info
python -m online_research_agent info
# Validate structure
python -m online_research_agent validate
# Run research on a topic
python -m online_research_agent run --topic "impact of AI on healthcare"
# Interactive shell
python -m online_research_agent shell
```
### Python API
```python
from online_research_agent import default_agent
# Simple usage
result = await default_agent.run({"topic": "climate change solutions"})
# Check output
if result.success:
print(f"Report saved to: {result.output['file_path']}")
print(result.output['final_report'])
```
## Workflow
```
parse-query → search-sources → fetch-content → evaluate-sources
write-report ← synthesize-findings
quality-check → save-report
```
## Output
Reports are saved to `./research_reports/` as markdown files with:
1. Executive Summary
2. Introduction
3. Key Findings (by theme)
4. Analysis
5. Conclusion
6. References
## Requirements
- Python 3.11+
- LLM provider API key (Groq, Cerebras, etc.)
- Internet access for web search/fetch
## Configuration
Edit `config.py` to change:
- `model`: LLM model (default: groq/moonshotai/kimi-k2-instruct-0905)
- `temperature`: Generation temperature (default: 0.7)
- `max_tokens`: Max tokens per response (default: 16384)
@@ -0,0 +1,23 @@
"""
Online Research Agent - Deep-dive research with narrative reports.
Research any topic by searching multiple sources, synthesizing information,
and producing a well-structured narrative report with citations.
"""
from .agent import OnlineResearchAgent, default_agent, goal, nodes, edges
from .config import RuntimeConfig, AgentMetadata, default_config, metadata
__version__ = "1.0.0"
__all__ = [
"OnlineResearchAgent",
"default_agent",
"goal",
"nodes",
"edges",
"RuntimeConfig",
"AgentMetadata",
"default_config",
"metadata",
]
@@ -0,0 +1,158 @@
"""
CLI entry point for Online Research Agent.
Uses AgentRuntime for multi-entrypoint support with HITL pause/resume.
"""
import asyncio
import json
import logging
import sys
import click
from .agent import default_agent, OnlineResearchAgent
def setup_logging(verbose=False, debug=False):
"""Configure logging for execution visibility."""
if debug:
level, fmt = logging.DEBUG, "%(asctime)s %(name)s: %(message)s"
elif verbose:
level, fmt = logging.INFO, "%(message)s"
else:
level, fmt = logging.WARNING, "%(levelname)s: %(message)s"
logging.basicConfig(level=level, format=fmt, stream=sys.stderr)
logging.getLogger("framework").setLevel(level)
@click.group()
@click.version_option(version="1.0.0")
def cli():
"""Online Research Agent - Deep-dive research with narrative reports."""
pass
@cli.command()
@click.option("--topic", "-t", type=str, required=True, help="Research topic")
@click.option("--mock", is_flag=True, help="Run in mock mode")
@click.option("--quiet", "-q", is_flag=True, help="Only output result JSON")
@click.option("--verbose", "-v", is_flag=True, help="Show execution details")
@click.option("--debug", is_flag=True, help="Show debug logging")
def run(topic, mock, quiet, verbose, debug):
"""Execute research on a topic."""
if not quiet:
setup_logging(verbose=verbose, debug=debug)
context = {"topic": topic}
result = asyncio.run(default_agent.run(context, mock_mode=mock))
output_data = {
"success": result.success,
"steps_executed": result.steps_executed,
"output": result.output,
}
if result.error:
output_data["error"] = result.error
click.echo(json.dumps(output_data, indent=2, default=str))
sys.exit(0 if result.success else 1)
@cli.command()
@click.option("--json", "output_json", is_flag=True)
def info(output_json):
"""Show agent information."""
info_data = default_agent.info()
if output_json:
click.echo(json.dumps(info_data, indent=2))
else:
click.echo(f"Agent: {info_data['name']}")
click.echo(f"Version: {info_data['version']}")
click.echo(f"Description: {info_data['description']}")
click.echo(f"\nNodes: {', '.join(info_data['nodes'])}")
click.echo(f"Entry: {info_data['entry_node']}")
click.echo(f"Terminal: {', '.join(info_data['terminal_nodes'])}")
@cli.command()
def validate():
"""Validate agent structure."""
validation = default_agent.validate()
if validation["valid"]:
click.echo("Agent is valid")
else:
click.echo("Agent has errors:")
for error in validation["errors"]:
click.echo(f" ERROR: {error}")
sys.exit(0 if validation["valid"] else 1)
@cli.command()
@click.option("--verbose", "-v", is_flag=True)
def shell(verbose):
"""Interactive research session."""
asyncio.run(_interactive_shell(verbose))
async def _interactive_shell(verbose=False):
"""Async interactive shell."""
setup_logging(verbose=verbose)
click.echo("=== Online Research Agent ===")
click.echo("Enter a topic to research (or 'quit' to exit):\n")
agent = OnlineResearchAgent()
await agent.start()
try:
while True:
try:
topic = await asyncio.get_event_loop().run_in_executor(
None, input, "Topic> "
)
if topic.lower() in ["quit", "exit", "q"]:
click.echo("Goodbye!")
break
if not topic.strip():
continue
click.echo("\nResearching... (this may take a few minutes)\n")
result = await agent.trigger_and_wait("start", {"topic": topic})
if result is None:
click.echo("\n[Execution timed out]\n")
continue
if result.success:
output = result.output
if "file_path" in output:
click.echo(f"\nReport saved to: {output['file_path']}\n")
if "final_report" in output:
click.echo("\n--- Report Preview ---\n")
preview = (
output["final_report"][:500] + "..."
if len(output.get("final_report", "")) > 500
else output.get("final_report", "")
)
click.echo(preview)
click.echo("\n")
else:
click.echo(f"\nResearch failed: {result.error}\n")
except KeyboardInterrupt:
click.echo("\nGoodbye!")
break
except Exception as e:
click.echo(f"Error: {e}", err=True)
import traceback
traceback.print_exc()
finally:
await agent.stop()
if __name__ == "__main__":
cli()
@@ -0,0 +1,429 @@
"""Agent graph construction for Online Research Agent."""
from framework.graph import EdgeSpec, EdgeCondition, Goal, SuccessCriterion, Constraint
from framework.graph.edge import GraphSpec
from framework.graph.executor import ExecutionResult
from framework.runtime.agent_runtime import AgentRuntime, create_agent_runtime
from framework.runtime.execution_stream import EntryPointSpec
from framework.llm import LiteLLMProvider
from framework.runner.tool_registry import ToolRegistry
from .config import default_config, metadata
from .nodes import (
parse_query_node,
search_sources_node,
fetch_content_node,
evaluate_sources_node,
synthesize_findings_node,
write_report_node,
quality_check_node,
save_report_node,
)
# Goal definition
goal = Goal(
id="comprehensive-online-research",
name="Comprehensive Online Research",
description="Research any topic by searching multiple sources, synthesizing information, and producing a well-structured narrative report with citations.",
success_criteria=[
SuccessCriterion(
id="source-coverage",
description="Query 10+ diverse sources",
metric="source_count",
target=">=10",
weight=0.20,
),
SuccessCriterion(
id="relevance",
description="All sources directly address the query",
metric="relevance_score",
target="90%",
weight=0.25,
),
SuccessCriterion(
id="synthesis",
description="Synthesize findings into coherent narrative",
metric="coherence_score",
target="85%",
weight=0.25,
),
SuccessCriterion(
id="citations",
description="Include citations for all claims",
metric="citation_coverage",
target="100%",
weight=0.15,
),
SuccessCriterion(
id="actionable",
description="Report answers the user's question",
metric="answer_completeness",
target="90%",
weight=0.15,
),
],
constraints=[
Constraint(
id="no-hallucination",
description="Only include information found in sources",
constraint_type="quality",
category="accuracy",
),
Constraint(
id="source-attribution",
description="Every factual claim must cite its source",
constraint_type="quality",
category="accuracy",
),
Constraint(
id="recency-preference",
description="Prefer recent sources when relevant",
constraint_type="quality",
category="relevance",
),
Constraint(
id="no-paywalled",
description="Avoid sources that require payment to access",
constraint_type="functional",
category="accessibility",
),
],
)
# Node list
nodes = [
parse_query_node,
search_sources_node,
fetch_content_node,
evaluate_sources_node,
synthesize_findings_node,
write_report_node,
quality_check_node,
save_report_node,
]
# Edge definitions
edges = [
EdgeSpec(
id="parse-to-search",
source="parse-query",
target="search-sources",
condition=EdgeCondition.ON_SUCCESS,
priority=1,
),
EdgeSpec(
id="search-to-fetch",
source="search-sources",
target="fetch-content",
condition=EdgeCondition.ON_SUCCESS,
priority=1,
),
EdgeSpec(
id="fetch-to-evaluate",
source="fetch-content",
target="evaluate-sources",
condition=EdgeCondition.ON_SUCCESS,
priority=1,
),
EdgeSpec(
id="evaluate-to-synthesize",
source="evaluate-sources",
target="synthesize-findings",
condition=EdgeCondition.ON_SUCCESS,
priority=1,
),
EdgeSpec(
id="synthesize-to-write",
source="synthesize-findings",
target="write-report",
condition=EdgeCondition.ON_SUCCESS,
priority=1,
),
EdgeSpec(
id="write-to-quality",
source="write-report",
target="quality-check",
condition=EdgeCondition.ON_SUCCESS,
priority=1,
),
EdgeSpec(
id="quality-to-save",
source="quality-check",
target="save-report",
condition=EdgeCondition.ON_SUCCESS,
priority=1,
),
]
# Graph configuration
entry_node = "parse-query"
entry_points = {"start": "parse-query"}
pause_nodes = []
terminal_nodes = ["save-report"]
class OnlineResearchAgent:
"""
Online Research Agent - Deep-dive research with narrative reports.
Uses AgentRuntime for multi-entrypoint support with HITL pause/resume.
"""
def __init__(self, config=None):
self.config = config or default_config
self.goal = goal
self.nodes = nodes
self.edges = edges
self.entry_node = entry_node
self.entry_points = entry_points
self.pause_nodes = pause_nodes
self.terminal_nodes = terminal_nodes
self._runtime: AgentRuntime | None = None
self._graph: GraphSpec | None = None
def _build_entry_point_specs(self) -> list[EntryPointSpec]:
"""Convert entry_points dict to EntryPointSpec list."""
specs = []
for ep_id, node_id in self.entry_points.items():
if ep_id == "start":
trigger_type = "manual"
name = "Start"
elif "_resume" in ep_id:
trigger_type = "resume"
name = f"Resume from {ep_id.replace('_resume', '')}"
else:
trigger_type = "manual"
name = ep_id.replace("-", " ").title()
specs.append(
EntryPointSpec(
id=ep_id,
name=name,
entry_node=node_id,
trigger_type=trigger_type,
isolation_level="shared",
)
)
return specs
def _create_runtime(self, mock_mode=False) -> AgentRuntime:
"""Create AgentRuntime instance."""
import json
from pathlib import Path
# Persistent storage in ~/.hive for telemetry and run history
storage_path = Path.home() / ".hive" / "online_research_agent"
storage_path.mkdir(parents=True, exist_ok=True)
tool_registry = ToolRegistry()
# Load MCP servers (always load, needed for tool validation)
agent_dir = Path(__file__).parent
mcp_config_path = agent_dir / "mcp_servers.json"
if mcp_config_path.exists():
with open(mcp_config_path) as f:
mcp_servers = json.load(f)
for server_config in mcp_servers.get("servers", []):
# Resolve relative cwd paths
cwd = server_config.get("cwd")
if cwd and not Path(cwd).is_absolute():
server_config["cwd"] = str(agent_dir / cwd)
tool_registry.register_mcp_server(server_config)
llm = None
if not mock_mode:
# LiteLLMProvider uses environment variables for API keys
llm = LiteLLMProvider(
model=self.config.model,
api_key=self.config.api_key,
api_base=self.config.api_base,
)
self._graph = GraphSpec(
id="online-research-agent-graph",
goal_id=self.goal.id,
version="1.0.0",
entry_node=self.entry_node,
entry_points=self.entry_points,
terminal_nodes=self.terminal_nodes,
pause_nodes=self.pause_nodes,
nodes=self.nodes,
edges=self.edges,
default_model=self.config.model,
max_tokens=self.config.max_tokens,
)
# Create AgentRuntime with all entry points
self._runtime = create_agent_runtime(
graph=self._graph,
goal=self.goal,
storage_path=storage_path,
entry_points=self._build_entry_point_specs(),
llm=llm,
tools=list(tool_registry.get_tools().values()),
tool_executor=tool_registry.get_executor(),
)
return self._runtime
async def start(self, mock_mode=False) -> None:
"""Start the agent runtime."""
if self._runtime is None:
self._create_runtime(mock_mode=mock_mode)
await self._runtime.start()
async def stop(self) -> None:
"""Stop the agent runtime."""
if self._runtime is not None:
await self._runtime.stop()
async def trigger(
self,
entry_point: str,
input_data: dict,
correlation_id: str | None = None,
session_state: dict | None = None,
) -> str:
"""
Trigger execution at a specific entry point (non-blocking).
Args:
entry_point: Entry point ID (e.g., "start", "pause-node_resume")
input_data: Input data for the execution
correlation_id: Optional ID to correlate related executions
session_state: Optional session state to resume from (with paused_at, memory)
Returns:
Execution ID for tracking
"""
if self._runtime is None or not self._runtime.is_running:
raise RuntimeError("Agent runtime not started. Call start() first.")
return await self._runtime.trigger(
entry_point, input_data, correlation_id, session_state=session_state
)
async def trigger_and_wait(
self,
entry_point: str,
input_data: dict,
timeout: float | None = None,
session_state: dict | None = None,
) -> ExecutionResult | None:
"""
Trigger execution and wait for completion.
Args:
entry_point: Entry point ID
input_data: Input data for the execution
timeout: Maximum time to wait (seconds)
session_state: Optional session state to resume from (with paused_at, memory)
Returns:
ExecutionResult or None if timeout
"""
if self._runtime is None or not self._runtime.is_running:
raise RuntimeError("Agent runtime not started. Call start() first.")
return await self._runtime.trigger_and_wait(
entry_point, input_data, timeout, session_state=session_state
)
async def run(
self, context: dict, mock_mode=False, session_state=None
) -> ExecutionResult:
"""
Run the agent (convenience method for simple single execution).
For more control, use start() + trigger_and_wait() + stop().
"""
await self.start(mock_mode=mock_mode)
try:
# Determine entry point based on session_state
if session_state and "paused_at" in session_state:
paused_node = session_state["paused_at"]
resume_key = f"{paused_node}_resume"
if resume_key in self.entry_points:
entry_point = resume_key
else:
entry_point = "start"
else:
entry_point = "start"
result = await self.trigger_and_wait(
entry_point, context, session_state=session_state
)
return result or ExecutionResult(success=False, error="Execution timeout")
finally:
await self.stop()
async def get_goal_progress(self) -> dict:
"""Get goal progress across all executions."""
if self._runtime is None:
raise RuntimeError("Agent runtime not started")
return await self._runtime.get_goal_progress()
def get_stats(self) -> dict:
"""Get runtime statistics."""
if self._runtime is None:
return {"running": False}
return self._runtime.get_stats()
def info(self):
"""Get agent information."""
return {
"name": metadata.name,
"version": metadata.version,
"description": metadata.description,
"goal": {
"name": self.goal.name,
"description": self.goal.description,
},
"nodes": [n.id for n in self.nodes],
"edges": [e.id for e in self.edges],
"entry_node": self.entry_node,
"entry_points": self.entry_points,
"pause_nodes": self.pause_nodes,
"terminal_nodes": self.terminal_nodes,
"multi_entrypoint": True,
}
def validate(self):
"""Validate agent structure."""
errors = []
warnings = []
node_ids = {node.id for node in self.nodes}
for edge in self.edges:
if edge.source not in node_ids:
errors.append(f"Edge {edge.id}: source '{edge.source}' not found")
if edge.target not in node_ids:
errors.append(f"Edge {edge.id}: target '{edge.target}' not found")
if self.entry_node not in node_ids:
errors.append(f"Entry node '{self.entry_node}' not found")
for terminal in self.terminal_nodes:
if terminal not in node_ids:
errors.append(f"Terminal node '{terminal}' not found")
for pause in self.pause_nodes:
if pause not in node_ids:
errors.append(f"Pause node '{pause}' not found")
# Validate entry points
for ep_id, node_id in self.entry_points.items():
if node_id not in node_ids:
errors.append(
f"Entry point '{ep_id}' references unknown node '{node_id}'"
)
return {
"valid": len(errors) == 0,
"errors": errors,
"warnings": warnings,
}
# Create default instance
default_agent = OnlineResearchAgent()
@@ -0,0 +1,43 @@
"""Runtime configuration."""
import json
from dataclasses import dataclass, field
from pathlib import Path
def _load_preferred_model() -> str:
"""Load preferred model from ~/.hive/configuration.json."""
config_path = Path.home() / ".hive" / "configuration.json"
if config_path.exists():
try:
with open(config_path) as f:
config = json.load(f)
llm = config.get("llm", {})
if llm.get("provider") and llm.get("model"):
return f"{llm['provider']}/{llm['model']}"
except Exception:
pass
return "anthropic/claude-sonnet-4-20250514"
@dataclass
class RuntimeConfig:
model: str = field(default_factory=_load_preferred_model)
temperature: float = 0.7
max_tokens: int = 8192
api_key: str | None = None
api_base: str | None = None
default_config = RuntimeConfig()
# Agent metadata
@dataclass
class AgentMetadata:
name: str = "Online Research Agent"
version: str = "1.0.0"
description: str = "Research any topic by searching multiple sources, synthesizing information, and producing a well-structured narrative report with citations."
metadata = AgentMetadata()
@@ -0,0 +1,9 @@
{
"hive-tools": {
"transport": "stdio",
"command": "python",
"args": ["mcp_server.py", "--stdio"],
"cwd": "../../tools",
"description": "Hive tools MCP server providing web_search, web_scrape, and write_to_file"
}
}
@@ -0,0 +1,396 @@
"""Node definitions for Online Research Agent."""
from framework.graph import NodeSpec
# Node 1: Parse Query
parse_query_node = NodeSpec(
id="parse-query",
name="Parse Query",
description="Analyze the research topic and generate 3-5 diverse search queries to cover different aspects",
node_type="llm_generate",
input_keys=["topic"],
output_keys=["search_queries", "research_focus", "key_aspects"],
output_schema={
"research_focus": {
"type": "string",
"required": True,
"description": "Brief statement of what we're researching",
},
"key_aspects": {
"type": "array",
"required": True,
"description": "List of 3-5 key aspects to investigate",
},
"search_queries": {
"type": "array",
"required": True,
"description": "List of 3-5 search queries",
},
},
system_prompt="""\
You are a research query strategist. Given a research topic, analyze it and generate search queries.
Your task:
1. Understand the core research question
2. Identify 3-5 key aspects to investigate
3. Generate 3-5 diverse search queries that will find comprehensive information
CRITICAL: Return ONLY raw JSON. NO markdown, NO code blocks.
Return this JSON structure:
{
"research_focus": "Brief statement of what we're researching",
"key_aspects": ["aspect1", "aspect2", "aspect3"],
"search_queries": [
"query 1 - broad overview",
"query 2 - specific angle",
"query 3 - recent developments",
"query 4 - expert opinions",
"query 5 - data/statistics"
]
}
""",
tools=[],
max_retries=3,
)
# Node 2: Search Sources
search_sources_node = NodeSpec(
id="search-sources",
name="Search Sources",
description="Execute web searches using the generated queries to find 15+ source URLs",
node_type="llm_tool_use",
input_keys=["search_queries", "research_focus"],
output_keys=["source_urls", "search_results_summary"],
output_schema={
"source_urls": {
"type": "array",
"required": True,
"description": "List of source URLs found",
},
"search_results_summary": {
"type": "string",
"required": True,
"description": "Brief summary of what was found",
},
},
system_prompt="""\
You are a research assistant executing web searches. Use the web_search tool to find sources.
Your task:
1. Execute each search query using web_search tool
2. Collect URLs from search results
3. Aim for 15+ diverse sources
After searching, return JSON with found sources:
{
"source_urls": ["url1", "url2", ...],
"search_results_summary": "Brief summary of what was found"
}
""",
tools=["web_search"],
max_retries=3,
)
# Node 3: Fetch Content
fetch_content_node = NodeSpec(
id="fetch-content",
name="Fetch Content",
description="Fetch and extract content from the discovered source URLs",
node_type="llm_tool_use",
input_keys=["source_urls", "research_focus"],
output_keys=["fetched_sources", "fetch_errors"],
output_schema={
"fetched_sources": {
"type": "array",
"required": True,
"description": "List of fetched source objects with url, title, content",
},
"fetch_errors": {
"type": "array",
"required": True,
"description": "List of URLs that failed to fetch",
},
},
system_prompt="""\
You are a content fetcher. Use web_scrape tool to retrieve content from URLs.
Your task:
1. Fetch content from each source URL using web_scrape tool
2. Extract the main content relevant to the research focus
3. Track any URLs that failed to fetch
After fetching, return JSON:
{
"fetched_sources": [
{"url": "...", "title": "...", "content": "extracted text..."},
...
],
"fetch_errors": ["url that failed", ...]
}
""",
tools=["web_scrape"],
max_retries=3,
)
# Node 4: Evaluate Sources
evaluate_sources_node = NodeSpec(
id="evaluate-sources",
name="Evaluate Sources",
description="Score sources for relevance and quality, filter to top 10",
node_type="llm_generate",
input_keys=["fetched_sources", "research_focus", "key_aspects"],
output_keys=["ranked_sources", "source_analysis"],
output_schema={
"ranked_sources": {
"type": "array",
"required": True,
"description": "List of ranked sources with scores",
},
"source_analysis": {
"type": "string",
"required": True,
"description": "Overview of source quality and coverage",
},
},
system_prompt="""\
You are a source evaluator. Assess each source for quality and relevance.
Scoring criteria:
- Relevance to research focus (1-10)
- Source credibility (1-10)
- Information depth (1-10)
- Recency if relevant (1-10)
Your task:
1. Score each source
2. Rank by combined score
3. Select top 10 sources
4. Note what each source uniquely contributes
Return JSON:
{
"ranked_sources": [
{"url": "...", "title": "...", "content": "...", "score": 8.5, "unique_value": "..."},
...
],
"source_analysis": "Overview of source quality and coverage"
}
""",
tools=[],
max_retries=3,
)
# Node 5: Synthesize Findings
synthesize_findings_node = NodeSpec(
id="synthesize-findings",
name="Synthesize Findings",
description="Extract key facts from sources and identify common themes",
node_type="llm_generate",
input_keys=["ranked_sources", "research_focus", "key_aspects"],
output_keys=["key_findings", "themes", "source_citations"],
output_schema={
"key_findings": {
"type": "array",
"required": True,
"description": "List of key findings with sources and confidence",
},
"themes": {
"type": "array",
"required": True,
"description": "List of themes with descriptions and supporting sources",
},
"source_citations": {
"type": "object",
"required": True,
"description": "Map of facts to supporting URLs",
},
},
system_prompt="""\
You are a research synthesizer. Analyze multiple sources to extract insights.
Your task:
1. Identify key facts from each source
2. Find common themes across sources
3. Note contradictions or debates
4. Build a citation map (fact -> source URL)
Return JSON:
{
"key_findings": [
{"finding": "...", "sources": ["url1", "url2"], "confidence": "high/medium/low"},
...
],
"themes": [
{"theme": "...", "description": "...", "supporting_sources": ["url1", ...]},
...
],
"source_citations": {
"fact or claim": ["supporting url1", "url2"],
...
}
}
""",
tools=[],
max_retries=3,
)
# Node 6: Write Report
write_report_node = NodeSpec(
id="write-report",
name="Write Report",
description="Generate a narrative report with proper citations",
node_type="llm_generate",
input_keys=[
"key_findings",
"themes",
"source_citations",
"research_focus",
"ranked_sources",
],
output_keys=["report_content", "references"],
output_schema={
"report_content": {
"type": "string",
"required": True,
"description": "Full markdown report text with citations",
},
"references": {
"type": "array",
"required": True,
"description": "List of reference objects with number, url, title",
},
},
system_prompt="""\
You are a research report writer. Create a well-structured narrative report.
Report structure:
1. Executive Summary (2-3 paragraphs)
2. Introduction (context and scope)
3. Key Findings (organized by theme)
4. Analysis (synthesis and implications)
5. Conclusion
6. References (numbered list of all sources)
Citation format: Use numbered citations like [1], [2] that correspond to the References section.
IMPORTANT:
- Every factual claim MUST have a citation
- Write in clear, professional prose
- Be objective and balanced
- Highlight areas of consensus and debate
Return JSON:
{
"report_content": "Full markdown report text with citations...",
"references": [
{"number": 1, "url": "...", "title": "..."},
...
]
}
""",
tools=[],
max_retries=3,
)
# Node 7: Quality Check
quality_check_node = NodeSpec(
id="quality-check",
name="Quality Check",
description="Verify all claims have citations and report is coherent",
node_type="llm_generate",
input_keys=["report_content", "references", "source_citations"],
output_keys=["quality_score", "issues", "final_report"],
output_schema={
"quality_score": {
"type": "number",
"required": True,
"description": "Quality score 0-1",
},
"issues": {
"type": "array",
"required": True,
"description": "List of issues found and fixed",
},
"final_report": {
"type": "string",
"required": True,
"description": "Corrected full report",
},
},
system_prompt="""\
You are a quality assurance reviewer. Check the research report for issues.
Check for:
1. Uncited claims (factual statements without [n] citation)
2. Broken citations (references to non-existent numbers)
3. Coherence (logical flow between sections)
4. Completeness (all key aspects covered)
5. Accuracy (claims match source content)
If issues found, fix them in the final report.
Return JSON:
{
"quality_score": 0.95,
"issues": [
{"type": "uncited_claim", "location": "paragraph 3", "fixed": true},
...
],
"final_report": "Corrected full report with all issues fixed..."
}
""",
tools=[],
max_retries=3,
)
# Node 8: Save Report
save_report_node = NodeSpec(
id="save-report",
name="Save Report",
description="Write the final report to a local markdown file",
node_type="llm_tool_use",
input_keys=["final_report", "references", "research_focus"],
output_keys=["file_path", "save_status"],
output_schema={
"file_path": {
"type": "string",
"required": True,
"description": "Path where report was saved",
},
"save_status": {
"type": "string",
"required": True,
"description": "Status of save operation",
},
},
system_prompt="""\
You are a file manager. Save the research report to disk.
Your task:
1. Generate a filename from the research focus (slugified, with date)
2. Use the write_to_file tool to save the report as markdown
3. Save to the ./research_reports/ directory
Filename format: research_YYYY-MM-DD_topic-slug.md
Return JSON:
{
"file_path": "research_reports/research_2026-01-23_topic-name.md",
"save_status": "success"
}
""",
tools=["write_to_file"],
max_retries=3,
)
__all__ = [
"parse_query_node",
"search_sources_node",
"fetch_content_node",
"evaluate_sources_node",
"synthesize_findings_node",
"write_report_node",
"quality_check_node",
"save_report_node",
]
@@ -0,0 +1,303 @@
---
name: building-agents-core
description: Core concepts for goal-driven agents - architecture, node types, tool discovery, and workflow overview. Use when starting agent development or need to understand agent fundamentals.
license: Apache-2.0
metadata:
author: hive
version: "1.0"
type: foundational
part_of: building-agents
---
# Building Agents - Core Concepts
Foundational knowledge for building goal-driven agents as Python packages.
## Architecture: Python Services (Not JSON Configs)
Agents are built as Python packages:
```
exports/my_agent/
├── __init__.py # Package exports
├── __main__.py # CLI (run, info, validate, shell)
├── agent.py # Graph construction (goal, edges, agent class)
├── nodes/__init__.py # Node definitions (NodeSpec)
├── config.py # Runtime config
└── README.md # Documentation
```
**Key Principle: Agent is visible and editable during build**
- ✅ Files created immediately as components are approved
- ✅ User can watch files grow in their editor
- ✅ No session state - just direct file writes
- ✅ No "export" step - agent is ready when build completes
## Core Concepts
### Goal
Success criteria and constraints (written to agent.py)
```python
goal = Goal(
id="research-goal",
name="Technical Research Agent",
description="Research technical topics thoroughly",
success_criteria=[
SuccessCriterion(
id="completeness",
description="Cover all aspects of topic",
metric="coverage_score",
target=">=0.9",
weight=0.4,
),
# 3-5 success criteria total
],
constraints=[
Constraint(
id="accuracy",
description="All information must be verified",
constraint_type="hard",
category="quality",
),
# 1-5 constraints total
],
)
```
### Node
Unit of work (written to nodes/__init__.py)
**Node Types:**
- `llm_generate` - Text generation, parsing
- `llm_tool_use` - Actions requiring tools
- `router` - Conditional branching
- `function` - Deterministic operations
```python
search_node = NodeSpec(
id="search-web",
name="Search Web",
description="Search for information online",
node_type="llm_tool_use",
input_keys=["query"],
output_keys=["search_results"],
system_prompt="Search the web for: {query}",
tools=["web_search"],
max_retries=3,
)
```
### Edge
Connection between nodes (written to agent.py)
**Edge Conditions:**
- `on_success` - Proceed if node succeeds
- `on_failure` - Handle errors
- `always` - Always proceed
- `conditional` - Based on expression
```python
EdgeSpec(
id="search-to-analyze",
source="search-web",
target="analyze-results",
condition=EdgeCondition.ON_SUCCESS,
priority=1,
)
```
### Pause/Resume
Multi-turn conversations
- **Pause nodes** - Stop execution, wait for user input
- **Resume entry points** - Continue from pause with user's response
```python
# Example pause/resume configuration
pause_nodes = ["request-clarification"]
entry_points = {
"start": "analyze-request",
"request-clarification_resume": "process-clarification"
}
```
## Tool Discovery & Validation
**CRITICAL:** Before adding a node with tools, you MUST verify the tools exist.
Tools are provided by MCP servers. Never assume a tool exists - always discover dynamically.
### Step 1: Register MCP Server (if not already done)
```python
mcp__agent-builder__add_mcp_server(
name="tools",
transport="stdio",
command="python",
args='["mcp_server.py", "--stdio"]',
cwd="../tools"
)
```
### Step 2: Discover Available Tools
```python
# List all tools from all registered servers
mcp__agent-builder__list_mcp_tools()
# Or list tools from a specific server
mcp__agent-builder__list_mcp_tools(server_name="tools")
```
This returns available tools with their descriptions and parameters:
```json
{
"success": true,
"tools_by_server": {
"tools": [
{
"name": "web_search",
"description": "Search the web...",
"parameters": ["query"]
},
{
"name": "web_scrape",
"description": "Scrape a URL...",
"parameters": ["url"]
}
]
},
"total_tools": 14
}
```
### Step 3: Validate Before Adding Nodes
Before writing a node with `tools=[...]`:
1. Call `list_mcp_tools()` to get available tools
2. Check each tool in your node exists in the response
3. If a tool doesn't exist:
- **DO NOT proceed** with the node
- Inform the user: "The tool 'X' is not available. Available tools are: ..."
- Ask if they want to use an alternative or proceed without the tool
### Tool Validation Anti-Patterns
**Never assume a tool exists** - always call `list_mcp_tools()` first
**Never write a node with unverified tools** - validate before writing
**Never silently drop tools** - if a tool doesn't exist, inform the user
**Never guess tool names** - use exact names from discovery response
### Example Validation Flow
```python
# 1. User requests: "Add a node that searches the web"
# 2. Discover available tools
tools_response = mcp__agent-builder__list_mcp_tools()
# 3. Check if web_search exists
available = [t["name"] for tools in tools_response["tools_by_server"].values() for t in tools]
if "web_search" not in available:
# Inform user and ask how to proceed
print("'web_search' not available. Available tools:", available)
else:
# Proceed with node creation
# ...
```
## Workflow Overview: Incremental File Construction
```
1. CREATE PACKAGE → mkdir + write skeletons
2. DEFINE GOAL → Write to agent.py + config.py
3. FOR EACH NODE:
- Propose design
- User approves
- Write to nodes/__init__.py IMMEDIATELY ← FILE WRITTEN
- (Optional) Validate with test_node ← MCP VALIDATION
- User can open file and see it
4. CONNECT EDGES → Update agent.py ← FILE WRITTEN
- (Optional) Validate with validate_graph ← MCP VALIDATION
5. FINALIZE → Write agent class to agent.py ← FILE WRITTEN
6. DONE - Agent ready at exports/my_agent/
```
**Files written immediately. MCP tools optional for validation/testing bookkeeping.**
### The Key Difference
**OLD (Bad):**
```
MCP add_node → Session State → MCP add_node → Session State → ...
MCP export_graph
Files appear
```
**NEW (Good):**
```
Write node to file → (Optional: MCP test_node) → Write node to file → ...
↓ ↓
File visible File visible
immediately immediately
```
**Bottom line:** Use Write/Edit for construction, MCP for validation if needed.
## When to Use This Skill
Use building-agents-core when:
- Starting a new agent project and need to understand fundamentals
- Need to understand agent architecture before building
- Want to validate tool availability before proceeding
- Learning about node types, edges, and graph execution
**Next Steps:**
- Ready to build? → Use `building-agents-construction` skill
- Need patterns and examples? → Use `building-agents-patterns` skill
## MCP Tools for Validation
After writing files, optionally use MCP tools for validation:
**test_node** - Validate node configuration with mock inputs
```python
mcp__agent-builder__test_node(
node_id="search-web",
test_input='{"query": "test query"}',
mock_llm_response='{"results": "mock output"}'
)
```
**validate_graph** - Check graph structure
```python
mcp__agent-builder__validate_graph()
# Returns: unreachable nodes, missing connections, etc.
```
**create_session** - Track session state for bookkeeping
```python
mcp__agent-builder__create_session(session_name="my-build")
```
**Key Point:** Files are written FIRST. MCP tools are for validation only.
## Related Skills
- **building-agents-construction** - Step-by-step building process
- **building-agents-patterns** - Best practices and examples
- **agent-workflow** - Complete workflow orchestrator
- **testing-agent** - Test and validate completed agents
@@ -0,0 +1,497 @@
---
name: building-agents-patterns
description: Best practices, patterns, and examples for building goal-driven agents. Includes pause/resume architecture, hybrid workflows, anti-patterns, and handoff to testing. Use when optimizing agent design.
license: Apache-2.0
metadata:
author: hive
version: "1.0"
type: reference
part_of: building-agents
---
# Building Agents - Patterns & Best Practices
Design patterns, examples, and best practices for building robust goal-driven agents.
**Prerequisites:** Complete agent structure using `building-agents-construction`.
## Practical Example: Hybrid Workflow
How to build a node using both direct file writes and optional MCP validation:
```python
# 1. WRITE TO FILE FIRST (Primary - makes it visible)
node_code = '''
search_node = NodeSpec(
id="search-web",
node_type="llm_tool_use",
input_keys=["query"],
output_keys=["search_results"],
system_prompt="Search the web for: {query}",
tools=["web_search"],
)
'''
Edit(
file_path="exports/research_agent/nodes/__init__.py",
old_string="# Nodes will be added here",
new_string=node_code
)
print("✅ Added search_node to nodes/__init__.py")
print("📁 Open exports/research_agent/nodes/__init__.py to see it!")
# 2. OPTIONALLY VALIDATE WITH MCP (Secondary - bookkeeping)
validation = mcp__agent-builder__test_node(
node_id="search-web",
test_input='{"query": "python tutorials"}',
mock_llm_response='{"search_results": [...mock results...]}'
)
print(f"✓ Validation: {validation['success']}")
```
**User experience:**
- Immediately sees node in their editor (from step 1)
- Gets validation feedback (from step 2)
- Can edit the file directly if needed
This combines visibility (files) with validation (MCP tools).
## Pause/Resume Architecture
For agents needing multi-turn conversations with user interaction:
### Basic Pause/Resume Flow
```python
# Define pause nodes - execution stops at these nodes
pause_nodes = ["request-clarification", "await-approval"]
# Define entry points - where to resume from each pause
entry_points = {
"start": "analyze-request", # Initial entry
"request-clarification_resume": "process-clarification", # Resume from clarification
"await-approval_resume": "execute-action", # Resume from approval
}
```
### Example: Multi-Turn Research Agent
```python
# Nodes
nodes = [
NodeSpec(id="analyze-request", ...),
NodeSpec(id="request-clarification", ...), # PAUSE NODE
NodeSpec(id="process-clarification", ...),
NodeSpec(id="generate-results", ...),
NodeSpec(id="await-approval", ...), # PAUSE NODE
NodeSpec(id="execute-action", ...),
]
# Edges with resume flows
edges = [
EdgeSpec(
id="analyze-to-clarify",
source="analyze-request",
target="request-clarification",
condition=EdgeCondition.CONDITIONAL,
condition_expr="needs_clarification == true",
),
# When resumed, goes to process-clarification
EdgeSpec(
id="clarify-to-process",
source="request-clarification",
target="process-clarification",
condition=EdgeCondition.ALWAYS,
),
EdgeSpec(
id="results-to-approval",
source="generate-results",
target="await-approval",
condition=EdgeCondition.ALWAYS,
),
# When resumed, goes to execute-action
EdgeSpec(
id="approval-to-execute",
source="await-approval",
target="execute-action",
condition=EdgeCondition.ALWAYS,
),
]
# Configuration
pause_nodes = ["request-clarification", "await-approval"]
entry_points = {
"start": "analyze-request",
"request-clarification_resume": "process-clarification",
"await-approval_resume": "execute-action",
}
```
### Running Pause/Resume Agents
```python
# Initial run - will pause at first pause node
result1 = await agent.run(
context={"query": "research topic"},
session_state=None
)
# Check if paused
if result1.paused_at:
print(f"Paused at: {result1.paused_at}")
# Resume with user input
result2 = await agent.run(
context={"user_response": "clarification details"},
session_state=result1.session_state # Pass previous state
)
```
## Anti-Patterns
### What NOT to Do
**Don't rely on `export_graph`** - Write files immediately, not at end
```python
# BAD: Building in session state, exporting at end
mcp__agent-builder__add_node(...)
mcp__agent-builder__add_node(...)
mcp__agent-builder__export_graph() # Files appear only now
# GOOD: Writing files immediately
Write(file_path="...", content=node_code) # File visible now
Write(file_path="...", content=node_code) # File visible now
```
**Don't hide code in session** - Write to files as components approved
```python
# BAD: Accumulating changes invisibly
session.add_component(component1)
session.add_component(component2)
# User can't see anything yet
# GOOD: Incremental visibility
Edit(file_path="...", ...) # User sees change 1
Edit(file_path="...", ...) # User sees change 2
```
**Don't wait to write files** - Agent visible from first step
```python
# BAD: Building everything before writing
design_all_nodes()
design_all_edges()
write_everything_at_once()
# GOOD: Write as you go
write_package_structure() # Visible
write_goal() # Visible
write_node_1() # Visible
write_node_2() # Visible
```
**Don't batch everything** - Write incrementally
```python
# BAD: Batching all nodes
nodes = [design_node_1(), design_node_2(), ...]
write_all_nodes(nodes)
# GOOD: One at a time with user feedback
write_node_1() # User approves
write_node_2() # User approves
write_node_3() # User approves
```
### MCP Tools - Correct Usage
**MCP tools OK for:**
`test_node` - Validate node configuration with mock inputs
`validate_graph` - Check graph structure
`create_session` - Track session state for bookkeeping
✅ Other validation tools
**Just don't:** Use MCP as the primary construction method or rely on export_graph
## Best Practices
### 1. Show Progress After Each Write
```python
# After writing a node
print("✅ Added analyze_request_node to nodes/__init__.py")
print("📊 Progress: 1/6 nodes added")
print("📁 Open exports/my_agent/nodes/__init__.py to see it!")
```
### 2. Let User Open Files During Build
```python
# Encourage file inspection
print("✅ Goal written to agent.py")
print("")
print("💡 Tip: Open exports/my_agent/agent.py in your editor to see the goal!")
```
### 3. Write Incrementally - One Component at a Time
```python
# Good flow
write_package_structure()
show_user("Package created")
write_goal()
show_user("Goal written")
for node in nodes:
get_approval(node)
write_node(node)
show_user(f"Node {node.id} written")
```
### 4. Test As You Build
```python
# After adding several nodes
print("💡 You can test current state with:")
print(" PYTHONPATH=core:exports python -m my_agent validate")
print(" PYTHONPATH=core:exports python -m my_agent info")
```
### 5. Keep User Informed
```python
# Clear status updates
print("🔨 Creating package structure...")
print("✅ Package created: exports/my_agent/")
print("")
print("📝 Next: Define agent goal")
```
## Continuous Monitoring Agents
For agents that run continuously without terminal nodes:
```python
# No terminal nodes - loops forever
terminal_nodes = []
# Workflow loops back to start
edges = [
EdgeSpec(id="monitor-to-check", source="monitor", target="check-condition"),
EdgeSpec(id="check-to-wait", source="check-condition", target="wait"),
EdgeSpec(id="wait-to-monitor", source="wait", target="monitor"), # Loop
]
# Entry node only
entry_node = "monitor"
entry_points = {"start": "monitor"}
pause_nodes = []
```
**Example: File Monitor**
```python
nodes = [
NodeSpec(id="list-files", ...),
NodeSpec(id="check-new-files", node_type="router", ...),
NodeSpec(id="process-files", ...),
NodeSpec(id="wait-interval", node_type="function", ...),
]
edges = [
EdgeSpec(id="list-to-check", source="list-files", target="check-new-files"),
EdgeSpec(
id="check-to-process",
source="check-new-files",
target="process-files",
condition=EdgeCondition.CONDITIONAL,
condition_expr="new_files_count > 0",
),
EdgeSpec(
id="check-to-wait",
source="check-new-files",
target="wait-interval",
condition=EdgeCondition.CONDITIONAL,
condition_expr="new_files_count == 0",
),
EdgeSpec(id="process-to-wait", source="process-files", target="wait-interval"),
EdgeSpec(id="wait-to-list", source="wait-interval", target="list-files"), # Loop back
]
terminal_nodes = [] # No terminal - runs forever
```
## Complex Routing Patterns
### Multi-Condition Router
```python
router_node = NodeSpec(
id="decision-router",
node_type="router",
input_keys=["analysis_result"],
output_keys=["decision"],
system_prompt="""
Based on the analysis result, decide the next action:
- If confidence > 0.9: route to "execute"
- If 0.5 <= confidence <= 0.9: route to "review"
- If confidence < 0.5: route to "clarify"
Return: {"decision": "execute|review|clarify"}
""",
)
# Edges for each route
edges = [
EdgeSpec(
id="router-to-execute",
source="decision-router",
target="execute-action",
condition=EdgeCondition.CONDITIONAL,
condition_expr="decision == 'execute'",
priority=1,
),
EdgeSpec(
id="router-to-review",
source="decision-router",
target="human-review",
condition=EdgeCondition.CONDITIONAL,
condition_expr="decision == 'review'",
priority=2,
),
EdgeSpec(
id="router-to-clarify",
source="decision-router",
target="request-clarification",
condition=EdgeCondition.CONDITIONAL,
condition_expr="decision == 'clarify'",
priority=3,
),
]
```
## Error Handling Patterns
### Graceful Failure with Fallback
```python
# Primary node with error handling
nodes = [
NodeSpec(id="api-call", max_retries=3, ...),
NodeSpec(id="fallback-cache", ...),
NodeSpec(id="report-error", ...),
]
edges = [
# Success path
EdgeSpec(
id="api-success",
source="api-call",
target="process-results",
condition=EdgeCondition.ON_SUCCESS,
),
# Fallback on failure
EdgeSpec(
id="api-to-fallback",
source="api-call",
target="fallback-cache",
condition=EdgeCondition.ON_FAILURE,
priority=1,
),
# Report if fallback also fails
EdgeSpec(
id="fallback-to-error",
source="fallback-cache",
target="report-error",
condition=EdgeCondition.ON_FAILURE,
priority=1,
),
]
```
## Performance Optimization
### Parallel Node Execution
```python
# Use multiple edges from same source for parallel execution
edges = [
EdgeSpec(
id="start-to-search1",
source="start",
target="search-source-1",
condition=EdgeCondition.ALWAYS,
),
EdgeSpec(
id="start-to-search2",
source="start",
target="search-source-2",
condition=EdgeCondition.ALWAYS,
),
EdgeSpec(
id="start-to-search3",
source="start",
target="search-source-3",
condition=EdgeCondition.ALWAYS,
),
# Converge results
EdgeSpec(
id="search1-to-merge",
source="search-source-1",
target="merge-results",
),
EdgeSpec(
id="search2-to-merge",
source="search-source-2",
target="merge-results",
),
EdgeSpec(
id="search3-to-merge",
source="search-source-3",
target="merge-results",
),
]
```
## Handoff to Testing
When agent is complete, transition to testing phase:
```python
print("""
✅ Agent complete: exports/my_agent/
Next steps:
1. Switch to testing-agent skill
2. Generate and approve tests
3. Run evaluation
4. Debug any failures
Command: "Test the agent at exports/my_agent/"
""")
```
### Pre-Testing Checklist
Before handing off to testing-agent:
- [ ] Agent structure validates: `python -m agent_name validate`
- [ ] All nodes defined in nodes/__init__.py
- [ ] All edges connect valid nodes
- [ ] Entry node specified
- [ ] Agent can be imported: `from exports.agent_name import default_agent`
- [ ] README.md with usage instructions
- [ ] CLI commands work (info, validate)
## Related Skills
- **building-agents-core** - Fundamental concepts
- **building-agents-construction** - Step-by-step building
- **testing-agent** - Test and validate agents
- **agent-workflow** - Complete workflow orchestrator
---
**Remember: Agent is actively constructed, visible the whole time. No hidden state. No surprise exports. Just transparent, incremental file building.**
+572
View File
@@ -0,0 +1,572 @@
---
name: setup-credentials
description: Set up and install credentials for an agent. Detects missing credentials from agent config, collects them from the user, and stores them securely in the encrypted credential store at ~/.hive/credentials.
license: Apache-2.0
metadata:
author: hive
version: "2.1"
type: utility
---
# Setup Credentials
Interactive credential setup for agents with multiple authentication options. Detects what's missing, offers auth method choices, validates with health checks, and stores credentials securely.
## When to Use
- Before running or testing an agent for the first time
- When `AgentRunner.run()` fails with "missing required credentials"
- When a user asks to configure credentials for an agent
- After building a new agent that uses tools requiring API keys
## Workflow
### Step 1: Identify the Agent
Determine which agent needs credentials. The user will either:
- Name the agent directly (e.g., "set up credentials for hubspot-agent")
- Have an agent directory open (check `exports/` for agent dirs)
- Be working on an agent in the current session
Locate the agent's directory under `exports/{agent_name}/`.
### Step 2: Detect Required Credentials
Read the agent's configuration to determine which tools and node types it uses:
```python
from core.framework.runner import AgentRunner
runner = AgentRunner.load("exports/{agent_name}")
validation = runner.validate()
# validation.missing_credentials contains env var names
# validation.warnings contains detailed messages with help URLs
```
Alternatively, check the credential store directly:
```python
from core.framework.credentials import CredentialStore
# Use encrypted storage (default: ~/.hive/credentials)
store = CredentialStore.with_encrypted_storage()
# Check what's available
available = store.list_credentials()
print(f"Available credentials: {available}")
# Check if specific credential exists
if store.is_available("hubspot"):
print("HubSpot credential found")
else:
print("HubSpot credential missing")
```
To see all known credential specs (for help URLs and setup instructions):
```python
from aden_tools.credentials import CREDENTIAL_SPECS
for name, spec in CREDENTIAL_SPECS.items():
print(f"{name}: env_var={spec.env_var}, aden={spec.aden_supported}")
```
### Step 3: Present Auth Options for Each Missing Credential
For each missing credential, check what authentication methods are available:
```python
from aden_tools.credentials import CREDENTIAL_SPECS
spec = CREDENTIAL_SPECS.get("hubspot")
if spec:
# Determine available auth options
auth_options = []
if spec.aden_supported:
auth_options.append("aden")
if spec.direct_api_key_supported:
auth_options.append("direct")
auth_options.append("custom") # Always available
# Get setup info
setup_info = {
"env_var": spec.env_var,
"description": spec.description,
"help_url": spec.help_url,
"api_key_instructions": spec.api_key_instructions,
}
```
Present the available options using AskUserQuestion:
```
Choose how to configure HUBSPOT_ACCESS_TOKEN:
1) Aden Authorization Server (Recommended)
Secure OAuth2 flow via integration.adenhq.com
- Quick setup with automatic token refresh
- No need to manage API keys manually
2) Direct API Key
Enter your own API key manually
- Requires creating a HubSpot Private App
- Full control over scopes and permissions
3) Custom Credential Store (Advanced)
Programmatic configuration for CI/CD
- For automated deployments
- Requires manual API calls
```
### Step 4: Execute Auth Flow Based on User Choice
#### Option 1: Aden Authorization Server
This is the recommended flow for supported integrations (HubSpot, etc.).
**How Aden OAuth Works:**
The ADEN_API_KEY represents a user who has already completed OAuth authorization on Aden's platform. When users sign up and connect integrations on Aden, those OAuth tokens are stored server-side. Having an ADEN_API_KEY means:
1. User has an Aden account
2. User has already authorized integrations (HubSpot, etc.) via OAuth on Aden
3. We just need to sync those credentials down to the local credential store
**4.1a. Check for ADEN_API_KEY**
```python
import os
aden_key = os.environ.get("ADEN_API_KEY")
```
If not set, guide user to get one from Aden (this is where they do OAuth):
```python
from aden_tools.credentials import open_browser, get_aden_setup_url
# Open browser to Aden - user will sign up and connect integrations there
url = get_aden_setup_url() # https://integration.adenhq.com/setup
success, msg = open_browser(url)
print("Please sign in to Aden and connect your integrations (HubSpot, etc.).")
print("Once done, copy your API key and return here.")
```
Ask user to provide the ADEN_API_KEY they received.
**4.1b. Save ADEN_API_KEY to Shell Config**
With user approval, persist ADEN_API_KEY to their shell config:
```python
from aden_tools.credentials import (
detect_shell,
add_env_var_to_shell_config,
get_shell_source_command,
)
shell_type = detect_shell() # 'bash', 'zsh', or 'unknown'
# Ask user for approval before modifying shell config
# If approved:
success, config_path = add_env_var_to_shell_config(
"ADEN_API_KEY",
user_provided_key,
comment="Aden authorization server API key"
)
if success:
source_cmd = get_shell_source_command()
print(f"Saved to {config_path}")
print(f"Run: {source_cmd}")
```
Also save to `~/.hive/configuration.json` for the framework:
```python
import json
from pathlib import Path
config_path = Path.home() / ".hive" / "configuration.json"
config = json.loads(config_path.read_text()) if config_path.exists() else {}
config["aden"] = {
"api_key_configured": True,
"api_url": "https://api.adenhq.com"
}
config_path.parent.mkdir(parents=True, exist_ok=True)
config_path.write_text(json.dumps(config, indent=2))
```
**4.1c. Sync Credentials from Aden Server**
Since the user has already authorized integrations on Aden, use the one-liner factory method:
```python
from core.framework.credentials import CredentialStore
# This single call handles everything:
# - Creates encrypted local storage at ~/.hive/credentials
# - Configures Aden client from ADEN_API_KEY env var
# - Syncs all credentials from Aden server automatically
store = CredentialStore.with_aden_sync(
base_url="https://api.adenhq.com",
auto_sync=True, # Syncs on creation
)
# Check what was synced
synced = store.list_credentials()
print(f"Synced credentials: {synced}")
# If the required credential wasn't synced, the user hasn't authorized it on Aden yet
if "hubspot" not in synced:
print("HubSpot not found in your Aden account.")
print("Please visit https://integration.adenhq.com to connect HubSpot, then try again.")
```
For more control over the sync process:
```python
from core.framework.credentials import CredentialStore
from core.framework.credentials.aden import (
AdenCredentialClient,
AdenClientConfig,
AdenSyncProvider,
)
# Create client (API key loaded from ADEN_API_KEY env var)
client = AdenCredentialClient(AdenClientConfig(
base_url="https://api.adenhq.com",
))
# Create provider and store
provider = AdenSyncProvider(client=client)
store = CredentialStore.with_encrypted_storage()
# Manual sync
synced_count = provider.sync_all(store)
print(f"Synced {synced_count} credentials from Aden")
```
**4.1d. Run Health Check**
```python
from aden_tools.credentials import check_credential_health
# Get the token from the store
cred = store.get_credential("hubspot")
token = cred.keys["access_token"].value.get_secret_value()
result = check_credential_health("hubspot", token)
if result.valid:
print("HubSpot credentials validated successfully!")
else:
print(f"Validation failed: {result.message}")
# Offer to retry the OAuth flow
```
#### Option 2: Direct API Key
For users who prefer manual API key management.
**4.2a. Show Setup Instructions**
```python
from aden_tools.credentials import CREDENTIAL_SPECS
spec = CREDENTIAL_SPECS.get("hubspot")
if spec and spec.api_key_instructions:
print(spec.api_key_instructions)
# Output:
# To get a HubSpot Private App token:
# 1. Go to HubSpot Settings > Integrations > Private Apps
# 2. Click "Create a private app"
# 3. Name your app (e.g., "Hive Agent")
# ...
if spec and spec.help_url:
print(f"More info: {spec.help_url}")
```
**4.2b. Collect API Key from User**
Use AskUserQuestion to securely collect the API key:
```
Please provide your HubSpot access token:
(This will be stored securely in ~/.hive/credentials)
```
**4.2c. Run Health Check Before Storing**
```python
from aden_tools.credentials import check_credential_health
result = check_credential_health("hubspot", user_provided_token)
if not result.valid:
print(f"Warning: {result.message}")
# Ask user if they want to:
# 1. Try a different token
# 2. Continue anyway (not recommended)
```
**4.2d. Store in Encrypted Credential Store**
```python
from core.framework.credentials import CredentialStore, CredentialObject, CredentialKey
from pydantic import SecretStr
store = CredentialStore.with_encrypted_storage()
cred = CredentialObject(
id="hubspot",
name="HubSpot Access Token",
keys={
"access_token": CredentialKey(
name="access_token",
value=SecretStr(user_provided_token),
)
},
)
store.save_credential(cred)
```
**4.2e. Export to Current Session**
```bash
export HUBSPOT_ACCESS_TOKEN="the-value"
```
#### Option 3: Custom Credential Store (Advanced)
For programmatic/CI/CD setups.
**4.3a. Show Documentation**
```
For advanced credential management, you can use the CredentialStore API directly:
from core.framework.credentials import CredentialStore, CredentialObject, CredentialKey
from pydantic import SecretStr
store = CredentialStore.with_encrypted_storage()
cred = CredentialObject(
id="hubspot",
name="HubSpot Access Token",
keys={"access_token": CredentialKey(name="access_token", value=SecretStr("..."))}
)
store.save_credential(cred)
For CI/CD environments:
- Set HIVE_CREDENTIAL_KEY for encryption
- Pre-populate ~/.hive/credentials programmatically
- Or use environment variables directly (HUBSPOT_ACCESS_TOKEN)
Documentation: See core/framework/credentials/README.md
```
### Step 5: Record Configuration Method
Track which auth method was used for each credential in `~/.hive/configuration.json`:
```python
import json
from pathlib import Path
from datetime import datetime
config_path = Path.home() / ".hive" / "configuration.json"
config = json.loads(config_path.read_text()) if config_path.exists() else {}
if "credential_methods" not in config:
config["credential_methods"] = {}
config["credential_methods"]["hubspot"] = {
"method": "aden", # or "direct" or "custom"
"configured_at": datetime.now().isoformat(),
}
config_path.write_text(json.dumps(config, indent=2))
```
### Step 6: Verify All Credentials
Run validation again to confirm everything is set:
```python
runner = AgentRunner.load("exports/{agent_name}")
validation = runner.validate()
assert not validation.missing_credentials, "Still missing credentials!"
```
Report the result to the user.
## Health Check Reference
Health checks validate credentials by making lightweight API calls:
| Credential | Endpoint | What It Checks |
| -------------- | --------------------------------------- | --------------------------------- |
| `hubspot` | `GET /crm/v3/objects/contacts?limit=1` | Bearer token validity, CRM scopes |
| `brave_search` | `GET /res/v1/web/search?q=test&count=1` | API key validity |
```python
from aden_tools.credentials import check_credential_health, HealthCheckResult
result: HealthCheckResult = check_credential_health("hubspot", token_value)
# result.valid: bool
# result.message: str
# result.details: dict (status_code, rate_limited, etc.)
```
## Encryption Key (HIVE_CREDENTIAL_KEY)
The encrypted credential store requires `HIVE_CREDENTIAL_KEY` to encrypt/decrypt credentials.
- If the user doesn't have one, `EncryptedFileStorage` will auto-generate one and log it
- The user MUST persist this key (e.g., in `~/.bashrc` or a secrets manager)
- Without this key, stored credentials cannot be decrypted
- This is the ONLY secret that should live in `~/.bashrc` or environment config
If `HIVE_CREDENTIAL_KEY` is not set:
1. Let the store generate one
2. Tell the user to save it: `export HIVE_CREDENTIAL_KEY="{generated_key}"`
3. Recommend adding it to `~/.bashrc` or their shell profile
## Security Rules
- **NEVER** log, print, or echo credential values in tool output
- **NEVER** store credentials in plaintext files, git-tracked files, or agent configs
- **NEVER** hardcode credentials in source code
- **ALWAYS** use `SecretStr` from Pydantic when handling credential values in Python
- **ALWAYS** use the encrypted credential store (`~/.hive/credentials`) for persistence
- **ALWAYS** run health checks before storing credentials (when possible)
- **ALWAYS** verify credentials were stored by re-running validation, not by reading them back
- When modifying `~/.bashrc` or `~/.zshrc`, confirm with the user first
## Credential Sources Reference
All credential specs are defined in `tools/src/aden_tools/credentials/`:
| File | Category | Credentials | Aden Supported |
| ----------------- | ------------- | --------------------------------------------- | -------------- |
| `llm.py` | LLM Providers | `anthropic` | No |
| `search.py` | Search Tools | `brave_search`, `google_search`, `google_cse` | No |
| `integrations.py` | Integrations | `hubspot` | Yes |
**Note:** Additional LLM providers (Cerebras, Groq, OpenAI) are handled by LiteLLM via environment
variables (`CEREBRAS_API_KEY`, `GROQ_API_KEY`, `OPENAI_API_KEY`) but are not yet in CREDENTIAL_SPECS.
Add them to `llm.py` as needed.
To check what's registered:
```python
from aden_tools.credentials import CREDENTIAL_SPECS
for name, spec in CREDENTIAL_SPECS.items():
print(f"{name}: aden={spec.aden_supported}, direct={spec.direct_api_key_supported}")
```
## Migration: CredentialManager → CredentialStore
**CredentialManager is deprecated.** Use CredentialStore instead.
| Old (Deprecated) | New (Recommended) |
| ----------------------------------------- | -------------------------------------------------------------------- |
| `CredentialManager()` | `CredentialStore.with_encrypted_storage()` |
| `creds.get("hubspot")` | `store.get("hubspot")` or `store.get_key("hubspot", "access_token")` |
| `creds.validate_for_tools(tools)` | Use `store.is_available(cred_id)` per credential |
| `creds.get_auth_options("hubspot")` | Check `CREDENTIAL_SPECS["hubspot"].aden_supported` |
| `creds.get_setup_instructions("hubspot")` | Access `CREDENTIAL_SPECS["hubspot"]` directly |
**Why migrate?**
- **CredentialStore** supports encrypted storage, multi-key credentials, template resolution, and automatic token refresh
- **CredentialManager** only reads from environment variables and .env files (no encryption, no refresh)
- **CredentialStoreAdapter** exists for backward compatibility during migration
```python
# Old way (deprecated)
from aden_tools.credentials import CredentialManager
creds = CredentialManager()
token = creds.get("hubspot")
# New way (recommended)
from core.framework.credentials import CredentialStore
store = CredentialStore.with_encrypted_storage()
token = store.get("hubspot")
# With Aden sync (recommended for OAuth integrations)
store = CredentialStore.with_aden_sync()
token = store.get_key("hubspot", "access_token")
```
## Example Session
```
User: /setup-credentials for my hubspot-agent
Agent: Let me check what credentials your hubspot-agent needs.
[Runs validation, finds ANTHROPIC_API_KEY and HUBSPOT_ACCESS_TOKEN missing]
Agent: Your hubspot-agent requires 2 credentials:
1. ANTHROPIC_API_KEY - Anthropic API key for LLM calls
2. HUBSPOT_ACCESS_TOKEN - HubSpot access token
Let's set these up one at a time.
--- Setting up ANTHROPIC_API_KEY ---
This credential only supports direct API key entry.
[Shows instructions for getting Anthropic API key]
[AskUserQuestion: "Please provide your Anthropic API key:"]
[User provides key]
[Stores in credential store, exports to env]
--- Setting up HUBSPOT_ACCESS_TOKEN ---
[AskUserQuestion with 3 options]
Choose how to configure HUBSPOT_ACCESS_TOKEN:
1) Aden Authorization Server (Recommended)
2) Direct API Key
3) Custom Credential Store (Advanced)
User: 1 (Aden)
Agent: Great! Let me check if you have an Aden API key configured...
[Checks for ADEN_API_KEY, not found]
[Opens browser to integration.adenhq.com/setup]
Agent: I've opened Aden's setup page. Please:
1. Sign in or create an account
2. Connect your HubSpot account (OAuth happens on Aden's site)
3. Copy the API key shown after setup
[AskUserQuestion: "Please provide your Aden API key:"]
[User provides key]
Agent: Would you like me to save this to your shell config (~/.zshrc) for future sessions?
[User approves]
[Saves to ~/.zshrc and ~/.hive/configuration.json]
Agent: Let me sync your credentials from Aden...
[Syncs credentials from Aden server - OAuth already done on Aden's side]
[Runs health check]
Agent: HubSpot credentials validated successfully!
All credentials are now configured:
- ANTHROPIC_API_KEY: Stored in encrypted credential store
- HUBSPOT_ACCESS_TOKEN: Synced from Aden (OAuth completed on Aden)
- Validation passed - your agent is ready to run!
```
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,351 @@
# Example: Testing a YouTube Research Agent
This example walks through testing a YouTube research agent that finds relevant videos based on a topic.
## Prerequisites
- Agent built with building-agents skill at `exports/youtube-research/`
- Goal defined with success criteria and constraints
## Step 1: Load the Goal
First, load the goal that was defined during the Goal stage:
```json
{
"id": "youtube-research",
"name": "YouTube Research Agent",
"description": "Find relevant YouTube videos on a given topic",
"success_criteria": [
{
"id": "find_videos",
"description": "Find 3-5 relevant videos",
"metric": "video_count",
"target": "3-5",
"weight": 1.0
},
{
"id": "relevance",
"description": "Videos must be relevant to the topic",
"metric": "relevance_score",
"target": ">0.8",
"weight": 0.8
}
],
"constraints": [
{
"id": "api_limits",
"description": "Must not exceed YouTube API rate limits",
"constraint_type": "hard",
"category": "technical"
},
{
"id": "content_safety",
"description": "Must filter out inappropriate content",
"constraint_type": "hard",
"category": "safety"
}
]
}
```
## Step 2: Get Constraint Test Guidelines
During the Goal stage (or early Eval), get test guidelines for constraints:
```python
result = generate_constraint_tests(
goal_id="youtube-research",
goal_json='<goal JSON above>',
agent_path="exports/youtube-research"
)
```
**The result contains guidelines (not generated tests):**
- `output_file`: Where to write tests
- `file_header`: Imports and fixtures to use
- `test_template`: Format for test functions
- `constraints_formatted`: The constraints to test
- `test_guidelines`: Rules for writing tests
## Step 3: Write Constraint Tests
Using the guidelines, write tests directly with the Write tool:
```python
# Write constraint tests using the provided file_header and guidelines
Write(
file_path="exports/youtube-research/tests/test_constraints.py",
content='''
"""Constraint tests for youtube-research agent."""
import os
import pytest
from exports.youtube_research import default_agent
pytestmark = pytest.mark.skipif(
not os.environ.get("ANTHROPIC_API_KEY") and not os.environ.get("MOCK_MODE"),
reason="API key required for real testing."
)
@pytest.mark.asyncio
async def test_constraint_api_limits_respected():
"""Verify API rate limits are not exceeded."""
import time
mock_mode = bool(os.environ.get("MOCK_MODE"))
for i in range(10):
result = await default_agent.run({"topic": f"test_{i}"}, mock_mode=mock_mode)
time.sleep(0.1)
# Should complete without rate limit errors
assert "rate limit" not in str(result).lower()
@pytest.mark.asyncio
async def test_constraint_content_safety_filter():
"""Verify inappropriate content is filtered."""
mock_mode = bool(os.environ.get("MOCK_MODE"))
result = await default_agent.run({"topic": "general topic"}, mock_mode=mock_mode)
for video in result.videos:
assert video.safe_for_work is True
assert video.age_restricted is False
'''
)
```
## Step 4: Get Success Criteria Test Guidelines
After the agent is built, get success criteria test guidelines:
```python
result = generate_success_tests(
goal_id="youtube-research",
goal_json='<goal JSON>',
node_names="search_node,filter_node,rank_node,format_node",
tool_names="youtube_search,video_details,channel_info",
agent_path="exports/youtube-research"
)
```
## Step 5: Write Success Criteria Tests
Using the guidelines, write success criteria tests:
```python
Write(
file_path="exports/youtube-research/tests/test_success_criteria.py",
content='''
"""Success criteria tests for youtube-research agent."""
import os
import pytest
from exports.youtube_research import default_agent
pytestmark = pytest.mark.skipif(
not os.environ.get("ANTHROPIC_API_KEY") and not os.environ.get("MOCK_MODE"),
reason="API key required for real testing."
)
@pytest.mark.asyncio
async def test_find_videos_happy_path():
"""Test finding videos for a common topic."""
mock_mode = bool(os.environ.get("MOCK_MODE"))
result = await default_agent.run({"topic": "machine learning"}, mock_mode=mock_mode)
assert result.success
assert 3 <= len(result.videos) <= 5
assert all(v.title for v in result.videos)
assert all(v.video_id for v in result.videos)
@pytest.mark.asyncio
async def test_find_videos_minimum_boundary():
"""Test at minimum threshold (3 videos)."""
mock_mode = bool(os.environ.get("MOCK_MODE"))
result = await default_agent.run({"topic": "niche topic xyz"}, mock_mode=mock_mode)
assert len(result.videos) >= 3
@pytest.mark.asyncio
async def test_relevance_score_threshold():
"""Test relevance scoring meets threshold."""
mock_mode = bool(os.environ.get("MOCK_MODE"))
result = await default_agent.run({"topic": "python programming"}, mock_mode=mock_mode)
for video in result.videos:
assert video.relevance_score > 0.8
@pytest.mark.asyncio
async def test_find_videos_no_results_graceful():
"""Test graceful handling of no results."""
mock_mode = bool(os.environ.get("MOCK_MODE"))
result = await default_agent.run({"topic": "xyznonexistent123"}, mock_mode=mock_mode)
# Should not crash, return empty or message
assert result.videos == [] or result.message
'''
)
```
## Step 6: Run All Tests
Execute all tests:
```python
result = run_tests(
goal_id="youtube-research",
agent_path="exports/youtube-research",
test_types='["all"]',
parallel=4
)
```
**Results:**
```json
{
"goal_id": "youtube-research",
"overall_passed": false,
"summary": {
"total": 6,
"passed": 5,
"failed": 1,
"pass_rate": "83.3%"
},
"duration_ms": 4521,
"results": [
{"test_id": "test_constraint_api_001", "passed": true, "duration_ms": 1234},
{"test_id": "test_constraint_content_001", "passed": true, "duration_ms": 456},
{"test_id": "test_success_001", "passed": true, "duration_ms": 789},
{"test_id": "test_success_002", "passed": true, "duration_ms": 654},
{"test_id": "test_success_003", "passed": true, "duration_ms": 543},
{"test_id": "test_success_004", "passed": false, "duration_ms": 845,
"error_category": "IMPLEMENTATION_ERROR",
"error_message": "TypeError: 'NoneType' object has no attribute 'videos'"}
]
}
```
## Step 7: Debug the Failed Test
```python
result = debug_test(
goal_id="youtube-research",
test_name="test_find_videos_no_results_graceful",
agent_path="exports/youtube-research"
)
```
**Debug Output:**
```json
{
"test_id": "test_success_004",
"test_name": "test_find_videos_no_results_graceful",
"input": {"topic": "xyznonexistent123"},
"expected": "Empty list or message",
"actual": {"error": "TypeError: 'NoneType' object has no attribute 'videos'"},
"passed": false,
"error_message": "TypeError: 'NoneType' object has no attribute 'videos'",
"error_category": "IMPLEMENTATION_ERROR",
"stack_trace": "Traceback (most recent call last):\n File \"filter_node.py\", line 42\n for video in result.videos:\nTypeError: 'NoneType' object has no attribute 'videos'",
"logs": [
{"timestamp": "2026-01-20T10:00:01", "node": "search_node", "level": "INFO", "msg": "Searching for: xyznonexistent123"},
{"timestamp": "2026-01-20T10:00:02", "node": "search_node", "level": "WARNING", "msg": "No results found"},
{"timestamp": "2026-01-20T10:00:02", "node": "filter_node", "level": "ERROR", "msg": "NoneType error"}
],
"runtime_data": {
"execution_path": ["start", "search_node", "filter_node"],
"node_outputs": {
"search_node": null
}
},
"suggested_fix": "Add null check in filter_node before accessing .videos attribute",
"iteration_guidance": {
"stage": "Agent",
"action": "Fix the code in nodes/edges",
"restart_required": false,
"description": "The goal is correct, but filter_node doesn't handle null results from search_node."
}
}
```
## Step 8: Iterate Based on Category
Since this is an **IMPLEMENTATION_ERROR**, we:
1. **Don't restart** the Goal → Agent → Eval flow
2. **Fix the agent** using building-agents skill:
- Modify `filter_node` to handle null results
3. **Re-run Eval** (tests only)
### Fix in building-agents:
```python
# Update the filter_node to handle null
add_node(
node_id="filter_node",
name="Filter Node",
description="Filter and rank videos",
node_type="function",
input_keys=["search_results"],
output_keys=["filtered_videos"],
system_prompt="""
Filter videos by relevance.
IMPORTANT: Handle case where search_results is None or empty.
Return empty list if no results.
"""
)
```
### Re-export and re-test:
```python
# Re-export the fixed agent
export_graph(path="exports/youtube-research")
# Re-run tests
result = run_tests(
goal_id="youtube-research",
agent_path="exports/youtube-research",
test_types='["all"]'
)
```
**Updated Results:**
```json
{
"goal_id": "youtube-research",
"overall_passed": true,
"summary": {
"total": 6,
"passed": 6,
"failed": 0,
"pass_rate": "100.0%"
}
}
```
## Summary
1. **Got guidelines** for constraint tests during Goal stage
2. **Wrote** constraint tests using Write tool
3. **Got guidelines** for success criteria tests during Eval stage
4. **Wrote** success criteria tests using Write tool
5. **Ran** tests in parallel
6. **Debugged** the one failure
7. **Categorized** as IMPLEMENTATION_ERROR
8. **Fixed** the agent (not the goal)
9. **Re-ran** Eval only (didn't restart full flow)
10. **Passed** all tests
The agent is now validated and ready for production use.
+145
View File
@@ -0,0 +1,145 @@
# Triage Issue Skill
Analyze a GitHub issue, verify claims against the codebase, and close invalid issues with a technical response.
## Trigger
User provides a GitHub issue URL or number, e.g.:
- `/triage-issue 1970`
- `/triage-issue https://github.com/adenhq/hive/issues/1970`
## Workflow
### Step 1: Fetch Issue Details
```bash
gh issue view <number> --repo adenhq/hive --json title,body,state,labels,author
```
Extract:
- Title
- Body (the claim/bug report)
- Current state
- Labels
- Author
If issue is already closed, inform user and stop.
### Step 2: Analyze the Claim
Read the issue body and identify:
1. **The core claim** - What is the user asserting?
2. **Technical specifics** - File paths, function names, code snippets mentioned
3. **Expected behavior** - What do they think should happen?
4. **Severity claimed** - Security issue? Bug? Feature request?
### Step 3: Investigate the Codebase
For each technical claim:
1. Find the referenced code using Grep/Glob/Read
2. Understand the actual implementation
3. Check if the claim accurately describes the behavior
4. Look for related tests, documentation, or design decisions
### Step 4: Evaluate Validity
Categorize the issue as one of:
| Category | Action |
|----------|--------|
| **Valid Bug** | Do NOT close. Inform user this is a real issue. |
| **Valid Feature Request** | Do NOT close. Suggest labeling appropriately. |
| **Misunderstanding** | Prepare technical explanation for why behavior is correct. |
| **Fundamentally Flawed** | Prepare critique explaining the technical impossibility or design rationale. |
| **Duplicate** | Find the original issue and prepare duplicate notice. |
| **Incomplete** | Prepare request for more information. |
### Step 5: Draft Response
For issues to be closed, draft a response that:
1. **Acknowledges the concern** - Don't be dismissive
2. **Explains the actual behavior** - With code references
3. **Provides technical rationale** - Why it works this way
4. **References industry standards** - If applicable
5. **Offers alternatives** - If there's a better approach for the user
Use this template:
```markdown
## Analysis
[Brief summary of what was investigated]
## Technical Details
[Explanation with code references]
## Why This Is Working As Designed
[Rationale]
## Recommendation
[What the user should do instead, if applicable]
---
*This issue was reviewed and closed by the maintainers.*
```
### Step 6: User Review
Present the draft to the user with:
```
## Issue #<number>: <title>
**Claim:** <summary of claim>
**Finding:** <valid/invalid/misunderstanding/etc>
**Draft Response:**
<the markdown response>
---
Do you want me to post this comment and close the issue?
```
Use AskUserQuestion with options:
- "Post and close" - Post comment, close issue
- "Edit response" - Let user modify the response
- "Skip" - Don't take action
### Step 7: Execute Action
If user approves:
```bash
# Post comment
gh issue comment <number> --repo adenhq/hive --body "<response>"
# Close issue
gh issue close <number> --repo adenhq/hive --reason "not planned"
```
Report success with link to the issue.
## Important Guidelines
1. **Never close valid issues** - If there's any merit to the claim, don't close it
2. **Be respectful** - The reporter took time to file the issue
3. **Be technical** - Provide code references and evidence
4. **Be educational** - Help them understand, don't just dismiss
5. **Check twice** - Make sure you understand the code before declaring something invalid
6. **Consider edge cases** - Maybe their environment reveals a real issue
## Example Critiques
### Security Misunderstanding
> "The claim that secrets are exposed in plaintext misunderstands the encryption architecture. While `SecretStr` is used for logging protection, actual encryption is provided by Fernet (AES-128-CBC) at the storage layer. The code path is: serialize → encrypt → write. Only encrypted bytes touch disk."
### Impossible Request
> "The requested feature would require [X] which violates [fundamental constraint]. This is not a limitation of our implementation but a fundamental property of [technology/protocol]."
### Already Handled
> "This scenario is already handled by [code reference]. The reporter may be using an older version or misconfigured environment."
+20
View File
@@ -0,0 +1,20 @@
{
"mcpServers": {
"agent-builder": {
"command": "python",
"args": ["-m", "framework.mcp.agent_builder_server"],
"cwd": "core",
"env": {
"PYTHONPATH": "../tools/src"
}
},
"tools": {
"command": "python",
"args": ["mcp_server.py", "--stdio"],
"cwd": "tools",
"env": {
"PYTHONPATH": "src"
}
}
}
}
+1
View File
@@ -0,0 +1 @@
../../.claude/skills/agent-workflow
+1
View File
@@ -0,0 +1 @@
../../.claude/skills/building-agents-construction
+1
View File
@@ -0,0 +1 @@
../../.claude/skills/building-agents-core
+1
View File
@@ -0,0 +1 @@
../../.claude/skills/building-agents-patterns
+1
View File
@@ -0,0 +1 @@
../../.claude/skills/testing-agent
+18
View File
@@ -0,0 +1,18 @@
This project uses ruff for Python linting and formatting.
Rules:
- Line length: 100 characters
- Python target: 3.11+
- Use double quotes for strings
- Sort imports with isort (ruff I rules): stdlib, third-party, first-party (framework), local
- Combine as-imports
- Use type hints on all function signatures
- Use `from __future__ import annotations` for modern type syntax
- Raise exceptions with `from` in except blocks (B904)
- No unused imports (F401), no unused variables (F841)
- Prefer list/dict/set comprehensions over map/filter (C4)
Run `make lint` to auto-fix, `make check` to verify without modifying files.
Run `make format` to apply ruff formatting.
The ruff config lives in core/pyproject.toml under [tool.ruff].
+3
View File
@@ -11,6 +11,9 @@ indent_size = 2
insert_final_newline = true
trim_trailing_whitespace = true
[*.py]
indent_size = 4
[*.md]
trim_trailing_whitespace = false
+124
View File
@@ -0,0 +1,124 @@
# Normalize line endings for all text files
* text=auto
# Source code
*.py text diff=python
*.js text
*.ts text
*.jsx text
*.tsx text
*.json text
*.yaml text
*.yml text
*.toml text
*.ini text
*.cfg text
# Shell scripts (must use LF)
*.sh text eol=lf
quickstart.sh text eol=lf
# PowerShell scripts (Windows-friendly)
*.ps1 text eol=lf
*.psm1 text eol=lf
# Windows batch files (must use CRLF)
*.bat text eol=crlf
*.cmd text eol=crlf
# Documentation
*.md text
*.txt text
*.rst text
*.tex text
# Configuration files
.gitignore text
.gitattributes text
.editorconfig text
Dockerfile text
docker-compose.yml text
requirements*.txt text
pyproject.toml text
setup.py text
setup.cfg text
MANIFEST.in text
LICENSE text
README* text
CHANGELOG* text
CONTRIBUTING* text
CODE_OF_CONDUCT* text
# Web files
*.html text
*.css text
*.scss text
*.sass text
# Data files
*.xml text
*.csv text
*.sql text
# Graphics (binary)
*.png binary
*.jpg binary
*.jpeg binary
*.gif binary
*.ico binary
*.svg binary
*.eps binary
*.bmp binary
*.tif binary
*.tiff binary
# Archives (binary)
*.zip binary
*.tar binary
*.gz binary
*.bz2 binary
*.7z binary
*.rar binary
# Python compiled (binary)
*.pyc binary
*.pyo binary
*.pyd binary
*.whl binary
*.egg binary
# System libraries (binary)
*.so binary
*.dll binary
*.dylib binary
*.lib binary
*.a binary
# Documents (binary)
*.pdf binary
*.doc binary
*.docx binary
*.ppt binary
*.pptx binary
*.xls binary
*.xlsx binary
# Fonts (binary)
*.ttf binary
*.otf binary
*.woff binary
*.woff2 binary
*.eot binary
# Audio/Video (binary)
*.mp3 binary
*.mp4 binary
*.wav binary
*.avi binary
*.mov binary
*.flv binary
# Database files (binary)
*.db binary
*.sqlite binary
*.sqlite3 binary
-1
View File
@@ -8,7 +8,6 @@
/hive/ @adenhq/maintainers
# Infrastructure
/docker-compose*.yml @adenhq/maintainers
/.github/ @adenhq/maintainers
# Documentation
+3 -4
View File
@@ -29,13 +29,12 @@ If applicable, add screenshots to help explain your problem.
## Environment
- OS: [e.g., Ubuntu 22.04, macOS 14]
- Docker version: [e.g., 24.0.0]
- Node version: [e.g., 20.10.0]
- Browser (if applicable): [e.g., Chrome 120]
- Python version: [e.g., 3.11.0]
- Docker version (if applicable): [e.g., 24.0.0]
## Configuration
Relevant parts of your `config.yaml` (remove any sensitive data):
Relevant parts of your agent configuration or environment setup (remove any sensitive data):
```yaml
# paste here
+2 -2
View File
@@ -24,8 +24,8 @@ Fixes #(issue number)
Describe the tests you ran to verify your changes:
- [ ] Unit tests pass (`npm run test`)
- [ ] Lint passes (`npm run lint`)
- [ ] Unit tests pass (`cd core && pytest tests/`)
- [ ] Lint passes (`cd core && ruff check .`)
- [ ] Manual testing performed
## Checklist
@@ -0,0 +1,34 @@
name: Auto-close duplicate issues
description: Auto-closes issues that are duplicates of existing issues
on:
schedule:
- cron: "0 */6 * * *"
workflow_dispatch:
jobs:
auto-close-duplicates:
runs-on: ubuntu-latest
timeout-minutes: 10
permissions:
contents: read
issues: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup Bun
uses: oven-sh/setup-bun@v2
with:
bun-version: latest
- name: Run auto-close-duplicates tests
run: bun test scripts/auto-close-duplicates
- name: Auto-close duplicate issues
run: bun run scripts/auto-close-duplicates.ts
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_REPOSITORY_OWNER: ${{ github.repository_owner }}
GITHUB_REPOSITORY_NAME: ${{ github.event.repository.name }}
STATSIG_API_KEY: ${{ secrets.STATSIG_API_KEY }}
+71 -49
View File
@@ -12,84 +12,106 @@ concurrency:
jobs:
lint:
name: Lint
name: Lint Python
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
node-version: '20'
cache: 'npm'
python-version: '3.11'
cache: 'pip'
- name: Install dependencies
run: npm ci
run: |
cd core
pip install -e .
pip install -r requirements-dev.txt
- name: Run linter
run: npm run lint
- name: Ruff lint
run: |
ruff check core/
ruff check tools/
- name: Ruff format
run: |
ruff format --check core/
ruff format --check tools/
test:
name: Test
runs-on: ubuntu-latest
name: Test Python Framework
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-latest, windows-latest]
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
node-version: '20'
cache: 'npm'
python-version: '3.11'
cache: 'pip'
- name: Install dependencies
run: npm ci
run: |
cd core
pip install -e .
pip install -r requirements-dev.txt
- name: Run tests
run: npm run test
run: |
cd core
pytest tests/ -v
build:
name: Build
validate:
name: Validate Agent Exports
runs-on: ubuntu-latest
needs: [lint, test]
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
node-version: '20'
cache: 'npm'
python-version: '3.11'
cache: 'pip'
- name: Install dependencies
run: npm ci
run: |
cd core
pip install -e .
pip install -r requirements-dev.txt
- name: Build packages
run: npm run build
- name: Validate exported agents
run: |
# Check that agent exports have valid structure
if [ ! -d "exports" ]; then
echo "No exports/ directory found, skipping validation"
exit 0
fi
docker:
name: Docker Build
runs-on: ubuntu-latest
needs: [lint, test]
steps:
- uses: actions/checkout@v4
shopt -s nullglob
agent_dirs=(exports/*/)
shopt -u nullglob
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
if [ ${#agent_dirs[@]} -eq 0 ]; then
echo "No agent directories in exports/, skipping validation"
exit 0
fi
- name: Build frontend image
uses: docker/build-push-action@v5
with:
context: ./honeycomb
push: false
tags: honeycomb-frontend:test
cache-from: type=gha
cache-to: type=gha,mode=max
validated=0
for agent_dir in "${agent_dirs[@]}"; do
if [ -f "$agent_dir/agent.json" ]; then
echo "Validating $agent_dir"
python -c "import json; json.load(open('$agent_dir/agent.json'))"
validated=$((validated + 1))
fi
done
- name: Build backend image
uses: docker/build-push-action@v5
with:
context: ./hive
push: false
tags: honeycomb-backend:test
cache-from: type=gha
cache-to: type=gha,mode=max
if [ "$validated" -eq 0 ]; then
echo "No agent.json files found in exports/, skipping validation"
else
echo "Validated $validated agent(s)"
fi
+97
View File
@@ -0,0 +1,97 @@
name: Issue Triage
on:
issues:
types: [opened]
jobs:
triage:
runs-on: ubuntu-latest
timeout-minutes: 10
permissions:
contents: read
issues: write
id-token: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 1
- name: Triage and check for duplicates
uses: anthropics/claude-code-action@v1
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
github_token: ${{ secrets.GITHUB_TOKEN }}
allowed_non_write_users: "*"
prompt: |
Analyze this new issue and perform triage tasks.
Issue: #${{ github.event.issue.number }}
Repository: ${{ github.repository }}
## Your Tasks:
### 1. Get issue details
Use mcp__github__get_issue to get the full details of issue #${{ github.event.issue.number }}
### 2. Check for duplicates
Search for similar existing issues using mcp__github__search_issues with relevant keywords from the issue title and body.
Criteria for duplicates:
- Same bug or error being reported
- Same feature request (even if worded differently)
- Same question being asked
- Issues describing the same root problem
If you find a duplicate:
- Add a comment using EXACTLY this format (required for auto-close to work):
"Found a possible duplicate of #<issue_number>: <brief explanation of why it's a duplicate>"
- Do NOT apply the "duplicate" label yet (the auto-close script will add it after 12 hours if no objections)
- Suggest the user react with a thumbs-down if they disagree
### 3. Check for Low-Quality / AI Spam
Analyze the issue quality. We are receiving many low-effort, AI-generated spam issues.
Flag the issue as INVALID if it matches these criteria:
- **Vague/Generic**: Title is "Fix bug" or "Error" without specific context.
- **Hallucinated**: Refers to files or features that do not exist in this repo.
- **Template Filler**: Body contains "Insert description here" or unrelated gibberish.
- **Low Effort**: No reproduction steps, no logs, only 1-2 sentences.
If identified as spam/low-quality:
- Add the "invalid" label.
- Add a comment:
"This issue has been automatically flagged as low-quality or potentially AI-generated spam. It lacks specific details (logs, reproduction steps, file references) required for us to help. Please open a new issue following the template exactly if this is a legitimate request."
- Do NOT proceed to other steps.
### 4. Check for invalid issues (General)
If the issue is not spam but still lacks information:
- Add the "invalid" label
- Comment asking for clarification
### 5. Categorize with labels (if NOT a duplicate or spam)
Apply appropriate labels based on the issue content. Use ONLY these labels:
- bug: Something isn't working
- enhancement: New feature or request
- question: Further information is requested
- documentation: Improvements or additions to documentation
- good first issue: Good for newcomers (if issue is well-defined and small scope)
- help wanted: Extra attention is needed (if issue needs community input)
- backlog: Tracked for the future, but not currently planned or prioritized
You may apply multiple labels if appropriate (e.g., "bug" and "help wanted").
## Tools Available:
- mcp__github__get_issue: Get issue details
- mcp__github__search_issues: Search for similar issues
- mcp__github__list_issues: List recent issues if needed
- mcp__github__add_issue_comment: Add a comment
- mcp__github__update_issue: Add labels
- mcp__github__get_issue_comments: Get existing comments
Be thorough but efficient. Focus on accurate categorization and finding true duplicates.
claude_args: |
--model claude-haiku-4-5-20251001
--allowedTools "mcp__github__get_issue,mcp__github__search_issues,mcp__github__list_issues,mcp__github__add_issue_comment,mcp__github__update_issue,mcp__github__get_issue_comments"
+204
View File
@@ -0,0 +1,204 @@
name: PR Check Command
on:
issue_comment:
types: [created]
jobs:
check-pr:
# Only run on PR comments that start with /check
if: github.event.issue.pull_request && startsWith(github.event.comment.body, '/check')
runs-on: ubuntu-latest
permissions:
pull-requests: write
issues: write
checks: write
statuses: write
steps:
- name: Check PR requirements
uses: actions/github-script@v7
with:
script: |
const prNumber = context.payload.issue.number;
console.log(`Triggered by /check comment on PR #${prNumber}`);
// Fetch PR data
const { data: pr } = await github.rest.pulls.get({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: prNumber,
});
const prBody = pr.body || '';
const prTitle = pr.title || '';
const prAuthor = pr.user.login;
const headSha = pr.head.sha;
// Create a check run in progress
const { data: checkRun } = await github.rest.checks.create({
owner: context.repo.owner,
repo: context.repo.repo,
name: 'check-requirements',
head_sha: headSha,
status: 'in_progress',
started_at: new Date().toISOString(),
});
// Extract issue numbers
const issuePattern = /(?:close[sd]?|fix(?:e[sd])?|resolve[sd]?)?\s*#(\d+)/gi;
const allText = `${prTitle} ${prBody}`;
const matches = [...allText.matchAll(issuePattern)];
const issueNumbers = [...new Set(matches.map(m => parseInt(m[1], 10)))];
console.log(`PR #${prNumber}:`);
console.log(` Author: ${prAuthor}`);
console.log(` Found issue references: ${issueNumbers.length > 0 ? issueNumbers.join(', ') : 'none'}`);
if (issueNumbers.length === 0) {
const message = `## PR Closed - Requirements Not Met
This PR has been automatically closed because it doesn't meet the requirements.
**Missing:** No linked issue found.
**To fix:**
1. Create or find an existing issue for this work
2. Assign yourself to the issue
3. Re-open this PR and add \`Fixes #123\` in the description
**Why is this required?** See #472 for details.`;
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
body: message,
});
await github.rest.pulls.update({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: prNumber,
state: 'closed',
});
// Update check run to failure
await github.rest.checks.update({
owner: context.repo.owner,
repo: context.repo.repo,
check_run_id: checkRun.id,
status: 'completed',
conclusion: 'failure',
completed_at: new Date().toISOString(),
output: {
title: 'Missing linked issue',
summary: 'PR must reference an issue (e.g., `Fixes #123`)',
},
});
core.setFailed('PR must reference an issue');
return;
}
// Check if PR author is assigned to any linked issue
let issueWithAuthorAssigned = null;
let issuesWithoutAuthor = [];
for (const issueNum of issueNumbers) {
try {
const { data: issue } = await github.rest.issues.get({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: issueNum,
});
const assigneeLogins = (issue.assignees || []).map(a => a.login);
if (assigneeLogins.includes(prAuthor)) {
issueWithAuthorAssigned = issueNum;
console.log(` Issue #${issueNum} has PR author ${prAuthor} as assignee`);
break;
} else {
issuesWithoutAuthor.push({
number: issueNum,
assignees: assigneeLogins
});
console.log(` Issue #${issueNum} assignees: ${assigneeLogins.length > 0 ? assigneeLogins.join(', ') : 'none'}`);
}
} catch (error) {
console.log(` Issue #${issueNum} not found`);
}
}
if (!issueWithAuthorAssigned) {
const issueList = issuesWithoutAuthor.map(i =>
`#${i.number} (assignees: ${i.assignees.length > 0 ? i.assignees.join(', ') : 'none'})`
).join(', ');
const message = `## PR Closed - Requirements Not Met
This PR has been automatically closed because it doesn't meet the requirements.
**PR Author:** @${prAuthor}
**Found issues:** ${issueList}
**Problem:** The PR author must be assigned to the linked issue.
**To fix:**
1. Assign yourself (@${prAuthor}) to one of the linked issues
2. Re-open this PR
**Why is this required?** See #472 for details.`;
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
body: message,
});
await github.rest.pulls.update({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: prNumber,
state: 'closed',
});
// Update check run to failure
await github.rest.checks.update({
owner: context.repo.owner,
repo: context.repo.repo,
check_run_id: checkRun.id,
status: 'completed',
conclusion: 'failure',
completed_at: new Date().toISOString(),
output: {
title: 'PR author not assigned to issue',
summary: `PR author @${prAuthor} must be assigned to one of the linked issues: ${issueList}`,
},
});
core.setFailed('PR author must be assigned to the linked issue');
} else {
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
body: `✅ PR requirements met! Issue #${issueWithAuthorAssigned} has @${prAuthor} as assignee.`,
});
// Update check run to success
await github.rest.checks.update({
owner: context.repo.owner,
repo: context.repo.repo,
check_run_id: checkRun.id,
status: 'completed',
conclusion: 'success',
completed_at: new Date().toISOString(),
output: {
title: 'Requirements met',
summary: `Issue #${issueWithAuthorAssigned} has @${prAuthor} as assignee.`,
},
});
console.log(`PR requirements met!`);
}
@@ -0,0 +1,138 @@
name: PR Requirements Backfill
on:
workflow_dispatch:
jobs:
check-all-open-prs:
runs-on: ubuntu-latest
permissions:
pull-requests: write
issues: write
steps:
- name: Check all open PRs
uses: actions/github-script@v7
with:
script: |
const { data: pullRequests } = await github.rest.pulls.list({
owner: context.repo.owner,
repo: context.repo.repo,
state: 'open',
per_page: 100,
});
console.log(`Found ${pullRequests.length} open PRs`);
for (const pr of pullRequests) {
const prNumber = pr.number;
const prBody = pr.body || '';
const prTitle = pr.title || '';
const prAuthor = pr.user.login;
console.log(`\nChecking PR #${prNumber}: ${prTitle}`);
// Extract issue numbers from body and title
const issuePattern = /(?:close[sd]?|fix(?:e[sd])?|resolve[sd]?)?\s*#(\d+)/gi;
const allText = `${prTitle} ${prBody}`;
const matches = [...allText.matchAll(issuePattern)];
const issueNumbers = [...new Set(matches.map(m => parseInt(m[1], 10)))];
console.log(` Found issue references: ${issueNumbers.length > 0 ? issueNumbers.join(', ') : 'none'}`);
if (issueNumbers.length === 0) {
console.log(` ❌ No linked issue - closing PR`);
const message = `## PR Closed - Requirements Not Met
This PR has been automatically closed because it doesn't meet the requirements.
**Missing:** No linked issue found.
**To fix:**
1. Create or find an existing issue for this work
2. Assign yourself to the issue
3. Re-open this PR and add \`Fixes #123\` in the description`;
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
body: message,
});
await github.rest.pulls.update({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: prNumber,
state: 'closed',
});
continue;
}
// Check if any linked issue has the PR author as assignee
let issueWithAuthorAssigned = null;
let issuesWithoutAuthor = [];
for (const issueNum of issueNumbers) {
try {
const { data: issue } = await github.rest.issues.get({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: issueNum,
});
const assigneeLogins = (issue.assignees || []).map(a => a.login);
if (assigneeLogins.includes(prAuthor)) {
issueWithAuthorAssigned = issueNum;
break;
} else {
issuesWithoutAuthor.push({
number: issueNum,
assignees: assigneeLogins
});
}
} catch (error) {
console.log(` Issue #${issueNum} not found or inaccessible`);
}
}
if (!issueWithAuthorAssigned) {
const issueList = issuesWithoutAuthor.map(i =>
`#${i.number} (assignees: ${i.assignees.length > 0 ? i.assignees.join(', ') : 'none'})`
).join(', ');
console.log(` ❌ PR author not assigned to any linked issue - closing PR`);
const message = `## PR Closed - Requirements Not Met
This PR has been automatically closed because it doesn't meet the requirements.
**PR Author:** @${prAuthor}
**Found issues:** ${issueList}
**Problem:** The PR author must be assigned to the linked issue.
**To fix:**
1. Assign yourself (@${prAuthor}) to one of the linked issues
2. Re-open this PR`;
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
body: message,
});
await github.rest.pulls.update({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: prNumber,
state: 'closed',
});
} else {
console.log(` ✅ PR requirements met! Issue #${issueWithAuthorAssigned} has ${prAuthor} as assignee.`);
}
}
console.log('\nBackfill complete!');
+189
View File
@@ -0,0 +1,189 @@
name: PR Requirements Check
on:
pull_request_target:
types: [opened, reopened, edited, synchronize]
jobs:
check-requirements:
runs-on: ubuntu-latest
permissions:
pull-requests: write
issues: write
steps:
- name: Check PR has linked issue with assignee
uses: actions/github-script@v7
with:
script: |
const pr = context.payload.pull_request;
const prNumber = pr.number;
const prBody = pr.body || '';
const prTitle = pr.title || '';
const prLabels = (pr.labels || []).map(l => l.name);
// Allow micro-fix and documentation PRs without a linked issue
const isMicroFix = prLabels.includes('micro-fix') || /micro-fix/i.test(prTitle);
const isDocumentation = prLabels.includes('documentation') || /\bdocs?\b/i.test(prTitle);
if (isMicroFix || isDocumentation) {
const reason = isMicroFix ? 'micro-fix' : 'documentation';
console.log(`PR #${prNumber} is a ${reason}, skipping issue requirement.`);
return;
}
// Extract issue numbers from body and title
// Matches: fixes #123, closes #123, resolves #123, or plain #123
const issuePattern = /(?:close[sd]?|fix(?:e[sd])?|resolve[sd]?)?\s*#(\d+)/gi;
const allText = `${prTitle} ${prBody}`;
const matches = [...allText.matchAll(issuePattern)];
const issueNumbers = [...new Set(matches.map(m => parseInt(m[1], 10)))];
console.log(`PR #${prNumber}:`);
console.log(` Found issue references: ${issueNumbers.length > 0 ? issueNumbers.join(', ') : 'none'}`);
if (issueNumbers.length === 0) {
const message = `## PR Closed - Requirements Not Met
This PR has been automatically closed because it doesn't meet the requirements.
**Missing:** No linked issue found.
**To fix:**
1. Create or find an existing issue for this work
2. Assign yourself to the issue
3. Re-open this PR and add \`Fixes #123\` in the description
**Exception:** To bypass this requirement, you can:
- Add the \`micro-fix\` label or include \`micro-fix\` in your PR title for trivial fixes
- Add the \`documentation\` label or include \`doc\`/\`docs\` in your PR title for documentation changes
**Micro-fix requirements** (must meet ALL):
| Qualifies | Disqualifies |
|-----------|--------------|
| < 20 lines changed | Any functional bug fix |
| Typos & Documentation & Linting | Refactoring for "clean code" |
| No logic/API/DB changes | New features (even tiny ones) |
**Why is this required?** See #472 for details.`;
const comments = await github.rest.issues.listComments({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
});
const botComment = comments.data.find(
(c) => c.user.type === 'Bot' && c.body.includes('PR Closed - Requirements Not Met')
);
if (!botComment) {
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
body: message,
});
}
await github.rest.pulls.update({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: prNumber,
state: 'closed',
});
core.setFailed('PR must reference an issue');
return;
}
// Check if any linked issue has the PR author as assignee
const prAuthor = pr.user.login;
let issueWithAuthorAssigned = null;
let issuesWithoutAuthor = [];
for (const issueNum of issueNumbers) {
try {
const { data: issue } = await github.rest.issues.get({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: issueNum,
});
const assigneeLogins = (issue.assignees || []).map(a => a.login);
if (assigneeLogins.includes(prAuthor)) {
issueWithAuthorAssigned = issueNum;
console.log(` Issue #${issueNum} has PR author ${prAuthor} as assignee`);
break;
} else {
issuesWithoutAuthor.push({
number: issueNum,
assignees: assigneeLogins
});
console.log(` Issue #${issueNum} assignees: ${assigneeLogins.length > 0 ? assigneeLogins.join(', ') : 'none'} (PR author: ${prAuthor})`);
}
} catch (error) {
console.log(` Issue #${issueNum} not found or inaccessible`);
}
}
if (!issueWithAuthorAssigned) {
const issueList = issuesWithoutAuthor.map(i =>
`#${i.number} (assignees: ${i.assignees.length > 0 ? i.assignees.join(', ') : 'none'})`
).join(', ');
const message = `## PR Closed - Requirements Not Met
This PR has been automatically closed because it doesn't meet the requirements.
**PR Author:** @${prAuthor}
**Found issues:** ${issueList}
**Problem:** The PR author must be assigned to the linked issue.
**To fix:**
1. Assign yourself (@${prAuthor}) to one of the linked issues
2. Re-open this PR
**Exception:** To bypass this requirement, you can:
- Add the \`micro-fix\` label or include \`micro-fix\` in your PR title for trivial fixes
- Add the \`documentation\` label or include \`doc\`/\`docs\` in your PR title for documentation changes
**Micro-fix requirements** (must meet ALL):
| Qualifies | Disqualifies |
|-----------|--------------|
| < 20 lines changed | Any functional bug fix |
| Typos & Documentation & Linting | Refactoring for "clean code" |
| No logic/API/DB changes | New features (even tiny ones) |
**Why is this required?** See #472 for details.`;
const comments = await github.rest.issues.listComments({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
});
const botComment = comments.data.find(
(c) => c.user.type === 'Bot' && c.body.includes('PR Closed - Requirements Not Met')
);
if (!botComment) {
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: prNumber,
body: message,
});
}
await github.rest.pulls.update({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: prNumber,
state: 'closed',
});
core.setFailed('PR author must be assigned to the linked issue');
} else {
console.log(`PR requirements met! Issue #${issueWithAuthorAssigned} has ${prAuthor} as assignee.`);
}
+11 -57
View File
@@ -7,7 +7,6 @@ on:
permissions:
contents: write
packages: write
jobs:
release:
@@ -18,20 +17,22 @@ jobs:
with:
fetch-depth: 0
- name: Setup Node.js
uses: actions/setup-node@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
node-version: '20'
cache: 'npm'
python-version: '3.11'
cache: 'pip'
- name: Install dependencies
run: npm ci
- name: Build packages
run: npm run build
run: |
cd core
pip install -e .
pip install -r requirements-dev.txt
- name: Run tests
run: npm run test
run: |
cd core
pytest tests/ -v
- name: Generate changelog
id: changelog
@@ -46,50 +47,3 @@ jobs:
generate_release_notes: true
draft: false
prerelease: ${{ contains(github.ref, '-') }}
docker-publish:
name: Publish Docker Images
runs-on: ubuntu-latest
needs: release
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: |
ghcr.io/${{ github.repository }}/frontend
ghcr.io/${{ github.repository }}/backend
tags: |
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=semver,pattern={{major}}
- name: Build and push frontend
uses: docker/build-push-action@v5
with:
context: ./honeycomb
push: true
tags: ghcr.io/${{ github.repository }}/frontend:${{ github.ref_name }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Build and push backend
uses: docker/build-push-action@v5
with:
context: ./hive
push: true
tags: ghcr.io/${{ github.repository }}/backend:${{ github.ref_name }}
cache-from: type=gha
cache-to: type=gha,mode=max
+10 -4
View File
@@ -9,12 +9,10 @@ workdir/
.next/
out/
# Environment files (generated from config.yaml)
# Environment files
.env
.env.local
.env.*.local
honeycomb/.env
hive/.env
# User configuration (copied from .example)
config.yaml
@@ -56,6 +54,10 @@ __pycache__/
*.egg-info/
.eggs/
*.egg
uv.lock
# Generated runtime data
core/data/
# Misc
*.local
@@ -63,4 +65,8 @@ __pycache__/
tmp/
temp/
exports/*
exports/*
.agent-builder-sessions/*
.venv
+13 -2
View File
@@ -1,9 +1,20 @@
{
"mcpServers": {
"agent-builder": {
"command": "python",
"command": ".venv/bin/python",
"args": ["-m", "framework.mcp.agent_builder_server"],
"cwd": "/home/timothy/oss/hive/core"
"cwd": "core",
"env": {
"PYTHONPATH": "../tools/src"
}
},
"tools": {
"command": ".venv/bin/python",
"args": ["mcp_server.py", "--stdio"],
"cwd": "tools",
"env": {
"PYTHONPATH": "src:../core"
}
}
}
}
+18
View File
@@ -0,0 +1,18 @@
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
rev: v0.8.6
hooks:
- id: ruff
name: ruff lint (core)
args: [--fix]
files: ^core/
- id: ruff
name: ruff lint (tools)
args: [--fix]
files: ^tools/
- id: ruff-format
name: ruff format (core)
files: ^core/
- id: ruff-format
name: ruff format (tools)
files: ^tools/
+1
View File
@@ -0,0 +1 @@
3.11
+7
View File
@@ -0,0 +1,7 @@
{
"recommendations": [
"charliermarsh.ruff",
"editorconfig.editorconfig",
"ms-python.python"
]
}
+2 -1
View File
@@ -25,8 +25,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Removed
- N/A
### Fixed
- N/A
- tools: Fixed web_scrape tool attempting to parse non-HTML content (PDF, JSON) as HTML (#487)
### Security
- N/A
+78 -31
View File
@@ -1,37 +1,63 @@
# Contributing to Hive
# Contributing to Aden Agent Framework
Thank you for your interest in contributing to Hive! This document provides guidelines and information for contributors.
Thank you for your interest in contributing to the Aden Agent Framework! This document provides guidelines and information for contributors. Were especially looking for help building tools, integrations([check #2805](https://github.com/adenhq/hive/issues/2805)), and example agents for the framework. If youre interested in extending its functionality, this is the perfect place to start.
## Code of Conduct
By participating in this project, you agree to abide by our [Code of Conduct](CODE_OF_CONDUCT.md).
## Issue Assignment Policy
To prevent duplicate work and respect contributors' time, we require issue assignment before submitting PRs.
### How to Claim an Issue
1. **Find an Issue:** Browse existing issues or create a new one
2. **Claim It:** Leave a comment (e.g., *"I'd like to work on this!"*)
3. **Wait for Assignment:** A maintainer will assign you within 24 hours. Issues with reproducible steps or proposals are prioritized.
4. **Submit Your PR:** Once assigned, you're ready to contribute
> **Note:** PRs for unassigned issues may be delayed or closed if someone else was already assigned.
### Exceptions (No Assignment Needed)
You may submit PRs without prior assignment for:
- **Documentation:** Fixing typos or clarifying instructions — add the `documentation` label or include `doc`/`docs` in your PR title to bypass the linked issue requirement
- **Micro-fixes:** Add the `micro-fix` label or include `micro-fix` in your PR title to bypass the linked issue requirement. Micro-fixes must meet **all** qualification criteria:
| Qualifies | Disqualifies |
|-----------|--------------|
| < 20 lines changed | Any functional bug fix |
| Typos & Documentation & Linting | Refactoring for "clean code" |
| No logic/API/DB changes | New features (even tiny ones) |
## Getting Started
1. Fork the repository
2. Clone your fork: `git clone https://github.com/YOUR_USERNAME/hive.git`
3. Create a feature branch: `git checkout -b feature/your-feature-name`
4. Make your changes
5. Run tests: `npm run test`
5. Run checks and tests:
```bash
make check # Lint and format checks (ruff check + ruff format --check on core/ and tools/)
make test # Core tests (cd core && pytest tests/ -v)
```
6. Commit your changes following our commit conventions
7. Push to your fork and submit a Pull Request
## Development Setup
```bash
# Install dependencies
npm install
# Copy configuration
cp config.yaml.example config.yaml
# Generate environment files
npm run setup
# Start development environment
docker compose up
# Install Python packages and verify setup
./quickstart.sh
```
> **Windows Users:**
> If you are on native Windows, it is recommended to use **WSL (Windows Subsystem for Linux)**.
> Alternatively, make sure to run PowerShell or Git Bash with Python 3.11+ installed, and disable "App Execution Aliases" in Windows settings.
> **Tip:** Installing Claude Code skills is optional for running existing agents, but required if you plan to **build new agents**.
## Commit Convention
We follow [Conventional Commits](https://www.conventionalcommits.org/):
@@ -62,11 +88,12 @@ docs(readme): update installation instructions
## Pull Request Process
1. Update documentation if needed
2. Add tests for new functionality
3. Ensure all tests pass
4. Update the CHANGELOG.md if applicable
5. Request review from maintainers
1. **Get assigned to the issue first** (see [Issue Assignment Policy](#issue-assignment-policy))
2. Update documentation if needed
3. Add tests for new functionality
4. Ensure `make check` and `make test` pass
5. Update the CHANGELOG.md if applicable
6. Request review from maintainers
### PR Title Format
@@ -77,32 +104,52 @@ feat(component): add new feature description
## Project Structure
- `honeycomb/` - React frontend application
- `hive/` - Node.js backend API
- `core/` - Core framework (agent runtime, graph executor, protocols)
- `tools/` - MCP Tools Package (tools for agent capabilities)
- `exports/` - Agent packages and examples
- `docs/` - Documentation
- `scripts/` - Build and utility scripts
- `.claude/` - Claude Code skills for building/testing agents
## Code Style
- Use TypeScript for all new code
- Follow existing code patterns
- Use Python 3.11+ for all new code
- Follow PEP 8 style guide
- Add type hints to function signatures
- Write docstrings for classes and public functions
- Use meaningful variable and function names
- Add comments for complex logic
- Keep functions focused and small
## Testing
```bash
# Run all tests
npm run test
> **Note:** When testing agents in `exports/`, always set PYTHONPATH:
>
> ```bash
> PYTHONPATH=core:exports python -m agent_name test
> ```
# Run tests for a specific package
npm run test --workspace=honeycomb
npm run test --workspace=hive
```bash
# Run lint and format checks (mirrors CI lint job)
make check
# Run core framework tests (mirrors CI test job)
make test
# Or run tests directly
cd core && pytest tests/ -v
# Run tests for a specific agent
PYTHONPATH=core:exports python -m agent_name test
```
> **CI also validates** that all exported agent JSON files (`exports/*/agent.json`) are well-formed JSON. Ensure your agent exports are valid before submitting.
## Contributor License Agreement
By submitting a Pull Request, you agree that your contributions will be licensed under the Aden Agent Framework license.
## Questions?
Feel free to open an issue for questions or join our [Discord community](https://discord.com/invite/MXE49hrKDk).
Thank you for contributing!
Thank you for contributing!
+421 -928
View File
File diff suppressed because it is too large Load Diff
+495
View File
@@ -0,0 +1,495 @@
# Agent Development Environment Setup
Complete setup guide for building and running goal-driven agents with the Aden Agent Framework.
## Quick Setup
```bash
# Run the automated setup script
./quickstart.sh
```
> **Note for Windows Users:**
> Running the setup script on native Windows shells (PowerShell / Git Bash) may sometimes fail due to Python App Execution Aliases.
> It is **strongly recommended to use WSL (Windows Subsystem for Linux)** for a smoother setup experience.
This will:
- Check Python version (requires 3.11+)
- Install the core framework package (`framework`)
- Install the tools package (`aden_tools`)
- Fix package compatibility issues (openai + litellm)
- Verify all installations
## Alpine Linux Setup
If you are using Alpine Linux (e.g., inside a Docker container), you must install system dependencies and use a virtual environment before running the setup script:
1. Install System Dependencies:
```bash
apk update
apk add bash git python3 py3-pip nodejs npm curl build-base python3-dev linux-headers libffi-dev
```
2. Set up Virtual Environment (Required for Python 3.12+):
```
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip setuptools wheel
```
3. Run the Quickstart Script:
```
./quickstart.sh
```
## Manual Setup (Alternative)
If you prefer to set up manually or the script fails:
### 1. Install Core Framework
```bash
cd core
pip install -e .
```
### 2. Install Tools Package
```bash
cd tools
pip install -e .
```
### 3. Upgrade OpenAI Package
```bash
# litellm requires openai >= 1.0.0
pip install --upgrade "openai>=1.0.0"
```
### 4. Verify Installation
```bash
python -c "import framework; print('✓ framework OK')"
python -c "import aden_tools; print('✓ aden_tools OK')"
python -c "import litellm; print('✓ litellm OK')"
```
> **Windows Tip:**
> On Windows, if the verification commands fail, ensure you are running them in **WSL** or after **disabling Python App Execution Aliases** in Windows Settings → Apps → App Execution Aliases.
## Requirements
### Python Version
- **Minimum:** Python 3.11
- **Recommended:** Python 3.11 or 3.12
- **Tested on:** Python 3.11, 3.12, 3.13
### System Requirements
- pip (latest version)
- 2GB+ RAM
- Internet connection (for LLM API calls)
- For Windows users: WSL 2 is recommended for full compatibility.
### API Keys (Optional)
For running agents with real LLMs:
```bash
export ANTHROPIC_API_KEY="your-key-here"
```
## Running Agents
All agent commands must be run from the project root with `PYTHONPATH` set:
```bash
# From /hive/ directory
PYTHONPATH=core:exports python -m agent_name COMMAND
```
### Example Commands
After building an agent via `/building-agents-construction`, use these commands:
```bash
# Validate agent structure
PYTHONPATH=core:exports python -m your_agent_name validate
# Show agent information
PYTHONPATH=core:exports python -m your_agent_name info
# Run agent with input
PYTHONPATH=core:exports python -m your_agent_name run --input '{
"task": "Your input here"
}'
# Run in mock mode (no LLM calls)
PYTHONPATH=core:exports python -m your_agent_name run --mock --input '{...}'
```
## Building New Agents and Run Flow
Build and run an agent using Claude Code CLI with the agent building skills:
### 1. Install Claude Skills (One-time)
```bash
./quickstart.sh
```
This verifies agent-related Claude Code skills are available:
- `/building-agents-construction` - Step-by-step build guide
- `/building-agents-core` - Fundamental concepts
- `/building-agents-patterns` - Best practices
- `/testing-agent` - Test and validate agents
- `/agent-workflow` - Complete workflow
### 2. Build an Agent
```
claude> /building-agents-construction
```
Follow the prompts to:
1. Define your agent's goal
2. Design the workflow nodes
3. Connect nodes with edges
4. Generate the agent package under `exports/`
This step creates the initial agent structure required for further development.
### 3. Define Agent Logic
```
claude> /building-agents-core
```
Follow the prompts to:
1. Understand the agent architecture and file structure
2. Define the agent's goal, success criteria, and constraints
3. Learn node types (LLM, tool-use, router, function)
4. Discover and validate available tools before use
This step establishes the core concepts and rules needed before building an agent.
### 4. Apply Agent Patterns
```
claude> /building-agents-patterns
```
Follow the prompts to:
1. Apply best-practice agent design patterns
2. Add pause/resume flows for multi-turn interactions
3. Improve robustness with routing, fallbacks, and retries
4. Avoid common anti-patterns during agent construction
This step helps optimize agent design before final testing.
### 5. Test Your Agent
```
claude> /testing-agent
```
Follow the prompts to:
1. Generate test guidelines for constraints and success criteria
2. Write agent tests directly under `exports/{agent}/tests/`
3. Run goal-based evaluation tests
4. Debug failing tests and iterate on agent improvements
This step verifies that the agent meets its goals before production use.
### 6. Agent Development Workflow (End-to-End)
```
claude> /agent-workflow
```
Follow the guided flow to:
1. Understand core agent concepts (optional)
2. Build the agent structure step by step
3. Apply best-practice design patterns (optional)
4. Test and validate the agent against its goals
This workflow orchestrates all agent-building skills to take you from idea → production-ready agent.
## Troubleshooting
### "externally-managed-environment" error (PEP 668)
**Cause:** Python 3.12+ on macOS/Homebrew, WSL, or some Linux distros prevents system-wide pip installs.
**Solution:** Create and use a virtual environment:
```bash
# Create virtual environment
python3 -m venv .venv
# Activate it
source .venv/bin/activate # macOS/Linux
# .venv\Scripts\activate # Windows
# Then run setup
./quickstart.sh
```
Always activate the venv before running agents:
```bash
source .venv/bin/activate
PYTHONPATH=core:exports python -m your_agent_name demo
```
### "ModuleNotFoundError: No module named 'framework'"
**Solution:** Install the core package:
```bash
cd core && pip install -e .
```
### "ModuleNotFoundError: No module named 'aden_tools'"
**Solution:** Install the tools package:
```bash
cd tools && pip install -e .
```
Or run the setup script:
```bash
./quickstart.sh
```
### "ModuleNotFoundError: No module named 'openai.\_models'"
**Cause:** Outdated `openai` package (0.27.x) incompatible with `litellm`
**Solution:** Upgrade openai:
```bash
pip install --upgrade "openai>=1.0.0"
```
### "No module named 'your_agent_name'"
**Cause:** Not running from project root, missing PYTHONPATH, or agent not yet created
**Solution:** Ensure you're in the project root directory, have built an agent, and use:
```bash
PYTHONPATH=core:exports python -m your_agent_name validate
```
### Agent imports fail with "broken installation"
**Symptom:** `pip list` shows packages pointing to non-existent directories
**Solution:** Reinstall packages properly:
```bash
# Remove broken installations
pip uninstall -y framework tools
# Reinstall correctly
./quickstart.sh
```
## Package Structure
The Hive framework consists of three Python packages:
```
hive/
├── core/ # Core framework (runtime, graph executor, LLM providers)
│ ├── framework/
│ ├── .venv/ # Created by quickstart.sh
│ └── pyproject.toml
├── tools/ # Tools and MCP servers
│ ├── src/
│ │ └── aden_tools/ # Actual package location
│ ├── .venv/ # Created by quickstart.sh
│ └── pyproject.toml
└── exports/ # Agent packages (user-created, gitignored)
└── your_agent_name/ # Created via /building-agents-construction
```
## Separate Virtual Environments
The project uses **separate virtual environments** for `core` and `tools` packages to:
- Isolate dependencies and avoid conflicts
- Allow independent development and testing of each package
- Enable MCP servers to run with their specific dependencies
### How It Works
When you run `./quickstart.sh` or `uv sync` in each directory:
1. **core/.venv/** - Contains the `framework` package and its dependencies (anthropic, litellm, mcp, etc.)
2. **tools/.venv/** - Contains the `aden_tools` package and its dependencies (beautifulsoup4, pandas, etc.)
### Cross-Package Imports
The `core` and `tools` packages are **intentionally independent**:
- **No cross-imports**: `framework` does not import `aden_tools` directly, and vice versa
- **Communication via MCP**: Tools are exposed to agents through MCP servers, not direct Python imports
- **Runtime integration**: The agent runner loads tools via the MCP protocol at runtime
If you need to use both packages in a single script (e.g., for testing), you have two options:
```bash
# Option 1: Install both in a shared environment
python -m venv .venv
source .venv/bin/activate
pip install -e core/ -e tools/
# Option 2: Use PYTHONPATH (for quick testing)
PYTHONPATH=core:tools/src python your_script.py
```
### MCP Server Configuration
The `.mcp.json` at project root configures MCP servers to use their respective virtual environments:
```json
{
"mcpServers": {
"agent-builder": {
"command": "core/.venv/bin/python",
"args": ["-m", "framework.mcp.agent_builder_server"]
},
"tools": {
"command": "tools/.venv/bin/python",
"args": ["-m", "aden_tools.mcp_server", "--stdio"]
}
}
}
```
This ensures each MCP server runs with its correct dependencies.
### Why PYTHONPATH is Required
The packages are installed in **editable mode** (`pip install -e`), which means:
- `framework` and `aden_tools` are globally importable (no PYTHONPATH needed)
- `exports` is NOT installed as a package (PYTHONPATH required)
This design allows agents in `exports/` to be:
- Developed independently
- Version controlled separately
- Deployed as standalone packages
## Development Workflow
### 1. Setup (Once)
```bash
./quickstart.sh
```
### 2. Build Agent (Claude Code)
```
claude> /building-agents-construction
Enter goal: "Build an agent that processes customer support tickets"
```
### 3. Validate Agent
```bash
PYTHONPATH=core:exports python -m your_agent_name validate
```
### 4. Test Agent
```
claude> /testing-agent
```
### 5. Run Agent
```bash
PYTHONPATH=core:exports python -m your_agent_name run --input '{...}'
```
## IDE Setup
### VSCode
Add to `.vscode/settings.json`:
```json
{
"python.analysis.extraPaths": [
"${workspaceFolder}/core",
"${workspaceFolder}/exports"
],
"python.autoComplete.extraPaths": [
"${workspaceFolder}/core",
"${workspaceFolder}/exports"
]
}
```
### PyCharm
1. Open Project Settings → Project Structure
2. Mark `core` as Sources Root
3. Mark `exports` as Sources Root
## Environment Variables
### Required for LLM Operations
```bash
export ANTHROPIC_API_KEY="sk-ant-..."
```
### Optional Configuration
```bash
# Credentials storage location (default: ~/.aden/credentials)
export ADEN_CREDENTIALS_PATH="/custom/path"
# Agent storage location (default: /tmp)
export AGENT_STORAGE_PATH="/custom/storage"
```
## Additional Resources
- **Framework Documentation:** [core/README.md](core/README.md)
- **Tools Documentation:** [tools/README.md](tools/README.md)
- **Example Agents:** [exports/](exports/)
- **Agent Building Guide:** [.claude/skills/building-agents-construction/SKILL.md](.claude/skills/building-agents-construction/SKILL.md)
- **Testing Guide:** [.claude/skills/testing-agent/SKILL.md](.claude/skills/testing-agent/SKILL.md)
## Contributing
When contributing agent packages:
1. Place agents in `exports/agent_name/`
2. Follow the standard agent structure (see existing agents)
3. Include README.md with usage instructions
4. Add tests if using `/testing-agent`
5. Document required environment variables
## Support
- **Issues:** https://github.com/adenhq/hive/issues
- **Discord:** https://discord.com/invite/MXE49hrKDk
- **Documentation:** https://docs.adenhq.com/
+26
View File
@@ -0,0 +1,26 @@
.PHONY: lint format check test install-hooks help
help: ## Show this help
@grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | \
awk 'BEGIN {FS = ":.*?## "}; {printf " \033[36m%-15s\033[0m %s\n", $$1, $$2}'
lint: ## Run ruff linter (with auto-fix)
cd core && ruff check --fix .
cd tools && ruff check --fix .
format: ## Run ruff formatter
cd core && ruff format .
cd tools && ruff format .
check: ## Run all checks without modifying files (CI-safe)
cd core && ruff check .
cd tools && ruff check .
cd core && ruff format --check .
cd tools && ruff format --check .
test: ## Run all tests
cd core && python -m pytest tests/ -v
install-hooks: ## Install pre-commit hooks
pip install pre-commit
pre-commit install
+51
View File
@@ -0,0 +1,51 @@
## Summary
- **Added HubSpot integration** — new HubSpot MCP tool with search, get, create, and update operations for contacts, companies, and deals. Includes OAuth2 provider for HubSpot credentials and credential store adapter for the tools layer.
- **Replaced web_scrape tool with Playwright + stealth** — swapped httpx/BeautifulSoup for a headless Chromium browser using `playwright` (async API) and `playwright-stealth`, enabling JS-rendered page scraping and bot detection evasion
- **Added empty response retry logic** — LLM provider now detects empty responses (e.g. Gemini returning 200 with no content on rate limit) and retries with exponential backoff, preventing hallucinated output from the cleanup LLM
- **Added context-aware input compaction** — LLM nodes now estimate input token count before calling the model and progressively truncate the largest values if they exceed the context window budget
- **Increased rate limit retries to 10** with verbose `[retry]` and `[compaction]` logging that includes model name, finish reason, and attempt count
- **Updated setup scripts** — `scripts/setup-python.sh` now installs Playwright Chromium browser automatically for web scraping support
- **Interactive quickstart onboarding** — `quickstart.sh` rewritten as bee-themed interactive wizard that detects existing API keys (including Claude Code subscription), lets user pick ONE default LLM provider, and saves configuration to `~/.hive/configuration.json`
- **Fixed lint errors** across `hubspot_tool.py` (line length) and `agent_builder_server.py` (unused variable)
## Changed files
### HubSpot Integration
- `tools/src/aden_tools/tools/hubspot_tool/` — New MCP tool: contacts, companies, and deals CRUD
- `tools/src/aden_tools/tools/__init__.py` — Registered HubSpot tools
- `tools/src/aden_tools/credentials/integrations.py` — HubSpot credential integration
- `tools/src/aden_tools/credentials/__init__.py` — Updated credential exports
- `core/framework/credentials/oauth2/hubspot_provider.py` — HubSpot OAuth2 provider
- `core/framework/credentials/oauth2/__init__.py` — Registered HubSpot OAuth2 provider
- `core/framework/runner/runner.py` — Updated runner for credential support
### Web Scrape Rewrite
- `tools/src/aden_tools/tools/web_scrape_tool/web_scrape_tool.py` — Playwright async rewrite
- `tools/src/aden_tools/tools/web_scrape_tool/README.md` — Updated docs
- `tools/pyproject.toml` — Added `playwright`, `playwright-stealth` deps
- `tools/Dockerfile` — Added `playwright install chromium --with-deps`
- `scripts/setup-python.sh` — Added Playwright Chromium browser install step
### LLM Reliability
- `core/framework/llm/litellm.py` — Empty response retry + max retries 10 + verbose logging
- `core/framework/graph/node.py` — Input compaction via `_compact_inputs()`, `_estimate_tokens()`, `_get_context_limit()`
### Quickstart & Setup
- `quickstart.sh` — Interactive bee-themed onboarding wizard with single provider selection
- `~/.hive/configuration.json` — New user config file for default LLM provider/model
### Fixes
- `core/framework/mcp/agent_builder_server.py` — Removed unused variable
- `tools/src/aden_tools/tools/hubspot_tool/hubspot_tool.py` — Fixed E501 line length violations
## Test plan
- [ ] Run `make lint` — passes clean
- [ ] Run `./quickstart.sh` and verify interactive flow works, config saved to `~/.hive/configuration.json`
- [ ] Run `./scripts/setup-python.sh` and verify Playwright Chromium installs
- [ ] Run `pytest tests/tools/test_web_scrape_tool.py -v`
- [ ] Run agent against a JS-heavy site and verify `web_scrape` returns rendered content
- [ ] Set `HUBSPOT_ACCESS_TOKEN` and verify HubSpot tool CRUD operations work
- [ ] Trigger rate limit and verify `[retry]` logs appear with correct attempt counts
- [ ] Run agent with large inputs and verify `[compaction]` logs show truncation
🤖 Generated with [Claude Code](https://claude.com/claude-code)
+247 -180
View File
@@ -2,6 +2,17 @@
<img width="100%" alt="Hive Banner" src="https://storage.googleapis.com/aden-prod-assets/website/aden-title-card.png" />
</p>
<p align="center">
<a href="README.md">English</a> |
<a href="docs/i18n/zh-CN.md">简体中文</a> |
<a href="docs/i18n/es.md">Español</a> |
<a href="docs/i18n/hi.md">हिन्दी</a> |
<a href="docs/i18n/pt.md">Português</a> |
<a href="docs/i18n/ja.md">日本語</a> |
<a href="docs/i18n/ru.md">Русский</a> |
<a href="docs/i18n/ko.md">한국어</a>
</p>
[![Apache 2.0 License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/adenhq/hive/blob/main/LICENSE)
[![Y Combinator](https://img.shields.io/badge/Y%20Combinator-Aden-orange)](https://www.ycombinator.com/companies/aden)
[![Docker Pulls](https://img.shields.io/docker/pulls/adenhq/hive?logo=Docker&labelColor=%23528bff)](https://hub.docker.com/u/adenhq)
@@ -29,6 +40,20 @@ Build reliable, self-improving AI agents without hardcoding workflows. Define yo
Visit [adenhq.com](https://adenhq.com) for complete documentation, examples, and guides.
## What is Aden
<p align="center">
<img width="100%" alt="Aden Architecture" src="docs/assets/aden-architecture-diagram.jpg" />
</p>
Aden is a platform for building, deploying, operating, and adapting AI agents:
- **Build** - A Coding Agent generates specialized Worker Agents (Sales, Marketing, Ops) from natural language goals
- **Deploy** - Headless deployment with CI/CD integration and full API lifecycle management
- **Operate** - Real-time monitoring, observability, and runtime guardrails keep agents reliable
- **Adapt** - Continuous evaluation, supervision, and adaptation ensure agents improve over time
- **Infra** - Shared memory, LLM integrations, tools, and skills power every agent
## Quick Links
- **[Documentation](https://docs.adenhq.com/)** - Complete guides and API reference
@@ -41,8 +66,8 @@ Visit [adenhq.com](https://adenhq.com) for complete documentation, examples, and
### Prerequisites
- [Docker](https://docs.docker.com/get-docker/) (v20.10+)
- [Docker Compose](https://docs.docker.com/compose/install/) (v2.0+)
- [Python 3.11+](https://www.python.org/downloads/) for agent development
- Claude Code or Cursor for utilizing agent skills
### Installation
@@ -51,24 +76,43 @@ Visit [adenhq.com](https://adenhq.com) for complete documentation, examples, and
git clone https://github.com/adenhq/hive.git
cd hive
# Copy and configure
cp config.yaml.example config.yaml
# Run setup and start services
npm run setup
docker compose up
# Run quickstart setup
./quickstart.sh
```
**Access the application:**
This sets up:
- **framework** - Core agent runtime and graph executor (in `core/.venv`)
- **aden_tools** - MCP tools for agent capabilities (in `tools/.venv`)
- All required Python dependencies
- Dashboard: http://localhost:3000
- API: http://localhost:4000
- Health: http://localhost:4000/health
### Build Your First Agent
```bash
# Build an agent using Claude Code
claude> /building-agents-construction
# Test your agent
claude> /testing-agent
# Run your agent
PYTHONPATH=core:exports python -m your_agent_name run --input '{...}'
```
**[📖 Complete Setup Guide](ENVIRONMENT_SETUP.md)** - Detailed instructions for agent development
### Cursor IDE Support
Skills are also available in Cursor. To enable:
1. Open Command Palette (`Cmd+Shift+P` / `Ctrl+Shift+P`)
2. Run `MCP: Enable` to enable MCP servers
3. Restart Cursor to load the MCP servers from `.cursor/mcp.json`
4. Type `/` in Agent chat and search for skills (e.g., `/building-agents-construction`)
## Features
- **Goal-Driven Development** - Define objectives in natural language; the coding agent generates the agent graph and connection code to achieve them
- **Self-Adapting Agents** - Framework captures failures, updates objectives and updates the agent graph
- **Adaptiveness** - Framework captures failures, calibrates according to the objectives, and evolves the agent graph
- **Dynamic Node Connections** - No predefined edges; connection code is generated by any capable LLM based on your goals
- **SDK-Wrapped Nodes** - Every node gets shared memory, local RLM memory, monitoring, tools, and LLM access out of the box
- **Human-in-the-Loop** - Intervention nodes that pause execution for human input with configurable timeouts and escalation
@@ -78,63 +122,50 @@ docker compose up
## Why Aden
Traditional agent frameworks require you to manually design workflows, define agent interactions, and handle failures reactively. Aden flips this paradigm**you describe outcomes, and the system builds itself**.
Hive focuses on generating agents that run real business processes rather than generic agents. Instead of requiring you to manually design workflows, define agent interactions, and handle failures reactively, Hive flips the paradigm: **you describe outcomes, and the system builds itself**—delivering an outcome-driven, adaptive experience with an easy-to-use set of tools and integrations.
```mermaid
flowchart LR
subgraph BUILD["🏗️ BUILD"]
GOAL["Define Goal<br/>+ Success Criteria"] --> NODES["Add Nodes<br/>LLM/Router/Function"]
NODES --> EDGES["Connect Edges<br/>on_success/failure/conditional"]
EDGES --> TEST["Test & Validate"] --> APPROVE["Approve & Export"]
end
GOAL["Define Goal"] --> GEN["Auto-Generate Graph"]
GEN --> EXEC["Execute Agents"]
EXEC --> MON["Monitor & Observe"]
MON --> CHECK{{"Pass?"}}
CHECK -- "Yes" --> DONE["Deliver Result"]
CHECK -- "No" --> EVOLVE["Evolve Graph"]
EVOLVE --> EXEC
subgraph EXPORT["📦 EXPORT"]
direction TB
JSON["agent.json<br/>(GraphSpec)"]
TOOLS["tools.py<br/>(Functions)"]
MCP["mcp_servers.json<br/>(Integrations)"]
end
GOAL -.- V1["Natural Language"]
GEN -.- V2["Instant Architecture"]
EXEC -.- V3["Easy Integrations"]
MON -.- V4["Full visibility"]
EVOLVE -.- V5["Adaptability"]
DONE -.- V6["Reliable outcomes"]
subgraph RUN["🚀 RUNTIME"]
LOAD["AgentRunner<br/>Load + Parse"] --> SETUP["Setup Runtime<br/>+ ToolRegistry"]
SETUP --> EXEC["GraphExecutor<br/>Execute Nodes"]
subgraph DECISION["Decision Recording"]
DEC1["runtime.decide()<br/>intent → options → choice"]
DEC2["runtime.record_outcome()<br/>success, result, metrics"]
end
end
subgraph INFRA["⚙️ INFRASTRUCTURE"]
CTX["NodeContext<br/>memory • llm • tools"]
STORE[("FileStorage<br/>Runs & Decisions")]
end
APPROVE --> EXPORT
EXPORT --> LOAD
EXEC --> DECISION
EXEC --> CTX
DECISION --> STORE
STORE -.->|"Analyze & Improve"| NODES
style BUILD fill:#ffbe42,stroke:#cc5d00,stroke-width:3px,color:#333
style EXPORT fill:#fff59d,stroke:#ed8c00,stroke-width:2px,color:#333
style RUN fill:#ffb100,stroke:#cc5d00,stroke-width:3px,color:#333
style DECISION fill:#ffcc80,stroke:#ed8c00,stroke-width:2px,color:#333
style INFRA fill:#e8763d,stroke:#cc5d00,stroke-width:3px,color:#fff
style STORE fill:#ed8c00,stroke:#cc5d00,stroke-width:2px,color:#fff
style GOAL fill:#ffbe42,stroke:#cc5d00,stroke-width:2px,color:#333
style GEN fill:#ffb100,stroke:#cc5d00,stroke-width:2px,color:#333
style EXEC fill:#ff9800,stroke:#cc5d00,stroke-width:2px,color:#fff
style MON fill:#ff9800,stroke:#cc5d00,stroke-width:2px,color:#fff
style CHECK fill:#fff59d,stroke:#ed8c00,stroke-width:2px,color:#333
style DONE fill:#4caf50,stroke:#2e7d32,stroke-width:2px,color:#fff
style EVOLVE fill:#e8763d,stroke:#cc5d00,stroke-width:2px,color:#fff
style V1 fill:#fff,stroke:#ed8c00,stroke-width:1px,color:#cc5d00
style V2 fill:#fff,stroke:#ed8c00,stroke-width:1px,color:#cc5d00
style V3 fill:#fff,stroke:#ed8c00,stroke-width:1px,color:#cc5d00
style V4 fill:#fff,stroke:#ed8c00,stroke-width:1px,color:#cc5d00
style V5 fill:#fff,stroke:#ed8c00,stroke-width:1px,color:#cc5d00
style V6 fill:#fff,stroke:#ed8c00,stroke-width:1px,color:#cc5d00
```
### The Aden Advantage
| Traditional Frameworks | Aden |
|------------------------|------|
| Hardcode agent workflows | Describe goals in natural language |
| Manual graph definition | Auto-generated agent graphs |
| Reactive error handling | Proactive self-evolution |
| Static tool configurations | Dynamic SDK-wrapped nodes |
| Separate monitoring setup | Built-in real-time observability |
| DIY budget management | Integrated cost controls & degradation |
| Traditional Frameworks | Aden |
| -------------------------- | -------------------------------------- |
| Hardcode agent workflows | Describe goals in natural language |
| Manual graph definition | Auto-generated agent graphs |
| Reactive error handling | Outcome-evaluation and adaptiveness |
| Static tool configurations | Dynamic SDK-wrapped nodes |
| Separate monitoring setup | Built-in real-time observability |
| DIY budget management | Integrated cost controls & degradation |
### How It Works
@@ -142,113 +173,151 @@ flowchart LR
2. **Coding Agent Generates** → Creates the agent graph, connection code, and test cases
3. **Workers Execute** → SDK-wrapped nodes run with full observability and tool access
4. **Control Plane Monitors** → Real-time metrics, budget enforcement, policy management
5. **Self-Improve** → On failure, the system evolves the graph and redeploys automatically
5. **Adaptiveness** → On failure, the system evolves the graph and redeploys automatically
## How Aden Compares
## Run pre-built Agents (Coming Soon)
Aden takes a fundamentally different approach to agent development. While most frameworks require you to hardcode workflows or manually define agent graphs, Aden uses a **coding agent to generate your entire agent system** from natural language goals. When agents fail, the framework doesn't just log errors—it **automatically evolves the agent graph** and redeploys.
### Run a sample agent
Aden Hive provides a list of featured agents that you can use and build on top of.
### Comparison Table
### Run an agent shared by others
Put the agent in `exports/` and run `PYTHONPATH=core:exports python -m your_agent_name run --input '{...}'`
| Framework | Category | Approach | Aden Difference |
|-----------|----------|----------|-----------------|
| **LangChain, LlamaIndex, Haystack** | Component Libraries | Predefined components for RAG/LLM apps; manual connection logic | Generates entire graph and connection code upfront |
| **CrewAI, AutoGen, Swarm** | Multi-Agent Orchestration | Role-based agents with predefined collaboration patterns | Dynamically creates agents/connections; adapts on failure |
| **PydanticAI, Mastra, Agno** | Type-Safe Frameworks | Structured outputs and validation for known workflows | Evolving workflows; structure emerges through iteration |
| **Agent Zero, Letta** | Personal AI Assistants | Memory and learning; OS-as-tool or stateful memory focus | Production multi-agent systems with self-healing |
| **CAMEL** | Research Framework | Emergent behavior in large-scale simulations (up to 1M agents) | Production-oriented with reliable execution and recovery |
| **TEN Framework, Genkit** | Infrastructure Frameworks | Real-time multimodal (TEN) or full-stack AI (Genkit) | Higher abstraction—generates and evolves agent logic |
| **GPT Engineer, Motia** | Code Generation | Code from specs (GPT Engineer) or "Step" primitive (Motia) | Self-adapting graphs with automatic failure recovery |
| **Trading Agents** | Domain-Specific | Hardcoded trading firm roles on LangGraph | Domain-agnostic; generates structures for any use case |
### When to Choose Aden
Choose Aden when you need:
- Agents that **self-improve from failures** without manual intervention
- **Goal-driven development** where you describe outcomes, not workflows
- **Production reliability** with automatic recovery and redeployment
- **Rapid iteration** on agent architectures without rewriting code
- **Full observability** with real-time monitoring and human oversight
Choose other frameworks when you need:
- **Type-safe, predictable workflows** (PydanticAI, Mastra)
- **RAG and document processing** (LlamaIndex, Haystack)
- **Research on agent emergence** (CAMEL)
- **Real-time voice/multimodal** (TEN Framework)
- **Simple component chaining** (LangChain, Swarm)
## Project Structure
```
hive/
├── honeycomb/ # Frontend Dashboard
├── hive/ # Backend API Server
├── aden-tools/ # MCP Tools Package - 19 tools for agent capabilities
├── docs/ # Documentation and guides
├── scripts/ # Build and utility scripts
├── config.yaml.example # Configuration template
├── docker-compose.yml # Container orchestration
├── DEVELOPER.md # Developer guide
├── CONTRIBUTING.md # Contribution guidelines
└── ROADMAP.md # Product roadmap
```
## Development
### Local Development with Hot Reload
For building and running goal-driven agents with the framework:
```bash
# Copy development overrides
cp docker-compose.override.yml.example docker-compose.override.yml
# One-time setup
./quickstart.sh
# Start with hot reload enabled
docker compose up
# This sets up:
# - framework package (core runtime)
# - aden_tools package (MCP tools)
# - All Python dependencies
# Build new agents using Claude Code skills
claude> /building-agents-construction
# Test agents
claude> /testing-agent
# Run agents
PYTHONPATH=core:exports python -m agent_name run --input '{...}'
```
### Running Without Docker
```bash
# Install dependencies
npm install
# Generate environment files
npm run generate:env
# Start frontend (in honeycomb/)
cd honeycomb && npm run dev
# Start backend (in hive/)
cd hive && npm run dev
```
See [ENVIRONMENT_SETUP.md](ENVIRONMENT_SETUP.md) for complete setup instructions.
## Documentation
- **[Developer Guide](DEVELOPER.md)** - Comprehensive guide for developers
- [Getting Started](docs/getting-started.md) - Quick setup instructions
- [Configuration Guide](docs/configuration.md) - All configuration options
- [Architecture Overview](docs/architecture.md) - System design and structure
- [Architecture Overview](docs/architecture/README.md) - System design and structure
## Roadmap
Aden Agent Framework aims to help developers build outcome oriented, self-adaptive agents. Please find our roadmap here
[ROADMAP.md](ROADMAP.md)
Aden Hive Agent Framework aims to help developers build outcome-oriented, self-adaptive agents. See [ROADMAP.md](ROADMAP.md) for details.
```mermaid
timeline
title Aden Agent Framework Roadmap
section Foundation
Architecture : Node-Based Architecture : Python SDK : LLM Integration (OpenAI, Anthropic, Google) : Communication Protocol
Coding Agent : Goal Creation Session : Worker Agent Creation : MCP Tools Integration
Worker Agent : Human-in-the-Loop : Callback Handlers : Intervention Points : Streaming Interface
Tools : File Use : Memory (STM/LTM) : Web Search : Web Scraper : Audit Trail
Core : Eval System : Pydantic Validation : Docker Deployment : Documentation : Sample Agents
section Expansion
Intelligence : Guardrails : Streaming Mode : Semantic Search
Platform : JavaScript SDK : Custom Tool Integrator : Credential Store
Deployment : Self-Hosted : Cloud Services : CI/CD Pipeline
Templates : Sales Agent : Marketing Agent : Analytics Agent : Training Agent : Smart Form Agent
flowchart TD
subgraph Foundation
direction LR
subgraph arch["Architecture"]
a1["Node-Based Architecture"]:::done
a2["Python SDK"]:::done
a3["LLM Integration"]:::done
a4["Communication Protocol"]:::done
end
subgraph ca["Coding Agent"]
b1["Goal Creation Session"]:::done
b2["Worker Agent Creation"]
b3["MCP Tools"]:::done
end
subgraph wa["Worker Agent"]
c1["Human-in-the-Loop"]:::done
c2["Callback Handlers"]:::done
c3["Intervention Points"]:::done
c4["Streaming Interface"]
end
subgraph cred["Credentials"]
d1["Setup Process"]:::done
d2["Pluggable Sources"]:::done
d3["Enterprise Secrets"]
d4["Integration Tools"]:::done
end
subgraph tools["Tools"]
e1["File Use"]:::done
e2["Memory STM/LTM"]:::done
e3["Web Search/Scraper"]:::done
e4["CSV/PDF"]:::done
e5["Excel/Email"]
end
subgraph core["Core"]
f1["Eval System"]
f2["Pydantic Validation"]:::done
f3["Documentation"]:::done
f4["Adaptiveness"]
f5["Sample Agents"]
end
end
subgraph Expansion
direction LR
subgraph intel["Intelligence"]
g1["Guardrails"]
g2["Streaming Mode"]
g3["Image Generation"]
g4["Semantic Search"]
end
subgraph mem["Memory Iteration"]
h1["Message Model & Sessions"]
h2["Storage Migration"]
h3["Context Building"]
h4["Proactive Compaction"]
h5["Token Tracking"]
end
subgraph evt["Event System"]
i1["Event Bus for Nodes"]
end
subgraph cas["Coding Agent Support"]
j1["Claude Code"]
j2["Cursor"]
j3["Opencode"]
j4["Antigravity"]
end
subgraph plat["Platform"]
k1["JavaScript/TypeScript SDK"]
k2["Custom Tool Integrator"]
k3["Windows Support"]
end
subgraph dep["Deployment"]
l1["Self-Hosted"]
l2["Cloud Services"]
l3["CI/CD Pipeline"]
end
subgraph tmpl["Templates"]
m1["Sales Agent"]
m2["Marketing Agent"]
m3["Analytics Agent"]
m4["Training Agent"]
m5["Smart Form Agent"]
end
end
classDef done fill:#9e9e9e,color:#fff,stroke:#757575
```
## Contributing
We welcome contributions from the community! Were especially looking for help building tools, integrations, and example agents for the framework ([check #2805](https://github.com/adenhq/hive/issues/2805)). If youre interested in extending its functionality, this is the perfect place to start. Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
**Important:** Please get assigned to an issue before submitting a PR. Comment on an issue to claim it, and a maintainer will assign you. Issues with reproducible steps and proposals are prioritized. This helps prevent duplicate work.
1. Find or create an issue and get assigned
2. Fork the repository
3. Create your feature branch (`git checkout -b feature/amazing-feature`)
4. Commit your changes (`git commit -m 'Add amazing feature'`)
5. Push to the branch (`git push origin feature/amazing-feature`)
6. Open a Pull Request
## Community & Support
@@ -258,16 +327,6 @@ We use [Discord](https://discord.com/invite/MXE49hrKDk) for support, feature req
- Twitter/X - [@adenhq](https://x.com/aden_hq)
- LinkedIn - [Company Page](https://www.linkedin.com/company/teamaden/)
## Contributing
We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
## Join Our Team
**We're hiring!** Join us in engineering, research, and go-to-market roles.
@@ -284,57 +343,57 @@ This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENS
## Frequently Asked Questions (FAQ)
**Q: Does Aden depend on LangChain or other agent frameworks?**
**Q: Does Hive depend on LangChain or other agent frameworks?**
No. Aden is built from the ground up with no dependencies on LangChain, CrewAI, or other agent frameworks. The framework is designed to be lean and flexible, generating agent graphs dynamically rather than relying on predefined components.
No. Hive is built from the ground up with no dependencies on LangChain, CrewAI, or other agent frameworks. The framework is designed to be lean and flexible, generating agent graphs dynamically rather than relying on predefined components.
**Q: What LLM providers does Aden support?**
**Q: What LLM providers does Hive support?**
Aden supports OpenAI (GPT-4, GPT-4o), Anthropic (Claude models), and Google Gemini out of the box. The architecture is provider-agnostic through SDK abstraction, with LiteLLM integration on the roadmap for expanded model support.
Hive supports 100+ LLM providers through LiteLLM integration, including OpenAI (GPT-4, GPT-4o), Anthropic (Claude models), Google Gemini, DeepSeek, Mistral, Groq, and many more. Simply set the appropriate API key environment variable and specify the model name.
**Q: Can I use Aden with local AI models like Ollama?**
**Q: Can I use Hive with local AI models like Ollama?**
Local model support through LiteLLM integration is on our roadmap. The SDK's provider-agnostic design means adding local model support will be straightforward once implemented.
Yes! Hive supports local models through LiteLLM. Simply use the model name format `ollama/model-name` (e.g., `ollama/llama3`, `ollama/mistral`) and ensure Ollama is running locally.
**Q: What makes Aden different from other agent frameworks?**
**Q: What makes Hive different from other agent frameworks?**
Aden generates your entire agent system from natural language goals using a coding agent—you don't hardcode workflows or manually define graphs. When agents fail, the framework automatically captures failure data, evolves the agent graph, and redeploys. This self-improving loop is unique to Aden.
Hive generates your entire agent system from natural language goals using a coding agent—you don't hardcode workflows or manually define graphs. When agents fail, the framework automatically captures failure data, evolves the agent graph, and redeploys. This self-improving loop is unique to Aden.
**Q: Is Aden open-source?**
**Q: Is Hive open-source?**
Yes, Aden is fully open-source under the Apache License 2.0. We actively encourage community contributions and collaboration.
Yes, Hive is fully open-source under the Apache License 2.0. We actively encourage community contributions and collaboration.
**Q: Does Aden collect data from users?**
**Q: Does Hive collect data from users?**
Aden collects telemetry data for monitoring and observability purposes, including token usage, latency metrics, and cost tracking. Content capture (prompts and responses) is configurable and stored with team-scoped data isolation. All data stays within your infrastructure when self-hosted.
Hive collects telemetry data for monitoring and observability purposes, including token usage, latency metrics, and cost tracking. Content capture (prompts and responses) is configurable and stored with team-scoped data isolation. All data stays within your infrastructure when self-hosted.
**Q: What deployment options does Aden support?**
**Q: What deployment options does Hive support?**
Aden supports Docker Compose deployment out of the box, with both production and development configurations. Self-hosted deployments work on any infrastructure supporting Docker. Cloud deployment options and Kubernetes-ready configurations are on the roadmap.
Hive supports self-hosted deployments via Python packages. See the [Environment Setup Guide](ENVIRONMENT_SETUP.md) for installation instructions. Cloud deployment options and Kubernetes-ready configurations are on the roadmap.
**Q: Can Aden handle complex, production-scale use cases?**
**Q: Can Hive handle complex, production-scale use cases?**
Yes. Aden is explicitly designed for production environments with features like automatic failure recovery, real-time observability, cost controls, and horizontal scaling support. The framework handles both simple automations and complex multi-agent workflows.
Yes. Hive is explicitly designed for production environments with features like automatic failure recovery, real-time observability, cost controls, and horizontal scaling support. The framework handles both simple automations and complex multi-agent workflows.
**Q: Does Aden support human-in-the-loop workflows?**
**Q: Does Hive support human-in-the-loop workflows?**
Yes, Aden fully supports human-in-the-loop workflows through intervention nodes that pause execution for human input. These include configurable timeouts and escalation policies, allowing seamless collaboration between human experts and AI agents.
Yes, Hive fully supports human-in-the-loop workflows through intervention nodes that pause execution for human input. These include configurable timeouts and escalation policies, allowing seamless collaboration between human experts and AI agents.
**Q: What monitoring and debugging tools does Aden provide?**
**Q: What monitoring and debugging tools does Hive provide?**
Aden includes comprehensive observability features: real-time WebSocket streaming for live agent execution monitoring, TimescaleDB-powered analytics for cost and performance metrics, health check endpoints for Kubernetes integration, and 19 MCP tools for budget management, agent status, and policy control.
Hive includes comprehensive observability features: real-time WebSocket streaming for live agent execution monitoring, TimescaleDB-powered analytics for cost and performance metrics, health check endpoints for Kubernetes integration, and MCP tools for agent execution, including file operations, web search, data processing, and more.
**Q: What programming languages does Aden support?**
**Q: What programming languages does Hive support?**
Aden provides SDKs for both Python and JavaScript/TypeScript. The Python SDK includes integration templates for LangGraph, LangFlow, and LiveKit. The backend is Node.js/TypeScript, and the frontend is React/TypeScript.
The Hive framework is built in Python. A JavaScript/TypeScript SDK is on the roadmap.
**Q: Can Aden agents interact with external tools and APIs?**
Yes. Aden's SDK-wrapped nodes provide built-in tool access, and the framework supports flexible tool ecosystems. Agents can integrate with external APIs, databases, and services through the node architecture.
**Q: How does cost control work in Aden?**
**Q: How does cost control work in Hive?**
Aden provides granular budget controls including spending limits, throttles, and automatic model degradation policies. You can set budgets at the team, agent, or workflow level, with real-time cost tracking and alerts.
Hive provides granular budget controls including spending limits, throttles, and automatic model degradation policies. You can set budgets at the team, agent, or workflow level, with real-time cost tracking and alerts.
**Q: Where can I find examples and documentation?**
@@ -344,6 +403,14 @@ Visit [docs.adenhq.com](https://docs.adenhq.com/) for complete guides, API refer
Contributions are welcome! Fork the repository, create your feature branch, implement your changes, and submit a pull request. See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.
**Q: When will my team start seeing results from Aden's adaptive agents?**
Aden's adaptation loop begins working from the first execution. When an agent fails, the framework captures the failure data, helping developers evolve the agent graph through the coding agent. How quickly this translates to measurable results depends on the complexity of your use case, the quality of your goal definitions, and the volume of executions generating feedback.
**Q: How does Hive compare to other agent frameworks?**
Hive focuses on generating agents that run real business processes, rather than generic agents. This vision emphasizes outcome-driven design, adaptability, and an easy-to-use set of tools and integrations.
**Q: Does Aden offer enterprise support?**
For enterprise inquiries, contact the Aden team through [adenhq.com](https://adenhq.com) or join our [Discord community](https://discord.com/invite/MXE49hrKDk) for support and discussions.
+191 -42
View File
@@ -1,21 +1,94 @@
Product Roadmap
# Product Roadmap
Aden Agent Framework aims to help developers build outcome oriented, self-adaptive agents. Please find our roadmap here
```mermaid
timeline
title Aden Agent Framework Roadmap
section Foundation
Architecture : Node-Based Architecture : Python SDK : LLM Integration (OpenAI, Anthropic, Google) : Communication Protocol
Coding Agent : Goal Creation Session : Worker Agent Creation : MCP Tools Integration
Worker Agent : Human-in-the-Loop : Callback Handlers : Intervention Points : Streaming Interface
Tools : File Use : Memory (STM/LTM) : Web Search : Web Scraper : Audit Trail
Core : Eval System : Pydantic Validation : Docker Deployment : Documentation : Sample Agents
section Expansion
Intelligence : Guardrails : Streaming Mode : Semantic Search
Platform : JavaScript SDK : Custom Tool Integrator : Credential Store
Deployment : Self-Hosted : Cloud Services : CI/CD Pipeline
Templates : Sales Agent : Marketing Agent : Analytics Agent : Training Agent : Smart Form Agent
flowchart TD
subgraph Foundation
direction LR
subgraph arch["Architecture"]
a1["Node-Based Architecture"]:::done
a2["Python SDK"]:::done
a3["LLM Integration"]:::done
a4["Communication Protocol"]:::done
end
subgraph ca["Coding Agent"]
b1["Goal Creation Session"]:::done
b2["Worker Agent Creation"]
b3["MCP Tools"]:::done
end
subgraph wa["Worker Agent"]
c1["Human-in-the-Loop"]:::done
c2["Callback Handlers"]:::done
c3["Intervention Points"]:::done
c4["Streaming Interface"]
end
subgraph cred["Credentials"]
d1["Setup Process"]:::done
d2["Pluggable Sources"]:::done
d3["Enterprise Secrets"]
d4["Integration Tools"]:::done
end
subgraph tools["Tools"]
e1["File Use"]:::done
e2["Memory STM/LTM"]:::done
e3["Web Search/Scraper"]:::done
e4["CSV/PDF"]:::done
e5["Excel/Email"]
end
subgraph core["Core"]
f1["Eval System"]
f2["Pydantic Validation"]:::done
f3["Documentation"]:::done
f4["Adaptiveness"]
f5["Sample Agents"]
end
end
subgraph Expansion
direction LR
subgraph intel["Intelligence"]
g1["Guardrails"]
g2["Streaming Mode"]
g3["Image Generation"]
g4["Semantic Search"]
end
subgraph mem["Memory Iteration"]
h1["Message Model & Sessions"]
h2["Storage Migration"]
h3["Context Building"]
h4["Proactive Compaction"]
h5["Token Tracking"]
end
subgraph evt["Event System"]
i1["Event Bus for Nodes"]
end
subgraph cas["Coding Agent Support"]
j1["Claude Code"]
j2["Cursor"]
j3["Opencode"]
j4["Antigravity"]
end
subgraph plat["Platform"]
k1["JavaScript/TypeScript SDK"]
k2["Custom Tool Integrator"]
k3["Windows Support"]
end
subgraph dep["Deployment"]
l1["Self-Hosted"]
l2["Cloud Services"]
l3["CI/CD Pipeline"]
end
subgraph tmpl["Templates"]
m1["Sales Agent"]
m2["Marketing Agent"]
m3["Analytics Agent"]
m4["Training Agent"]
m5["Smart Form Agent"]
end
end
classDef done fill:#9e9e9e,color:#fff,stroke:#757575
```
---
@@ -26,19 +99,19 @@ timeline
- [ ] **Node-Based Architecture (Agent as a node)**
- [x] Object schema definition
- [x] Node wrapper SDK
- [ ] Shared memory access
- [x] Shared memory access
- [ ] Default monitoring hooks
- [ ] Tool access layer
- [x] Tool access layer
- [x] LLM integration layer (Natively supports all mainstream LLMs through LiteLLM)
- [x] Anthropic
- [x] OpenAI
- [x] Google
- [ ] **Communication protocol between nodes**
- [ ] **[Coding Agent] Goal Creation Session** (separate from coding session)
- [ ] Instruction back and forth
- [x] **Communication protocol between nodes**
- [x] **[Coding Agent] Goal Creation Session** (separate from coding session)
- [x] Instruction back and forth
- [x] Goal Object schema definition
- [ ] Being able to generate the test cases
- [ ] Test case validation for worker agent (Outcome driven)
- [x] Being able to generate the test cases
- [x] Test case validation for worker agent (Outcome driven)
- [ ] **[Coding Agent] Worker Agent Creation**
- [x] Coding Agent tools
- [ ] Use Template Agent as a start
@@ -46,21 +119,62 @@ timeline
- [ ] **[Worker Agent] Human-in-the-Loop**
- [x] Worker Agents request with questions and options
- [x] Callback Handler System to receive events throughout execution
- [ ] Tool-Based Intervention Points (tool to pause execution and request human input)
- [x] Tool-Based Intervention Points (tool to pause execution and request human input)
- [x] Multiple entrypoint for different event source (e.g. Human input, webhook)
- [ ] Streaming Interface for Real-time Monitoring
- [ ] Request State Management
- [x] Request State Management
### Credential Management
- [x] **Credentials Setup Process**
- [x] Install Credential MCP
- [x] **Pluggable Credential Sources**
- [x] **Abstraction & Local Sources**
- [x] Introduce `CredentialSource` base class
- [x] Refactor existing logic into `EnvVarSource`
- [x] Implementation of Source Priority Chain mechanism
- [ ] Foundation unit tests
- [ ] **Enterprise Secret Managers**
- [x] `VaultSource` (HashiCorp Vault)
- [ ] `AWSSecretsSource` (AWS Secrets Manager)
- [ ] `AzureKeyVaultSource` (Azure Key Vault)
- [ ] Management of optional provider dependencies
- [ ] **Advanced Features**
- [x] Credential expiration and auto-refresh
- [ ] Audit logging for compliance/tracking
- [ ] Per-environment configuration support
- [ ] **Documentation & DX**
- [ ] Comprehensive source documentation
- [ ] Example configurations for all providers
- [x] **Integration as tools coverage**
- [x] Gsuite Tools
- [x] Social Media
- [ ] Twitter(X)
- [x] Github
- [ ] Instagram
- [ ] SAAS
- [ ] Hubspot
- [ ] Slack
- [ ] Teams
- [ ] Zoom
- [ ] Stripe
- [ ] Salesforce
> [!IMPORTANT]
> **Community Contribution Wanted**: We appreciate help from the community to expand the "Integration as tools" capability. Leave an issue of the integration you want to support via Hive!
### Essential Tools
- [x] **File Use Tool Kit**
- [ ] **Memory Tools**
- [X] **Memory Tools**
- [x] STM Layer Tool (state-based short-term memory)
- [x] LTM Layer Tool (RLM - long-term memory)
- [ ] **Infrastructure Tools**
- [x] Runtime Log Tool (logs for coding agent)
- [ ] Audit Trail Tool (decision timeline generation)
- [ ] Web Search
- [ ] Web Scraper
- [x] Web Search
- [x] Web Scraper
- [x] CSV tools
- [x] PDF tools
- [ ] Excel tools
- [ ] Email Tools
- [ ] Recipe for "Add your own tools"
### Memory & File System
@@ -75,20 +189,25 @@ timeline
- [ ] User-driven log analysis (OSS approach)
### Data Validation
- [ ] Natively Support data validation of LLMs output with Pydantic
- [x] Natively Support data validation of LLMs output with Pydantic
### Developer Experience
- [ ] **Debugging mode**
- [ ] **Documentation**
- [ ] Quick start guide
- [ ] Goal creation guide
- [ ] Agent creation guide
- [ ] GitHub Page setup
- [ ] README with examples
- [ ] Contributing guidelines
- [ ] **Distribution**
- [ ] PyPI package
- [ ] Docker image on Docker Hub
- [ ] **MVP Features**
- [ ] Debugging mode
- [ ] CLI tools for memory management
- [ ] CLI tools for credential management
- [ ] **MVP Resources & Documentation**
- [x] Quick start guide
- [x] Goal creation guide
- [x] Agent creation guide
- [x] GitHub Page setup
- [x] README with examples
- [x] Contributing guidelines
- [ ] Introduction Video
### Adaptiveness
- [ ] Runtime data feedback loop
- [ ] Instant Developer Feedback for improvement
### Sample Agents
- [ ] Knowledge Agent
@@ -106,9 +225,35 @@ timeline
### Agent Capability
- [ ] Streaming mode support
- [ ] Image Generation support
- [ ] Take end user input Image and flatfile understand capability
### Cross-Platform
- [ ] JavaScript / TypeScript Version SDK
### Event-loop For Nodes (Opencode-style)
- [ ] **Event bus**
### Memory System Iteration
- [ ] **Message Model & Session Management**
- [ ] Introduce `Message` class with structured content types
- [ ] Implement `Session` classes for conversation state
- [ ] **Storage Migration**
- [ ] Implement granular per-message file persistence (`/message/[agentID]/...`)
- [ ] Migrate from monolithic run storage
- [ ] **Context Building & Conversation Loop**
- [ ] Implement `Message.stream(sessionID)`
- [ ] Update `LLMNode.execute()` for full context building
- [ ] Implement `Message.toModelMessages()` conversion
- [ ] **Proactive Compaction**
- [ ] Implement proactive overflow detection
- [ ] Develop backward-scanning pruning strategy (e.g., clearing old tool outputs)
- [ ] **Enhanced Token Tracking**
- [ ] Extend `LLMResponse` to track reasoning and cache tokens
- [ ] Integrate granular token metrics into compaction logic
### Coding Agent Support
- [ ] Claude Code
- [ ] Cursor
- [ ] Opencode
- [ ] Antigravity
### File System Enhancement
- [ ] Semantic Search integration
@@ -126,7 +271,7 @@ timeline
- [ ] Docker container standardization
- [ ] Headless backend execution
- [ ] Exposed API for frontend attachment
- [ ] Local monitoring & observability (from hive repo)
- [ ] Local monitoring & observability
- [ ] Basic lifecycle APIs (Start, Stop, Pause, Resume)
### Deployment (Cloud)
@@ -148,3 +293,7 @@ timeline
- [ ] Analytics Agent
- [ ] Training Agent
- [ ] Smart Entry / Form Agent (self-evolution emphasis)
### Cross-Platform
- [ ] JavaScript / TypeScript Version SDK
- [ ] Better windows support
-103
View File
@@ -1,103 +0,0 @@
# Aden Tools
Tool library for the Aden agent framework. Provides a collection of tools that AI agents can use to interact with external systems, process data, and perform actions via the Model Context Protocol (MCP).
## Installation
```bash
pip install -e aden-tools
```
For development:
```bash
pip install -e "aden-tools[dev]"
```
## Quick Start
### As an MCP Server
```python
from fastmcp import FastMCP
from aden_tools.tools import register_all_tools
mcp = FastMCP("aden-tools")
register_all_tools(mcp)
mcp.run()
```
Or run directly:
```bash
python mcp_server.py
```
## Available Tools
| Tool | Description |
|------|-------------|
| `example_tool` | Template tool demonstrating the pattern |
| `file_read` | Read contents of local files |
| `file_write` | Write content to local files |
| `web_search` | Search the web using Brave Search API |
| `web_scrape` | Scrape and extract content from webpages |
| `pdf_read` | Read and extract text from PDF files |
## Project Structure
```
aden-tools/
├── src/aden_tools/
│ ├── __init__.py # Main exports
│ ├── utils/ # Utility functions
│ └── tools/ # Tool implementations
│ ├── example_tool/
│ ├── file_read_tool/
│ ├── file_write_tool/
│ ├── web_search_tool/
│ ├── web_scrape_tool/
│ └── pdf_read_tool/
├── tests/ # Test suite
├── mcp_server.py # MCP server entry point
├── README.md
├── BUILDING_TOOLS.md # Tool development guide
└── pyproject.toml
```
## Creating Custom Tools
Tools use FastMCP's native decorator pattern:
```python
from fastmcp import FastMCP
def register_tools(mcp: FastMCP) -> None:
@mcp.tool()
def my_tool(query: str, limit: int = 10) -> dict:
"""
Search for items matching the query.
Args:
query: The search query
limit: Max results to return
Returns:
Dict with results or error
"""
try:
results = do_search(query, limit)
return {"results": results, "total": len(results)}
except Exception as e:
return {"error": str(e)}
```
See [BUILDING_TOOLS.md](BUILDING_TOOLS.md) for the full guide.
## Documentation
- [Building Tools Guide](BUILDING_TOOLS.md) - How to create new tools
- Individual tool READMEs in `src/aden_tools/tools/*/README.md`
## License
This project is licensed under the Apache License 2.0 - see the [LICENSE](../LICENSE) file for details.
-79
View File
@@ -1,79 +0,0 @@
#!/usr/bin/env python3
"""
Aden Tools MCP Server
Exposes all aden-tools via Model Context Protocol using FastMCP.
Usage:
# Run with HTTP transport (default, for Docker)
python mcp_server.py
# Run with custom port
python mcp_server.py --port 8001
# Run with STDIO transport (for local testing)
python mcp_server.py --stdio
Environment Variables:
MCP_PORT - Server port (default: 4001)
BRAVE_SEARCH_API_KEY - Required for web_search tool
"""
import argparse
import os
from fastmcp import FastMCP
from starlette.requests import Request
from starlette.responses import PlainTextResponse
mcp = FastMCP("aden-tools")
# Register all tools with the MCP server
from aden_tools.tools import register_all_tools
tools = register_all_tools(mcp)
print(f"[MCP] Registered {len(tools)} tools: {tools}")
@mcp.custom_route("/health", methods=["GET"])
async def health_check(request: Request) -> PlainTextResponse:
"""Health check endpoint for container orchestration."""
return PlainTextResponse("OK")
@mcp.custom_route("/", methods=["GET"])
async def index(request: Request) -> PlainTextResponse:
"""Landing page for browser visits."""
return PlainTextResponse("Welcome to the Hive MCP Server")
def main() -> None:
"""Entry point for the MCP server."""
parser = argparse.ArgumentParser(description="Aden Tools MCP Server")
parser.add_argument(
"--port",
type=int,
default=int(os.getenv("MCP_PORT", "4001")),
help="HTTP server port (default: 4001)",
)
parser.add_argument(
"--host",
default="0.0.0.0",
help="HTTP server host (default: 0.0.0.0)",
)
parser.add_argument(
"--stdio",
action="store_true",
help="Use STDIO transport instead of HTTP",
)
args = parser.parse_args()
if args.stdio:
print("[MCP] Starting with STDIO transport")
mcp.run(transport="stdio")
else:
print(f"[MCP] Starting HTTP server on {args.host}:{args.port}")
mcp.run(transport="http", host=args.host, port=args.port)
if __name__ == "__main__":
main()
-60
View File
@@ -1,60 +0,0 @@
[project]
name = "aden-tools"
version = "0.1.0"
description = "Tools library for the Aden agent framework"
readme = "README.md"
requires-python = ">=3.10"
license = { text = "Apache-2.0" }
authors = [
{ name = "Aden", email = "team@aden.ai" }
]
keywords = ["ai", "agents", "tools", "llm"]
classifiers = [
"Development Status :: 3 - Alpha",
"Intended Audience :: Developers",
"License :: OSI Approved :: Apache Software License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
]
dependencies = [
"pydantic>=2.0.0",
"httpx>=0.27.0",
"beautifulsoup4>=4.12.0",
"pypdf>=4.0.0",
"pandas>=2.0.0",
"jsonpath-ng>=1.6.0",
"fastmcp>=2.0.0",
"diff-match-patch>=20230430",
]
[project.optional-dependencies]
dev = [
"pytest>=7.0.0",
"pytest-asyncio>=0.21.0",
]
sandbox = [
"RestrictedPython>=7.0",
]
ocr = [
"pytesseract>=0.3.10",
"pillow>=10.0.0",
]
all = [
"RestrictedPython>=7.0",
"pytesseract>=0.3.10",
"pillow>=10.0.0",
]
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[tool.hatch.build.targets.wheel]
packages = ["src/aden_tools"]
[tool.pytest.ini_options]
testpaths = ["tests"]
asyncio_mode = "auto"
-30
View File
@@ -1,30 +0,0 @@
"""
Aden Tools - Tool library for the Aden agent framework.
Tools provide capabilities that AI agents can use to interact with
external systems, process data, and perform actions.
Usage:
from fastmcp import FastMCP
from aden_tools.tools import register_all_tools
mcp = FastMCP("my-server")
register_all_tools(mcp)
"""
__version__ = "0.1.0"
# Utilities
from .utils import get_env_var
# MCP registration
from .tools import register_all_tools
__all__ = [
# Version
"__version__",
# Utilities
"get_env_var",
# MCP registration
"register_all_tools",
]
@@ -1,79 +0,0 @@
"""
Aden Tools - Tool implementations for FastMCP.
Usage:
from fastmcp import FastMCP
from aden_tools.tools import register_all_tools
mcp = FastMCP("my-server")
register_all_tools(mcp)
"""
from typing import List
from fastmcp import FastMCP
# Import register_tools from each tool module
from .example_tool import register_tools as register_example
from .file_read_tool import register_tools as register_file_read
from .file_write_tool import register_tools as register_file_write
from .web_search_tool import register_tools as register_web_search
from .web_scrape_tool import register_tools as register_web_scrape
from .pdf_read_tool import register_tools as register_pdf_read
# Import file system toolkits
from .file_system_toolkits.view_file import register_tools as register_view_file
from .file_system_toolkits.write_to_file import register_tools as register_write_to_file
from .file_system_toolkits.list_dir import register_tools as register_list_dir
from .file_system_toolkits.replace_file_content import register_tools as register_replace_file_content
from .file_system_toolkits.apply_diff import register_tools as register_apply_diff
from .file_system_toolkits.apply_patch import register_tools as register_apply_patch
from .file_system_toolkits.grep_search import register_tools as register_grep_search
from .file_system_toolkits.execute_command_tool import register_tools as register_execute_command
def register_all_tools(mcp: FastMCP) -> List[str]:
"""
Register all aden-tools with a FastMCP server.
Args:
mcp: FastMCP server instance
Returns:
List of registered tool names
"""
register_example(mcp)
register_file_read(mcp)
register_file_write(mcp)
register_web_search(mcp)
register_web_scrape(mcp)
register_pdf_read(mcp)
# Register file system toolkits
register_view_file(mcp)
register_write_to_file(mcp)
register_list_dir(mcp)
register_replace_file_content(mcp)
register_apply_diff(mcp)
register_apply_patch(mcp)
register_grep_search(mcp)
register_execute_command(mcp)
return [
"example_tool",
"file_read",
"file_write",
"web_search",
"web_scrape",
"pdf_read",
"view_file",
"write_to_file",
"list_dir",
"replace_file_content",
"apply_diff",
"apply_patch",
"grep_search",
"execute_command_tool",
]
__all__ = ["register_all_tools"]
@@ -1,28 +0,0 @@
# File Read Tool
Read contents of local files with encoding support.
## Description
Use for reading configs, data files, source code, logs, or any text file. Returns file content along with path, name, size, and encoding metadata.
## Arguments
| Argument | Type | Required | Default | Description |
|----------|------|----------|---------|-------------|
| `file_path` | str | Yes | - | Path to the file to read (absolute or relative) |
| `encoding` | str | No | `utf-8` | File encoding (utf-8, latin-1, etc.) |
| `max_size` | int | No | `10000000` | Maximum file size to read in bytes (default 10MB) |
## Environment Variables
This tool does not require any environment variables.
## Error Handling
Returns error dicts for common issues:
- `File not found: <path>` - File does not exist
- `Not a file: <path>` - Path points to a directory
- `File too large: <size> bytes (max: <max_size>)` - File exceeds max_size limit
- `Failed to decode file with encoding '<encoding>'` - Wrong encoding specified
- `Permission denied: <path>` - No read access to file
@@ -1,4 +0,0 @@
"""File Read Tool - Read contents of local files."""
from .file_read_tool import register_tools
__all__ = ["register_tools"]
@@ -1,75 +0,0 @@
"""
File Read Tool - Read contents of local files.
Supports reading text files with various encodings.
Returns file content along with metadata.
"""
from __future__ import annotations
from pathlib import Path
from fastmcp import FastMCP
def register_tools(mcp: FastMCP) -> None:
"""Register file read tools with the MCP server."""
@mcp.tool()
def file_read(
file_path: str,
encoding: str = "utf-8",
max_size: int = 10_000_000,
) -> dict:
"""
Read the contents of a local file.
Use for reading configs, data files, source code, logs, or any text file.
Returns file content along with path, name, size, and encoding.
Args:
file_path: Path to the file to read (absolute or relative)
encoding: File encoding (utf-8, latin-1, etc.)
max_size: Maximum file size to read in bytes (default 10MB)
Returns:
Dict with file content and metadata, or error dict
"""
try:
path = Path(file_path).resolve()
# Check if file exists
if not path.exists():
return {"error": f"File not found: {file_path}"}
# Check if it's a file (not directory)
if not path.is_file():
return {"error": f"Not a file: {file_path}"}
# Check file size
file_size = path.stat().st_size
if max_size > 0 and file_size > max_size:
return {
"error": f"File too large: {file_size} bytes (max: {max_size})",
"file_size": file_size,
}
# Read the file
content = path.read_text(encoding=encoding)
return {
"path": str(path),
"name": path.name,
"content": content,
"size": len(content),
"encoding": encoding,
}
except UnicodeDecodeError as e:
return {
"error": f"Failed to decode file with encoding '{encoding}': {str(e)}",
"suggestion": "Try a different encoding like 'latin-1' or 'cp1252'",
}
except PermissionError:
return {"error": f"Permission denied: {file_path}"}
except Exception as e:
return {"error": f"Failed to read file: {str(e)}"}
@@ -1,40 +0,0 @@
import os
from mcp.server.fastmcp import FastMCP
from ..security import get_secure_path
def register_tools(mcp: FastMCP) -> None:
"""Register file view tools with the MCP server."""
@mcp.tool()
def view_file(path: str, workspace_id: str, agent_id: str, session_id: str) -> dict:
"""
Read the content of a file within the session sandbox.
Use this when you need to view the contents of an existing file.
Args:
path: The path to the file (relative to session root)
workspace_id: The ID of the workspace
agent_id: The ID of the agent
session_id: The ID of the current session
Returns:
Dict with file content and metadata, or error dict
"""
try:
secure_path = get_secure_path(path, workspace_id, agent_id, session_id)
if not os.path.exists(secure_path):
return {"error": f"File not found at {path}"}
with open(secure_path, "r", encoding="utf-8") as f:
content = f.read()
return {
"success": True,
"path": path,
"content": content,
"size_bytes": len(content.encode("utf-8")),
"lines": len(content.splitlines())
}
except Exception as e:
return {"error": f"Failed to read file: {str(e)}"}
@@ -1,29 +0,0 @@
# File Write Tool
Write content to local files with encoding support.
## Description
Can create new files or overwrite/append to existing ones. Use for saving data, creating configs, writing reports, or exporting results. Optionally creates parent directories if they don't exist.
## Arguments
| Argument | Type | Required | Default | Description |
|----------|------|----------|---------|-------------|
| `file_path` | str | Yes | - | Path to the file to write (absolute or relative) |
| `content` | str | Yes | - | Content to write to the file |
| `encoding` | str | No | `utf-8` | File encoding (utf-8, latin-1, etc.) |
| `mode` | str | No | `write` | Write mode - 'write' (overwrite) or 'append' |
| `create_dirs` | bool | No | `True` | Create parent directories if they don't exist |
## Environment Variables
This tool does not require any environment variables.
## Error Handling
Returns error dicts for common issues:
- `Parent directory does not exist: <path>` - Parent dir missing and create_dirs=False
- `Invalid mode: <mode>. Use 'write' or 'append'.` - Invalid mode specified
- `Permission denied: <path>` - No write access to file/directory
- `OS error writing file: <error>` - Filesystem error
@@ -1,4 +0,0 @@
"""File Write Tool - Create or update local files."""
from .file_write_tool import register_tools
__all__ = ["register_tools"]
@@ -1,83 +0,0 @@
"""
File Write Tool - Create or update local files.
Supports writing text files with various encodings.
Can create directories if they don't exist.
"""
from __future__ import annotations
from pathlib import Path
from fastmcp import FastMCP
def register_tools(mcp: FastMCP) -> None:
"""Register file write tools with the MCP server."""
@mcp.tool()
def file_write(
file_path: str,
content: str,
encoding: str = "utf-8",
mode: str = "write",
create_dirs: bool = True,
) -> dict:
"""
Write content to a local file.
Can create new files or overwrite/append to existing ones.
Use for saving data, creating configs, writing reports, or exporting results.
Args:
file_path: Path to the file to write (absolute or relative)
content: Content to write to the file
encoding: File encoding (utf-8, latin-1, etc.)
mode: Write mode - 'write' (overwrite) or 'append'
create_dirs: Create parent directories if they don't exist
Returns:
Dict with write result or error dict
"""
try:
path = Path(file_path).resolve()
# Create parent directories if requested
if create_dirs:
path.parent.mkdir(parents=True, exist_ok=True)
elif not path.parent.exists():
return {"error": f"Parent directory does not exist: {path.parent}"}
# Determine write mode
if mode == "append":
write_mode = "a"
elif mode == "write":
write_mode = "w"
else:
return {"error": f"Invalid mode: {mode}. Use 'write' or 'append'."}
# Check if we're overwriting
existed = path.exists()
previous_size = path.stat().st_size if existed else 0
# Write the file
with open(path, write_mode, encoding=encoding) as f:
f.write(content)
new_size = path.stat().st_size
return {
"path": str(path),
"name": path.name,
"bytes_written": len(content.encode(encoding)),
"total_size": new_size,
"mode": mode,
"created": not existed,
"previous_size": previous_size if existed else None,
}
except PermissionError:
return {"error": f"Permission denied: {file_path}"}
except OSError as e:
return {"error": f"OS error writing file: {str(e)}"}
except Exception as e:
return {"error": f"Failed to write file: {str(e)}"}
@@ -1,134 +0,0 @@
"""
Web Scrape Tool - Extract content from web pages.
Uses httpx for requests and BeautifulSoup for HTML parsing.
Returns clean text content from web pages.
"""
from __future__ import annotations
from typing import Any, List
import httpx
from bs4 import BeautifulSoup
from fastmcp import FastMCP
def register_tools(mcp: FastMCP) -> None:
"""Register web scrape tools with the MCP server."""
@mcp.tool()
def web_scrape(
url: str,
selector: str | None = None,
include_links: bool = False,
max_length: int = 50000,
) -> dict:
"""
Scrape and extract text content from a webpage.
Use when you need to read the content of a specific URL,
extract data from a website, or read articles/documentation.
Args:
url: URL of the webpage to scrape
selector: CSS selector to target specific content (e.g., 'article', '.main-content')
include_links: Include extracted links in the response
max_length: Maximum length of extracted text (1000-500000)
Returns:
Dict with scraped content (url, title, description, content, length) or error dict
"""
try:
# Validate URL
if not url.startswith(("http://", "https://")):
url = "https://" + url
# Validate max_length
if max_length < 1000:
max_length = 1000
elif max_length > 500000:
max_length = 500000
# Make request
response = httpx.get(
url,
headers={
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language": "en-US,en;q=0.5",
},
follow_redirects=True,
timeout=30.0,
)
if response.status_code != 200:
return {"error": f"HTTP {response.status_code}: Failed to fetch URL"}
# Parse HTML
soup = BeautifulSoup(response.text, "html.parser")
# Remove noise elements
for tag in soup(["script", "style", "nav", "footer", "header", "aside", "noscript", "iframe"]):
tag.decompose()
# Get title and description
title = ""
title_tag = soup.find("title")
if title_tag:
title = title_tag.get_text(strip=True)
description = ""
meta_desc = soup.find("meta", attrs={"name": "description"})
if meta_desc:
description = meta_desc.get("content", "")
# Target content
if selector:
content_elem = soup.select_one(selector)
if not content_elem:
return {"error": f"No elements found matching selector: {selector}"}
text = content_elem.get_text(separator=" ", strip=True)
else:
# Auto-detect main content
main_content = (
soup.find("article")
or soup.find("main")
or soup.find(attrs={"role": "main"})
or soup.find(class_=["content", "post", "entry", "article-body"])
or soup.find("body")
)
text = main_content.get_text(separator=" ", strip=True) if main_content else ""
# Clean up whitespace
text = " ".join(text.split())
# Truncate if needed
if len(text) > max_length:
text = text[:max_length] + "..."
result: dict[str, Any] = {
"url": str(response.url),
"title": title,
"description": description,
"content": text,
"length": len(text),
}
# Extract links if requested
if include_links:
links: List[dict[str, str]] = []
for a in soup.find_all("a", href=True)[:50]:
href = a["href"]
link_text = a.get_text(strip=True)
if link_text and href:
links.append({"text": link_text, "href": href})
result["links"] = links
return result
except httpx.TimeoutException:
return {"error": "Request timed out"}
except httpx.RequestError as e:
return {"error": f"Network error: {str(e)}"}
except Exception as e:
return {"error": f"Scraping failed: {str(e)}"}
@@ -1,31 +0,0 @@
# Web Search Tool
Search the web using the Brave Search API.
## Description
Returns titles, URLs, and snippets for search results. Use when you need current information, research topics, or find websites.
## Arguments
| Argument | Type | Required | Default | Description |
|----------|------|----------|---------|-------------|
| `query` | str | Yes | - | The search query (1-500 chars) |
| `num_results` | int | No | `10` | Number of results to return (1-20) |
| `country` | str | No | `us` | Country code for localized results (us, uk, de, etc.) |
## Environment Variables
| Variable | Required | Description |
|----------|----------|-------------|
| `BRAVE_SEARCH_API_KEY` | Yes | API key from [Brave Search API](https://brave.com/search/api/) |
## Error Handling
Returns error dicts for common issues:
- `BRAVE_SEARCH_API_KEY environment variable not set` - Missing API key
- `Query must be 1-500 characters` - Empty or too long query
- `Invalid API key` - API key rejected (HTTP 401)
- `Rate limit exceeded. Try again later.` - Too many requests (HTTP 429)
- `Search request timed out` - Request exceeded 30s timeout
- `Network error: <error>` - Connection or DNS issues
@@ -1,100 +0,0 @@
"""
Web Search Tool - Search the web using Brave Search API.
Requires BRAVE_SEARCH_API_KEY environment variable.
Returns search results with titles, URLs, and snippets.
"""
from __future__ import annotations
import os
import httpx
from fastmcp import FastMCP
def register_tools(mcp: FastMCP) -> None:
"""Register web search tools with the MCP server."""
@mcp.tool()
def web_search(
query: str,
num_results: int = 10,
country: str = "us",
) -> dict:
"""
Search the web for information using Brave Search API.
Returns titles, URLs, and snippets. Use when you need current
information, research, or to find websites.
Requires BRAVE_SEARCH_API_KEY environment variable.
Args:
query: The search query (1-500 chars)
num_results: Number of results to return (1-20)
country: Country code for localized results (us, uk, de, etc.)
Returns:
Dict with search results or error dict
"""
api_key = os.getenv("BRAVE_SEARCH_API_KEY")
if not api_key:
return {
"error": "BRAVE_SEARCH_API_KEY environment variable not set",
"help": "Get an API key at https://brave.com/search/api/",
}
# Validate inputs
if not query or len(query) > 500:
return {"error": "Query must be 1-500 characters"}
if num_results < 1 or num_results > 20:
num_results = max(1, min(20, num_results))
try:
# Make request to Brave Search API
response = httpx.get(
"https://api.search.brave.com/res/v1/web/search",
params={
"q": query,
"count": num_results,
"country": country,
},
headers={
"X-Subscription-Token": api_key,
"Accept": "application/json",
},
timeout=30.0,
)
if response.status_code == 401:
return {"error": "Invalid API key"}
elif response.status_code == 429:
return {"error": "Rate limit exceeded. Try again later."}
elif response.status_code != 200:
return {"error": f"API request failed: HTTP {response.status_code}"}
data = response.json()
# Extract results
results = []
web_results = data.get("web", {}).get("results", [])
for item in web_results[:num_results]:
results.append({
"title": item.get("title", ""),
"url": item.get("url", ""),
"snippet": item.get("description", ""),
})
return {
"query": query,
"results": results,
"total": len(results),
}
except httpx.TimeoutException:
return {"error": "Search request timed out"}
except httpx.RequestError as e:
return {"error": f"Network error: {str(e)}"}
except Exception as e:
return {"error": f"Search failed: {str(e)}"}
@@ -1,96 +0,0 @@
"""Tests for file_read tool (FastMCP)."""
import pytest
from pathlib import Path
from fastmcp import FastMCP
from aden_tools.tools.file_read_tool import register_tools
@pytest.fixture
def file_read_fn(mcp: FastMCP):
"""Register and return the file_read tool function."""
register_tools(mcp)
# Access the registered tool's function directly
return mcp._tool_manager._tools["file_read"].fn
class TestFileReadTool:
"""Tests for file_read tool."""
def test_read_existing_file(self, file_read_fn, sample_text_file: Path):
"""Reading an existing file returns content and metadata."""
result = file_read_fn(file_path=str(sample_text_file))
assert "error" not in result
assert result["content"] == "Hello, World!\nLine 2\nLine 3"
assert result["name"] == "test.txt"
assert result["encoding"] == "utf-8"
assert "size" in result
def test_read_file_not_found(self, file_read_fn, tmp_path: Path):
"""Reading a non-existent file returns an error dict."""
missing_file = tmp_path / "does_not_exist.txt"
result = file_read_fn(file_path=str(missing_file))
assert "error" in result
assert "not found" in result["error"].lower()
def test_read_directory_returns_error(self, file_read_fn, tmp_path: Path):
"""Reading a directory (not a file) returns an error."""
result = file_read_fn(file_path=str(tmp_path))
assert "error" in result
assert "not a file" in result["error"].lower()
def test_read_file_too_large(self, file_read_fn, tmp_path: Path):
"""Reading a file exceeding max_size returns an error."""
large_file = tmp_path / "large.txt"
large_file.write_text("x" * 1000)
result = file_read_fn(file_path=str(large_file), max_size=100)
assert "error" in result
assert "too large" in result["error"].lower()
assert "file_size" in result
def test_read_with_no_size_limit(self, file_read_fn, tmp_path: Path):
"""Reading with max_size=0 allows any file size."""
large_file = tmp_path / "large.txt"
content = "x" * 100_000
large_file.write_text(content)
# max_size=0 means no limit in the implementation
result = file_read_fn(file_path=str(large_file), max_size=0)
assert "error" not in result
assert result["content"] == content
def test_read_with_different_encoding(self, file_read_fn, tmp_path: Path):
"""Reading with a specific encoding works."""
latin_file = tmp_path / "latin.txt"
# Write bytes directly with latin-1 encoding
latin_file.write_bytes("café".encode("latin-1"))
result = file_read_fn(file_path=str(latin_file), encoding="latin-1")
assert "error" not in result
assert result["content"] == "café"
assert result["encoding"] == "latin-1"
def test_read_with_wrong_encoding_returns_error(self, file_read_fn, tmp_path: Path):
"""Reading with wrong encoding returns helpful error."""
# Create a file with bytes that aren't valid UTF-8
binary_file = tmp_path / "binary.txt"
binary_file.write_bytes(b"\xff\xfe")
result = file_read_fn(file_path=str(binary_file), encoding="utf-8")
assert "error" in result
assert "suggestion" in result
def test_returns_absolute_path(self, file_read_fn, sample_text_file: Path):
"""Result includes the absolute path."""
result = file_read_fn(file_path=str(sample_text_file))
assert result["path"] == str(sample_text_file.resolve())
@@ -1,99 +0,0 @@
"""Tests for file_write tool (FastMCP)."""
import pytest
from pathlib import Path
from fastmcp import FastMCP
from aden_tools.tools.file_write_tool import register_tools
@pytest.fixture
def file_write_fn(mcp: FastMCP):
"""Register and return the file_write tool function."""
register_tools(mcp)
return mcp._tool_manager._tools["file_write"].fn
class TestFileWriteTool:
"""Tests for file_write tool."""
def test_write_creates_new_file(self, file_write_fn, tmp_path: Path):
"""Writing to a new file creates it with content."""
new_file = tmp_path / "new.txt"
result = file_write_fn(file_path=str(new_file), content="Hello, World!")
assert "error" not in result
assert result["created"] is True
assert result["name"] == "new.txt"
assert new_file.read_text() == "Hello, World!"
def test_write_overwrites_existing(self, file_write_fn, tmp_path: Path):
"""Writing to existing file overwrites by default."""
existing = tmp_path / "existing.txt"
existing.write_text("old content")
result = file_write_fn(file_path=str(existing), content="new content")
assert "error" not in result
assert result["created"] is False
assert result["previous_size"] is not None
assert existing.read_text() == "new content"
def test_write_appends_to_existing(self, file_write_fn, tmp_path: Path):
"""Writing with mode='append' adds to existing content."""
existing = tmp_path / "existing.txt"
existing.write_text("line1\n")
result = file_write_fn(file_path=str(existing), content="line2\n", mode="append")
assert "error" not in result
assert result["mode"] == "append"
assert existing.read_text() == "line1\nline2\n"
def test_write_creates_parent_dirs(self, file_write_fn, tmp_path: Path):
"""Writing with create_dirs=True creates missing directories."""
deep_path = tmp_path / "nested" / "dirs" / "file.txt"
result = file_write_fn(file_path=str(deep_path), content="content", create_dirs=True)
assert "error" not in result
assert deep_path.exists()
assert deep_path.read_text() == "content"
def test_write_fails_without_parent_dir(self, file_write_fn, tmp_path: Path):
"""Writing with create_dirs=False fails if parent doesn't exist."""
missing_dir = tmp_path / "missing" / "file.txt"
result = file_write_fn(file_path=str(missing_dir), content="content", create_dirs=False)
assert "error" in result
assert "parent directory" in result["error"].lower()
def test_write_invalid_mode(self, file_write_fn, tmp_path: Path):
"""Writing with invalid mode returns error."""
result = file_write_fn(
file_path=str(tmp_path / "test.txt"),
content="content",
mode="invalid"
)
assert "error" in result
assert "invalid mode" in result["error"].lower()
def test_write_returns_bytes_written(self, file_write_fn, tmp_path: Path):
"""Result includes accurate bytes_written count."""
content = "Hello, World!"
result = file_write_fn(file_path=str(tmp_path / "test.txt"), content=content)
assert result["bytes_written"] == len(content.encode("utf-8"))
def test_write_with_encoding(self, file_write_fn, tmp_path: Path):
"""Writing with specific encoding works."""
file_path = tmp_path / "latin.txt"
result = file_write_fn(file_path=str(file_path), content="café", encoding="latin-1")
assert "error" not in result
# Verify it was written with latin-1 encoding
assert file_path.read_bytes() == "café".encode("latin-1")
@@ -1,52 +0,0 @@
"""Tests for web_scrape tool (FastMCP)."""
import pytest
from fastmcp import FastMCP
from aden_tools.tools.web_scrape_tool import register_tools
@pytest.fixture
def web_scrape_fn(mcp: FastMCP):
"""Register and return the web_scrape tool function."""
register_tools(mcp)
return mcp._tool_manager._tools["web_scrape"].fn
class TestWebScrapeTool:
"""Tests for web_scrape tool."""
def test_url_auto_prefixed_with_https(self, web_scrape_fn):
"""URLs without scheme get https:// prefix."""
# This will fail to connect, but we can verify the behavior
result = web_scrape_fn(url="example.com")
# Should either succeed or have a network error (not a validation error)
assert isinstance(result, dict)
def test_max_length_clamped_low(self, web_scrape_fn):
"""max_length below 1000 is clamped to 1000."""
# Test with a very low max_length - implementation clamps to 1000
result = web_scrape_fn(url="https://example.com", max_length=500)
# Should not error due to invalid max_length
assert isinstance(result, dict)
def test_max_length_clamped_high(self, web_scrape_fn):
"""max_length above 500000 is clamped to 500000."""
# Test with a very high max_length - implementation clamps to 500000
result = web_scrape_fn(url="https://example.com", max_length=600000)
# Should not error due to invalid max_length
assert isinstance(result, dict)
def test_valid_max_length_accepted(self, web_scrape_fn):
"""Valid max_length values are accepted."""
result = web_scrape_fn(url="https://example.com", max_length=10000)
assert isinstance(result, dict)
def test_include_links_option(self, web_scrape_fn):
"""include_links parameter is accepted."""
result = web_scrape_fn(url="https://example.com", include_links=True)
assert isinstance(result, dict)
def test_selector_option(self, web_scrape_fn):
"""selector parameter is accepted."""
result = web_scrape_fn(url="https://example.com", selector=".content")
assert isinstance(result, dict)
@@ -1,57 +0,0 @@
"""Tests for web_search tool (FastMCP)."""
import pytest
from fastmcp import FastMCP
from aden_tools.tools.web_search_tool import register_tools
@pytest.fixture
def web_search_fn(mcp: FastMCP):
"""Register and return the web_search tool function."""
register_tools(mcp)
return mcp._tool_manager._tools["web_search"].fn
class TestWebSearchTool:
"""Tests for web_search tool."""
def test_search_missing_api_key(self, web_search_fn, monkeypatch):
"""Search without API key returns helpful error."""
monkeypatch.delenv("BRAVE_SEARCH_API_KEY", raising=False)
result = web_search_fn(query="test query")
assert "error" in result
assert "BRAVE_SEARCH_API_KEY" in result["error"]
assert "help" in result
def test_empty_query_returns_error(self, web_search_fn, monkeypatch):
"""Empty query returns error."""
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "test-key")
result = web_search_fn(query="")
assert "error" in result
assert "1-500" in result["error"].lower() or "character" in result["error"].lower()
def test_long_query_returns_error(self, web_search_fn, monkeypatch):
"""Query exceeding 500 chars returns error."""
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "test-key")
result = web_search_fn(query="x" * 501)
assert "error" in result
def test_num_results_clamped_to_valid_range(self, web_search_fn, monkeypatch):
"""num_results outside 1-20 is clamped (not error)."""
monkeypatch.setenv("BRAVE_SEARCH_API_KEY", "test-key")
# Test that the function handles out-of-range values gracefully
# The implementation clamps values, so we just verify it doesn't crash
# (actual API call would fail with invalid key, but that's expected)
result = web_search_fn(query="test", num_results=0)
# Should either clamp or error - both are acceptable
assert isinstance(result, dict)
result = web_search_fn(query="test", num_results=100)
assert isinstance(result, dict)
-118
View File
@@ -1,118 +0,0 @@
# Hive Configuration
# ======================
# Copy this file to config.yaml and customize for your environment.
# Run `npm run setup` to generate .env files from this configuration.
#
# For detailed documentation, see: docs/configuration.md
# -----------------------------------------------------------------------------
# Application Settings
# -----------------------------------------------------------------------------
app:
# Application name (displayed in UI and logs)
name: Hive
# Environment: development, production, or test
environment: development
# Log level: debug, info, warn, error
log_level: info
# -----------------------------------------------------------------------------
# Server Configuration
# -----------------------------------------------------------------------------
server:
# Frontend settings
frontend:
# Port for the frontend application
port: 3000
# Backend (Hive) settings
backend:
# Port for the backend API
port: 4000
# Host to bind to (0.0.0.0 for all interfaces)
host: 0.0.0.0
# -----------------------------------------------------------------------------
# TimescaleDB Configuration (Time-series metrics storage)
# -----------------------------------------------------------------------------
timescaledb:
# Connection URL for TimescaleDB
# Format: postgresql://user:password@host:port/database
url: postgresql://postgres:postgres@localhost:5432/aden_tsdb
# External port mapping (for docker-compose)
port: 5432
# -----------------------------------------------------------------------------
# MongoDB Configuration (Policies, pricing, control config)
# -----------------------------------------------------------------------------
mongodb:
# Connection URL for MongoDB
url: mongodb://localhost:27017
# Database name for main data
database: aden
# Database name for ERP data
erp_database: erp
# External port mapping (for docker-compose)
port: 27017
# -----------------------------------------------------------------------------
# Redis Configuration (Caching and Socket.IO)
# -----------------------------------------------------------------------------
redis:
# Connection URL for Redis
url: redis://localhost:6379
# External port mapping (for docker-compose)
port: 6379
# -----------------------------------------------------------------------------
# Authentication & Security
# -----------------------------------------------------------------------------
auth:
# JWT secret key - CHANGE THIS IN PRODUCTION!
# Generate with: openssl rand -base64 32
jwt_secret: change-this-to-a-secure-random-string-min-32-chars
# JWT token expiration (e.g., 1h, 7d, 30d)
jwt_expires_in: 7d
# Passphrase for additional encryption - CHANGE THIS IN PRODUCTION!
passphrase: change-this-to-a-secure-passphrase
# -----------------------------------------------------------------------------
# NPM Configuration
# -----------------------------------------------------------------------------
npm:
# NPM token for private package access (if needed)
token: ""
# -----------------------------------------------------------------------------
# CORS Configuration
# -----------------------------------------------------------------------------
cors:
# Allowed origin for CORS requests
# In production, set this to your frontend URL
origin: http://localhost:3000
# -----------------------------------------------------------------------------
# Feature Flags
# -----------------------------------------------------------------------------
features:
# Enable user registration
registration: true
# Enable API rate limiting
rate_limiting: false
# Enable request logging
request_logging: true
# Enable MCP (Model Context Protocol) server
mcp_server: true
@@ -1,919 +0,0 @@
---
name: building-agents
description: Build goal-driven agents with nodes, edges, and validation. Use when asked to create an agent, design a workflow, or build automation that requires multiple steps with LLM reasoning.
---
# Building Agents
Build goal-driven agents that use LLM reasoning to accomplish tasks.
## Quick Start
1. Define the goal (what success looks like)
2. Add nodes (units of work)
3. Connect with edges (flow between nodes)
4. Validate and test
## Core Concepts
**Goal**: The source of truth. Defines success criteria and constraints.
**Node**: A unit of work. Types:
- `llm_generate` - Text generation, parsing
- `llm_tool_use` - Actions requiring tools
- `router` - Conditional branching
- `function` - Deterministic operations
**Edge**: Connection between nodes with conditions:
- `on_success` - Proceed if node succeeds
- `on_failure` - Handle errors
- `always` - Always proceed
- `conditional` - Based on expression
**Session Architecture**: Agents are stateful services that:
- Maintain execution state across invocations
- Pause at HITL nodes and resume with new input
- Accept inputs through multiple entry points
- Persist state until explicitly cleared
## Workflow (HITL Required)
**CRITICAL**: Each step requires human approval before proceeding.
**CRITICAL**: Run tests during approval so humans can see actual behavior.
**CRITICAL**: Use structured questions (AskUserQuestion) with fallback to text mode.
### Approval Strategy
**Always try structured questions first**, with graceful fallback:
1. **Attempt**: Call AskUserQuestion with clickable options
2. **Catch**: If tool fails/rejected, fall back to text prompt
3. **Parse**: Accept text input like "approve", "reject", "pause"
This ensures the workflow works in all environments (VSCode extension, CLI, web).
**Practical Example**:
```python
# 1. Call MCP tool to create goal
result = set_goal(
goal_id="text-parser",
name="Text Parser",
description="Parse text into JSON",
success_criteria='[...]',
constraints='[...]'
)
# 2. Parse result
import json
data = json.loads(result)
# 3. MCP tool returns approval_required=True with approval_question
# Claude sees this and calls AskUserQuestion
# 4. Present component
print(f"**GOAL: {data['goal']['name']}**")
print(f"Validation: ✅ PASS")
# 5. Call AskUserQuestion with the approval_question data
answer = AskUserQuestion(
questions=[{
"question": data["approval_question"]["question"],
"header": data["approval_question"]["header"],
"options": data["approval_question"]["options"],
"multiSelect": False
}]
)
# If widget supported → User sees clickable buttons:
Do you approve this goal?
Approve (Recommended)
Reject & Modify
Pause & Review
# If widget NOT supported → Falls back to text:
Do you approve this goal definition?
Options: approve | reject | pause
> approve User types this
```
### Build Loop
```
For each component (goal, node, edge):
1. PROPOSE → Show the component to the human
2. VALIDATE → Run validation, show errors/warnings
3. TEST → Run the component with sample inputs to show behavior
4. ASK APPROVAL → Use AskUserQuestion with clickable options (NOT free text)
5. Only proceed after approval
```
**CRITICAL**: Step 4 MUST use AskUserQuestion tool with structured options. Never ask "Do you approve?" as free text.
### Checklist (ask approval at each ✓)
**NOTE**: Every "ASK APPROVAL" means use AskUserQuestion with clickable options.
```
Agent Build Progress:
- [ ] Define goal with success criteria → ASK APPROVAL (clickable: Approve/Reject/Pause) ✓
- [ ] Define goal constraints → ASK APPROVAL (clickable: Approve/Reject/Pause) ✓
- [ ] Add entry node → TEST NODE → ASK APPROVAL (clickable: Approve/Reject/Pause) ✓
- [ ] Add each processing node → TEST NODE → ASK APPROVAL (clickable: Approve/Reject/Pause) ✓
- [ ] Add pause nodes (if HITL needed) → TEST NODE → ASK APPROVAL (clickable: Approve/Reject/Pause) ✓
- [ ] Add resume entry points (for pause nodes) → ASK APPROVAL (clickable: Approve/Reject/Pause) ✓
- [ ] Add terminal node(s) → TEST NODE → ASK APPROVAL (clickable: Approve/Reject/Pause) ✓
- [ ] Connect nodes with edges → ASK APPROVAL (clickable: Approve/Reject/Pause) ✓
- [ ] Configure entry_points and pause_nodes → ASK APPROVAL (clickable: Approve/Reject/Pause) ✓
- [ ] Validate full graph → TEST GRAPH → SHOW RESULTS
- [ ] Final approval → ASK APPROVAL (clickable: Approve & Export/Reject/Pause) ✓
- [ ] Export to exports/{agent-name}/
```
### Testing During Approval
**For each node**, use `test_node` with sample inputs:
```
test_node(
node_id="my-node",
test_input='{"key": "sample value"}',
)
```
Show the human:
- What inputs the node will read
- What the LLM prompt looks like
- What tools are available
- What outputs will be written
**Before final approval**, use `test_graph` to simulate full execution:
```
test_graph(
test_input='{"initial": "data"}',
dry_run=true,
)
```
Show the human:
- The complete execution path
- Each node that will execute
- The data flow between nodes
### Approval Format
After each component, **TRY to use AskUserQuestion with structured options** (fallback to text if unavailable):
**CRITICAL**: Attempt structured questions first, fall back to text mode gracefully if the environment doesn't support it.
```python
# Try structured approval first
try:
response = AskUserQuestion(
questions=[{
"question": "Do you approve this [goal/node/edge]?",
"header": "Approve",
"options": [
{
"label": "✓ Approve (Recommended)",
"description": "Component looks good, proceed to next step"
},
{
"label": "✗ Reject & Modify",
"description": "Need to make changes before proceeding"
},
{
"label": "⏸ Pause & Review",
"description": "I need more time to review this"
}
],
"multiSelect": false
}]
)
except:
# Fallback to text mode if widget not supported
# Ask: "Do you approve? Type: approve | reject | pause"
pass
```
**Before asking for approval**, present the component details:
```
**[COMPONENT TYPE]: [NAME]**
[Show details of what was created]
Validation: [PASS/FAIL]
- Errors: [list]
- Warnings: [list]
Test Results:
[Show test_node or test_graph output]
```
**Then ask for approval** using structured questions (or text fallback).
**DO NOT proceed without explicit human approval.**
### Approval Helper Pattern
**IMPORTANT**: MCP tools now return `approval_required: true` flag with approval questions.
After calling any MCP tool (`set_goal`, `add_node`, `add_edge`), check the response:
```python
# Call MCP tool
result = set_goal(...)
result_data = json.loads(result)
# Check if approval is required
if result_data.get("approval_required"):
approval_q = result_data["approval_question"]
# Present component details first
print(f"**{approval_q['component_type'].upper()}: {approval_q['component_name']}**")
print(f"\nValidation: {'✅ PASS' if result_data['valid'] else '❌ FAIL'}")
if result_data.get('errors'):
print(f"Errors: {result_data['errors']}")
if result_data.get('warnings'):
print(f"Warnings: {result_data['warnings']}")
# Try structured question first
try:
answer = AskUserQuestion(
questions=[{
"question": approval_q["question"],
"header": approval_q["header"],
"options": approval_q["options"],
"multiSelect": False
}]
)
# Parse answer - look for "Approve" in the response
response_text = str(answer.values())
if "Approve" in response_text and "Reject" not in response_text:
# Approved - continue
pass
elif "Reject" in response_text:
# Rejected - ask what to modify
print("What would you like to modify?")
# Handle modifications...
else:
# Paused - stop here
print("Build paused. Resume when ready.")
return
except:
# Fallback: text mode
print(f"\n{approval_q['question']}")
print("Options: approve | reject | pause")
user_input = input().strip().lower()
if user_input != "approve":
if user_input == "reject":
print("What would you like to modify?")
else:
print("Build paused.")
return
```
Use this pattern after EVERY MCP tool call that creates/modifies components.
### Clarification Questions (Use Structured Options)
When you need to clarify requirements during the build, **TRY AskUserQuestion with options (fallback to text)**:
**For Node Type Selection**:
```python
try:
answer = AskUserQuestion(
questions=[{
"question": "What type of node should this be?",
"header": "Node Type",
"options": [
{
"label": "llm_generate",
"description": "Text generation, parsing, analysis"
},
{
"label": "llm_tool_use",
"description": "Actions requiring tools (API calls, data fetching)"
},
{
"label": "router",
"description": "Conditional branching based on output"
},
{
"label": "function",
"description": "Deterministic operations without LLM"
}
],
"multiSelect": false
}]
)
node_type = answer["Node Type"]
except:
# Fallback to text
print("→ Node type? Options: llm_generate | llm_tool_use | router | function")
node_type = input().strip()
```
**For Edge Conditions**:
```python
AskUserQuestion(
questions=[{
"question": "When should this edge be traversed?",
"header": "Edge Condition",
"options": [
{
"label": "on_success (Recommended)",
"description": "Proceed only if node succeeds"
},
{
"label": "on_failure",
"description": "Proceed only if node fails (error handling)"
},
{
"label": "always",
"description": "Always proceed regardless of result"
},
{
"label": "conditional",
"description": "Custom expression-based condition"
}
],
"multiSelect": false
}]
)
```
**For Multi-Field Input** (e.g., collecting input/output keys):
```python
AskUserQuestion(
questions=[{
"question": "What keys should this node read from memory?",
"header": "Input Keys",
"options": [
{
"label": "objective",
"description": "User's main objective/request"
},
{
"label": "context",
"description": "Additional context data"
},
{
"label": "previous_result",
"description": "Output from previous node"
},
{
"label": "Custom keys",
"description": "I'll specify custom keys in the text field"
}
],
"multiSelect": true # Allow selecting multiple
}]
)
```
**For Yes/No Decisions**:
```python
AskUserQuestion(
questions=[{
"question": "Should this agent support pause/resume for HITL conversations?",
"header": "HITL Support",
"options": [
{
"label": "Yes",
"description": "Agent will pause for user input and resume later"
},
{
"label": "No",
"description": "Agent runs end-to-end without pausing"
}
],
"multiSelect": false
}]
)
```
**General Rules**:
- If there are 2-4 common options → Use structured questions with fallback
- For truly open-ended input (system prompts, descriptions) → Text input only
- **Always wrap AskUserQuestion in try/except** to handle environments without widget support
- Fallback format: Simple text prompt listing the options
## Defining Goals
Goals must be measurable. Include:
```python
Goal(
id="my-agent",
name="My Agent",
description="One sentence describing what it does",
success_criteria=[
SuccessCriterion(
id="primary",
description="What must be true for success",
metric="how to measure",
target="threshold",
weight=1.0,
),
],
constraints=[
Constraint(
id="safety",
description="What the agent must NOT do",
constraint_type="hard", # hard = must not violate
category="safety",
),
],
)
```
**Good goals**: Specific, measurable, constrained
**Bad goals**: Vague, unmeasurable, no boundaries
## Integrating External Tools (MCP Servers)
Before adding nodes, you can register MCP servers to make their tools available to your agent.
### Using aden-tools in the Hive Monorepo
The hive monorepo includes `aden-tools` which provides web search, web scraping, and file operations.
**Step 1: Register the MCP Server**
After creating your session, register aden-tools:
```python
# Using MCP tools
add_mcp_server(
name="aden-tools",
transport="stdio",
command="python",
args='["mcp_server.py", "--stdio"]',
cwd="../aden-tools" # Relative to core/ directory
)
```
**Expected response:**
```json
{
"success": true,
"server": {
"name": "aden-tools",
"transport": "stdio",
"command": "python",
"args": ["-m", "aden_tools.server"],
"cwd": "../aden-tools"
},
"tools_discovered": 6,
"tools": [
"web_search",
"web_scrape",
"file_read",
"file_write",
"pdf_read",
"example_tool"
],
"note": "MCP server 'aden-tools' registered with 6 tools..."
}
```
**Step 2: List Available Tools** (optional verification)
```python
list_mcp_tools(server_name="aden-tools")
```
This shows detailed information about each tool including parameters.
**Step 3: Use Tools in Your Nodes**
Now you can reference these tools in `llm_tool_use` nodes:
```python
add_node(
node_id="web_searcher",
name="Web Searcher",
description="Search the web for information",
node_type="llm_tool_use",
input_keys='["query"]',
output_keys='["search_results"]',
tools='["web_search"]', # ← Tool from aden-tools
system_prompt="Search for {query} using web_search tool"
)
```
**Step 4: Export Creates mcp_servers.json**
When you export your agent with `export_graph()`, the MCP server configuration is automatically saved:
```
exports/my-agent/
├── agent.json # Agent specification
├── README.md # Documentation
└── mcp_servers.json # ← MCP configuration (auto-generated)
```
The `mcp_servers.json` file ensures the agent can access aden-tools when run later.
### Available aden-tools
| Tool | Description | Key Parameters |
|------|-------------|----------------|
| `web_search` | Search the web using Brave Search API | `query`, `num_results`, `country` |
| `web_scrape` | Extract text content from a webpage | `url`, `selector`, `include_links` |
| `file_read` | Read file contents | `path` |
| `file_write` | Write content to files | `path`, `content` |
| `pdf_read` | Extract text from PDF files | `path` |
### MCP Server Management
List registered servers:
```python
list_mcp_servers()
```
Remove a server:
```python
remove_mcp_server(name="aden-tools")
```
### Best Practices
1. **Register early**: Call `add_mcp_server` right after `create_session` and before defining nodes
2. **Verify tools**: Use `list_mcp_tools` to see available tools and their parameters
3. **Minimal tools**: Only include tools a node actually needs in its `tools` list
4. **Test nodes**: Use `test_node` to verify tool access works before building the full graph
### Example: Research Agent with aden-tools
```python
# 1. Create session
create_session(name="research-agent")
# 2. Register aden-tools
add_mcp_server(
name="aden-tools",
transport="stdio",
command="python",
args='["mcp_server.py", "--stdio"]',
cwd="../aden-tools"
)
# 3. Verify tools
list_mcp_tools(server_name="aden-tools")
# 4. Define goal
set_goal(
goal_id="research",
name="Research Agent",
description="Gather and synthesize information",
success_criteria='[...]',
constraints='[...]'
)
# 5. Add node that uses web_search
add_node(
node_id="searcher",
name="Information Searcher",
node_type="llm_tool_use",
input_keys='["topic"]',
output_keys='["search_results"]',
tools='["web_search"]', # From aden-tools
system_prompt="Search for information about {topic}"
)
# 6. Continue building...
```
## Adding Nodes
Each node does one thing:
```python
NodeSpec(
id="processor",
name="Processor",
description="What this node does",
node_type="llm_tool_use",
input_keys=["input_data"], # What it reads
output_keys=["result"], # What it writes
tools=["tool_a", "tool_b"], # Available tools
system_prompt="Instructions for the LLM",
)
```
**Node design rules**:
- Single responsibility
- Explicit input/output keys
- Minimal tools (only what's needed)
- Specific system prompts
## Connecting Edges
Edges define flow:
```python
EdgeSpec(
id="process-to-format",
source="processor",
target="formatter",
condition=EdgeCondition.ON_SUCCESS,
)
```
**Edge rules**:
- Every node (except terminal) needs outgoing edges
- Handle failure paths explicitly
- Use priority when multiple edges could match
## Pause/Resume Architecture (HITL Conversations)
For agents that need multi-turn conversations with users:
### Graph Configuration
```python
GraphSpec(
entry_node="start-node",
entry_points={
"start": "analyze-input", # Initial entry
"request-clarification_resume": "process-clarification", # Resume after pause
},
pause_nodes=["request-clarification"], # Nodes that pause execution
terminal_nodes=["output-result"],
)
```
### Pause Node Pattern
**Pause nodes** generate output (e.g., questions) then pause execution:
```python
# Node 1: Detect if clarification needed (entry node)
NodeSpec(
id="analyze-input",
node_type="llm_generate",
input_keys=["objective"],
output_keys=["objective", "needs_clarification", "questions"],
)
# Node 2: Ask questions (PAUSE NODE)
NodeSpec(
id="request-clarification",
node_type="llm_generate",
input_keys=["objective", "questions"],
output_keys=["questions_to_ask"], # Returns questions to user
)
# Node 3: Process user's answers (RESUME ENTRY POINT)
NodeSpec(
id="process-clarification",
node_type="llm_generate",
input_keys=["objective", "questions_to_ask", "input"], # input = user's answers
output_keys=["enriched_objective", "ready"],
)
```
### Execution Flow
**First invocation** (fresh start):
```
User: "Travel to LA"
→ Entry: analyze-input
→ Executes: analyze-input (needs_clarification=true)
→ Executes: request-clarification (pause node)
⏸ PAUSES - saves state
```
**Second invocation** (resume):
```
User: "from SF, March 15-20"
→ Entry: process-clarification (resume point)
→ Executes: process-clarification (merges answers)
→ Continues: identify-stakeholders → ...
```
### Key Rules
1. **Pause nodes are NOT terminal** - They execute fully, save state, then pause
2. **Entry points** - Each pause node needs a `{pause_node}_resume` entry point
3. **Resume node** - Takes user's follow-up input in the `input` key
4. **State restoration** - All memory from pause is restored automatically
## Validation Checks
Before running, validate:
- [ ] Entry node exists (no incoming edges)
- [ ] Terminal nodes exist (no outgoing edges)
- [ ] All nodes reachable from entry
- [ ] No orphan nodes
- [ ] All edge sources/targets exist
## Example: Calculator Agent
See [examples/calculator.md](examples/calculator.md) for a complete example.
## Example: Sales Agent
See [examples/sales-agent.md](examples/sales-agent.md) for a multi-node agent with tools.
## Common Patterns
**Linear pipeline**: A → B → C → D (each node feeds the next)
**Router pattern**: A → Router → [B or C or D] based on condition
**Error handling**: Add `on_failure` edges to error handler nodes
**Parallel paths**: Multiple edges from same source (use priority)
**HITL Conversation** (multi-turn with user):
```
analyze → needs_clarification? → YES → request-clarification (PAUSE)
↓ NO ↓
process [User provides answers]
process-clarification (RESUME) → continue
```
- Pause node generates questions and pauses
- User provides answers in next invocation
- Resume node merges answers and continues
- State persists across pauses automatically
## Anti-Patterns
**Too many nodes**: If a node does one tiny thing, combine with others
**Vague prompts**: "Process the data" → "Extract the customer name and email from the JSON"
**Missing error paths**: Always handle what happens when nodes fail
**Circular dependencies**: Nodes shouldn't loop back without exit conditions
**Terminal pause nodes**: ❌ Don't make pause nodes terminal - they need edges to resume nodes
**Missing resume entry points**: ❌ Each pause node needs a `{pause_node}_resume` entry point
**Restarting instead of resuming**: ❌ Don't route back to entry node - use resume entry points
## Tools Reference
### Building Tools
| Tool | Purpose |
|------|---------|
| `create_session` | Start a new agent building session |
| `set_goal` | Define the goal with success criteria and constraints |
| `add_node` | Add a node to the graph |
| `add_edge` | Connect two nodes with an edge |
| `validate_graph` | Check the graph for errors |
| `export_graph` | Export the completed agent |
| `get_session_status` | View current build progress |
### Testing Tools (for HITL approval)
| Tool | Purpose |
|------|---------|
| `test_node` | Run a single node with sample inputs to show behavior |
| `test_graph` | Simulate full graph execution to show the complete flow |
## Using the Exported Agent
After `export_graph`, you get JSON containing both the **plan** and the **goal**.
### 1. Save the Export to Proper Location
**CRITICAL**: Each agent MUST be saved to its own folder under `exports/`:
```
exports/
├── outbound-sales-agent/
│ ├── agent.json # The export_graph() output
│ └── tools.py # Tool implementations (optional)
├── lead-qualifier/
│ ├── agent.json
│ └── tools.py
└── customer-support/
├── agent.json
└── tools.py
```
Save the complete output from `export_graph()`:
```python
import os
# Create agent folder
agent_name = "outbound-sales-agent" # Use the session name
os.makedirs(f"exports/{agent_name}", exist_ok=True)
# Save the export
with open(f"exports/{agent_name}/agent.json", "w") as f:
f.write(export_graph_output)
```
### 2. Running the Agent (CLI)
Use the built-in agent runner CLI:
```bash
# Show agent info
python -m core info exports/outbound-sales-agent
# Validate the agent
python -m core validate exports/outbound-sales-agent
# Run with JSON input
python -m core run exports/outbound-sales-agent --input '{"lead_id": "123"}'
# Interactive shell (best for conversational agents)
python -m core shell exports/outbound-sales-agent
# Run in mock mode (no real LLM calls)
python -m core run exports/outbound-sales-agent --input '{"lead_id": "123"}' --mock
# List all agents
python -m core list exports/
```
**Interactive Shell** (for agents with pause/resume):
```bash
$ python -m core shell exports/task-planner
>>> Travel to LA this month
⏸ Agent paused at: request-clarification
Questions: ["What's your departure city?", "What dates?"]
>>> from San Francisco, March 15-20
🔄 Resuming from paused state
✓ Execution complete!
# Use /reset to clear conversation state
>>> /reset
✓ Conversation state and agent session cleared
```
### 3. Running the Agent (Python API)
Use `AgentRunner` for programmatic access:
```python
import asyncio
from framework.runner import AgentRunner
async def main():
# Load and run
runner = AgentRunner.load("exports/outbound-sales-agent")
result = await runner.run({"lead_id": "123"})
if result.status.value == "completed":
print("Success!", result.results)
else:
print("Needs attention:", result.feedback)
asyncio.run(main())
```
With context manager:
```python
async with AgentRunner.load("exports/outbound-sales-agent") as runner:
result = await runner.run({"lead_id": "123"})
```
### 4. Providing Tools
Create `tools.py` in the agent folder:
```python
"""Tools for my-agent."""
import json
from framework.llm.provider import Tool, ToolUse, ToolResult
# Define tools
TOOLS = {
"my_tool": Tool(
name="my_tool",
description="What it does",
parameters={"type": "object", "properties": {"param": {"type": "string"}}},
),
}
# Implement executor
def tool_executor(tool_use: ToolUse) -> ToolResult:
if tool_use.name == "my_tool":
result = do_something(tool_use.input["param"])
return ToolResult(
tool_use_id=tool_use.id,
content=json.dumps(result),
is_error=False,
)
```
Or register tools programmatically:
```python
runner = AgentRunner.load("exports/my-agent")
runner.register_tool("my_tool", my_tool_function)
result = await runner.run(context)
```
For complete API details, see [reference/api.md](reference/api.md).
@@ -1,161 +0,0 @@
# Example: Calculator Agent
A simple agent that evaluates mathematical expressions.
## Goal
```python
from framework.graph import Goal, SuccessCriterion, Constraint
goal = Goal(
id="calculator",
name="Calculator",
description="Evaluate mathematical expressions accurately",
success_criteria=[
SuccessCriterion(
id="correct-result",
description="Mathematical result is correct",
metric="output_equals_expected",
target="exact_match",
weight=1.0,
),
],
constraints=[
Constraint(
id="no-crash",
description="Invalid operations return 'Error', not exceptions",
constraint_type="hard",
category="safety",
check="no_exception",
),
],
)
```
## Nodes
```python
from framework.graph import NodeSpec
nodes = [
NodeSpec(
id="calculator",
name="Calculator",
description="Evaluate the mathematical expression",
node_type="llm_tool_use",
input_keys=["expression"],
output_keys=["result"],
tools=["calculate"],
system_prompt="Calculate the expression using the calculate tool. Return only the numeric result.",
),
NodeSpec(
id="formatter",
name="Formatter",
description="Format the result for display",
node_type="llm_generate",
input_keys=["result"],
output_keys=["formatted"],
system_prompt="Format the number for display. Output only the formatted result.",
),
]
```
## Edges
```python
from framework.graph import EdgeSpec, EdgeCondition
edges = [
EdgeSpec(
id="calc-to-format",
source="calculator",
target="formatter",
condition=EdgeCondition.ON_SUCCESS,
),
]
```
## Graph
```python
from framework.graph.edge import GraphSpec
graph = GraphSpec(
id="calculator-graph",
goal_id=goal.id,
entry_node="calculator",
terminal_nodes=["formatter"],
nodes=nodes,
edges=edges,
)
```
## Tool Definition
```python
from framework.llm.provider import Tool, ToolResult
tools = [
Tool(
name="calculate",
description="Evaluate a mathematical expression",
parameters={
"type": "object",
"properties": {
"expression": {"type": "string", "description": "Math expression to evaluate"}
},
"required": ["expression"],
},
),
]
def tool_executor(ctx, tool_use):
if tool_use.name == "calculate":
expr = tool_use.input["expression"]
try:
# Safe evaluation (in production, use a proper math parser)
result = eval(expr.replace('×', '*').replace('÷', '/'))
return ToolResult(tool_use.id, json.dumps({"result": result}), False)
except Exception:
return ToolResult(tool_use.id, json.dumps({"error": "Error"}), True)
return ToolResult(tool_use.id, json.dumps({"error": "Unknown tool"}), True)
```
## Running
```python
from core import Runtime
from framework.llm import AnthropicProvider
from framework.graph import GraphExecutor
async def run():
runtime = Runtime("/tmp/calculator")
llm = AnthropicProvider()
executor = GraphExecutor(
runtime=runtime,
llm=llm,
tools=tools,
tool_executor=tool_executor,
)
result = await executor.execute(
graph=graph,
goal=goal,
input_data={"expression": "2 + 3 * 4"},
)
print(f"Result: {result.output}")
```
## Architecture
```
┌────────────┐ on_success ┌───────────┐
│ Calculator │ ───────────────► │ Formatter │
│ (tool_use) │ │ (generate)│
└────────────┘ └───────────┘
│ │
calculate formats
tool call output
```
@@ -1,207 +0,0 @@
# Example: Sales Opportunity Agent
A multi-node agent that analyzes sales opportunities and recommends actions.
## Goal
```python
goal = Goal(
id="sales-opportunity",
name="Sales Opportunity Automation",
description="Analyze opportunities, qualify leads, recommend next actions",
success_criteria=[
SuccessCriterion(
id="accurate-qualification",
description="Correctly qualify leads as hot/warm/cold",
metric="qualification_accuracy",
target=">0.85",
weight=0.4,
),
SuccessCriterion(
id="actionable-recommendations",
description="Provide specific next steps",
metric="recommendation_specificity",
target="always_specific",
weight=0.3,
),
],
constraints=[
Constraint(
id="no-false-promises",
description="Never suggest outcomes without data support",
constraint_type="hard",
category="safety",
),
Constraint(
id="privacy",
description="Handle data in compliance with privacy regulations",
constraint_type="hard",
category="safety",
),
],
)
```
## Nodes
### 1. Lead Analyzer (Entry)
```python
NodeSpec(
id="lead-analyzer",
name="Lead Analyzer",
description="Extract engagement signals from opportunity data",
node_type="llm_generate",
input_keys=["opportunity"],
output_keys=["signals", "company_profile", "engagement_summary"],
system_prompt="""Analyze the opportunity and extract:
1. Engagement signals (response times, meeting attendance)
2. Company profile (size, industry, fit)
3. Deal signals (budget, timeline, decision-maker)
Output JSON with: signals, company_profile, engagement_summary""",
)
```
### 2. Opportunity Scorer
```python
NodeSpec(
id="opportunity-scorer",
name="Opportunity Scorer",
description="Score opportunity based on signals",
node_type="llm_tool_use",
input_keys=["signals", "company_profile", "engagement_summary"],
output_keys=["score", "qualification", "score_breakdown"],
tools=["historical_lookup"],
system_prompt="""Score this opportunity 0-100:
- Engagement (30%)
- Company fit (25%)
- Deal signals (25%)
- Historical similarity (20%)
Qualify as:
- HOT (80-100): High intent, active engagement
- WARM (50-79): Some interest, needs nurturing
- COLD (0-49): Low engagement or poor fit
Use historical_lookup to find similar deals.""",
)
```
### 3. Action Recommender
```python
NodeSpec(
id="action-recommender",
name="Action Recommender",
description="Generate specific next steps",
node_type="llm_tool_use",
input_keys=["score", "qualification", "engagement_summary", "opportunity"],
output_keys=["recommended_actions", "reasoning", "priority"],
tools=["calendar_availability", "email_templates"],
system_prompt="""Recommend actions based on qualification:
HOT: Check calendar, schedule meeting, send proposal
WARM: Send nurturing content, plan discovery call
COLD: Re-engagement campaign or deprioritize
Output JSON with: recommended_actions, reasoning, priority""",
)
```
### 4. Output Formatter (Terminal)
```python
NodeSpec(
id="output-formatter",
name="Output Formatter",
description="Format final analysis",
node_type="llm_generate",
input_keys=["qualification", "score", "recommended_actions", "reasoning"],
output_keys=["result"],
system_prompt="""Format into clean report:
- qualification
- score
- recommended_actions
- reasoning
- one-sentence summary""",
)
```
## Edges
```python
edges = [
EdgeSpec(id="analyze-to-score", source="lead-analyzer", target="opportunity-scorer", condition=EdgeCondition.ON_SUCCESS),
EdgeSpec(id="score-to-recommend", source="opportunity-scorer", target="action-recommender", condition=EdgeCondition.ON_SUCCESS),
EdgeSpec(id="recommend-to-format", source="action-recommender", target="output-formatter", condition=EdgeCondition.ON_SUCCESS),
]
```
## Architecture
```
┌───────────────┐ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────┐
│ Lead Analyzer │──►│Opportunity │──►│ Action │──►│ Output │
│ (generate) │ │Scorer (tool_use)│ │ Recommender │ │ Formatter │
└───────────────┘ └─────────────────┘ │ (tool_use) │ │ (generate) │
│ └─────────────────┘ └─────────────┘
historical_lookup │
calendar_availability
email_templates
```
## Tools
```python
tools = [
Tool(
name="historical_lookup",
description="Find similar past opportunities",
parameters={
"type": "object",
"properties": {
"company_size": {"type": "string"},
"industry": {"type": "string"},
},
},
),
Tool(
name="calendar_availability",
description="Check calendar for meeting slots",
parameters={
"type": "object",
"properties": {
"timeframe": {"type": "string"},
},
},
),
Tool(
name="email_templates",
description="Get email templates for sales scenarios",
parameters={
"type": "object",
"properties": {
"template_type": {"type": "string"},
},
},
),
]
```
## Test Cases
```python
# Hot lead test
{"opportunity": {"engagement": "high", "budget_confirmed": True, "decision_maker": True}}
# Expected: qualification = "HOT", priority = "high"
# Cold lead test
{"opportunity": {"engagement": "low", "budget_confirmed": False, "last_contact": "3 months ago"}}
# Expected: qualification = "COLD", priority = "low"
# Warm lead test
{"opportunity": {"engagement": "medium", "budget_confirmed": False, "decision_maker": True}}
# Expected: qualification = "WARM", priority = "medium"
```
@@ -1,174 +0,0 @@
# API Reference
## Goal
```python
Goal(
id: str, # Unique identifier
name: str, # Human-readable name
description: str, # What the agent does
success_criteria: list[SuccessCriterion], # Measurable success metrics
constraints: list[Constraint], # Boundaries and rules
required_capabilities: list[str], # e.g., ["llm", "tools"]
input_schema: dict, # Expected input format
output_schema: dict, # Expected output format
)
```
## SuccessCriterion
```python
SuccessCriterion(
id: str, # Unique identifier
description: str, # What must be true
metric: str, # How to measure (e.g., "accuracy", "output_equals")
target: str, # Threshold (e.g., ">0.9", "exact_match")
weight: float, # Importance (0.0-1.0)
)
```
## Constraint
```python
Constraint(
id: str, # Unique identifier
description: str, # What the agent must NOT do
constraint_type: str, # "hard" (must not violate) or "soft" (prefer not to)
category: str, # "safety", "time", "cost", "scope", "quality"
check: str, # How to verify compliance
)
```
## NodeSpec
```python
NodeSpec(
id: str, # Unique identifier
name: str, # Human-readable name
description: str, # What this node does
node_type: str, # "llm_generate", "llm_tool_use", "router", "function"
input_keys: list[str], # Keys to read from shared memory
output_keys: list[str], # Keys to write to shared memory
system_prompt: str | None, # Instructions for LLM (required for llm_*)
tools: list[str], # Available tools (for llm_tool_use)
routes: dict[str, str], # Route map (for router)
function: str | None, # Function name (for function)
max_retries: int, # Default 3
)
```
### Node Types
| Type | Description | Requires |
|------|-------------|----------|
| `llm_generate` | Text generation, parsing | `system_prompt` |
| `llm_tool_use` | Actions with tools | `system_prompt`, `tools` |
| `router` | Conditional branching | `routes` |
| `function` | Deterministic code | `function` |
## EdgeSpec
```python
EdgeSpec(
id: str, # Unique identifier
source: str, # Source node ID
target: str, # Target node ID
condition: EdgeCondition, # When to traverse
condition_expr: str | None, # Expression for CONDITIONAL
input_mapping: dict[str, str],# Data mapping between nodes
priority: int, # Higher = checked first
)
```
### EdgeCondition
| Value | When |
|-------|------|
| `ALWAYS` | After source completes (success or failure) |
| `ON_SUCCESS` | Only if source succeeds |
| `ON_FAILURE` | Only if source fails |
| `CONDITIONAL` | Based on `condition_expr` |
## GraphSpec
```python
GraphSpec(
id: str, # Unique identifier
goal_id: str, # Associated goal
entry_node: str, # Starting node
terminal_nodes: list[str], # Ending nodes
nodes: list[NodeSpec], # All nodes
edges: list[EdgeSpec], # All edges
memory_keys: list[str], # All shared memory keys
default_model: str, # Default LLM model
max_steps: int, # Max execution steps
)
```
## GraphExecutor
```python
executor = GraphExecutor(
runtime: Runtime, # Decision logging
llm: LLMProvider, # LLM for nodes
tools: list[Tool], # Available tools
tool_executor: Callable, # Function to execute tools
)
result = await executor.execute(
graph: GraphSpec,
goal: Goal,
input_data: dict,
)
```
### ExecutionResult
```python
ExecutionResult(
success: bool, # Did execution succeed?
output: dict, # Final output from shared memory
error: str | None, # Error message if failed
steps_executed: int, # Number of steps taken
total_tokens: int, # LLM tokens used
total_latency_ms: int, # Total execution time
path: list[str], # Node IDs traversed
)
```
## Tool Definition
```python
Tool(
name: str, # Tool identifier
description: str, # What the tool does
parameters: dict, # JSON Schema for parameters
)
```
## ToolResult
```python
ToolResult(
tool_use_id: str, # ID from tool call
content: str, # Result (usually JSON string)
is_error: bool, # True if tool failed
)
```
## Imports
```python
# Core
from framework.graph import Goal, SuccessCriterion, Constraint
from framework.graph import NodeSpec, EdgeSpec, EdgeCondition
from framework.graph.edge import GraphSpec
from framework.graph import GraphExecutor
# LLM
from framework.llm import AnthropicProvider
from framework.llm.provider import Tool, ToolResult
# Runtime
from core import Runtime
```
+3 -3
View File
@@ -3,12 +3,12 @@
"agent-builder": {
"command": "python",
"args": ["-m", "framework.mcp.agent_builder_server"],
"cwd": "/home/timothy/oss/hive/core"
"cwd": "core"
},
"aden-tools": {
"tools": {
"command": "python",
"args": ["-m", "aden_tools.mcp_server", "--stdio"],
"cwd": "/home/timothy/oss/hive/aden-tools"
"cwd": "tools"
}
}
}
+99 -20
View File
@@ -6,7 +6,7 @@ This guide explains how to use the new MCP integration tools in the agent builde
The agent builder now supports registering external MCP servers as tool sources. This allows you to:
1. Register MCP servers (like aden-tools) during agent building
1. Register MCP servers (like tools) during agent building
2. Discover available tools from those servers
3. Use those tools in your agent nodes
4. Automatically generate `mcp_servers.json` configuration on export
@@ -18,6 +18,7 @@ The agent builder now supports registering external MCP servers as tool sources.
Register an MCP server as a tool source for your agent.
**Parameters:**
- `name` (string, required): Unique name for the MCP server
- `transport` (string, required): Transport type - "stdio" or "http"
- `command` (string): Command to run (for stdio transport)
@@ -29,21 +30,23 @@ Register an MCP server as a tool source for your agent.
- `description` (string): Description of the MCP server
**Example - STDIO:**
```json
{
"name": "add_mcp_server",
"arguments": {
"name": "aden-tools",
"name": "tools",
"transport": "stdio",
"command": "python",
"args": "[\"mcp_server.py\", \"--stdio\"]",
"cwd": "../aden-tools",
"cwd": "../tools",
"description": "Aden tools for web search and file operations"
}
}
```
**Example - HTTP:**
```json
{
"name": "add_mcp_server",
@@ -57,15 +60,16 @@ Register an MCP server as a tool source for your agent.
```
**Response:**
```json
{
"success": true,
"server": {
"name": "aden-tools",
"name": "tools",
"transport": "stdio",
"command": "python",
"args": ["mcp_server.py", "--stdio"],
"cwd": "../aden-tools",
"cwd": "../tools",
"description": "Aden tools..."
},
"tools_discovered": 6,
@@ -78,7 +82,7 @@ Register an MCP server as a tool source for your agent.
"example_tool"
],
"total_mcp_servers": 1,
"note": "MCP server 'aden-tools' registered with 6 tools. These tools can now be used in llm_tool_use nodes."
"note": "MCP server 'tools' registered with 6 tools. These tools can now be used in llm_tool_use nodes."
}
```
@@ -89,15 +93,16 @@ List all registered MCP servers.
**Parameters:** None
**Response:**
```json
{
"mcp_servers": [
{
"name": "aden-tools",
"name": "tools",
"transport": "stdio",
"command": "python",
"args": ["mcp_server.py", "--stdio"],
"cwd": "../aden-tools",
"cwd": "../tools",
"description": "Aden tools..."
}
],
@@ -110,24 +115,27 @@ List all registered MCP servers.
List tools available from registered MCP servers.
**Parameters:**
- `server_name` (string, optional): Name of specific server to list tools from. If omitted, lists tools from all servers.
**Example:**
```json
{
"name": "list_mcp_tools",
"arguments": {
"server_name": "aden-tools"
"server_name": "tools"
}
}
```
**Response:**
```json
{
"success": true,
"tools_by_server": {
"aden-tools": [
"tools": [
{
"name": "web_search",
"description": "Search the web for information using Brave Search API...",
@@ -150,23 +158,26 @@ List tools available from registered MCP servers.
Remove a registered MCP server.
**Parameters:**
- `name` (string, required): Name of the MCP server to remove
**Example:**
```json
{
"name": "remove_mcp_server",
"arguments": {
"name": "aden-tools"
"name": "tools"
}
}
```
**Response:**
```json
{
"success": true,
"removed": "aden-tools",
"removed": "tools",
"remaining_servers": 0
}
```
@@ -176,6 +187,7 @@ Remove a registered MCP server.
Here's a complete workflow for building an agent with MCP tools:
### 1. Create Session
```json
{
"name": "create_session",
@@ -186,30 +198,33 @@ Here's a complete workflow for building an agent with MCP tools:
```
### 2. Register MCP Server
```json
{
"name": "add_mcp_server",
"arguments": {
"name": "aden-tools",
"name": "tools",
"transport": "stdio",
"command": "python",
"args": "[\"mcp_server.py\", \"--stdio\"]",
"cwd": "../aden-tools"
"cwd": "../tools"
}
}
```
### 3. List Available Tools
```json
{
"name": "list_mcp_tools",
"arguments": {
"server_name": "aden-tools"
"server_name": "tools"
}
}
```
### 4. Set Goal
```json
{
"name": "set_goal",
@@ -223,6 +238,7 @@ Here's a complete workflow for building an agent with MCP tools:
```
### 5. Add Node with MCP Tool
```json
{
"name": "add_node",
@@ -239,9 +255,10 @@ Here's a complete workflow for building an agent with MCP tools:
}
```
Note: `web_search` is now available because we registered the aden-tools MCP server!
Note: `web_search` is now available because we registered the tools MCP server!
### 6. Export Agent
```json
{
"name": "export_graph",
@@ -250,6 +267,7 @@ Note: `web_search` is now available because we registered the aden-tools MCP ser
```
The export will create:
- `exports/web-research-agent/agent.json` - Agent specification
- `exports/web-research-agent/README.md` - Documentation
- `exports/web-research-agent/mcp_servers.json` - **MCP server configuration**
@@ -262,11 +280,11 @@ When you export an agent with registered MCP servers, an `mcp_servers.json` file
{
"servers": [
{
"name": "aden-tools",
"name": "tools",
"transport": "stdio",
"command": "python",
"args": ["mcp_server.py", "--stdio"],
"cwd": "../aden-tools",
"cwd": "../tools",
"description": "Aden tools for web search and file operations"
}
]
@@ -288,7 +306,7 @@ runner = AgentRunner.load("exports/web-research-agent")
# Run with input
result = await runner.run({"query": "latest AI breakthroughs"})
# The web_search tool from aden-tools is automatically available!
# The web_search tool from tools is automatically available!
```
## Benefits
@@ -301,14 +319,17 @@ result = await runner.run({"query": "latest AI breakthroughs"})
## Common MCP Servers
### aden-tools
### tools
Provides:
- `web_search` - Brave Search API integration
- `web_scrape` - Web page content extraction
- `file_read` / `file_write` - File operations
- `pdf_read` - PDF text extraction
### Custom MCP Servers
You can register any MCP server that follows the Model Context Protocol specification.
## Troubleshooting
@@ -332,3 +353,61 @@ You can register any MCP server that follows the Model Context Protocol specific
- Verify you registered at least one MCP server
- Check `get_session_status` to see `mcp_servers_count > 0`
- Re-export the agent after registering servers
## Credential Validation
When adding nodes with tools that require API keys (like `web_search`), the agent builder automatically validates that the required credentials are available.
### How It Works
When you call `add_node` or `update_node` with a `tools` parameter, the agent builder:
1. Checks which tools require credentials (e.g., `web_search` requires `BRAVE_SEARCH_API_KEY`)
2. Validates those credentials are set in the environment or `.env` file
3. Returns an error if any credentials are missing
### Missing Credentials Error
If credentials are missing, you'll receive a response like:
```json
{
"valid": false,
"errors": ["Missing credentials for tools: ['BRAVE_SEARCH_API_KEY']"],
"missing_credentials": [
{
"credential": "brave_search",
"env_var": "BRAVE_SEARCH_API_KEY",
"tools_affected": ["web_search"],
"help_url": "https://brave.com/search/api/",
"description": "API key for Brave Search"
}
],
"action_required": "Add the credentials to your .env file and retry",
"example": "Add to .env:\nBRAVE_SEARCH_API_KEY=your_key_here",
"message": "Cannot add node: missing API credentials. Add them to .env and retry this command."
}
```
### Fixing Credential Errors
1. Get the required API key from the URL in `help_url`
2. Add it to your environment:
```bash
# Option 1: Export directly
export BRAVE_SEARCH_API_KEY=your-key-here
# Option 2: Add to tools/.env
echo "BRAVE_SEARCH_API_KEY=your-key-here" >> tools/.env
```
3. Retry the `add_node` command
### Required Credentials by Tool
| Tool | Credential | Get Key |
| ------------ | ---------------------- | ----------------------------------------------------- |
| `web_search` | `BRAVE_SEARCH_API_KEY` | [brave.com/search/api](https://brave.com/search/api/) |
Note: The MCP server itself requires `ANTHROPIC_API_KEY` at startup for LLM operations.
+19 -16
View File
@@ -21,13 +21,13 @@ from framework.runner.runner import AgentRunner
# Load your agent
runner = AgentRunner.load("exports/my-agent")
# Register aden-tools MCP server
# Register tools MCP server
runner.register_mcp_server(
name="aden-tools",
name="tools",
transport="stdio",
command="python",
args=["-m", "aden_tools.mcp_server", "--stdio"],
cwd="/path/to/aden-tools"
cwd="/path/to/tools"
)
# Tools are now available to your agent
@@ -42,11 +42,11 @@ Create `mcp_servers.json` in your agent folder:
{
"servers": [
{
"name": "aden-tools",
"name": "tools",
"transport": "stdio",
"command": "python",
"args": ["-m", "aden_tools.mcp_server", "--stdio"],
"cwd": "../aden-tools"
"cwd": "../tools"
}
]
}
@@ -78,6 +78,7 @@ runner.register_mcp_server(
```
**Configuration:**
- `command`: Executable to run (e.g., "python", "node")
- `args`: List of command-line arguments
- `cwd`: Working directory for the process
@@ -99,6 +100,7 @@ runner.register_mcp_server(
```
**Configuration:**
- `url`: Base URL of the MCP server
- `headers`: HTTP headers to include (optional)
@@ -119,7 +121,7 @@ builder.add_node(
name="Web Researcher",
node_type="llm_tool_use",
system_prompt="Research the topic using web_search",
tools=["web_search"], # Tool from aden-tools MCP server
tools=["web_search"], # Tool from tools MCP server
input_keys=["topic"],
output_keys=["findings"]
)
@@ -145,9 +147,9 @@ Tools from MCP servers can be referenced in your agent.json just like built-in t
}
```
## Available Tools from aden-tools
## Available Tools from tools
When you register the `aden-tools` MCP server, the following tools become available:
When you register the `tools` MCP server, the following tools become available:
- **web_search**: Search the web using Brave Search API
- **web_scrape**: Scrape content from a URL
@@ -163,11 +165,11 @@ Some MCP tools require environment variables. You can pass them in the configura
```python
runner.register_mcp_server(
name="aden-tools",
name="tools",
transport="stdio",
command="python",
args=["-m", "aden_tools.mcp_server", "--stdio"],
cwd="../aden-tools",
cwd="../tools",
env={
"BRAVE_SEARCH_API_KEY": os.environ["BRAVE_SEARCH_API_KEY"]
}
@@ -180,11 +182,11 @@ runner.register_mcp_server(
{
"servers": [
{
"name": "aden-tools",
"name": "tools",
"transport": "stdio",
"command": "python",
"args": ["-m", "aden_tools.mcp_server", "--stdio"],
"cwd": "../aden-tools",
"cwd": "../tools",
"env": {
"BRAVE_SEARCH_API_KEY": "${BRAVE_SEARCH_API_KEY}"
}
@@ -203,11 +205,11 @@ You can register multiple MCP servers to access different sets of tools:
{
"servers": [
{
"name": "aden-tools",
"name": "tools",
"transport": "stdio",
"command": "python",
"args": ["-m", "aden_tools.mcp_server", "--stdio"],
"cwd": "../aden-tools"
"cwd": "../tools"
},
{
"name": "database-tools",
@@ -243,6 +245,7 @@ runner.register_mcp_server(
### 2. Use HTTP for Production
HTTP transport is better for:
- Containerized deployments
- Shared tools across multiple agents
- Remote tool execution
@@ -330,11 +333,11 @@ async def main():
# Register MCP server
runner.register_mcp_server(
name="aden-tools",
name="tools",
transport="stdio",
command="python",
args=["-m", "aden_tools.mcp_server", "--stdio"],
cwd="../aden-tools",
cwd="../tools",
env={
"BRAVE_SEARCH_API_KEY": "your-api-key"
}
+53 -203
View File
@@ -64,7 +64,7 @@ To use the agent builder with Claude Desktop or other MCP clients, add this to y
"agent-builder": {
"command": "python",
"args": ["-m", "framework.mcp.agent_builder_server"],
"cwd": "/path/to/hive/core"
"cwd": "/path/to/goal-agent"
}
}
}
@@ -75,144 +75,48 @@ The MCP server provides tools for:
- Defining goals with success criteria
- Adding nodes (llm_generate, llm_tool_use, router, function)
- Connecting nodes with edges
- **Registering MCP servers as tool sources** ✨
- **Discovering tools from MCP servers** ✨
- Validating and exporting agent graphs
- Testing nodes and full agent graphs
When you register an MCP server during agent building, the tools from that server become available to your agent, and an `mcp_servers.json` configuration file is automatically created on export.
See [MCP_SERVER_GUIDE.md](MCP_SERVER_GUIDE.md) for agent builder instructions and [MCP_BUILDER_TOOLS_GUIDE.md](MCP_BUILDER_TOOLS_GUIDE.md) for MCP integration tools.
## MCP Tool Integration
The framework also supports **connecting to MCP servers as tool providers**, allowing your agents to use tools from external MCP servers (like aden-tools). This enables you to extend your agents with powerful external capabilities.
### Quick Example
```python
from framework.runner.runner import AgentRunner
# Load an agent
runner = AgentRunner.load("exports/task-planner")
# Register an MCP server with tools
runner.register_mcp_server(
name="aden-tools",
transport="stdio",
command="python",
args=["mcp_server.py", "--stdio"],
cwd="../aden-tools"
)
# Tools from the MCP server are now available to your agent
result = await runner.run({"query": "Search for AI news"})
```
### Auto-loading MCP Servers
Create `mcp_servers.json` in your agent folder:
```json
{
"servers": [
{
"name": "aden-tools",
"transport": "stdio",
"command": "python",
"args": ["mcp_server.py", "--stdio"],
"cwd": "../aden-tools"
}
]
}
```
MCP servers will be automatically loaded when you load the agent.
### Available Tools from aden-tools
When you register the aden-tools MCP server, these tools become available:
- `web_search` - Search the web using Brave Search API
- `web_scrape` - Extract content from web pages
- `file_read` - Read file contents
- `file_write` - Write content to files
- `pdf_read` - Extract text from PDF files
See [MCP_INTEGRATION_GUIDE.md](MCP_INTEGRATION_GUIDE.md) for detailed instructions on MCP tool integration.
## Quick Start
### Running Agents
### Calculator Agent
The framework comes with pre-built example agents in the `exports/` directory:
Run an LLM-powered calculator:
```bash
# List available agents
python -m framework list exports/
# Single calculation
python -m framework calculate "2 + 3 * 4"
# Show agent information
python -m framework info exports/task-planner
# Interactive mode
python -m framework interactive
# Run an agent
python -m framework run exports/task-planner --input '{"objective": "Build a web scraper"}'
# Interactive shell mode (with human-in-the-loop approval)
python -m framework shell exports/task-planner
# Analyze runs with Builder
python -m framework analyze calculator
```
### Available Commands
- `run` - Execute an exported agent with given input
- `info` - Display agent details (goal, nodes, edges, success criteria)
- `validate` - Check that an agent is valid and runnable
- `list` - List all exported agents in a directory
- `dispatch` - Route requests to multiple agents using the orchestrator
- `shell` - Start an interactive session with an agent
### Building Agents Programmatically
You can build agents using the MCP server (recommended) or programmatically:
### Using the Runtime
```python
from framework import Runtime
# Initialize runtime with storage path
runtime = Runtime("./storage")
runtime = Runtime("/path/to/storage")
# Start a run for a goal
run_id = runtime.start_run(
goal_id="data-processor",
goal_description="Process data with quality checks",
input_data={"dataset": "customers.csv"}
)
# Set the current node context
runtime.set_node("processor-node")
# Start a run
run_id = runtime.start_run("my_goal", "Description of what we're doing")
# Record a decision
decision_id = runtime.decide(
intent="Choose how to process the data",
options=[
{
"id": "fast",
"description": "Quick processing",
"action_type": "tool_call",
"pros": ["Fast"],
"cons": ["Less accurate"]
},
{
"id": "thorough",
"description": "Detailed processing",
"action_type": "tool_call",
"pros": ["Accurate"],
"cons": ["Slower"]
},
{"id": "fast", "description": "Quick processing", "pros": ["Fast"], "cons": ["Less accurate"]},
{"id": "thorough", "description": "Detailed processing", "pros": ["Accurate"], "cons": ["Slower"]},
],
chosen="thorough",
reasoning="Accuracy is more important for this task"
)
# Record the outcome of the decision
# Record the outcome
runtime.record_outcome(
decision_id=decision_id,
success=True,
@@ -221,13 +125,28 @@ runtime.record_outcome(
)
# End the run
runtime.end_run(
success=True,
narrative="Successfully processed all data",
output_data={"total_processed": 100}
)
runtime.end_run(success=True, narrative="Successfully processed all data")
```
### Testing Agents
The framework includes a goal-based testing framework for validating agent behavior.
Tests are generated using MCP tools (`generate_constraint_tests`, `generate_success_tests`) which return guidelines. Claude writes tests directly using the Write tool based on these guidelines.
```bash
# Run tests against an agent
python -m framework test-run <agent_path> --goal <goal_id> --parallel 4
# Debug failed tests
python -m framework test-debug <agent_path> <test_name>
# List tests for a goal
python -m framework test-list <goal_id>
```
For detailed testing workflows, see the [testing-agent skill](../.claude/skills/testing-agent/SKILL.md).
### Analyzing Agent Behavior with Builder
The BuilderQuery interface allows you to analyze agent runs and identify improvements:
@@ -235,119 +154,50 @@ The BuilderQuery interface allows you to analyze agent runs and identify improve
```python
from framework import BuilderQuery
# Initialize Builder query interface
query = BuilderQuery("./storage")
query = BuilderQuery("/path/to/storage")
# Find patterns across runs for a goal
patterns = query.find_patterns("data-processor")
if patterns:
print(f"Success rate: {patterns.success_rate:.1%}")
print(f"Runs analyzed: {patterns.run_count}")
# Find patterns across runs
patterns = query.find_patterns("my_goal")
print(f"Success rate: {patterns.success_rate:.1%}")
# Show problematic nodes
for node_id, failure_rate in patterns.problematic_nodes:
print(f"Node '{node_id}' has {failure_rate:.1%} failure rate")
# Analyze a failure
analysis = query.analyze_failure("run_123")
print(f"Root cause: {analysis.root_cause}")
print(f"Suggestions: {analysis.suggestions}")
# Analyze a specific failure
analysis = query.analyze_failure("run_20260119_143022_abc123")
if analysis:
print(f"Failure point: {analysis.failure_point}")
print(f"Root cause: {analysis.root_cause}")
print(f"\nSuggestions:")
for suggestion in analysis.suggestions:
print(f" - {suggestion}")
# Get improvement recommendations for a goal
suggestions = query.suggest_improvements("data-processor")
# Get improvement recommendations
suggestions = query.suggest_improvements("my_goal")
for s in suggestions:
print(f"[{s['priority']}] {s['recommendation']}")
print(f" Reason: {s['reason']}")
# Get performance metrics for a specific node
perf = query.get_node_performance("processor-node")
print(f"Node: {perf['node_id']}")
print(f"Success rate: {perf['success_rate']:.1%}")
print(f"Avg latency: {perf['avg_latency_ms']:.0f}ms")
```
## Architecture
The framework consists of several layers:
```
┌─────────────────┐
│ Human Engineer │ ← Supervision, approval via HITL
│ Human Engineer │ ← Supervision, approval
└────────┬────────┘
┌────────▼────────┐
│ Builder LLM │ ← Analyzes runs, suggests improvements (via MCP)
│ Builder LLM │ ← Analyzes runs, suggests improvements
│ (BuilderQuery) │
└────────┬────────┘
┌────────▼────────┐
│ Agent Graph │ ← Node-based execution flow
(AgentRunner) (llm_generate, llm_tool_use, router, function)
└────────┬────────┘
┌────────▼────────┐
│ Runtime │ ← Records decisions, outcomes, problems
│ (Decision DB) │
│ Agent LLM │ ← Executes tasks, records decisions
(Runtime)
└─────────────────┘
```
## Key Concepts
### Graph-Based Agents
Agents are defined as directed graphs with:
- **Nodes**: Execution steps (llm_generate, llm_tool_use, router, function)
- **Edges**: Control flow between nodes, including conditional routing
- **Goal**: What the agent is designed to accomplish with success criteria
- **Constraints**: Hard and soft limits on agent behavior
### Decision Recording
- **Decision**: The atomic unit of agent behavior. Captures intent, options, choice, and reasoning.
- **Outcome**: Result of executing a decision (success/failure, latency, tokens, state changes)
- **Run**: A complete execution trace with all decisions and outcomes
- **Problem**: Issues reported during execution with severity and suggested fixes
### Analysis & Improvement
- **Runtime**: Interface agents use to record their behavior during execution
- **BuilderQuery**: Interface for analyzing agent runs and identifying patterns
- **PatternAnalysis**: Cross-run analysis showing success rates, common failures, problematic nodes
- **FailureAnalysis**: Deep dive into why a specific run failed with suggestions
### Human-in-the-Loop (HITL)
- **Approval Callbacks**: Nodes can require human approval before execution
- **Interactive Shell**: Chat-like interface for running agents with approval prompts
- **Session State**: Agents can pause and resume based on user input
### Multi-Agent Orchestration
- **AgentOrchestrator**: Dispatch requests to multiple agents
- **Agent Discovery**: Automatically discover and register agents from a directory
- **Dispatch Strategy**: Route requests to the most appropriate agent(s)
## Example Agents
The `exports/` directory contains example agents you can run or use as templates:
- **task-planner**: Breaks down complex objectives into actionable tasks with dependencies
- **research-summary-agent**: Conducts research and generates summaries
- **outbound-sales-agent**: Handles outbound sales workflows
- **youtube-comments-research**: Analyzes YouTube comments for insights
Each agent includes:
- `agent.json`: Graph definition with nodes, edges, goal, and constraints
- `README.md`: Agent documentation
- `tools.py` (optional): Custom tool implementations
- **Run**: A complete execution with all decisions and outcomes.
- **Runtime**: Interface agents use to record their behavior.
- **BuilderQuery**: Interface Builder uses to analyze agent behavior.
## Requirements
- Python 3.11+
- pydantic >= 2.0
- anthropic >= 0.40.0 (for LLM-powered agents)
- mcp, fastmcp (optional, for MCP server)
+123
View File
@@ -0,0 +1,123 @@
"""
Minimal Manual Agent Example
----------------------------
This example demonstrates how to build and run an agent programmatically
without using the Claude Code CLI or external LLM APIs.
It uses 'function' nodes to define logic in pure Python, making it perfect
for understanding the core runtime loop:
Setup -> Graph definition -> Execution -> Result
Run with:
PYTHONPATH=core python core/examples/manual_agent.py
"""
import asyncio
from framework.graph import EdgeCondition, EdgeSpec, Goal, GraphSpec, NodeSpec
from framework.graph.executor import GraphExecutor
from framework.runtime.core import Runtime
# 1. Define Node Logic (Pure Python Functions)
def greet(name: str) -> str:
"""Generate a simple greeting."""
return f"Hello, {name}!"
def uppercase(greeting: str) -> str:
"""Convert text to uppercase."""
return greeting.upper()
async def main():
print("🚀 Setting up Manual Agent...")
# 2. Define the Goal
# Every agent needs a goal with success criteria
goal = Goal(
id="greet-user",
name="Greet User",
description="Generate a friendly uppercase greeting",
success_criteria=[
{
"id": "greeting_generated",
"description": "Greeting produced",
"metric": "custom",
"target": "any",
}
],
)
# 3. Define Nodes
# Nodes describe steps in the process
node1 = NodeSpec(
id="greeter",
name="Greeter",
description="Generates a simple greeting",
node_type="function",
function="greet", # Matches the registered function name
input_keys=["name"],
output_keys=["greeting"],
)
node2 = NodeSpec(
id="uppercaser",
name="Uppercaser",
description="Converts greeting to uppercase",
node_type="function",
function="uppercase",
input_keys=["greeting"],
output_keys=["final_greeting"],
)
# 4. Define Edges
# Edges define the flow between nodes
edge1 = EdgeSpec(
id="greet-to-upper",
source="greeter",
target="uppercaser",
condition=EdgeCondition.ON_SUCCESS,
)
# 5. Create Graph
# The graph works like a blueprint connecting nodes and edges
graph = GraphSpec(
id="greeting-agent",
goal_id="greet-user",
entry_node="greeter",
terminal_nodes=["uppercaser"],
nodes=[node1, node2],
edges=[edge1],
)
# 6. Initialize Runtime & Executor
# Runtime handles state/memory; Executor runs the graph
from pathlib import Path
runtime = Runtime(storage_path=Path("./agent_logs"))
executor = GraphExecutor(runtime=runtime)
# 7. Register Function Implementations
# Connect string names in NodeSpecs to actual Python functions
executor.register_function("greeter", greet)
executor.register_function("uppercaser", uppercase)
# 8. Execute Agent
print("▶ Executing agent with input: name='Alice'...")
result = await executor.execute(graph=graph, goal=goal, input_data={"name": "Alice"})
# 9. Verify Results
if result.success:
print("\n✅ Success!")
print(f"Path taken: {' -> '.join(result.path)}")
print(f"Final output: {result.output.get('final_greeting')}")
else:
print(f"\n❌ Failed: {result.error}")
if __name__ == "__main__":
# Optional: Enable logging to see internal decision flow
# logging.basicConfig(level=logging.INFO)
asyncio.run(main())
+22 -27
View File
@@ -21,25 +21,25 @@ async def example_1_programmatic_registration():
# Load an existing agent
runner = AgentRunner.load("exports/task-planner")
# Register aden-tools MCP server via STDIO
# Register tools MCP server via STDIO
num_tools = runner.register_mcp_server(
name="aden-tools",
name="tools",
transport="stdio",
command="python",
args=["-m", "aden_tools.mcp_server", "--stdio"],
cwd="../aden-tools",
cwd="../tools",
)
print(f"Registered {num_tools} tools from aden-tools MCP server")
print(f"Registered {num_tools} tools from tools MCP server")
# List all available tools
tools = runner._tool_registry.get_tools()
print(f"\nAvailable tools: {list(tools.keys())}")
# Run the agent with MCP tools available
result = await runner.run({
"objective": "Search for 'Claude AI' and summarize the top 3 results"
})
result = await runner.run(
{"objective": "Search for 'Claude AI' and summarize the top 3 results"}
)
print(f"\nAgent result: {result}")
@@ -51,14 +51,14 @@ async def example_2_http_transport():
"""Example 2: Connect to MCP server via HTTP"""
print("\n=== Example 2: HTTP MCP Server Connection ===\n")
# First, start the aden-tools MCP server in HTTP mode:
# cd aden-tools && python mcp_server.py --port 4001
# First, start the tools MCP server in HTTP mode:
# cd tools && python mcp_server.py --port 4001
runner = AgentRunner.load("exports/task-planner")
# Register aden-tools via HTTP
# Register tools via HTTP
num_tools = runner.register_mcp_server(
name="aden-tools-http",
name="tools-http",
transport="http",
url="http://localhost:4001",
)
@@ -78,10 +78,8 @@ async def example_3_config_file():
# Copy example config (in practice, you'd place this in your agent folder)
import shutil
shutil.copy(
"examples/mcp_servers.json",
test_agent_path / "mcp_servers.json"
)
shutil.copy("examples/mcp_servers.json", test_agent_path / "mcp_servers.json")
# Load agent - MCP servers will be auto-discovered
runner = AgentRunner.load(test_agent_path)
@@ -101,27 +99,23 @@ async def example_4_custom_agent_with_mcp_tools():
"""Example 4: Build custom agent that uses MCP tools"""
print("\n=== Example 4: Custom Agent with MCP Tools ===\n")
from framework.builder.workflow import WorkflowBuilder
from framework.builder.workflow import GraphBuilder
# Create a workflow builder
builder = WorkflowBuilder()
builder = GraphBuilder()
# Define goal
builder.set_goal(
goal_id="web-researcher",
name="Web Research Agent",
description="Search the web and summarize findings"
description="Search the web and summarize findings",
)
# Add success criteria
builder.add_success_criterion(
"search-results",
"Successfully retrieve at least 3 web search results"
)
builder.add_success_criterion(
"summary",
"Provide a clear, concise summary of the findings"
"search-results", "Successfully retrieve at least 3 web search results"
)
builder.add_success_criterion("summary", "Provide a clear, concise summary of the findings")
# Add nodes that will use MCP tools
builder.add_node(
@@ -130,7 +124,7 @@ async def example_4_custom_agent_with_mcp_tools():
description="Search the web for information",
node_type="llm_tool_use",
system_prompt="Search for {query} and return the top results. Use the web_search tool.",
tools=["web_search"], # This tool comes from aden-tools MCP server
tools=["web_search"], # This tool comes from tools MCP server
input_keys=["query"],
output_keys=["search_results"],
)
@@ -160,11 +154,11 @@ async def example_4_custom_agent_with_mcp_tools():
# Load and register MCP server
runner = AgentRunner.load(export_path)
runner.register_mcp_server(
name="aden-tools",
name="tools",
transport="stdio",
command="python",
args=["-m", "aden_tools.mcp_server", "--stdio"],
cwd="../aden-tools",
cwd="../tools",
)
# Run the agent
@@ -192,6 +186,7 @@ async def main():
except Exception as e:
print(f"\nError running example: {e}")
import traceback
traceback.print_exc()
+3 -3
View File
@@ -1,18 +1,18 @@
{
"servers": [
{
"name": "aden-tools",
"name": "tools",
"description": "Aden tools including web search, file operations, and PDF reading",
"transport": "stdio",
"command": "python",
"args": ["mcp_server.py", "--stdio"],
"cwd": "../aden-tools",
"cwd": "../tools",
"env": {
"BRAVE_SEARCH_API_KEY": "${BRAVE_SEARCH_API_KEY}"
}
},
{
"name": "aden-tools-http",
"name": "tools-http",
"description": "Aden tools via HTTP (for Docker deployments)",
"transport": "http",
"url": "http://localhost:4001",
+34 -5
View File
@@ -10,14 +10,35 @@ choice the agent makes is captured with:
- Whether that was good or bad (evaluated post-hoc)
This gives the Builder LLM the information it needs to improve agent behavior.
## Testing Framework
The framework includes a Goal-Based Testing system (Goal Agent Eval):
- Generate tests from Goal success_criteria and constraints
- Mandatory user approval before tests are stored
- Parallel test execution with error categorization
- Debug tools with fix suggestions
See `framework.testing` for details.
"""
from framework.schemas.decision import Decision, Option, Outcome, DecisionEvaluation
from framework.schemas.run import Run, RunSummary, Problem
from framework.runtime.core import Runtime
from framework.builder.query import BuilderQuery
from framework.llm import LLMProvider, AnthropicProvider
from framework.runner import AgentRunner, AgentOrchestrator
from framework.llm import AnthropicProvider, LLMProvider
from framework.runner import AgentOrchestrator, AgentRunner
from framework.runtime.core import Runtime
from framework.schemas.decision import Decision, DecisionEvaluation, Option, Outcome
from framework.schemas.run import Problem, Run, RunSummary
# Testing framework
from framework.testing import (
ApprovalStatus,
DebugTool,
ErrorCategory,
Test,
TestResult,
TestStorage,
TestSuiteResult,
)
__all__ = [
# Schemas
@@ -38,4 +59,12 @@ __all__ = [
# Runner
"AgentRunner",
"AgentOrchestrator",
# Testing
"Test",
"TestResult",
"TestSuiteResult",
"TestStorage",
"ApprovalStatus",
"ErrorCategory",
"DebugTool",
]
+1 -1
View File
@@ -1,4 +1,4 @@
"""Allow running as python -m framework"""
"""Allow running as ``python -m framework``, which powers the ``hive`` console entry point."""
from framework.cli import main
+3 -3
View File
@@ -2,12 +2,12 @@
from framework.builder.query import BuilderQuery
from framework.builder.workflow import (
GraphBuilder,
BuildSession,
BuildPhase,
ValidationResult,
BuildSession,
GraphBuilder,
TestCase,
TestResult,
ValidationResult,
)
__all__ = [
+51 -47
View File
@@ -8,12 +8,12 @@ This is designed around the questions I need to answer:
4. What should we change? (suggestions)
"""
from typing import Any
from collections import defaultdict
from pathlib import Path
from typing import Any
from framework.schemas.decision import Decision, DecisionType
from framework.schemas.run import Run, RunSummary, RunStatus
from framework.schemas.decision import Decision
from framework.schemas.run import Run, RunStatus, RunSummary
from framework.storage.backend import FileStorage
@@ -49,11 +49,11 @@ class FailureAnalysis:
def __str__(self) -> str:
lines = [
f"=== Failure Analysis for {self.run_id} ===",
f"",
"",
f"Failure Point: {self.failure_point}",
f"Root Cause: {self.root_cause}",
f"",
f"Decision Chain Leading to Failure:",
"",
"Decision Chain Leading to Failure:",
]
for i, dec in enumerate(self.decision_chain, 1):
lines.append(f" {i}. {dec}")
@@ -105,7 +105,7 @@ class PatternAnalysis:
def __str__(self) -> str:
lines = [
f"=== Pattern Analysis for Goal {self.goal_id} ===",
f"",
"",
f"Runs Analyzed: {self.run_count}",
f"Success Rate: {self.success_rate:.1%}",
]
@@ -196,10 +196,7 @@ class BuilderQuery:
break
# Extract problems
problems = [
f"[{p.severity}] {p.description}"
for p in run.problems
]
problems = [f"[{p.severity}] {p.description}" for p in run.problems]
# Generate suggestions based on the failure
suggestions = self._generate_suggestions(run, failed_decisions)
@@ -253,11 +250,7 @@ class BuilderQuery:
error = decision.outcome.error or "Unknown error"
failure_counts[error] += 1
common_failures = sorted(
failure_counts.items(),
key=lambda x: x[1],
reverse=True
)[:5]
common_failures = sorted(failure_counts.items(), key=lambda x: x[1], reverse=True)[:5]
# Find problematic nodes
node_stats: dict[str, dict[str, int]] = defaultdict(lambda: {"total": 0, "failed": 0})
@@ -328,34 +321,45 @@ class BuilderQuery:
# Suggestion: Fix problematic nodes
for node_id, failure_rate in patterns.problematic_nodes:
suggestions.append({
"type": "node_improvement",
"target": node_id,
"reason": f"Node has {failure_rate:.1%} failure rate",
"recommendation": f"Review and improve node '{node_id}' - high failure rate suggests prompt or tool issues",
"priority": "high" if failure_rate > 0.3 else "medium",
})
suggestions.append(
{
"type": "node_improvement",
"target": node_id,
"reason": f"Node has {failure_rate:.1%} failure rate",
"recommendation": (
f"Review and improve node '{node_id}' - "
"high failure rate suggests prompt or tool issues"
),
"priority": "high" if failure_rate > 0.3 else "medium",
}
)
# Suggestion: Address common failures
for failure, count in patterns.common_failures:
if count >= 2:
suggestions.append({
"type": "error_handling",
"target": failure,
"reason": f"Error occurred {count} times",
"recommendation": f"Add handling for: {failure}",
"priority": "high" if count >= 5 else "medium",
})
suggestions.append(
{
"type": "error_handling",
"target": failure,
"reason": f"Error occurred {count} times",
"recommendation": f"Add handling for: {failure}",
"priority": "high" if count >= 5 else "medium",
}
)
# Suggestion: Overall success rate
if patterns.success_rate < 0.8:
suggestions.append({
"type": "architecture",
"target": goal_id,
"reason": f"Goal success rate is only {patterns.success_rate:.1%}",
"recommendation": "Consider restructuring the agent graph or improving goal definition",
"priority": "high",
})
suggestions.append(
{
"type": "architecture",
"target": goal_id,
"reason": f"Goal success rate is only {patterns.success_rate:.1%}",
"recommendation": (
"Consider restructuring the agent graph or improving goal definition"
),
"priority": "high",
}
)
return suggestions
@@ -408,21 +412,22 @@ class BuilderQuery:
alternatives = [o for o in decision.options if o.id != decision.chosen_option_id]
if alternatives:
alt_desc = alternatives[0].description
chosen_desc = chosen.description if chosen else "unknown"
suggestions.append(
f"Consider alternative: '{alt_desc}' instead of '{chosen.description if chosen else 'unknown'}'"
f"Consider alternative: '{alt_desc}' instead of '{chosen_desc}'"
)
# Check for missing context
if not decision.input_context:
suggestions.append(
f"Decision '{decision.intent}' had no input context - ensure relevant data is passed"
f"Decision '{decision.intent}' had no input context - "
"ensure relevant data is passed"
)
# Check for constraint issues
if decision.active_constraints:
suggestions.append(
f"Review constraints: {', '.join(decision.active_constraints)} - may be too restrictive"
)
constraints = ", ".join(decision.active_constraints)
suggestions.append(f"Review constraints: {constraints} - may be too restrictive")
# Check for reported problems with suggestions
for problem in run.problems:
@@ -471,15 +476,14 @@ class BuilderQuery:
# Decision count difference
if len(run1.decisions) != len(run2.decisions):
differences.append(
f"Decision count: {len(run1.decisions)} vs {len(run2.decisions)}"
)
differences.append(f"Decision count: {len(run1.decisions)} vs {len(run2.decisions)}")
# Find first divergence point
for i, (d1, d2) in enumerate(zip(run1.decisions, run2.decisions)):
for i, (d1, d2) in enumerate(zip(run1.decisions, run2.decisions, strict=False)):
if d1.chosen_option_id != d2.chosen_option_id:
differences.append(
f"Diverged at decision {i}: chose '{d1.chosen_option_id}' vs '{d2.chosen_option_id}'"
f"Diverged at decision {i}: "
f"chose '{d1.chosen_option_id}' vs '{d2.chosen_option_id}'"
)
break
+93 -72
View File
@@ -13,34 +13,35 @@ Each step requires validation and human approval before proceeding.
You cannot skip steps or bypass validation.
"""
import json
from collections.abc import Callable
from datetime import datetime
from enum import Enum
from pathlib import Path
from datetime import datetime
from typing import Any, Callable
from dataclasses import dataclass, field
from typing import Any
from pydantic import BaseModel, Field
from framework.graph.goal import Goal, SuccessCriterion, Constraint
from framework.graph.edge import EdgeCondition, EdgeSpec, GraphSpec
from framework.graph.goal import Goal
from framework.graph.node import NodeSpec
from framework.graph.edge import EdgeSpec, EdgeCondition, GraphSpec
class BuildPhase(str, Enum):
"""Current phase of the build process."""
INIT = "init" # Just started
GOAL_DRAFT = "goal_draft" # Drafting goal
INIT = "init" # Just started
GOAL_DRAFT = "goal_draft" # Drafting goal
GOAL_APPROVED = "goal_approved" # Goal approved
ADDING_NODES = "adding_nodes" # Adding nodes
ADDING_EDGES = "adding_edges" # Adding edges
TESTING = "testing" # Running tests
APPROVED = "approved" # Fully approved
EXPORTED = "exported" # Exported to file
ADDING_NODES = "adding_nodes" # Adding nodes
ADDING_EDGES = "adding_edges" # Adding edges
TESTING = "testing" # Running tests
APPROVED = "approved" # Fully approved
EXPORTED = "exported" # Exported to file
class ValidationResult(BaseModel):
"""Result of a validation check."""
valid: bool
errors: list[str] = Field(default_factory=list)
warnings: list[str] = Field(default_factory=list)
@@ -49,6 +50,7 @@ class ValidationResult(BaseModel):
class TestCase(BaseModel):
"""A test case for validating agent behavior."""
id: str
description: str
input: dict[str, Any]
@@ -58,6 +60,7 @@ class TestCase(BaseModel):
class TestResult(BaseModel):
"""Result of running a test case."""
test_id: str
passed: bool
actual_output: Any = None
@@ -71,6 +74,7 @@ class BuildSession(BaseModel):
Saved after each approved step so you can resume later.
"""
id: str
name: str
phase: BuildPhase = BuildPhase.INIT
@@ -459,11 +463,14 @@ class GraphBuilder:
# Run the test
import asyncio
result = asyncio.run(executor.execute(
graph=graph,
goal=self.session.goal,
input_data=test.input,
))
result = asyncio.run(
executor.execute(
graph=graph,
goal=self.session.goal,
input_data=test.input,
)
)
# Check result
passed = result.success
@@ -517,12 +524,14 @@ class GraphBuilder:
if not self._pending_validation.valid:
return False
self.session.approvals.append({
"phase": self.session.phase.value,
"comment": comment,
"timestamp": datetime.now().isoformat(),
"validation": self._pending_validation.model_dump(),
})
self.session.approvals.append(
{
"phase": self.session.phase.value,
"comment": comment,
"timestamp": datetime.now().isoformat(),
"validation": self._pending_validation.model_dump(),
}
)
# Advance phase if appropriate
if self.session.phase == BuildPhase.GOAL_DRAFT:
@@ -556,11 +565,13 @@ class GraphBuilder:
return False
self.session.phase = BuildPhase.APPROVED
self.session.approvals.append({
"phase": "final",
"comment": comment,
"timestamp": datetime.now().isoformat(),
})
self.session.approvals.append(
{
"phase": "final",
"comment": comment,
"timestamp": datetime.now().isoformat(),
}
)
self._save_session()
return True
@@ -632,69 +643,75 @@ class GraphBuilder:
"""Generate Python code for the graph."""
lines = [
'"""',
f'Generated agent: {self.session.name}',
f'Generated at: {datetime.now().isoformat()}',
f"Generated agent: {self.session.name}",
f"Generated at: {datetime.now().isoformat()}",
'"""',
'',
'from framework.graph import (',
' Goal, SuccessCriterion, Constraint,',
' NodeSpec, EdgeSpec, EdgeCondition,',
')',
'from framework.graph.edge import GraphSpec',
'from framework.graph.goal import GoalStatus',
'',
'',
'# Goal',
"",
"from framework.graph import (",
" Goal, SuccessCriterion, Constraint,",
" NodeSpec, EdgeSpec, EdgeCondition,",
")",
"from framework.graph.edge import GraphSpec",
"from framework.graph.goal import GoalStatus",
"",
"",
"# Goal",
]
if self.session.goal:
goal_json = self.session.goal.model_dump_json(indent=4)
lines.append(f'GOAL = Goal.model_validate_json(\'\'\'')
lines.append("GOAL = Goal.model_validate_json('''")
lines.append(goal_json)
lines.append("''')")
else:
lines.append('GOAL = None')
lines.append("GOAL = None")
lines.extend([
'',
'',
'# Nodes',
'NODES = [',
])
lines.extend(
[
"",
"",
"# Nodes",
"NODES = [",
]
)
for node in self.session.nodes:
node_json = node.model_dump_json(indent=4)
lines.append(f' NodeSpec.model_validate_json(\'\'\'')
lines.append(" NodeSpec.model_validate_json('''")
lines.append(node_json)
lines.append(" '''),")
lines.extend([
']',
'',
'',
'# Edges',
'EDGES = [',
])
lines.extend(
[
"]",
"",
"",
"# Edges",
"EDGES = [",
]
)
for edge in self.session.edges:
edge_json = edge.model_dump_json(indent=4)
lines.append(f' EdgeSpec.model_validate_json(\'\'\'')
lines.append(" EdgeSpec.model_validate_json('''")
lines.append(edge_json)
lines.append(" '''),")
lines.extend([
']',
'',
'',
'# Graph',
])
lines.extend(
[
"]",
"",
"",
"# Graph",
]
)
graph_json = graph.model_dump_json(indent=4)
lines.append(f'GRAPH = GraphSpec.model_validate_json(\'\'\'')
lines.append("GRAPH = GraphSpec.model_validate_json('''")
lines.append(graph_json)
lines.append("''')")
return '\n'.join(lines)
return "\n".join(lines)
# =========================================================================
# SESSION MANAGEMENT
@@ -745,7 +762,9 @@ class GraphBuilder:
"tests": len(self.session.test_cases),
"tests_passed": sum(1 for t in self.session.test_results if t.passed),
"approvals": len(self.session.approvals),
"pending_validation": self._pending_validation.model_dump() if self._pending_validation else None,
"pending_validation": self._pending_validation.model_dump()
if self._pending_validation
else None,
}
def show(self) -> str:
@@ -757,11 +776,13 @@ class GraphBuilder:
]
if self.session.goal:
lines.extend([
f"Goal: {self.session.goal.name}",
f" {self.session.goal.description}",
"",
])
lines.extend(
[
f"Goal: {self.session.goal.name}",
f" {self.session.goal.description}",
"",
]
)
if self.session.nodes:
lines.append("Nodes:")
+54 -9
View File
@@ -1,26 +1,65 @@
"""
Command-line interface for Goal Agent.
Command-line interface for Aden Hive.
Usage:
python -m core run exports/my-agent --input '{"key": "value"}'
python -m core info exports/my-agent
python -m core validate exports/my-agent
python -m core list exports/
python -m core dispatch exports/ --input '{"key": "value"}'
python -m core shell exports/my-agent
hive run exports/my-agent --input '{"key": "value"}'
hive info exports/my-agent
hive validate exports/my-agent
hive list exports/
hive dispatch exports/ --input '{"key": "value"}'
hive shell exports/my-agent
Testing commands:
hive test-run <agent_path> --goal <goal_id>
hive test-debug <goal_id> <test_id>
hive test-list <goal_id>
hive test-stats <goal_id>
"""
import argparse
import sys
from pathlib import Path
def _configure_paths():
"""Auto-configure sys.path so agents in exports/ are discoverable.
Resolves the project root by walking up from this file (framework/cli.py lives
inside core/framework/) or from CWD, then adds the exports/ directory to sys.path
if it exists. This eliminates the need for manual PYTHONPATH configuration.
"""
# Strategy 1: resolve relative to this file (works when installed via pip install -e core/)
framework_dir = Path(__file__).resolve().parent # core/framework/
core_dir = framework_dir.parent # core/
project_root = core_dir.parent # project root
# Strategy 2: if project_root doesn't look right, fall back to CWD
if not (project_root / "exports").is_dir() and not (project_root / "core").is_dir():
project_root = Path.cwd()
# Add exports/ to sys.path so agents are importable as top-level packages
exports_dir = project_root / "exports"
if exports_dir.is_dir():
exports_str = str(exports_dir)
if exports_str not in sys.path:
sys.path.insert(0, exports_str)
# Ensure core/ is also in sys.path (for non-editable-install scenarios)
core_str = str(project_root / "core")
if (project_root / "core").is_dir() and core_str not in sys.path:
sys.path.insert(0, core_str)
def main():
_configure_paths()
parser = argparse.ArgumentParser(
description="Goal Agent - Build and run goal-driven agents"
prog="hive",
description="Aden Hive - Build and run goal-driven agents",
)
parser.add_argument(
"--model",
default="claude-sonnet-4-20250514",
default="claude-haiku-4-5-20251001",
help="Anthropic model to use",
)
@@ -28,8 +67,14 @@ def main():
# Register runner commands (run, info, validate, list, dispatch, shell)
from framework.runner.cli import register_commands
register_commands(subparsers)
# Register testing commands (test-run, test-debug, test-list, test-stats)
from framework.testing.cli import register_testing_commands
register_testing_commands(subparsers)
args = parser.parse_args()
if hasattr(args, "func"):
+122
View File
@@ -0,0 +1,122 @@
"""
Credential Store - Production-ready credential management for Hive.
This module provides secure credential storage with:
- Key-vault structure: Credentials as objects with multiple keys
- Template-based usage: {{cred.key}} patterns for injection
- Bipartisan model: Store stores values, tools define usage
- Provider system: Extensible lifecycle management (refresh, validate)
- Multiple backends: Encrypted files, env vars, HashiCorp Vault
Quick Start:
from core.framework.credentials import CredentialStore, CredentialObject
# Create store with encrypted storage
store = CredentialStore.with_encrypted_storage() # defaults to ~/.hive/credentials
# Get a credential
api_key = store.get("brave_search")
# Resolve templates in headers
headers = store.resolve_headers({
"Authorization": "Bearer {{github_oauth.access_token}}"
})
# Save a new credential
store.save_credential(CredentialObject(
id="my_api",
keys={"api_key": CredentialKey(name="api_key", value=SecretStr("xxx"))}
))
For OAuth2 support:
from core.framework.credentials.oauth2 import BaseOAuth2Provider, OAuth2Config
For Aden server sync:
from core.framework.credentials.aden import (
AdenCredentialClient,
AdenClientConfig,
AdenSyncProvider,
)
For Vault integration:
from core.framework.credentials.vault import HashiCorpVaultStorage
"""
from .models import (
CredentialDecryptionError,
CredentialError,
CredentialKey,
CredentialKeyNotFoundError,
CredentialNotFoundError,
CredentialObject,
CredentialRefreshError,
CredentialType,
CredentialUsageSpec,
CredentialValidationError,
)
from .provider import (
BearerTokenProvider,
CredentialProvider,
StaticProvider,
)
from .storage import (
CompositeStorage,
CredentialStorage,
EncryptedFileStorage,
EnvVarStorage,
InMemoryStorage,
)
from .store import CredentialStore
from .template import TemplateResolver
# Aden sync components (lazy import to avoid httpx dependency when not needed)
# Usage: from core.framework.credentials.aden import AdenSyncProvider
# Or: from core.framework.credentials import AdenSyncProvider
try:
from .aden import (
AdenCachedStorage,
AdenClientConfig,
AdenCredentialClient,
AdenSyncProvider,
)
_ADEN_AVAILABLE = True
except ImportError:
_ADEN_AVAILABLE = False
__all__ = [
# Main store
"CredentialStore",
# Models
"CredentialObject",
"CredentialKey",
"CredentialType",
"CredentialUsageSpec",
# Providers
"CredentialProvider",
"StaticProvider",
"BearerTokenProvider",
# Storage backends
"CredentialStorage",
"EncryptedFileStorage",
"EnvVarStorage",
"InMemoryStorage",
"CompositeStorage",
# Template resolution
"TemplateResolver",
# Exceptions
"CredentialError",
"CredentialNotFoundError",
"CredentialKeyNotFoundError",
"CredentialRefreshError",
"CredentialValidationError",
"CredentialDecryptionError",
# Aden sync (optional - requires httpx)
"AdenSyncProvider",
"AdenCredentialClient",
"AdenClientConfig",
"AdenCachedStorage",
]
# Track Aden availability for runtime checks
ADEN_AVAILABLE = _ADEN_AVAILABLE
@@ -0,0 +1,76 @@
"""
Aden Credential Sync.
Components for synchronizing credentials with the Aden authentication server.
The Aden server handles OAuth2 authorization flows and maintains refresh tokens.
These components fetch and cache access tokens locally while delegating
lifecycle management to Aden.
Components:
- AdenCredentialClient: HTTP client for Aden API
- AdenSyncProvider: CredentialProvider that syncs with Aden
- AdenCachedStorage: Storage with local cache + Aden fallback
Quick Start:
from core.framework.credentials import CredentialStore
from core.framework.credentials.storage import EncryptedFileStorage
from core.framework.credentials.aden import (
AdenCredentialClient,
AdenClientConfig,
AdenSyncProvider,
)
# Configure (API key loaded from ADEN_API_KEY env var)
client = AdenCredentialClient(AdenClientConfig(
base_url=os.environ["ADEN_API_URL"],
))
provider = AdenSyncProvider(client=client)
store = CredentialStore(
storage=EncryptedFileStorage(),
providers=[provider],
auto_refresh=True,
)
# Initial sync
provider.sync_all(store)
# Use normally
token = store.get_key("hubspot", "access_token")
See docs/aden-credential-sync.md for detailed documentation.
"""
from .client import (
AdenAuthenticationError,
AdenClientConfig,
AdenClientError,
AdenCredentialClient,
AdenCredentialResponse,
AdenIntegrationInfo,
AdenNotFoundError,
AdenRateLimitError,
AdenRefreshError,
)
from .provider import AdenSyncProvider
from .storage import AdenCachedStorage
__all__ = [
# Client
"AdenCredentialClient",
"AdenClientConfig",
"AdenCredentialResponse",
"AdenIntegrationInfo",
# Client errors
"AdenClientError",
"AdenAuthenticationError",
"AdenNotFoundError",
"AdenRateLimitError",
"AdenRefreshError",
# Provider
"AdenSyncProvider",
# Storage
"AdenCachedStorage",
]
+466
View File
@@ -0,0 +1,466 @@
"""
Aden Credential Client.
HTTP client for communicating with the Aden authentication server.
The Aden server handles OAuth2 authorization flows and token management.
This client fetches tokens and delegates refresh operations to Aden.
Usage:
# API key loaded from ADEN_API_KEY environment variable by default
client = AdenCredentialClient(AdenClientConfig(
base_url="https://api.adenhq.com",
))
# Or explicitly provide the API key
client = AdenCredentialClient(AdenClientConfig(
base_url="https://api.adenhq.com",
api_key="your-api-key",
))
# Fetch a credential
response = client.get_credential("hubspot")
if response:
print(f"Token expires at: {response.expires_at}")
# Request a refresh
refreshed = client.request_refresh("hubspot")
"""
from __future__ import annotations
import logging
import os
import time
from dataclasses import dataclass, field
from datetime import datetime
from typing import Any
import httpx
logger = logging.getLogger(__name__)
class AdenClientError(Exception):
"""Base exception for Aden client errors."""
pass
class AdenAuthenticationError(AdenClientError):
"""Raised when API key is invalid or revoked."""
pass
class AdenNotFoundError(AdenClientError):
"""Raised when integration is not found."""
pass
class AdenRefreshError(AdenClientError):
"""Raised when token refresh fails."""
def __init__(
self,
message: str,
requires_reauthorization: bool = False,
reauthorization_url: str | None = None,
):
super().__init__(message)
self.requires_reauthorization = requires_reauthorization
self.reauthorization_url = reauthorization_url
class AdenRateLimitError(AdenClientError):
"""Raised when rate limited."""
def __init__(self, message: str, retry_after: int = 60):
super().__init__(message)
self.retry_after = retry_after
@dataclass
class AdenClientConfig:
"""Configuration for Aden API client."""
base_url: str
"""Base URL of the Aden server (e.g., 'https://api.adenhq.com')."""
api_key: str | None = None
"""Agent's API key for authenticating with Aden.
If not provided, loaded from ADEN_API_KEY environment variable."""
tenant_id: str | None = None
"""Optional tenant ID for multi-tenant deployments."""
timeout: float = 30.0
"""Request timeout in seconds."""
retry_attempts: int = 3
"""Number of retry attempts for transient failures."""
retry_delay: float = 1.0
"""Base delay between retries in seconds (exponential backoff)."""
def __post_init__(self) -> None:
"""Load API key from environment if not provided."""
if self.api_key is None:
self.api_key = os.environ.get("ADEN_API_KEY")
if not self.api_key:
raise ValueError(
"Aden API key not provided. Either pass api_key to AdenClientConfig "
"or set the ADEN_API_KEY environment variable."
)
@dataclass
class AdenCredentialResponse:
"""Response from Aden server containing credential data."""
integration_id: str
"""Unique identifier for the integration (e.g., 'hubspot')."""
integration_type: str
"""Type of integration (e.g., 'hubspot', 'github', 'slack')."""
access_token: str
"""The access token for API calls."""
token_type: str = "Bearer"
"""Token type (usually 'Bearer')."""
expires_at: datetime | None = None
"""When the access token expires (UTC)."""
scopes: list[str] = field(default_factory=list)
"""OAuth2 scopes granted to this token."""
metadata: dict[str, Any] = field(default_factory=dict)
"""Additional integration-specific metadata."""
@classmethod
def from_dict(
cls, data: dict[str, Any], integration_id: str | None = None
) -> AdenCredentialResponse:
"""Create from API response dictionary."""
expires_at = None
if data.get("expires_at"):
expires_at = datetime.fromisoformat(data["expires_at"].replace("Z", "+00:00"))
return cls(
integration_id=integration_id or data.get("alias", data.get("provider", "")),
integration_type=data.get("provider", ""),
access_token=data["access_token"],
token_type=data.get("token_type", "Bearer"),
expires_at=expires_at,
scopes=data.get("scopes", []),
metadata={"email": data.get("email")} if data.get("email") else {},
)
@dataclass
class AdenIntegrationInfo:
"""Information about an available integration."""
integration_id: str
integration_type: str
status: str # "active", "requires_reauth", "expired"
expires_at: datetime | None = None
@classmethod
def from_dict(cls, data: dict[str, Any]) -> AdenIntegrationInfo:
"""Create from API response dictionary."""
expires_at = None
if data.get("expires_at"):
expires_at = datetime.fromisoformat(data["expires_at"].replace("Z", "+00:00"))
return cls(
integration_id=data["integration_id"],
integration_type=data.get("provider", data["integration_id"]),
status=data.get("status", "unknown"),
expires_at=expires_at,
)
class AdenCredentialClient:
"""
HTTP client for Aden credential server.
Handles communication with the Aden authentication server,
including fetching credentials, requesting refreshes, and
reporting usage statistics.
The client automatically handles:
- Retries with exponential backoff for transient failures
- Proper error classification (auth, not found, rate limit, etc.)
- Request headers for authentication and tenant isolation
Usage:
# API key loaded from ADEN_API_KEY environment variable
config = AdenClientConfig(
base_url="https://api.adenhq.com",
)
client = AdenCredentialClient(config)
# Fetch a credential
cred = client.get_credential("hubspot")
if cred:
headers = {"Authorization": f"Bearer {cred.access_token}"}
# List all integrations
integrations = client.list_integrations()
for info in integrations:
print(f"{info.integration_id}: {info.status}")
# Clean up
client.close()
"""
def __init__(self, config: AdenClientConfig):
"""
Initialize the Aden client.
Args:
config: Client configuration including base URL and API key.
"""
self.config = config
self._client: httpx.Client | None = None
def _get_client(self) -> httpx.Client:
"""Get or create the HTTP client."""
if self._client is None:
headers = {
"Authorization": f"Bearer {self.config.api_key}",
"Content-Type": "application/json",
"User-Agent": "hive-credential-store/1.0",
}
if self.config.tenant_id:
headers["X-Tenant-ID"] = self.config.tenant_id
self._client = httpx.Client(
base_url=self.config.base_url,
timeout=self.config.timeout,
headers=headers,
)
return self._client
def _request_with_retry(
self,
method: str,
path: str,
**kwargs: Any,
) -> httpx.Response:
"""Make a request with retry logic."""
client = self._get_client()
last_error: Exception | None = None
for attempt in range(self.config.retry_attempts):
try:
response = client.request(method, path, **kwargs)
# Handle specific error codes
if response.status_code == 401:
raise AdenAuthenticationError("Agent API key is invalid or revoked")
if response.status_code == 404:
raise AdenNotFoundError(f"Integration not found: {path}")
if response.status_code == 429:
retry_after = int(response.headers.get("Retry-After", 60))
raise AdenRateLimitError(
"Rate limited by Aden server",
retry_after=retry_after,
)
if response.status_code == 400:
data = response.json()
if data.get("error") == "refresh_failed":
raise AdenRefreshError(
data.get("message", "Token refresh failed"),
requires_reauthorization=data.get("requires_reauthorization", False),
reauthorization_url=data.get("reauthorization_url"),
)
# Success or other error
response.raise_for_status()
return response
except (httpx.ConnectError, httpx.TimeoutException) as e:
last_error = e
if attempt < self.config.retry_attempts - 1:
delay = self.config.retry_delay * (2**attempt)
logger.warning(
f"Aden request failed (attempt {attempt + 1}), retrying in {delay}s: {e}"
)
time.sleep(delay)
else:
raise AdenClientError(f"Failed to connect to Aden server: {e}") from e
except (
AdenAuthenticationError,
AdenNotFoundError,
AdenRefreshError,
AdenRateLimitError,
):
# Don't retry these errors
raise
# Should not reach here, but just in case
raise AdenClientError(
f"Request failed after {self.config.retry_attempts} attempts"
) from last_error
def get_credential(self, integration_id: str) -> AdenCredentialResponse | None:
"""
Fetch the current credential for an integration.
The Aden server may refresh the token internally if it's expired
before returning it.
Args:
integration_id: The integration identifier (e.g., 'hubspot').
Returns:
Credential response with access token, or None if not found.
Raises:
AdenAuthenticationError: If API key is invalid.
AdenClientError: For connection failures.
"""
try:
response = self._request_with_retry("GET", f"/v1/credentials/{integration_id}")
data = response.json()
return AdenCredentialResponse.from_dict(data, integration_id=integration_id)
except AdenNotFoundError:
return None
def request_refresh(self, integration_id: str) -> AdenCredentialResponse:
"""
Request the Aden server to refresh the token.
Use this when the local store detects an expired or near-expiry token.
The Aden server handles the actual OAuth2 refresh token flow.
Args:
integration_id: The integration identifier.
Returns:
Credential response with new access token.
Raises:
AdenRefreshError: If refresh fails (may require re-authorization).
AdenNotFoundError: If integration not found.
AdenAuthenticationError: If API key is invalid.
AdenRateLimitError: If rate limited.
"""
response = self._request_with_retry("POST", f"/v1/credentials/{integration_id}/refresh")
data = response.json()
return AdenCredentialResponse.from_dict(data, integration_id=integration_id)
def list_integrations(self) -> list[AdenIntegrationInfo]:
"""
List all integrations available for this agent/tenant.
Returns:
List of integration info objects.
Raises:
AdenAuthenticationError: If API key is invalid.
AdenClientError: For connection failures.
"""
response = self._request_with_retry("GET", "/v1/credentials")
data = response.json()
return [AdenIntegrationInfo.from_dict(item) for item in data.get("integrations", [])]
def validate_token(self, integration_id: str) -> dict[str, Any]:
"""
Check if a token is still valid without fetching it.
Args:
integration_id: The integration identifier.
Returns:
Dict with 'valid' bool and optional 'expires_at', 'reason',
'requires_reauthorization', 'reauthorization_url'.
Raises:
AdenNotFoundError: If integration not found.
AdenAuthenticationError: If API key is invalid.
"""
response = self._request_with_retry("GET", f"/v1/credentials/{integration_id}/validate")
return response.json()
def report_usage(
self,
integration_id: str,
operation: str,
status: str = "success",
metadata: dict[str, Any] | None = None,
) -> None:
"""
Report credential usage statistics to Aden.
This is optional and used for analytics/billing.
Args:
integration_id: The integration identifier.
operation: Operation name (e.g., 'api_call').
status: Operation status ('success', 'error').
metadata: Additional operation metadata.
"""
try:
self._request_with_retry(
"POST",
f"/v1/credentials/{integration_id}/usage",
json={
"operation": operation,
"status": status,
"timestamp": datetime.utcnow().isoformat() + "Z",
"metadata": metadata or {},
},
)
except Exception as e:
# Usage reporting is best-effort, don't fail on errors
logger.warning(f"Failed to report usage for '{integration_id}': {e}")
def health_check(self) -> dict[str, Any]:
"""
Check Aden server health and connectivity.
Returns:
Dict with 'status', 'version', 'timestamp', and optionally 'error'.
"""
try:
client = self._get_client()
response = client.get("/health")
if response.status_code == 200:
data = response.json()
data["latency_ms"] = response.elapsed.total_seconds() * 1000
return data
return {
"status": "degraded",
"error": f"Unexpected status code: {response.status_code}",
}
except Exception as e:
return {
"status": "unhealthy",
"error": str(e),
}
def close(self) -> None:
"""Close the HTTP client and release resources."""
if self._client:
self._client.close()
self._client = None
def __enter__(self) -> AdenCredentialClient:
"""Context manager entry."""
return self
def __exit__(self, *args: Any) -> None:
"""Context manager exit."""
self.close()
+415
View File
@@ -0,0 +1,415 @@
"""
Aden Sync Provider.
Provider that synchronizes credentials with the Aden authentication server.
The Aden server is the authoritative source for OAuth2 tokens - this provider
fetches and caches tokens locally while delegating refresh operations to Aden.
Usage:
from core.framework.credentials import CredentialStore
from core.framework.credentials.storage import EncryptedFileStorage
from core.framework.credentials.aden import (
AdenCredentialClient,
AdenClientConfig,
AdenSyncProvider,
)
# Configure client (API key loaded from ADEN_API_KEY env var)
client = AdenCredentialClient(AdenClientConfig(
base_url=os.environ["ADEN_API_URL"],
))
# Create provider
provider = AdenSyncProvider(client=client)
# Create store
store = CredentialStore(
storage=EncryptedFileStorage(),
providers=[provider],
auto_refresh=True,
)
# Initial sync from Aden
provider.sync_all(store)
# Use normally - auto-refreshes via Aden when needed
token = store.get_key("hubspot", "access_token")
"""
from __future__ import annotations
import logging
from datetime import UTC, datetime, timedelta
from typing import TYPE_CHECKING
from pydantic import SecretStr
from ..models import CredentialKey, CredentialObject, CredentialRefreshError, CredentialType
from ..provider import CredentialProvider
from .client import (
AdenClientError,
AdenCredentialClient,
AdenCredentialResponse,
AdenRefreshError,
)
if TYPE_CHECKING:
from ..store import CredentialStore
logger = logging.getLogger(__name__)
class AdenSyncProvider(CredentialProvider):
"""
Provider that synchronizes credentials with the Aden server.
The Aden server handles OAuth2 authorization flows and maintains
refresh tokens. This provider:
- Fetches access tokens from the Aden server
- Delegates token refresh to the Aden server
- Caches tokens locally in the credential store
- Optionally reports usage statistics back to Aden
Key benefits:
- Client secrets never leave the Aden server
- Refresh token security (stored only on Aden)
- Centralized audit logging
- Multi-tenant support
Usage:
client = AdenCredentialClient(AdenClientConfig(
base_url="https://api.adenhq.com",
api_key=os.environ["ADEN_API_KEY"],
))
provider = AdenSyncProvider(client=client)
store = CredentialStore(
storage=EncryptedFileStorage(),
providers=[provider],
auto_refresh=True,
)
"""
def __init__(
self,
client: AdenCredentialClient,
provider_id: str = "aden_sync",
refresh_buffer_minutes: int = 5,
report_usage: bool = False,
):
"""
Initialize the Aden sync provider.
Args:
client: Configured Aden API client.
provider_id: Unique identifier for this provider instance.
Useful for multi-tenant scenarios (e.g., 'aden_tenant_123').
refresh_buffer_minutes: Minutes before expiry to trigger refresh.
Default is 5 minutes.
report_usage: Whether to report usage statistics to Aden server.
"""
self._client = client
self._provider_id = provider_id
self._refresh_buffer = timedelta(minutes=refresh_buffer_minutes)
self._report_usage = report_usage
@property
def provider_id(self) -> str:
"""Unique identifier for this provider."""
return self._provider_id
@property
def supported_types(self) -> list[CredentialType]:
"""Credential types this provider can manage."""
return [CredentialType.OAUTH2, CredentialType.BEARER_TOKEN]
def can_handle(self, credential: CredentialObject) -> bool:
"""
Check if this provider can handle a credential.
Returns True if:
- Credential type is supported (OAUTH2 or BEARER_TOKEN)
- Credential's provider_id matches this provider, OR
- Credential has '_aden_managed' metadata flag
"""
if credential.credential_type not in self.supported_types:
return False
# Check if credential is explicitly linked to this provider
if credential.provider_id == self.provider_id:
return True
# Check for Aden-managed flag in metadata
aden_flag = credential.keys.get("_aden_managed")
if aden_flag and aden_flag.value.get_secret_value() == "true":
return True
return False
def refresh(self, credential: CredentialObject) -> CredentialObject:
"""
Refresh credential by requesting new token from Aden server.
The Aden server handles the actual OAuth2 refresh token flow.
This method simply fetches the result.
Args:
credential: The credential to refresh.
Returns:
Updated credential with new access token.
Raises:
CredentialRefreshError: If refresh fails.
"""
try:
# Request Aden to refresh the token
aden_response = self._client.request_refresh(credential.id)
# Update credential with new values
credential = self._update_credential_from_aden(credential, aden_response)
logger.info(f"Refreshed credential '{credential.id}' via Aden server")
# Report usage if enabled
if self._report_usage:
self._client.report_usage(
integration_id=credential.id,
operation="token_refresh",
status="success",
)
return credential
except AdenRefreshError as e:
logger.error(f"Aden refresh failed for '{credential.id}': {e}")
if e.requires_reauthorization:
raise CredentialRefreshError(
f"Integration '{credential.id}' requires re-authorization. "
f"Visit: {e.reauthorization_url or 'your Aden dashboard'}"
) from e
raise CredentialRefreshError(
f"Failed to refresh credential '{credential.id}': {e}"
) from e
except AdenClientError as e:
logger.error(f"Aden client error for '{credential.id}': {e}")
# Check if local token is still valid
access_key = credential.keys.get("access_token")
if access_key and access_key.expires_at:
if datetime.now(UTC) < access_key.expires_at:
logger.warning(f"Aden unavailable, using cached token for '{credential.id}'")
return credential
raise CredentialRefreshError(
f"Aden server unavailable and token expired for '{credential.id}'"
) from e
def validate(self, credential: CredentialObject) -> bool:
"""
Validate credential via Aden server introspection.
Args:
credential: The credential to validate.
Returns:
True if credential is valid.
"""
try:
result = self._client.validate_token(credential.id)
return result.get("valid", False)
except AdenClientError:
# Fall back to local validation
access_key = credential.keys.get("access_token")
if access_key is None:
return False
if access_key.expires_at is None:
# No expiration - assume valid
return True
return datetime.now(UTC) < access_key.expires_at
def should_refresh(self, credential: CredentialObject) -> bool:
"""
Check if credential should be refreshed.
Returns True if access_token is expired or within the refresh buffer.
Args:
credential: The credential to check.
Returns:
True if credential should be refreshed.
"""
access_key = credential.keys.get("access_token")
if access_key is None:
return False
if access_key.expires_at is None:
return False
# Refresh if within buffer of expiration
return datetime.now(UTC) >= (access_key.expires_at - self._refresh_buffer)
def fetch_from_aden(self, integration_id: str) -> CredentialObject | None:
"""
Fetch credential directly from Aden server.
Use this for initial population or when local cache is missing.
Args:
integration_id: The integration identifier (e.g., 'hubspot').
Returns:
CredentialObject if found, None otherwise.
Raises:
AdenClientError: For connection failures.
"""
aden_response = self._client.get_credential(integration_id)
if aden_response is None:
return None
return self._aden_response_to_credential(aden_response)
def sync_all(self, store: CredentialStore) -> int:
"""
Sync all credentials from Aden server to local store.
Fetches the list of available integrations from Aden and
populates the local credential store with current tokens.
Args:
store: The credential store to populate.
Returns:
Number of credentials synced.
"""
synced = 0
try:
integrations = self._client.list_integrations()
for info in integrations:
if info.status != "active":
logger.warning(
f"Skipping integration '{info.integration_id}': status={info.status}"
)
continue
try:
cred = self.fetch_from_aden(info.integration_id)
if cred:
store.save_credential(cred)
synced += 1
logger.info(f"Synced credential '{info.integration_id}' from Aden")
except Exception as e:
logger.warning(f"Failed to sync '{info.integration_id}': {e}")
except AdenClientError as e:
logger.error(f"Failed to list integrations from Aden: {e}")
return synced
def report_credential_usage(
self,
credential: CredentialObject,
operation: str,
status: str = "success",
metadata: dict | None = None,
) -> None:
"""
Report credential usage to Aden server.
Args:
credential: The credential that was used.
operation: Operation name (e.g., 'api_call').
status: Operation status ('success', 'error').
metadata: Additional metadata.
"""
if self._report_usage:
self._client.report_usage(
integration_id=credential.id,
operation=operation,
status=status,
metadata=metadata or {},
)
def _update_credential_from_aden(
self,
credential: CredentialObject,
aden_response: AdenCredentialResponse,
) -> CredentialObject:
"""Update credential object from Aden response."""
# Update access token
credential.keys["access_token"] = CredentialKey(
name="access_token",
value=SecretStr(aden_response.access_token),
expires_at=aden_response.expires_at,
)
# Update scopes if present
if aden_response.scopes:
credential.keys["scope"] = CredentialKey(
name="scope",
value=SecretStr(" ".join(aden_response.scopes)),
)
# Mark as Aden-managed
credential.keys["_aden_managed"] = CredentialKey(
name="_aden_managed",
value=SecretStr("true"),
)
# Store integration type
credential.keys["_integration_type"] = CredentialKey(
name="_integration_type",
value=SecretStr(aden_response.integration_type),
)
# Update timestamps
credential.last_refreshed = datetime.now(UTC)
credential.provider_id = self.provider_id
return credential
def _aden_response_to_credential(
self,
aden_response: AdenCredentialResponse,
) -> CredentialObject:
"""Convert Aden response to CredentialObject."""
keys: dict[str, CredentialKey] = {
"access_token": CredentialKey(
name="access_token",
value=SecretStr(aden_response.access_token),
expires_at=aden_response.expires_at,
),
"_aden_managed": CredentialKey(
name="_aden_managed",
value=SecretStr("true"),
),
"_integration_type": CredentialKey(
name="_integration_type",
value=SecretStr(aden_response.integration_type),
),
}
if aden_response.scopes:
keys["scope"] = CredentialKey(
name="scope",
value=SecretStr(" ".join(aden_response.scopes)),
)
return CredentialObject(
id=aden_response.integration_id,
credential_type=CredentialType.OAUTH2,
keys=keys,
provider_id=self.provider_id,
auto_refresh=True,
)
+307
View File
@@ -0,0 +1,307 @@
"""
Aden Cached Storage.
Storage backend that combines local cache with Aden server fallback.
Provides offline resilience by caching credentials locally while
keeping them synchronized with the Aden server.
Usage:
from core.framework.credentials import CredentialStore
from core.framework.credentials.storage import EncryptedFileStorage
from core.framework.credentials.aden import (
AdenCredentialClient,
AdenClientConfig,
AdenSyncProvider,
AdenCachedStorage,
)
# Configure
client = AdenCredentialClient(AdenClientConfig(
base_url=os.environ["ADEN_API_URL"],
api_key=os.environ["ADEN_API_KEY"],
))
provider = AdenSyncProvider(client=client)
# Create cached storage
storage = AdenCachedStorage(
local_storage=EncryptedFileStorage(),
aden_provider=provider,
cache_ttl_seconds=300, # Re-check Aden every 5 minutes
)
# Create store
store = CredentialStore(
storage=storage,
providers=[provider],
auto_refresh=True,
)
# Credentials automatically fetched from Aden on first access
# Cached locally for 5 minutes
# Falls back to cache if Aden is unreachable
"""
from __future__ import annotations
import logging
from datetime import UTC, datetime, timedelta
from typing import TYPE_CHECKING
from ..storage import CredentialStorage
if TYPE_CHECKING:
from ..models import CredentialObject
from .provider import AdenSyncProvider
logger = logging.getLogger(__name__)
class AdenCachedStorage(CredentialStorage):
"""
Storage with local cache and Aden server fallback.
This storage provides:
- **Reads**: Try local cache first, fallback to Aden if stale/missing
- **Writes**: Always write to local cache
- **Offline resilience**: Uses cached credentials when Aden is unreachable
The cache TTL determines how long to trust local credentials before
checking with the Aden server for updates. This balances:
- Performance (fewer network calls)
- Freshness (tokens stay current)
- Resilience (works during brief outages)
Usage:
storage = AdenCachedStorage(
local_storage=EncryptedFileStorage(),
aden_provider=provider,
cache_ttl_seconds=300, # 5 minutes
)
store = CredentialStore(
storage=storage,
providers=[provider],
)
# First access fetches from Aden
# Subsequent accesses use cache until TTL expires
token = store.get_key("hubspot", "access_token")
"""
def __init__(
self,
local_storage: CredentialStorage,
aden_provider: AdenSyncProvider,
cache_ttl_seconds: int = 300,
prefer_local: bool = True,
):
"""
Initialize Aden-cached storage.
Args:
local_storage: Local storage backend for caching (e.g., EncryptedFileStorage).
aden_provider: Provider for fetching from Aden server.
cache_ttl_seconds: How long to trust local cache before checking Aden.
Default is 300 seconds (5 minutes).
prefer_local: If True, use local cache when available and fresh.
If False, always check Aden first.
"""
self._local = local_storage
self._aden_provider = aden_provider
self._cache_ttl = timedelta(seconds=cache_ttl_seconds)
self._prefer_local = prefer_local
self._cache_timestamps: dict[str, datetime] = {}
def save(self, credential: CredentialObject) -> None:
"""
Save credential to local cache.
Args:
credential: The credential to save.
"""
self._local.save(credential)
self._cache_timestamps[credential.id] = datetime.now(UTC)
logger.debug(f"Cached credential '{credential.id}'")
def load(self, credential_id: str) -> CredentialObject | None:
"""
Load credential from cache, with Aden fallback.
The loading strategy depends on the `prefer_local` setting:
If prefer_local=True (default):
1. Check if local cache exists and is fresh (within TTL)
2. If fresh, return cached credential
3. If stale or missing, fetch from Aden
4. Update local cache with Aden response
5. If Aden fails, fall back to stale cache
If prefer_local=False:
1. Always try to fetch from Aden first
2. Update local cache with response
3. Fall back to local cache only if Aden fails
Args:
credential_id: The credential identifier.
Returns:
CredentialObject if found, None otherwise.
"""
local_cred = self._local.load(credential_id)
# If we prefer local and have a fresh cache, use it
if self._prefer_local and local_cred and self._is_cache_fresh(credential_id):
logger.debug(f"Using cached credential '{credential_id}'")
return local_cred
# Try to fetch from Aden
try:
aden_cred = self._aden_provider.fetch_from_aden(credential_id)
if aden_cred:
# Update local cache
self.save(aden_cred)
logger.debug(f"Fetched credential '{credential_id}' from Aden")
return aden_cred
except Exception as e:
logger.warning(f"Failed to fetch '{credential_id}' from Aden: {e}")
# Fall back to local cache if Aden fails
if local_cred:
logger.info(f"Using stale cached credential '{credential_id}'")
return local_cred
# Return local credential if it exists (may be None)
return local_cred
def delete(self, credential_id: str) -> bool:
"""
Delete credential from local cache.
Note: This does NOT delete the credential from the Aden server.
It only removes the local cache entry.
Args:
credential_id: The credential identifier.
Returns:
True if credential existed and was deleted.
"""
self._cache_timestamps.pop(credential_id, None)
return self._local.delete(credential_id)
def list_all(self) -> list[str]:
"""
List credentials from local cache.
Returns:
List of credential IDs in local cache.
"""
return self._local.list_all()
def exists(self, credential_id: str) -> bool:
"""
Check if credential exists in local cache.
Args:
credential_id: The credential identifier.
Returns:
True if credential exists locally.
"""
return self._local.exists(credential_id)
def _is_cache_fresh(self, credential_id: str) -> bool:
"""
Check if local cache is still fresh (within TTL).
Args:
credential_id: The credential identifier.
Returns:
True if cache is fresh, False if stale or not cached.
"""
cached_at = self._cache_timestamps.get(credential_id)
if cached_at is None:
return False
return datetime.now(UTC) - cached_at < self._cache_ttl
def invalidate_cache(self, credential_id: str) -> None:
"""
Invalidate cache for a specific credential.
The next load() call will fetch from Aden regardless of TTL.
Args:
credential_id: The credential identifier.
"""
self._cache_timestamps.pop(credential_id, None)
logger.debug(f"Invalidated cache for '{credential_id}'")
def invalidate_all(self) -> None:
"""Invalidate all cache entries."""
self._cache_timestamps.clear()
logger.debug("Invalidated all cache entries")
def sync_all_from_aden(self) -> int:
"""
Sync all credentials from Aden server to local cache.
Fetches the list of available integrations from Aden and
updates the local cache with current tokens.
Returns:
Number of credentials synced.
"""
synced = 0
try:
integrations = self._aden_provider._client.list_integrations()
for info in integrations:
if info.status != "active":
logger.warning(
f"Skipping integration '{info.integration_id}': status={info.status}"
)
continue
try:
cred = self._aden_provider.fetch_from_aden(info.integration_id)
if cred:
self.save(cred)
synced += 1
logger.info(f"Synced credential '{info.integration_id}' from Aden")
except Exception as e:
logger.warning(f"Failed to sync '{info.integration_id}': {e}")
except Exception as e:
logger.error(f"Failed to list integrations from Aden: {e}")
return synced
def get_cache_info(self) -> dict[str, dict]:
"""
Get cache status information for all credentials.
Returns:
Dict mapping credential_id to cache info (cached_at, is_fresh, ttl_remaining).
"""
now = datetime.now(UTC)
info = {}
for cred_id in self.list_all():
cached_at = self._cache_timestamps.get(cred_id)
if cached_at:
ttl_remaining = (cached_at + self._cache_ttl - now).total_seconds()
info[cred_id] = {
"cached_at": cached_at.isoformat(),
"is_fresh": ttl_remaining > 0,
"ttl_remaining_seconds": max(0, ttl_remaining),
}
else:
info[cred_id] = {
"cached_at": None,
"is_fresh": False,
"ttl_remaining_seconds": 0,
}
return info
@@ -0,0 +1 @@
"""Tests for Aden credential sync components."""
@@ -0,0 +1,670 @@
"""
Tests for Aden credential sync components.
Tests cover:
- AdenCredentialClient: HTTP client for Aden API
- AdenSyncProvider: Provider that syncs with Aden
- AdenCachedStorage: Storage with local cache + Aden fallback
"""
from datetime import UTC, datetime, timedelta
from unittest.mock import Mock
import pytest
from pydantic import SecretStr
from framework.credentials import (
CredentialKey,
CredentialObject,
CredentialStore,
CredentialType,
InMemoryStorage,
)
from framework.credentials.aden import (
AdenCachedStorage,
AdenClientConfig,
AdenClientError,
AdenCredentialClient,
AdenCredentialResponse,
AdenIntegrationInfo,
AdenRefreshError,
AdenSyncProvider,
)
# =============================================================================
# Fixtures
# =============================================================================
@pytest.fixture
def aden_config():
"""Create a test Aden client config."""
return AdenClientConfig(
base_url="https://api.test-aden.com",
api_key="test-api-key",
tenant_id="test-tenant",
timeout=5.0,
retry_attempts=2,
retry_delay=0.1,
)
@pytest.fixture
def mock_client(aden_config):
"""Create a mock Aden client."""
client = Mock(spec=AdenCredentialClient)
client.config = aden_config
return client
@pytest.fixture
def aden_response():
"""Create a sample Aden credential response."""
return AdenCredentialResponse(
integration_id="hubspot",
integration_type="hubspot",
access_token="test-access-token",
token_type="Bearer",
expires_at=datetime.now(UTC) + timedelta(hours=1),
scopes=["crm.objects.contacts.read", "crm.objects.contacts.write"],
metadata={"portal_id": "12345"},
)
@pytest.fixture
def provider(mock_client):
"""Create an AdenSyncProvider with mock client."""
return AdenSyncProvider(
client=mock_client,
provider_id="test_aden",
refresh_buffer_minutes=5,
report_usage=False,
)
@pytest.fixture
def local_storage():
"""Create an in-memory storage for testing."""
return InMemoryStorage()
@pytest.fixture
def cached_storage(local_storage, provider):
"""Create an AdenCachedStorage for testing."""
return AdenCachedStorage(
local_storage=local_storage,
aden_provider=provider,
cache_ttl_seconds=60,
prefer_local=True,
)
# =============================================================================
# AdenCredentialResponse Tests
# =============================================================================
class TestAdenCredentialResponse:
"""Tests for AdenCredentialResponse dataclass."""
def test_from_dict_basic(self):
"""Test creating response from dict."""
data = {
"integration_id": "github",
"integration_type": "github",
"access_token": "ghp_xxxxx",
}
response = AdenCredentialResponse.from_dict(data)
assert response.integration_id == "github"
assert response.integration_type == "github"
assert response.access_token == "ghp_xxxxx"
assert response.token_type == "Bearer"
assert response.expires_at is None
assert response.scopes == []
def test_from_dict_full(self):
"""Test creating response with all fields."""
data = {
"integration_id": "hubspot",
"integration_type": "hubspot",
"access_token": "token123",
"token_type": "Bearer",
"expires_at": "2026-01-28T15:30:00Z",
"scopes": ["read", "write"],
"metadata": {"key": "value"},
}
response = AdenCredentialResponse.from_dict(data)
assert response.integration_id == "hubspot"
assert response.access_token == "token123"
assert response.expires_at is not None
assert response.scopes == ["read", "write"]
assert response.metadata == {"key": "value"}
class TestAdenIntegrationInfo:
"""Tests for AdenIntegrationInfo dataclass."""
def test_from_dict(self):
"""Test creating integration info from dict."""
data = {
"integration_id": "slack",
"integration_type": "slack",
"status": "active",
"expires_at": "2026-02-01T00:00:00Z",
}
info = AdenIntegrationInfo.from_dict(data)
assert info.integration_id == "slack"
assert info.integration_type == "slack"
assert info.status == "active"
assert info.expires_at is not None
# =============================================================================
# AdenSyncProvider Tests
# =============================================================================
class TestAdenSyncProvider:
"""Tests for AdenSyncProvider."""
def test_provider_id(self, provider):
"""Test provider ID."""
assert provider.provider_id == "test_aden"
def test_supported_types(self, provider):
"""Test supported credential types."""
assert CredentialType.OAUTH2 in provider.supported_types
assert CredentialType.BEARER_TOKEN in provider.supported_types
def test_can_handle_oauth2(self, provider):
"""Test can_handle returns True for OAUTH2 credentials with matching provider_id."""
cred = CredentialObject(
id="test",
credential_type=CredentialType.OAUTH2,
keys={},
provider_id="test_aden",
)
assert provider.can_handle(cred) is True
def test_can_handle_aden_managed(self, provider):
"""Test can_handle returns True for Aden-managed credentials."""
cred = CredentialObject(
id="test",
credential_type=CredentialType.OAUTH2,
keys={
"_aden_managed": CredentialKey(
name="_aden_managed",
value=SecretStr("true"),
)
},
)
assert provider.can_handle(cred) is True
def test_can_handle_wrong_type(self, provider):
"""Test can_handle returns False for unsupported types."""
cred = CredentialObject(
id="test",
credential_type=CredentialType.API_KEY,
keys={},
)
assert provider.can_handle(cred) is False
def test_refresh_success(self, provider, mock_client, aden_response):
"""Test successful credential refresh."""
mock_client.request_refresh.return_value = aden_response
cred = CredentialObject(
id="hubspot",
credential_type=CredentialType.OAUTH2,
keys={
"access_token": CredentialKey(
name="access_token",
value=SecretStr("old-token"),
)
},
provider_id="test_aden",
)
refreshed = provider.refresh(cred)
assert refreshed.keys["access_token"].value.get_secret_value() == "test-access-token"
assert refreshed.keys["_aden_managed"].value.get_secret_value() == "true"
assert refreshed.last_refreshed is not None
mock_client.request_refresh.assert_called_once_with("hubspot")
def test_refresh_requires_reauth(self, provider, mock_client):
"""Test refresh that requires re-authorization."""
mock_client.request_refresh.side_effect = AdenRefreshError(
"Token revoked",
requires_reauthorization=True,
reauthorization_url="https://aden.com/reauth",
)
cred = CredentialObject(
id="hubspot",
credential_type=CredentialType.OAUTH2,
keys={},
)
from framework.credentials import CredentialRefreshError
with pytest.raises(CredentialRefreshError) as exc_info:
provider.refresh(cred)
assert "re-authorization" in str(exc_info.value).lower()
def test_refresh_aden_unavailable_cached_valid(self, provider, mock_client):
"""Test refresh falls back to cache when Aden is unavailable and token is valid."""
mock_client.request_refresh.side_effect = AdenClientError("Connection failed")
# Token expires in 1 hour - still valid
future = datetime.now(UTC) + timedelta(hours=1)
cred = CredentialObject(
id="hubspot",
credential_type=CredentialType.OAUTH2,
keys={
"access_token": CredentialKey(
name="access_token",
value=SecretStr("cached-token"),
expires_at=future,
)
},
)
# Should return the cached credential instead of failing
result = provider.refresh(cred)
assert result.keys["access_token"].value.get_secret_value() == "cached-token"
def test_should_refresh_expired(self, provider):
"""Test should_refresh returns True for expired token."""
past = datetime.now(UTC) - timedelta(hours=1)
cred = CredentialObject(
id="test",
credential_type=CredentialType.OAUTH2,
keys={
"access_token": CredentialKey(
name="access_token",
value=SecretStr("token"),
expires_at=past,
)
},
)
assert provider.should_refresh(cred) is True
def test_should_refresh_within_buffer(self, provider):
"""Test should_refresh returns True when within buffer."""
# Expires in 3 minutes (buffer is 5 minutes)
soon = datetime.now(UTC) + timedelta(minutes=3)
cred = CredentialObject(
id="test",
credential_type=CredentialType.OAUTH2,
keys={
"access_token": CredentialKey(
name="access_token",
value=SecretStr("token"),
expires_at=soon,
)
},
)
assert provider.should_refresh(cred) is True
def test_should_refresh_still_valid(self, provider):
"""Test should_refresh returns False for valid token."""
future = datetime.now(UTC) + timedelta(hours=1)
cred = CredentialObject(
id="test",
credential_type=CredentialType.OAUTH2,
keys={
"access_token": CredentialKey(
name="access_token",
value=SecretStr("token"),
expires_at=future,
)
},
)
assert provider.should_refresh(cred) is False
def test_fetch_from_aden(self, provider, mock_client, aden_response):
"""Test fetching credential from Aden."""
mock_client.get_credential.return_value = aden_response
cred = provider.fetch_from_aden("hubspot")
assert cred is not None
assert cred.id == "hubspot"
assert cred.keys["access_token"].value.get_secret_value() == "test-access-token"
assert cred.auto_refresh is True
def test_fetch_from_aden_not_found(self, provider, mock_client):
"""Test fetch returns None when not found."""
mock_client.get_credential.return_value = None
cred = provider.fetch_from_aden("nonexistent")
assert cred is None
def test_sync_all(self, provider, mock_client, aden_response):
"""Test syncing all credentials."""
mock_client.list_integrations.return_value = [
AdenIntegrationInfo(
integration_id="hubspot",
integration_type="hubspot",
status="active",
),
AdenIntegrationInfo(
integration_id="github",
integration_type="github",
status="requires_reauth", # Should be skipped
),
]
mock_client.get_credential.return_value = aden_response
store = CredentialStore(storage=InMemoryStorage())
synced = provider.sync_all(store)
assert synced == 1 # Only active one was synced
assert store.get_credential("hubspot") is not None
def test_validate_via_aden(self, provider, mock_client):
"""Test validation via Aden introspection."""
mock_client.validate_token.return_value = {"valid": True}
cred = CredentialObject(
id="hubspot",
credential_type=CredentialType.OAUTH2,
keys={},
)
assert provider.validate(cred) is True
def test_validate_fallback_to_local(self, provider, mock_client):
"""Test validation falls back to local check when Aden fails."""
mock_client.validate_token.side_effect = AdenClientError("Failed")
future = datetime.now(UTC) + timedelta(hours=1)
cred = CredentialObject(
id="hubspot",
credential_type=CredentialType.OAUTH2,
keys={
"access_token": CredentialKey(
name="access_token",
value=SecretStr("token"),
expires_at=future,
)
},
)
assert provider.validate(cred) is True
# =============================================================================
# AdenCachedStorage Tests
# =============================================================================
class TestAdenCachedStorage:
"""Tests for AdenCachedStorage."""
def test_save_updates_cache_timestamp(self, cached_storage):
"""Test save updates cache timestamp."""
cred = CredentialObject(
id="test",
credential_type=CredentialType.OAUTH2,
keys={
"access_token": CredentialKey(
name="access_token",
value=SecretStr("token"),
)
},
)
cached_storage.save(cred)
assert "test" in cached_storage._cache_timestamps
assert cached_storage.exists("test")
def test_load_from_fresh_cache(self, cached_storage, local_storage):
"""Test load returns cached credential when fresh."""
cred = CredentialObject(
id="test",
credential_type=CredentialType.OAUTH2,
keys={
"access_token": CredentialKey(
name="access_token",
value=SecretStr("cached-token"),
)
},
)
# Save to both local storage and update timestamp
local_storage.save(cred)
cached_storage._cache_timestamps["test"] = datetime.now(UTC)
loaded = cached_storage.load("test")
assert loaded is not None
assert loaded.keys["access_token"].value.get_secret_value() == "cached-token"
def test_load_from_aden_when_stale(
self, cached_storage, local_storage, provider, mock_client, aden_response
):
"""Test load fetches from Aden when cache is stale."""
# Create stale cached credential
cred = CredentialObject(
id="hubspot",
credential_type=CredentialType.OAUTH2,
keys={
"access_token": CredentialKey(
name="access_token",
value=SecretStr("stale-token"),
)
},
)
local_storage.save(cred)
# Set cache timestamp to be stale (2 minutes ago, TTL is 60 seconds)
cached_storage._cache_timestamps["hubspot"] = datetime.now(UTC) - timedelta(minutes=2)
# Mock Aden response
mock_client.get_credential.return_value = aden_response
loaded = cached_storage.load("hubspot")
assert loaded is not None
assert loaded.keys["access_token"].value.get_secret_value() == "test-access-token"
def test_load_falls_back_to_stale_when_aden_fails(
self, cached_storage, local_storage, provider, mock_client
):
"""Test load falls back to stale cache when Aden fails."""
# Create stale cached credential
cred = CredentialObject(
id="hubspot",
credential_type=CredentialType.OAUTH2,
keys={
"access_token": CredentialKey(
name="access_token",
value=SecretStr("stale-token"),
)
},
)
local_storage.save(cred)
cached_storage._cache_timestamps["hubspot"] = datetime.now(UTC) - timedelta(minutes=2)
# Aden fails
mock_client.get_credential.side_effect = AdenClientError("Connection failed")
loaded = cached_storage.load("hubspot")
assert loaded is not None
assert loaded.keys["access_token"].value.get_secret_value() == "stale-token"
def test_delete_removes_cache_timestamp(self, cached_storage, local_storage):
"""Test delete removes cache timestamp."""
cred = CredentialObject(
id="test",
credential_type=CredentialType.OAUTH2,
keys={},
)
cached_storage.save(cred)
assert "test" in cached_storage._cache_timestamps
cached_storage.delete("test")
assert "test" not in cached_storage._cache_timestamps
assert not cached_storage.exists("test")
def test_invalidate_cache(self, cached_storage, local_storage):
"""Test invalidate_cache removes timestamp."""
cred = CredentialObject(
id="test",
credential_type=CredentialType.OAUTH2,
keys={},
)
cached_storage.save(cred)
cached_storage.invalidate_cache("test")
assert "test" not in cached_storage._cache_timestamps
# Credential still exists in local storage
assert local_storage.exists("test")
def test_invalidate_all(self, cached_storage):
"""Test invalidate_all clears all timestamps."""
for i in range(3):
cached_storage._cache_timestamps[f"test_{i}"] = datetime.now(UTC)
cached_storage.invalidate_all()
assert len(cached_storage._cache_timestamps) == 0
def test_is_cache_fresh(self, cached_storage):
"""Test _is_cache_fresh logic."""
# Fresh cache
cached_storage._cache_timestamps["fresh"] = datetime.now(UTC)
assert cached_storage._is_cache_fresh("fresh") is True
# Stale cache
cached_storage._cache_timestamps["stale"] = datetime.now(UTC) - timedelta(minutes=5)
assert cached_storage._is_cache_fresh("stale") is False
# No cache
assert cached_storage._is_cache_fresh("nonexistent") is False
def test_get_cache_info(self, cached_storage, local_storage):
"""Test get_cache_info returns status for all credentials."""
# Add some credentials
for name in ["fresh", "stale"]:
cred = CredentialObject(
id=name,
credential_type=CredentialType.OAUTH2,
keys={},
)
local_storage.save(cred)
cached_storage._cache_timestamps["fresh"] = datetime.now(UTC)
cached_storage._cache_timestamps["stale"] = datetime.now(UTC) - timedelta(minutes=5)
info = cached_storage.get_cache_info()
assert "fresh" in info
assert info["fresh"]["is_fresh"] is True
assert info["fresh"]["ttl_remaining_seconds"] > 0
assert "stale" in info
assert info["stale"]["is_fresh"] is False
assert info["stale"]["ttl_remaining_seconds"] == 0
# =============================================================================
# Integration Tests
# =============================================================================
class TestAdenIntegration:
"""Integration tests for Aden sync components."""
def test_full_workflow(self, mock_client, aden_response):
"""Test full workflow: sync, get, refresh."""
# Setup
mock_client.list_integrations.return_value = [
AdenIntegrationInfo(
integration_id="hubspot",
integration_type="hubspot",
status="active",
),
]
mock_client.get_credential.return_value = aden_response
mock_client.request_refresh.return_value = AdenCredentialResponse(
integration_id="hubspot",
integration_type="hubspot",
access_token="refreshed-token",
expires_at=datetime.now(UTC) + timedelta(hours=2),
scopes=[],
)
provider = AdenSyncProvider(client=mock_client)
storage = InMemoryStorage()
store = CredentialStore(
storage=storage,
providers=[provider],
auto_refresh=True,
)
# Initial sync
synced = provider.sync_all(store)
assert synced == 1
# Get credential
cred = store.get_credential("hubspot")
assert cred is not None
assert cred.keys["access_token"].value.get_secret_value() == "test-access-token"
# Simulate expiration
cred.keys["access_token"] = CredentialKey(
name="access_token",
value=SecretStr("test-access-token"),
expires_at=datetime.now(UTC) - timedelta(hours=1), # Expired
)
storage.save(cred)
# Refresh should be triggered
refreshed = provider.refresh(cred)
assert refreshed.keys["access_token"].value.get_secret_value() == "refreshed-token"
def test_cached_storage_with_store(self, mock_client, aden_response):
"""Test AdenCachedStorage with CredentialStore."""
mock_client.get_credential.return_value = aden_response
provider = AdenSyncProvider(client=mock_client)
local_storage = InMemoryStorage()
cached_storage = AdenCachedStorage(
local_storage=local_storage,
aden_provider=provider,
cache_ttl_seconds=300,
)
# First load fetches from Aden
cred = cached_storage.load("hubspot")
assert cred is not None
mock_client.get_credential.assert_called_once()
# Second load uses cache
mock_client.get_credential.reset_mock()
cred2 = cached_storage.load("hubspot")
assert cred2 is not None
mock_client.get_credential.assert_not_called()
+293
View File
@@ -0,0 +1,293 @@
"""
Core data models for the credential store.
This module defines the key-vault structure where credentials are objects
containing one or more keys (e.g., api_key, access_token, refresh_token).
"""
from __future__ import annotations
from datetime import UTC, datetime
from enum import Enum
from typing import Any
from pydantic import BaseModel, Field, SecretStr
def _utc_now() -> datetime:
"""Get current UTC time as timezone-aware datetime."""
return datetime.now(UTC)
class CredentialType(str, Enum):
"""Types of credentials the store can manage."""
API_KEY = "api_key"
"""Simple API key (e.g., Brave Search, OpenAI)"""
OAUTH2 = "oauth2"
"""OAuth2 with refresh token support"""
BASIC_AUTH = "basic_auth"
"""Username/password pair"""
BEARER_TOKEN = "bearer_token"
"""JWT or bearer token without refresh"""
CUSTOM = "custom"
"""User-defined credential type"""
class CredentialKey(BaseModel):
"""
A single key within a credential object.
Example: 'api_key' within a 'brave_search' credential
Attributes:
name: Key name (e.g., 'api_key', 'access_token')
value: Secret value (SecretStr prevents accidental logging)
expires_at: Optional expiration time
metadata: Additional key-specific metadata
"""
name: str
value: SecretStr
expires_at: datetime | None = None
metadata: dict[str, Any] = Field(default_factory=dict)
model_config = {"extra": "allow"}
@property
def is_expired(self) -> bool:
"""Check if this key has expired."""
if self.expires_at is None:
return False
return datetime.now(UTC) >= self.expires_at
def get_secret_value(self) -> str:
"""Get the actual secret value (use sparingly)."""
return self.value.get_secret_value()
class CredentialObject(BaseModel):
"""
A credential object containing one or more keys.
This is the key-vault structure where each credential can have
multiple keys (e.g., access_token, refresh_token, expires_at).
Example:
CredentialObject(
id="github_oauth",
credential_type=CredentialType.OAUTH2,
keys={
"access_token": CredentialKey(name="access_token", value=SecretStr("ghp_xxx")),
"refresh_token": CredentialKey(name="refresh_token", value=SecretStr("ghr_xxx")),
},
provider_id="oauth2"
)
Attributes:
id: Unique identifier (e.g., 'brave_search', 'github_oauth')
credential_type: Type of credential (API_KEY, OAUTH2, etc.)
keys: Dictionary of key name to CredentialKey
provider_id: ID of provider responsible for lifecycle management
auto_refresh: Whether to automatically refresh when expired
"""
id: str = Field(description="Unique identifier (e.g., 'brave_search', 'github_oauth')")
credential_type: CredentialType = CredentialType.API_KEY
keys: dict[str, CredentialKey] = Field(default_factory=dict)
# Lifecycle management
provider_id: str | None = Field(
default=None,
description="ID of provider responsible for lifecycle (e.g., 'oauth2', 'static')",
)
last_refreshed: datetime | None = None
auto_refresh: bool = False
# Usage tracking
last_used: datetime | None = None
use_count: int = 0
# Metadata
description: str = ""
tags: list[str] = Field(default_factory=list)
created_at: datetime = Field(default_factory=_utc_now)
updated_at: datetime = Field(default_factory=_utc_now)
model_config = {"extra": "allow"}
def get_key(self, key_name: str) -> str | None:
"""
Get a specific key's value.
Args:
key_name: Name of the key to retrieve
Returns:
The key's secret value, or None if not found
"""
key = self.keys.get(key_name)
if key is None:
return None
return key.get_secret_value()
def set_key(
self,
key_name: str,
value: str,
expires_at: datetime | None = None,
metadata: dict[str, Any] | None = None,
) -> None:
"""
Set or update a key.
Args:
key_name: Name of the key
value: Secret value
expires_at: Optional expiration time
metadata: Optional key-specific metadata
"""
self.keys[key_name] = CredentialKey(
name=key_name,
value=SecretStr(value),
expires_at=expires_at,
metadata=metadata or {},
)
self.updated_at = datetime.now(UTC)
def has_key(self, key_name: str) -> bool:
"""Check if a key exists."""
return key_name in self.keys
@property
def needs_refresh(self) -> bool:
"""Check if any key is expired or near expiration."""
for key in self.keys.values():
if key.is_expired:
return True
return False
@property
def is_valid(self) -> bool:
"""Check if credential has at least one non-expired key."""
if not self.keys:
return False
return not all(key.is_expired for key in self.keys.values())
def record_usage(self) -> None:
"""Record that this credential was used."""
self.last_used = datetime.now(UTC)
self.use_count += 1
def get_default_key(self) -> str | None:
"""
Get the default key value.
Priority: 'value' > 'api_key' > 'access_token' > first key
Returns:
The default key's value, or None if no keys exist
"""
for key_name in ["value", "api_key", "access_token"]:
if key_name in self.keys:
return self.get_key(key_name)
if self.keys:
first_key = next(iter(self.keys))
return self.get_key(first_key)
return None
class CredentialUsageSpec(BaseModel):
"""
Specification for how a tool uses credentials.
This implements the "bipartisan" model where the credential store
just stores values, and tools define how those values are used
in HTTP requests (headers, query params, body).
Example:
CredentialUsageSpec(
credential_id="brave_search",
required_keys=["api_key"],
headers={"X-Subscription-Token": "{{api_key}}"}
)
CredentialUsageSpec(
credential_id="github_oauth",
required_keys=["access_token"],
headers={"Authorization": "Bearer {{access_token}}"}
)
Attributes:
credential_id: ID of credential to use
required_keys: Keys that must be present
headers: Header templates with {{key}} placeholders
query_params: Query parameter templates
body_fields: Request body field templates
"""
credential_id: str = Field(description="ID of credential to use (e.g., 'brave_search')")
required_keys: list[str] = Field(default_factory=list, description="Keys that must be present")
# Injection templates (bipartisan model)
headers: dict[str, str] = Field(
default_factory=dict,
description="Header templates (e.g., {'Authorization': 'Bearer {{access_token}}'})",
)
query_params: dict[str, str] = Field(
default_factory=dict,
description="Query param templates (e.g., {'api_key': '{{api_key}}'})",
)
body_fields: dict[str, str] = Field(
default_factory=dict,
description="Request body field templates",
)
# Metadata
required: bool = True
description: str = ""
help_url: str = ""
model_config = {"extra": "allow"}
class CredentialError(Exception):
"""Base exception for credential-related errors."""
pass
class CredentialNotFoundError(CredentialError):
"""Raised when a referenced credential doesn't exist."""
pass
class CredentialKeyNotFoundError(CredentialError):
"""Raised when a referenced key doesn't exist in a credential."""
pass
class CredentialRefreshError(CredentialError):
"""Raised when credential refresh fails."""
pass
class CredentialValidationError(CredentialError):
"""Raised when credential validation fails."""
pass
class CredentialDecryptionError(CredentialError):
"""Raised when credential decryption fails."""
pass
@@ -0,0 +1,92 @@
"""
OAuth2 support for the credential store.
This module provides OAuth2 credential management with:
- Token types and configuration (OAuth2Token, OAuth2Config)
- Generic OAuth2 provider (BaseOAuth2Provider)
- Token lifecycle management (TokenLifecycleManager)
Quick Start:
from core.framework.credentials import CredentialStore
from core.framework.credentials.oauth2 import BaseOAuth2Provider, OAuth2Config
# Configure OAuth2 provider
provider = BaseOAuth2Provider(OAuth2Config(
token_url="https://oauth2.example.com/token",
client_id="your-client-id",
client_secret="your-client-secret",
default_scopes=["read", "write"],
))
# Create store with OAuth2 provider
store = CredentialStore.with_encrypted_storage(
providers=[provider] # defaults to ~/.hive/credentials
)
# Get token using client credentials
token = provider.client_credentials_grant()
# Save to store
from core.framework.credentials import CredentialObject, CredentialKey, CredentialType
from pydantic import SecretStr
store.save_credential(CredentialObject(
id="my_api",
credential_type=CredentialType.OAUTH2,
keys={
"access_token": CredentialKey(
name="access_token",
value=SecretStr(token.access_token),
expires_at=token.expires_at,
),
"refresh_token": CredentialKey(
name="refresh_token",
value=SecretStr(token.refresh_token),
) if token.refresh_token else None,
},
provider_id="oauth2",
auto_refresh=True,
))
For advanced lifecycle management:
from core.framework.credentials.oauth2 import TokenLifecycleManager
manager = TokenLifecycleManager(
provider=provider,
credential_id="my_api",
store=store,
)
# Get valid token (auto-refreshes if needed)
token = manager.sync_get_valid_token()
headers = manager.get_request_headers()
"""
from .base_provider import BaseOAuth2Provider
from .hubspot_provider import HubSpotOAuth2Provider
from .lifecycle import TokenLifecycleManager, TokenRefreshResult
from .provider import (
OAuth2Config,
OAuth2Error,
OAuth2Token,
RefreshTokenInvalidError,
TokenExpiredError,
TokenPlacement,
)
__all__ = [
# Types
"OAuth2Token",
"OAuth2Config",
"TokenPlacement",
# Providers
"BaseOAuth2Provider",
"HubSpotOAuth2Provider",
# Lifecycle
"TokenLifecycleManager",
"TokenRefreshResult",
# Errors
"OAuth2Error",
"TokenExpiredError",
"RefreshTokenInvalidError",
]
@@ -0,0 +1,486 @@
"""
Base OAuth2 provider implementation.
This module provides a generic OAuth2 provider that works with standard
OAuth2 servers. OSS users can extend this class for custom providers.
"""
from __future__ import annotations
import logging
from datetime import UTC, datetime, timedelta
from typing import Any
from urllib.parse import urlencode
from ..models import CredentialObject, CredentialRefreshError, CredentialType
from ..provider import CredentialProvider
from .provider import (
OAuth2Config,
OAuth2Error,
OAuth2Token,
TokenPlacement,
)
logger = logging.getLogger(__name__)
class BaseOAuth2Provider(CredentialProvider):
"""
Generic OAuth2 provider implementation.
Works with standard OAuth2 servers (RFC 6749). Override methods for
provider-specific behavior.
Supported grant types:
- Client Credentials: For server-to-server authentication
- Refresh Token: For refreshing expired access tokens
- Authorization Code: For user-authorized access (requires callback handling)
OSS users can extend this class for custom providers:
class GitHubOAuth2Provider(BaseOAuth2Provider):
def __init__(self, client_id: str, client_secret: str):
super().__init__(OAuth2Config(
token_url="https://github.com/login/oauth/access_token",
authorization_url="https://github.com/login/oauth/authorize",
client_id=client_id,
client_secret=client_secret,
default_scopes=["repo", "user"],
))
def exchange_code(self, code: str, redirect_uri: str, **kwargs) -> OAuth2Token:
# GitHub returns data as form-encoded by default
# Override to handle this
...
Example usage:
provider = BaseOAuth2Provider(OAuth2Config(
token_url="https://oauth2.example.com/token",
client_id="my-client-id",
client_secret="my-client-secret",
))
# Get token using client credentials
token = provider.client_credentials_grant()
# Refresh an expired token
new_token = provider.refresh_token(old_token.refresh_token)
"""
def __init__(self, config: OAuth2Config, provider_id: str = "oauth2"):
"""
Initialize the OAuth2 provider.
Args:
config: OAuth2 configuration
provider_id: Unique identifier for this provider instance
"""
self.config = config
self._provider_id = provider_id
self._client: Any | None = None
@property
def provider_id(self) -> str:
return self._provider_id
@property
def supported_types(self) -> list[CredentialType]:
return [CredentialType.OAUTH2, CredentialType.BEARER_TOKEN]
def _get_client(self) -> Any:
"""Get or create HTTP client."""
if self._client is None:
try:
import httpx
self._client = httpx.Client(timeout=self.config.request_timeout)
except ImportError as e:
raise ImportError(
"OAuth2 provider requires 'httpx'. Install with: pip install httpx"
) from e
return self._client
def _close_client(self) -> None:
"""Close the HTTP client."""
if self._client is not None:
self._client.close()
self._client = None
def __del__(self) -> None:
"""Cleanup HTTP client on deletion."""
self._close_client()
# --- Grant Types ---
def get_authorization_url(
self,
state: str,
redirect_uri: str,
scopes: list[str] | None = None,
**kwargs: Any,
) -> str:
"""
Generate authorization URL for user consent (Authorization Code flow).
Args:
state: Anti-CSRF state parameter (should be random and verified)
redirect_uri: Callback URL to receive the authorization code
scopes: Requested scopes (defaults to config.default_scopes)
**kwargs: Additional provider-specific parameters
Returns:
URL to redirect user for authorization
Raises:
ValueError: If authorization_url is not configured
"""
if not self.config.authorization_url:
raise ValueError("authorization_url not configured for this provider")
params = {
"client_id": self.config.client_id,
"redirect_uri": redirect_uri,
"response_type": "code",
"state": state,
"scope": " ".join(scopes or self.config.default_scopes),
**kwargs,
}
return f"{self.config.authorization_url}?{urlencode(params)}"
def exchange_code(
self,
code: str,
redirect_uri: str,
**kwargs: Any,
) -> OAuth2Token:
"""
Exchange authorization code for tokens (Authorization Code flow).
Args:
code: Authorization code from callback
redirect_uri: Same redirect_uri used in authorization request
**kwargs: Additional provider-specific parameters
Returns:
OAuth2Token with access_token and optional refresh_token
Raises:
OAuth2Error: If token exchange fails
"""
data = {
"grant_type": "authorization_code",
"client_id": self.config.client_id,
"client_secret": self.config.client_secret,
"code": code,
"redirect_uri": redirect_uri,
**self.config.extra_token_params,
**kwargs,
}
return self._token_request(data)
def client_credentials_grant(
self,
scopes: list[str] | None = None,
**kwargs: Any,
) -> OAuth2Token:
"""
Obtain token using client credentials (Client Credentials flow).
This is for server-to-server authentication where no user is involved.
Args:
scopes: Requested scopes (defaults to config.default_scopes)
**kwargs: Additional provider-specific parameters
Returns:
OAuth2Token (typically without refresh_token)
Raises:
OAuth2Error: If token request fails
"""
data = {
"grant_type": "client_credentials",
"client_id": self.config.client_id,
"client_secret": self.config.client_secret,
**self.config.extra_token_params,
**kwargs,
}
if scopes or self.config.default_scopes:
data["scope"] = " ".join(scopes or self.config.default_scopes)
return self._token_request(data)
def refresh_access_token(
self,
refresh_token: str,
scopes: list[str] | None = None,
**kwargs: Any,
) -> OAuth2Token:
"""
Refresh an expired access token (Refresh Token flow).
Args:
refresh_token: The refresh token
scopes: Scopes to request (defaults to original scopes)
**kwargs: Additional provider-specific parameters
Returns:
New OAuth2Token (may include new refresh_token)
Raises:
OAuth2Error: If refresh fails
RefreshTokenInvalidError: If refresh token is revoked/invalid
"""
data = {
"grant_type": "refresh_token",
"client_id": self.config.client_id,
"client_secret": self.config.client_secret,
"refresh_token": refresh_token,
**self.config.extra_token_params,
**kwargs,
}
if scopes:
data["scope"] = " ".join(scopes)
return self._token_request(data)
def revoke_token(
self,
token: str,
token_type_hint: str = "access_token",
) -> bool:
"""
Revoke a token (RFC 7009).
Args:
token: The token to revoke
token_type_hint: "access_token" or "refresh_token"
Returns:
True if revocation succeeded
"""
if not self.config.revocation_url:
logger.warning("revocation_url not configured, cannot revoke token")
return False
try:
client = self._get_client()
response = client.post(
self.config.revocation_url,
data={
"token": token,
"token_type_hint": token_type_hint,
"client_id": self.config.client_id,
"client_secret": self.config.client_secret,
},
headers={"Accept": "application/json", **self.config.extra_headers},
)
# RFC 7009: 200 indicates success (even if token was already invalid)
return response.status_code == 200
except Exception as e:
logger.error(f"Token revocation failed: {e}")
return False
# --- CredentialProvider Interface ---
def refresh(self, credential: CredentialObject) -> CredentialObject:
"""
Refresh a credential using its refresh token.
Implements CredentialProvider.refresh().
Args:
credential: The credential to refresh
Returns:
Updated credential with new access_token
Raises:
CredentialRefreshError: If refresh fails
"""
refresh_tok = credential.get_key("refresh_token")
if not refresh_tok:
raise CredentialRefreshError(f"Credential '{credential.id}' has no refresh_token")
try:
new_token = self.refresh_access_token(refresh_tok)
except OAuth2Error as e:
if e.error == "invalid_grant":
raise CredentialRefreshError(
f"Refresh token for '{credential.id}' is invalid or revoked. "
"Re-authorization required."
) from e
raise CredentialRefreshError(f"Failed to refresh '{credential.id}': {e}") from e
# Update credential
credential.set_key("access_token", new_token.access_token, expires_at=new_token.expires_at)
# Update refresh token if a new one was issued
if new_token.refresh_token and new_token.refresh_token != refresh_tok:
credential.set_key("refresh_token", new_token.refresh_token)
credential.last_refreshed = datetime.now(UTC)
logger.info(f"Refreshed OAuth2 credential '{credential.id}'")
return credential
def validate(self, credential: CredentialObject) -> bool:
"""
Validate that credential has a valid (non-expired) access_token.
Args:
credential: The credential to validate
Returns:
True if credential has valid access_token
"""
access_key = credential.keys.get("access_token")
if access_key is None:
return False
return not access_key.is_expired
def should_refresh(self, credential: CredentialObject) -> bool:
"""
Check if credential should be refreshed.
Returns True if access_token is expired or within 5 minutes of expiry.
"""
access_key = credential.keys.get("access_token")
if access_key is None:
return False
if access_key.expires_at is None:
return False
buffer = timedelta(minutes=5)
return datetime.now(UTC) >= (access_key.expires_at - buffer)
def revoke(self, credential: CredentialObject) -> bool:
"""
Revoke all tokens in a credential.
Args:
credential: The credential to revoke
Returns:
True if all revocations succeeded
"""
success = True
# Revoke access token
access_token = credential.get_key("access_token")
if access_token:
if not self.revoke_token(access_token, "access_token"):
success = False
# Revoke refresh token
refresh_token = credential.get_key("refresh_token")
if refresh_token:
if not self.revoke_token(refresh_token, "refresh_token"):
success = False
return success
# --- Token Request Helpers ---
def _token_request(self, data: dict[str, Any]) -> OAuth2Token:
"""
Make a token request to the OAuth2 server.
Args:
data: Form data for the token request
Returns:
OAuth2Token from the response
Raises:
OAuth2Error: If request fails or returns an error
"""
client = self._get_client()
headers = {
"Accept": "application/json",
"Content-Type": "application/x-www-form-urlencoded",
**self.config.extra_headers,
}
response = client.post(self.config.token_url, data=data, headers=headers)
# Parse response
content_type = response.headers.get("content-type", "")
if "application/json" in content_type:
response_data = response.json()
else:
# Some providers (like GitHub) may return form-encoded
response_data = self._parse_form_response(response.text)
# Check for error
if response.status_code != 200 or "error" in response_data:
error = response_data.get("error", "unknown_error")
description = response_data.get("error_description", response.text)
raise OAuth2Error(
error=error, description=description, status_code=response.status_code
)
return OAuth2Token.from_token_response(response_data)
def _parse_form_response(self, text: str) -> dict[str, str]:
"""Parse form-encoded response (some providers use this instead of JSON)."""
from urllib.parse import parse_qs
parsed = parse_qs(text)
return {k: v[0] if len(v) == 1 else v for k, v in parsed.items()}
# --- Token Formatting for Requests ---
def format_for_request(self, token: OAuth2Token) -> dict[str, Any]:
"""
Format token for use in HTTP requests (bipartisan model).
Args:
token: The OAuth2 token
Returns:
Dict with 'headers', 'params', or 'data' keys as appropriate
"""
placement = self.config.token_placement
if placement == TokenPlacement.HEADER_BEARER:
return {"headers": {"Authorization": f"{token.token_type} {token.access_token}"}}
elif placement == TokenPlacement.HEADER_CUSTOM:
header_name = self.config.custom_header_name or "X-Access-Token"
return {"headers": {header_name: token.access_token}}
elif placement == TokenPlacement.QUERY_PARAM:
return {"params": {self.config.query_param_name: token.access_token}}
elif placement == TokenPlacement.BODY_PARAM:
return {"data": {"access_token": token.access_token}}
return {}
def format_credential_for_request(self, credential: CredentialObject) -> dict[str, Any]:
"""
Format a credential for use in HTTP requests.
Args:
credential: The credential containing access_token
Returns:
Dict with 'headers', 'params', or 'data' keys as appropriate
"""
access_token = credential.get_key("access_token")
if not access_token:
return {}
token = OAuth2Token(
access_token=access_token,
token_type=credential.keys.get("token_type", "Bearer") or "Bearer",
)
return self.format_for_request(token)
@@ -0,0 +1,112 @@
"""
HubSpot-specific OAuth2 provider.
Pre-configured for HubSpot's OAuth2 endpoints and CRM scopes.
Extends BaseOAuth2Provider for HubSpot-specific behavior.
Usage:
provider = HubSpotOAuth2Provider(
client_id="your-client-id",
client_secret="your-client-secret",
)
# Use with credential store
store = CredentialStore(
storage=EncryptedFileStorage(), # defaults to ~/.hive/credentials
providers=[provider],
)
See: https://developers.hubspot.com/docs/api/oauth-quickstart-guide
"""
from __future__ import annotations
import logging
from typing import Any
from ..models import CredentialObject, CredentialType
from .base_provider import BaseOAuth2Provider
from .provider import OAuth2Config
logger = logging.getLogger(__name__)
# HubSpot OAuth2 endpoints
HUBSPOT_TOKEN_URL = "https://api.hubapi.com/oauth/v1/token"
HUBSPOT_AUTHORIZATION_URL = "https://app.hubspot.com/oauth/authorize"
# Default CRM scopes for contacts, companies, and deals
HUBSPOT_DEFAULT_SCOPES = [
"crm.objects.contacts.read",
"crm.objects.contacts.write",
"crm.objects.companies.read",
"crm.objects.companies.write",
"crm.objects.deals.read",
"crm.objects.deals.write",
]
class HubSpotOAuth2Provider(BaseOAuth2Provider):
"""
HubSpot OAuth2 provider with pre-configured endpoints.
Handles HubSpot-specific OAuth2 behavior:
- Pre-configured token and authorization URLs
- Default CRM scopes for contacts, companies, and deals
- Token validation via HubSpot API
Example:
provider = HubSpotOAuth2Provider(
client_id="your-hubspot-client-id",
client_secret="your-hubspot-client-secret",
scopes=["crm.objects.contacts.read"], # Override default scopes
)
"""
def __init__(
self,
client_id: str,
client_secret: str,
scopes: list[str] | None = None,
):
config = OAuth2Config(
token_url=HUBSPOT_TOKEN_URL,
authorization_url=HUBSPOT_AUTHORIZATION_URL,
client_id=client_id,
client_secret=client_secret,
default_scopes=scopes or HUBSPOT_DEFAULT_SCOPES,
)
super().__init__(config, provider_id="hubspot_oauth2")
@property
def supported_types(self) -> list[CredentialType]:
return [CredentialType.OAUTH2]
def validate(self, credential: CredentialObject) -> bool:
"""
Validate HubSpot credential by making a lightweight API call.
Tests the access token against the contacts endpoint with limit=1.
"""
access_token = credential.get_key("access_token")
if not access_token:
return False
try:
client = self._get_client()
response = client.get(
"https://api.hubapi.com/crm/v3/objects/contacts",
headers={
"Authorization": f"Bearer {access_token}",
"Accept": "application/json",
},
params={"limit": "1"},
)
return response.status_code == 200
except Exception:
return False
def _parse_token_response(self, response_data: dict[str, Any]) -> Any:
"""Parse HubSpot token response."""
from .provider import OAuth2Token
return OAuth2Token.from_token_response(response_data)
@@ -0,0 +1,363 @@
"""
Token lifecycle management for OAuth2 credentials.
This module provides the TokenLifecycleManager which coordinates
automatic token refresh with the credential store.
"""
from __future__ import annotations
import asyncio
import logging
from collections.abc import Callable
from dataclasses import dataclass
from datetime import UTC, datetime, timedelta
from typing import TYPE_CHECKING
from pydantic import SecretStr
from ..models import CredentialKey, CredentialObject, CredentialType
from .base_provider import BaseOAuth2Provider
from .provider import OAuth2Token
if TYPE_CHECKING:
from ..store import CredentialStore
logger = logging.getLogger(__name__)
@dataclass
class TokenRefreshResult:
"""Result of a token refresh operation."""
success: bool
token: OAuth2Token | None = None
error: str | None = None
needs_reauthorization: bool = False
class TokenLifecycleManager:
"""
Manages the complete lifecycle of OAuth2 tokens.
Responsibilities:
- Coordinate with CredentialStore for persistence
- Automatically refresh expired tokens
- Handle refresh failures gracefully
- Provide callbacks for monitoring
This class is useful when you need more control over token management
than the basic auto-refresh in CredentialStore provides.
Usage:
manager = TokenLifecycleManager(
provider=github_provider,
credential_id="github_oauth",
store=credential_store,
)
# Get valid token (auto-refreshes if needed)
token = await manager.get_valid_token()
# Use token
headers = provider.format_for_request(token)
Synchronous usage:
# For synchronous code, use sync_ methods
token = manager.sync_get_valid_token()
"""
def __init__(
self,
provider: BaseOAuth2Provider,
credential_id: str,
store: CredentialStore,
refresh_buffer_minutes: int = 5,
on_token_refreshed: Callable[[OAuth2Token], None] | None = None,
on_refresh_failed: Callable[[str], None] | None = None,
):
"""
Initialize the lifecycle manager.
Args:
provider: OAuth2 provider for token operations
credential_id: ID of the credential in the store
store: Credential store for persistence
refresh_buffer_minutes: Minutes before expiry to trigger refresh
on_token_refreshed: Callback when token is refreshed
on_refresh_failed: Callback when refresh fails
"""
self.provider = provider
self.credential_id = credential_id
self.store = store
self.refresh_buffer = timedelta(minutes=refresh_buffer_minutes)
self.on_token_refreshed = on_token_refreshed
self.on_refresh_failed = on_refresh_failed
# In-memory cache for performance
self._cached_token: OAuth2Token | None = None
self._cache_time: datetime | None = None
# --- Async Token Access ---
async def get_valid_token(self) -> OAuth2Token | None:
"""
Get a valid access token, refreshing if necessary.
This is the main entry point for async code.
Returns:
Valid OAuth2Token or None if unavailable
"""
# Check cache first
if self._cached_token and not self._needs_refresh(self._cached_token):
return self._cached_token
# Load from store
credential = self.store.get_credential(self.credential_id, refresh_if_needed=False)
if credential is None:
return None
# Convert to OAuth2Token
token = self._credential_to_token(credential)
if token is None:
return None
# Refresh if needed
if self._needs_refresh(token):
result = await self._async_refresh_token(credential)
if result.success and result.token:
token = result.token
elif result.needs_reauthorization:
logger.warning(f"Token for {self.credential_id} needs reauthorization")
return None
else:
# Use existing token if still technically valid
if token.is_expired:
return None
logger.warning(f"Refresh failed for {self.credential_id}, using existing token")
self._cached_token = token
self._cache_time = datetime.now(UTC)
return token
async def acquire_token_client_credentials(
self,
scopes: list[str] | None = None,
) -> OAuth2Token:
"""
Acquire a new token using client credentials flow.
For service-to-service authentication.
Args:
scopes: Scopes to request
Returns:
New OAuth2Token
"""
# Run in executor to avoid blocking
loop = asyncio.get_event_loop()
token = await loop.run_in_executor(
None, lambda: self.provider.client_credentials_grant(scopes=scopes)
)
self._save_token_to_store(token)
self._cached_token = token
return token
async def revoke(self) -> bool:
"""
Revoke tokens and clear from store.
Returns:
True if revocation succeeded
"""
credential = self.store.get_credential(self.credential_id, refresh_if_needed=False)
if credential:
self.provider.revoke(credential)
self.store.delete_credential(self.credential_id)
self._cached_token = None
return True
# --- Synchronous Token Access ---
def sync_get_valid_token(self) -> OAuth2Token | None:
"""
Synchronous version of get_valid_token().
For use in synchronous code.
"""
# Check cache
if self._cached_token and not self._needs_refresh(self._cached_token):
return self._cached_token
# Load from store
credential = self.store.get_credential(self.credential_id, refresh_if_needed=False)
if credential is None:
return None
token = self._credential_to_token(credential)
if token is None:
return None
# Refresh if needed
if self._needs_refresh(token):
result = self._sync_refresh_token(credential)
if result.success and result.token:
token = result.token
elif result.needs_reauthorization:
logger.warning(f"Token for {self.credential_id} needs reauthorization")
return None
else:
if token.is_expired:
return None
self._cached_token = token
self._cache_time = datetime.now(UTC)
return token
def sync_acquire_token_client_credentials(
self,
scopes: list[str] | None = None,
) -> OAuth2Token:
"""Synchronous version of acquire_token_client_credentials()."""
token = self.provider.client_credentials_grant(scopes=scopes)
self._save_token_to_store(token)
self._cached_token = token
return token
# --- Helper Methods ---
def _needs_refresh(self, token: OAuth2Token) -> bool:
"""Check if token needs refresh."""
if token.expires_at is None:
return False
return datetime.now(UTC) >= (token.expires_at - self.refresh_buffer)
def _credential_to_token(self, credential: CredentialObject) -> OAuth2Token | None:
"""Convert credential to OAuth2Token."""
access_token = credential.get_key("access_token")
if not access_token:
return None
expires_at = None
access_key = credential.keys.get("access_token")
if access_key:
expires_at = access_key.expires_at
return OAuth2Token(
access_token=access_token,
token_type="Bearer",
expires_at=expires_at,
refresh_token=credential.get_key("refresh_token"),
scope=credential.get_key("scope"),
)
def _save_token_to_store(self, token: OAuth2Token) -> None:
"""Save token to credential store."""
credential = CredentialObject(
id=self.credential_id,
credential_type=CredentialType.OAUTH2,
keys={
"access_token": CredentialKey(
name="access_token",
value=SecretStr(token.access_token),
expires_at=token.expires_at,
),
},
provider_id=self.provider.provider_id,
auto_refresh=True,
)
if token.refresh_token:
credential.keys["refresh_token"] = CredentialKey(
name="refresh_token",
value=SecretStr(token.refresh_token),
)
if token.scope:
credential.keys["scope"] = CredentialKey(
name="scope",
value=SecretStr(token.scope),
)
self.store.save_credential(credential)
async def _async_refresh_token(self, credential: CredentialObject) -> TokenRefreshResult:
"""Async wrapper for token refresh."""
loop = asyncio.get_event_loop()
return await loop.run_in_executor(None, lambda: self._sync_refresh_token(credential))
def _sync_refresh_token(self, credential: CredentialObject) -> TokenRefreshResult:
"""Synchronously refresh token."""
refresh_token = credential.get_key("refresh_token")
if not refresh_token:
return TokenRefreshResult(
success=False,
error="No refresh token available",
needs_reauthorization=True,
)
try:
new_token = self.provider.refresh_access_token(refresh_token)
# Save to store
self._save_token_to_store(new_token)
# Notify callback
if self.on_token_refreshed:
self.on_token_refreshed(new_token)
logger.info(f"Token refreshed for {self.credential_id}")
return TokenRefreshResult(success=True, token=new_token)
except Exception as e:
error_msg = str(e)
# Check for refresh token revocation
if "invalid_grant" in error_msg.lower():
return TokenRefreshResult(
success=False,
error=error_msg,
needs_reauthorization=True,
)
if self.on_refresh_failed:
self.on_refresh_failed(error_msg)
logger.error(f"Token refresh failed for {self.credential_id}: {e}")
return TokenRefreshResult(success=False, error=error_msg)
def invalidate_cache(self) -> None:
"""Clear cached token."""
self._cached_token = None
self._cache_time = None
# --- Convenience Methods ---
def get_request_headers(self) -> dict[str, str]:
"""
Get headers for HTTP request with current token.
Returns empty dict if no valid token.
"""
token = self.sync_get_valid_token()
if token is None:
return {}
result = self.provider.format_for_request(token)
return result.get("headers", {})
def get_request_kwargs(self) -> dict:
"""
Get kwargs for HTTP request (headers, params, etc.).
Returns empty dict if no valid token.
"""
token = self.sync_get_valid_token()
if token is None:
return {}
return self.provider.format_for_request(token)

Some files were not shown because too many files have changed in this diff Show More