Align POST /api/sessions behavior across queen-only and one-step worker creation so callers can rely on deterministic session IDs. Add a regression test covering the forwarded session_id contract.
Made-with: Cursor
Pass browser_wait text through Playwright's function argument channel so quoted and multiline strings do not break the generated wait expression. Add a regression test covering text that previously would have been interpolated unsafely.
Made-with: Cursor
Changed HOW_TO_CONTRIBUTE.md back to CONTRIBUTING.md to comply with
GitHub's standard for contributing guidelines files.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Renamed CONTRIBUTING.md to HOW_TO_CONTRIBUTE.md and significantly expanded
the documentation with detailed sections on development setup, OS support,
tooling requirements, performance metrics, and contribution workflows.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Validate that agent names passed to --agents do not contain path
separators. Previously, passing 'exports/my_agent' would result in
the doubled path 'exports/exports/my_agent' with a confusing error.
Now a clear error message is shown suggesting the correct usage.
Fixes#6208
Co-authored-by: nightcityblade <nightcityblade@gmail.com>
Replace individual recipe READMEs with a comprehensive collection of 100 real-world agent prompt examples across marketing, sales, operations, engineering, and finance. This provides users with a broader range of use case inspiration in a single, organized reference document.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Replace individual recipe READMEs with a comprehensive collection of 100 real-world agent prompt examples across marketing, sales, operations, engineering, and finance. This provides users with a broader range of use case inspiration in a single, organized reference document.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Validate that agent names passed to --agents do not contain path
separators. Previously, passing 'exports/my_agent' would result in
the doubled path 'exports/exports/my_agent' with a confusing error.
Now a clear error message is shown suggesting the correct usage.
Fixes#6208
Co-authored-by: nightcityblade <nightcityblade@gmail.com>
Replace bare except Exception: clauses with specific exception handling:
- delete_aden_api_key(): Catch FileNotFoundError, PermissionError at debug
level; log unexpected errors at WARNING with exc_info=True
- _read_credential_key_file(): Catch FileNotFoundError, PermissionError at
debug level; log unexpected errors at WARNING with exc_info=True
- _read_aden_from_encrypted_store(): Catch FileNotFoundError, PermissionError,
KeyError at debug level; log unexpected errors at WARNING with exc_info=True
This makes credential issues easier to diagnose by:
- Logging unexpected errors at WARNING level (visible in production)
- Including full stack traces with exc_info=True
- Keeping expected failures (file not found, permissions) at debug level
Fixes#5931
Tests were asserting the old CLIENT_OUTPUT_DELTA + CLIENT_INPUT_REQUESTED
pattern; the fix in 89ccd66f routes escalations through the queen via
ESCALATION_REQUESTED instead.
Track the errlog file handle opened on non-Windows systems and
properly close it during cleanup to prevent file descriptor leaks.
Changes:
- Add _errlog_handle instance variable to track the file handle
- Store handle reference when opening os.devnull
- Close handle in _cleanup_stdio_async() after other cleanup
- Clear reference in disconnect() for safety
Fixes#6002
The handle_pause endpoint referenced session.mode_state (lines 360-361),
which does not exist on the Session dataclass. This caused an
AttributeError every time the pause endpoint reached the phase transition
step, preventing the queen phase from transitioning to staging and
returning a 500 error to the frontend.
Changed to session.phase_state, consistent with handle_stop (line 412),
handle_run (line 75), and the Session dataclass definition
(session_manager.py line 44).
Turns that call report_to_parent were incorrectly treated as "truly
empty" because the flag was not propagated. Thread it through
_run_single_turn and include it in the empty-turn guard.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add pg_get_table_stats for row counts and size info,
pg_list_indexes for index details, and pg_get_foreign_keys
for relationship discovery with both outgoing and incoming FKs.
Add lusha_bulk_enrich_persons for batch enrichment,
lusha_get_technologies for company tech stack lookup, and
lusha_search_decision_makers for senior contact discovery.
Add s3_copy_object for copying within/between buckets,
s3_get_object_metadata for HEAD-based metadata retrieval, and
s3_generate_presigned_url for temporary access URL generation.
Add pushover_cancel_receipt for stopping emergency retries,
pushover_send_glance for widget data updates, and
pushover_get_limits for checking message usage.
Add news_latest for breaking news without query, news_by_source
for source-filtered articles, and news_by_topic for topic-based
discovery with automatic date ranges.
Add scholar_cited_by for finding papers citing a given paper,
scholar_search_profiles for author profile discovery, and
serpapi_google_search for structured Google web results.
Add exa_search_news, exa_search_papers, and exa_search_companies
convenience wrappers with pre-configured category filters and
automatic date/domain filtering.
- greenhouse_list_offers: GET /offers or /applications/{id}/offers
- greenhouse_add_candidate_note: POST /candidates/{id}/activity_feed/notes
- greenhouse_list_scorecards: GET /applications/{id}/scorecards
- Add _post helper for POST requests
- brevo_list_contacts: GET /contacts with pagination and modified_since filter
- brevo_delete_contact: DELETE /contacts/{email} to remove contacts
- brevo_list_email_campaigns: GET /emailCampaigns with status filter and stats
- quickbooks_list_invoices: query invoices with status/customer filters
- quickbooks_get_customer: GET /customer/{id} with address and contact info
- quickbooks_create_payment: POST /payment with optional invoice linking
- cloudinary_get_usage: GET /usage for storage, bandwidth, transformation limits
- cloudinary_rename_resource: POST /rename to change public_id
- cloudinary_add_tag: POST /tags to add tags to resources
- twitter_get_user_followers: GET /users/{id}/followers with profile details
- twitter_get_tweet_replies: search recent replies via conversation_id
- twitter_get_list_tweets: GET /lists/{id}/tweets with author expansion
- apollo_get_person_activities: GET /activities for contact activity history
- apollo_list_email_accounts: GET /email_accounts for connected sending accounts
- apollo_bulk_enrich_people: POST /people/bulk_match for batch enrichment (up to 10)
- calendly_cancel_event: POST /scheduled_events/{id}/cancellation
- calendly_list_webhooks: GET /webhook_subscriptions for org/user scope
- calendly_get_event_type: GET /event_types/{id} for meeting template details
- Add _post helper for POST requests
- pagerduty_list_oncalls: GET /oncalls with schedule/policy filters
- pagerduty_add_incident_note: POST /incidents/{id}/notes to add notes
- pagerduty_list_escalation_policies: GET /escalation_policies with search
- airtable_delete_records: DELETE records by comma-separated IDs (up to 10)
- airtable_search_records: search records using FIND formula for partial matching
- airtable_list_collaborators: list base collaborators via meta API
- Add _delete helper for DELETE requests
- reddit_get_subreddit_info: GET /r/{name}/about for subscriber count, description
- reddit_get_post_detail: GET /by_id/t3_{id} for full post details with flair, ratios
- reddit_get_user_posts: GET /user/{name}/submitted for user's post history
Add confluence_update_page, confluence_delete_page, and
confluence_get_page_children tools using Confluence REST API v2.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add intercom_close_conversation, intercom_create_contact, and
intercom_list_conversations to Intercom credential spec.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add close_conversation, create_contact, and list_conversations client
methods plus intercom_close_conversation, intercom_create_contact, and
intercom_list_conversations MCP tools using Intercom API v2.11.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add gitlab_update_issue, gitlab_get_merge_request, and
gitlab_create_merge_request_note to GitLab credential spec.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add _put helper and gitlab_update_issue, gitlab_get_merge_request,
and gitlab_create_merge_request_note tools using GitLab REST API v4.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add slack_get_channel_info, slack_list_files, and slack_get_file_info
to Slack credential spec.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add get_channel_info, list_files, and get_file_info client methods
plus slack_get_channel_info, slack_list_files, and slack_get_file_info
MCP tools using Slack Web API.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add stripe_list_disputes, stripe_list_events, and
stripe_create_checkout_session to Stripe credential spec.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add list_disputes, list_events, and create_checkout_session client
methods plus stripe_list_disputes, stripe_list_events, and
stripe_create_checkout_session MCP tools using Stripe API.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add linear_cycles_list, linear_issue_comments_list, and
linear_issue_relation_create to Linear credential spec.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add list_cycles, list_issue_comments, and create_issue_relation client
methods plus linear_cycles_list, linear_issue_comments_list, and
linear_issue_relation_create MCP tools using Linear GraphQL API.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add zoom_update_meeting, zoom_list_meeting_participants, and
zoom_list_meeting_registrants to Zoom credential spec.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add zoom_update_meeting (PATCH), zoom_list_meeting_participants
(past meeting attendees), and zoom_list_meeting_registrants
using Zoom REST API v2.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add twilio_list_phone_numbers, twilio_list_calls, and
twilio_delete_message to both Twilio credential specs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add twilio_list_phone_numbers, twilio_list_calls, and
twilio_delete_message tools using Twilio REST API.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add shopify_update_product, shopify_get_customer, and
shopify_create_draft_order to both Shopify credential specs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add shopify_update_product, shopify_get_customer, and
shopify_create_draft_order tools using Shopify Admin REST API.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add zendesk_get_ticket_comments, zendesk_add_ticket_comment, and
zendesk_list_users to all three Zendesk credential specs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add zendesk_get_ticket_comments, zendesk_add_ticket_comment, and
zendesk_list_users tools using Zendesk Support API v2.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Register asana_update_task, asana_add_comment, and
asana_create_subtask in the Asana credential spec.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add _put helper and three new Asana MCP tools:
- asana_update_task: modify name, notes, completion, due date, assignee
- asana_add_comment: post comment stories on tasks
- asana_create_subtask: create subtasks under existing tasks
API ref: https://developers.asana.com/docs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add three new Trello MCP tools:
- trello_get_card: retrieve full card details with members/checklists/attachments
- trello_create_list: create new lists on boards
- trello_search_cards: full-text search across cards with board scoping
Update credential spec to include the new tool names.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add three new GitHub MCP tools:
- github_list_commits: query commits with author/date/branch filters
- github_create_release: create tagged releases with notes and draft support
- github_list_workflow_runs: monitor CI/CD pipeline runs with status filters
Update credential spec to include the new tool names.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add _GitHubClient methods for:
- list_commits: GET /repos/{owner}/{repo}/commits with sha/author/date filters
- create_release: POST /repos/{owner}/{repo}/releases with tag, notes, draft
- list_workflow_runs: GET /repos/{owner}/{repo}/actions/runs with filters
API ref: https://docs.github.com/en/rest
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add three new Telegram MCP tools:
- telegram_get_chat_member_count: retrieve group/channel membership size
- telegram_send_video: send video files via URL or file_id
- telegram_set_chat_description: update group/channel descriptions
Update credential spec to include the new tool names.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add _TelegramClient methods for:
- get_chat_member_count: getChatMemberCount API endpoint
- send_video: sendVideo with caption, parse_mode, duration support
- set_chat_description: setChatDescription for groups/channels
API ref: https://core.telegram.org/bots/api
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Register salesforce_delete_record, salesforce_search_records, and
salesforce_get_record_count in both Salesforce credential specs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add three new Salesforce MCP tools:
- salesforce_delete_record: DELETE /sobjects/{type}/{id}
- salesforce_search_records: SOSL full-text search via /search/
- salesforce_get_record_count: efficient COUNT() query for any SObject
API ref: https://developer.salesforce.com/docs/atlas.en-us.api_rest.meta
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add three new Discord MCP tools:
- discord_get_channel: retrieve channel metadata (name, topic, type)
- discord_create_reaction: add emoji reactions to messages
- discord_delete_message: remove messages from channels
Update credential spec to include the new tool names.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add _DiscordClient methods for:
- get_channel: retrieve channel metadata via GET /channels/{id}
- create_reaction: add emoji reaction via PUT reactions endpoint
- delete_message: remove a message via DELETE messages endpoint
API ref: https://discord.com/developers/docs/resources
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Register notion_update_page, notion_archive_page, and
notion_append_blocks in the Notion credential spec.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add three new Notion MCP tools:
- notion_update_page: modify page properties via PATCH /pages/{id}
- notion_archive_page: archive or restore pages
- notion_append_blocks: add paragraphs, headings, lists, todos, etc.
API ref: https://developers.notion.com/reference
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Register jira_update_issue, jira_list_transitions, and
jira_transition_issue in all three Jira credential specs
(domain, email, token).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add three new Jira MCP tools:
- jira_update_issue: modify summary, description, priority, labels, assignee
- jira_list_transitions: discover available status transitions for an issue
- jira_transition_issue: move an issue to a new status with optional comment
API ref: https://developer.atlassian.com/cloud/jira/platform/rest/v3/
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add three new MCP tools:
- hubspot_delete_object: archive contacts, companies, or deals
- hubspot_list_associations: query links between CRM objects (v4 API)
- hubspot_create_association: link two CRM records together
Update credential spec to include the new tool names.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add _HubSpotClient methods for:
- delete_object: archive a CRM object via DELETE /crm/v3/objects
- list_associations: query associations via GET /crm/v4/objects associations endpoint
- create_association: link two CRM objects via PUT /crm/v4/objects associations endpoint
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The google_sheets.py file defined GOOGLE_SHEETS_CREDENTIALS (an API-key
based credential for reading public sheets via GOOGLE_SHEETS_API_KEY) but
was never wired into the package:
- Never imported in credentials/__init__.py
- Never merged into CREDENTIAL_SPECS
- Never listed in __all__
- Tool never calls credentials.get('google_sheets_key') — uses 'google' (OAuth2)
- Tool names in the spec were stale and did not match actual function names
The 'google' credential in email.py already correctly covers all Google
Sheets tools via OAuth2. This file was dead code with no referencing
imports anywhere in the repository.
Closes#6077
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Resolves inconsistency between CLAUDE.md/AGENTS.md (TUI deprecated) and
docs/roadmap.md (TUI listed as completed feature).
- Strike through TUI items in 3 roadmap sections
- Add deprecation note to TUI-to-GUI upgrade section
- Reference AGENTS.md and hive open as replacement
Fixes#5941
Signed-off-by: Robert Hallers <robert@terplabs.ai>
- Change get_chat client method from httpx.get+params to httpx.post+json
to avoid URL-encoding issues with @username chat IDs
- Remove {"success": True} normalization from delete_message,
send_chat_action, pin_message, and unpin_message MCP tools;
return raw Telegram API response consistently
- Update corresponding test mocks and assertions to match
Add 8 new operations to the Telegram Bot tool, bringing it from 2 to 10
operations. This covers message lifecycle (edit, delete, forward), media
(send photo), chat info (get chat), UX (typing indicators), and pin
management — making the tool practical for agent workflows beyond
fire-and-forget messaging.
New operations:
- telegram_edit_message: edit previously sent messages
- telegram_delete_message: delete messages
- telegram_forward_message: forward between chats
- telegram_send_photo: send photos via URL or file_id
- telegram_send_chat_action: show typing/uploading indicators
- telegram_get_chat: retrieve chat metadata
- telegram_pin_message: pin important messages
- telegram_unpin_message: unpin stale messages
Also includes input validation for chat actions, credential spec updates,
central registry wiring, and 31 new tests (52 total).
Closes#4808
Use is_file() instead of exists() to reject directories, and check
for empty content before passing to json parser. Prevents raw
tracebacks on invalid agent.json inputs.
Fixes#5787
- Replace joined.replace("\n", "\r\n") with re.sub(r"(?<!\r)\n", "\r\n", joined to prevent \r\n in replace op new_content from becoming \r\r\n (fixed in both hashline_edit.py and file_ops.py)
- Track and report skipped large files in grep_search instead of silently skipping them
- Extract HASHLINE_MAX_FILE_BYTES constant to hashline.py as single source of truth, imported by view_file, grep_search, hashline_edit, and file_ops
- Add tests for CRLF replace op (both copies) and large file skip reporting
register_all_tools() now only loads verified (stable) tools by default.
Pass include_unverified=True to also load new/community integrations.
This prevents unverified tools from being loaded in production.
Also fixes duplicate register_brevo and register_pushover calls.
- Remove s3_tool (duplicate of aws_s3_tool), power_bi_tool (duplicate of
powerbi_tool), x_tool (duplicate of twitter_tool)
- Remove integrations/plaid (duplicate of plaid_tool), integrations/sap_s4hana
(duplicate of sap_tool), stray tools/mssql.py
- Add help key to credential error responses across 14 tool modules
- Fix health checker registry keys (calendly -> calendly_pat, lusha -> lusha_api_key)
- Add health_check_endpoint to calendly and lusha credential specs
- Fix Trello env var (TRELLO_TOKEN -> TRELLO_API_TOKEN) and remove duplicate
Trello specs from hubspot.py
- Add credential_group="aws" to AWS S3 and Redshift specs sharing env vars
- Update conftest UNREGISTERED_COMMUNITY_MODULES to only contain mssql_tool
Implements 5 tools via Tines REST API:
- tines_list_stories: List workflow stories with search/filter
- tines_get_story: Get story details including entry/exit agents
- tines_list_actions: List actions (agents) in stories
- tines_get_action: Get action details with sources/receivers
- tines_get_action_logs: Get action execution logs by level
Uses Bearer token auth with tenant domain.
Implements 4 tools via X API v2:
- twitter_search_tweets: Search recent tweets with query operators
- twitter_get_user: Get user profile by username
- twitter_get_user_tweets: Get user timeline
- twitter_get_tweet: Get tweet details by ID
Uses Bearer token auth (app-only, read access).
Implements 5 tools via QuickBooks Online API v3:
- quickbooks_query: Query entities with SQL-like syntax
- quickbooks_get_entity: Get entity by type and ID
- quickbooks_create_customer: Create customers
- quickbooks_create_invoice: Create invoices with line items
- quickbooks_get_company_info: Get company details
Uses OAuth 2.0 Bearer token auth. Supports sandbox mode.
Implements 5 tools via AWS S3 REST API:
- s3_list_buckets: List all buckets in the account
- s3_list_objects: List objects with prefix/delimiter filtering
- s3_get_object: Get object content and metadata
- s3_put_object: Upload text objects
- s3_delete_object: Delete objects
Uses AWS Signature V4 signing (no boto3 dependency).
Implements 5 tools via Calendly API v2:
- calendly_get_current_user: Get user URI and profile info
- calendly_list_event_types: List meeting templates
- calendly_list_scheduled_events: List booked meetings with date filters
- calendly_get_scheduled_event: Get event details by URI
- calendly_list_invitees: List invitees for an event
Uses Bearer token auth (Personal Access Token).
Implements 5 tools via PagerDuty REST API v2:
- pagerduty_list_incidents: List incidents with status/urgency/date filters
- pagerduty_get_incident: Get incident details by ID
- pagerduty_create_incident: Create incidents on a service
- pagerduty_update_incident: Acknowledge or resolve incidents
- pagerduty_list_services: List services with name search
Uses Token auth header, From header for write operations.
Implements 6 tools via Airtable Web API:
- airtable_list_records: List records with filters, sort, field selection
- airtable_get_record: Get a single record by ID
- airtable_create_records: Create up to 10 records per request
- airtable_update_records: Partial update up to 10 records per request
- airtable_list_bases: List accessible bases
- airtable_get_base_schema: Get table and field schema for a base
Uses Bearer token auth (Personal Access Token).
Implements 6 tools via MongoDB Atlas Data API:
- mongodb_find: Find documents with filters, projection, sort, limit
- mongodb_find_one: Find a single document
- mongodb_insert_one: Insert a document
- mongodb_update_one: Update a document with MongoDB operators
- mongodb_delete_one: Delete a document
- mongodb_aggregate: Run aggregation pipelines
Uses API key auth header. All endpoints are POST.
Implements 5 tools via Zendesk Support API v2:
- zendesk_list_tickets: List tickets with status/sort filters
- zendesk_get_ticket: Get ticket details by ID
- zendesk_create_ticket: Create tickets with priority/type/tags
- zendesk_update_ticket: Update ticket fields and add comments
- zendesk_search_tickets: Search tickets with Zendesk query syntax
Uses Basic auth (email/token:api_token).
6 tools: jira_search_issues, jira_get_issue, jira_create_issue,
jira_list_projects, jira_get_project, jira_add_comment. Uses Basic auth
with email + API token and Atlassian Document Format for text fields.
5 tools: cloudinary_upload, cloudinary_list_resources, cloudinary_get_resource,
cloudinary_delete_resource, cloudinary_search. Uses Basic auth with
API key/secret and supports image, video, and raw resource types.
- Implement 6 YouTube API tools (search videos, get video/channel details, list channel videos, get playlist items, search channels)
- Add YOUTUBE_API_KEY credential spec with help_url and description
- Register YouTube tool in tools/__init__.py
- Add comprehensive test coverage (18 tests) with mocking
- Add detailed README with setup instructions and examples
- Use httpx for HTTP requests to YouTube Data API v3
- Verified with real API integration testing
Implements #5603
The grace period logic for client-facing auto-blocks was placed after
_await_user_input(), which blocks forever since no inject_event is
scheduled for text-only turns. This caused test_text_after_user_input
_goes_to_judge to hang indefinitely, blocking CI framework tests.
Move the grace period check before the blocking call so that within
the grace window, auto-blocks with missing outputs skip blocking
entirely and continue to the next LLM turn for judge RETRY pressure.
Also adds an _auto_missing check: nodes with no missing outputs
(e.g. queen monitoring with output_keys=[]) should still block
as their text-only output is legitimate conversation.
Fixes#5633
* fix: enforce 0600 permissions on OAuth token files
Credential files were written with default umask permissions.
Use os.open with explicit 0o600 mode to ensure token files
are always owner-read/write only, regardless of umask.
Fixes#5530
* style: fix line too long in checkpoint_store.py
- Expose page parameter on search_people and search_companies
(client + MCP tool) enabling access beyond the first 50 results
- Add guard requiring at least one filter on both search endpoints
to prevent broad requests that burn API credits
- Add unit tests for pagination and empty filter validation
Add complete SAP S/4HANA integration with:
- Connector for OData API access
- Credential management following Hive patterns
- Unit tests with mocked responses
- Documentation and usage examples
Refs #3182
- Add S3Storage class with upload, download, list, delete operations
- Support IAM roles, environment variables, and credential store
- Implement retry logic with adaptive backoff
- Add MCP tools: s3_upload, s3_download, s3_list, s3_delete, s3_check_credentials
- Include comprehensive tests with moto mocking
- Add documentation for setup and IAM permissions
Closes#3012
- Replace non-existent CLI commands (calculate, interactive, analyze)
with actual commands (run, shell, info) in core/README.md
- Fix test-list argument from <goal_id> to <agent_path> in core/README.md
- Fix misleading docstring on MockProvider.complete_with_tools()
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: hundao <alchemy_wimp@hotmail.com>
* micro-fix: fix wrong credential path and env var in docs
Both docs/configuration.md and docs/environment-setup.md reference a
non-existent ADEN_CREDENTIALS_PATH env var and wrong default path
(~/.aden/credentials). The actual env var is HIVE_CREDENTIAL_KEY and
the default path is ~/.hive/credentials (see storage.py:119,125).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* micro-fix: clarify HIVE_CREDENTIAL_KEY comment wording
Reword comment to avoid implying the env var controls the path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Use Credentials.from_service_account_file() instead of mutating os.environ
- Remove unused dimensions param from _format_report_response
- Remove unused metrics param from _format_realtime_response
- Extract duplicated property_id/limit validation into _validate_inputs helper
- Add credential_group="google_cloud" to GA and BigQuery specs
- Update tests to mock Credentials class
Add read-only GA4 Data API v1 tools: ga_run_report, ga_get_realtime,
ga_get_top_pages, and ga_get_traffic_sources. Includes credential spec,
unit tests, and README.
Add IntercomHealthChecker (subclass of OAuthBearerHealthChecker) and
register it in HEALTH_CHECKERS so the credential registry completeness
test passes in CI.
- Pass assignee_type from intercom_assign_conversation tool function
through to _IntercomClient.assign_conversation() and into the API payload
- Add tests for assignee_type="team" passthrough at client and tool levels
- Add tool README with setup, usage examples, and error handling
Addresses PR #5171 review feedback from @bryanadenhq
Client-facing nodes auto-block on text-only turns (wait for user input).
This breaks weak models (Codex) that output text like "Understood" instead
of calling tools after user responds.
Add _cf_expecting_work state: after user input, text-only turns with
missing output keys skip auto-block and go to judge, which pushes the
LLM to call set_output. Tool calls reset the state back to presenting
mode (auto-block on next text-only).
No behavioral change for strong models (they always call tools after
user input, so the new code path is never triggered).
Judge feedback was saying "Use set_output tool to provide them" which
caused Codex to skip all work and call set_output directly. Changed to
"Follow your system prompt instructions to complete the work."
* Enhance ToolRegistry type inference for function parameters
- Add _infer_schema() helper to handle Union types (Union[T, U] and T | U)
- Support Optional[T] and Union[T, None] with correct optional flag
- Infer generic types: list[T] -> array with items schema, dict[K, V] -> object with additionalProperties
- Detect Pydantic BaseModel parameters and use model_json_schema()
- Correctly mark parameters as required/optional based on type annotations
- Add comprehensive test suite covering all type inference scenarios
- Maintain backward compatibility for unannotated parameters
* Fix asyncio.run crash in GraphBuilder.run_test
* Revert "Enhance ToolRegistry type inference for function parameters"
This reverts commit dacd0fa8b926e01d3f29e7c9b2ff5101b4a52c3b.
* feat(arxiv): implement search_papers and initial download_paper tools
* feat(arxiv): improve PDF download handling with temp files and validation (WIP)
Switch to NamedTemporaryFile for safer temp file handling
Force export.arxiv.org domain for PDF downloads
Add custom User-Agent header
Validate Content-Type to ensure PDF response
Improve error handling and cleanup logic
Add timeout to requests
Work in progress – download_paper still under refinement.
* feat(arxiv): replace NamedTemporaryFile with module-level TemporaryDirectory
Switch from NamedTemporaryFile(delete=False) to a shared _TEMP_DIR for
the lifetime of the server process. Scopes file lifetime to the session,
guarantees cleanup via atexit, and removes the need for manual file
handle management.
Expand README with full args/returns/error reference and implementation
notes explaining the temp storage design decision.
* test(arxiv): add comprehensive tests for search_papers and download_paper
fix(arxiv): return structured error instead of raising on invalid PDF content type
- Add full test coverage for search_papers (validation, success, id_list, errors)
- Add full test coverage for download_paper (success, network errors, invalid content, cleanup)
- Mock arxiv client and requests to isolate behavior
- Ensure partial files are cleaned up on failure
- Align download_paper behavior with tool contract (no exceptions, structured responses)
* style(tools): apply ruff formatting to arxiv tool and update lockfile
- Adds a new autonomous agent template that monitors competitor websites, news, and GitHub
- Implements a 7-node graph workflow to collect, aggregate, and analyze competitive data
- Generates a weekly structured HTML digest with key highlights and 30-day trends
- Utilizes existing web_scrape, web_search, and github MCP tools
- Addresses issue #4153Closes#4153
* docs: fix CLI usage args for test-debug/test-list to match implementation
* docs: restore 'uv run' prefix to test commands
Reverts unintentional removal of 'uv run' in usage examples as requested in code review.
* chore: changes to .gitignore
* perf(json): add json.loads fast path + asyncio.to_thread for extract_json
Addresses maintainer feedback:
- json.loads candidate fast path in find_json_object (300x speedup source)
- asyncio.to_thread wrappers for both _extract_json call sites (unblocks event loop)
- Remove ~480 lines of over-engineered incremental parsing logic
Total: ~16 lines, zero duplication, zero API surface change
* fix: simplify async JSON handling per maintainer feedback and align tests
* fix(test): replace tautology assertion in test_mismatched_then_valid
The original assertion `assert result is not None or result is None`
is always true. Replace with a meaningful type check.
---------
Co-authored-by: hundao <alchemy_wimp@hotmail.com>
- Add brevo_tool with 6 MCP tools: brevo_send_email, brevo_send_sms,
brevo_create_contact, brevo_get_contact, brevo_update_contact,
brevo_get_email_stats
- Add CredentialSpec for BREVO_API_KEY in credentials/brevo.py
- Register brevo_tool in tools/__init__.py and credentials/__init__.py
- Add README with setup instructions and usage examples
- Add 34 unit tests covering all tools, validation and error handling
Closes#5127
- Bump framework version 0.5.0 → 0.5.1
- Add CHANGELOG.md with full release notes
Highlights: Hive Coder meta-agent, multi-graph runtime, TUI revamp,
subscription model support, 5 new tool integrations, deprecated node
type removal.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Bump framework version 0.5.0 → 0.5.1
- Add CHANGELOG.md with full release notes
Highlights: Hive Coder meta-agent, multi-graph runtime, TUI revamp,
subscription model support, 5 new tool integrations, deprecated node
type removal.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previously the table listed ~20 of ~50 available tools. This expands
it to cover all tools, grouped into categories: File System, Data Files,
Web & Search, Communication, Productivity & CRM, Cloud & APIs,
Security, and Utilities.
All tool names verified against registered @mcp.tool() functions in source.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- search_people: replaced freetext searchText concatenation with proper
structured Lusha API filters (jobTitles, seniority as list[int],
departments, locations as dict, company_names, industry_ids, search_text)
- search_companies: added locations, company_names, search_text params;
made all params optional for flexible queries
- Pagination: exposed limit param (clamped 10-50 per Lusha API constraints)
on both search tools, replacing hardcoded size=25
- get_signals: changed ids from list[str] to list[int], removed internal
str-to-int conversion as Lusha IDs are always numeric
- seniority type corrected to list[int] (API rejects string-encoded values
despite OpenAPI spec suggesting strings — verified via live integration)
- Unit tests updated for all changes (19/19 pass)
Verified against live Lusha API: all 6 tools return correct responses.
- search_companies: replace names filter with mainIndustriesIds (numeric
industry IDs) per Lusha API schema. Parameter changed from
industry: str to industry_ids: list[int] | None.
- _get_api_key: return None instead of raising TypeError on unexpected
credential type. Lets _get_client handle it with the standard error dict
pattern used across all tools.
- Updated unit tests for new industry_ids parameter and added test for
non-string credential handling.
Lusha API rejects filters.companies.include.searchText (HTTP 400).
Replaced with valid 'names' field in search_companies and removed
redundant company searchText from search_people. Updated unit tests.
Implements AI-powered web search, content extraction, and research tools
via the Exa API for agent workflows.
Tools: exa_search, exa_find_similar, exa_get_contents, exa_answer
Follows existing tool pattern (web_search_tool, hubspot_tool, slack_tool):
- register_tools(mcp, credentials) with @mcp.tool() decorators
- Credential fallback: CredentialStoreAdapter -> EXA_API_KEY env var
- Error handling: always returns dicts, never raises
- Retry with exponential backoff on HTTP 429
Includes:
- Neural/keyword search with domain, date, and category filters
- Similar page discovery via neural embeddings
- Content extraction from up to 10 URLs per request
- Citation-backed answer generation
- CredentialSpec in credentials/search.py
- Comprehensive unit tests (21 tests)
- 500/500 integration CI tests passing
Fixes#4177
- Fix export endpoint: /Export -> /ExportTo
- Add 202 Accepted response handling
- Add notifyOption to refresh_dataset API call
- Rename format parameter to export_format (avoid shadowing builtin)
- Add PNG support to export formats
- All critical API issues from review addressed
Terminals without extended key reporting (VS Code, Cursor) send
identical events for Enter and Shift+Enter, making it impossible
to insert newlines. Ctrl+J produces a distinct key event in all
terminals.
* feat(tools): add Razorpay payment processing integration
Add Razorpay MCP tool integration for payment processing, invoicing,
and refund management. Implements 6 MCP tools:
- razorpay_list_payments: List recent payments with filters (pagination, date range)
- razorpay_get_payment: Fetch detailed payment information by ID
- razorpay_create_payment_link: Create one-time payment links with shareable URLs
- razorpay_list_invoices: List invoices with status and type filtering
- razorpay_get_invoice: Fetch invoice details including line items
- razorpay_create_refund: Create full or partial refunds for payments
Features:
- Authentication via HTTP Basic Auth (RAZORPAY_API_KEY + RAZORPAY_API_SECRET)
- Credential spec in dedicated razorpay.py (follows repo pattern)
- Comprehensive error handling (401, 403, 404, 400, 429, 500, timeouts)
- Input validation (payment IDs, invoice IDs, amounts, currencies)
- Full test coverage (42 unit tests, 26 integration tests)
Closes#4404
* style: fix ruff I001 import order and W291 in tools
* fix: improve Razorpay credential tracking and validation
- Add razorpay_secret CredentialSpec with credential_group
- Fix amount=0 bug by using 'is not None' checks
- Add regex validation for payment/invoice IDs
* fix: use graceful credential handling instead of raising TypeError
Match codebase convention (calcom, lusha) - return None for non-string
credentials instead of raising TypeError, so the tool returns an error
dict instead of crashing.
---------
Co-authored-by: hundao <alchemy_wimp@hotmail.com>
Correct hyphenation and spelling in the product roadmap: change 'outcome oriented' to 'outcome-oriented' and fix 'Workder' to 'Worker' in the Deployment section.
Implements comprehensive Apify integration for web scraping and automation:
- Added 4 new tools: apify_run_actor, apify_get_dataset, apify_get_run, apify_search_actors
- Credential management for APIFY_API_TOKEN with health check
- Support for synchronous (wait=True) and asynchronous (wait=False) actor execution
- Actor ID validation and comprehensive error handling
- Full test coverage (26 tests passing)
- README with usage examples and documentation
Addresses #4510
Add Zoho CRM MCP integration for lead/contact/account/deal workflows with notes support. Implements 5 MCP tools:
- zoho_crm_search: Search Leads/Contacts/Accounts/Deals by criteria or word with pagination
- zoho_crm_get_record: Fetch a single record by module and ID
- zoho_crm_create_record: Create records with pass-through field payloads
- zoho_crm_update_record: Update records by ID with partial field payloads
- zoho_crm_add_note: Create notes linked to CRM records via Parent_Id mapping
Features:
- Zoho OAuth2 provider added in core credentials (refresh-token flow)
- Zoho auth format: Authorization: Zoho-oauthtoken <token>
- Region/DC-aware routing using accounts domain/region + api_domain usage
- Persisted DC metadata on refresh (api_domain/accounts_domain/location)
- Credential spec and health check registration for zoho_crm
- Tool registration and allowed-tool list updates
- Normalized tool responses with retriable 429 handling
- README with setup, auth modes, usage, and testing instructions
- Comprehensive unit/integration coverage updates for tool, provider, and health checks
Validation:
- Scoped ruff lint/format checks passed
- Targeted test suite passed: 563 passed, 18 skipped
Closes#4418
- Keep both APOLLO_CREDENTIALS and AIRTABLE_CREDENTIALS
- Keep both apollo_tool and airtable_tool imports (alphabetical)
Co-authored-by: Cursor <cursoragent@cursor.com>
- Keep both APOLLO_CREDENTIALS and CALENDLY_CREDENTIALS
- Keep both apollo_tool and calendly_tool imports (alphabetical)
Co-authored-by: Cursor <cursoragent@cursor.com>
- Add 429 handling with retry_after from Retry-After header
- Add _request_with_retry (2 retries) for all API calls
- Update tests to use httpx.request
Co-authored-by: Cursor <cursoragent@cursor.com>
- Add 429 handling with retry_after from Retry-After header
- Add _request_with_retry (2 retries) for all API calls
- Validate get_availability date range <= 7 days
- Update tests to use httpx.request
Co-authored-by: Cursor <cursoragent@cursor.com>
Reorganized imports in __init__.py for clarity and consistency. Cleaned up formatting and comments in wikipedia_tool.py. Enhanced test_wikipedia_tool.py by improving patching targets, clarifying comments, and refining test structure for better maintainability.
Reorders imports in tools/__init__.py for clarity and groups web and PDF tools together. Updates Wikipedia tool tests to patch httpx.get using the correct import path, ensuring mocks work as intended. Removes unnecessary print statement in Wikipedia tool error handling.
Introduces a new 'search_wikipedia' tool for searching Wikipedia and retrieving article summaries using the public Wikipedia REST API. Updates documentation and tool registration, and adds unit tests for the new tool.
Implements Reddit API integration for community management and content monitoring.
Features:
- Search & Monitoring: search posts/comments, get subreddit feeds (new/hot), get posts/comments (6 tools)
- Content Creation: submit posts, reply, edit, delete comments (5 tools)
- User Engagement: get profiles, upvote, downvote, save posts (4 tools)
- Moderation: remove/approve posts, ban users (3 tools)
Implementation:
- OAuth 2.0 authentication via REDDIT_CREDENTIALS
- PRAW library for Reddit API integration
- Comprehensive error handling and validation
- Full test coverage (25 tests passing)
Resolves#3595
Add test_x_credentials_share_credential_group to verify all X credentials
share the 'x' credential group. Update test_credential_group_default_empty
to account for X credentials alongside existing Google exceptions.
- Added `x_send_dm` tool using v2 endpoint (`POST /dm_conversations/with/:id/messages`) for reliable 1:1 messaging.
- Fixed 403 Forbidden payload validation errors by simplifying DM payload structure.
- Enhanced `_handle_response` to verify `x_tool.py` returns raw API error details for 403/400 responses, aiding in permission debugging.
- Updated `demo_x_tools.py` to support standard `.env` variable names (e.g., `X_API_KEY`) and added user lookup for DM testing.
- Added unit tests covering new DM functionality and payload verification in `test_x_tool.py`.
- Audited credential handling: Read-only tools (Search/Mentions) correctly use Bearer Token, while Write tools (Post/Reply/Delete/DM) enforce OAuth 1.0a User Context.
Verified with live API tests (see PR description for logs).
- Implement 5 core functions for data warehouse querying
- Add boto3 integration with Redshift Data API
- Security: Read-only SELECT queries by default
- Full credential store support
- 26/26 tests passing (100% coverage)
- Complete documentation with examples
description: Core concepts for goal-driven agents - architecture, node types (event_loop, function), tool discovery, and workflow overview. Use when starting agent development or need to understand agent fundamentals.
license: Apache-2.0
metadata:
author: hive
version: "2.0"
type: foundational
part_of: hive
---
# Building Agents - Core Concepts
Foundational knowledge for building goal-driven agents as Python packages.
-`event_loop` — Multi-turn streaming loop with tool execution and judge-based evaluation. Works with or without tools.
-`function` — Deterministic Python operations. No LLM involved.
```python
search_node=NodeSpec(
id="search-web",
name="Search Web",
description="Search for information and extract results",
node_type="event_loop",
input_keys=["query"],
output_keys=["search_results"],
system_prompt="Search the web for: {query}. Use the web_search tool to find results, then call set_output to store them.",
tools=["web_search"],
)
```
**NodeSpec Fields for Event Loop Nodes:**
| Field | Default | Description |
|-------|---------|-------------|
| `client_facing` | `False` | If True, streams output to user and blocks for input between turns |
| `nullable_output_keys` | `[]` | Output keys that may remain unset (for mutually exclusive outputs) |
| `max_node_visits` | `1` | Max times this node executes per run. Set >1 for feedback loop targets |
### Edge
Connection between nodes (written to agent.py)
**Edge Conditions:**
-`on_success` — Proceed if node succeeds (most common)
-`on_failure` — Handle errors
-`always` — Always proceed
-`conditional` — Based on expression evaluating node output
**Edge Priority:**
Priority controls evaluation order when multiple edges leave the same node. Higher priority edges are evaluated first. Use negative priority for feedback edges (edges that loop back to earlier nodes).
```python
# Forward edge (evaluated first)
EdgeSpec(
id="review-to-campaign",
source="review",
target="campaign-builder",
condition=EdgeCondition.CONDITIONAL,
condition_expr="output.get('approved_contacts') is not None",
priority=1,
)
# Feedback edge (evaluated after forward edges)
EdgeSpec(
id="review-feedback",
source="review",
target="extractor",
condition=EdgeCondition.CONDITIONAL,
condition_expr="output.get('redo_extraction') is not None",
priority=-1,
)
```
### Client-Facing Nodes
For multi-turn conversations with the user, set `client_facing=True` on a node. The node will:
- Stream its LLM output directly to the end user
- Block for user input between conversational turns
- Resume when new input is injected via `inject_event()`
```python
intake_node=NodeSpec(
id="intake",
name="Intake",
description="Gather requirements from the user",
node_type="event_loop",
client_facing=True,
input_keys=[],
output_keys=["repo_url","project_url"],
system_prompt="You are the intake agent. Ask the user for the repo URL and project URL.",
)
```
> **Legacy Note:** The old `pause_nodes` / `entry_points` pattern still works but `client_facing=True` is preferred for new agents.
**STEP 1 / STEP 2 Prompt Pattern:** For client-facing nodes, structure the system prompt with two explicit phases:
```python
system_prompt="""\
**STEP 1 — Respond to the user (text only, NO tool calls):**
[Present information, ask questions, etc.]
**STEP 2 — After the user responds, call set_output:**
[Call set_output with the structured outputs]
"""
```
This prevents the LLM from calling `set_output` prematurely before the user has had a chance to respond.
### Node Design: Fewer, Richer Nodes
Prefer fewer nodes that do more work over many thin single-purpose nodes:
Why: Each node boundary requires serializing outputs and passing context. Fewer nodes means the LLM retains full context of its work within the node. A research node that searches, fetches, and analyzes keeps all the source material in its conversation history.
### nullable_output_keys for Cross-Edge Inputs
When a node receives inputs that only arrive on certain edges (e.g., `feedback` only comes from a review → research feedback loop, not from intake → research), mark those keys as `nullable_output_keys`:
```python
research_node=NodeSpec(
id="research",
input_keys=["research_brief","feedback"],
nullable_output_keys=["feedback"],# Not present on first visit
max_node_visits=3,
...
)
```
## Event Loop Architecture Concepts
### How EventLoopNode Works
An event loop node runs a multi-turn loop:
1. LLM receives system prompt + conversation history
2. LLM responds (text and/or tool calls)
3. Tool calls are executed, results added to conversation
5. Repeat until judge ACCEPTs or max_iterations reached
### EventLoopNode Runtime
EventLoopNodes are **auto-created** by `GraphExecutor` at runtime. You do NOT need to manually register them. Both `GraphExecutor` (direct) and `AgentRuntime` / `create_agent_runtime()` handle event_loop nodes automatically.
```python
# Direct execution — executor auto-creates EventLoopNodes
Nodes produce structured outputs by calling `set_output(key, value)` — a synthetic tool injected by the framework. When the LLM calls `set_output`, the value is stored in the output accumulator and made available to downstream nodes via shared memory.
`set_output` is NOT a real tool — it is excluded from `real_tool_results`. For client-facing nodes, this means a turn where the LLM only calls `set_output` (no other tools) is treated as a conversational boundary and will block for user input.
### JudgeProtocol
**The judge is the SOLE mechanism for acceptance decisions.** Do not add ad-hoc framework gating, output rollback, or premature rejection logic. If the LLM calls `set_output` too early, fix it with better prompts or a custom judge — not framework-level guards.
The judge controls when a node's loop exits:
- **Implicit judge** (default, no judge configured): ACCEPTs when the LLM finishes with no tool calls and all required output keys are set
- **SchemaJudge**: Validates outputs against a Pydantic model
When tool results exceed the context window, the framework automatically saves them to a spillover directory and truncates with a hint. Nodes that produce or consume large data should include the data tools:
-`save_data(filename, data)` — Write data to a file in the data directory
-`load_data(filename, offset=0, limit=50)` — Read data with line-based pagination
-`list_data_files()` — List available data files
-`serve_file_to_user(filename, label="")` — Get a clickable file:// URI for the user
Note: `data_dir` is a framework-injected context parameter — the LLM never sees or passes it. `GraphExecutor.execute()` sets it per-execution via `contextvars`, so data tools and spillover always share the same session-scoped directory.
These are real MCP tools (not synthetic). Add them to nodes that handle large tool results:
Multiple ON_SUCCESS edges from the same source create parallel execution. All branches run concurrently via `asyncio.gather()`. Parallel event_loop nodes must have disjoint `output_keys`.
### max_node_visits
Controls how many times a node can execute in one graph run. Default is 1. Set higher for nodes that are targets of feedback edges (review-reject loops). Set 0 for unlimited (guarded by max_steps).
## Tool Discovery & Validation
**CRITICAL:** Before adding a node with tools, you MUST verify the tools exist.
Tools are provided by MCP servers. Never assume a tool exists - always discover dynamically.
### Step 1: Register MCP Server (if not already done)
description: Set up and install credentials for an agent. Detects missing credentials from agent config, collects them from the user, and stores them securely in the local encrypted store at ~/.hive/credentials.
license: Apache-2.0
metadata:
author: hive
version: "2.3"
type: utility
---
# Setup Credentials
Interactive credential setup for agents with multiple authentication options. Detects what's missing, offers auth method choices, validates with health checks, and stores credentials securely.
## When to Use
- Before running or testing an agent for the first time
- When `AgentRunner.run()` fails with "missing required credentials"
- When a user asks to configure credentials for an agent
- After building a new agent that uses tools requiring API keys
## Workflow
### Step 1: Identify the Agent
Determine which agent needs credentials. The user will either:
- Name the agent directly (e.g., "set up credentials for hubspot-agent")
- Have an agent directory open (check `exports/` for agent dirs)
- Be working on an agent in the current session
Locate the agent's directory under `exports/{agent_name}/`.
### Step 2: Detect Missing Credentials
Use the `check_missing_credentials` MCP tool to detect what the agent needs and what's already configured. This tool loads the agent, inspects its required tools and node types, maps them to credentials via `CREDENTIAL_SPECS`, and checks both the encrypted store and environment variables.
"description":"Brave Search API key for web search",
"help_url":"https://brave.com/search/api/",
"tools":["web_search"]
}
],
"available":[
{
"credential_name":"anthropic",
"env_var":"ANTHROPIC_API_KEY",
"source":"encrypted_store"
}
],
"total_missing":1,
"ready":false
}
```
**If `ready` is true (nothing missing):** Report all credentials as configured and skip Steps 3-5. Example:
```
All required credentials are already configured:
✓ anthropic (ANTHROPIC_API_KEY)
✓ brave_search (BRAVE_SEARCH_API_KEY)
Your agent is ready to run!
```
**If credentials are missing:** Continue to Step 3 with the `missing` list.
### Step 3: Present Auth Options for Each Missing Credential
For each missing credential, check what authentication methods are available:
```python
fromaden_tools.credentialsimportCREDENTIAL_SPECS
spec=CREDENTIAL_SPECS.get("hubspot")
ifspec:
# Determine available auth options
auth_options=[]
ifspec.aden_supported:
auth_options.append("aden")
ifspec.direct_api_key_supported:
auth_options.append("direct")
auth_options.append("custom")# Always available
# Get setup info
setup_info={
"env_var":spec.env_var,
"description":spec.description,
"help_url":spec.help_url,
"api_key_instructions":spec.api_key_instructions,
}
```
Present the available options using AskUserQuestion:
```
Choose how to configure HUBSPOT_ACCESS_TOKEN:
1) Aden Platform (OAuth) (Recommended)
Secure OAuth2 flow via hive.adenhq.com
- Quick setup with automatic token refresh
- No need to manage API keys manually
2) Direct API Key
Enter your own API key manually
- Requires creating a HubSpot Private App
- Full control over scopes and permissions
3) Local Credential Setup (Advanced)
Programmatic configuration for CI/CD
- For automated deployments
- Requires manual API calls
```
### Step 4: Execute Auth Flow Based on User Choice
#### Prerequisite: Ensure HIVE_CREDENTIAL_KEY Is Available
Before storing any credentials, verify `HIVE_CREDENTIAL_KEY` is set (needed to encrypt/decrypt the local store). Check both the current session and shell config:
```bash
# Check current session
printenv HIVE_CREDENTIAL_KEY > /dev/null 2>&1&&echo"session: set"||echo"session: not set"
# Check shell config files
for f in ~/.zshrc ~/.bashrc ~/.profile;do[ -f "$f"]&& grep -q 'HIVE_CREDENTIAL_KEY'"$f"&&echo"$f";done
```
- **In current session** — proceed to store credentials
- **In shell config but NOT in current session** — run `source ~/.zshrc` (or `~/.bashrc`) first, then proceed
- **Not set anywhere** — `EncryptedFileStorage` will auto-generate one. After storing, tell the user to persist it: `export HIVE_CREDENTIAL_KEY="{generated_key}"` in their shell profile
> **⚠️ IMPORTANT: After adding `HIVE_CREDENTIAL_KEY` to the user's shell config, always display:**
> ```
> ⚠️ Environment variables were added to your shell config.
> Open a NEW TERMINAL for them to take effect outside this session.
> ```
#### Option 1: Aden Platform (OAuth)
This is the recommended flow for supported integrations (HubSpot, etc.).
**How Aden OAuth Works:**
The ADEN_API_KEY represents a user who has already completed OAuth authorization on Aden's platform. When users sign up and connect integrations on Aden, those OAuth tokens are stored server-side. Having an ADEN_API_KEY means:
1. User has an Aden account
2. User has already authorized integrations (HubSpot, etc.) via OAuth on Aden
3. We just need to sync those credentials down to the local credential store
**4.1a. Check for ADEN_API_KEY**
```python
importos
aden_key=os.environ.get("ADEN_API_KEY")
```
If not set, guide user to get one from Aden (this is where they do OAuth):
The local encrypted store requires `HIVE_CREDENTIAL_KEY` to encrypt/decrypt credentials.
- If the user doesn't have one, `EncryptedFileStorage` will auto-generate one and log it
- The user MUST persist this key (e.g., in `~/.bashrc`/`~/.zshrc` or a secrets manager)
- Without this key, stored credentials cannot be decrypted
**Shell config rule:** Only TWO keys belong in shell config (`~/.zshrc`/`~/.bashrc`):
-`HIVE_CREDENTIAL_KEY` — encryption key for the credential store
-`ADEN_API_KEY` — Aden platform auth key (needed before the store can sync)
All other API keys (Brave, Google, HubSpot, etc.) must go in the encrypted store only. **Never offer to add them to shell config.**
If `HIVE_CREDENTIAL_KEY` is not set:
1. Let the store generate one
2. Tell the user to save it: `export HIVE_CREDENTIAL_KEY="{generated_key}"`
3. Recommend adding it to `~/.bashrc` or their shell profile
## Security Rules
- **NEVER** log, print, or echo credential values in tool output
- **NEVER** store credentials in plaintext files, git-tracked files, or agent configs
- **NEVER** hardcode credentials in source code
- **NEVER** offer to save API keys to shell config (`~/.zshrc`/`~/.bashrc`) — the **only** keys that belong in shell config are `HIVE_CREDENTIAL_KEY` and `ADEN_API_KEY`. All other credentials (Brave, Google, HubSpot, GitHub, Resend, etc.) go in the encrypted store only.
- **ALWAYS** use `SecretStr` from Pydantic when handling credential values in Python
- **ALWAYS** use the local encrypted store (`~/.hive/credentials`) for persistence
- **ALWAYS** run health checks before storing credentials (when possible)
- **ALWAYS** verify credentials were stored by re-running validation, not by reading them back
- When modifying `~/.bashrc` or `~/.zshrc`, confirm with the user first
## Credential Sources Reference
All credential specs are defined in `tools/src/aden_tools/credentials/`:
description: Best practices, patterns, and examples for building goal-driven agents. Includes client-facing interaction, feedback edges, judge patterns, fan-out/fan-in, context management, and anti-patterns.
license: Apache-2.0
metadata:
author: hive
version: "2.0"
type: reference
part_of: hive
---
# Building Agents - Patterns & Best Practices
Design patterns, examples, and best practices for building robust goal-driven agents.
**Prerequisites:** Complete agent structure using `hive-create`.
## Practical Example: Hybrid Workflow
How to build a node using both direct file writes and optional MCP validation:
```python
# 1. WRITE TO FILE FIRST (Primary - makes it visible)
node_code='''
search_node = NodeSpec(
id="search-web",
node_type="event_loop",
input_keys=["query"],
output_keys=["search_results"],
system_prompt="Search the web for: {query}. Use web_search, then call set_output to store results.",
- Immediately sees node in their editor (from step 1)
- Gets validation feedback (from step 2)
- Can edit the file directly if needed
## Multi-Turn Interaction Patterns
For agents needing multi-turn conversations with users, use `client_facing=True` on event_loop nodes.
### Client-Facing Nodes
A client-facing node streams LLM output to the user and blocks for user input between conversational turns. This replaces the old pause/resume pattern.
```python
# Client-facing node with STEP 1/STEP 2 prompt pattern
intake_node=NodeSpec(
id="intake",
name="Intake",
description="Gather requirements from the user",
node_type="event_loop",
client_facing=True,
input_keys=["topic"],
output_keys=["research_brief"],
system_prompt="""\
You are an intake specialist.
**STEP 1 — Read and respond (text only, NO tool calls):**
1. Read the topic provided
2. If it's vague, ask 1-2 clarifying questions
3. If it's clear, confirm your understanding
**STEP 2 — After the user confirms, call set_output:**
- set_output("research_brief", "Clear description of what to research")
""",
)
# Internal node runs without user interaction
research_node=NodeSpec(
id="research",
name="Research",
description="Search and analyze sources",
node_type="event_loop",
input_keys=["research_brief"],
output_keys=["findings","sources"],
system_prompt="Research the topic using web_search and web_scrape...",
- Client-facing nodes stream LLM text to the user and block for input after each response
- User input is injected via `node.inject_event(text)`
- When the LLM calls `set_output` to produce structured outputs, the judge evaluates and ACCEPTs
- Internal nodes (non-client-facing) run their entire loop without blocking
-`set_output` is a synthetic tool — a turn with only `set_output` calls (no real tools) triggers user input blocking
**STEP 1/STEP 2 pattern:** Always structure client-facing prompts with explicit phases. STEP 1 is text-only conversation. STEP 2 calls `set_output` after user confirmation. This prevents the LLM from calling `set_output` prematurely before the user responds.
| Gathering user requirements | Yes | Need user input |
| Human review/approval checkpoint | Yes | Need human decision |
| Data processing (scanning, scoring) | No | Runs autonomously |
| Report generation | No | No user input needed |
| Final confirmation before action | Yes | Need explicit approval |
> **Legacy Note:** The `pause_nodes` / `entry_points` pattern still works for backward compatibility but `client_facing=True` is preferred for new agents.
## Edge-Based Routing and Feedback Loops
### Conditional Edge Routing
Multiple conditional edges from the same source replace the old `router` node type. Each edge checks a condition on the node's output.
system_prompt="Present the contact list to the operator. If they approve, call set_output('approved_contacts', ...). If they want changes, call set_output('redo_extraction', 'true').",
| Loop-back on rejection | Manual entry_points | Feedback edge with `priority=-1` |
| Multi-way routing | Router with routes dict | Multiple conditional edges with priorities |
## Judge Patterns
**Core Principle: The judge is the SOLE mechanism for acceptance decisions.** Never add ad-hoc framework gating to compensate for LLM behavior. If the LLM calls `set_output` prematurely, fix the system prompt or use a custom judge. Anti-patterns to avoid:
- Output rollback logic
-`_user_has_responded` flags
- Premature set_output rejection
- Interaction protocol injection into system prompts
Judges control when an event_loop node's loop exits. Choose based on validation needs.
### Implicit Judge (Default)
When no judge is configured, the implicit judge ACCEPTs when:
- The LLM finishes its response with no tool calls
- All required output keys have been set via `set_output`
Best for simple nodes where "all outputs set" is sufficient validation.
### SchemaJudge
Validates outputs against a Pydantic model. Use when you need structural validation.
```python
frompydanticimportBaseModel
classScannerOutput(BaseModel):
github_users:list[dict]# Must be a list of user objects
3.**Aggressive compaction** — Keeps only recent messages + summary
4.**Emergency** — Hard reset with tool history preservation
### Spillover Pattern
The framework automatically truncates large tool results and saves full content to a spillover directory. The LLM receives a truncation message with instructions to use `load_data` to read the full result.
For explicit data management, use the data tools (real MCP tools, not synthetic):
```python
# save_data, load_data, list_data_files, serve_file_to_user are real MCP tools
# data_dir is auto-injected by the framework — the LLM never sees it
`data_dir` is a framework context parameter — auto-injected at call time. `GraphExecutor.execute()` sets it per-execution via `ToolRegistry.set_execution_context(data_dir=...)` (using `contextvars` for concurrency safety), ensuring it matches the session-scoped spillover directory.
## Anti-Patterns
### What NOT to Do
- **Don't rely on `export_graph`** — Write files immediately, not at end
- **Don't hide code in session** — Write to files as components are approved
- **Don't wait to write files** — Agent visible from first step
- **Don't batch everything** — Write incrementally, one component at a time
- **Don't create too many thin nodes** — Prefer fewer, richer nodes (see below)
- **Don't add framework gating for LLM behavior** — Fix prompts or use judges instead
### Fewer, Richer Nodes
A common mistake is splitting work into too many small single-purpose nodes. Each node boundary requires serializing outputs, losing in-context information, and adding edge complexity.
**Remember: Agent is actively constructed, visible the whole time. No hidden state. No surprise exports. Just transparent, incremental file building.**
description: Iterative agent testing with session recovery. Execute, analyze, fix, resume from checkpoints. Use when testing an agent, debugging test failures, or verifying fixes without re-running from scratch.
---
# Agent Testing
Test agents iteratively: execute, analyze failures, fix, resume from checkpoint, repeat.
## When to Use
- Testing a newly built agent against its goal
- Debugging a failing agent iteratively
- Verifying fixes without re-running expensive early nodes
- Running final regression tests before deployment
## Prerequisites
1. Agent package at `exports/{agent_name}/` (built with `/hive-create`)
2. Credentials configured (`/hive-credentials`)
3.`ANTHROPIC_API_KEY` set (or appropriate LLM provider key)
These return `file_header`, `test_template`, `constraints_formatted`/`success_criteria_formatted`, and `test_guidelines`. They do NOT generate test code — you write the tests.
- Every test MUST be `async` with `@pytest.mark.asyncio`
- Every test MUST accept `runner, auto_responder, mock_mode` fixtures
- Use `await auto_responder.start()` before running, `await auto_responder.stop()` in `finally`
- Use `await runner.run(input_dict)` — this goes through AgentRunner → AgentRuntime → ExecutionStream
- Access output via `result.output.get("key")` — NEVER `result.output["key"]`
-`result.success=True` means no exception, NOT goal achieved — always check output
- Write 8-15 tests total, not 30+
- Each real test costs ~3 seconds + LLM tokens
- NEVER use `default_agent.run()` — it bypasses the runtime (no sessions, no logs, client-facing nodes hang)
#### Step 1d: Check existing tests
Before generating, check if tests already exist:
```python
list_tests(
goal_id="your-goal-id",
agent_path="exports/{agent_name}"
)
```
---
### Phase 2: Execute
Two execution paths, use the right one for your situation.
#### Iterative debugging (for complex agents)
Run the agent via CLI. This creates sessions with checkpoints at `~/.hive/agents/{agent_name}/sessions/`:
```bash
uv run hive run exports/{agent_name} --input '{"query": "test topic"}'
```
Sessions and checkpoints are saved automatically.
**Client-facing nodes**: Agents with `client_facing=True` nodes (interactive conversation) work in headless mode when run from a real terminal — the agent streams output to stdout and reads user input from stdin via a `>>> ` prompt. In non-interactive shells (like Claude Code's Bash tool), client-facing nodes will hang because there is no stdin. For testing interactive agents from Claude Code, use `run_tests` with mock mode or have the user run the agent manually in their terminal.
#### Automated regression (for CI or final verification)
Use the `run_tests` MCP tool to run all pytest tests:
**Note:**`run_tests` uses `AgentRunner` with `tmp_path` storage, so sessions are isolated per test run. For checkpoint-based recovery with persistent sessions, use CLI execution. Use `run_tests` for quick regression checks and final verification.
---
### Phase 3: Analyze Failures
When a test fails, drill down systematically. Don't guess — use the tools.
#### Step 3a: Get error category
```python
debug_test(
goal_id="your-goal-id",
test_name="test_success_source_diversity",
agent_path="exports/{agent_name}"
)
```
Returns error category (`IMPLEMENTATION_ERROR`, `ASSERTION_FAILURE`, `TIMEOUT`, `IMPORT_ERROR`, `API_ERROR`) plus full traceback and suggestions.
#### Step 3b: Find the failed session
```python
list_agent_sessions(
agent_work_dir="~/.hive/agents/{agent_name}",
status="failed",
limit=5
)
```
Returns session list with IDs, timestamps, current_node (where it failed), execution_quality.
#### Step 3c: Inspect session state
```python
get_agent_session_state(
agent_work_dir="~/.hive/agents/{agent_name}",
session_id="session_20260209_143022_abc12345"
)
```
Returns execution path, which node was current, step count, timestamps — but excludes memory values (to avoid context bloat). Shows `memory_keys` and `memory_size` instead.
Returns checkpoint summaries with IDs, types (`node_start`, `node_complete`), which node, and `is_clean` flag. Clean checkpoints are safe resume points.
#### Step 3g: Compare checkpoints (optional)
To understand what changed between two points in execution:
This skips all nodes before the checkpoint and only re-runs the fixed node onward.
#### When to re-run from scratch
Re-run when ANY of these are true:
- The fix is to the entry node or an early node
- No checkpoints exist (e.g., agent was run via `run_tests`)
- The agent is fast (2-3 nodes, completes in seconds)
- You changed the graph structure (added/removed nodes/edges)
```bash
uv run hive run exports/{agent_name} --input '{"query": "test topic"}'
```
#### Inspecting a checkpoint before resuming
```python
get_agent_checkpoint(
agent_work_dir="~/.hive/agents/{agent_name}",
session_id="session_20260209_143022_abc12345",
checkpoint_id="cp_node_complete_research_143030"
)
```
Returns the full checkpoint: shared_memory snapshot, execution_path, current_node, next_node, is_clean.
#### Loop back to Phase 2
After resuming or re-running, check if the fix worked. If not, go back to Phase 3.
---
### Phase 6: Final Verification
Once the iterative fix loop converges (the agent produces correct output), run the full automated test suite:
```python
run_tests(
goal_id="your-goal-id",
agent_path="exports/{agent_name}"
)
```
All tests should pass. If not, repeat the loop for remaining failures.
---
## Credential Requirements
**CRITICAL: Testing requires ALL credentials the agent depends on.** This includes both the LLM API key AND any tool-specific credentials (HubSpot, Brave Search, etc.).
### Prerequisites
Before running agent tests, you MUST collect ALL required credentials from the user.
**Step 1: LLM API Key (always required)**
```bash
exportANTHROPIC_API_KEY="your-key-here"
```
**Step 2: Tool-specific credentials (depends on agent's tools)**
Inspect the agent's `mcp_servers.json` and tool configuration to determine which tools the agent uses, then check for all required credentials:
When the user asks to test an agent, **ALWAYS check for ALL credentials first**:
1.**Identify the agent's tools** from `mcp_servers.json`
2.**Check ALL required credentials** using `CredentialManager`
3.**Ask the user to provide any missing credentials** before proceeding
4. Collect ALL missing credentials in a single prompt — not one at a time
---
## Safe Test Patterns
### OutputCleaner
The framework automatically validates and cleans node outputs using a fast LLM at edge traversal time. Tests should still use safe patterns because OutputCleaner may not catch all issues.
old_string='system_prompt="Search for information on the user\'s topic."',
new_string='system_prompt="Search for information on the user\'s topic. You MUST find at least 5 diverse, authoritative sources. Use multiple different search queries to ensure source diversity. Do not stop searching until you have at least 5 distinct sources."'
)
```
### Phase 5: Resume from checkpoint
For this example, the fix is to the `research` node. If we had run via CLI with checkpointing, we could resume from the checkpoint after `intake` to skip re-running intake:
old_string='system_prompt="Search for information on the user\'s topic using web search."',
new_string='system_prompt="Search for information on the user\'s topic using web search. You MUST find at least 5 diverse, authoritative sources. Use multiple different search queries with varied keywords. Do NOT call set_output until you have gathered at least 5 distinct sources from different domains."'
)
```
---
## Phase 5: Recover & Resume (Iteration 1)
The fix is to the `research` node. Since this was a `run_tests` execution (no checkpoints), we re-run from scratch:
old_string='system_prompt="Write a comprehensive report based on the research findings."',
new_string='system_prompt="Write a comprehensive report based on the research findings. You MUST include numbered citations [1], [2], etc. for every factual claim. At the end, include a References section listing all sources with their URLs. Every claim must be traceable to a specific source."'
)
```
---
## Phase 5: Resume (Iteration 2)
The fix is to the `report` node (the last node). To demonstrate checkpoint recovery, run via CLI:
```bash
# Run via CLI to get checkpoints
uv run hive run exports/deep_research_agent --input '{"topic": "climate change effects"}'
# After it runs, find the clean checkpoint before report
description: Complete workflow for building, implementing, and testing goal-driven agents. Orchestrates hive-* skills. Use when starting a new agent project, unsure which skill to use, or need end-to-end guidance.
license: Apache-2.0
metadata:
author: hive
version: "2.0"
type: workflow-orchestrator
orchestrates:
- hive-concepts
- hive-create
- hive-patterns
- hive-test
- hive-credentials
- hive-debugger
---
# Agent Development Workflow
**THIS IS AN EXECUTABLE WORKFLOW. DO NOT explore the codebase or read source files. ROUTE to the correct skill IMMEDIATELY.**
When this skill is loaded, **ALWAYS use the AskUserQuestion tool** to present options:
```
Use AskUserQuestion with these options:
- "Build a new agent" → Then invoke /hive-create
- "Test an existing agent" → Then invoke /hive-test
- "Learn agent concepts" → Then invoke /hive-concepts
- "Optimize agent design" → Then invoke /hive-patterns
- "Set up credentials" → Then invoke /hive-credentials
- "Debug a failing agent" → Then invoke /hive-debugger
- "Other" (please describe what you want to achieve)
```
**DO NOT:** Read source files, explore the codebase, search for code, or do any investigation before routing. The sub-skills handle all of that.
---
Complete Standard Operating Procedure (SOP) for building production-ready goal-driven agents.
## Overview
This workflow orchestrates specialized skills to take you from initial concept to production-ready agent:
4.**Preserve working states** - Tag successful iterations
5.**Learn from failures** - Failed tests reveal design issues
## Exit Criteria
You're done with the workflow when:
✅ Agent structure validates
✅ All tests pass
✅ Success criteria met
✅ Constraints verified
✅ Documentation complete
✅ Agent ready for deployment
## Additional Resources
- **hive-concepts**: See `.claude/skills/hive-concepts/SKILL.md`
- **hive-create**: See `.claude/skills/hive-create/SKILL.md`
- **hive-patterns**: See `.claude/skills/hive-patterns/SKILL.md`
- **hive-test**: See `.claude/skills/hive-test/SKILL.md`
- **Agent framework docs**: See `core/README.md`
- **Example agents**: See `exports/` directory
## Summary
This workflow provides a proven path from concept to production-ready agent:
1.**Learn** with `/hive-concepts` → Understand fundamentals (optional)
2.**Build** with `/hive-create` → Get validated structure
3.**Optimize** with `/hive-patterns` → Apply best practices (optional)
4.**Configure** with `/hive-credentials` → Set up API keys (if needed)
5.**Test** with `/hive-test` → Get verified functionality
6.**Debug** with `/hive-debugger` → Fix runtime issues (if needed)
The workflow is **flexible** - skip phases as needed, iterate freely, and adapt to your specific requirements. The goal is **production-ready agents** built with **consistent, repeatable processes**.
## Skill Selection Guide
**Choose hive-concepts when:**
- First time building agents
- Need to understand event loop architecture
- Validating tool availability
- Learning about node types, edges, and judges
**Choose hive-create when:**
- Actually building an agent
- Have clear requirements
- Ready to write code
- Want step-by-step guidance
- Want to start from an existing template and customize it
// Extract Discord ID — look for the numeric value after the "Discord User ID" heading
const discordMatch = body.match(/### Discord User ID\s*\n\s*(\d{17,20})/);
if (!discordMatch) {
await github.rest.issues.createComment({
...context.repo,
issue_number: issue.number,
body: `Could not find a valid Discord ID in the issue body. Please make sure you entered a numeric ID (17-20 digits), not a username.\n\nExample: \`123456789012345678\``
});
await github.rest.issues.update({
...context.repo,
issue_number: issue.number,
state: 'closed',
state_reason: 'not_planned'
});
return;
}
const discordId = discordMatch[1];
// Extract display name (optional)
const nameMatch = body.match(/### Display Name \(optional\)\s*\n\s*(.+)/);
body: `A PR has been created to link your account. A maintainer will merge it shortly — once merged, you'll receive XP and Discord pings when your bounty PRs are merged.`
v0.5.1 is our most ambitious release yet. Hive agents can now **build other agents** -- the new Hive Coder meta-agent writes, tests, and fixes agent packages from natural language. The runtime grows multi-graph support so one session can orchestrate multiple agents simultaneously. The TUI gets a complete overhaul with an in-app agent picker, live streaming, and seamless escalation to the Coder. And we're now provider-agnostic: Claude Code subscriptions, OpenAI-compatible endpoints, and any LiteLLM-supported model work out of the box.
---
## Highlights
### Hive Coder -- The Agent That Builds Agents
A native meta-agent that lives inside the framework at `core/framework/agents/hive_coder/`. Give it a natural-language specification and it produces a complete agent package -- goal definition, node prompts, edge routing, MCP tool wiring, tests, and all boilerplate files.
```bash
# Launch the Coder directly
hive code
# Or escalate from any running agent (TUI)
Ctrl+E # or /coder in chat
```
The Coder ships with:
- **Reference documentation** -- anti-patterns, construction guide, and design patterns baked into its system prompt
- **Guardian watchdog** -- an event-driven monitor that catches agent failures and triggers automatic remediation
- **Test generation** -- structural tests for forever-alive agents that don't hang on `runner.run()`
### Multi-Graph Agent Runtime
`AgentRuntime` now supports loading, managing, and switching between multiple agent graphs within a single session. Six new lifecycle tools give agents (and the TUI) full control:
The Hive Coder uses multi-graph internally -- when you escalate from a worker agent, the Coder loads as a separate graph while the worker stays alive in the background.
### TUI Revamp
The Terminal UI gets a ground-up rebuild with five major additions:
- **Agent Picker** (Ctrl+A) -- tabbed modal screen for browsing Your Agents, Framework agents, and Examples with metadata badges (node count, tool count, session count, tags)
- **Runtime-optional startup** -- TUI launches without a pre-loaded agent, showing the picker on first open
- **Live streaming pane** -- dedicated RichLog widget shows LLM tokens as they arrive, replacing the old one-token-per-line display
- **PDF attachments** -- `/attach` and `/detach` commands with native OS file dialog (macOS, Linux, Windows)
- **Interactive credential setup** -- Guided `CredentialSetupSession` with health checks and encrypted storage, accessible via `hive setup-credentials` or automatic prompting on credential errors. (@RichardTang-Aden)
- **Pre-start confirmation prompt** -- Interactive prompt before agent execution allowing credential updates or abort. (@RichardTang-Aden)
- **Event bus multi-graph support** -- `graph_id` on events, `filter_graph` on subscriptions, `ESCALATION_REQUESTED` event type, `exclude_own_graph` filter. (@TimothyZhang7)
### TUI Improvements
- **In-app agent picker** (Ctrl+A) -- Tabbed modal for browsing agents with metadata badges (nodes, tools, sessions, tags). (@TimothyZhang7)
- **Runtime-optional TUI startup** -- Launches without a pre-loaded agent, shows agent picker on startup. (@TimothyZhang7)
- **Hive Coder escalation** (Ctrl+E) -- Escalate to Hive Coder and return; also available via `/coder` and `/back` chat commands. (@TimothyZhang7)
- **PDF attachment support** -- `/attach` and `/detach` commands with native OS file dialog. (@TimothyZhang7)
- **Streaming output pane** -- Dedicated RichLog widget for live LLM token streaming. (@TimothyZhang7)
- **System prompt datetime injection** -- All system prompts now include current date/time for time-aware agent behavior. (@TimothyZhang7)
- **Utils module exports** -- Proper `__init__.py` exports for the utils module. (@Siddharth2624)
- **Increased default max_tokens** -- Opus 4.6 defaults to 32768, Sonnet 4.5 to 16384 (up from 8192). (@TimothyZhang7)
---
## Bug Fixes
- Flush WIP accumulator outputs on cancel/failure so edge conditions see correct values on resume
- Stall detection state preserved across resume (no more resets on checkpoint restore)
- Skip client-facing blocking for event-triggered executions (timer/webhook)
- Executor retry override scoped to actual EventLoopNode instances only
- Add `_awaiting_input` flag to EventLoopNode to prevent input injection race conditions
- Fix TUI streaming display (tokens no longer appear one-per-line)
- Fix `_return_from_escalation` crash when ChatRepl widgets not yet mounted
- Fix tools registration problems for Google Docs credentials (@RichardTang-Aden)
- Fix email agent version conflicts (@RichardTang-Aden)
- Fix coder tool timeouts (120s for tests, 300s cap for commands)
## Documentation
- Clarify installation and prevent root pip install misuse (@paarths-collab)
---
## Agent Updates
- **Email Inbox Management** -- Consolidate `gmail_inbox_guardian` and `inbox_management` into a single unified agent with updated prompts and config. (@RichardTang-Aden, @bryanadenhq)
- **Job Hunter** -- Updated node prompts, config, and agent metadata; added PDF resume selection. (@bryanadenhq)
- **Deep Research Agent** -- Revised node implementations with updated prompts and output handling.
NodeSpec(node_type="event_loop",...)# or just omit node_type (it's the default now)
```
If your agents set `max_node_visits=1` explicitly, they'll still work. The only change is the _default_ -- new agents without an explicit value now get unlimited visits.
To try the new Hive Coder:
```bash
# Launch Coder directly
hive code
# Or from TUI -- press Ctrl+E to escalate
hive tui
```
---
## What's Next
- **Agent-to-agent communication** -- one agent's output triggers another agent's entry point
- **Cost visibility** -- detailed runtime log of LLM costs per node and per session
- **Persistent webhook subscriptions** -- survive agent restarts without re-registering
- **Remote agent deployment** -- run agents as long-lived services with HTTP APIs
Build autonomous, reliable, self-improving AI agents without hardcoding workflows. Define your goal through conversation with a coding agent, and the framework generates a node graph with dynamically created connection code. When things break, the framework captures failure data, evolves the agent through the coding agent, and redeploys. Built-in human-in-the-loop nodes, credential management, and real-time monitoring give you control without sacrificing adaptability.
Build autonomous, reliable, self-improving AI agents without hardcoding workflows. Define your goal through conversation with hive coding agent(queen), and the framework generates a node graph with dynamically created connection code. When things break, the framework captures failure data, evolves the agent through the coding agent, and redeploys. Built-in human-in-the-loop nodes, credential management, and real-time monitoring give you control without sacrificing adaptability.
Visit [adenhq.com](https://adenhq.com) for complete documentation, examples, and guides.
@@ -50,7 +50,7 @@ Hive is designed for developers and teams who want to build **production-grade A
Hive is a good fit if you:
- Want AI agents that **execute real business processes**, not demos
-Prefer **goal-driven development** over hardcoded workflows
-Need **fast or high volume agent execution** over open workflow
- Need **self-healing and adaptive agents** that improve over time
- Require **human-in-the-loop control**, observability, and cost limits
- Plan to run agents in **production environments**
@@ -71,7 +71,7 @@ Use Hive when you need:
- **[Documentation](https://docs.adenhq.com/)** - Complete guides and API reference
- **[Self-Hosting Guide](https://docs.adenhq.com/getting-started/quickstart)** - Deploy Hive on your infrastructure
- **[Changelog](https://github.com/adenhq/hive/releases)** - Latest updates and releases
- **[Changelog](https://github.com/aden-hive/hive/releases)** - Latest updates and releases
- **[Roadmap](docs/roadmap.md)** - Upcoming features and plans
- **[Report Issues](https://github.com/adenhq/hive/issues)** - Bug reports and feature requests
- **[Contributing](CONTRIBUTING.md)** - How to contribute and submit PRs
@@ -81,17 +81,24 @@ Use Hive when you need:
### Prerequisites
- Python 3.11+ for agent development
-Claude Code, Codex CLI, or Cursor for utilizing agent skills
-An LLM provider that powers the agents
- **ripgrep (optional, recommended on Windows):** The `search_files` tool uses ripgrep for faster file search. If not installed, a Python fallback is used. On Windows: `winget install BurntSushi.ripgrep` or `scoop install ripgrep`
> **Note for Windows Users:** It is strongly recommended to use **WSL (Windows Subsystem for Linux)** or **Git Bash** to run this framework. Some core automation scripts may not execute correctly in standard Command Prompt or PowerShell.
### Installation
> **Note**
> Hive uses a `uv` workspace layout and is not installed with `pip install`.
> Running `pip install -e .` from the repository root will create a placeholder package and Hive will not function correctly.
> Please use the quickstart script below to set up the environment.
```bash
# Clone the repository
git clone https://github.com/adenhq/hive.git
git clone https://github.com/aden-hive/hive.git
cd hive
# Run quickstart setup
./quickstart.sh
```
@@ -104,78 +111,48 @@ This sets up:
- **LLM provider** - Interactive default model configuration
- All required Python dependencies with `uv`
- Finally, it will open the Hive interface in your browser
> **Tip:** To reopen the dashboard later, run `hive open` from the project directory.
# (at separate terminal) Launch the interactive dashboard
hive tui
### Use Template Agents
# Or run directly
hive run exports/your_agent_name --input '{"key": "value"}'
```
## Coding Agent Support
### Codex CLI
Hive includes native support for [OpenAI Codex CLI](https://github.com/openai/codex) (v0.101.0+).
Click "Try a sample agent" and check the templates. You can run a template directly or choose to build your version on top of the existing template.
1.**Config:**`.codex/config.toml` with `agent-builder` MCP server (tracked in git)
2.**Skills:**`.agents/skills/` symlinks to Hive skills (tracked in git)
3.**Launch:** Run `codex` in the repo root, then type `use hive`
### Run Agents
Example:
```
codex> use hive
```
Now you can run an agent by selecting the agent (either an existing agent or example agent). You can click the Run button on the top left, or talk to the queen agent and it can run the agent for you.
### Opencode
Hive includes native support for [Opencode](https://github.com/opencode-ai/opencode).
1.**Setup:** Run the quickstart script
2.**Launch:** Open Opencode in the project root.
3.**Activate:** Type `/hive` in the chat to switch to the Hive Agent.
4.**Verify:** Ask the agent *"List your tools"* to confirm the connection.
The agent has access to all Hive skills and can scaffold agents, add tools, and debug workflows directly from the chat.
**[📖 Complete Setup Guide](docs/environment-setup.md)** - Detailed instructions for agent development
### Antigravity IDE Support
Skills and MCP servers are also available in [Antigravity IDE](https://antigravity.google/) (Google's AI-powered IDE). **Easiest:** open a terminal in the hive repo folder and run (use `./` — the script is inside the repo):
```bash
./scripts/setup-antigravity-mcp.sh
```
**Important:** Always restart/refresh Antigravity IDE after running the setup script—MCP servers only load on startup. After restart, **agent-builder** and **tools** MCP servers should connect. Skills are under `.agent/skills/` (symlinks to `.claude/skills/`). See [docs/antigravity-setup.md](docs/antigravity-setup.md) for manual setup and troubleshooting.
- **[Goal-Driven Development](docs/key_concepts/goals_outcome.md)** - Define objectives in natural language; the coding agent generates the agent graph and connection code to achieve them
- **Browser-Use** - Control the browser on your computer to achieve hard tasks
- **Parallel Execution** - Execute the generated graph in parallel. This way you can have multiple agents completing the jobs for you
- **[Goal-Driven Generation](docs/key_concepts/goals_outcome.md)** - Define objectives in natural language; the coding agent generates the agent graph and connection code to achieve them
- **[Adaptiveness](docs/key_concepts/evolution.md)** - Framework captures failures, calibrates according to the objectives, and evolves the agent graph
- **[Dynamic Node Connections](docs/key_concepts/graph.md)** - No predefined edges; connection code is generated by any capable LLM based on your goals
- **SDK-Wrapped Nodes** - Every node gets shared memory, local RLM memory, monitoring, tools, and LLM access out of the box
- **[Human-in-the-Loop](docs/key_concepts/graph.md#human-in-the-loop)** - Intervention nodes that pause execution for human input with configurable timeouts and escalation
- **Real-time Observability** - WebSocket streaming for live monitoring of agent execution, decisions, and node-to-node communication
- **Interactive TUI Dashboard** - Terminal-based dashboard with live graph view, event log, and chat interface for agent interaction
- **Cost & Budget Control** - Set spending limits, throttles, and automatic model degradation policies
- **Production-Ready** - Self-hostable, built for scale and reliability
Hive is built to be model-agnostic and system-agnostic.
- **LLM flexibility** - Hive Framework is designed to support various types of LLMs, including hosted and local models through LiteLLM-compatible providers.
- **Business system connectivity** - Hive Framework is designed to connect to all kinds of business systems as tools, such as CRM, support, messaging, data, file, and internal APIs via MCP.
## Why Aden
Hive focuses on generating agents that run real business processes rather than generic agents. Instead of requiring you to manually design workflows, define agent interactions, and handle failures reactively, Hive flips the paradigm: **you describe outcomes, and the system builds itself**—delivering an outcome-driven, adaptive experience with an easy-to-use set of tools and integrations.
WTM["Write through Conversation Memory<br/>(Logs/RAM/Harddrive)"]
SM["Shared Memory<br/>(State/Harddrive)"]
EB["Event Bus<br/>(RAM)"]
CS["Credential Store<br/>(Harddrive/Cloud)"]
end
subgraph PC [PC]
B["Browser"]
CB["Codebase<br/>v 0.0.x ... v n.n.n"]
end
%% =========================================
%% CONNECTIONS & DATA FLOW
%% =========================================
%% External Event Routing
E_Sch --> ELN_L
E_WH --> ELN_L
E_SSE --> ELN_L
ELN_L -->|"triggers"| ELN_EL
%% User Interactions
User -->|"Talk"| WB_C
User -->|"Talk"| QB_C
User -->|"Read/Write Access"| CS
%% Inter-System Logic
ELN_C <-->|"Mirror"| WB_C
WB_C -->|"Focus"| AN
WorkerBees -->|"Inquire"| JudgeNode
JudgeNode -->|"Approve"| WorkerBees
%% Judge Alignments
J_C <-.->|"aligns"| WB_SP
J_P <-.->|"aligns"| QB_SP
%% Escalate path
J_EL -->|"Report (Escalate)"| QB_EL
%% Pub/Sub Logic
AN -->|"publish"| EB
EB -->|"subscribe"| QB_C
%% Infra and Process Spawning
ELN_EL -->|"Spawn"| SA
SA -->|"Inform"| ELN_EL
SA -->|"Starts"| B
B -->|"Report"| ELN_EL
TR -->|"Assigned"| ELN_EL
CB -->|"Modify Worker Bee"| WB_C
%% =========================================
%% SHARED MEMORY & LOGS ACCESS
%% =========================================
%% Worker Bees Access (link to node inside Graph subgraph)
AN <-->|"Read/Write"| WTM
AN <-->|"Read/Write"| SM
%% Queen Bee Access
QB_C <-->|"Read/Write"| WTM
QB_EL <-->|"Read/Write"| SM
%% Credentials Access
CS -->|"Read Access"| QB_C
```
## Contributing
We welcome contributions from the community! We’re especially looking for help building tools, integrations, and example agents for the framework ([check #2805](https://github.com/adenhq/hive/issues/2805)). If you’re interested in extending its functionality, this is the perfect place to start. Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
We welcome contributions from the community! We’re especially looking for help building tools, integrations, and example agents for the framework ([check #2805](https://github.com/aden-hive/hive/issues/2805)). If you’re interested in extending its functionality, this is the perfect place to start. Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
**Important:** Please get assigned to an issue before submitting a PR. Comment on an issue to claim it, and a maintainer will assign you. Issues with reproducible steps and proposals are prioritized. This helps prevent duplicate work.
@@ -396,7 +378,7 @@ This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENS
**Q: What LLM providers does Hive support?**
Hive supports 100+ LLM providers through LiteLLM integration, including OpenAI (GPT-4, GPT-4o), Anthropic (Claude models), Google Gemini, DeepSeek, Mistral, Groq, and many more. Simply set the appropriate API key environment variable and specify the model name.
Hive supports 100+ LLM providers through LiteLLM integration, including OpenAI (GPT-4, GPT-4o), Anthropic (Claude models), Google Gemini, DeepSeek, Mistral, Groq, and many more. Simply set the appropriate API key environment variable and specify the model name. We recommend using Claude, GLM and Gemini as they have the best performance.
**Q: Can I use Hive with local AI models like Ollama?**
@@ -438,14 +420,6 @@ Visit [docs.adenhq.com](https://docs.adenhq.com/) for complete guides, API refer
Contributions are welcome! Fork the repository, create your feature branch, implement your changes, and submit a pull request. See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.
**Q: When will my team start seeing results from Aden's adaptive agents?**
Aden's adaptation loop begins working from the first execution. When an agent fails, the framework captures the failure data, helping developers evolve the agent graph through the coding agent. How quickly this translates to measurable results depends on the complexity of your use case, the quality of your goal definitions, and the volume of executions generating feedback.
**Q: How does Hive compare to other agent frameworks?**
Hive focuses on generating agents that run real business processes, rather than generic agents. This vision emphasizes outcome-driven design, adaptability, and an easy-to-use set of tools and integrations.
This guide covers the MCP (Model Context Protocol) server for building goal-driven agents.
> **Note:** The standalone `agent-builder` MCP server (`framework.mcp.agent_builder_server`) has been replaced. Agent building is now done via the `coder-tools` server's `initialize_and_build_agent` tool, with underlying logic in `tools/coder_tools_server.py`.
This guide covers the MCP tools available for building goal-driven agents.
## Setup
### Quick Setup
```bash
# Using the setup script (recommended)
python setup_mcp.py
# Or using bash
./setup_mcp.sh
# Run the quickstart script (recommended)
./quickstart.sh
```
### Manual Configuration
@@ -21,10 +20,10 @@ Add to your MCP client configuration (e.g., Claude Desktop):
@@ -17,66 +17,11 @@ Framework provides a runtime framework that captures **decisions**, not just act
uv pip install -e .
```
## MCP Server Setup
## Agent Building
The framework includes an MCP (Model Context Protocol) server for building agents. To set up the MCP server:
Agent scaffolding is handled by the `coder-tools` MCP server (in `tools/coder_tools_server.py`), which provides the `initialize_and_build_agent` tool and related utilities. The package generation logic lives directly in `tools/coder_tools_server.py`.
### Automated Setup
**Using bash (Linux/macOS):**
```bash
./setup_mcp.sh
```
**Using Python (cross-platform):**
```bash
python setup_mcp.py
```
The setup script will:
1. Install the framework package
2. Install MCP dependencies (mcp, fastmcp)
3. Create/verify `.mcp.json` configuration
4. Test the MCP server module
### Manual Setup
If you prefer manual setup:
```bash
# Install framework
uv pip install -e .
# Install MCP dependencies
uv pip install mcp fastmcp
# Test the server
uv run python -m framework.mcp.agent_builder_server
```
### Using with MCP Clients
To use the agent builder with Claude Desktop or other MCP clients, add this to your MCP client configuration:
1.**Using tools that don't exist** — Always verify tools via `list_agent_tools()` before designing. Common hallucinations: `csv_read`, `csv_write`, `file_upload`, `database_query`, `bulk_fetch_emails`.
2.**Wrong mcp_servers.json format** — Flat dict (no `"mcpServers"` wrapper). `cwd` must be `"../../tools"`. `command` must be `"uv"` with args `["run", "python", ...]`.
3.**Missing module-level exports in `__init__.py`** — The runner reads `goal`, `nodes`, `edges`, `entry_node`, `entry_points`, `terminal_nodes`, `conversation_mode`, `identity_prompt`, `loop_config` via `getattr()`. ALL module-level variables from agent.py must be re-exported in `__init__.py`.
## Value Errors
4.**Fabricating tools** — Always verify via `list_agent_tools()` before designing and `validate_agent_package()` after building.
## Design Errors
5.**Adding framework gating for LLM behavior** — Don't add output rollback or premature rejection. Fix with better prompts or custom judges.
6.**Calling set_output in same turn as tool calls** — Call set_output in a SEPARATE turn.
## File Template Errors
7.**Wrong import paths** — Use `from framework.graph import ...`, NOT `from core.framework.graph import ...`.
8.**Missing storage path** — Agent class must set `self._storage_path = Path.home() / ".hive" / "agents" / "agent_name"`.
9.**Missing mcp_servers.json** — Without this, the agent has no tools at runtime.
10.**Bare `python` command** — Use `"command": "uv"` with args `["run", "python", ...]`.
## Testing Errors
11.**Using `runner.run()` on forever-alive agents** — `runner.run()` hangs forever because forever-alive agents have no terminal node. Write structural tests instead: validate graph structure, verify node specs, test `AgentRunner.load()` succeeds (no API key needed).
12.**Stale tests after restructuring** — When changing nodes/edges, update tests to match. Tests referencing old node names will fail.
13.**Running integration tests without API keys** — Use `pytest.skip()` when credentials are missing.
14.**Forgetting sys.path setup in conftest.py** — Tests need `exports/` and `core/` on sys.path.
## GCU Errors
15.**Manually wiring browser tools on event_loop nodes** — Use `node_type="gcu"` which auto-includes browser tools. Do NOT manually list browser tool names.
16.**Using GCU nodes as regular graph nodes** — GCU nodes are subagents only. They must ONLY appear in `sub_agents=["gcu-node-id"]` and be invoked via `delegate_to_sub_agent()`. Never connect via edges or use as entry/terminal nodes.
## Worker Agent Errors
17.**Adding client-facing intake node to workers** — The queen owns intake. Workers should start with an autonomous processing node. Client-facing nodes in workers are for mid-execution review/approval only.
18.**Putting `escalate` or `set_output` in NodeSpec `tools=[]`** — These are synthetic framework tools, auto-injected at runtime. Only list MCP tools from `list_agent_tools()`.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.