chore: ci and release doc

2026-05-01 18:06:39 -07:00
parent 9a75d45351
commit 78fffa63ec
3 changed files with 143 additions and 1 deletions
@@ -60,6 +60,21 @@ _HIVE_PATH_NAMES = (
 )


+@pytest.fixture(autouse=True)
+def _no_seed_mcp_defaults(monkeypatch):
+    """Skip bundled-server seeding in MCPRegistry.initialize() for tests.
+
+    Production wants ``initialize()`` to seed ``hive_tools`` / ``gcu-tools``
+    / ``files-tools`` / ``terminal-tools`` / ``chart-tools`` so a fresh
+    HIVE_HOME comes up with working defaults. Tests want a deterministic
+    empty registry — every assertion about counts, "no servers installed"
+    output, or first-element identity breaks otherwise. Patching here
+    keeps the production API clean and avoids a test-only flag on
+    ``initialize()``.
+    """
+    monkeypatch.setattr(_mcp_registry.MCPRegistry, "_seed_defaults", lambda self: [])
+
+
@pytest.fixture(autouse=True)
 def _isolate_hive_home_autouse(tmp_path, monkeypatch):
    """Per-test isolation of ``~/.hive`` to ``tmp_path/.hive``.
@@ -889,7 +889,7 @@ def test_concurrency_safe_allowlist_is_conservative():
    allowlist = ToolRegistry.CONCURRENCY_SAFE_TOOLS

    # Positive assertions: known-safe read operations are present.
-    for name in ("read_file", "grep", "glob", "search_files", "web_search"):
+    for name in ("read_file", "terminal_rg", "terminal_find", "search_files", "web_scrape"):
        assert name in allowlist, f"{name} should be concurrency-safe"

    # Negative assertions: nothing that mutates state is allowed in.
@@ -0,0 +1,127 @@
+# 🐝 Hive Agent v0.11.0: Action Plans, Charts, and a Cleaner Queen
+
+> Major features released in Hive 0.11. Now Queen has an action plan for everything and charting capability to do analytics for you. Overall the conversation and agent experience is also improved a lot thanks to a major Queen prompt and tools refactor.
+
+---
+
+## ✨ Highlights
+
+### 📋 Queen now keeps an action plan for everything
+
+A new file-backed task system gives Queen a persistent, structured plan for every conversation — visible to the user, editable on the fly, and surviving session reload.
+
+- **File-backed task store** under `core/framework/tasks/` with full CRUD, scoping, hooks, and reminders. Tasks live on disk so they outlast a single agent run and can be inspected, replayed, or shared between Queen and colony workers.
+- **Multi-task creation in one call** — Queen can stage a whole plan up front instead of dripping out one task at a time, then tick items off as it works.
+- **Colony task templates** — colonies can publish a template task list that Queen picks up when the colony is invoked, so recurring workflows start with the same plan every time.
+- **Live task list in the UI** — a new `TaskListPanel` renders the plan in real time next to the chat, with item status flowing through the event bus as Queen marks tasks done.
+- **Task reminders + hooks** wire into Queen's loop so the plan stays in front of the model and structural blockers preventing tool calls on `task_*` are now resolved.
+
+### 📊 Charting capability for analytics
+
+Queen can now produce real charts inline in the conversation, not just describe them.
+
+- **New `chart_tools` MCP server** with ECharts and Mermaid renderers, an OpenHive theme, and a `chart-creation-foundations` skill that teaches Queen when to chart vs. when to table.
+- **Inline chart rendering in chat** — `EChartsBlock` and `MermaidBlock` components render the chart spec directly in the transcript; tool results get a contentful display with `ChartToolDetail` instead of a JSON dump.
+- **Chart spec normalization** in the renderer keeps Y-axis scaling, series colors, and theme tokens consistent regardless of how Queen phrases the spec.
+
+### 🧹 Major Queen prompt + tools refactor
+
+The biggest cleanup of Queen's tool surface and prompt since v0.7. Fewer, sharper tools; a shorter, more focused prompt; and a clearer model of what Queen has access to vs. what colonies do.
+
+- **File ops consolidated** — `apply_diff`, `apply_patch`, `hashline_edit`, the old `data_tools`, `grep_search`, and the legacy `coder_tools_server` are gone. A single rewritten `file_ops` module covers read / search / list / edit with a more predictable interface and ~1.7k fewer lines on net.
+- **Search and list-files unified** into one toolkit so Queen stops juggling near-duplicate variants.
+- **Browser tools audit** — interactions, navigation, tabs, and lifecycle trimmed and consolidated; `web_scrape` and `browser_open` merged into a single web-search-and-open path.
+- **New shell/terminal toolkit** (`shell_tools`) — replaces the old `execute_command_tool` and the inline command sanitizer with a typed module that has proper job control, PTY sessions, ring-buffered output, semantic exit codes, and a destructive-command warning gate. Five new preset skills (`shell-tools-foundations`, `-fs-search`, `-job-control`, `-pty-sessions`, `-troubleshooting`) teach Queen the new surface.
+- **Old lifecycle tools removed** — `queen_lifecycle_tools.py` shrunk by ~900 lines as deprecated default tools came out.
+- **Prompt simplification + improvements** — Queen's node prompts dropped redundant `_queen_style` blocks, tightened phrasing, and now lean on the task system for plan-keeping instead of restating the plan every turn.
+- **Tools editor frontend grouping** — `ToolsEditor.tsx` groups tools by category so configuring a queen profile is no longer a flat scroll through 80+ entries.
+
+---
+
+## 🆕 What's New
+
+### Tasks & Action Plans
+
+- **`core/framework/tasks/`** — full task subsystem: `store`, `models`, `events`, `hooks`, `reminders`, `scoping`, plus a `tools/` package exposing session and colony task tools to Queen. (@RichardTang-Aden)
+- **`POST /api/tasks` routes** for the frontend to read and mutate the live plan. (@RichardTang-Aden)
+- **`TaskListPanel` + `TaskItem` + `TaskListContext`** on the frontend render the plan in real time. (@RichardTang-Aden)
+- **Multi-task creation tool** lets Queen stage a whole plan in one call. (@RichardTang-Aden)
+- **Colony task templates** — colonies ship with a default task list that Queen adopts on entry. (@RichardTang-Aden)
+- **Hook + reminder fixes** so Queen reliably uses `task_*` tools instead of skipping them. (@RichardTang-Aden)
+
+### Charts
+
+- **`tools/src/chart_tools/`** — new MCP server with `renderer.py`, `theme.py`, `tools.py`, plus bundled `echarts.min.js` and `mermaid.min.js`. (@TimothyZhang7)
+- **`chart-creation-foundations` skill** teaches Queen when and how to chart. (@TimothyZhang7)
+- **`EChartsBlock` / `MermaidBlock` / `ChartToolDetail`** components render charts inline. (@TimothyZhang7)
+- **OpenHive chart theme** (`openhiveTheme.ts`) keeps chart styling consistent with the rest of the UI. (@TimothyZhang7)
+- **Chart spec normalization** in the renderer fixes Y-axis edge cases and series defaults. (@TimothyZhang7)
+
+### Queen Prompt & Tools Refactor
+
+- **Major file ops refactor** — single rewritten `file_ops` module replaces `apply_diff`, `apply_patch`, `hashline_edit`, `grep_search`, `data_tools`, and the legacy `coder_tools_server`. (@RichardTang-Aden)
+- **Edit-file refactor** with a tighter API surface and ~560 lines of dead `test_file_ops_hashline.py` removed. (@RichardTang-Aden)
+- **Search + list-files consolidation** into one toolkit. (@RichardTang-Aden)
+- **Browser tools audit** — navigation, interactions, lifecycle, and tabs trimmed; `web_scrape` and browser-open merged. (@RichardTang-Aden)
+- **`shell_tools` package** replaces `execute_command_tool` with proper job control, PTY sessions, ring-buffered output, semantic exit codes, and destructive-command warnings. (@TimothyZhang7)
+- **Five new shell preset skills** plus reference docs (`exit_codes.md`, `find_predicates.md`, `ripgrep_cheatsheet.md`, `signals.md`). (@TimothyZhang7)
+- **Old lifecycle tools removed** — `queen_lifecycle_tools.py` lost ~900 lines. (@RichardTang-Aden)
+- **Autocompaction + concurrency tools updated** to play nicely with the new tool registry. (@RichardTang-Aden)
+- **Prompt simplification** — `nodes/__init__.py` dropped redundant `_queen_style` block and tightened phrasing across nodes. (@RichardTang-Aden)
+- **`ToolsEditor` grouping** — frontend tool-config screen now groups tools by category. (@RichardTang-Aden)
+
+### Conversation & Agent Experience
+
+- **`ask_user` questions surface in the chat transcript** instead of vanishing into a side panel, and the question bubble now defers until the user actually answers. (@bryan)
+- **New-session navigation with Queen warm-up UI** — new `queen-routing.tsx` page handles the warm-up so the user sees progress instead of a blank screen. (@bryan)
+- **Sync tool result contentful display** — tool results render as structured cards (charts, file diffs, etc.) instead of raw JSON. (@TimothyZhang7)
+
+### Vision Fallback
+
+- **Vision model retry + fallback** — non-vision models can now route image inputs through a captioning step instead of failing. (@RichardTang-Aden)
+- **Vision fallback with intent** — caption prompts incorporate the user's intent so the caption is task-relevant. (@RichardTang-Aden)
+- **Vision fallback auth** — fallback path now uses the right credentials per provider. (@RichardTang-Aden)
+- **Looser max-token cap** on vision fallback for models that spend output tokens on internal thinking. (@RichardTang-Aden)
+- **Vision fallback model usage logging** for cost visibility. (@RichardTang-Aden)
+
+### Colonies
+
+- **`POST /api/colonies/import`** — onboard a colony from a `tar` / `tar.gz` upload. 50 MB cap, manual path-traversal validation (Python 3.11 compatible), symlinks/hardlinks/devices rejected, mode bits masked. Tests cover happy path, name override, replace flag, traversal, absolute paths, and corrupt archives. (@RichardTang-Aden)
+- **Refactored colony routes** — `routes_colonies.py` gained ~450 lines of structure for import/export/list flows. (@TimothyZhang7)
+
+### MCP & Tools
+
+- **SimilarWeb V5 integration** — 29 new MCP tools covering traffic & engagement, competitor intelligence, keywords/SERP, audience demographics, and segment analysis. Includes credential spec, health checker, README, and tests on Ubuntu and Windows. (#7066)
+- **MCP registry initialization fix** — registry no longer races on first install. (@RichardTang-Aden)
+
+---
+
+## 🐛 Bug Fixes
+
+- **Initial install** path resolution — hardcoded `HIVE_HOME` references replaced; all agent paths now prefixed by the resolved `HIVE_HOME`. (@RichardTang-Aden)
+- **Frontend recovery** after a broken state on session reload. (@RichardTang-Aden)
+- **Compaction issues** when the agent loop runs into the buffer mid-stream. (@RichardTang-Aden)
+- **LiteLLM patch** for a streaming-usage edge case. (@RichardTang-Aden)
+- **`ask_user` question bubble** now defers until the user answers. (@bryan)
+- **Incubating-mode approval guidance** correctly injects into the prompt. (@RichardTang-Aden)
+- **LLM debugger** — fixed timeline order and tool-call display. (@RichardTang-Aden)
+- **Shell split-command** parsing fix. (@TimothyZhang7)
+- **Chart Y-axis** + **chart spec normalization** edge cases. (@TimothyZhang7)
+- **Scroll behavior** on certain element selectors. (@bryan)
+- **CI fixes**: skills `HIVE_HOME` refactor regressions, `run_parallel_workers` losing task text on spawn, `test_capabilities` deprecated model identifiers, `test_colony_runtime_overseer` Windows flake. (#7141, #7149)
+- **Orphan Zoho CRM test directory** removed under `src/` after the MCP refactor. (#7142)
+- **Credentials** — `EnvVarStorage.exists` now matches `load` semantics for empty values. (#5680)
+
+---
+
+## 🚀 Upgrading from v0.10.5
+
+No migration required. Pull `main` at `v0.11.0` and restart Hive — existing `~/.hive/` profiles, queens, colonies, and sessions keep working.
+
+A few things to know:
+
+1. **Queen's default tool surface changed.** If you have a queen profile pinned to a removed tool (e.g. `apply_diff`, `apply_patch`, `hashline_edit`, `grep_search`, the old `execute_command_tool`), it'll fall back to the consolidated replacements. Custom profiles referencing those tool names should be updated.
+2. **Old `queen_lifecycle_tools` entries are gone.** If you wired any external code against those defaults, switch to the new task system.
+3. **Task plan is now persistent.** Queen will start staging a plan automatically on new sessions — if you don't want the panel, you can collapse it from the layout.
+
+Plan the work. Chart the result. 🐝