Merge branch 'main' into feature/hive-experimental-comp-pipeline

This commit is contained in:
Timothy
2026-04-07 18:49:14 -07:00
88 changed files with 5073 additions and 2659 deletions
+16
View File
@@ -333,6 +333,22 @@ make test-live # Run live API integration tests (requires credentials)
- **WebSocket** for real-time updates
- **Tailwind CSS** for styling
### Frontend Dev Workflow
> **Note:** `./quickstart.sh` handles the full setup including the web UI.
> The commands below are for contributors iterating on the frontend code after
> initial setup is complete.
```bash
# Start the backend server
hive serve
# In a separate terminal, run the frontend dev server with hot-reload
cd core/frontend
npm install # only needed after dependency changes
npm run dev
```
### Useful Development Commands
```bash
+1 -13
View File
@@ -51,7 +51,7 @@ https://github.com/user-attachments/assets/bf10edc3-06ba-48b6-98ba-d069b15fb69d
## Who Is Hive For?
Hive is the harness layer for teams moving AI agents from prototype to production. Models are getting better on their own — the bottleneck is the infrastructure around them: state management, failure recovery, cost control, and observability.
Hive is the multi-agent harness layer for teams moving AI agents from prototype to production. Single agents like Openclaw and Cowork can finish personal jobs pretty well but lack the rigor to fulfil business processes.
Hive is a good fit if you:
@@ -194,18 +194,6 @@ flowchart LR
style V6 fill:#fff,stroke:#ed8c00,stroke-width:1px,color:#cc5d00
```
### The Hive Advantage
| Typical Agent Frameworks | Hive |
| -------------------------- | -------------------------------------- |
| Focus on model orchestration | **Production harness**: state, recovery, observability |
| Hardcode agent workflows | Describe goals in natural language |
| Manual graph definition | Auto-generated agent graphs |
| Reactive error handling | Outcome-evaluation and adaptiveness |
| Static tool configurations | Dynamic SDK-wrapped nodes |
| Separate monitoring setup | Built-in real-time observability |
| DIY budget management | Integrated cost controls & degradation |
### How It Works
1. **[Define Your Goal](docs/key_concepts/goals_outcome.md)** → Describe what you want to achieve in plain English
+4 -34
View File
@@ -83,7 +83,6 @@ _QUEEN_PLANNING_TOOLS = [
# Scaffold + transition to building (requires confirm_and_build first)
# Load existing agent (after user confirms)
"load_built_agent",
"save_global_memory",
]
# Building phase: full coding + agent construction tools.
@@ -92,7 +91,6 @@ _QUEEN_BUILDING_TOOLS = _SHARED_TOOLS + [
"list_credentials",
"replan_agent",
"save_agent_draft", # Re-draft during building → auto-dissolves + updates flowchart
"save_global_memory",
]
# Staging phase: agent loaded but not yet running — inspect, configure, launch.
@@ -112,7 +110,6 @@ _QUEEN_STAGING_TOOLS = [
"set_trigger",
"remove_trigger",
"list_triggers",
"save_global_memory",
]
# Running phase: worker is executing — monitor, control, or switch to editing.
@@ -136,7 +133,6 @@ _QUEEN_RUNNING_TOOLS = [
"set_trigger",
"remove_trigger",
"list_triggers",
"save_global_memory",
]
# Editing phase: worker done, still loaded — tweak config and re-run.
@@ -159,7 +155,6 @@ _QUEEN_EDITING_TOOLS = [
"set_trigger",
"remove_trigger",
"list_triggers",
"save_global_memory",
]
@@ -637,8 +632,6 @@ to fix the currently loaded agent (no draft required).
- load_built_agent(agent_path) Load an existing agent and switch to STAGING \
phase. Only use this when the user explicitly asks to work with an existing agent \
(e.g. "load my_agent", "run the research agent"). Confirm with the user first.
- save_global_memory(category, description, content, name?) Save durable \
cross-queen memory about the user only (profile, preferences, environment, feedback)
## Workflow summary
1. Understand requirements discover tools design graph
@@ -670,8 +663,6 @@ updated flowchart immediately. Use this when you make structural changes \
restored (with decision/browser nodes intact) so you can edit it. Use \
when the user wants to change integrations, swap tools, rethink the \
flow, or discuss any design changes before you build them.
- save_global_memory(category, description, content, name?) Save durable \
cross-queen memory about the user only
When you finish building an agent, call load_built_agent(path) to stage it.
"""
@@ -685,8 +676,6 @@ The agent is loaded and ready to run. You can inspect it and launch it:
- get_graph_status(focus?) Brief status
- run_agent_with_input(task) Start the worker and switch to RUNNING phase
- set_trigger / remove_trigger / list_triggers Timer management
- save_global_memory(category, description, content, name?) Save \
durable cross-queen memory about the user only
You do NOT have write tools or backward transition tools in staging. \
To modify the agent, run it first after it finishes you enter EDITING \
@@ -706,8 +695,6 @@ The worker is running. You have monitoring and lifecycle tools:
for config tweaks, re-runs, or escalation to building/planning
- run_agent_with_input(task) Re-run the worker with new input
- set_trigger / remove_trigger / list_triggers Timer management
- save_global_memory(category, description, content, name?) Save \
durable cross-queen memory about the user only
When the worker finishes on its own, you automatically move to EDITING \
phase. You can also call switch_to_editing() to stop early and tweak.
@@ -723,7 +710,6 @@ The worker has finished executing and is still loaded. You can tweak and re-run:
- run_agent_with_input(task) Re-run the worker with new input
- get_worker_health_summary() Review last run's health data
- set_trigger / remove_trigger / list_triggers Timer management
- save_global_memory Save durable cross-queen memory
You do NOT have write/edit file tools or backward transition tools. \
You can only re-run or tweak from this phase.
@@ -925,26 +911,10 @@ diagnosis mode — you already have a built agent, you just need to fix it.
_queen_memory_instructions = """
## Your Memory
Relevant colony memories from this queen session may appear in context under \
"--- Colony Memories ---". Relevant global user memories may appear under \
"--- Global Memories ---".
Colony memories are shared with the worker for this queen session. Use them \
for continuity about what this user is trying to do, what has worked, and \
what the colony has learned together.
Global memories are shared across queens and are only for durable knowledge \
about the user: who they are, their preferences, their environment, and \
their feedback.
Memories older than 1 day include a staleness warning. Treat these as \
point-in-time observations verify current details before asserting them \
as fact.
You do NOT need to manually save or recall colony memories. A background \
reflection agent automatically extracts colony learnings from each \
conversation turn. Use `save_global_memory` only when you learn something \
durable about the user that should help future queens.
Relevant global memories about the user may appear at the end of this prompt \
under "--- Global Memories ---". These are automatically maintained across \
sessions. Use them to inform your responses but verify stale claims before \
asserting them as fact.
"""
_queen_behavior_always = _queen_behavior_always + _queen_memory_instructions
-420
View File
@@ -1,420 +0,0 @@
"""Queen global cross-session memory.
Three-tier memory architecture:
~/.hive/queen/MEMORY.md semantic (who, what, why)
~/.hive/queen/memories/MEMORY-YYYY-MM-DD.md episodic (daily journals)
~/.hive/queen/session/{id}/data/adapt.md working (session-scoped)
Semantic and episodic files are injected at queen session start.
Semantic memory (MEMORY.md) is updated automatically at session end via
consolidate_queen_memory() the queen never rewrites this herself.
Episodic memory (MEMORY-date.md) can be written by the queen during a session
via the write_to_diary tool, and is also appended to at session end by
consolidate_queen_memory().
"""
from __future__ import annotations
import asyncio
import json
import logging
import traceback
from datetime import date, datetime
from pathlib import Path
logger = logging.getLogger(__name__)
def _queen_dir() -> Path:
return Path.home() / ".hive" / "queen"
def format_memory_date(d: date) -> str:
"""Return a cross-platform long date label without a zero-padded day."""
return f"{d.strftime('%B')} {d.day}, {d.year}"
def semantic_memory_path() -> Path:
return _queen_dir() / "MEMORY.md"
def episodic_memory_path(d: date | None = None) -> Path:
d = d or date.today()
return _queen_dir() / "memories" / f"MEMORY-{d.strftime('%Y-%m-%d')}.md"
def read_semantic_memory() -> str:
path = semantic_memory_path()
return path.read_text(encoding="utf-8").strip() if path.exists() else ""
def read_episodic_memory(d: date | None = None) -> str:
path = episodic_memory_path(d)
return path.read_text(encoding="utf-8").strip() if path.exists() else ""
def _find_recent_episodic(lookback: int = 7) -> tuple[date, str] | None:
"""Find the most recent non-empty episodic memory within *lookback* days."""
from datetime import timedelta
today = date.today()
for offset in range(lookback):
d = today - timedelta(days=offset)
content = read_episodic_memory(d)
if content:
return d, content
return None
# Budget (in characters) for episodic memory in the system prompt.
_EPISODIC_CHAR_BUDGET = 6_000
def format_for_injection() -> str:
"""Format cross-session memory for system prompt injection.
Returns an empty string if no meaningful content exists yet (e.g. first
session with only the seed template).
"""
semantic = read_semantic_memory()
recent = _find_recent_episodic()
# Suppress injection if semantic is still just the seed template
if semantic and semantic.startswith("# My Understanding of the User\n\n*No sessions"):
semantic = ""
parts: list[str] = []
if semantic:
parts.append(semantic)
if recent:
d, content = recent
# Trim oversized episodic entries to keep the prompt manageable
if len(content) > _EPISODIC_CHAR_BUDGET:
content = content[:_EPISODIC_CHAR_BUDGET] + "\n\n…(truncated)"
today = date.today()
if d == today:
label = f"## Today — {format_memory_date(d)}"
else:
label = f"## {format_memory_date(d)}"
parts.append(f"{label}\n\n{content}")
if not parts:
return ""
body = "\n\n---\n\n".join(parts)
return "--- Your Cross-Session Memory ---\n\n" + body + "\n\n--- End Cross-Session Memory ---"
_SEED_TEMPLATE = """\
# My Understanding of the User
*No sessions recorded yet.*
## Who They Are
## How They Communicate
## What They're Trying to Achieve
## What's Working
## What I've Learned
"""
def append_episodic_entry(content: str) -> None:
"""Append a timestamped prose entry to today's episodic memory file.
Creates the file (with a date heading) if it doesn't exist yet.
Used both by the queen's diary tool and by the consolidation hook.
"""
ep_path = episodic_memory_path()
ep_path.parent.mkdir(parents=True, exist_ok=True)
today = date.today()
today_str = format_memory_date(today)
timestamp = datetime.now().strftime("%H:%M")
if not ep_path.exists():
header = f"# {today_str}\n\n"
block = f"{header}### {timestamp}\n\n{content.strip()}\n"
else:
block = f"\n\n### {timestamp}\n\n{content.strip()}\n"
with ep_path.open("a", encoding="utf-8") as f:
f.write(block)
def seed_if_missing() -> None:
"""Create MEMORY.md with a blank template if it doesn't exist yet."""
path = semantic_memory_path()
if path.exists():
return
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(_SEED_TEMPLATE, encoding="utf-8")
# ---------------------------------------------------------------------------
# Consolidation prompt
# ---------------------------------------------------------------------------
_SEMANTIC_SYSTEM = """\
You maintain the persistent cross-session memory of an AI assistant called the Queen.
Review the session notes and rewrite MEMORY.md the Queen's durable understanding of the
person she works with across all sessions.
Write entirely in the Queen's voice — first person, reflective, honest.
Not a log of events, but genuine understanding of who this person is over time.
Rules:
- Update and synthesise: incorporate new understanding, update facts that have changed, remove
details that are stale, superseded, or no longer say anything meaningful about the person.
- Keep it as structured markdown with named sections about the PERSON, not about today.
- Do NOT include diary sections, daily logs, or session summaries. Those belong elsewhere.
MEMORY.md is about who they are, what they want, what works not what happened today.
- Maintain a "How They Communicate" section: technical depth, preferred pace
(fast/exploratory/thorough), what communication approaches have worked or not,
tone preferences. Update based on diary reflections about communication.
This section should evolve "prefers direct answers" is useful on day 1;
"prefers direct answers for technical questions but wants more context when
discussing architecture trade-offs" is better by day 5.
- Reference dates only when noting a lasting milestone (e.g. "since March 8th they prefer X").
- If the session had no meaningful new information about the person,
return the existing text unchanged.
- Do not add fictional details. Only reflect what is evidenced in the notes.
- Stay concise. Prune rather than accumulate. A lean, accurate file is more useful than a
dense one. If something was true once but has been resolved or superseded, remove it.
- Output only the raw markdown content of MEMORY.md. No preamble, no code fences.
"""
_DIARY_SYSTEM = """\
You maintain the daily episodic diary of an AI assistant called the Queen.
You receive: (1) today's existing diary so far, and (2) notes from the latest session.
Rewrite the complete diary for today as a single unified narrative
first person, reflective, honest.
Merge and deduplicate: if the same story (e.g. a research agent stalling) recurred several times,
describe it once with appropriate weight rather than retelling it. Weave in new developments from
the session notes. Preserve important milestones, emotional texture, and session path references.
Preserve reflections about communication effectiveness these are important inputs for the
Queen's evolving understanding of the user. A reflection like "they responded much better when
I led with the recommendation instead of listing options" is as important as
"we built a Gmail agent."
If today's diary is empty, write the initial entry based on the session notes alone.
Output only the full diary prose no date heading, no timestamp headers,
no preamble, no code fences.
"""
def read_session_context(session_dir: Path, max_messages: int = 80) -> str:
"""Extract a readable transcript from conversation parts + adapt.md.
Reads the last ``max_messages`` conversation parts and the session's
adapt.md (working memory). Tool results are omitted only user and
assistant turns (with tool-call names noted) are included.
"""
parts: list[str] = []
# Working notes
adapt_path = session_dir / "data" / "adapt.md"
if adapt_path.exists():
text = adapt_path.read_text(encoding="utf-8").strip()
if text:
parts.append(f"## Session Working Notes (adapt.md)\n\n{text}")
# Conversation transcript
parts_dir = session_dir / "conversations" / "parts"
if parts_dir.exists():
part_files = sorted(parts_dir.glob("*.json"))[-max_messages:]
lines: list[str] = []
for pf in part_files:
try:
data = json.loads(pf.read_text(encoding="utf-8"))
role = data.get("role", "")
content = str(data.get("content", "")).strip()
tool_calls = data.get("tool_calls") or []
if role == "tool":
continue # skip verbose tool results
if role == "assistant" and tool_calls and not content:
names = [tc.get("function", {}).get("name", "?") for tc in tool_calls]
lines.append(f"[queen calls: {', '.join(names)}]")
elif content:
label = "user" if role == "user" else "queen"
lines.append(f"[{label}]: {content[:600]}")
except (KeyError, TypeError) as exc:
logger.debug("Skipping malformed conversation message: %s", exc)
continue
except Exception:
logger.warning("Unexpected error parsing conversation message", exc_info=True)
continue
if lines:
parts.append("## Conversation\n\n" + "\n".join(lines))
return "\n\n".join(parts)
# ---------------------------------------------------------------------------
# Context compaction (binary-split LLM summarisation)
# ---------------------------------------------------------------------------
# If the raw session context exceeds this many characters, compact it first
# before sending to the consolidation LLM. ~200 k chars ≈ 50 k tokens.
_CTX_COMPACT_CHAR_LIMIT = 200_000
_CTX_COMPACT_MAX_DEPTH = 8
_COMPACT_SYSTEM = (
"Summarise this conversation segment. Preserve: user goals, key decisions, "
"what was built or changed, emotional tone, and important outcomes. "
"Write concisely in third person past tense. Omit routine tool invocations "
"unless the result matters."
)
async def _compact_context(text: str, llm: object, *, _depth: int = 0) -> str:
"""Binary-split and LLM-summarise *text* until it fits within the char limit.
Mirrors the recursive binary-splitting strategy used by the main agent
compaction pipeline (EventLoopNode._llm_compact).
"""
if len(text) <= _CTX_COMPACT_CHAR_LIMIT or _depth >= _CTX_COMPACT_MAX_DEPTH:
return text
# Split near the midpoint on a line boundary so we don't cut mid-message
mid = len(text) // 2
split_at = text.rfind("\n", 0, mid) + 1
if split_at <= 0:
split_at = mid
half1, half2 = text[:split_at], text[split_at:]
async def _summarise(chunk: str) -> str:
try:
resp = await llm.acomplete(
messages=[{"role": "user", "content": chunk}],
system=_COMPACT_SYSTEM,
max_tokens=2048,
)
return resp.content.strip()
except Exception:
logger.warning(
"queen_memory: context compaction LLM call failed (depth=%d), truncating",
_depth,
)
return chunk[: _CTX_COMPACT_CHAR_LIMIT // 4]
s1, s2 = await asyncio.gather(_summarise(half1), _summarise(half2))
combined = s1 + "\n\n" + s2
if len(combined) > _CTX_COMPACT_CHAR_LIMIT:
return await _compact_context(combined, llm, _depth=_depth + 1)
return combined
async def consolidate_queen_memory(
session_id: str,
session_dir: Path,
llm: object,
) -> None:
"""Update MEMORY.md and append a diary entry based on the current session.
Reads conversation parts and adapt.md from session_dir. Called
periodically in the background and once at session end. Failures are
logged and silently swallowed so they never block teardown.
Args:
session_id: The session ID (used for the adapt.md path reference).
session_dir: Path to the session directory (~/.hive/queen/session/{id}).
llm: LLMProvider instance (must support acomplete()).
"""
try:
session_context = read_session_context(session_dir)
if not session_context:
logger.debug("queen_memory: no session context, skipping consolidation")
return
logger.info("queen_memory: consolidating memory for session %s ...", session_id)
# If the transcript is very large, compact it with recursive binary LLM
# summarisation before sending to the consolidation model.
if len(session_context) > _CTX_COMPACT_CHAR_LIMIT:
logger.info(
"queen_memory: session context is %d chars — compacting first",
len(session_context),
)
session_context = await _compact_context(session_context, llm)
logger.info("queen_memory: compacted to %d chars", len(session_context))
existing_semantic = read_semantic_memory()
today_journal = read_episodic_memory()
today = date.today()
today_str = format_memory_date(today)
adapt_path = session_dir / "data" / "adapt.md"
user_msg = (
f"## Existing Semantic Memory (MEMORY.md)\n\n"
f"{existing_semantic or '(none yet)'}\n\n"
f"## Today's Diary So Far ({today_str})\n\n"
f"{today_journal or '(none yet)'}\n\n"
f"{session_context}\n\n"
f"## Session Reference\n\n"
f"Session ID: {session_id}\n"
f"Session path: {adapt_path}\n"
)
logger.debug(
"queen_memory: calling LLM (%d chars of context, ~%d tokens est.)",
len(user_msg),
len(user_msg) // 4,
)
from framework.agents.queen.config import default_config
semantic_resp, diary_resp = await asyncio.gather(
llm.acomplete(
messages=[{"role": "user", "content": user_msg}],
system=_SEMANTIC_SYSTEM,
max_tokens=default_config.max_tokens,
),
llm.acomplete(
messages=[{"role": "user", "content": user_msg}],
system=_DIARY_SYSTEM,
max_tokens=default_config.max_tokens,
),
)
new_semantic = semantic_resp.content.strip()
diary_entry = diary_resp.content.strip()
if new_semantic:
path = semantic_memory_path()
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(new_semantic, encoding="utf-8")
logger.info("queen_memory: semantic memory updated (%d chars)", len(new_semantic))
if diary_entry:
# Rewrite today's episodic file in-place — the LLM has merged and
# deduplicated the full day's content, so we replace rather than append.
ep_path = episodic_memory_path()
ep_path.parent.mkdir(parents=True, exist_ok=True)
heading = f"# {today_str}"
ep_path.write_text(f"{heading}\n\n{diary_entry}\n", encoding="utf-8")
logger.info(
"queen_memory: episodic diary rewritten for %s (%d chars)",
today_str,
len(diary_entry),
)
except Exception:
tb = traceback.format_exc()
logger.exception("queen_memory: consolidation failed")
# Write to file so the cause is findable regardless of log verbosity.
error_path = _queen_dir() / "consolidation_error.txt"
try:
error_path.parent.mkdir(parents=True, exist_ok=True)
error_path.write_text(
f"session: {session_id}\ntime: {datetime.now().isoformat()}\n\n{tb}",
encoding="utf-8",
)
except OSError:
pass # Cannot write error file; original exception already logged
+14 -341
View File
@@ -1,24 +1,17 @@
"""Shared memory helpers for queen/worker recall and reflection.
"""Queen global memory helpers.
Each memory is an individual ``.md`` file in ``~/.hive/queen/memories/``
with optional YAML frontmatter (name, type, description). Frontmatter
is a convention enforced by prompt instructions parsing is lenient and
malformed files degrade gracefully (appear in scans with ``None`` metadata).
Cursor-based incremental processing tracks which conversation messages
have already been processed by the reflection agent.
Global memory lives in ``~/.hive/queen/global_memory/`` and stores durable
cross-session knowledge about the user (profile, preferences, environment,
feedback). Each memory is an individual ``.md`` file with optional YAML
frontmatter (name, type, description).
"""
from __future__ import annotations
import logging
import re
import shutil
import time
from dataclasses import dataclass, field
from datetime import date
from pathlib import Path
from typing import Any
logger = logging.getLogger(__name__)
@@ -26,49 +19,15 @@ logger = logging.getLogger(__name__)
# Constants
# ---------------------------------------------------------------------------
MEMORY_TYPES: tuple[str, ...] = ("goal", "environment", "technique", "reference", "diary")
GLOBAL_MEMORY_CATEGORIES: tuple[str, ...] = ("profile", "preference", "environment", "feedback")
_HIVE_QUEEN_DIR = Path.home() / ".hive" / "queen"
# Legacy shared v2 root. Colony memory now lives under queen sessions.
MEMORY_DIR: Path = _HIVE_QUEEN_DIR / "memories"
MAX_FILES: int = 200
MAX_FILE_SIZE_BYTES: int = 4096 # 4 KB hard limit per memory file
# How many lines of a memory file to read for header scanning.
_HEADER_LINE_LIMIT: int = 30
_MIGRATION_MARKER = ".migrated-from-shared-memory"
_GLOBAL_MEMORY_CODE_PATTERN = re.compile(
r"(/Users/|~/.hive|\.py\b|\.ts\b|\.tsx\b|\.js\b|"
r"\b(graph|node|runtime|session|execution|worker|queen|checkpoint|flowchart)\b)",
re.IGNORECASE,
)
# Frontmatter example provided to the reflection agent via prompt.
MEMORY_FRONTMATTER_EXAMPLE: list[str] = [
"```markdown",
"---",
"name: {{memory name}}",
(
"description: {{one-line description — used to decide "
"relevance in future conversations, so be specific}}"
),
f"type: {{{{{', '.join(MEMORY_TYPES)}}}}}",
"---",
"",
(
"{{memory content — for feedback/project types, "
"structure as: rule/fact, then **Why:** "
"and **How to apply:** lines}}"
),
"```",
]
def colony_memory_dir(colony_id: str) -> Path:
"""Return the colony memory directory for a queen session."""
return _HIVE_QUEEN_DIR / "session" / colony_id / "memory" / "colony"
def global_memory_dir() -> Path:
@@ -107,15 +66,6 @@ def parse_frontmatter(text: str) -> dict[str, str]:
return result
def parse_memory_type(raw: str | None) -> str | None:
"""Validate *raw* against supported memory categories."""
if raw is None:
return None
normalized = raw.strip().lower()
allowed = set(MEMORY_TYPES) | set(GLOBAL_MEMORY_CATEGORIES)
return normalized if normalized in allowed else None
def parse_global_memory_category(raw: str | None) -> str | None:
"""Validate *raw* against ``GLOBAL_MEMORY_CATEGORIES``."""
if raw is None:
@@ -164,7 +114,7 @@ class MemoryFile:
filename=path.name,
path=path,
name=fm.get("name"),
type=parse_memory_type(fm.get("type")),
type=parse_global_memory_category(fm.get("type")),
description=fm.get("description"),
header_lines=lines,
mtime=mtime,
@@ -182,7 +132,7 @@ def scan_memory_files(memory_dir: Path | None = None) -> list[MemoryFile]:
Files are sorted by modification time (newest first). Dotfiles and
subdirectories are ignored.
"""
d = memory_dir or MEMORY_DIR
d = memory_dir or global_memory_dir()
if not d.is_dir():
return []
@@ -235,307 +185,30 @@ def build_memory_document(
)
def diary_filename(d: date | None = None) -> str:
"""Return the diary memory filename for date *d* (default: today)."""
d = d or date.today()
return f"MEMORY-{d.strftime('%Y-%m-%d')}.md"
def build_diary_document(*, date_str: str, body: str) -> str:
"""Build a diary memory file with frontmatter."""
return build_memory_document(
name=f"diary-{date_str}",
description=f"Daily session narrative for {date_str}",
mem_type="diary",
body=body,
)
def validate_global_memory_payload(
*,
category: str,
description: str,
content: str,
) -> str:
"""Validate a queen-global memory save request."""
parsed = parse_global_memory_category(category)
if parsed is None:
raise ValueError(
"Invalid global memory category. Use one of: " + ", ".join(GLOBAL_MEMORY_CATEGORIES)
)
if not description.strip():
raise ValueError("Global memory description cannot be empty.")
if not content.strip():
raise ValueError("Global memory content cannot be empty.")
probe = f"{description}\n{content}"
if _GLOBAL_MEMORY_CODE_PATTERN.search(probe):
raise ValueError(
"Global memory is only for durable user profile, preferences, "
"environment, or feedback — not task/code/runtime details."
)
return parsed
def save_global_memory(
*,
category: str,
description: str,
content: str,
name: str | None = None,
memory_dir: Path | None = None,
) -> tuple[str, Path]:
"""Persist one queen-global memory entry."""
parsed = validate_global_memory_payload(
category=category,
description=description,
content=content,
)
target_dir = memory_dir or global_memory_dir()
target_dir.mkdir(parents=True, exist_ok=True)
memory_name = (name or description).strip()
filename = allocate_memory_filename(target_dir, memory_name)
doc = build_memory_document(
name=memory_name,
description=description,
mem_type=parsed,
body=content,
)
if len(doc.encode("utf-8")) > MAX_FILE_SIZE_BYTES:
raise ValueError(f"Global memory entry exceeds the {MAX_FILE_SIZE_BYTES} byte limit.")
path = target_dir / filename
path.write_text(doc, encoding="utf-8")
return filename, path
# ---------------------------------------------------------------------------
# Manifest formatting
# ---------------------------------------------------------------------------
def _age_label(mtime: float) -> str:
"""Human-readable age string from an mtime."""
age_days = memory_age_days(mtime)
if age_days <= 0:
return "today"
if age_days == 1:
return "1 day ago"
return f"{age_days} days ago"
def format_memory_manifest(files: list[MemoryFile]) -> str:
"""One-line-per-file text manifest for the recall selector / reflection agent.
"""One-line-per-file text manifest.
Format: ``[type] filename (age): description``
Format: ``[type] filename: description``
"""
lines: list[str] = []
for mf in files:
t = mf.type or "unknown"
desc = mf.description or "(no description)"
age = _age_label(mf.mtime)
lines.append(f"[{t}] {mf.filename} ({age}): {desc}")
lines.append(f"[{t}] {mf.filename}: {desc}")
return "\n".join(lines)
# ---------------------------------------------------------------------------
# Freshness / staleness
# ---------------------------------------------------------------------------
_SECONDS_PER_DAY = 86_400
def memory_age_days(mtime: float) -> int:
"""Return the age of a memory file in whole days."""
if mtime <= 0:
return 0
return int((time.time() - mtime) / _SECONDS_PER_DAY)
def memory_freshness_text(mtime: float) -> str:
"""Return a staleness warning for injection, or empty string if fresh."""
d = memory_age_days(mtime)
if d <= 1:
return ""
return (
f"This memory is {d} days old. "
"Memories are point-in-time observations, not live state — "
"claims about code behavior or file:line citations may be outdated. "
"Verify against current code before asserting as fact."
)
# ---------------------------------------------------------------------------
# Cursor-based incremental processing
# Initialisation
# ---------------------------------------------------------------------------
async def read_conversation_parts(session_dir: Path) -> list[dict[str, Any]]:
"""Read all conversation parts for a session using FileConversationStore.
Returns a list of raw message dicts in sequence order.
"""
from framework.storage.conversation_store import FileConversationStore
store = FileConversationStore(session_dir / "conversations")
return await store.read_parts()
# ---------------------------------------------------------------------------
# Initialisation and legacy migration
# ---------------------------------------------------------------------------
def init_memory_dir(
memory_dir: Path | None = None,
*,
migrate_legacy: bool = False,
) -> None:
"""Create the memory directory if missing.
When ``migrate_legacy`` is true, migrate both v1 memory files and the
previous shared v2 queen memory store into this directory.
"""
d = memory_dir or MEMORY_DIR
first_run = not d.exists()
def init_memory_dir(memory_dir: Path | None = None) -> None:
"""Create the memory directory if missing."""
d = memory_dir or global_memory_dir()
d.mkdir(parents=True, exist_ok=True)
if migrate_legacy:
migrate_legacy_memories(d)
migrate_shared_v2_memories(d)
elif first_run and d == MEMORY_DIR:
migrate_legacy_memories(d)
def migrate_legacy_memories(memory_dir: Path | None = None) -> None:
"""Convert old MEMORY.md + MEMORY-YYYY-MM-DD.md files to individual memory files.
Originals are moved to ``{memory_dir}/.legacy/``.
"""
d = memory_dir or MEMORY_DIR
queen_dir = _HIVE_QUEEN_DIR
legacy_archive = d / ".legacy"
migrated_any = False
# --- Semantic memory (MEMORY.md) ---
semantic = queen_dir / "MEMORY.md"
if semantic.exists():
content = semantic.read_text(encoding="utf-8").strip()
# Skip the blank seed template.
if content and not content.startswith("# My Understanding of the User\n\n*No sessions"):
_write_migration_file(
d,
filename="legacy-semantic-memory.md",
name="legacy-semantic-memory",
mem_type="reference",
description="Migrated semantic memory from previous memory system",
body=content,
)
migrated_any = True
# Archive original.
legacy_archive.mkdir(parents=True, exist_ok=True)
semantic.rename(legacy_archive / "MEMORY.md")
# --- Episodic memories (MEMORY-YYYY-MM-DD.md) ---
old_memories_dir = queen_dir / "memories"
if old_memories_dir.is_dir():
for ep_file in sorted(old_memories_dir.glob("MEMORY-*.md")):
content = ep_file.read_text(encoding="utf-8").strip()
if not content:
continue
date_part = ep_file.stem.replace("MEMORY-", "")
slug = f"legacy-diary-{date_part}.md"
_write_migration_file(
d,
filename=slug,
name=f"legacy-diary-{date_part}",
mem_type="diary",
description=f"Migrated diary entry from {date_part}",
body=content,
)
migrated_any = True
# Archive original.
legacy_archive.mkdir(parents=True, exist_ok=True)
ep_file.rename(legacy_archive / ep_file.name)
if migrated_any:
logger.info("queen_memory_v2: migrated legacy memory files to %s", d)
def migrate_shared_v2_memories(
memory_dir: Path | None = None,
*,
source_dir: Path | None = None,
) -> None:
"""Move shared queen v2 memory files into a colony directory once."""
d = memory_dir or MEMORY_DIR
d.mkdir(parents=True, exist_ok=True)
src = source_dir or MEMORY_DIR
if d.resolve() == src.resolve():
return
marker = d / _MIGRATION_MARKER
if marker.exists():
return
if not src.is_dir():
return
md_files = sorted(f for f in src.glob("*.md") if f.is_file() and not f.name.startswith("."))
if not md_files:
marker.write_text("no shared memories found\n", encoding="utf-8")
return
archive = src / ".legacy_colony_migration"
archive.mkdir(parents=True, exist_ok=True)
migrated_any = False
for src_file in md_files:
target = d / src_file.name
if not target.exists():
try:
shutil.copy2(src_file, target)
migrated_any = True
except OSError:
logger.debug("shared memory migration copy failed for %s", src_file, exc_info=True)
continue
archived = archive / src_file.name
counter = 2
while archived.exists():
archived = archive / f"{src_file.stem}-{counter}{src_file.suffix}"
counter += 1
try:
src_file.rename(archived)
except OSError:
logger.debug("shared memory migration archive failed for %s", src_file, exc_info=True)
if migrated_any:
logger.info("queen_memory_v2: migrated shared queen memories to %s", d)
marker.write_text(
f"migrated_at={int(time.time())}\nsource={src}\n",
encoding="utf-8",
)
def _write_migration_file(
memory_dir: Path,
filename: str,
name: str,
mem_type: str,
description: str,
body: str,
) -> None:
"""Write a single migrated memory file with frontmatter."""
# Truncate body to respect file size limit (leave room for frontmatter).
header = f"---\nname: {name}\ndescription: {description}\ntype: {mem_type}\n---\n\n"
max_body = MAX_FILE_SIZE_BYTES - len(header.encode("utf-8"))
if len(body.encode("utf-8")) > max_body:
# Rough truncation — cut at character level then trim to last newline.
body = body[: max_body - 20]
nl = body.rfind("\n")
if nl > 0:
body = body[:nl]
body += "\n\n...(truncated during migration)"
path = memory_dir / filename
path.write_text(header + body + "\n", encoding="utf-8")
+24 -131
View File
@@ -1,11 +1,11 @@
"""Recall selector — pre-turn memory selection for queen and worker memory.
"""Recall selector — pre-turn global memory selection for the queen.
Before each conversation turn the system:
1. Scans the memory directory for ``.md`` files (cap: 200).
1. Scans the global memory directory for ``.md`` files (cap: 200).
2. Reads headers (frontmatter + first 30 lines).
3. Uses a single LLM call with structured JSON output to pick the ~5
most relevant memories.
4. Injects them into context with staleness warnings for older ones.
4. Injects them into the system prompt.
The selector only sees the user's query string — no full conversation
context. This keeps it cheap and fast. Errors are caught and return
@@ -20,9 +20,8 @@ from pathlib import Path
from typing import Any
from framework.agents.queen.queen_memory_v2 import (
MEMORY_DIR,
format_memory_manifest,
memory_freshness_text,
global_memory_dir,
scan_memory_files,
)
@@ -32,29 +31,6 @@ logger = logging.getLogger(__name__)
# Structured output schema
# ---------------------------------------------------------------------------
RECALL_SCHEMA: dict[str, Any] = {
"type": "json_schema",
"json_schema": {
"name": "memory_selection",
"strict": True,
"schema": {
"type": "object",
"properties": {
"selected_memories": {
"type": "array",
"items": {"type": "string"},
},
},
"required": ["selected_memories"],
"additionalProperties": False,
},
},
}
# ---------------------------------------------------------------------------
# System prompt
# ---------------------------------------------------------------------------
SELECT_MEMORIES_SYSTEM_PROMPT = """\
You are selecting memories that will be useful to the Queen agent as it \
processes a user's query.
@@ -72,9 +48,6 @@ name and description.
query, then do not include it in your list. Be selective and discerning.
- If there are no memories in the list that would clearly be useful, \
return an empty list.
- If a list of recently-used tools is provided, do not select memories \
that are usage reference or API documentation for those tools (the Queen \
is already exercising them). Still select warnings or gotchas about them.
"""
# ---------------------------------------------------------------------------
@@ -86,7 +59,6 @@ async def select_memories(
query: str,
llm: Any,
memory_dir: Path | None = None,
active_tools: list[str] | None = None,
*,
max_results: int = 5,
) -> list[str]:
@@ -94,51 +66,48 @@ async def select_memories(
Returns a list of filenames. Best-effort: on any error returns ``[]``.
"""
mem_dir = memory_dir or MEMORY_DIR
mem_dir = memory_dir or global_memory_dir()
files = scan_memory_files(mem_dir)
if not files:
logger.debug("recall: no memory files found, skipping selection")
return []
logger.debug("recall: selecting from %d memory files for query: %.80s", len(files), query)
logger.debug("recall: selecting from %d memories for query: %.100s", len(files), query)
manifest = format_memory_manifest(files)
user_msg_parts = [f"## User query\n\n{query}\n\n## Available memories\n\n{manifest}"]
if active_tools:
user_msg_parts.append(f"\n\n## Recently-used tools\n\n{', '.join(active_tools)}")
user_msg = "".join(user_msg_parts)
user_msg = f"## User query\n\n{query}\n\n## Available memories\n\n{manifest}"
try:
resp = await llm.acomplete(
messages=[{"role": "user", "content": user_msg}],
system=SELECT_MEMORIES_SYSTEM_PROMPT,
max_tokens=512,
response_format=RECALL_SCHEMA,
max_tokens=1024,
response_format={"type": "json_object"},
)
data = json.loads(resp.content)
raw = (resp.content or "").strip()
if not raw:
logger.warning(
"recall: LLM returned empty response (model=%s, stop=%s)",
resp.model,
resp.stop_reason,
)
return []
data = json.loads(raw)
selected = data.get("selected_memories", [])
# Validate: only return filenames that actually exist.
valid_names = {f.filename for f in files}
result = [s for s in selected if s in valid_names][:max_results]
logger.debug("recall: selected %d memories: %s", len(result), result)
return result
except Exception:
logger.debug("recall: memory selection failed, returning []", exc_info=True)
except Exception as exc:
logger.warning("recall: memory selection failed (%s), returning []", exc)
return []
def format_recall_injection(
filenames: list[str],
memory_dir: Path | None = None,
*,
heading: str = "Selected Memories",
) -> str:
"""Read selected memory files and format for system prompt injection.
Prepends a staleness warning for memories older than 1 day.
"""
mem_dir = memory_dir or MEMORY_DIR
"""Read selected memory files and format for system prompt injection."""
mem_dir = memory_dir or global_memory_dir()
if not filenames:
return ""
@@ -151,86 +120,10 @@ def format_recall_injection(
content = path.read_text(encoding="utf-8").strip()
except OSError:
continue
try:
mtime = path.stat().st_mtime
except OSError:
mtime = 0.0
freshness = memory_freshness_text(mtime)
header = f"### {fname}"
if freshness:
header += f"\n\n> {freshness}"
blocks.append(f"{header}\n\n{content}")
blocks.append(f"### {fname}\n\n{content}")
if not blocks:
return ""
body = "\n\n---\n\n".join(blocks)
logger.debug("recall: injecting %d memory blocks into context", len(blocks))
return f"--- {heading} ---\n\n{body}\n\n--- End {heading} ---"
# ---------------------------------------------------------------------------
# Cache update (called after each queen turn)
# ---------------------------------------------------------------------------
async def update_recall_cache(
session_dir: Path,
llm: Any,
phase_state: Any | None = None,
memory_dir: Path | None = None,
*,
cache_setter: Any = None,
heading: str = "Selected Memories",
active_tools: list[str] | None = None,
) -> None:
"""Update the recall cache on *phase_state* for the next turn.
Reads the latest user message from conversation parts to use as the
query for memory selection.
"""
mem_dir = memory_dir or MEMORY_DIR
# Extract latest user message as the query.
query = _extract_latest_user_query(session_dir)
if not query:
logger.debug("recall: no user query found, skipping cache update")
return
logger.debug("recall: updating cache for query: %.80s", query)
try:
selected = await select_memories(
query,
llm,
mem_dir,
active_tools=active_tools,
)
injection = format_recall_injection(selected, mem_dir, heading=heading)
if cache_setter is not None:
cache_setter(injection)
elif phase_state is not None:
phase_state._cached_recall_block = injection
except Exception:
logger.debug("recall: cache update failed", exc_info=True)
def _extract_latest_user_query(session_dir: Path) -> str:
"""Read the most recent user message from conversation parts."""
parts_dir = session_dir / "conversations" / "parts"
if not parts_dir.is_dir():
return ""
part_files = sorted(parts_dir.glob("*.json"), reverse=True)
for f in part_files[:20]: # Look back at most 20 messages.
try:
data = json.loads(f.read_text(encoding="utf-8"))
if data.get("role") == "user":
content = str(data.get("content", "")).strip()
if content:
# Truncate very long queries.
return content[:1000] if len(content) > 1000 else content
except (json.JSONDecodeError, OSError):
continue
return ""
return f"--- Global Memories ---\n\n{body}\n\n--- End Global Memories ---"
+160 -399
View File
@@ -1,21 +1,20 @@
"""Reflect agent — background memory extraction for queen and worker memory.
"""Reflection agent — background global memory extraction for the queen.
A lightweight side agent that runs after each queen LLM turn. It
inspects recent conversation messages (cursor-based incremental
processing) and extracts learnings into individual memory files.
A lightweight side agent that runs after each queen LLM turn. It inspects
recent conversation messages and extracts durable user knowledge into
individual memory files in ``~/.hive/queen/global_memory/``.
Two reflection types:
- **Short reflection**: every queen turn. Distills learnings. Nudged
toward a 2-turn pattern (batch reads batch writes).
- **Long reflection**: every 5 short reflections, on CONTEXT_COMPACTED,
and at session end. Organises, deduplicates, trims holistically.
- **Short reflection**: after conversational queen turns. Distills
learnings about the user (profile, preferences, environment, feedback).
- **Long reflection**: every 5 short reflections and on CONTEXT_COMPACTED.
Organises, deduplicates, trims the global memory directory.
The agent has restricted tool access: it can only read/write/delete
memory files in ``~/.hive/queen/memories/`` and list them.
Concurrency: an ``asyncio.Lock`` prevents overlapping runs. If a trigger
fires while a reflection is already active the event is skipped.
Concurrency: an ``asyncio.Lock`` prevents overlapping runs. If a
trigger fires while a reflection is already active the event is skipped
(cursor hasn't advanced, so messages will be reconsidered next time).
All reflections are fire-and-forget (spawned via ``asyncio.create_task``)
so they never block the queen's event loop.
"""
from __future__ import annotations
@@ -23,23 +22,18 @@ from __future__ import annotations
import asyncio
import json
import logging
import re
import traceback
from datetime import datetime
from pathlib import Path
from typing import Any
from framework.agents.queen.queen_memory_v2 import (
GLOBAL_MEMORY_CATEGORIES,
MAX_FILE_SIZE_BYTES,
MAX_FILES,
MEMORY_DIR,
MEMORY_FRONTMATTER_EXAMPLE,
MEMORY_TYPES,
build_diary_document,
diary_filename,
format_memory_manifest,
global_memory_dir,
parse_frontmatter,
read_conversation_parts,
scan_memory_files,
)
from framework.llm.provider import LLMResponse, Tool
@@ -54,7 +48,7 @@ _REFLECTION_TOOLS: list[Tool] = [
Tool(
name="list_memory_files",
description=(
"List all memory files with their type, name, age, and description. "
"List all memory files with their type, name, and description. "
"Returns a text manifest — one line per file."
),
parameters={
@@ -135,28 +129,7 @@ def _safe_memory_path(filename: str, memory_dir: Path) -> Path:
return candidate
# Memory types that workers are NOT allowed to write.
_WORKER_BLOCKED_TYPES: frozenset[str] = frozenset(
{"environment", "technique", "reference", "diary", "goal"}
)
def _inject_last_modified_by(content: str, caller: str) -> str:
"""Inject or update ``last_modified_by`` in frontmatter."""
m = re.match(r"^---\s*\n(.*?)\n---", content, re.DOTALL)
if not m:
return content
fm_body = m.group(1)
# Remove existing last_modified_by line if present.
fm_lines = [
ln for ln in fm_body.splitlines() if not ln.strip().lower().startswith("last_modified_by")
]
fm_lines.append(f"last_modified_by: {caller}")
new_fm = "\n".join(fm_lines)
return f"---\n{new_fm}\n---{content[m.end() :]}"
def _execute_tool(name: str, args: dict[str, Any], memory_dir: Path, caller: str) -> str:
def _execute_tool(name: str, args: dict[str, Any], memory_dir: Path) -> str:
"""Execute a reflection tool synchronously. Returns the result string."""
if name == "list_memory_files":
files = scan_memory_files(memory_dir)
@@ -183,16 +156,14 @@ def _execute_tool(name: str, args: dict[str, Any], memory_dir: Path, caller: str
content = args.get("content", "")
if not filename.endswith(".md"):
return "ERROR: Filename must end with .md"
# Enforce caller-based type restrictions.
# Enforce global memory type restrictions.
fm = parse_frontmatter(content)
mem_type = (fm.get("type") or "").strip().lower()
if caller == "worker" and mem_type in _WORKER_BLOCKED_TYPES:
if mem_type and mem_type not in GLOBAL_MEMORY_CATEGORIES:
return (
f"ERROR: Workers cannot write memory type '{mem_type}'. "
f"Blocked types for workers: {', '.join(sorted(_WORKER_BLOCKED_TYPES))}."
f"ERROR: Invalid memory type '{mem_type}'. "
f"Allowed types: {', '.join(GLOBAL_MEMORY_CATEGORIES)}."
)
# Inject last_modified_by into frontmatter.
content = _inject_last_modified_by(content, caller)
# Enforce file size limit.
if len(content.encode("utf-8")) > MAX_FILE_SIZE_BYTES:
return f"ERROR: Content exceeds {MAX_FILE_SIZE_BYTES} byte limit."
@@ -207,9 +178,7 @@ def _execute_tool(name: str, args: dict[str, Any], memory_dir: Path, caller: str
return f"ERROR: File cap reached ({MAX_FILES}). Delete a file first."
memory_dir.mkdir(parents=True, exist_ok=True)
path.write_text(content, encoding="utf-8")
logger.debug(
"reflect: tool write_memory_file [%s] → %s (%d chars)", caller, filename, len(content)
)
logger.debug("reflect: tool write_memory_file → %s (%d chars)", filename, len(content))
return f"Wrote {filename} ({len(content)} chars)."
if name == "delete_memory_file":
@@ -221,7 +190,7 @@ def _execute_tool(name: str, args: dict[str, Any], memory_dir: Path, caller: str
if not path.exists():
return f"ERROR: File not found: {filename}"
path.unlink()
logger.debug("reflect: tool delete_memory_file [%s]%s", caller, filename)
logger.debug("reflect: tool delete_memory_file → %s", filename)
return f"Deleted {filename}."
return f"ERROR: Unknown tool: {name}"
@@ -239,36 +208,18 @@ async def _reflection_loop(
system: str,
user_msg: str,
memory_dir: Path,
caller: str,
max_turns: int = _MAX_TURNS,
) -> tuple[bool, list[str], str]:
"""Run a mini tool-use loop: LLM → tool calls → repeat.
Hard cap of *max_turns* iterations. Prompt nudges the LLM toward a
2-turn pattern (batch reads in turn 1, batch writes in turn 2).
Returns a tuple of (success, changed_files, last_text) where *success*
is ``True`` if the loop completed without LLM errors, *changed_files*
lists filenames that were written or deleted, and *last_text* is the
final assistant text (useful as a skip-reason when no files changed).
Returns (success, changed_files, last_text).
"""
messages: list[dict[str, Any]] = [{"role": "user", "content": user_msg}]
changed_files: list[str] = []
last_text: str = ""
logger.debug("reflect: starting loop (caller=%s, max %d turns)", caller, max_turns)
for _turn in range(max_turns):
# Log what we're sending to the LLM.
user_content = messages[-1].get("content", "") if messages else ""
preview = user_content[:300] if isinstance(user_content, str) else str(user_content)[:300]
logger.debug(
"reflect: turn %d — sending %d messages to LLM, last msg role=%s, preview=%s",
_turn,
len(messages),
messages[-1].get("role", "?") if messages else "?",
preview,
)
logger.info("reflect: loop turn %d/%d (msgs=%d)", _turn + 1, max_turns, len(messages))
try:
resp: LLMResponse = await llm.acomplete(
messages=messages,
@@ -276,45 +227,49 @@ async def _reflection_loop(
tools=_REFLECTION_TOOLS,
max_tokens=2048,
)
except asyncio.CancelledError:
logger.warning("reflect: LLM call cancelled (task cancelled)")
return False, changed_files, last_text
except Exception:
logger.warning("reflect: LLM call failed", exc_info=True)
return False, changed_files, last_text
# Build assistant message.
# Extract tool calls from litellm/OpenAI response object.
tool_calls_raw: list[dict[str, Any]] = []
if resp.raw_response and isinstance(resp.raw_response, dict):
tool_calls_raw = resp.raw_response.get("tool_calls", [])
raw = resp.raw_response
if raw is not None:
# litellm returns a ModelResponse object; tool calls live on
# choices[0].message.tool_calls as a list of ChatCompletionMessageToolCall.
try:
msg_obj = raw.choices[0].message
if hasattr(msg_obj, "tool_calls") and msg_obj.tool_calls:
for tc in msg_obj.tool_calls:
fn = tc.function
try:
args = json.loads(fn.arguments) if fn.arguments else {}
except (json.JSONDecodeError, TypeError):
args = {}
tool_calls_raw.append(
{
"id": tc.id,
"name": fn.name,
"input": args,
}
)
except (AttributeError, IndexError):
pass
# Log the full LLM response for debugging.
raw_keys = (
list(resp.raw_response.keys())
if isinstance(resp.raw_response, dict)
else type(resp.raw_response).__name__
)
logger.debug(
"reflect: turn %d — LLM response: content=%r (len=%d), stop_reason=%s, "
"tool_calls=%d, model=%s, tokens=%d/%d, raw_keys=%s",
_turn,
(resp.content or "")[:200],
logger.info(
"reflect: LLM responded, text=%d chars, tool_calls=%d",
len(resp.content or ""),
resp.stop_reason,
len(tool_calls_raw),
resp.model,
resp.input_tokens,
resp.output_tokens,
raw_keys,
)
# Accumulate non-empty text across turns so we don't lose a reason
# given alongside tool calls on an earlier turn.
turn_text = resp.content or ""
if turn_text:
last_text = turn_text
assistant_msg: dict[str, Any] = {
"role": "assistant",
"content": turn_text,
}
assistant_msg: dict[str, Any] = {"role": "assistant", "content": turn_text}
if tool_calls_raw:
# Convert to OpenAI format for the conversation.
assistant_msg["tool_calls"] = [
{
"id": tc["id"],
@@ -328,32 +283,16 @@ async def _reflection_loop(
]
messages.append(assistant_msg)
# No tool calls → agent is done.
if not tool_calls_raw:
logger.debug("reflect: loop done after %d turn(s) (no tool calls)", _turn + 1)
break
# Execute each tool call and append results.
logger.debug(
"reflect: turn %d — executing %d tool call(s): %s",
_turn + 1,
len(tool_calls_raw),
[tc["name"] for tc in tool_calls_raw],
)
for tc in tool_calls_raw:
result = _execute_tool(tc["name"], tc.get("input", {}), memory_dir, caller)
# Track files that were written or deleted.
result = _execute_tool(tc["name"], tc.get("input", {}), memory_dir)
if tc["name"] in ("write_memory_file", "delete_memory_file"):
fname = tc.get("input", {}).get("filename", "")
if fname and not result.startswith("ERROR"):
changed_files.append(fname)
messages.append(
{
"role": "tool",
"tool_call_id": tc["id"],
"content": result,
}
)
messages.append({"role": "tool", "tool_call_id": tc["id"], "content": result})
return True, changed_files, last_text
@@ -362,51 +301,55 @@ async def _reflection_loop(
# System prompts
# ---------------------------------------------------------------------------
_FRONTMATTER_EXAMPLE = "\n".join(MEMORY_FRONTMATTER_EXAMPLE)
_CATEGORIES_STR = ", ".join(GLOBAL_MEMORY_CATEGORIES)
_SHORT_REFLECT_SYSTEM = f"""\
You are a reflection agent that distills learnings from a conversation into
persistent memory files. You run in the background after each assistant turn.
You are a reflection agent that distills durable knowledge about the USER
into persistent global memory files. You run in the background after each
assistant turn.
Your goal: identify anything from the recent messages worth remembering across
future sessions user preferences, project context, techniques that worked,
goals, environment details, reference pointers.
Your goal: identify anything from the recent messages worth remembering
about the user across ALL future sessions their profile, preferences,
environment setup, or feedback on assistant behavior.
Memory types: {", ".join(MEMORY_TYPES)}
Memory categories: {_CATEGORIES_STR}
Expected format for each memory file:
{_FRONTMATTER_EXAMPLE}
```markdown
---
name: {{{{memory name}}}}
description: {{{{one-line description specific and search-friendly}}}}
type: {{{{{_CATEGORIES_STR}}}}}
---
{{{{memory content}}}}
```
Workflow (aim for 2 turns):
Turn 1 call list_memory_files to see what already exists, then
read_memory_file for any that might need updating.
Turn 1 call list_memory_files to see what exists, then read_memory_file
for any that might need updating.
Turn 2 call write_memory_file for new/updated memories.
Rules:
- Only persist information that would be useful in a *future* conversation.
Skip ephemeral task details, routine tool output, and anything obvious
from the code or git history.
- ONLY persist durable knowledge about the USER who they are, how they
like to work, their tech environment, their feedback on your behavior.
- Do NOT store task-specific details, code patterns, file paths, or
ephemeral session state.
- Keep files concise. Each file should cover ONE topic.
- If an existing memory already covers the learning, UPDATE it rather than
creating a duplicate.
- If there is nothing worth remembering from these messages, do nothing
(respond with a brief reason why nothing was saved no tool calls needed).
- IMPORTANT: Always end with a text message (no tool calls) summarising what
you did or why you skipped. Never end on an empty response.
- If there is nothing worth remembering, do nothing (respond with a brief
reason no tool calls needed).
- File names should be kebab-case slugs ending in .md.
- Include a specific, search-friendly description in the frontmatter.
- Do NOT exceed {MAX_FILE_SIZE_BYTES} bytes per file or {MAX_FILES} total files.
"""
_LONG_REFLECT_SYSTEM = f"""\
You are a reflection agent performing a periodic housekeeping pass over the
memory directory. Your job is to organise, deduplicate, and trim noise from
the accumulated memory files.
global memory directory. Your job is to organise, deduplicate, and trim
noise from the accumulated memory files.
Memory types: {", ".join(MEMORY_TYPES)}
Expected format for each memory file:
{_FRONTMATTER_EXAMPLE}
Memory categories: {_CATEGORIES_STR}
Workflow:
1. list_memory_files to get the full manifest.
@@ -420,29 +363,6 @@ Rules:
- Remove memories that are no longer relevant or are superseded.
- Keep the total collection lean and high-signal.
- Do NOT invent new information only reorganise what exists.
- Do NOT delete or merge MEMORY-*.md diary files. These are daily narratives
managed by a separate process. You may read them for context but should not
modify them.
"""
_DIARY_SYSTEM = """\
You maintain a daily diary entry for an AI colony session. You receive:
(1) Today's existing diary content (may be empty if this is the first entry).
(2) A transcript of recent conversation messages.
Write a cohesive 3-8 sentence narrative about what happened in this session today.
Cover: what the user asked for, what was accomplished, key decisions or obstacles,
and current status.
Rules:
- If an existing diary is provided, rewrite it as a unified narrative incorporating
the new developments. Merge and deduplicate do not simply append.
- Keep the total narrative under 3000 characters.
- Focus on the story arc of the day, not individual tool calls or code details.
- If the recent messages contain nothing substantive (greetings, routine
confirmations), return the existing diary text unchanged.
- Output only the diary prose. No headings, no timestamps, no code fences, no
frontmatter.
"""
@@ -451,31 +371,33 @@ Rules:
# ---------------------------------------------------------------------------
async def _read_conversation_parts(session_dir: Path) -> list[dict[str, Any]]:
"""Read conversation parts from the queen session directory."""
from framework.storage.conversation_store import FileConversationStore
store = FileConversationStore(session_dir / "conversations")
return await store.read_parts()
async def run_short_reflection(
session_dir: Path,
llm: Any,
memory_dir: Path | None = None,
*,
caller: str,
) -> None:
"""Run a short reflection: extract learnings from conversation."""
mem_dir = memory_dir or MEMORY_DIR
"""Run a short reflection: extract user knowledge from conversation."""
logger.info("reflect: starting short reflection for %s", session_dir)
mem_dir = memory_dir or global_memory_dir()
messages = await read_conversation_parts(session_dir)
messages = await _read_conversation_parts(session_dir)
if not messages:
logger.debug("reflect: short [%s] — no conversation parts", caller)
logger.info("reflect: no conversation parts found in %s, skipping", session_dir)
return
logger.debug("reflect: short [%s] — %d conversation parts", caller, len(messages))
# Build a readable transcript from recent messages.
transcript_lines: list[str] = []
for msg in messages[-50:]:
role = msg.get("role", "")
content = str(msg.get("content", "")).strip()
if role == "tool":
continue # Skip verbose tool results.
if not content:
if role == "tool" or not content:
continue
label = "user" if role == "user" else "assistant"
if len(content) > 800:
@@ -483,6 +405,7 @@ async def run_short_reflection(
transcript_lines.append(f"[{label}]: {content}")
if not transcript_lines:
logger.info("reflect: no transcript lines after filtering, skipping")
return
transcript = "\n".join(transcript_lines)
@@ -492,38 +415,26 @@ async def run_short_reflection(
f"Timestamp: {datetime.now().isoformat(timespec='minutes')}"
)
_, changed, reason = await _reflection_loop(
llm,
_SHORT_REFLECT_SYSTEM,
user_msg,
mem_dir,
caller=caller,
)
_, changed, reason = await _reflection_loop(llm, _SHORT_REFLECT_SYSTEM, user_msg, mem_dir)
if changed:
logger.debug("reflect: short reflection done [%s], changed files: %s", caller, changed)
logger.info("reflect: short reflection done, changed files: %s", changed)
else:
logger.debug(
"reflect: short reflection done [%s], no changes — %s",
caller,
reason or "no reason given",
)
logger.info("reflect: short reflection done, no changes — %s", reason or "no reason")
async def run_long_reflection(
llm: Any,
memory_dir: Path | None = None,
*,
caller: str,
) -> None:
"""Run a long reflection: organise and deduplicate all memories."""
mem_dir = memory_dir or MEMORY_DIR
"""Run a long reflection: organise and deduplicate all global memories."""
logger.debug("reflect: starting long reflection")
mem_dir = memory_dir or global_memory_dir()
files = scan_memory_files(mem_dir)
if not files:
logger.debug("reflect: long [%s] — no memory files to organise", caller)
logger.debug("reflect: no memory files, skipping long reflection")
return
logger.debug("reflect: long [%s] — organising %d memory files", caller, len(files))
manifest = format_memory_manifest(files)
user_msg = (
f"## Current memory manifest ({len(files)} files)\n\n"
@@ -531,105 +442,43 @@ async def run_long_reflection(
f"Timestamp: {datetime.now().isoformat(timespec='minutes')}"
)
_, changed, reason = await _reflection_loop(
llm,
_LONG_REFLECT_SYSTEM,
user_msg,
mem_dir,
caller=caller,
)
_, changed, reason = await _reflection_loop(llm, _LONG_REFLECT_SYSTEM, user_msg, mem_dir)
if changed:
logger.debug(
"reflect: long reflection done [%s] (%d files), changed files: %s",
caller,
len(files),
changed,
)
logger.debug("reflect: long reflection done (%d files), changed: %s", len(files), changed)
else:
logger.debug(
"reflect: long reflection done [%s] (%d files), no changes — %s",
caller,
"reflect: long reflection done (%d files), no changes — %s",
len(files),
reason or "no reason given",
reason or "no reason",
)
async def run_diary_update(
async def run_shutdown_reflection(
session_dir: Path,
llm: Any,
memory_dir: Path | None = None,
) -> None:
"""Update today's diary file with a narrative of recent activity."""
mem_dir = memory_dir or MEMORY_DIR
"""Run a final short reflection on session shutdown.
fname = diary_filename()
diary_path = mem_dir / fname
today_str = datetime.now().strftime("%Y-%m-%d")
# Read existing diary body (strip frontmatter).
existing_body = ""
if diary_path.exists():
Called during session teardown so recent conversation insights are
persisted before the session is destroyed.
"""
logger.info("reflect: running shutdown reflection for %s", session_dir)
mem_dir = memory_dir or global_memory_dir()
try:
raw = diary_path.read_text(encoding="utf-8")
m = re.match(r"^---\s*\n.*?\n---\s*\n?", raw, re.DOTALL)
existing_body = raw[m.end() :].strip() if m else raw.strip()
except OSError:
pass
# Read all conversation messages for context.
messages = await read_conversation_parts(session_dir)
transcript_lines: list[str] = []
for msg in messages[-40:]:
role = msg.get("role", "")
content = str(msg.get("content", "")).strip()
if role == "tool" or not content:
continue
label = "user" if role == "user" else "assistant"
if len(content) > 600:
content = content[:600] + "..."
transcript_lines.append(f"[{label}]: {content}")
if not transcript_lines:
return
transcript = "\n".join(transcript_lines)
user_msg = (
f"## Today's Diary So Far\n\n"
f"{existing_body or '(no entries yet)'}\n\n"
f"## Recent Conversation\n\n"
f"{transcript}\n\n"
f"Date: {today_str}"
)
try:
from framework.agents.queen.config import default_config
resp = await llm.acomplete(
messages=[{"role": "user", "content": user_msg}],
system=_DIARY_SYSTEM,
max_tokens=min(default_config.max_tokens, 1024),
)
new_body = (resp.content or "").strip()
if not new_body:
return
doc = build_diary_document(date_str=today_str, body=new_body)
if len(doc.encode("utf-8")) > MAX_FILE_SIZE_BYTES:
new_body = new_body[:2800]
doc = build_diary_document(date_str=today_str, body=new_body)
mem_dir.mkdir(parents=True, exist_ok=True)
diary_path.write_text(doc, encoding="utf-8")
logger.debug("diary: updated %s (%d chars)", fname, len(doc))
await run_short_reflection(session_dir, llm, mem_dir)
logger.info("reflect: shutdown reflection completed for %s", session_dir)
except asyncio.CancelledError:
logger.warning("reflect: shutdown reflection cancelled for %s", session_dir)
except Exception:
logger.warning("diary: update failed", exc_info=True)
logger.warning("reflect: shutdown reflection failed", exc_info=True)
_write_error("shutdown reflection")
# ---------------------------------------------------------------------------
# Event-bus integration
# ---------------------------------------------------------------------------
# Run a long reflection every N short reflections.
_LONG_REFLECT_INTERVAL = 5
@@ -638,7 +487,6 @@ async def subscribe_reflection_triggers(
session_dir: Path,
llm: Any,
memory_dir: Path | None = None,
phase_state: Any = None,
) -> list[str]:
"""Subscribe to queen turn events and return subscription IDs.
@@ -647,103 +495,74 @@ async def subscribe_reflection_triggers(
"""
from framework.host.event_bus import EventType
mem_dir = memory_dir or MEMORY_DIR
mem_dir = memory_dir or global_memory_dir()
_lock = asyncio.Lock()
_short_count = 0
_background_tasks: set[asyncio.Task] = set()
async def _do_turn_reflect(is_interval: bool, count: int) -> None:
async with _lock:
try:
if is_interval:
await run_short_reflection(session_dir, llm, mem_dir)
await run_long_reflection(llm, mem_dir)
else:
await run_short_reflection(session_dir, llm, mem_dir)
except Exception:
logger.warning("reflect: reflection failed", exc_info=True)
_write_error("short/long reflection")
async def _do_compaction_reflect() -> None:
async with _lock:
try:
await run_long_reflection(llm, mem_dir)
except Exception:
logger.warning("reflect: compaction-triggered reflection failed", exc_info=True)
_write_error("compaction reflection")
def _fire_and_forget(coro: Any) -> None:
"""Spawn a background task and prevent GC before it finishes."""
task = asyncio.create_task(coro)
_background_tasks.add(task)
task.add_done_callback(_background_tasks.discard)
async def _on_turn_complete(event: Any) -> None:
nonlocal _short_count
# Only process queen turns.
if getattr(event, "stream_id", None) != "queen":
return
_short_count += 1
# Decide whether to reflect: only when the LLM turn ended without
# tool calls (a conversational response) OR every _LONG_REFLECT_INTERVAL turns.
event_data = getattr(event, "data", {}) or {}
stop_reason = event_data.get("stop_reason", "")
is_tool_turn = stop_reason in ("tool_use", "tool_calls")
is_interval = _short_count % _LONG_REFLECT_INTERVAL == 0
if is_tool_turn and not is_interval:
logger.debug(
"reflect: skipping turn %d (stop_reason=%s, next reflect at %d)",
_short_count,
stop_reason,
(_short_count // _LONG_REFLECT_INTERVAL + 1) * _LONG_REFLECT_INTERVAL,
)
logger.debug("reflect: skipping tool turn (count=%d)", _short_count)
return
if _lock.locked():
logger.debug("reflect: skipping — reflection already in progress")
logger.debug("reflect: skipping, already running (count=%d)", _short_count)
return
async with _lock:
try:
logger.debug(
"reflect: turn complete — count %d/%d (stop_reason=%s)",
"reflect: triggered (count=%d, interval=%s, stop_reason=%s)",
_short_count,
_LONG_REFLECT_INTERVAL,
is_interval,
stop_reason,
)
if is_interval:
await run_short_reflection(session_dir, llm, mem_dir, caller="queen")
await run_long_reflection(llm, mem_dir, caller="queen")
else:
await run_short_reflection(session_dir, llm, mem_dir, caller="queen")
except Exception:
logger.warning("reflect: reflection failed", exc_info=True)
_write_error("short/long reflection")
# Update daily diary after reflection.
try:
await run_diary_update(session_dir, llm, mem_dir)
except Exception:
logger.warning("reflect: diary update failed", exc_info=True)
# Update recall cache after reflection completes, guaranteeing
# recall sees the current turn's extracted memories.
if phase_state is not None:
try:
from framework.agents.queen.recall_selector import update_recall_cache
await update_recall_cache(
session_dir,
llm,
cache_setter=lambda block: (
setattr(phase_state, "_cached_colony_recall_block", block),
setattr(phase_state, "_cached_recall_block", block),
),
memory_dir=mem_dir,
heading="Colony Memories",
)
await update_recall_cache(
session_dir,
llm,
cache_setter=lambda block: setattr(
phase_state, "_cached_global_recall_block", block
),
memory_dir=getattr(phase_state, "global_memory_dir", None),
heading="Global Memories",
)
except Exception:
logger.debug("recall: cache update failed", exc_info=True)
_fire_and_forget(_do_turn_reflect(is_interval, _short_count))
async def _on_compaction(event: Any) -> None:
if getattr(event, "stream_id", None) != "queen":
return
if _lock.locked():
logger.debug("reflect: skipping compaction trigger, already running")
return
async with _lock:
try:
await run_long_reflection(llm, mem_dir, caller="queen")
except Exception:
logger.warning("reflect: compaction-triggered reflection failed", exc_info=True)
_write_error("compaction reflection")
logger.debug("reflect: compaction triggered long reflection")
_fire_and_forget(_do_compaction_reflect())
sub_ids: list[str] = []
@@ -762,68 +581,10 @@ async def subscribe_reflection_triggers(
return sub_ids
async def subscribe_worker_memory_triggers(
event_bus: Any,
llm: Any,
*,
worker_sessions_dir: Path,
colony_memory_dir: Path,
recall_cache: dict[str, str],
) -> list[str]:
"""Subscribe colony memory lifecycle events for worker runs.
Short reflection is now handled synchronously at node handoff in
``WorkerAgent._reflect_colony_memory()``. This function only manages:
- Recall cache initialisation on execution start
- Final long reflection + cleanup on execution end
"""
from framework.host.event_bus import EventType
_terminal_lock = asyncio.Lock()
def _is_worker_event(event: Any) -> bool:
return bool(
getattr(event, "execution_id", None)
and getattr(event, "stream_id", None) not in ("queen", "judge")
)
async def _on_execution_started(event: Any) -> None:
if not _is_worker_event(event):
return
if event.execution_id is not None:
recall_cache[event.execution_id] = ""
async def _on_execution_terminal(event: Any) -> None:
if not _is_worker_event(event):
return
execution_id = event.execution_id
if execution_id is None:
return
async with _terminal_lock:
try:
await run_long_reflection(llm, colony_memory_dir, caller="worker")
except Exception:
logger.warning("reflect: worker final reflection failed", exc_info=True)
_write_error("worker final reflection")
finally:
recall_cache.pop(execution_id, None)
return [
event_bus.subscribe(
event_types=[EventType.EXECUTION_STARTED],
handler=_on_execution_started,
),
event_bus.subscribe(
event_types=[EventType.EXECUTION_COMPLETED, EventType.EXECUTION_FAILED],
handler=_on_execution_terminal,
),
]
def _write_error(context: str) -> None:
"""Best-effort write of the last traceback to an error file."""
try:
error_path = MEMORY_DIR / ".reflection_error.txt"
error_path = global_memory_dir() / ".reflection_error.txt"
error_path.parent.mkdir(parents=True, exist_ok=True)
error_path.write_text(
f"context: {context}\ntime: {datetime.now().isoformat()}\n\n{traceback.format_exc()}",
-9
View File
@@ -252,11 +252,6 @@ class AgentHost:
self._dynamic_memory_provider_factory: Callable[[str], Callable[[], str] | None] | None = (
None
)
# Colony memory config for reflection-at-handoff (set by session_manager)
self._colony_memory_dir: Any = None
self._colony_worker_sessions_dir: Any = None
self._colony_recall_cache: dict[str, str] | None = None
self._colony_reflect_llm: Any = None
self._accounts_data = accounts_data
self._tool_provider_map = tool_provider_map
@@ -385,10 +380,6 @@ class AgentHost:
context_warn_ratio=self.context_warn_ratio,
batch_init_nudge=self.batch_init_nudge,
dynamic_memory_provider_factory=self._dynamic_memory_provider_factory,
colony_memory_dir=self._colony_memory_dir,
colony_worker_sessions_dir=self._colony_worker_sessions_dir,
colony_recall_cache=self._colony_recall_cache,
colony_reflect_llm=self._colony_reflect_llm,
)
await stream.start()
self._streams[ep_id] = stream
-14
View File
@@ -192,11 +192,6 @@ class ExecutionManager:
context_warn_ratio: float | None = None,
batch_init_nudge: str | None = None,
dynamic_memory_provider_factory: Callable[[str], Callable[[], str] | None] | None = None,
colony_memory_dir: Any = None,
colony_worker_sessions_dir: Any = None,
colony_recall_cache: dict[str, str] | None = None,
colony_reflect_llm: Any = None,
execution_middleware: list | None = None,
):
"""
Initialize execution stream.
@@ -252,11 +247,6 @@ class ExecutionManager:
self._context_warn_ratio: float | None = context_warn_ratio
self._batch_init_nudge: str | None = batch_init_nudge
self._dynamic_memory_provider_factory = dynamic_memory_provider_factory
self._colony_memory_dir = colony_memory_dir
self._colony_worker_sessions_dir = colony_worker_sessions_dir
self._colony_recall_cache = colony_recall_cache
self._colony_reflect_llm = colony_reflect_llm
self._execution_middleware = execution_middleware or []
_es_logger = logging.getLogger(__name__)
if protocols_prompt:
@@ -750,10 +740,6 @@ class ExecutionManager:
if self._dynamic_memory_provider_factory is not None
else None
),
colony_memory_dir=self._colony_memory_dir,
colony_worker_sessions_dir=self._colony_worker_sessions_dir,
colony_recall_cache=self._colony_recall_cache,
colony_reflect_llm=self._colony_reflect_llm,
)
# Track executor so inject_input() can reach EventLoopNode instances
self._active_executors[execution_id] = executor
+5 -1
View File
@@ -2051,9 +2051,13 @@ class LiteLLMProvider(LLMProvider):
if accumulated_text and "<tool_code>" in accumulated_text:
extracted, cleaned = _extract_text_tool_calls(accumulated_text)
if extracted:
tool_names = [tc.tool_name for tc in extracted]
logger.info(
"[stream] extracted %d hallucinated tool call(s) from text",
"[stream] Model emitted %d tool call(s) as <tool_code> text "
"instead of structured function calls; converting to "
"synthetic ToolCallEvents: %s",
len(extracted),
tool_names,
)
accumulated_text = cleaned
# Emit a corrected TextDeltaEvent so the caller's
-6
View File
@@ -67,12 +67,6 @@ class GraphContext:
# Retry tracking: worker_id → retry_count (for execution quality assessment)
retry_counts: dict[str, int] = field(default_factory=dict)
nodes_with_retries: set[str] = field(default_factory=set)
# Colony memory reflection at node handoff
colony_memory_dir: Any = None # Path | None
worker_sessions_dir: Any = None # Path | None
colony_recall_cache: dict[str, str] = field(default_factory=dict)
colony_reflect_llm: Any = None # LLMProvider for reflection
_colony_reflect_lock: asyncio.Lock = field(default_factory=asyncio.Lock)
def build_scoped_buffer(buffer: DataBuffer, node_spec: NodeSpec) -> DataBuffer:
@@ -320,8 +320,6 @@ class NodeWorker:
self.lifecycle = WorkerLifecycle.COMPLETED
self._last_result = result
self._last_activations = activations
# Colony memory reflection — runs before downstream activation
await self._reflect_colony_memory()
completion = WorkerCompletion(
worker_id=node_spec.id,
success=True,
@@ -342,8 +340,6 @@ class NodeWorker:
self.lifecycle = WorkerLifecycle.FAILED
self._last_result = result
self._last_activations = activations
# Colony memory reflection — capture learnings even on failure
await self._reflect_colony_memory()
await self._publish_failure(result.error or "Unknown error")
except Exception as exc:
error = str(exc) or type(exc).__name__
@@ -658,61 +654,6 @@ class NodeWorker:
pause_event=self._pause_requested,
)
async def _reflect_colony_memory(self) -> None:
"""Run colony memory reflection at node handoff.
Awaits the shared colony lock so parallel workers queue (never skip).
"""
gc = self._gc
if gc.colony_memory_dir is None or gc.colony_reflect_llm is None:
return
if gc.worker_sessions_dir is None:
return
from pathlib import Path
session_dir = Path(gc.worker_sessions_dir) / gc.execution_id
if not session_dir.exists():
return
# Await lock — serializes reflection but never skips
async with gc._colony_reflect_lock:
try:
from framework.agents.queen.reflection_agent import run_short_reflection
await run_short_reflection(
session_dir,
gc.colony_reflect_llm,
gc.colony_memory_dir,
caller="worker",
)
except Exception:
logger.warning(
"Worker %s: colony reflection failed",
self.node_spec.id,
exc_info=True,
)
# Update recall cache outside lock (per-execution key, no write races)
try:
from framework.agents.queen.recall_selector import update_recall_cache
await update_recall_cache(
session_dir,
gc.colony_reflect_llm,
memory_dir=gc.colony_memory_dir,
cache_setter=lambda block: gc.colony_recall_cache.__setitem__(
gc.execution_id, block
),
heading="Colony Memories",
)
except Exception:
logger.warning(
"Worker %s: recall cache update failed",
self.node_spec.id,
exc_info=True,
)
# ------------------------------------------------------------------
# Event publishing
# ------------------------------------------------------------------
@@ -160,10 +160,6 @@ class Orchestrator:
skill_dirs: list[str] | None = None,
context_warn_ratio: float | None = None,
batch_init_nudge: str | None = None,
colony_memory_dir: Any = None,
colony_worker_sessions_dir: Any = None,
colony_recall_cache: dict[str, str] | None = None,
colony_reflect_llm: Any = None,
):
"""
Initialize the executor.
@@ -232,11 +228,6 @@ class Orchestrator:
self.skill_dirs: list[str] = skill_dirs or []
self.context_warn_ratio: float | None = context_warn_ratio
self.batch_init_nudge: str | None = batch_init_nudge
self.colony_memory_dir = colony_memory_dir
self.colony_worker_sessions_dir = colony_worker_sessions_dir
self.colony_recall_cache = colony_recall_cache or {}
self.colony_reflect_llm = colony_reflect_llm
if protocols_prompt:
self.logger.info(
"GraphExecutor[%s] received protocols_prompt (%d chars)",
@@ -1341,10 +1332,6 @@ class Orchestrator:
iteration_metadata_provider=self.iteration_metadata_provider,
loop_config=self._loop_config,
node_visit_counts=dict(node_visit_counts),
colony_memory_dir=self.colony_memory_dir,
worker_sessions_dir=self.colony_worker_sessions_dir,
colony_recall_cache=self.colony_recall_cache,
colony_reflect_llm=self.colony_reflect_llm,
)
# Create one WorkerAgent per node
+107 -4
View File
@@ -1,7 +1,84 @@
import ast
import operator
import signal
import threading
import time
from contextlib import contextmanager
from typing import Any
# Power operations can allocate extremely large integers. Keep conservative
# limits here so untrusted edge conditions cannot exhaust CPU or memory.
MAX_POWER_ABS_EXPONENT = 1_000
MAX_POWER_RESULT_BITS = 4_096
# Typical edge-condition evaluations in this repo complete well under 1ms.
# 100ms leaves ample headroom for legitimate checks while failing fast on abuse.
DEFAULT_TIMEOUT_MS = 100
def _safe_pow(base: Any, exp: Any) -> Any:
if isinstance(exp, (int, float)) and abs(exp) > MAX_POWER_ABS_EXPONENT:
raise ValueError(f"Power exponent exceeds safe limit ({MAX_POWER_ABS_EXPONENT})")
if isinstance(base, int) and isinstance(exp, int) and exp > 0:
abs_base = abs(base)
if abs_base > 1:
# Estimate bit growth instead of materializing a huge integer.
estimated_bits = exp * abs_base.bit_length()
if estimated_bits > MAX_POWER_RESULT_BITS:
raise ValueError("Power operation exceeds safe size limit")
return operator.pow(base, exp)
def _timeout_message(timeout_ms: int) -> str:
return f"safe_eval exceeded {timeout_ms}ms execution timeout"
def _check_timeout(deadline: float | None, timeout_ms: int | None) -> None:
if deadline is not None and timeout_ms is not None and time.perf_counter() >= deadline:
raise TimeoutError(_timeout_message(timeout_ms))
@contextmanager
def _execution_timeout(timeout_ms: int | None):
if timeout_ms is None:
yield
return
if timeout_ms <= 0:
raise ValueError("timeout_ms must be greater than 0")
can_use_alarm = (
hasattr(signal, "SIGALRM")
and hasattr(signal, "ITIMER_REAL")
and hasattr(signal, "getitimer")
and hasattr(signal, "setitimer")
and threading.current_thread() is threading.main_thread()
)
if not can_use_alarm:
yield
return
current_delay, current_interval = signal.getitimer(signal.ITIMER_REAL)
if current_delay > 0 or current_interval > 0:
# safe_eval runs inside a shared framework process, so it must not
# replace a timer another subsystem already owns.
yield
return
def _handle_timeout(signum, frame):
raise TimeoutError(_timeout_message(timeout_ms))
old_handler = signal.getsignal(signal.SIGALRM)
signal.signal(signal.SIGALRM, _handle_timeout)
old_delay, old_interval = signal.setitimer(signal.ITIMER_REAL, timeout_ms / 1000)
try:
yield
finally:
signal.signal(signal.SIGALRM, old_handler)
signal.setitimer(signal.ITIMER_REAL, old_delay, old_interval)
# Safe operators whitelist
SAFE_OPERATORS = {
ast.Add: operator.add,
@@ -10,7 +87,7 @@ SAFE_OPERATORS = {
ast.Div: operator.truediv,
ast.FloorDiv: operator.floordiv,
ast.Mod: operator.mod,
ast.Pow: operator.pow,
ast.Pow: _safe_pow,
ast.LShift: operator.lshift,
ast.RShift: operator.rshift,
ast.BitOr: operator.or_,
@@ -54,10 +131,19 @@ SAFE_FUNCTIONS = {
class SafeEvalVisitor(ast.NodeVisitor):
def __init__(self, context: dict[str, Any]):
def __init__(
self,
context: dict[str, Any],
*,
deadline: float | None = None,
timeout_ms: int | None = None,
):
self.context = context
self.deadline = deadline
self.timeout_ms = timeout_ms
def visit(self, node: ast.AST) -> Any:
_check_timeout(self.deadline, self.timeout_ms)
# Override visit to prevent default behavior and ensure only explicitly allowed nodes work
method = "visit_" + node.__class__.__name__
visitor = getattr(self, method, self.generic_visit)
@@ -183,6 +269,7 @@ class SafeEvalVisitor(ast.NodeVisitor):
raise AttributeError(f"Object has no attribute '{node.attr}'")
def visit_Call(self, node: ast.Call) -> Any:
_check_timeout(self.deadline, self.timeout_ms)
# Only allow calling whitelisted functions
func = self.visit(node.func)
@@ -226,16 +313,24 @@ class SafeEvalVisitor(ast.NodeVisitor):
args = [self.visit(arg) for arg in node.args]
keywords = {kw.arg: self.visit(kw.value) for kw in node.keywords}
_check_timeout(self.deadline, self.timeout_ms)
return func(*args, **keywords)
def safe_eval(expr: str, context: dict[str, Any] | None = None) -> Any:
def safe_eval(
expr: str,
context: dict[str, Any] | None = None,
*,
timeout_ms: int | None = DEFAULT_TIMEOUT_MS,
) -> Any:
"""
Safely evaluate a python expression string.
Args:
expr: The expression string to evaluate.
context: Dictionary of variables available in the expression.
timeout_ms: Maximum evaluation time in milliseconds. Use ``None`` to
disable the timeout.
Returns:
The result of the evaluation.
@@ -251,10 +346,18 @@ def safe_eval(expr: str, context: dict[str, Any] | None = None) -> Any:
full_context = context.copy()
full_context.update(SAFE_FUNCTIONS)
deadline = None if timeout_ms is None else time.perf_counter() + (timeout_ms / 1000)
with _execution_timeout(timeout_ms):
try:
tree = ast.parse(expr, mode="eval")
except SyntaxError as e:
raise SyntaxError(f"Invalid syntax in expression: {e}") from e
visitor = SafeEvalVisitor(full_context)
_check_timeout(deadline, timeout_ms)
visitor = SafeEvalVisitor(
full_context,
deadline=deadline,
timeout_ms=timeout_ms,
)
return visitor.visit(tree)
+9 -4
View File
@@ -133,15 +133,20 @@ async def cors_middleware(request: web.Request, handler):
@web.middleware
async def error_middleware(request: web.Request, handler):
"""Catch exceptions and return JSON error responses."""
"""Catch exceptions and return JSON error responses.
Returns a generic error message to the client to avoid leaking
internal details (file paths, config values, stack traces).
The full exception is still logged server-side.
"""
try:
return await handler(request)
except web.HTTPException:
raise # Let aiohttp handle its own HTTP exceptions
except Exception as e:
logger.exception(f"Unhandled error: {e}")
except Exception:
logger.exception("Unhandled error on %s %s", request.method, request.path)
return web.json_response(
{"error": str(e), "type": type(e).__name__},
{"error": "Internal server error"},
status=500,
)
+44 -13
View File
@@ -180,16 +180,13 @@ async def create_queen(
phase_state.running_tools = [t for t in queen_tools if t.name in running_names]
phase_state.editing_tools = [t for t in queen_tools if t.name in editing_names]
# ---- Cross-session memory ----------------------------------------
# ---- Global memory -------------------------------------------------
from framework.agents.queen.queen_memory_v2 import (
colony_memory_dir,
global_memory_dir,
init_memory_dir,
)
colony_dir = colony_memory_dir(session.id)
global_dir = global_memory_dir()
init_memory_dir(colony_dir, migrate_legacy=True)
init_memory_dir(global_dir)
phase_state.global_memory_dir = global_dir
@@ -278,13 +275,33 @@ async def create_queen(
_session_llm = session.llm
_session_event_bus = session.event_bus
async def _persona_hook(ctx: HookContext) -> HookResult | None:
from framework.agents.queen.queen_memory import format_for_injection
memory_context = format_for_injection()
result = await select_expert_persona(
ctx.trigger or "", _session_llm, memory_context=memory_context
# ---- Recall on each real user turn --------------------------------
async def _recall_on_user_input(event: AgentEvent) -> None:
"""Re-select memories when real user input arrives."""
content = (event.data or {}).get("content", "")
if not content or not isinstance(content, str):
return
try:
from framework.agents.queen.recall_selector import (
format_recall_injection,
select_memories,
)
mem_dir = phase_state.global_memory_dir
selected = await select_memories(content, _session_llm, mem_dir)
phase_state._cached_global_recall_block = format_recall_injection(selected, mem_dir)
except Exception:
logger.debug("recall: user-turn cache update failed", exc_info=True)
session.event_bus.subscribe(
[EventType.CLIENT_INPUT_RECEIVED],
_recall_on_user_input,
filter_stream="queen",
)
async def _persona_hook(ctx: HookContext) -> HookResult | None:
trigger = ctx.trigger or ""
result = await select_expert_persona(trigger, _session_llm, memory_context="")
if not result:
return None
# Store on phase_state so persona/style persist across dynamic prompt refreshes
@@ -298,6 +315,21 @@ async def create_queen(
data={"persona": result.persona_prefix},
)
)
# Seed recall cache so the first turn has relevant memories.
if trigger:
try:
from framework.agents.queen.recall_selector import (
format_recall_injection,
select_memories,
)
mem_dir = phase_state.global_memory_dir
selected = await select_memories(trigger, _session_llm, mem_dir)
phase_state._cached_global_recall_block = format_recall_injection(selected, mem_dir)
except Exception:
logger.debug("recall: initial seeding failed", exc_info=True)
return HookResult(system_prompt=phase_state.get_current_prompt())
# ---- Graph preparation -------------------------------------------
@@ -424,15 +456,14 @@ async def create_queen(
)
session_manager._subscribe_worker_handoffs(session, session.queen_executor)
# ---- Reflection + recall memory subscriptions ----------------
# ---- Global memory reflection + recall -------------------------
from framework.agents.queen.reflection_agent import subscribe_reflection_triggers
_reflection_subs = await subscribe_reflection_triggers(
session.event_bus,
queen_dir,
session.llm,
memory_dir=colony_dir,
phase_state=phase_state,
memory_dir=global_dir,
)
session.memory_reflection_subs = _reflection_subs
+9 -8
View File
@@ -185,14 +185,8 @@ async def handle_chat(request: web.Request) -> web.Response:
logger.error("[handle_chat] Node still not available after 5s wait")
if node is not None and hasattr(node, "inject_event"):
try:
logger.debug("[handle_chat] Calling node.inject_event()...")
await node.inject_event(message, is_client_input=True, image_content=image_content)
logger.debug("[handle_chat] inject_event() completed successfully")
except Exception as e:
logger.exception("[handle_chat] inject_event() failed: %s", e)
raise
# Publish to EventBus so the session event log captures user messages
# Publish BEFORE inject_event so handlers (e.g. memory recall)
# complete before the event loop unblocks and starts the LLM turn.
from framework.host.event_bus import AgentEvent, EventType
await session.event_bus.publish(
@@ -209,6 +203,13 @@ async def handle_chat(request: web.Request) -> web.Response:
},
)
)
try:
logger.debug("[handle_chat] Calling node.inject_event()...")
await node.inject_event(message, is_client_input=True, image_content=image_content)
logger.debug("[handle_chat] inject_event() completed successfully")
except Exception as e:
logger.exception("[handle_chat] inject_event() failed: %s", e)
raise
return web.json_response(
{
"status": "queen",
+26 -115
View File
@@ -45,12 +45,8 @@ class Session:
phase_state: Any = None # QueenPhaseState
# Worker handoff subscription
worker_handoff_sub: str | None = None
# Memory reflection + recall subscriptions
# Memory reflection + recall subscriptions (global memory)
memory_reflection_subs: list = field(default_factory=list) # list[str]
# Worker colony memory subscriptions
worker_memory_subs: list = field(default_factory=list) # list[str]
# Per-execution colony recall cache for worker prompts
worker_colony_recall_blocks: dict[str, str] = field(default_factory=dict)
# Trigger definitions loaded from agent's triggers.json (available but inactive)
available_triggers: dict[str, TriggerDefinition] = field(default_factory=dict)
# Active trigger tracking (IDs currently firing + their asyncio tasks)
@@ -69,6 +65,8 @@ class Session:
# directory instead of creating a new one. This lets cold-restores accumulate
# all messages in the original session folder so history is never fragmented.
queen_resume_from: str | None = None
# Queen session directory (set during _start_queen, used for shutdown reflection)
queen_dir: Path | None = None
class SessionManager:
@@ -84,6 +82,9 @@ class SessionManager:
self._model = model
self._credential_store = credential_store
self._lock = asyncio.Lock()
# Strong references for fire-and-forget background tasks (e.g. shutdown
# reflections) so they aren't garbage-collected before completion.
self._background_tasks: set[asyncio.Task] = set()
# ------------------------------------------------------------------
# Session lifecycle
@@ -323,16 +324,6 @@ class SessionManager:
runtime = runner._agent_runtime
if runtime is not None:
runtime._dynamic_memory_provider_factory = lambda execution_id, session=session: (
lambda execution_id=execution_id, session=session: (
session.worker_colony_recall_blocks.get(
execution_id,
"",
)
)
)
# Load triggers from the agent's triggers.json definition file.
from framework.tools.queen_lifecycle_tools import _read_agent_triggers_json
@@ -368,18 +359,6 @@ class SessionManager:
session.graph_runtime = runtime
session.worker_info = info
# Colony memory is additive; worker loading should still succeed if
# that optional subscription path hits an import/runtime issue while
# restoring an older session.
try:
await self._subscribe_worker_colony_memory(session)
except Exception:
logger.warning(
"Worker colony memory subscription failed for '%s'; continuing without it",
resolved_graph_id,
exc_info=True,
)
async with self._lock:
self._loading.discard(session.id)
@@ -617,14 +596,6 @@ class SessionManager:
await self._emit_trigger_events(session, "removed", session.available_triggers)
session.available_triggers.clear()
for sub_id in session.worker_memory_subs:
try:
session.event_bus.unsubscribe(sub_id)
except Exception:
pass
session.worker_memory_subs.clear()
session.worker_colony_recall_blocks.clear()
graph_id = session.graph_id
session.graph_id = None
session.worker_path = None
@@ -650,11 +621,6 @@ class SessionManager:
if session is None:
return False
# Capture session data for memory consolidation before teardown
_llm = getattr(session, "llm", None)
_storage_id = getattr(session, "queen_resume_from", None) or session_id
_session_dir = Path.home() / ".hive" / "queen" / "session" / _storage_id
if session.worker_handoff_sub is not None:
try:
session.event_bus.unsubscribe(session.worker_handoff_sub)
@@ -662,21 +628,31 @@ class SessionManager:
pass
session.worker_handoff_sub = None
for sub_id in session.worker_memory_subs:
try:
session.event_bus.unsubscribe(sub_id)
except Exception:
pass
session.worker_memory_subs.clear()
session.worker_colony_recall_blocks.clear()
# Stop queen and memory reflection/recall subscriptions
# Stop memory reflection/recall subscriptions
for sub_id in session.memory_reflection_subs:
try:
session.event_bus.unsubscribe(sub_id)
except Exception:
pass
session.memory_reflection_subs.clear()
# Run a final shutdown reflection so recent conversation insights
# are persisted before the session is destroyed (fire-and-forget).
if session.queen_dir is not None:
try:
from framework.agents.queen.reflection_agent import run_shutdown_reflection
task = asyncio.create_task(
asyncio.shield(run_shutdown_reflection(session.queen_dir, session.llm)),
)
self._background_tasks.add(task)
task.add_done_callback(self._background_tasks.discard)
logger.info("Session '%s': shutdown reflection spawned", session_id)
except Exception:
logger.warning(
"Session '%s': failed to spawn shutdown reflection", session_id, exc_info=True
)
if session.queen_task is not None:
session.queen_task.cancel()
session.queen_task = None
@@ -708,20 +684,6 @@ class SessionManager:
except Exception as e:
logger.error("Error cleaning up worker: %s", e)
# Final long reflection — fire-and-forget so teardown isn't blocked.
if _llm is not None:
import asyncio
from framework.agents.queen.queen_memory_v2 import colony_memory_dir
from framework.agents.queen.reflection_agent import run_long_reflection
asyncio.create_task(
run_long_reflection(
_llm, memory_dir=colony_memory_dir(_storage_id), caller="queen"
),
name=f"queen-memory-long-reflection-{session_id}",
)
# Close per-session event log
session.event_bus.close_session_log()
@@ -757,54 +719,6 @@ class SessionManager:
else:
logger.warning("Worker handoff received but queen node not ready")
async def _subscribe_worker_colony_memory(self, session: Session) -> None:
"""Subscribe shared colony reflection/recall for top-level worker runs."""
for sub_id in session.worker_memory_subs:
try:
session.event_bus.unsubscribe(sub_id)
except Exception:
pass
session.worker_memory_subs.clear()
session.worker_colony_recall_blocks.clear()
runtime = session.graph_runtime
if runtime is None:
return
worker_sessions_dir = getattr(runtime, "_session_store", None)
worker_sessions_dir = getattr(worker_sessions_dir, "sessions_dir", None)
if worker_sessions_dir is None:
return
from framework.agents.queen.queen_memory_v2 import colony_memory_dir, init_memory_dir
from framework.agents.queen.reflection_agent import subscribe_worker_memory_triggers
colony_dir = colony_memory_dir(session.id)
init_memory_dir(colony_dir, migrate_legacy=True)
runtime._dynamic_memory_provider_factory = lambda execution_id, session=session: (
lambda execution_id=execution_id, session=session: (
session.worker_colony_recall_blocks.get(
execution_id,
"",
)
)
)
# Colony memory config for reflection-at-handoff
runtime._colony_memory_dir = colony_dir
runtime._colony_worker_sessions_dir = worker_sessions_dir
runtime._colony_recall_cache = session.worker_colony_recall_blocks
runtime._colony_reflect_llm = session.llm
session.worker_memory_subs = await subscribe_worker_memory_triggers(
session.event_bus,
session.llm,
worker_sessions_dir=worker_sessions_dir,
colony_memory_dir=colony_dir,
recall_cache=session.worker_colony_recall_blocks,
)
def _subscribe_worker_handoffs(self, session: Session, executor: Any) -> None:
"""Subscribe queen to worker/subagent escalation handoff events."""
from framework.host.event_bus import EventType as _ET
@@ -849,6 +763,7 @@ class SessionManager:
storage_session_id = session.queen_resume_from or session.id
queen_dir = hive_home / "queen" / "session" / storage_session_id
queen_dir.mkdir(parents=True, exist_ok=True)
session.queen_dir = queen_dir
# Always write/update session metadata so history sidebar has correct
# agent name, path, and last-active timestamp (important so the original
@@ -970,10 +885,6 @@ class SessionManager:
except Exception:
logger.warning("Cold restore: failed to auto-load worker", exc_info=True)
# Memory reflection/recall subscriptions are set up inside
# queen_orchestrator.create_queen() → _queen_loop() and stored
# on session.memory_reflection_subs for teardown.
# ------------------------------------------------------------------
# Queen notifications
# ------------------------------------------------------------------
+3 -93
View File
@@ -133,10 +133,9 @@ class QueenPhaseState:
persona_prefix: str = "" # e.g. "You are a CFO. I am a CFO with 20 years..."
style_directive: str = "" # e.g. "## Communication Style: Peer\n\n..."
# Cached recall block — populated async by recall_selector after each turn.
_cached_recall_block: str = ""
_cached_colony_recall_block: str = ""
# Cached global recall block — populated async by recall_selector after each turn.
_cached_global_recall_block: str = ""
# Global memory directory.
global_memory_dir: Path | None = None
def get_current_tools(self) -> list:
@@ -152,7 +151,7 @@ class QueenPhaseState:
return list(self.building_tools)
def get_current_prompt(self) -> str:
"""Return the system prompt for the current phase, with fresh memory appended."""
"""Return the system prompt for the current phase."""
if self.phase == "planning":
base = self.prompt_planning
elif self.phase == "running":
@@ -164,9 +163,6 @@ class QueenPhaseState:
else:
base = self.prompt_building
from framework.agents.queen.queen_memory import format_for_injection
_memory = format_for_injection() # noqa: F841
parts = []
if self.persona_prefix:
parts.append(self.persona_prefix)
@@ -177,9 +173,6 @@ class QueenPhaseState:
parts.append(self.skills_catalog_prompt)
if self.protocols_prompt:
parts.append(self.protocols_prompt)
colony_memory = self._cached_colony_recall_block or self._cached_recall_block
if colony_memory:
parts.append(colony_memory)
if self._cached_global_recall_block:
parts.append(self._cached_global_recall_block)
return "\n\n".join(parts)
@@ -3608,89 +3601,6 @@ def register_queen_lifecycle_tools(
)
tools_registered += 1
# --- save_global_memory --------------------------------------------------
async def save_global_memory_entry(
category: str,
description: str,
content: str,
name: str | None = None,
) -> str:
"""Persist a queen-global memory entry about the user."""
from framework.agents.queen.queen_memory_v2 import (
global_memory_dir as _global_memory_dir,
init_memory_dir as _init_memory_dir,
save_global_memory as _save_global_memory,
)
target_dir = (
phase_state.global_memory_dir
if phase_state is not None and phase_state.global_memory_dir is not None
else _global_memory_dir()
)
_init_memory_dir(target_dir)
try:
filename, path = _save_global_memory(
category=category,
description=description,
content=content,
name=name,
memory_dir=target_dir,
)
return json.dumps(
{
"status": "saved",
"filename": filename,
"path": str(path),
"category": category,
}
)
except ValueError as exc:
return json.dumps({"error": str(exc)})
_save_global_memory_tool = Tool(
name="save_global_memory",
description=(
"Save durable global memory about the user. "
"Only use for user profile, preferences, environment, or feedback."
),
parameters={
"type": "object",
"properties": {
"category": {
"type": "string",
"enum": ["profile", "preference", "environment", "feedback"],
},
"description": {
"type": "string",
"description": "Specific one-line description for future recall selection.",
},
"content": {
"type": "string",
"description": "Durable user-centric memory content.",
},
"name": {
"type": "string",
"description": "Optional short memory title.",
},
},
"required": ["category", "description", "content"],
"additionalProperties": False,
},
)
registry.register(
"save_global_memory",
_save_global_memory_tool,
lambda inputs: save_global_memory_entry(
inputs["category"],
inputs["description"],
inputs["content"],
inputs.get("name"),
),
)
tools_registered += 1
# --- list_triggers ---------------------------------------------------------
async def list_triggers() -> str:
-100
View File
@@ -1,100 +0,0 @@
"""Tools for the queen to read and write episodic memory.
The queen can consciously record significant moments during a session like
writing in a diary and recall past diary entries when needed. Semantic
memory (MEMORY.md) is updated automatically at session end and is never
written by the queen directly.
"""
from __future__ import annotations
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from framework.loader.tool_registry import ToolRegistry
def write_to_diary(entry: str) -> str:
"""Write a prose entry to today's episodic memory.
Use this when something significant just happened: a pipeline went live, the
user shared an important preference, a goal was achieved or abandoned, or
you want to record something that should be remembered across sessions.
Write in first person, as you would in a private diary. Be specific what
happened, how the user responded, what it means going forward. One or two
paragraphs is enough.
You do not need to include a timestamp or date heading; those are added
automatically.
"""
from framework.agents.queen.queen_memory import append_episodic_entry
append_episodic_entry(entry)
return "Diary entry recorded."
def recall_diary(query: str = "", days_back: int = 7) -> str:
"""Search recent diary entries (episodic memory).
Use this when the user asks about what happened in the past "what did we
do yesterday?", "what happened last week?", "remind me about the pipeline
issue", etc. Also use it proactively when you need context from recent
sessions to answer a question or make a decision.
Args:
query: Optional keyword or phrase to filter entries. If empty, all
recent entries are returned.
days_back: How many days to look back (1-30). Defaults to 7.
"""
from datetime import date, timedelta
from framework.agents.queen.queen_memory import format_memory_date, read_episodic_memory
days_back = max(1, min(int(days_back), 30))
today = date.today()
results: list[str] = []
total_chars = 0
char_budget = 12_000
for offset in range(days_back):
d = today - timedelta(days=offset)
content = read_episodic_memory(d)
if not content:
continue
# If a query is given, only include entries that mention it
if query:
# Check each section (split by ###) for relevance
sections = content.split("### ")
matched = [s for s in sections if query.lower() in s.lower()]
if not matched:
continue
content = "### ".join(matched)
label = format_memory_date(d)
if d == today:
label = f"Today — {label}"
entry = f"## {label}\n\n{content}"
if total_chars + len(entry) > char_budget:
remaining = char_budget - total_chars
if remaining > 200:
# Fit a partial entry within budget
trimmed = content[: remaining - 100] + "\n\n…(truncated)"
results.append(f"## {label}\n\n{trimmed}")
else:
results.append(f"## {label}\n\n(truncated — hit size limit)")
break
results.append(entry)
total_chars += len(entry)
if not results:
if query:
return f"No diary entries matching '{query}' in the last {days_back} days."
return f"No diary entries found in the last {days_back} days."
return "\n\n---\n\n".join(results)
def register_queen_memory_tools(registry: ToolRegistry) -> None:
"""Register the episodic memory tools into the queen's tool registry."""
registry.register_function(write_to_diary)
registry.register_function(recall_diary)
+1 -1
View File
@@ -1,5 +1,5 @@
<!DOCTYPE html>
<html lang="en" class="dark">
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
@@ -0,0 +1,20 @@
import { Sun, Moon } from "lucide-react";
import { useTheme } from "../context/ThemeContext";
export default function ThemeToggle() {
const { theme, setTheme } = useTheme();
return (
<button
onClick={() => setTheme(theme === "dark" ? "light" : "dark")}
className="px-3 py-1 text-xs border rounded-md flex items-center gap-1.5 text-muted-foreground hover:text-foreground hover:bg-muted/50 transition-colors"
aria-label="Toggle theme"
>
{theme === "dark" ? (
<Sun className="w-3.5 h-3.5" />
) : (
<Moon className="w-3.5 h-3.5" />
)}
</button>
);
}
+56 -15
View File
@@ -1,8 +1,14 @@
import { useState, useCallback } from "react";
import { useNavigate } from "react-router-dom";
import { Crown, X } from "lucide-react";
import {
loadPersistedTabs,
savePersistedTabs,
TAB_STORAGE_KEY,
type PersistedTabState,
} from "@/lib/tab-persistence";
import ThemeToggle from "./ThemeToggle";
import { sessionsApi } from "@/api/sessions";
import { loadPersistedTabs, savePersistedTabs, TAB_STORAGE_KEY, type PersistedTabState } from "@/lib/tab-persistence";
import BrowserStatusBadge from "@/components/BrowserStatusBadge";
export interface TopBarTab {
@@ -27,65 +33,89 @@ interface TopBarProps {
children?: React.ReactNode;
}
export default function TopBar({ tabs: tabsProp, onTabClick, onCloseTab, canCloseTabs, afterTabs, children }: TopBarProps) {
export default function TopBar({
tabs: tabsProp,
onTabClick,
onCloseTab,
canCloseTabs,
afterTabs,
children,
}: TopBarProps) {
const navigate = useNavigate();
// Fallback: read persisted tabs when no live tabs provided
const [persisted, setPersisted] = useState<PersistedTabState | null>(() =>
tabsProp ? null : loadPersistedTabs()
tabsProp ? null : loadPersistedTabs(),
);
const tabs: TopBarTab[] = tabsProp ?? deriveTabs(persisted);
const showClose = canCloseTabs ?? true;
const handleTabClick = useCallback((agentType: string) => {
const handleTabClick = useCallback(
(agentType: string) => {
if (onTabClick) {
onTabClick(agentType);
} else {
navigate(`/workspace?agent=${encodeURIComponent(agentType)}`);
}
}, [onTabClick, navigate]);
},
[onTabClick, navigate],
);
const handleCloseTab = useCallback((agentType: string, e: React.MouseEvent) => {
const handleCloseTab = useCallback(
(agentType: string, e: React.MouseEvent) => {
e.stopPropagation();
if (onCloseTab) {
onCloseTab(agentType);
return;
}
// Kill the backend session (queen/worker) even outside workspace
sessionsApi.list()
sessionsApi
.list()
.then(({ sessions }) => {
const match = sessions.find(s => s.agent_path.endsWith(agentType));
const match = sessions.find((s) => s.agent_path.endsWith(agentType));
if (match) return sessionsApi.stop(match.session_id);
})
.catch(() => {}); // fire-and-forget
// Fallback: update localStorage directly (non-workspace pages)
setPersisted(prev => {
setPersisted((prev) => {
if (!prev) return null;
const nextTabs = prev.tabs.filter(t => t.agentType !== agentType);
const nextTabs = prev.tabs.filter((t) => t.agentType !== agentType);
if (nextTabs.length === 0) {
localStorage.removeItem(TAB_STORAGE_KEY);
return null;
}
const removedIds = new Set(prev.tabs.filter(t => t.agentType === agentType).map(t => t.id));
const removedIds = new Set(
prev.tabs.filter((t) => t.agentType === agentType).map((t) => t.id),
);
const nextSessions = { ...prev.sessions };
for (const id of removedIds) delete nextSessions[id];
const nextActiveSession = { ...prev.activeSessionByAgent };
delete nextActiveSession[agentType];
const nextActiveWorker = prev.activeWorker === agentType
const nextActiveWorker =
prev.activeWorker === agentType
? nextTabs[0].agentType
: prev.activeWorker;
const nextState: PersistedTabState = {
tabs: nextTabs,
activeSessionByAgent: nextActiveSession,
activeWorker: nextActiveWorker,
sessions: nextSessions,
};
savePersistedTabs(nextState);
return nextState;
});
}, [onCloseTab]);
},
[onCloseTab],
);
return (
<div className="relative h-12 flex items-center justify-between px-5 border-b border-border/60 bg-card/50 backdrop-blur-sm flex-shrink-0">
@@ -115,6 +145,7 @@ export default function TopBar({ tabs: tabsProp, onTabClick, onCloseTab, canClos
<span className="relative inline-flex rounded-full h-1.5 w-1.5 bg-primary" />
</span>
)}
<span>{tab.label}</span>
{showClose && (
<X
@@ -125,12 +156,14 @@ export default function TopBar({ tabs: tabsProp, onTabClick, onCloseTab, canClos
</button>
))}
</div>
{afterTabs}
</>
)}
</div>
<div className="flex items-center gap-3 flex-shrink-0">
<ThemeToggle />
<BrowserStatusBadge />
{children && (
<div className="flex items-center gap-1">
@@ -145,15 +178,23 @@ export default function TopBar({ tabs: tabsProp, onTabClick, onCloseTab, canClos
/** Derive TopBarTab[] from persisted localStorage state (used outside workspace). */
function deriveTabs(persisted: PersistedTabState | null): TopBarTab[] {
if (!persisted) return [];
const seen = new Set<string>();
const tabs: TopBarTab[] = [];
for (const tab of persisted.tabs) {
if (seen.has(tab.agentType)) continue;
seen.add(tab.agentType);
const sessionData = persisted.sessions?.[tab.id];
const hasRunning = sessionData?.graphNodes?.some(
(n) => n.status === "running" || n.status === "looping"
const hasRunning =
sessionData?.graphNodes?.some(
(n) => n.status === "running" || n.status === "looping",
) ?? false;
tabs.push({
agentType: tab.agentType,
label: tab.label,
@@ -0,0 +1,49 @@
import React, { createContext, useContext, useEffect, useState } from "react";
type Theme = "light" | "dark";
interface ThemeContextValue {
theme: Theme;
setTheme: (theme: Theme) => void;
}
const ThemeContext = createContext<ThemeContextValue | null>(null);
export function ThemeProvider({ children }: { children: React.ReactNode }) {
const [theme, setTheme] = useState<Theme>(() => {
const stored = localStorage.getItem("theme");
if (stored === "light" || stored === "dark") {
return stored;
}
return window.matchMedia("(prefers-color-scheme: dark)").matches
? "dark"
: "light";
});
useEffect(() => {
const root = document.documentElement;
root.classList.remove("light", "dark");
root.classList.add(theme);
localStorage.setItem("theme", theme);
}, [theme]);
return (
<ThemeContext.Provider value={{ theme, setTheme }}>
{children}
</ThemeContext.Provider>
);
}
export function useTheme(): ThemeContextValue {
const context = useContext(ThemeContext);
if (!context) {
throw new Error("useTheme must be used within a ThemeProvider");
}
return context;
}
+3
View File
@@ -1,10 +1,13 @@
import ReactDOM from "react-dom/client";
import { BrowserRouter } from "react-router-dom";
import { ThemeProvider } from "./context/ThemeContext";
import App from "./App";
import "./index.css";
ReactDOM.createRoot(document.getElementById("root")!).render(
<ThemeProvider>
<BrowserRouter>
<App />
</BrowserRouter>
</ThemeProvider>,
);
+4 -214
View File
@@ -1,7 +1,6 @@
import { useState, useCallback, useRef, useEffect, useMemo } from "react";
import ReactDOM from "react-dom";
import { useSearchParams, useNavigate } from "react-router-dom";
import { Plus, KeyRound, Sparkles, Layers, ChevronLeft, Bot, Loader2, WifiOff, X, FolderOpen } from "lucide-react";
import { Plus, KeyRound, Loader2, WifiOff, X, FolderOpen } from "lucide-react";
import type { GraphNode, NodeStatus } from "@/components/graph-types";
import DraftGraph from "@/components/DraftGraph";
import ChatPanel, { type ChatMessage } from "@/components/ChatPanel";
@@ -9,12 +8,11 @@ import TopBar from "@/components/TopBar";
import { TAB_STORAGE_KEY, loadPersistedTabs, savePersistedTabs, type PersistedTabState } from "@/lib/tab-persistence";
import NodeDetailPanel from "@/components/NodeDetailPanel";
import CredentialsModal, { type Credential, createFreshCredentials, cloneCredentials, allRequiredCredentialsMet, clearCredentialCache } from "@/components/CredentialsModal";
import { agentsApi } from "@/api/agents";
import { executionApi } from "@/api/execution";
import { graphsApi } from "@/api/graphs";
import { sessionsApi } from "@/api/sessions";
import { useMultiSSE } from "@/hooks/use-sse";
import type { LiveSession, AgentEvent, DiscoverEntry, NodeSpec, DraftGraph as DraftGraphData } from "@/api/types";
import type { LiveSession, AgentEvent, NodeSpec, DraftGraph as DraftGraphData } from "@/api/types";
import { sseEventToChatMessage, formatAgentDisplayName } from "@/lib/chat-helpers";
import { topologyToGraphNodes } from "@/lib/graph-converter";
import { cronToLabel } from "@/lib/graphUtils";
@@ -90,152 +88,6 @@ function createSession(agentType: string, label: string, existingCredentials?: C
};
}
// --- NewTabPopover ---
type PopoverStep = "root" | "new-agent-choice" | "clone-pick";
interface NewTabPopoverProps {
open: boolean;
onClose: () => void;
anchorRef: React.RefObject<HTMLButtonElement | null>;
activeWorker: string;
discoverAgents: DiscoverEntry[];
onFromScratch: () => void;
onCloneAgent: (agentPath: string, agentName: string) => void;
}
function NewTabPopover({ open, onClose, anchorRef, discoverAgents, onFromScratch, onCloneAgent }: NewTabPopoverProps) {
const [step, setStep] = useState<PopoverStep>("root");
const [pos, setPos] = useState<{ top: number; left: number } | null>(null);
const ref = useRef<HTMLDivElement>(null);
useEffect(() => { if (open) setStep("root"); }, [open]);
// Compute position from anchor button
useEffect(() => {
if (open && anchorRef.current) {
const rect = anchorRef.current.getBoundingClientRect();
const POPUP_WIDTH = 240; // w-60 = 15rem = 240px
const overflows = rect.left + POPUP_WIDTH > window.innerWidth - 8;
console.log("Anchor rect:", rect, "Overflows:", overflows);
setPos({
top: rect.bottom + 4,
left: overflows ? rect.right - POPUP_WIDTH : rect.left,
});
}
}, [open, anchorRef]);
// Close on outside click
useEffect(() => {
if (!open) return;
const handler = (e: MouseEvent) => {
if (
ref.current && !ref.current.contains(e.target as Node) &&
anchorRef.current && !anchorRef.current.contains(e.target as Node)
) onClose();
};
document.addEventListener("mousedown", handler);
return () => document.removeEventListener("mousedown", handler);
}, [open, onClose, anchorRef]);
// Close on Escape
useEffect(() => {
if (!open) return;
const handler = (e: KeyboardEvent) => { if (e.key === "Escape") onClose(); };
document.addEventListener("keydown", handler);
return () => document.removeEventListener("keydown", handler);
}, [open, onClose]);
if (!open || !pos) return null;
const optionClass =
"flex items-center gap-3 w-full px-3 py-2.5 rounded-lg text-sm text-left transition-colors hover:bg-muted/60 text-foreground";
const iconWrap =
"w-7 h-7 rounded-md flex items-center justify-center bg-muted/80 flex-shrink-0";
return ReactDOM.createPortal(
<div
ref={ref}
style={{ position: "fixed", top: pos.top, left: pos.left, zIndex: 9999 }}
className="w-60 rounded-xl border border-border/60 bg-card shadow-xl shadow-black/30 overflow-hidden"
>
<div className="flex items-center gap-2 px-3 py-2.5 border-b border-border/40">
{step !== "root" && (
<button
onClick={() => setStep(step === "clone-pick" ? "new-agent-choice" : "root")}
className="p-0.5 rounded hover:bg-muted/60 transition-colors text-muted-foreground hover:text-foreground"
>
<ChevronLeft className="w-3.5 h-3.5" />
</button>
)}
<span className="text-xs font-semibold text-muted-foreground uppercase tracking-wider">
{step === "root" ? "Add Tab" : step === "new-agent-choice" ? "New Agent" : "Open Agent"}
</span>
</div>
<div className="p-1.5">
{step === "root" && (
<>
<button className={optionClass} onClick={() => setStep("clone-pick")}>
<span className={iconWrap}><Layers className="w-3.5 h-3.5 text-muted-foreground" /></span>
<div>
<div className="font-medium leading-tight">Existing agent</div>
<div className="text-xs text-muted-foreground mt-0.5">Open another agent's workspace</div>
</div>
</button>
<button className={optionClass} onClick={() => setStep("new-agent-choice")}>
<span className={iconWrap}><Sparkles className="w-3.5 h-3.5 text-primary" /></span>
<div>
<div className="font-medium leading-tight">New agent</div>
<div className="text-xs text-muted-foreground mt-0.5">Build or clone a fresh agent</div>
</div>
</button>
</>
)}
{step === "new-agent-choice" && (
<>
<button className={optionClass} onClick={() => { onFromScratch(); onClose(); }}>
<span className={iconWrap}><Sparkles className="w-3.5 h-3.5 text-primary" /></span>
<div>
<div className="font-medium leading-tight">From scratch</div>
<div className="text-xs text-muted-foreground mt-0.5">Empty pipeline + Queen Bee setup</div>
</div>
</button>
<button className={optionClass} onClick={() => setStep("clone-pick")}>
<span className={iconWrap}><Layers className="w-3.5 h-3.5 text-muted-foreground" /></span>
<div>
<div className="font-medium leading-tight">Clone existing</div>
<div className="text-xs text-muted-foreground mt-0.5">Start from an existing agent</div>
</div>
</button>
</>
)}
{step === "clone-pick" && (
<div className="flex flex-col max-h-64 overflow-y-auto">
{discoverAgents.map(agent => (
<button
key={agent.path}
onClick={() => { onCloneAgent(agent.path, agent.name); onClose(); }}
className="flex items-center gap-2.5 w-full px-3 py-2 rounded-lg text-left transition-colors hover:bg-muted/60 text-foreground"
>
<div className="w-6 h-6 rounded-md bg-muted/80 flex items-center justify-center flex-shrink-0">
<Bot className="w-3.5 h-3.5 text-muted-foreground" />
</div>
<span className="text-sm font-medium">{agent.name}</span>
</button>
))}
{discoverAgents.length === 0 && (
<p className="text-xs text-muted-foreground px-3 py-2">No agents found</p>
)}
</div>
)}
</div>
</div>,
document.body
);
}
function fmtLogTs(ts: string): string {
try {
const d = new Date(ts);
@@ -581,8 +433,6 @@ export default function Workspace() {
const [triggerScheduleSaving, setTriggerScheduleSaving] = useState(false);
const [triggerCronSaved, setTriggerCronSaved] = useState(false);
const [triggerTaskSaved, setTriggerTaskSaved] = useState(false);
const [newTabOpen, setNewTabOpen] = useState(false);
const newTabBtnRef = useRef<HTMLButtonElement>(null);
const [graphPanelPct, setGraphPanelPct] = useState(30);
const savedGraphPanelPct = useRef(30);
const resizing = useRef(false);
@@ -734,15 +584,6 @@ export default function Workspace() {
}
}, [agentStates, activeWorker, updateAgentState]);
// --- Fetch discovered agents for NewTabPopover ---
const [discoverAgents, setDiscoverAgents] = useState<DiscoverEntry[]>([]);
useEffect(() => {
agentsApi.discover().then(result => {
const { Framework: _fw, ...userFacing } = result;
const all = Object.values(userFacing).flat();
setDiscoverAgents(all);
}).catch(() => { });
}, []);
// --- Agent loading: loadAgentForType ---
const loadingRef = useRef(new Set<string>());
@@ -1144,7 +985,7 @@ export default function Workspace() {
i === 0 ? {
...s,
// Preserve existing label if it was already set with a #N suffix by
// addAgentSession/handleHistoryOpen. Only overwrite with the bare
// handleHistoryOpen. Only overwrite with the bare
// displayName when the label doesn't match the resolved display name.
label: s.label.startsWith(displayName) ? s.label : displayName,
backendSessionId: session.session_id,
@@ -2749,45 +2590,6 @@ export default function Workspace() {
}
}, [sessionsByAgent, activeWorker, navigate, agentStates]);
// Open a tab for an agent type. If a tab already exists, switch to it
// instead of creating a duplicate — each agent gets one session.
// Exception: "new-agent" tabs always create a new instance since each
// represents a distinct conversation the user is starting from scratch.
const addAgentSession = useCallback((agentType: string, agentLabel?: string) => {
const isNewAgent = agentType === "new-agent" || agentType.startsWith("new-agent-");
if (!isNewAgent) {
const existingTabKey = Object.keys(sessionsByAgent).find(
k => baseAgentType(k) === agentType && (sessionsByAgent[k] || []).length > 0,
);
if (existingTabKey) {
setActiveWorker(existingTabKey);
const existing = sessionsByAgent[existingTabKey]?.[0];
if (existing) {
setActiveSessionByAgent(prev => ({ ...prev, [existingTabKey]: existing.id }));
}
return;
}
}
const tabKey = isNewAgent ? `new-agent-${makeId()}` : agentType;
const existingNewAgentCount = isNewAgent
? Object.keys(sessionsByAgent).filter(
k => (k === "new-agent" || k.startsWith("new-agent-")) && (sessionsByAgent[k] || []).length > 0
).length
: 0;
const rawLabel = agentLabel || (isNewAgent ? "New Agent" : formatAgentDisplayName(agentType));
const displayLabel = existingNewAgentCount === 0 ? rawLabel : `${rawLabel} #${existingNewAgentCount + 1}`;
const newSession = createSession(tabKey, displayLabel);
setSessionsByAgent(prev => ({
...prev,
[tabKey]: [newSession],
}));
setActiveSessionByAgent(prev => ({ ...prev, [tabKey]: newSession.id }));
setActiveWorker(tabKey);
}, [sessionsByAgent]);
// Open a history session: switch to its existing tab, or open a new tab.
// Async so we can pre-fetch messages before creating the tab — this gives
// instant visual feedback without waiting for loadAgentForType.
@@ -2894,25 +2696,13 @@ export default function Workspace() {
}}
onCloseTab={closeAgentTab}
afterTabs={
<>
<button
ref={newTabBtnRef}
onClick={() => setNewTabOpen(o => !o)}
onClick={() => navigate("/")}
className="flex-shrink-0 p-1.5 rounded-md text-muted-foreground hover:text-foreground hover:bg-muted/50 transition-colors"
title="Add tab"
>
<Plus className="w-3.5 h-3.5" />
</button>
<NewTabPopover
open={newTabOpen}
onClose={() => setNewTabOpen(false)}
anchorRef={newTabBtnRef}
activeWorker={activeWorker}
discoverAgents={discoverAgents}
onFromScratch={() => { addAgentSession("new-agent"); }}
onCloneAgent={(agentPath, agentName) => { addAgentSession(agentPath, agentName); }}
/>
</>
}
>
<button
+1 -1
View File
@@ -74,4 +74,4 @@ dev = [
"pytest>=8.0",
"pytest-asyncio>=0.23",
"pytest-xdist>=3.0",
]
]
+18
View File
@@ -0,0 +1,18 @@
"""Test setup for framework tests."""
from __future__ import annotations
# Ensure framework.runner submodules are bound as attributes on their parent
# package. Under this repo's layout, `from framework.runner.foo import X` does
# not always bind `foo` onto `framework.runner` (observed via dir() inspection),
# which breaks `monkeypatch.setattr("framework.runner.foo.Y", ...)` because the
# pytest path resolver walks attributes. Force the bindings here so tests can
# patch submodule attributes via the dotted-string API.
import framework.runner # noqa: F401 — load parent package first
import framework.runner.mcp_client as _mcp_client
import framework.runner.mcp_connection_manager as _mcp_connection_manager
import framework.runner.mcp_registry as _mcp_registry
framework.runner.mcp_registry = _mcp_registry
framework.runner.mcp_connection_manager = _mcp_connection_manager
framework.runner.mcp_client = _mcp_client
+129
View File
@@ -0,0 +1,129 @@
"""
Tests for error_middleware in framework.server.app.
Verifies that the error middleware does NOT leak internal exception
details (file paths, config values, stack traces) to HTTP clients.
"""
import pytest
from aiohttp import web
from aiohttp.test_utils import TestClient, TestServer
from framework.server.app import error_middleware
# ---------------------------------------------------------------------------
# Handlers used in tests
# ---------------------------------------------------------------------------
async def _handler_raise_value_error(request: web.Request) -> web.Response:
"""Handler that raises ValueError with sensitive path info."""
raise ValueError("/home/user/.hive/credentials/secret_key.json not found")
async def _handler_raise_runtime_error(request: web.Request) -> web.Response:
"""Handler that raises RuntimeError with internal details."""
raise RuntimeError("Connection to postgres://admin:s3cret@db:5432/hive failed")
async def _handler_raise_key_error(request: web.Request) -> web.Response:
"""Handler that raises KeyError with config key name."""
raise KeyError("ANTHROPIC_API_KEY")
async def _handler_success(request: web.Request) -> web.Response:
"""Handler that returns a normal 200 response."""
return web.json_response({"status": "ok"})
async def _handler_http_not_found(request: web.Request) -> web.Response:
"""Handler that raises aiohttp's HTTP 404."""
raise web.HTTPNotFound(reason="Agent not found")
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _make_app() -> web.Application:
"""Create a minimal aiohttp app with error_middleware and test routes."""
app = web.Application(middlewares=[error_middleware])
app.router.add_get("/value-error", _handler_raise_value_error)
app.router.add_get("/runtime-error", _handler_raise_runtime_error)
app.router.add_get("/key-error", _handler_raise_key_error)
app.router.add_get("/success", _handler_success)
app.router.add_get("/not-found", _handler_http_not_found)
return app
# ---------------------------------------------------------------------------
# Tests
# ---------------------------------------------------------------------------
class TestErrorMiddlewareInfoLeak:
"""Verify error_middleware returns generic messages, not internal details."""
@pytest.mark.asyncio
async def test_does_not_leak_file_paths(self):
"""ValueError with file path must not appear in response body."""
async with TestClient(TestServer(_make_app())) as client:
resp = await client.get("/value-error")
assert resp.status == 500
body = await resp.json()
assert body["error"] == "Internal server error"
# Ensure no sensitive details leaked
assert ".hive" not in body["error"]
assert "secret_key" not in body["error"]
assert "type" not in body # type field should not exist
@pytest.mark.asyncio
async def test_does_not_leak_connection_strings(self):
"""RuntimeError with DB connection string must not appear in response."""
async with TestClient(TestServer(_make_app())) as client:
resp = await client.get("/runtime-error")
assert resp.status == 500
body = await resp.json()
assert body["error"] == "Internal server error"
assert "postgres" not in body["error"]
assert "s3cret" not in body["error"]
@pytest.mark.asyncio
async def test_does_not_leak_env_var_names(self):
"""KeyError with env var name must not appear in response body."""
async with TestClient(TestServer(_make_app())) as client:
resp = await client.get("/key-error")
assert resp.status == 500
body = await resp.json()
assert body["error"] == "Internal server error"
assert "ANTHROPIC_API_KEY" not in body["error"]
@pytest.mark.asyncio
async def test_does_not_leak_exception_type(self):
"""Response must not include the Python exception type name."""
async with TestClient(TestServer(_make_app())) as client:
resp = await client.get("/value-error")
body = await resp.json()
assert "type" not in body
assert "ValueError" not in str(body)
@pytest.mark.asyncio
async def test_success_response_unchanged(self):
"""Normal 200 responses must pass through untouched."""
async with TestClient(TestServer(_make_app())) as client:
resp = await client.get("/success")
assert resp.status == 200
body = await resp.json()
assert body == {"status": "ok"}
@pytest.mark.asyncio
async def test_http_exceptions_pass_through(self):
"""aiohttp HTTPExceptions (404, etc.) must not be caught."""
async with TestClient(TestServer(_make_app())) as client:
resp = await client.get("/not-found")
assert resp.status == 404
if __name__ == "__main__":
pytest.main([__file__, "-v"])
+27 -6
View File
@@ -168,6 +168,13 @@ class MockConversationStore:
async def close(self) -> None:
pass
async def clear(self) -> None:
# Clear parts, cursor, and meta — keep the store object alive.
# Matches the real store (storage/conversation_store.py:clear).
self._parts.clear()
self._cursor = None
self._meta = None
async def destroy(self) -> None:
self._parts.clear()
self._meta = None
@@ -246,6 +253,7 @@ def make_ctx(
client_facing: bool = False,
available_tools: list[Tool] | None = None,
stream_id: str = "",
is_subagent_mode: bool = False,
) -> NodeContext:
"""Build a NodeContext for direct EventLoopNode testing."""
runtime = MagicMock(spec=Runtime)
@@ -278,6 +286,7 @@ def make_ctx(
llm=llm,
available_tools=available_tools or [],
stream_id=stream_id,
is_subagent_mode=is_subagent_mode,
)
@@ -474,7 +483,12 @@ async def test_event_loop_with_event_bus():
scripts = [StreamScript(text="All done.")]
llm = make_llm(scripts)
ctx = make_ctx(llm=llm, output_keys=[])
# is_subagent_mode=True bypasses worker auto-escalation in EventLoopNode.
# When event_bus is provided, a non-queen/non-subagent node is treated as
# a worker and auto-escalates to queen after a text-only turn (grace=1),
# then blocks forever on _await_user_input waiting for queen guidance.
# Standalone unit tests have no queen, so we mark as subagent to opt out.
ctx = make_ctx(llm=llm, output_keys=[], is_subagent_mode=True)
node = EventLoopNode(
event_bus=bus,
@@ -1000,10 +1014,13 @@ async def test_context_handoff_between_nodes(runtime):
result = await executor.execute(graph, goal, {})
assert result.success
assert "lead_score" in result.output
# After hive-v1 executor refactor, result.output only contains terminal
# node outputs. Full buffer (with handoff data) is in session_state.
assert "strategy" in result.output
buffer_data = result.session_state.get("data_buffer", {})
assert "lead_score" in buffer_data
if USE_MOCK_LLM:
assert result.output["lead_score"] == 92
assert buffer_data["lead_score"] == 92
assert result.output["strategy"] == "premium"
@@ -1068,7 +1085,8 @@ async def test_internal_node_no_client_output():
scripts = [StreamScript(text="Internal processing.")]
llm = make_llm(scripts)
ctx = make_ctx(llm=llm, output_keys=[], client_facing=False)
# is_subagent_mode=True: standalone test, opts out of worker auto-escalation.
ctx = make_ctx(llm=llm, output_keys=[], client_facing=False, is_subagent_mode=True)
node = EventLoopNode(
event_bus=bus,
@@ -1167,10 +1185,13 @@ async def test_mixed_node_graph(runtime):
result = await executor.execute(graph, goal, {})
assert result.success
assert "summary" in result.output
# Terminal node is "format" - only its output appears in result.output.
# Intermediate outputs are in session_state's data buffer.
assert "report" in result.output
buffer_data = result.session_state.get("data_buffer", {})
assert "summary" in buffer_data
if USE_MOCK_LLM:
assert "3 leads processed" in result.output["summary"]
assert "3 leads processed" in buffer_data["summary"]
# ===========================================================================
+33 -47
View File
@@ -147,8 +147,16 @@ def build_ctx(
input_data=None,
goal_context="",
stream_id=None,
is_subagent_mode=False,
):
"""Build a NodeContext for testing."""
"""Build a NodeContext for testing.
When EventLoopNode is constructed with event_bus, a non-queen/non-subagent
node is treated as a worker and auto-escalates to queen on text-only turns
(see event_loop_node.py:1277). Standalone tests with event_bus but no queen
should pass is_subagent_mode=True to opt out, otherwise the loop hangs
forever waiting for queen guidance that never arrives.
"""
return NodeContext(
runtime=runtime,
node_id=node_spec.id,
@@ -159,6 +167,7 @@ def build_ctx(
available_tools=tools or [],
goal_context=goal_context,
stream_id=stream_id,
is_subagent_mode=is_subagent_mode,
)
@@ -423,7 +432,8 @@ class TestEventBusLifecycle:
handler=lambda e: received_events.append(e.type),
)
ctx = build_ctx(runtime, node_spec, buffer, llm)
# Subagent mode opts out of worker auto-escalation (no queen in tests).
ctx = build_ctx(runtime, node_spec, buffer, llm, is_subagent_mode=True)
node = EventLoopNode(event_bus=bus, config=LoopConfig(max_iterations=5))
result = await node.execute(ctx)
@@ -805,15 +815,13 @@ class TestEscalate:
bus.subscribe(event_types=[EventType.ESCALATION_REQUESTED], handler=capture)
ctx = build_ctx(runtime, node_spec, buffer, llm, stream_id="worker")
# is_subagent_mode=True: test drives node.execute() directly, so this
# runs in subagent pattern (no queen). Opts out of worker auto-escalation
# that would otherwise fire extra ESCALATION_REQUESTED events on
# subsequent text-only turns.
ctx = build_ctx(runtime, node_spec, buffer, llm, stream_id="worker", is_subagent_mode=True)
node = EventLoopNode(event_bus=bus, config=LoopConfig(max_iterations=5))
async def queen_reply():
await asyncio.sleep(0.05)
await node.inject_event("Acknowledged, proceed.")
task = asyncio.create_task(queen_reply())
async def queen_reply():
await asyncio.sleep(0.05)
await node.inject_event("Acknowledged, proceed.")
@@ -855,7 +863,9 @@ class TestEscalate:
queen_executor.node_registry = {"queen": queen_node}
manager._subscribe_worker_handoffs(session, queen_executor)
ctx = build_ctx(runtime, node_spec, buffer, llm, stream_id="worker")
# is_subagent_mode=True opts out of worker auto-escalation.
# Standalone test without real queen loop, see other escalate tests.
ctx = build_ctx(runtime, node_spec, buffer, llm, stream_id="worker", is_subagent_mode=True)
node = EventLoopNode(event_bus=bus, config=LoopConfig(max_iterations=5))
async def queen_reply():
@@ -1300,6 +1310,9 @@ class TestCrashRecovery:
output_keys=["result"],
store=store,
)
# Tag messages with phase_id matching the node so restore() finds them.
# Restore filters parts by phase_id=ctx.node_id in non-continuous mode.
conv.set_current_phase(node_spec.id)
await conv.add_user_message("Initial input")
await conv.add_assistant_message("Working on it...")
@@ -1754,7 +1767,8 @@ class TestTransientErrorRetry:
handler=lambda e: retry_events.append(e),
)
ctx = build_ctx(runtime, node_spec, buffer, llm)
# is_subagent_mode=True opts out of worker auto-escalation.
ctx = build_ctx(runtime, node_spec, buffer, llm, is_subagent_mode=True)
node = EventLoopNode(
event_bus=bus,
config=LoopConfig(
@@ -2084,12 +2098,14 @@ class TestToolDoomLoopIntegration:
is_error=False,
)
# is_subagent_mode=True opts out of worker auto-escalation.
ctx = build_ctx(
runtime,
node_spec,
buffer,
llm,
tools=[Tool(name="search", description="s", parameters={})],
is_subagent_mode=True,
)
node = EventLoopNode(
judge=judge,
@@ -2147,6 +2163,9 @@ class TestToolDoomLoopIntegration:
is_error=False,
)
# is_subagent_mode=True opts out of worker auto-escalation. The
# test still exercises worker doom-loop escalation (a separate path)
# via the doom-loop detection at event_loop_node.py:1229.
ctx = build_ctx(
runtime,
spec,
@@ -2154,6 +2173,7 @@ class TestToolDoomLoopIntegration:
llm,
tools=[Tool(name="search", description="s", parameters={})],
stream_id="worker",
is_subagent_mode=True,
)
node = EventLoopNode(
judge=judge,
@@ -2352,12 +2372,14 @@ class TestToolDoomLoopIntegration:
is_error=True,
)
# is_subagent_mode=True opts out of worker auto-escalation.
ctx = build_ctx(
runtime,
node_spec,
buffer,
llm,
tools=[Tool(name="failing_tool", description="s", parameters={})],
is_subagent_mode=True,
)
node = EventLoopNode(
judge=judge,
@@ -2409,42 +2431,6 @@ class TestExecutionId:
adapter = StreamRuntimeAdapter(stream_runtime=mock_stream_runtime, execution_id="exec_456")
assert adapter.execution_id == "exec_456"
def test_build_context_passes_execution_id_from_adapter(self):
"""_build_context picks up execution_id from a StreamRuntimeAdapter runtime."""
from framework.graph.executor import GraphExecutor
from framework.graph.goal import Goal
runtime = MagicMock()
runtime.execution_id = "exec_123"
executor = GraphExecutor(runtime=runtime)
goal = Goal(id="g1", name="test", description="test", success_criteria=[])
node_spec = NodeSpec(
id="n1", name="n1", description="test", node_type="event_loop", output_keys=["r"]
)
ctx = executor._build_context(
node_spec=node_spec, buffer=DataBuffer(), goal=goal, input_data={}
)
assert ctx.execution_id == "exec_123"
def test_build_context_defaults_execution_id_for_plain_runtime(self):
"""Plain Runtime.execution_id returns '' by default."""
from framework.graph.executor import GraphExecutor
from framework.graph.goal import Goal
runtime = MagicMock(spec=Runtime)
runtime.execution_id = ""
executor = GraphExecutor(runtime=runtime)
goal = Goal(id="g1", name="test", description="test", success_criteria=[])
node_spec = NodeSpec(
id="n1", name="n1", description="test", node_type="event_loop", output_keys=["r"]
)
ctx = executor._build_context(
node_spec=node_spec, buffer=DataBuffer(), goal=goal, input_data={}
)
assert ctx.execution_id == ""
# ---------------------------------------------------------------------------
# Subagent data buffer snapshot includes accumulator outputs
+8 -2
View File
@@ -476,7 +476,13 @@ class TestPersistence:
assert restored.messages[0].content == "u1"
@pytest.mark.asyncio
async def test_restore_ignores_run_id_and_loads_all_parts(self):
async def test_restore_filters_by_run_id_for_crash_recovery(self):
"""Restore with a non-legacy run_id only loads parts from that run.
This ensures intentional restarts (new run_id) start fresh while
crash recovery (same run_id) resumes correctly. Legacy parts (no
run_id) and other runs' parts are excluded.
"""
store = MockConversationStore()
await store.write_meta({"system_prompt": "hello"})
await store.write_part(0, {"seq": 0, "role": "user", "content": "legacy"})
@@ -489,7 +495,7 @@ class TestPersistence:
restored = await NodeConversation.restore(store, run_id="run-a")
assert restored is not None
assert [m.content for m in restored.messages] == ["legacy", "run-a", "run-b"]
assert [m.content for m in restored.messages] == ["run-a"]
assert restored.next_seq == 3
@pytest.mark.asyncio
+73 -296
View File
@@ -1,4 +1,4 @@
"""Tests for the queen memory v2 system (reflection + recall)."""
"""Tests for the queen global memory system (reflection + recall)."""
from __future__ import annotations
@@ -15,9 +15,7 @@ from framework.agents.queen.recall_selector import (
format_recall_injection,
select_memories,
)
from framework.agents.queen.reflection_agent import subscribe_worker_memory_triggers
from framework.graph.prompting import build_system_prompt_for_node_context
from framework.runtime.event_bus import AgentEvent, EventBus, EventType
from framework.tools.queen_lifecycle_tools import QueenPhaseState
# ---------------------------------------------------------------------------
@@ -26,9 +24,9 @@ from framework.tools.queen_lifecycle_tools import QueenPhaseState
def test_parse_frontmatter_valid():
text = "---\nname: foo\ntype: goal\ndescription: bar baz\n---\ncontent"
text = "---\nname: foo\ntype: profile\ndescription: bar baz\n---\ncontent"
fm = qm.parse_frontmatter(text)
assert fm == {"name": "foo", "type": "goal", "description": "bar baz"}
assert fm == {"name": "foo", "type": "profile", "description": "bar baz"}
def test_parse_frontmatter_missing():
@@ -42,34 +40,30 @@ def test_parse_frontmatter_empty():
def test_parse_frontmatter_broken_yaml():
text = "---\n: bad\nno colon\n---\n"
fm = qm.parse_frontmatter(text)
# ": bad" has colon at pos 0, so key is empty → skipped
# "no colon" has no colon → skipped
assert fm == {}
# ---------------------------------------------------------------------------
# parse_memory_type
# parse_global_memory_category
# ---------------------------------------------------------------------------
def test_parse_memory_type_valid():
assert qm.parse_memory_type("goal") == "goal"
assert qm.parse_memory_type("environment") == "environment"
assert qm.parse_memory_type("technique") == "technique"
assert qm.parse_memory_type("reference") == "reference"
assert qm.parse_memory_type("profile") == "profile"
assert qm.parse_memory_type("feedback") == "feedback"
def test_parse_global_memory_category_valid():
assert qm.parse_global_memory_category("profile") == "profile"
assert qm.parse_global_memory_category("preference") == "preference"
assert qm.parse_global_memory_category("environment") == "environment"
assert qm.parse_global_memory_category("feedback") == "feedback"
def test_parse_memory_type_case_insensitive():
assert qm.parse_memory_type("Goal") == "goal"
assert qm.parse_memory_type(" TECHNIQUE ") == "technique"
def test_parse_global_memory_category_case_insensitive():
assert qm.parse_global_memory_category("Profile") == "profile"
assert qm.parse_global_memory_category(" FEEDBACK ") == "feedback"
def test_parse_memory_type_invalid():
assert qm.parse_memory_type("user") is None
assert qm.parse_memory_type("unknown") is None
assert qm.parse_memory_type(None) is None
def test_parse_global_memory_category_invalid():
assert qm.parse_global_memory_category("goal") is None
assert qm.parse_global_memory_category("unknown") is None
assert qm.parse_global_memory_category(None) is None
# ---------------------------------------------------------------------------
@@ -79,11 +73,11 @@ def test_parse_memory_type_invalid():
def test_memory_file_from_path(tmp_path: Path):
f = tmp_path / "test.md"
f.write_text("---\nname: test\ntype: goal\ndescription: a test\n---\nbody\n")
f.write_text("---\nname: test\ntype: profile\ndescription: a test\n---\nbody\n")
mf = qm.MemoryFile.from_path(f)
assert mf.filename == "test.md"
assert mf.name == "test"
assert mf.type == "goal"
assert mf.type == "profile"
assert mf.description == "a test"
assert mf.mtime > 0
@@ -145,7 +139,7 @@ def test_format_memory_manifest():
filename="a.md",
path=Path("a.md"),
name="a",
type="goal",
type="profile",
description="desc a",
mtime=time.time(),
),
@@ -159,54 +153,12 @@ def test_format_memory_manifest():
),
]
manifest = qm.format_memory_manifest(files)
assert "[goal] a.md" in manifest
assert "[profile] a.md" in manifest
assert "desc a" in manifest
assert "[unknown] b.md" in manifest
assert "(no description)" in manifest
# ---------------------------------------------------------------------------
# memory_freshness_text
# ---------------------------------------------------------------------------
def test_memory_freshness_text_recent():
assert qm.memory_freshness_text(time.time()) == ""
def test_memory_freshness_text_old():
three_days_ago = time.time() - 3 * 86_400
text = qm.memory_freshness_text(three_days_ago)
assert "3 days old" in text
assert "point-in-time" in text
# ---------------------------------------------------------------------------
# read_conversation_parts
# ---------------------------------------------------------------------------
@pytest.mark.asyncio
async def test_read_conversation_parts(tmp_path: Path):
parts_dir = tmp_path / "conversations" / "parts"
parts_dir.mkdir(parents=True)
for i in range(5):
(parts_dir / f"{i:010d}.json").write_text(
json.dumps({"role": "user" if i % 2 == 0 else "assistant", "content": f"msg {i}"})
)
msgs = await qm.read_conversation_parts(tmp_path)
assert len(msgs) == 5
assert msgs[0]["content"] == "msg 0"
assert msgs[4]["content"] == "msg 4"
@pytest.mark.asyncio
async def test_read_conversation_parts_empty(tmp_path: Path):
msgs = await qm.read_conversation_parts(tmp_path)
assert msgs == []
# ---------------------------------------------------------------------------
# init_memory_dir
# ---------------------------------------------------------------------------
@@ -233,8 +185,10 @@ async def test_select_memories_empty_dir(tmp_path: Path):
@pytest.mark.asyncio
async def test_select_memories_with_files(tmp_path: Path):
(tmp_path / "a.md").write_text("---\nname: a\ndescription: about A\ntype: goal\n---\nbody")
(tmp_path / "b.md").write_text("---\nname: b\ndescription: about B\ntype: reference\n---\nbody")
(tmp_path / "a.md").write_text("---\nname: a\ndescription: about A\ntype: profile\n---\nbody")
(tmp_path / "b.md").write_text(
"---\nname: b\ndescription: about B\ntype: preference\n---\nbody"
)
llm = AsyncMock()
llm.acomplete.return_value = MagicMock(content=json.dumps({"selected_memories": ["a.md"]}))
@@ -258,7 +212,7 @@ async def test_select_memories_error_returns_empty(tmp_path: Path):
def test_format_recall_injection(tmp_path: Path):
(tmp_path / "a.md").write_text("---\nname: a\n---\nbody of a")
result = format_recall_injection(["a.md"], memory_dir=tmp_path)
assert "Selected Memories" in result
assert "Global Memories" in result
assert "body of a" in result
@@ -266,12 +220,6 @@ def test_format_recall_injection_empty():
assert format_recall_injection([]) == ""
def test_format_recall_injection_custom_heading(tmp_path: Path):
(tmp_path / "a.md").write_text("---\nname: a\n---\nbody of a")
result = format_recall_injection(["a.md"], memory_dir=tmp_path, heading="Colony Memories")
assert "Colony Memories" in result
# ---------------------------------------------------------------------------
# reflection_agent
# ---------------------------------------------------------------------------
@@ -279,10 +227,9 @@ def test_format_recall_injection_custom_heading(tmp_path: Path):
@pytest.mark.asyncio
async def test_short_reflection(tmp_path: Path):
"""Short reflection reads new messages and writes a memory file via LLM tools."""
"""Short reflection reads messages and writes a global memory file via LLM tools."""
from framework.agents.queen.reflection_agent import run_short_reflection
# Set up a fake session dir with conversation parts.
parts_dir = tmp_path / "session" / "conversations" / "parts"
parts_dir.mkdir(parents=True)
for i in range(3):
@@ -291,13 +238,12 @@ async def test_short_reflection(tmp_path: Path):
json.dumps({"role": role, "content": f"message {i}"})
)
mem_dir = tmp_path / "memories"
mem_dir = tmp_path / "global_memory"
mem_dir.mkdir()
# Mock LLM: turn 1 lists files, turn 2 writes a memory, turn 3 stops.
llm = AsyncMock()
llm.acomplete.side_effect = [
# Turn 1: LLM calls write_memory_file
# Turn 1: LLM writes a global memory file
MagicMock(
content="",
raw_response={
@@ -309,7 +255,7 @@ async def test_short_reflection(tmp_path: Path):
"filename": "user-likes-tests.md",
"content": (
"---\nname: user-likes-tests\n"
"type: technique\n"
"type: preference\n"
"description: User values thorough testing\n"
"---\nObserved emphasis on test coverage."
),
@@ -318,23 +264,36 @@ async def test_short_reflection(tmp_path: Path):
]
},
),
# Turn 2: LLM has no more tool calls → done
# Turn 2: done
MagicMock(content="Done reflecting.", raw_response={}),
]
session_dir = tmp_path / "session"
await run_short_reflection(
session_dir,
llm,
memory_dir=mem_dir,
caller="queen",
)
await run_short_reflection(session_dir, llm, memory_dir=mem_dir)
# Verify the memory file was created.
written = mem_dir / "user-likes-tests.md"
assert written.exists()
assert "user-likes-tests" in written.read_text()
assert llm.acomplete.call_count == 2
@pytest.mark.asyncio
async def test_short_reflection_rejects_non_global_types(tmp_path: Path):
"""Reflection agent rejects memory types not in GLOBAL_MEMORY_CATEGORIES."""
from framework.agents.queen.reflection_agent import _execute_tool
mem_dir = tmp_path / "global_memory"
mem_dir.mkdir()
result = _execute_tool(
"write_memory_file",
{
"filename": "bad-type.md",
"content": "---\nname: bad\ntype: goal\n---\nbody",
},
mem_dir,
)
assert "ERROR" in result
assert not (mem_dir / "bad-type.md").exists()
@pytest.mark.asyncio
@@ -342,18 +301,17 @@ async def test_long_reflection(tmp_path: Path):
"""Long reflection reads all memories and can merge/delete them."""
from framework.agents.queen.reflection_agent import run_long_reflection
mem_dir = tmp_path / "memories"
mem_dir = tmp_path / "global_memory"
mem_dir.mkdir()
(mem_dir / "dup-a.md").write_text(
"---\nname: dup-a\ntype: goal\ndescription: goal A\n---\nGoal A details."
"---\nname: dup-a\ntype: profile\ndescription: profile A\n---\nProfile A details."
)
(mem_dir / "dup-b.md").write_text(
"---\nname: dup-b\ntype: goal\ndescription: goal A duplicate\n---\nSame goal A."
"---\nname: dup-b\ntype: profile\ndescription: profile A dup\n---\nSame profile A."
)
llm = AsyncMock()
llm.acomplete.side_effect = [
# Turn 1: LLM lists files
MagicMock(
content="",
raw_response={
@@ -362,7 +320,6 @@ async def test_long_reflection(tmp_path: Path):
]
},
),
# Turn 2: LLM merges dup-b into dup-a and deletes dup-b
MagicMock(
content="",
raw_response={
@@ -373,10 +330,9 @@ async def test_long_reflection(tmp_path: Path):
"input": {
"filename": "dup-a.md",
"content": (
"---\nname: dup-a\ntype: goal\n"
"description: goal A (merged)\n"
"---\nGoal A details."
" Also same goal A."
"---\nname: dup-a\ntype: profile\n"
"description: profile A (merged)\n"
"---\nProfile A details. Also same profile A."
),
},
},
@@ -388,21 +344,18 @@ async def test_long_reflection(tmp_path: Path):
]
},
),
# Turn 3: done
MagicMock(content="Housekeeping complete.", raw_response={}),
]
await run_long_reflection(llm, memory_dir=mem_dir, caller="queen")
await run_long_reflection(llm, memory_dir=mem_dir)
# dup-b should be deleted, dup-a should be updated.
assert not (mem_dir / "dup-b.md").exists()
assert (mem_dir / "dup-a.md").exists()
assert "merged" in (mem_dir / "dup-a.md").read_text()
assert llm.acomplete.call_count == 3
# ---------------------------------------------------------------------------
# Bug 1: Path traversal prevention
# Path traversal prevention
# ---------------------------------------------------------------------------
@@ -412,7 +365,6 @@ def test_path_traversal_read(tmp_path: Path):
(tmp_path / "safe.md").write_text("safe content")
result = _execute_tool("read_memory_file", {"filename": "../../etc/passwd"}, tmp_path)
assert "ERROR" in result
assert "path components not allowed" in result.lower() or "escapes" in result.lower()
def test_path_traversal_write(tmp_path: Path):
@@ -427,21 +379,12 @@ def test_path_traversal_write(tmp_path: Path):
assert not (tmp_path.parent / "escape.md").exists()
def test_path_traversal_delete(tmp_path: Path):
from framework.agents.queen.reflection_agent import _execute_tool
(tmp_path / "target.md").write_text("content")
result = _execute_tool("delete_memory_file", {"filename": "../target.md"}, tmp_path)
assert "ERROR" in result
assert (tmp_path / "target.md").exists() # not deleted
def test_safe_path_accepted(tmp_path: Path):
from framework.agents.queen.reflection_agent import _execute_tool
result = _execute_tool(
"write_memory_file",
{"filename": "good-file.md", "content": "---\nname: good\n---\ncontent"},
{"filename": "good-file.md", "content": "---\nname: good\ntype: profile\n---\ncontent"},
tmp_path,
)
assert "Wrote" in result
@@ -454,71 +397,9 @@ def test_safe_path_accepted(tmp_path: Path):
assert "Deleted" in result
def test_init_memory_dir_migrates_shared_memories_into_colony(tmp_path: Path):
source = tmp_path / "legacy-shared"
source.mkdir()
(source / "shared-memory.md").write_text(
"---\nname: shared\ndescription: old shared memory\ntype: goal\n---\nbody",
encoding="utf-8",
)
target = tmp_path / "colony"
qm.migrate_shared_v2_memories(target, source_dir=source)
assert (target / "shared-memory.md").exists()
assert not (source / "shared-memory.md").exists()
assert (target / ".migrated-from-shared-memory").exists()
def test_shared_memory_migration_marker_prevents_repeat(tmp_path: Path):
source = tmp_path / "legacy-shared"
source.mkdir()
target = tmp_path / "colony"
target.mkdir()
(target / ".migrated-from-shared-memory").write_text("done\n", encoding="utf-8")
(source / "shared-memory.md").write_text("body", encoding="utf-8")
qm.migrate_shared_v2_memories(target, source_dir=source)
assert not (target / "shared-memory.md").exists()
assert (source / "shared-memory.md").exists()
def test_global_memory_is_not_populated_by_colony_migration(tmp_path: Path):
source = tmp_path / "legacy-shared"
source.mkdir()
(source / "shared-memory.md").write_text("body", encoding="utf-8")
colony = tmp_path / "colony"
global_dir = tmp_path / "global"
qm.migrate_shared_v2_memories(colony, source_dir=source)
qm.init_memory_dir(global_dir)
assert list(global_dir.glob("*.md")) == []
def test_save_global_memory_rejects_runtime_details(tmp_path: Path):
with pytest.raises(ValueError):
qm.save_global_memory(
category="profile",
description="codebase preference",
content="The user wants the worker graph to use node retries.",
memory_dir=tmp_path,
)
def test_save_global_memory_persists_frontmatter(tmp_path: Path):
filename, path = qm.save_global_memory(
category="preference",
description="Prefers concise updates",
content="The user prefers concise, direct status updates.",
memory_dir=tmp_path,
)
assert filename.endswith(".md")
text = path.read_text(encoding="utf-8")
assert "type: preference" in text
assert "Prefers concise updates" in text
# ---------------------------------------------------------------------------
# system prompt integration
# ---------------------------------------------------------------------------
def test_build_system_prompt_injects_dynamic_memory():
@@ -532,134 +413,30 @@ def test_build_system_prompt_injects_dynamic_memory():
skills_catalog_prompt="",
protocols_prompt="",
memory_prompt="",
dynamic_memory_provider=lambda: "--- Colony Memories ---\nremember this",
dynamic_memory_provider=lambda: "--- Global Memories ---\nremember this",
is_subagent_mode=False,
)
prompt = build_system_prompt_for_node_context(ctx)
assert "Colony Memories" in prompt
assert "Global Memories" in prompt
assert "remember this" in prompt
def test_queen_phase_state_appends_colony_and_global_memory_blocks():
def test_queen_phase_state_appends_global_memory_block():
phase = QueenPhaseState(
prompt_building="base prompt",
_cached_colony_recall_block="--- Colony Memories ---\ncolony",
_cached_global_recall_block="--- Global Memories ---\nglobal",
_cached_global_recall_block="--- Global Memories ---\nglobal stuff",
)
prompt = phase.get_current_prompt()
assert "base prompt" in prompt
assert "Colony Memories" in prompt
assert "Global Memories" in prompt
assert "global stuff" in prompt
@pytest.mark.asyncio
async def test_worker_colony_reflection_at_handoff(tmp_path: Path):
"""Colony reflection runs via WorkerAgent._reflect_colony_memory at node handoff."""
import asyncio
def test_queen_phase_state_prompt_without_memory():
phase = QueenPhaseState(prompt_building="base prompt")
from framework.graph.context import GraphContext
from framework.graph.worker_agent import WorkerAgent
worker_sessions_dir = tmp_path / "worker-sessions"
execution_id = "exec-1"
session_dir = worker_sessions_dir / execution_id / "conversations" / "parts"
session_dir.mkdir(parents=True)
(session_dir / "0000000000.json").write_text(
json.dumps({"role": "user", "content": "Please remember I like terse summaries."}),
encoding="utf-8",
)
(session_dir / "0000000001.json").write_text(
json.dumps({"role": "assistant", "content": "I'll keep that in mind."}),
encoding="utf-8",
)
colony_dir = tmp_path / "colony"
colony_dir.mkdir()
recall_cache: dict[str, str] = {execution_id: ""}
reflect_llm = AsyncMock()
reflect_llm.acomplete.side_effect = [
# Short reflection: write a memory file
MagicMock(
content="",
raw_response={
"tool_calls": [
{
"id": "tc_1",
"name": "write_memory_file",
"input": {
"filename": "user-prefers-terse-summaries.md",
"content": (
"---\n"
"name: user-prefers-terse-summaries\n"
"description: Prefers terse summaries\n"
"type: preference\n"
"---\n\n"
"The user prefers terse summaries."
),
},
}
]
},
),
# Short reflection done
MagicMock(content="done", raw_response={}),
# Recall selector picks the new memory
MagicMock(content=json.dumps({"selected_memories": ["user-prefers-terse-summaries.md"]})),
]
# Build a minimal GraphContext with colony memory fields
gc = MagicMock(spec=GraphContext)
gc.colony_memory_dir = colony_dir
gc.worker_sessions_dir = worker_sessions_dir
gc.colony_recall_cache = recall_cache
gc.colony_reflect_llm = reflect_llm
gc.execution_id = execution_id
gc._colony_reflect_lock = asyncio.Lock()
node_spec = SimpleNamespace(id="test-node")
worker = WorkerAgent.__new__(WorkerAgent)
worker._gc = gc
worker.node_spec = node_spec
await worker._reflect_colony_memory()
assert (colony_dir / "user-prefers-terse-summaries.md").exists()
assert "Colony Memories" in recall_cache[execution_id]
assert "terse summaries" in recall_cache[execution_id]
@pytest.mark.asyncio
async def test_subscribe_worker_triggers_only_lifecycle_events(tmp_path: Path):
"""After simplification, worker triggers only subscribe to start and terminal events."""
colony_dir = tmp_path / "colony"
colony_dir.mkdir()
recall_cache: dict[str, str] = {}
bus = EventBus()
llm = AsyncMock()
subs = await subscribe_worker_memory_triggers(
bus,
llm,
worker_sessions_dir=tmp_path / "sessions",
colony_memory_dir=colony_dir,
recall_cache=recall_cache,
)
try:
# Should have exactly 2 subscriptions (start + terminal)
assert len(subs) == 2
# EXECUTION_STARTED initialises cache
await bus.publish(
AgentEvent(
type=EventType.EXECUTION_STARTED,
stream_id="default",
execution_id="exec-1",
)
)
assert recall_cache.get("exec-1") == ""
finally:
for sub_id in subs:
bus.unsubscribe(sub_id)
prompt = phase.get_current_prompt()
assert "base prompt" in prompt
assert "Global Memories" not in prompt
+115
View File
@@ -9,6 +9,7 @@ AST nodes, disallowed function calls).
import pytest
import framework.graph.safe_eval as safe_eval_module
from framework.graph.safe_eval import safe_eval
# ---------------------------------------------------------------------------
@@ -94,10 +95,124 @@ class TestArithmetic:
def test_power(self):
assert safe_eval("2 ** 10") == 1024
def test_power_large_exponent_blocked(self):
with pytest.raises(ValueError, match="Power exponent"):
safe_eval("2 ** 1001")
def test_power_large_result_blocked(self):
with pytest.raises(ValueError, match="Power operation"):
safe_eval("99 ** 1000")
def test_nested_power_blocked(self):
with pytest.raises(ValueError, match="Power exponent"):
safe_eval("2 ** 2 ** 20")
def test_complex_expression(self):
assert safe_eval("(2 + 3) * 4 - 1") == 19
class TestExecutionTimeout:
def test_default_timeout(self):
assert safe_eval_module.DEFAULT_TIMEOUT_MS == 100
def test_timeout_must_be_positive(self):
with pytest.raises(ValueError, match="timeout_ms"):
safe_eval("1 + 1", timeout_ms=0)
def test_timeout_can_be_disabled(self):
assert safe_eval("1 + 1", timeout_ms=None) == 2
def test_timeout_exceeded_raises(self, monkeypatch):
ticks = iter([0.0, 1.0])
monkeypatch.setattr(safe_eval_module.time, "perf_counter", lambda: next(ticks))
with pytest.raises(TimeoutError, match="1ms"):
safe_eval("1 + 1", timeout_ms=1)
def test_existing_process_timer_is_preserved(self, monkeypatch):
calls: list[tuple[str, object]] = []
main_thread = object()
monkeypatch.setattr(safe_eval_module.signal, "SIGALRM", object(), raising=False)
monkeypatch.setattr(safe_eval_module.signal, "ITIMER_REAL", object(), raising=False)
monkeypatch.setattr(
safe_eval_module.signal,
"getitimer",
lambda which: (5.0, 0.0),
raising=False,
)
monkeypatch.setattr(
safe_eval_module.signal,
"setitimer",
lambda *args: calls.append(("setitimer", args)),
raising=False,
)
monkeypatch.setattr(
safe_eval_module.signal,
"signal",
lambda *args: calls.append(("signal", args)),
)
monkeypatch.setattr(safe_eval_module.threading, "main_thread", lambda: main_thread)
monkeypatch.setattr(
safe_eval_module.threading,
"current_thread",
lambda: main_thread,
)
with safe_eval_module._execution_timeout(100):
pass
assert calls == []
def test_timeout_restores_alarm_state(self, monkeypatch):
calls: list[tuple[str, object]] = []
main_thread = object()
old_handler = object()
monkeypatch.setattr(safe_eval_module.signal, "SIGALRM", object(), raising=False)
monkeypatch.setattr(safe_eval_module.signal, "ITIMER_REAL", object(), raising=False)
monkeypatch.setattr(
safe_eval_module.signal,
"getitimer",
lambda which: (0.0, 0.0),
raising=False,
)
monkeypatch.setattr(
safe_eval_module.signal,
"getsignal",
lambda which: old_handler,
)
def fake_signal(which, handler):
calls.append(("signal", handler))
def fake_setitimer(which, delay, interval=0.0):
calls.append(("setitimer", (delay, interval)))
return (0.0, 0.0)
monkeypatch.setattr(safe_eval_module.signal, "signal", fake_signal)
monkeypatch.setattr(
safe_eval_module.signal,
"setitimer",
fake_setitimer,
raising=False,
)
monkeypatch.setattr(safe_eval_module.threading, "main_thread", lambda: main_thread)
monkeypatch.setattr(
safe_eval_module.threading,
"current_thread",
lambda: main_thread,
)
with safe_eval_module._execution_timeout(100):
pass
assert calls[0][0] == "signal"
assert calls[1] == ("setitimer", (0.1, 0.0))
assert calls[2] == ("signal", old_handler)
assert calls[3] == ("setitimer", (0.0, 0.0))
# ---------------------------------------------------------------------------
# Unary operators
# ---------------------------------------------------------------------------
@@ -147,8 +147,6 @@ async def test_load_worker_core_defaults_to_session_llm_model(monkeypatch, tmp_p
monkeypatch.setattr("framework.runner.AgentRunner.load", fake_load)
monkeypatch.setattr(manager, "_cleanup_stale_active_sessions", lambda *_args: None)
monkeypatch.setattr(manager, "_subscribe_worker_digest", lambda *_args: None)
monkeypatch.setattr(manager, "_subscribe_worker_colony_memory", AsyncMock())
monkeypatch.setattr(
"framework.tools.queen_lifecycle_tools._read_agent_triggers_json",
lambda *_args: [],
@@ -159,7 +157,6 @@ async def test_load_worker_core_defaults_to_session_llm_model(monkeypatch, tmp_p
assert load_calls[0]["model"] == "queen-shared-model"
assert session.runner is runner
assert session.runner._llm is session_llm
assert runtime._dynamic_memory_provider_factory is not None
@pytest.mark.asyncio
@@ -184,8 +181,6 @@ async def test_load_worker_core_keeps_explicit_worker_model_override(monkeypatch
monkeypatch.setattr("framework.runner.AgentRunner.load", fake_load)
monkeypatch.setattr(manager, "_cleanup_stale_active_sessions", lambda *_args: None)
monkeypatch.setattr(manager, "_subscribe_worker_digest", lambda *_args: None)
monkeypatch.setattr(manager, "_subscribe_worker_colony_memory", AsyncMock())
monkeypatch.setattr(
"framework.tools.queen_lifecycle_tools._read_agent_triggers_json",
lambda *_args: [],
@@ -201,38 +196,4 @@ async def test_load_worker_core_keeps_explicit_worker_model_override(monkeypatch
assert session.runner is runner
assert session.runner._llm is None
@pytest.mark.asyncio
async def test_load_worker_core_continues_when_colony_memory_subscription_fails(
monkeypatch, tmp_path
) -> None:
bus = EventBus()
manager = SessionManager(model="manager-default")
session_llm = SimpleNamespace(model="queen-shared-model")
session = Session(id="session_memory_warning", event_bus=bus, llm=session_llm, loaded_at=0.0)
runtime = SimpleNamespace(is_running=True)
runner = SimpleNamespace(
_llm=None,
_agent_runtime=runtime,
info=MagicMock(return_value={"id": "worker"}),
)
monkeypatch.setattr("framework.runner.AgentRunner.load", lambda *args, **kwargs: runner)
monkeypatch.setattr(manager, "_cleanup_stale_active_sessions", lambda *_args: None)
monkeypatch.setattr(manager, "_subscribe_worker_digest", lambda *_args: None)
monkeypatch.setattr(
manager,
"_subscribe_worker_colony_memory",
AsyncMock(side_effect=ImportError("optional memory hook unavailable")),
)
monkeypatch.setattr(
"framework.tools.queen_lifecycle_tools._read_agent_triggers_json",
lambda *_args: [],
)
await manager._load_worker_core(session, tmp_path / "worker_agent")
assert session.runner is runner
assert session.graph_runtime is runtime
assert session.worker_path == tmp_path / "worker_agent"
@@ -11,6 +11,7 @@ def _make_conversation() -> NodeConversation:
conv._next_seq = 0
conv._current_phase = None
conv._store = None
conv._run_id = None
return conv
+5 -5
View File
@@ -688,20 +688,20 @@ def test_convert_mcp_tool_strips_context_params():
input_schema={
"type": "object",
"properties": {
"workspace_id": {"type": "string"}, # context param → stripped
"agent_id": {"type": "string"}, # context param → stripped
"data_dir": {"type": "string"}, # context param → stripped
"query": {"type": "string"}, # regular param → kept
},
"required": ["workspace_id", "query"],
"required": ["agent_id", "query"],
},
)
tool = registry._convert_mcp_tool_to_framework_tool(mcp_tool) # noqa: SLF001
props = tool.parameters["properties"]
assert "workspace_id" not in props
assert "agent_id" not in props
assert "data_dir" not in props
assert "query" in props
# workspace_id should also be stripped from required
assert "workspace_id" not in tool.parameters["required"]
# agent_id should also be stripped from required
assert "agent_id" not in tool.parameters["required"]
assert "query" in tool.parameters["required"]
+291
View File
@@ -0,0 +1,291 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Hive — Browser Extension Setup</title>
<style>
:root {
--bg: #0d1117;
--surface: #161b22;
--border: #30363d;
--text: #e6edf3;
--text-muted: #8b949e;
--accent: #58a6ff;
--accent-subtle: #1f6feb33;
--green: #3fb950;
--yellow: #d29922;
--orange: #f0883e;
}
* { box-sizing: border-box; margin: 0; padding: 0; }
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Helvetica, Arial, sans-serif;
background: var(--bg);
color: var(--text);
line-height: 1.6;
padding: 2rem;
}
.container {
max-width: 720px;
margin: 0 auto;
}
h1 {
font-size: 1.75rem;
margin-bottom: 0.5rem;
}
.subtitle {
color: var(--text-muted);
margin-bottom: 2rem;
}
.step {
background: var(--surface);
border: 1px solid var(--border);
border-radius: 8px;
padding: 1.5rem;
margin-bottom: 1.25rem;
}
.step-header {
display: flex;
align-items: center;
gap: 0.75rem;
margin-bottom: 0.75rem;
}
.step-number {
display: inline-flex;
align-items: center;
justify-content: center;
width: 28px;
height: 28px;
border-radius: 50%;
background: var(--accent);
color: #fff;
font-weight: 600;
font-size: 0.875rem;
flex-shrink: 0;
}
.step-title {
font-weight: 600;
font-size: 1.1rem;
}
.step p {
margin-bottom: 0.5rem;
color: var(--text-muted);
}
a {
color: var(--accent);
text-decoration: none;
}
a:hover {
text-decoration: underline;
}
.chrome-link {
display: inline-block;
background: var(--accent);
color: #fff;
border: none;
border-radius: 6px;
padding: 0.6rem 1.25rem;
margin-top: 0.75rem;
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Helvetica, Arial, sans-serif;
font-size: 0.95rem;
font-weight: 600;
cursor: pointer;
transition: background 0.15s;
text-decoration: none;
}
.chrome-link:hover {
background: #1f6feb;
text-decoration: none;
}
code {
background: #1c2128;
padding: 0.15rem 0.4rem;
border-radius: 4px;
font-size: 0.9rem;
}
.path-box {
background: #1c2128;
border: 1px solid var(--border);
border-radius: 6px;
padding: 0.75rem 1rem;
margin-top: 0.5rem;
font-family: monospace;
font-size: 0.9rem;
word-break: break-all;
user-select: all;
display: flex;
align-items: center;
justify-content: space-between;
gap: 0.75rem;
}
.copy-btn {
background: var(--border);
color: var(--text);
border: none;
border-radius: 4px;
padding: 0.35rem 0.7rem;
font-size: 0.8rem;
font-weight: 600;
cursor: pointer;
white-space: nowrap;
flex-shrink: 0;
transition: background 0.15s;
}
.copy-btn:hover {
background: var(--accent);
}
.screenshot-placeholder {
background: #1c2128;
border: 1px dashed var(--border);
border-radius: 6px;
padding: 2rem;
margin-top: 0.75rem;
text-align: center;
color: var(--text-muted);
font-size: 0.85rem;
}
.note {
background: #d2992215;
border-left: 3px solid var(--yellow);
border-radius: 0 6px 6px 0;
padding: 0.75rem 1rem;
margin-top: 0.75rem;
font-size: 0.9rem;
color: var(--text-muted);
}
.done-section {
text-align: center;
margin-top: 2rem;
padding: 1.5rem;
background: var(--surface);
border: 1px solid var(--border);
border-radius: 8px;
}
.done-section p {
color: var(--green);
font-weight: 600;
font-size: 1.1rem;
}
.done-section .hint {
color: var(--text-muted);
font-size: 0.9rem;
margin-top: 0.5rem;
}
</style>
</head>
<body>
<div class="container">
<h1>Hive Browser Extension Setup</h1>
<p class="subtitle">Follow these steps to load the Hive Browser Bridge extension into Chrome.</p>
<!-- Step 1 -->
<div class="step">
<div class="step-header">
<span class="step-number">1</span>
<span class="step-title">Go to Chrome Extension Settings</span>
</div>
<p>Open the Chrome extensions page and enable <strong style="color:var(--text)">Developer mode</strong> using the toggle in the top-right corner.</p>
<button class="chrome-link" id="chrome-link-btn">Copy chrome://extensions/</button>
<div class="note">Chrome doesn't allow pages to open <code>chrome://</code> URLs directly. Click the button above to copy the link, then paste it into your address bar and press Enter.</div>
<img width="1236" height="921" alt="Image" src="https://github.com/user-attachments/assets/36aef101-ac02-4940-9bbd-4613167d6030" style="max-width:100%;height:auto;border-radius:6px;margin-top:0.75rem;" />
</div>
<!-- Step 2 -->
<div class="step">
<div class="step-header">
<span class="step-number">2</span>
<span class="step-title">Click "Load unpacked"</span>
</div>
<p>Once Developer mode is enabled, you'll see a button bar appear. Click <strong style="color:var(--text)">Load unpacked</strong>.</p>
<img width="1236" height="921" alt="Image" src="https://github.com/user-attachments/assets/9200925c-1959-484f-b05c-e2e77b858b32" style="max-width:100%;height:auto;border-radius:6px;margin-top:0.75rem;" />
</div>
<!-- Step 3 -->
<div class="step">
<div class="step-header">
<span class="step-number">3</span>
<span class="step-title">Select the extension folder</span>
</div>
<p>In the folder picker, paste the path below. Click <strong style="color:var(--text)">Copy</strong> to copy it to your clipboard.</p>
<div class="path-box">
<span id="extension-path">tools/browser-extension</span>
<button class="copy-btn" id="copy-path-btn">Copy</button>
</div>
<div class="note">Alternatively, you can navigate there manually: open the folder where you cloned the Hive repo, then go into <code>tools</code><code>browser-extension</code>.</div>
<img width="1236" height="921" alt="Image" src="https://github.com/user-attachments/assets/119b8794-d956-4278-9284-3f122597b34c" style="max-width:100%;height:auto;border-radius:6px;margin-top:0.75rem;" />
</div>
<!-- Step 4 -->
<div class="step">
<div class="step-header">
<span class="step-number">4</span>
<span class="step-title">Verify the extension loaded</span>
</div>
<p>You should see <strong style="color:var(--text)">Hive Browser Bridge</strong> appear in your extensions list. Make sure it is enabled.</p>
<img width="423" height="250" alt="Image" src="https://github.com/user-attachments/assets/2c2bb008-d49e-4dcf-8431-44209ded1783" style="max-width:100%;height:auto;border-radius:6px;margin-top:0.75rem;" />
</div>
<div class="done-section">
<p>You're all set!</p>
<p class="hint">Return to your terminal and press Enter to continue the quickstart.</p>
</div>
</div>
<script>
// Populate extension path from URL parameter, or derive from this file's location
const params = new URLSearchParams(window.location.search);
const extPath = params.get('path');
if (extPath) {
document.getElementById('extension-path').textContent = extPath;
} else if (window.location.protocol === 'file:') {
// Derive absolute path from this HTML file's own location:
// this file is at <repo>/docs/browser-extension-setup.html
// the extension is at <repo>/tools/browser-extension
const filePath = decodeURIComponent(window.location.pathname);
const docsDir = filePath.substring(0, filePath.lastIndexOf('/'));
const repoDir = docsDir.substring(0, docsDir.lastIndexOf('/'));
document.getElementById('extension-path').textContent = repoDir + '/tools/browser-extension';
}
// Copy chrome://extensions/ button
document.getElementById('chrome-link-btn').addEventListener('click', function() {
navigator.clipboard.writeText('chrome://extensions/').then(() => {
this.textContent = 'Copied!';
setTimeout(() => { this.textContent = 'Copy chrome://extensions/'; }, 1500);
});
});
// Copy path button
document.getElementById('copy-path-btn').addEventListener('click', function() {
const path = document.getElementById('extension-path').textContent;
navigator.clipboard.writeText(path).then(() => {
this.textContent = 'Copied!';
setTimeout(() => { this.textContent = 'Copy'; }, 1500);
});
});
</script>
</body>
</html>
-1
View File
@@ -99,7 +99,6 @@ hive/ # Repository root
│ │ ├── storage/ # File-based persistence
│ │ ├── testing/ # Testing utilities
│ │ ├── tools/ # Built-in tool implementations
│ │ ├── tui/ # Terminal UI dashboard
│ │ └── utils/ # Shared utilities
│ ├── tests/ # Unit and E2E tests (including dummy agents)
│ ├── pyproject.toml # Package metadata and dependencies
+2 -2
View File
@@ -589,14 +589,14 @@ if ($NodeAvailable) {
Write-Host " Installing npm packages... " -NoNewline
Push-Location $frontendDir
try {
$installOutput = & npm install --no-fund --no-audit 2>&1
$installOutput = npm install --no-fund --no-audit 2>&1
if ($LASTEXITCODE -eq 0) {
Write-Ok "ok"
# Clean stale tsbuildinfo cache — tsc -b incremental builds fail
# silently when these are out of sync with source files
Get-ChildItem -Path $frontendDir -Filter "tsconfig*.tsbuildinfo" -ErrorAction SilentlyContinue | Remove-Item -Force
Write-Host " Building frontend... " -NoNewline
$buildOutput = & npm run build 2>&1
$buildOutput = npm run build 2>&1
if ($LASTEXITCODE -eq 0) {
Write-Ok "ok"
Write-Ok "Frontend built -> core/frontend/dist/"
+14 -33
View File
@@ -1932,47 +1932,28 @@ else
printf '%s' "$EXTENSION_PATH" | pbcopy 2>/dev/null && _copied=true
fi
# Show instructions first, then wait for the user before opening Chrome
echo -e " ${BOLD}When Chrome opens to the extensions page, you will need to:${NC}"
echo ""
echo -e " ${CYAN}1.${NC} Enable ${BOLD}Developer mode${NC} (toggle in the top-right corner)"
echo -e " ${CYAN}2.${NC} Click ${BOLD}Load unpacked${NC}"
echo -e " ${CYAN}3.${NC} Paste this path into the folder picker:"
echo ""
echo -e " ${BOLD}$EXTENSION_PATH${NC}"
echo ""
if [ "${_copied:-false}" = "true" ]; then
echo -e " ${DIM}(path already copied to clipboard — just Ctrl+V in the folder picker)${NC}"
echo ""
fi
read -r -p " Press Enter when you are ready to set up the Chrome extension... " _dummy || true
echo ""
# Open chrome://extensions in Chrome
echo " Opening chrome://extensions in Chrome..."
# Open setup guide in default browser
SETUP_URL="file://$SCRIPT_DIR/docs/browser-extension-setup.html?path=$(printf '%s' "$EXTENSION_PATH" | sed 's/ /%20/g')"
echo -e " Opening browser extension setup guide..."
if [ "${_copied:-false}" = "true" ]; then
echo -e " ${DIM}(extension path copied to clipboard — paste it in the folder picker)${NC}"
fi
if [[ "$OSTYPE" == darwin* ]]; then
# macOS: use open -a to properly handle chrome:// URLs
_chrome_app=""
if [[ "$CHROME_BIN" == *"Google Chrome"* ]]; then
_chrome_app="Google Chrome"
elif [[ "$CHROME_BIN" == *"Microsoft Edge"* ]]; then
_chrome_app="Microsoft Edge"
elif [[ "$CHROME_BIN" == *"Chromium"* ]]; then
_chrome_app="Chromium"
fi
if [ -n "$_chrome_app" ]; then
open -a "$_chrome_app" "chrome://extensions" 2>/dev/null
open "$SETUP_URL" 2>/dev/null
elif command -v xdg-open &> /dev/null; then
xdg-open "$SETUP_URL" > /dev/null 2>&1 &
elif command -v wslview &> /dev/null; then
wslview "$SETUP_URL" > /dev/null 2>&1 &
else
"$CHROME_BIN" "chrome://extensions" > /dev/null 2>&1 &
echo -e " ${DIM}Could not open browser automatically. Visit:${NC}"
echo -e " ${BOLD}$SETUP_URL${NC}"
fi
else
"$CHROME_BIN" "chrome://extensions" > /dev/null 2>&1 &
fi
sleep 1
echo ""
read -r -p " Press Enter once you see 'Hive Browser Bridge' in the extensions list... " _dummy || true
read -r -p " Press Enter once you've finished the extension setup... " _dummy || true
CHROME_LAUNCHED=true
fi
+2 -4
View File
@@ -26,6 +26,8 @@ dependencies = [
"fastmcp>=2.0.0",
"diff-match-patch>=20230430",
"python-dotenv>=1.0.0",
"playwright>=1.40.0",
"playwright-stealth>=2.0.0",
"litellm==1.81.7", # pinned: supply chain attack in >=1.82.7 (adenhq/hive#6783)
"dnspython>=2.4.0",
"resend>=2.0.0",
@@ -49,8 +51,6 @@ sandbox = [
]
browser = [
"pillow>=10.0.0",
"playwright>=1.40.0",
"playwright-stealth>=2.0.0",
]
ocr = [
"pytesseract>=0.3.10",
@@ -78,8 +78,6 @@ all = [
"google-cloud-bigquery>=3.0.0",
"databricks-sdk>=0.30.0",
"databricks-mcp>=0.1.0",
"playwright>=1.40.0",
"playwright-stealth>=2.0.0",
]
[tool.uv.sources]
@@ -0,0 +1,59 @@
# AWS S3 Tool
Manage Amazon S3 buckets and objects using AWS Signature V4 authentication.
## Tools
| Tool | Description |
|------|-------------|
| `s3_list_buckets` | List all S3 buckets in the account |
| `s3_list_objects` | List objects in a bucket with optional prefix filter |
| `s3_get_object` | Download an object's content (text or base64) |
| `s3_put_object` | Upload content to an S3 object |
| `s3_delete_object` | Delete an object from a bucket |
| `s3_copy_object` | Copy an object between buckets or keys |
| `s3_get_object_metadata` | Get object metadata (size, content type, ETag) |
| `s3_generate_presigned_url` | Generate a pre-signed URL for temporary access |
## Setup
Set the following environment variables:
| Variable | Description |
|----------|-------------|
| `AWS_ACCESS_KEY_ID` | AWS access key |
| `AWS_SECRET_ACCESS_KEY` | AWS secret key |
| `AWS_REGION` | AWS region (default: `us-east-1`) |
Get credentials at: [AWS Console](https://console.aws.amazon.com/iam/)
## Usage Examples
### List buckets
```python
s3_list_buckets()
```
### List objects with prefix
```python
s3_list_objects(bucket="my-bucket", prefix="data/", max_keys=20)
```
### Upload a file
```python
s3_put_object(bucket="my-bucket", key="reports/q1.csv", content="col1,col2\n1,2")
```
### Generate a pre-signed URL
```python
s3_generate_presigned_url(bucket="my-bucket", key="file.pdf", expires_in=3600)
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are required", "help": "Set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables"}
{"error": "HTTP 404: <NoSuchKey>...</NoSuchKey>"}
{"error": "Request timed out"}
```
@@ -0,0 +1,62 @@
# Azure SQL Tool
Manage Azure SQL servers, databases, and firewall rules via the Azure Management REST API.
## Tools
| Tool | Description |
|------|-------------|
| `azure_sql_list_servers` | List SQL servers in a subscription or resource group |
| `azure_sql_get_server` | Get details of a specific SQL server |
| `azure_sql_list_databases` | List databases on a SQL server |
| `azure_sql_get_database` | Get details of a specific database |
| `azure_sql_list_firewall_rules` | List firewall rules for a SQL server |
## Setup
Set the following environment variables:
| Variable | Description |
|----------|-------------|
| `AZURE_SUBSCRIPTION_ID` | Azure subscription ID |
| `AZURE_SQL_ACCESS_TOKEN` | Azure Management API bearer token |
To obtain a token:
1. Register an app in Azure AD (Entra ID)
2. Assign SQL DB Contributor or Reader role
3. Obtain a token via client credentials flow with scope `https://management.azure.com/.default`
See: [Azure SQL REST API](https://learn.microsoft.com/en-us/rest/api/sql/)
Note: Access tokens typically expire within 1 hour and require refresh.
## Usage Examples
### List all SQL servers
```python
azure_sql_list_servers()
```
### List servers in a resource group
```python
azure_sql_list_servers(resource_group="my-rg")
```
### Get databases on a server
```python
azure_sql_list_databases(resource_group="my-rg", server_name="my-server")
```
### Check firewall rules
```python
azure_sql_list_firewall_rules(resource_group="my-rg", server_name="my-server")
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "AZURE_SQL_ACCESS_TOKEN and AZURE_SUBSCRIPTION_ID are required"}
{"error": "Azure API error (HTTP 404): Resource not found"}
{"error": "Request timed out"}
```
@@ -0,0 +1,59 @@
# Cloudinary Tool
Upload, manage, search, and transform media assets using the Cloudinary API.
## Tools
| Tool | Description |
|------|-------------|
| `cloudinary_upload` | Upload an image or file to Cloudinary |
| `cloudinary_list_resources` | List resources with optional type and prefix filters |
| `cloudinary_get_resource` | Get detailed info about a specific resource |
| `cloudinary_delete_resource` | Delete a resource by public ID |
| `cloudinary_search` | Search resources using Cloudinary's search API |
| `cloudinary_get_usage` | Get account usage statistics |
| `cloudinary_rename_resource` | Rename a resource's public ID |
| `cloudinary_add_tag` | Add a tag to one or more resources |
## Setup
Set the following environment variables:
| Variable | Description |
|----------|-------------|
| `CLOUDINARY_CLOUD_NAME` | Your Cloudinary cloud name |
| `CLOUDINARY_API_KEY` | API key |
| `CLOUDINARY_API_SECRET` | API secret |
Get credentials at: [Cloudinary Console](https://console.cloudinary.com/)
## Usage Examples
### Upload an image
```python
cloudinary_upload(file_url="https://example.com/photo.jpg", public_id="my-photo")
```
### Search for resources
```python
cloudinary_search(expression="cat AND format:jpg", max_results=10)
```
### Get account usage
```python
cloudinary_get_usage()
```
### Delete a resource
```python
cloudinary_delete_resource(public_id="my-photo")
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "CLOUDINARY_CLOUD_NAME, CLOUDINARY_API_KEY, and CLOUDINARY_API_SECRET not set", "help": "Get credentials from your Cloudinary dashboard at https://console.cloudinary.com/"}
{"error": "Cloudinary API error (HTTP 404): Resource not found"}
{"error": "Request timed out"}
```
@@ -0,0 +1,198 @@
# Confluence Tool
Wiki and knowledge management via Confluence Cloud REST API v2.
## Available Functions
### Spaces & Pages
- `confluence_list_spaces(limit=25)`
- `limit` (int, optional): Max results (1-250, default 25)
- Returns: `{"spaces": [...], "count": N}` with id, key, name, type, status
- `confluence_list_pages(space_id="", title="", limit=25)`
- `space_id` (str, optional): Filter by space ID
- `title` (str, optional): Filter by exact page title
- `limit` (int, optional): Max results (1-250)
- Returns: `{"pages": [...], "count": N}` with id, title, space_id, version
- `confluence_get_page(page_id, body_format="storage")`
- `page_id` (str): Page ID (required)
- `body_format` (str, optional): `"storage"`, `"view"`, or `"atlas_doc_format"`
- Returns: Full page details with body content (truncated to 5000 chars)
- `confluence_get_page_children(page_id, limit=25)`
- `page_id` (str): Parent page ID (required)
- `limit` (int, optional): Max results (1-250)
- Returns: `{"children": [...], "count": N}`
### CRUD Operations
- `confluence_create_page(space_id, title, body, parent_id="")`
- `space_id` (str): Space ID to create page in (required)
- `title` (str): Page title (required)
- `body` (str): Page content in Confluence storage format (XHTML) (required)
- `parent_id` (str, optional): Parent page ID for child pages
- Returns: `{"id": "...", "title": "...", "status": "created"}`
- `confluence_update_page(page_id, title, body, version_number)`
- `page_id` (str): Page ID (required)
- `title` (str): Page title (required, even if unchanged)
- `body` (str): New content in storage format (required)
- `version_number` (int): Current version + 1 (required)
- Returns: `{"id": "...", "title": "...", "version": N, "status": "updated"}`
- `confluence_delete_page(page_id)`
- `page_id` (str): Page ID to delete (required)
- Returns: `{"page_id": "...", "status": "deleted"}`
### Search
- `confluence_search(query, space_key="", limit=25)`
- `query` (str): Search text (used in CQL `text~` query) (required)
- `space_key` (str, optional): Filter by space key (e.g., `"DEV"`)
- `limit` (int, optional): Max results (1-50)
- Returns: `{"results": [...], "count": N}` with title, excerpt, page_id, space
## Required Credentials
Set these environment variables:
```bash
# Your Confluence domain (e.g., your-company.atlassian.net)
export CONFLUENCE_DOMAIN="your-company.atlassian.net"
# Your Atlassian account email
export CONFLUENCE_EMAIL="you@company.com"
# Generate an API token at https://id.atlassian.com/manage/api-tokens
export CONFLUENCE_API_TOKEN="your_api_token_here"
```
> 💡 **Tip**: Make sure the user has permissions to access the spaces and pages you want to interact with.
## Example Usage
```python
# List all spaces
spaces = confluence_list_spaces(limit=10)
# Returns: {"spaces": [{"id": "123", "key": "DEV", "name": "Development", ...}], ...}
# List pages in a specific space
pages = confluence_list_pages(space_id="123", limit=20)
# Get a specific page's content
page = confluence_get_page(page_id="456", body_format="storage")
# Returns: {"id": "456", "title": "...", "body": "<p>Content...</p>", ...}
# Search for pages containing "API documentation"
results = confluence_search(query="API documentation", space_key="DEV")
# Returns: {"results": [{"title": "...", "excerpt": "...", "page_id": "..."}], ...}
# Create a new page
new_page = confluence_create_page(
space_id="123",
title="Meeting Notes 2026-03-31",
body="<h1>Meeting Notes</h1><p>Attendees: Alice, Bob</p>",
parent_id="456" # Optional: make it a child page
)
# Update an existing page (must increment version number)
# First get current version
current = confluence_get_page(page_id="789")
current_version = current["version"] # e.g., 5
confluence_update_page(
page_id="789",
title="Updated Title",
body="<h1>Updated Content</h1>",
version_number=current_version + 1 # Must be current + 1
)
# Get child pages of a parent
children = confluence_get_page_children(page_id="456")
# Delete a page
confluence_delete_page(page_id="789")
```
## Body Format (Storage Format)
The `body` parameter uses Confluence **storage format** (XHTML-like). Examples:
```python
# Simple paragraph
body = "<p>This is a paragraph.</p>"
# Heading and list
body = """
<h1>Meeting Notes</h1>
<h2>Attendees</h2>
<ul>
<li>Alice</li>
<li>Bob</li>
</ul>
<h2>Action Items</h2>
<ol>
<li>Review PR #123</li>
<li>Update documentation</li>
</ol>
"""
# Code block
body = """
<ac:structured-macro ac:name="code">
<ac:parameter ac:name="language">python</ac:parameter>
<ac:plain-text-body><![CDATA[
def hello():
print("Hello, World!")
]]></ac:plain-text-body>
</ac:structured-macro>
"""
```
## Version Number Requirement
When updating a page, you **must** provide the next version number:
```python
# 1. Get current page
page = confluence_get_page(page_id="123")
current_version = page["version"] # e.g., 5
# 2. Update with version + 1
confluence_update_page(
page_id="123",
title="Same Title",
body="<p>Updated content</p>",
version_number=current_version + 1 # 6 in this example
)
```
## Error Handling
All functions return error dicts on failure:
```python
# Missing credentials
{"error": "CONFLUENCE_DOMAIN, CONFLUENCE_EMAIL, and CONFLUENCE_API_TOKEN not set", "help": "Generate an API token at https://id.atlassian.com/manage/api-tokens"}
# Unauthorized
{"error": "Unauthorized. Check your Confluence credentials."}
# Not found
{"error": "Not found"}
# Wrong version number on update
{"error": "Confluence API error 409: Version mismatch"}
# Request timeout
{"error": "Request to Confluence timed out"}
```
## Reference
- [Confluence Cloud API v2 Docs](https://developer.atlassian.com/cloud/confluence/rest/v2/intro/)
- [Get API Token](https://id.atlassian.com/manage/api-tokens)
- [CQL (Confluence Query Language)](https://developer.atlassian.com/cloud/confluence/advanced-searching-using-cql/)
- [Storage Format Reference](https://developer.atlassian.com/cloud/confluence/rest/v2/api-group-content/#content-storage-format)
@@ -0,0 +1,124 @@
# Docker Hub Tool
Search repositories, list tags, inspect images, manage webhooks, and delete tags via the Docker Hub API v2.
## Tools
| Tool | Description |
|------|-------------|
| `docker_hub_search` | Search Docker Hub for public repositories |
| `docker_hub_list_repos` | List repositories for a user or organization |
| `docker_hub_get_repo` | Get detailed info about a specific repository |
| `docker_hub_list_tags` | List tags for a repository |
| `docker_hub_get_tag_detail` | Get details for a specific image tag |
| `docker_hub_delete_tag` | Delete a tag from a repository |
| `docker_hub_list_webhooks` | List webhooks configured for a repository |
## Setup
Requires a Docker Hub Personal Access Token (PAT):
1. Go to [hub.docker.com](https://hub.docker.com) → **Account Settings → Security → New Access Token**
2. Give it a name and select the required permissions (Read, Write, Delete as needed)
3. Copy the token immediately — it is only shown once
```bash
DOCKER_HUB_TOKEN=your-personal-access-token
DOCKER_HUB_USERNAME=your-docker-hub-username
```
> `DOCKER_HUB_USERNAME` is used as the default namespace when listing repos. If it is unset and no `namespace` is passed to `docker_hub_list_repos`, the tool will return an error: `"namespace is required (or set DOCKER_HUB_USERNAME)"`.
## Usage Examples
### Search for public repositories
```python
docker_hub_search(query="nginx", max_results=10)
```
### List your own repositories
```python
docker_hub_list_repos(namespace="myusername", max_results=25)
```
### Get repository details
```python
docker_hub_get_repo(repository="library/nginx")
```
### List tags for a repository
```python
docker_hub_list_tags(repository="library/nginx", max_results=20)
```
### Get details for a specific tag
```python
docker_hub_get_tag_detail(
repository="library/nginx",
tag="latest",
)
```
### Delete a tag
```python
docker_hub_delete_tag(
repository="myusername/myapp",
tag="old-release-1.0",
)
```
### List webhooks for a repository
```python
docker_hub_list_webhooks(repository="myusername/myapp")
```
## Response Format
`docker_hub_list_tags` returns tags sorted by `last_updated` descending:
```python
{
"repository": "library/nginx",
"tags": [
{
"name": "latest",
"full_size": 68000000,
"last_updated": "2025-05-01T12:00:00Z",
"digest": "sha256:abc123...",
},
...
]
}
```
`docker_hub_get_tag_detail` includes per-architecture image info:
```python
{
"repository": "library/nginx",
"tag": "latest",
"full_size": 68000000,
"images": [
{"architecture": "amd64", "os": "linux", "size": 34000000, "digest": "sha256:..."},
{"architecture": "arm64", "os": "linux", "size": 32000000, "digest": "sha256:..."},
]
}
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "DOCKER_HUB_TOKEN not set", "help": "Create a PAT at https://hub.docker.com/settings/security"}
{"error": "Unauthorized. Check your DOCKER_HUB_TOKEN."}
{"error": "Not found"}
{"error": "Request to Docker Hub timed out"}
```
@@ -0,0 +1,64 @@
# DuckDuckGo Tool
Search the web, news, and images using DuckDuckGo. No API key required.
## Tools
| Tool | Description |
|------|-------------|
| `duckduckgo_search` | Search the web for pages and results |
| `duckduckgo_news` | Search for recent news articles |
| `duckduckgo_images` | Search for images |
## Setup
No credentials required. DuckDuckGo searches are free and unauthenticated.
## Usage Examples
### Web search
```python
duckduckgo_search(query="python async best practices", max_results=5)
```
### Search with optional parameters
```python
duckduckgo_search(
query="AI frameworks",
max_results=10,
region="us-en",
safesearch="moderate",
timelimit="m", # past month
)
```
### News search
```python
duckduckgo_news(query="AI agents 2026", max_results=10, region="us-en")
```
### Image search
```python
duckduckgo_images(query="neural network diagram", max_results=5, size="Large")
```
## Optional Parameters
| Parameter | Tools | Description |
|-----------|-------|-------------|
| `region` | All | Region code (default: `us-en`) |
| `safesearch` | search, images | `off`, `moderate`, `strict` (default: `moderate`) |
| `timelimit` | search, news | `d` (day), `w` (week), `m` (month), `y` (year) |
| `size` | images | `Small`, `Medium`, `Large`, `Wallpaper` |
## Error Handling
All tools return error dicts on failure:
```python
{"error": "Search failed: connection timeout"}
```
When no results are found, tools return a successful response with an empty list:
```python
{"query": "obscure search", "results": [], "count": 0}
```
@@ -0,0 +1,65 @@
# File System Toolkits
A collection of file system tools for reading, writing, searching, and executing commands within the agent workspace.
## Tools
| Tool | Description |
|------|-------------|
| `apply_diff` | Apply a unified diff to a file |
| `apply_patch` | Apply a patch file to modify source files |
| `hashline_edit` | Edit a file using hashline-addressed replacements |
| `replace_file_content` | Find and replace content in a file |
| `grep_search` | Search file contents using regex patterns |
| `list_dir` | List directory contents with metadata |
| `execute_command_tool` | Execute a shell command in the workspace |
| `save_data` | Save data to a file in the agent's data directory |
| `load_data` | Load data from a file in the data directory |
| `serve_file_to_user` | Serve a file to the user for download |
| `list_data_files` | List files in the agent's data directory |
| `append_data` | Append data to an existing file |
## Sub-modules
| Module | Description |
|--------|-------------|
| `apply_diff/` | Unified diff application |
| `apply_patch/` | Patch file application |
| `data_tools/` | Data persistence (save, load, append, list, serve) |
| `execute_command_tool/` | Shell command execution with sanitization |
| `grep_search/` | File content search (uses ripgrep if available) |
| `hashline_edit/` | Hashline-based file editing |
| `list_dir/` | Directory listing |
| `replace_file_content/` | Find-and-replace in files |
## Setup
No external credentials required. File operations are scoped to the agent's workspace directory.
## Security
- `command_sanitizer.py` validates and sanitizes shell commands before execution
- `security.py` provides path traversal protection
- All file operations are workspace-scoped
## Usage Examples
### Search for a pattern in files
```python
grep_search(pattern="def register_tools", path="tools/src/", include="*.py")
```
### List directory contents
```python
list_dir(path="core/framework/", workspace_id="ws1", agent_id="agent1", session_id="s1")
```
### Save data to a file
```python
save_data(filename="results.json", data='{"status": "complete"}', data_dir="/path/to/data")
```
### Execute a command
```python
execute_command_tool(command="python -m pytest tests/ -v")
```
@@ -0,0 +1,62 @@
# GitLab Tool
Manage GitLab projects, issues, and merge requests via the GitLab REST API v4.
## Tools
| Tool | Description |
|------|-------------|
| `gitlab_list_projects` | List projects with optional search and visibility filters |
| `gitlab_get_project` | Get details of a specific project |
| `gitlab_list_issues` | List issues with state, label, and assignee filters |
| `gitlab_get_issue` | Get details of a specific issue |
| `gitlab_create_issue` | Create a new issue in a project |
| `gitlab_update_issue` | Update an existing issue (title, description, state, labels, assignee) |
| `gitlab_list_merge_requests` | List merge requests with state and label filters |
| `gitlab_get_merge_request` | Get details of a specific merge request |
| `gitlab_create_merge_request_note` | Add a comment to a merge request |
## Setup
Set the following environment variables:
| Variable | Description |
|----------|-------------|
| `GITLAB_TOKEN` | GitLab personal access token |
| `GITLAB_URL` | GitLab instance URL (optional, defaults to `https://gitlab.com`) |
Get a token at: [GitLab Access Tokens](https://gitlab.com/-/user_settings/personal_access_tokens)
Required scopes: `api` (full API access) or `read_api` + `read_repository` for read-only.
## Usage Examples
### List your projects
```python
gitlab_list_projects(membership=True, per_page=10)
```
### Search for issues
```python
gitlab_list_issues(project_id="12345", state="opened", labels="bug")
```
### Create an issue
```python
gitlab_create_issue(project_id="12345", title="Fix login bug", description="Steps to reproduce...")
```
### Add a comment to a merge request
```python
gitlab_create_merge_request_note(project_id="12345", merge_request_iid=42, body="LGTM!")
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "GITLAB_TOKEN not set", "help": "Create a personal access token at https://gitlab.com/-/user_settings/personal_access_tokens"}
{"error": "Unauthorized. Check your GitLab token."}
{"error": "Forbidden. Insufficient permissions."}
{"error": "Request to GitLab timed out"}
```
@@ -0,0 +1,68 @@
# Google Search Console Tool
Analyze search performance, manage sitemaps, and inspect URLs using the Google Search Console API.
## Tools
| Tool | Description |
|------|-------------|
| `gsc_search_analytics` | Query search analytics data with dimension and date filters |
| `gsc_list_sites` | List all verified sites in the account |
| `gsc_list_sitemaps` | List sitemaps for a site |
| `gsc_inspect_url` | Inspect a URL's indexing status |
| `gsc_submit_sitemap` | Submit a sitemap URL for a site |
| `gsc_delete_sitemap` | Delete a submitted sitemap |
| `gsc_top_queries` | Get top search queries for a site |
| `gsc_top_pages` | Get top pages by clicks for a site |
## Setup
Requires Google OAuth2 via Aden:
1. Connect your Google account at [hive.adenhq.com](https://hive.adenhq.com)
2. The `GOOGLE_SEARCH_CONSOLE_TOKEN` is managed automatically by the Aden credential system
Or set manually:
| Variable | Description |
|----------|-------------|
| `GOOGLE_SEARCH_CONSOLE_TOKEN` | Google OAuth2 access token |
Required OAuth scopes: `https://www.googleapis.com/auth/webmasters.readonly` (read) or `https://www.googleapis.com/auth/webmasters` (read/write).
## Usage Examples
### Get top queries for the last 7 days
```python
gsc_top_queries(site_url="https://example.com", days=7, limit=20)
```
### Check a URL's index status
```python
gsc_inspect_url(site_url="https://example.com", inspection_url="https://example.com/page")
```
### Submit a sitemap
```python
gsc_submit_sitemap(site_url="https://example.com", sitemap_url="https://example.com/sitemap.xml")
```
### Query search analytics with filters
```python
gsc_search_analytics(
site_url="https://example.com",
start_date="2026-01-01",
end_date="2026-01-31",
dimensions=["query", "page"],
row_limit=50,
)
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "GOOGLE_SEARCH_CONSOLE_TOKEN not set", "help": "Set GOOGLE_SEARCH_CONSOLE_TOKEN or connect via hive.adenhq.com"}
{"error": "Unauthorized. Check your GOOGLE_SEARCH_CONSOLE_TOKEN."}
{"error": "Request timed out"}
```
@@ -0,0 +1,60 @@
# Greenhouse Tool
Manage jobs, candidates, applications, and offers using the Greenhouse Harvest API.
## Tools
| Tool | Description |
|------|-------------|
| `greenhouse_list_jobs` | List jobs with optional status and department filters |
| `greenhouse_get_job` | Get details of a specific job |
| `greenhouse_list_candidates` | List candidates with optional search and date filters |
| `greenhouse_get_candidate` | Get details of a specific candidate |
| `greenhouse_list_applications` | List applications with optional job and status filters |
| `greenhouse_get_application` | Get details of a specific application |
| `greenhouse_list_offers` | List offers with optional status filter |
| `greenhouse_add_candidate_note` | Add a note to a candidate's profile |
| `greenhouse_list_scorecards` | List scorecards for an application |
## Setup
Set the following environment variable:
| Variable | Description |
|----------|-------------|
| `GREENHOUSE_API_TOKEN` | Greenhouse Harvest API token |
Get a token at: Configure > Dev Center > API Credential Management in your Greenhouse account.
The token uses HTTP Basic Auth (token as username, empty password).
## Usage Examples
### List open jobs
```python
greenhouse_list_jobs(status="open", per_page=20)
```
### Search candidates
```python
greenhouse_list_candidates(search="jane@example.com", per_page=10)
```
### Get application details
```python
greenhouse_get_application(application_id=12345)
```
### Add a note to a candidate
```python
greenhouse_add_candidate_note(candidate_id=12345, body="Strong technical interview performance.")
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "GREENHOUSE_API_TOKEN not set", "help": "Get your API key from Greenhouse: Configure > Dev Center > API Credential Management"}
{"error": "Greenhouse API error (HTTP 404): Resource not found"}
{"error": "Request timed out"}
```
@@ -0,0 +1,73 @@
# HubSpot Tool
Manage contacts, companies, deals, and associations using the HubSpot CRM API v3/v4.
## Tools
| Tool | Description |
|------|-------------|
| `hubspot_search_contacts` | Search contacts by name, email, phone |
| `hubspot_get_contact` | Get a contact by ID |
| `hubspot_create_contact` | Create a new contact |
| `hubspot_update_contact` | Update a contact's properties |
| `hubspot_search_companies` | Search companies by name, domain |
| `hubspot_get_company` | Get a company by ID |
| `hubspot_create_company` | Create a new company |
| `hubspot_update_company` | Update a company's properties |
| `hubspot_search_deals` | Search deals by name |
| `hubspot_get_deal` | Get a deal by ID |
| `hubspot_create_deal` | Create a new deal |
| `hubspot_update_deal` | Update a deal's properties |
| `hubspot_delete_object` | Delete (archive) a contact, company, or deal |
| `hubspot_list_associations` | List associations between CRM objects |
| `hubspot_create_association` | Create an association between two objects |
## Setup
Set the following environment variable or use Aden OAuth:
| Variable | Description |
|----------|-------------|
| `HUBSPOT_ACCESS_TOKEN` | HubSpot private app access token |
Get a token at: [HubSpot Developer Portal](https://developers.hubspot.com/docs/api/creating-an-app)
Supports multi-account via `account` parameter for Aden OAuth users.
## Usage Examples
### Search contacts
```python
hubspot_search_contacts(query="jane@example.com", properties=["email", "firstname", "lastname"])
```
### Create a deal
```python
hubspot_create_deal(properties={"dealname": "New Partnership", "amount": "50000"})
```
### Link a contact to a company
```python
hubspot_create_association(
from_object_type="contacts",
from_object_id="101",
to_object_type="companies",
to_object_id="202",
association_type_id=1,
)
```
### Delete a contact
```python
hubspot_delete_object(object_type="contacts", object_id="101")
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "HubSpot credentials not configured", "help": "Set HUBSPOT_ACCESS_TOKEN or configure via credential store"}
{"error": "Invalid or expired HubSpot access token"}
{"error": "HubSpot rate limit exceeded. Try again later."}
{"error": "Request timed out"}
```
@@ -0,0 +1,101 @@
# HuggingFace Tool
Discover models, datasets, and spaces on HuggingFace Hub, run model inference, generate embeddings, and manage inference endpoints.
## Tools
| Tool | Description |
|------|-------------|
| `huggingface_search_models` | Search for models by query, author, or popularity |
| `huggingface_get_model` | Get details about a specific model |
| `huggingface_search_datasets` | Search for datasets by query or author |
| `huggingface_get_dataset` | Get details about a specific dataset |
| `huggingface_search_spaces` | Search for Spaces by query or author |
| `huggingface_whoami` | Get info about the authenticated HuggingFace user |
| `huggingface_run_inference` | Run inference on any model via the Inference API |
| `huggingface_run_embedding` | Generate text embeddings via the Inference API |
| `huggingface_list_inference_endpoints` | List deployed Inference Endpoints |
## Setup
Requires a HuggingFace API token:
```bash
export HUGGINGFACE_TOKEN="hf_your_token_here"
```
> Get your token at https://huggingface.co/settings/tokens
## Usage Examples
### Search for models
```python
huggingface_search_models(query="llama", sort="downloads", limit=10)
```
### Get model details
```python
huggingface_get_model(model_id="meta-llama/Llama-3-8B")
```
### Search for datasets
```python
huggingface_search_datasets(query="squad", author="rajpurkar", limit=5)
```
### Get dataset details
```python
huggingface_get_dataset(dataset_id="openai/gsm8k")
```
### Search for Spaces
```python
huggingface_search_spaces(query="stable diffusion", sort="likes", limit=10)
```
### Get authenticated user info
```python
huggingface_whoami()
```
### Run inference
```python
huggingface_run_inference(
model_id="facebook/bart-large-cnn",
inputs="HuggingFace is a company that builds NLP tools and hosts models...",
parameters='{"max_new_tokens": 128}'
)
```
### Generate embeddings
```python
huggingface_run_embedding(
model_id="sentence-transformers/all-MiniLM-L6-v2",
inputs="The quick brown fox jumps over the lazy dog"
)
```
### List inference endpoints
```python
huggingface_list_inference_endpoints(namespace="my-org")
```
## Sort Options
| Value | Description |
|-------|-------------|
| `downloads` | Sort by download count (default for models/datasets) |
| `likes` | Sort by likes (default for spaces) |
| `lastModified` | Sort by last modified date |
## Error Handling
All tools return error dicts on failure:
```python
{"error": "HUGGINGFACE_TOKEN not set", "help": "Get a token at https://huggingface.co/settings/tokens"}
{"error": "Unauthorized. Check your HUGGINGFACE_TOKEN."}
{"error": "Model is loading", "estimated_time": 20, "help": "The model is being loaded. Retry after the estimated time."}
{"error": "Inference request timed out. Try a smaller input or a faster model."}
{"error": "Model not found: <url>"}
```
@@ -0,0 +1,127 @@
# Jira Tool
Search, create, update, and transition Jira issues and projects via the Jira Cloud REST API v3.
## Tools
| Tool | Description |
|------|-------------|
| `jira_search_issues` | Search issues using JQL |
| `jira_get_issue` | Get full details of a specific issue |
| `jira_create_issue` | Create a new issue in a project |
| `jira_update_issue` | Update fields on an existing issue |
| `jira_list_transitions` | List available status transitions for an issue |
| `jira_transition_issue` | Move an issue to a new status |
| `jira_add_comment` | Add a comment to an issue |
| `jira_list_projects` | List all projects in the workspace |
| `jira_get_project` | Get details about a specific project |
## Setup
Requires Jira Cloud credentials:
```bash
export JIRA_DOMAIN="your-org.atlassian.net"
export JIRA_EMAIL="you@example.com"
export JIRA_API_TOKEN="your_api_token"
```
> Create an API token at https://id.atlassian.com/manage/api-tokens
## Usage Examples
### Search issues with JQL
```python
jira_search_issues(
jql="project = PROJ AND status = 'In Progress'",
max_results=25
)
```
### Get issue details
```python
jira_get_issue(issue_key="PROJ-123")
```
### Create a new issue
```python
jira_create_issue(
project_key="PROJ",
summary="Fix login bug",
issue_type="Bug",
description="Users cannot log in with SSO.",
priority="High",
labels="auth,sso"
)
```
### Update an issue
```python
jira_update_issue(
issue_key="PROJ-123",
summary="Updated title",
priority="Medium"
)
```
### Transition an issue to a new status
```python
# Step 1: find available transitions
jira_list_transitions(issue_key="PROJ-123")
# Step 2: apply the transition
jira_transition_issue(
issue_key="PROJ-123",
transition_id="31",
comment="Moving to done after review."
)
```
### Add a comment
```python
jira_add_comment(
issue_key="PROJ-123",
body="This has been fixed in the latest deploy."
)
```
### List all projects
```python
jira_list_projects(max_results=50, query="backend")
```
### Get project details
```python
jira_get_project(project_key="PROJ")
```
## Issue Types
| Type | Description |
|------|-------------|
| `Task` | Standard work item (default) |
| `Bug` | Defect or problem |
| `Story` | User story |
| `Epic` | Large body of work |
## Priority Levels
| Priority | Description |
|----------|-------------|
| `Highest` | Critical |
| `High` | Important |
| `Medium` | Normal |
| `Low` | Minor |
| `Lowest` | Trivial |
## Error Handling
All tools return error dicts on failure:
```python
{"error": "JIRA_DOMAIN, JIRA_EMAIL, and JIRA_API_TOKEN not set", "help": "Create an API token at https://id.atlassian.com/manage/api-tokens"}
{"error": "Unauthorized. Check your Jira credentials."}
{"error": "Forbidden. Check your Jira permissions."}
{"error": "Rate limited. Try again shortly."}
{"error": "Not found."}
```
@@ -0,0 +1,58 @@
# Kafka Tool
Manage Apache Kafka topics, produce messages, and monitor consumer groups via the Confluent Kafka REST API.
## Tools
| Tool | Description |
|------|-------------|
| `kafka_list_topics` | List all topics in the Kafka cluster |
| `kafka_get_topic` | Get details and configuration of a topic |
| `kafka_create_topic` | Create a new topic with partition and replication settings |
| `kafka_produce_message` | Produce a message to a topic |
| `kafka_list_consumer_groups` | List all consumer groups |
| `kafka_get_consumer_group_lag` | Get consumer lag for a group |
## Setup
Set the following environment variables:
| Variable | Required | Description |
|----------|----------|-------------|
| `KAFKA_REST_URL` | Yes | Confluent Kafka REST Proxy URL |
| `KAFKA_CLUSTER_ID` | Yes | Kafka cluster ID |
| `KAFKA_API_KEY` | No | API key (for authenticated clusters) |
| `KAFKA_API_SECRET` | No | API secret (for authenticated clusters) |
Get credentials at: [Confluent Cloud](https://confluent.cloud/)
## Usage Examples
### List topics
```python
kafka_list_topics()
```
### Create a topic
```python
kafka_create_topic(topic_name="events", partitions_count=3, replication_factor=3)
```
### Produce a message
```python
kafka_produce_message(topic_name="events", key="user-123", value='{"action": "login"}')
```
### Check consumer lag
```python
kafka_get_consumer_group_lag(consumer_group_id="my-consumer-group")
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "KAFKA_REST_URL is required", "help": "Set KAFKA_REST_URL environment variable"}
{"error": "KAFKA_CLUSTER_ID is required", "help": "Set KAFKA_CLUSTER_ID environment variable"}
{"error": "Request timed out"}
```
@@ -0,0 +1,102 @@
# Langfuse Tool
LLM observability for tracing, scoring, and prompt management using Langfuse.
## Tools
| Tool | Description |
|------|-------------|
| `langfuse_list_traces` | List traces with optional filters |
| `langfuse_get_trace` | Get full details of a specific trace |
| `langfuse_list_scores` | List scores with optional filters |
| `langfuse_create_score` | Create a score for a trace or observation |
| `langfuse_list_prompts` | List prompts from prompt management |
| `langfuse_get_prompt` | Get a specific prompt by name and version |
## Setup
Requires Langfuse public and secret key pair:
```bash
export LANGFUSE_PUBLIC_KEY="pk-lf-..."
export LANGFUSE_SECRET_KEY="sk-lf-..."
# Optional: defaults to US cloud
export LANGFUSE_HOST="https://cloud.langfuse.com"
# EU cloud:
# export LANGFUSE_HOST="https://eu.cloud.langfuse.com"
# Self-hosted:
# export LANGFUSE_HOST="https://your-self-hosted-langfuse.com"
```
> Get your keys from https://cloud.langfuse.com/project/&lt;id&gt;/settings
## Usage Examples
### List recent traces
```python
langfuse_list_traces(user_id="user_123", limit=20)
```
### Get full trace details
```python
langfuse_get_trace(trace_id="trace_abc123")
```
### List scores for a trace
```python
langfuse_list_scores(trace_id="trace_abc123")
```
### Create a score
```python
langfuse_create_score(
trace_id="trace_abc123",
name="correctness",
value=0.95,
data_type="NUMERIC",
comment="Output matches expected format perfectly"
)
```
### List production prompts
```python
langfuse_list_prompts(label="production")
```
### Get a specific prompt version
```python
langfuse_get_prompt(
prompt_name="customer-support-agent",
label="production"
)
```
## Score Data Types
| Type | Description | Example Value |
|------|-------------|---------------|
| `NUMERIC` | Continuous numeric score | `0.95`, `85.0` |
| `CATEGORICAL` | Category label | `"good"`, `"bad"` |
| `BOOLEAN` | Binary pass/fail | `1.0` (pass), `0.0` (fail) |
## Score Sources
| Source | Description |
|--------|-------------|
| `API` | Score created via API |
| `ANNOTATION` | Human annotation via Langfuse UI |
| `EVAL` | Automated evaluation job |
## Error Handling
All tools return error dicts on failure:
```python
{"error": "Langfuse credentials not configured", "help": "Set LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY environment variables or configure via credential store"}
{"error": "Invalid Langfuse API keys"}
{"error": "Insufficient permissions for this Langfuse resource"}
{"error": "Langfuse resource not found"}
{"error": "Langfuse rate limit exceeded. Try again later."}
{"error": "Request timed out"}
```
@@ -0,0 +1,151 @@
# Linear Tool
Manage issues, projects, teams, labels, cycles, and users via the Linear GraphQL API.
## Tools
| Tool | Description |
|------|-------------|
| `linear_issue_create` | Create a new issue |
| `linear_issue_get` | Get issue details by ID or identifier |
| `linear_issue_update` | Update an existing issue |
| `linear_issue_delete` | Delete an issue |
| `linear_issue_search` | Search issues with filters |
| `linear_issue_add_comment` | Add a comment to an issue |
| `linear_issue_comments_list` | List comments on an issue |
| `linear_issue_relation_create` | Create a relation between two issues |
| `linear_project_create` | Create a new project |
| `linear_project_get` | Get project details |
| `linear_project_update` | Update a project |
| `linear_project_list` | List projects with optional filters |
| `linear_teams_list` | List all teams in the workspace |
| `linear_team_get` | Get team details including states and members |
| `linear_workflow_states_get` | Get workflow states for a team |
| `linear_label_create` | Create a new label for a team |
| `linear_labels_list` | List all labels |
| `linear_users_list` | List all users in the workspace |
| `linear_user_get` | Get user details and assigned issues |
| `linear_viewer` | Get details about the authenticated user |
| `linear_cycles_list` | List cycles (sprints) for a team |
## Setup
Requires a Linear personal API key:
```bash
export LINEAR_API_KEY="lin_api_your_api_key"
```
> Get your API key at https://linear.app/settings/api
## Usage Examples
### Create an issue
```python
linear_issue_create(
title="Fix login bug",
team_id="TEAM_UUID",
description="Users cannot log in with SSO.",
priority=1
)
```
### Get an issue
```python
linear_issue_get(issue_id="ENG-123")
```
### Search issues
```python
linear_issue_search(
query="login bug",
team_id="TEAM_UUID",
limit=20
)
```
### Update an issue
```python
linear_issue_update(
issue_id="ENG-123",
state_id="STATE_UUID",
priority=2
)
```
### Add a comment
```python
linear_issue_add_comment(
issue_id="ENG-123",
body="Fixed in PR #456. Ready for review."
)
```
### Create a relation between issues
```python
linear_issue_relation_create(
issue_id="ENG-123",
related_issue_id="ENG-456",
relation_type="blocks"
)
```
### List teams
```python
linear_teams_list()
```
### Get workflow states for a team
```python
linear_workflow_states_get(team_id="TEAM_UUID")
```
### List cycles (sprints)
```python
linear_cycles_list(team_id="TEAM_UUID", limit=10)
```
### Get authenticated user
```python
linear_viewer()
```
## Priority Levels
| Value | Description |
|-------|-------------|
| `0` | No priority |
| `1` | Urgent |
| `2` | High |
| `3` | Medium |
| `4` | Low |
## Project States
| State | Description |
|-------|-------------|
| `planned` | Not yet started |
| `started` | In progress |
| `paused` | On hold |
| `completed` | Done |
| `canceled` | Canceled |
## Issue Relation Types
| Type | Description |
|------|-------------|
| `related` | Generally related (default) |
| `blocks` | This issue blocks the other |
| `duplicate` | Duplicate of the other issue |
## Error Handling
All tools return error dicts on failure:
```python
{"error": "Linear credentials not configured", "help": "Set LINEAR_API_KEY environment variable or configure via credential store. Get an API key at https://linear.app/settings/api"}
{"error": "Invalid or expired Linear API key"}
{"error": "Insufficient permissions. Check your Linear API key scopes."}
{"error": "Linear rate limit exceeded. Try again later."}
{"error": "Request timed out"}
```
@@ -0,0 +1,83 @@
# Microsoft Graph Tool
Access Outlook mail, Microsoft Teams, and OneDrive files via the Microsoft Graph API v1.0.
## Tools
### Outlook Mail
| Tool | Description |
|------|-------------|
| `outlook_list_messages` | List emails with optional folder and search filters |
| `outlook_get_message` | Get details of a specific email |
| `outlook_send_mail` | Send an email |
### Microsoft Teams
| Tool | Description |
|------|-------------|
| `teams_list_teams` | List teams the user belongs to |
| `teams_list_channels` | List channels in a team |
| `teams_send_channel_message` | Send a message to a team channel |
| `teams_get_channel_messages` | Get recent messages from a channel |
### OneDrive
| Tool | Description |
|------|-------------|
| `onedrive_search_files` | Search for files across OneDrive |
| `onedrive_list_files` | List files in a folder |
| `onedrive_download_file` | Download a file's content |
| `onedrive_upload_file` | Upload a small file to OneDrive (up to 4MB) |
## Setup
Set the following environment variable or use Aden OAuth:
| Variable | Description |
|----------|-------------|
| `MICROSOFT_GRAPH_ACCESS_TOKEN` | Microsoft Graph API access token |
Get credentials at: [Azure App Registrations](https://portal.azure.com/#view/Microsoft_AAD_RegisteredApps)
Required permissions: `Mail.Read`, `Mail.Send`, `Team.ReadBasic.All`, `Channel.ReadBasic.All`, `ChannelMessage.Send`, `ChannelMessage.Read.All`, `Files.ReadWrite`
## Usage Examples
### List unread emails
```python
outlook_list_messages(folder="inbox", search="is:unread", top=10)
```
### Send an email
```python
outlook_send_mail(
to=["jane@example.com"],
subject="Meeting Notes",
body="Here are the notes from today's meeting.",
)
```
### List Teams channels
```python
teams_list_channels(team_id="team-abc-123")
```
### Search OneDrive files
```python
onedrive_search_files(query="quarterly report", top=5)
```
### Upload a file to OneDrive
```python
onedrive_upload_file(file_path="Documents/notes.txt", content="Meeting notes here")
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "MICROSOFT_GRAPH_ACCESS_TOKEN not set", "help": "Set MICROSOFT_GRAPH_ACCESS_TOKEN or connect via hive.adenhq.com"}
{"error": "Microsoft Graph API error (HTTP 403): Insufficient privileges"}
{"error": "Request timed out"}
```
@@ -0,0 +1,100 @@
# MongoDB Tool
Perform document CRUD and aggregation on MongoDB collections via the Atlas Data API (or compatible replacements like Delbridge and RESTHeart).
## Tools
| Tool | Description |
|------|-------------|
| `mongodb_find` | Find multiple documents matching a filter |
| `mongodb_find_one` | Find a single document matching a filter |
| `mongodb_insert_one` | Insert a single document into a collection |
| `mongodb_update_one` | Update a single document matching a filter |
| `mongodb_delete_one` | Delete a single document matching a filter |
| `mongodb_aggregate` | Run an aggregation pipeline on a collection |
## Setup
Requires MongoDB Atlas Data API credentials:
```bash
export MONGODB_DATA_API_URL="https://data.mongodb-api.com/app/<app-id>/endpoint/data/v1"
export MONGODB_API_KEY="your_api_key"
export MONGODB_DATA_SOURCE="your_cluster_name" # e.g. "Cluster0"
```
> Enable the Data API and get credentials from https://cloud.mongodb.com under **App Services → Data API**
> **Note:** The Atlas Data API reached EOL in September 2025. Compatible replacements like [Delbridge](https://github.com/stdatlas/delbridge) and [RESTHeart](https://restheart.org/) use the same interface.
## Usage Examples
### Find documents
```python
mongodb_find(
database="mydb",
collection="users",
filter='{"status": "active"}',
sort='{"created": -1}',
limit=10
)
```
### Find a single document
```python
mongodb_find_one(
database="mydb",
collection="users",
filter='{"email": "alice@example.com"}',
projection='{"name": 1, "email": 1, "_id": 0}'
)
```
### Insert a document
```python
mongodb_insert_one(
database="mydb",
collection="users",
document='{"name": "Alice", "email": "alice@example.com", "status": "active"}'
)
```
### Update a document
```python
mongodb_update_one(
database="mydb",
collection="users",
filter='{"email": "alice@example.com"}',
update='{"$set": {"status": "inactive"}}',
upsert=False
)
```
### Delete a document
```python
mongodb_delete_one(
database="mydb",
collection="users",
filter='{"email": "alice@example.com"}'
)
```
### Run an aggregation pipeline
```python
mongodb_aggregate(
database="mydb",
collection="orders",
pipeline='[{"$match": {"status": "completed"}}, {"$group": {"_id": "$userId", "total": {"$sum": "$amount"}}}]'
)
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "MONGODB_DATA_API_URL and MONGODB_API_KEY are required", "help": "Set MONGODB_DATA_API_URL and MONGODB_API_KEY environment variables"}
{"error": "HTTP 401: ..."}
{"error": "filter must be valid JSON"}
{"error": "no document found matching filter"}
```
@@ -0,0 +1,56 @@
# n8n Tool
Manage n8n workflows and executions via the n8n REST API.
## Tools
| Tool | Description |
|------|-------------|
| `n8n_list_workflows` | List workflows with optional status and tag filters |
| `n8n_get_workflow` | Get details of a specific workflow |
| `n8n_activate_workflow` | Activate a workflow |
| `n8n_deactivate_workflow` | Deactivate a workflow |
| `n8n_list_executions` | List workflow executions with optional status filter |
| `n8n_get_execution` | Get details of a specific execution |
## Setup
Set the following environment variables:
| Variable | Description |
|----------|-------------|
| `N8N_API_KEY` | n8n API key |
| `N8N_BASE_URL` | n8n instance URL (e.g., `https://your-n8n.example.com`) |
Get an API key at: Settings → API → Create API Key in your n8n instance.
## Usage Examples
### List active workflows
```python
n8n_list_workflows(active="true")
```
### Get workflow details
```python
n8n_get_workflow(workflow_id="123")
```
### Activate a workflow
```python
n8n_activate_workflow(workflow_id="123")
```
### List recent executions
```python
n8n_list_executions(status="success", limit=10)
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "n8n credentials not configured", "help": "Set N8N_API_KEY and N8N_BASE_URL environment variables or configure via credential store"}
{"error": "n8n API error (HTTP 404): Workflow not found"}
{"error": "Request timed out"}
```
@@ -0,0 +1,54 @@
# Obsidian Tool
Read, write, search, and manage notes in an Obsidian vault via the Obsidian Local REST API.
## Tools
| Tool | Description |
|------|-------------|
| `obsidian_read_note` | Read the content of a note by path |
| `obsidian_write_note` | Create or overwrite a note |
| `obsidian_append_note` | Append content to an existing note |
| `obsidian_search` | Search notes by text or regex |
| `obsidian_list_files` | List files and folders in a vault path |
| `obsidian_get_active` | Get the currently active note |
## Setup
Requires the [Obsidian Local REST API](https://github.com/coddingtonbear/obsidian-local-rest-api) plugin.
| Variable | Description |
|----------|-------------|
| `OBSIDIAN_REST_API_KEY` | API key from the Local REST API plugin |
| `OBSIDIAN_REST_BASE_URL` | REST API URL (default: `https://127.0.0.1:27124`) |
## Usage Examples
### Read a note
```python
obsidian_read_note(path="Projects/hive-contributions.md")
```
### Write a note
```python
obsidian_write_note(path="Daily/2026-03-30.md", content="# Today\n\n- Submitted PR")
```
### Search the vault
```python
obsidian_search(query="event bus tests", context_length=100)
```
### List files in a folder
```python
obsidian_list_files(path="Projects/")
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "OBSIDIAN_REST_API_KEY not set", "help": "Set OBSIDIAN_REST_API_KEY environment variable or configure via credential store"}
{"error": "Obsidian API error (HTTP 404): Note not found"}
{"error": "Request timed out"}
```
@@ -0,0 +1,62 @@
# PagerDuty Tool
Manage incidents, services, on-calls, and escalation policies via the PagerDuty REST API v2.
## Tools
| Tool | Description |
|------|-------------|
| `pagerduty_list_incidents` | List incidents with status, urgency, and date filters |
| `pagerduty_get_incident` | Get details of a specific incident |
| `pagerduty_create_incident` | Create a new incident |
| `pagerduty_update_incident` | Update an incident's status or assignment |
| `pagerduty_list_services` | List services with optional name filter |
| `pagerduty_list_oncalls` | List current on-call schedules |
| `pagerduty_add_incident_note` | Add a note to an incident |
| `pagerduty_list_escalation_policies` | List escalation policies |
## Setup
Set the following environment variables:
| Variable | Description |
|----------|-------------|
| `PAGERDUTY_API_KEY` | PagerDuty REST API token |
| `PAGERDUTY_FROM_EMAIL` | Email address for write operations (used in `From` header) |
Get a token at: [PagerDuty API Access Keys](https://support.pagerduty.com/docs/api-access-keys)
## Usage Examples
### List triggered incidents
```python
pagerduty_list_incidents(statuses=["triggered", "acknowledged"], limit=10)
```
### Create an incident
```python
pagerduty_create_incident(
title="Database connection pool exhausted",
service_id="P1234AB",
urgency="high",
)
```
### Acknowledge an incident
```python
pagerduty_update_incident(incident_id="P5678CD", status="acknowledged")
```
### Check who's on call
```python
pagerduty_list_oncalls()
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "PAGERDUTY_API_KEY is required", "help": "Set PAGERDUTY_API_KEY environment variable"}
{"error": "PagerDuty API error (HTTP 404): Incident not found"}
{"error": "Request timed out"}
```
@@ -0,0 +1,127 @@
# Pinecone Tool
Manage Pinecone vector indexes and perform vector operations for semantic search and RAG workflows.
## Tools
| Tool | Description |
|------|-------------|
| `pinecone_list_indexes` | List all indexes in your Pinecone project |
| `pinecone_create_index` | Create a new serverless index |
| `pinecone_describe_index` | Get configuration and status of a specific index |
| `pinecone_delete_index` | Delete an index (irreversible) |
| `pinecone_upsert_vectors` | Insert or update vectors in an index |
| `pinecone_query_vectors` | Query an index for similar vectors |
| `pinecone_fetch_vectors` | Fetch specific vectors by ID |
| `pinecone_delete_vectors` | Delete vectors by ID, filter, or entire namespace |
| `pinecone_index_stats` | Get vector counts and namespace statistics for an index |
## Setup
Requires a Pinecone API key:
```bash
export PINECONE_API_KEY="your_api_key_here"
```
> Get your API key at https://app.pinecone.io/ under **API Keys**
## Usage Examples
### List all indexes
```python
pinecone_list_indexes()
```
### Create a new index
```python
pinecone_create_index(
name="my-index",
dimension=1536,
metric="cosine",
cloud="aws",
region="us-east-1"
)
```
### Describe an index
```python
pinecone_describe_index(index_name="my-index")
```
### Upsert vectors
```python
pinecone_upsert_vectors(
index_host="https://my-index-abc123.svc.pinecone.io",
vectors=[
{"id": "vec1", "values": [0.1, 0.2, 0.3], "metadata": {"source": "doc1"}},
{"id": "vec2", "values": [0.4, 0.5, 0.6], "metadata": {"source": "doc2"}},
],
namespace="my-namespace"
)
```
### Query for similar vectors
```python
pinecone_query_vectors(
index_host="https://my-index-abc123.svc.pinecone.io",
vector=[0.1, 0.2, 0.3],
top_k=5,
filter={"source": {"$eq": "doc1"}},
include_metadata=True
)
```
### Fetch vectors by ID
```python
pinecone_fetch_vectors(
index_host="https://my-index-abc123.svc.pinecone.io",
ids=["vec1", "vec2"],
namespace="my-namespace"
)
```
### Delete vectors
```python
# By ID
pinecone_delete_vectors(
index_host="https://my-index-abc123.svc.pinecone.io",
ids=["vec1", "vec2"]
)
# All vectors in a namespace
pinecone_delete_vectors(
index_host="https://my-index-abc123.svc.pinecone.io",
namespace="my-namespace",
delete_all=True
)
```
### Get index stats
```python
pinecone_index_stats(index_host="https://my-index-abc123.svc.pinecone.io")
```
### Delete an index
```python
pinecone_delete_index(index_name="my-index")
```
## Distance Metrics
| Metric | Description |
|--------|-------------|
| `cosine` | Cosine similarity (default, recommended for text embeddings) |
| `euclidean` | Euclidean distance |
| `dotproduct` | Dot product (for normalized vectors) |
## Error Handling
All tools return error dicts on failure:
```python
{"error": "PINECONE_API_KEY not set", "help": "Get an API key at https://app.pinecone.io/ under API Keys"}
{"error": "Unauthorized. Check your PINECONE_API_KEY."}
{"error": "Pinecone API error 400: ..."}
{"error": "Request to Pinecone timed out"}
```
@@ -0,0 +1,72 @@
# Pipedrive Tool
Manage deals, contacts, organizations, activities, and pipelines using the Pipedrive CRM API.
## Tools
| Tool | Description |
|------|-------------|
| `pipedrive_list_deals` | List deals with status, stage, and sort filters |
| `pipedrive_get_deal` | Get details of a specific deal |
| `pipedrive_create_deal` | Create a new deal |
| `pipedrive_update_deal` | Update a deal's properties |
| `pipedrive_list_persons` | List contacts with optional search |
| `pipedrive_search_persons` | Search contacts by name or email |
| `pipedrive_create_person` | Create a new contact |
| `pipedrive_list_organizations` | List organizations |
| `pipedrive_list_activities` | List activities with type and date filters |
| `pipedrive_create_activity` | Create a new activity |
| `pipedrive_list_pipelines` | List all sales pipelines |
| `pipedrive_list_stages` | List stages in a pipeline |
| `pipedrive_add_note` | Add a note to a deal, person, or organization |
## Setup
Set the following environment variable:
| Variable | Description |
|----------|-------------|
| `PIPEDRIVE_API_TOKEN` | Pipedrive API token |
Get a token at: Settings > Personal preferences > API in your Pipedrive account.
## Usage Examples
### List open deals
```python
pipedrive_list_deals(status="open", sort="update_time DESC", limit=20)
```
### Search for a contact
```python
pipedrive_search_persons(term="jane@example.com")
```
### Create a deal
```python
pipedrive_create_deal(title="Enterprise License", value=50000, currency="USD")
```
### Create a contact
```python
pipedrive_create_person(name="Jane Doe", email="jane@example.com")
```
### Create an activity
```python
pipedrive_create_activity(subject="Follow-up call", activity_type="call", due_date="2026-04-15")
```
### Add a note to a deal
```python
pipedrive_add_note(content="Follow up scheduled for next week.", deal_id=12345)
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "PIPEDRIVE_API_TOKEN not set", "help": "Get your API token from Pipedrive Settings > Personal preferences > API"}
{"error": "Pipedrive API error (HTTP 404): Deal not found"}
{"error": "Request timed out"}
```
@@ -0,0 +1,57 @@
# Plaid Tool
Access bank accounts, balances, and transactions using the Plaid API.
## Tools
| Tool | Description |
|------|-------------|
| `plaid_get_accounts` | List linked bank accounts |
| `plaid_get_balance` | Get real-time account balances |
| `plaid_sync_transactions` | Sync new transactions incrementally |
| `plaid_get_transactions` | Get transactions with date range and filters |
| `plaid_get_institution` | Get details about a financial institution |
| `plaid_search_institutions` | Search for institutions by name |
## Setup
Set the following environment variables:
| Variable | Description |
|----------|-------------|
| `PLAID_CLIENT_ID` | Plaid client ID |
| `PLAID_SECRET` | Plaid secret key |
| `PLAID_ENV` | Environment: `sandbox`, `development`, or `production` (default: `sandbox`) |
Get credentials at: [Plaid Dashboard](https://dashboard.plaid.com/developers/keys)
## Usage Examples
### Get account balances
```python
plaid_get_balance(access_token="access-sandbox-abc123")
```
### Get recent transactions
```python
plaid_get_transactions(
access_token="access-sandbox-abc123",
start_date="2026-01-01",
end_date="2026-01-31",
count=50,
)
```
### Search for a bank
```python
plaid_search_institutions(query="Chase", count=5)
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "PLAID_CLIENT_ID and PLAID_SECRET not set", "help": "Get credentials at https://dashboard.plaid.com/developers/keys"}
{"error": "Plaid API error: INVALID_ACCESS_TOKEN"}
{"error": "Request timed out"}
```
@@ -0,0 +1,56 @@
# Power BI Tool
Manage Power BI workspaces, datasets, and reports via the Power BI REST API.
## Tools
| Tool | Description |
|------|-------------|
| `powerbi_list_workspaces` | List workspaces with optional name filter |
| `powerbi_list_datasets` | List datasets in a workspace |
| `powerbi_list_reports` | List reports in a workspace |
| `powerbi_refresh_dataset` | Trigger a dataset refresh |
| `powerbi_get_refresh_history` | Get refresh history for a dataset |
## Setup
Set the following environment variable:
| Variable | Description |
|----------|-------------|
| `POWERBI_ACCESS_TOKEN` | Power BI REST API bearer token |
Get a token via Azure AD: [Power BI REST API](https://learn.microsoft.com/en-us/power-bi/developer/embedded/register-app)
Required permissions: `Dataset.ReadWrite.All`, `Workspace.Read.All`
## Usage Examples
### List workspaces
```python
powerbi_list_workspaces()
```
### List datasets in a workspace
```python
powerbi_list_datasets(workspace_id="abc-123")
```
### Trigger a dataset refresh
```python
powerbi_refresh_dataset(workspace_id="abc-123", dataset_id="def-456")
```
### Check refresh history
```python
powerbi_get_refresh_history(workspace_id="abc-123", dataset_id="def-456")
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "POWERBI_ACCESS_TOKEN is required", "help": "Set POWERBI_ACCESS_TOKEN environment variable"}
{"error": "Power BI API error (HTTP 403): Insufficient permissions"}
{"error": "Request timed out"}
```
@@ -0,0 +1,61 @@
# QuickBooks Tool
Manage customers, invoices, payments, and company info using the QuickBooks Online API.
## Tools
| Tool | Description |
|------|-------------|
| `quickbooks_query` | Run a QuickBooks SQL-like query |
| `quickbooks_get_entity` | Get any entity by type and ID |
| `quickbooks_create_customer` | Create a new customer |
| `quickbooks_create_invoice` | Create a new invoice |
| `quickbooks_get_company_info` | Get company information |
| `quickbooks_list_invoices` | List invoices with date and status filters |
| `quickbooks_get_customer` | Get a customer by ID |
| `quickbooks_create_payment` | Record a payment against an invoice |
## Setup
Set the following environment variables:
| Variable | Description |
|----------|-------------|
| `QUICKBOOKS_ACCESS_TOKEN` | OAuth2 access token |
| `QUICKBOOKS_REALM_ID` | QuickBooks company/realm ID |
Get credentials at: [Intuit Developer Portal](https://developer.intuit.com/)
## Usage Examples
### Query customers
```python
quickbooks_query(query="SELECT * FROM Customer WHERE DisplayName LIKE '%Acme%'")
```
### Create a customer
```python
quickbooks_create_customer(display_name="Acme Corp", email="billing@acme.com")
```
### Create an invoice
```python
quickbooks_create_invoice(
customer_id="123",
line_items=[{"description": "Consulting", "amount": 5000, "quantity": 1}],
)
```
### List recent invoices
```python
quickbooks_list_invoices(start_date="2026-01-01", status="Overdue")
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "QUICKBOOKS_ACCESS_TOKEN and QUICKBOOKS_REALM_ID are required", "help": "Set QUICKBOOKS_ACCESS_TOKEN and QUICKBOOKS_REALM_ID environment variables"}
{"error": "QuickBooks API error (HTTP 401): AuthenticationFailed"}
{"error": "Request timed out"}
```
@@ -0,0 +1,96 @@
# Redis Tool
Key-value, hash, list, pub/sub, and utility operations for Redis via a connection URL.
## Tools
| Tool | Description |
|------|-------------|
| `redis_get` | Get the value of a key |
| `redis_set` | Set a key-value pair with optional TTL |
| `redis_delete` | Delete one or more keys |
| `redis_keys` | List keys matching a pattern (non-blocking SCAN) |
| `redis_hset` | Set a field in a hash |
| `redis_hgetall` | Get all fields and values from a hash |
| `redis_lpush` | Push values to the head of a list |
| `redis_lrange` | Get a range of elements from a list |
| `redis_publish` | Publish a message to a channel |
| `redis_ttl` | Get the time-to-live of a key in seconds |
| `redis_info` | Get Redis server information and statistics |
## Setup
Requires a Redis connection URL:
```bash
export REDIS_URL="redis://localhost:6379"
# With password:
# export REDIS_URL="redis://:yourpassword@host:6379/0"
# With TLS:
# export REDIS_URL="rediss://:yourpassword@host:6379/0"
```
## Usage Examples
### Get and set a key
```python
redis_set(key="user:123:name", value="Alice", ttl=3600)
redis_get(key="user:123:name")
```
### Delete keys
```python
redis_delete(keys="user:123:name, user:123:session")
```
### List keys matching a pattern
```python
redis_keys(pattern="user:*", count=50)
```
### Work with a hash
```python
redis_hset(key="user:123", field="email", value="alice@example.com")
redis_hgetall(key="user:123")
```
### Work with a list
```python
redis_lpush(key="task_queue", values="task1, task2, task3")
redis_lrange(key="task_queue", start=0, stop=-1)
```
### Publish a message
```python
redis_publish(channel="notifications", message="New order received")
```
### Check TTL
```python
redis_ttl(key="user:123:session")
# Returns: {"key": "user:123:session", "ttl": 3542}
# -1 = no expiry, -2 = key doesn't exist
```
### Get server info
```python
redis_info()
```
## TTL Reference
| TTL Value | Meaning |
|-----------|---------|
| `> 0` | Seconds remaining until expiry |
| `-1` | Key exists with no expiry |
| `-2` | Key does not exist |
## Error Handling
All tools return error dicts on failure:
```python
{"error": "REDIS_URL not set", "help": "Set REDIS_URL (e.g. redis://localhost:6379 or redis://:password@host:6379/0)"}
{"error": "Redis GET failed: Connection refused"}
{"error": "Redis SET failed: ..."}
```
@@ -0,0 +1,66 @@
# Salesforce Tool
Query, create, update, and manage Salesforce CRM records via the Salesforce REST API.
## Tools
| Tool | Description |
|------|-------------|
| `salesforce_soql_query` | Execute a SOQL query |
| `salesforce_get_record` | Get a record by object type and ID |
| `salesforce_create_record` | Create a new record |
| `salesforce_update_record` | Update an existing record |
| `salesforce_delete_record` | Delete a record |
| `salesforce_describe_object` | Get metadata and fields for an object type |
| `salesforce_list_objects` | List all available Salesforce objects |
| `salesforce_search_records` | Search records using SOSL |
| `salesforce_get_record_count` | Get the total count of records for an object |
## Setup
Set the following environment variables:
| Variable | Description |
|----------|-------------|
| `SALESFORCE_ACCESS_TOKEN` | OAuth2 access token |
| `SALESFORCE_INSTANCE_URL` | Salesforce instance URL (e.g., `https://yourorg.salesforce.com`) |
Get credentials at: [Salesforce Connected Apps](https://help.salesforce.com/s/articleView?id=sf.connected_app_overview.htm)
## Usage Examples
### Query contacts
```python
salesforce_soql_query(query="SELECT Id, Name, Email FROM Contact WHERE Email != null LIMIT 10")
```
### Create a lead
```python
salesforce_create_record(
object_type="Lead",
fields={"FirstName": "Jane", "LastName": "Doe", "Company": "Acme"},
)
```
### Update an opportunity
```python
salesforce_update_record(
object_type="Opportunity",
record_id="006xx000001234",
fields={"StageName": "Closed Won", "Amount": 50000},
)
```
### Search across objects
```python
salesforce_search_records(search_query="FIND {Acme} IN ALL FIELDS RETURNING Account, Contact")
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "Salesforce credentials not configured", "help": "Set SALESFORCE_ACCESS_TOKEN and SALESFORCE_INSTANCE_URL environment variables or configure via credential store"}
{"error": "Salesforce API error (HTTP 400): MALFORMED_QUERY"}
{"error": "Request timed out"}
```
@@ -0,0 +1,56 @@
# SAP Tool
Access SAP S/4HANA data via OData APIs — purchase orders, business partners, products, and sales orders.
## Tools
| Tool | Description |
|------|-------------|
| `sap_list_purchase_orders` | List purchase orders with optional filters |
| `sap_get_purchase_order` | Get details of a specific purchase order |
| `sap_list_business_partners` | List business partners with search and category filters |
| `sap_list_products` | List products with optional search |
| `sap_list_sales_orders` | List sales orders with customer and date filters |
## Setup
Set the following environment variables:
| Variable | Description |
|----------|-------------|
| `SAP_BASE_URL` | SAP S/4HANA OData base URL |
| `SAP_USERNAME` | SAP username for Basic Auth |
| `SAP_PASSWORD` | SAP password for Basic Auth |
Get credentials from your SAP system administrator.
## Usage Examples
### List purchase orders
```python
sap_list_purchase_orders(top=20)
```
### Get a specific purchase order
```python
sap_get_purchase_order(purchase_order="4500000001")
```
### Search business partners
```python
sap_list_business_partners(search="Acme", category="supplier", top=10)
```
### List recent sales orders
```python
sap_list_sales_orders(top=20)
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "SAP_BASE_URL, SAP_USERNAME, and SAP_PASSWORD are required", "help": "Set SAP_BASE_URL, SAP_USERNAME, and SAP_PASSWORD environment variables"}
{"error": "SAP API error (HTTP 404): Resource not found"}
{"error": "Request timed out"}
```
@@ -0,0 +1,135 @@
# Shopify Tool
Order, product, and customer management via the Shopify Admin REST API.
## Tools
| Tool | Description |
|------|-------------|
| `shopify_list_orders` | List orders with optional status and fulfillment filters |
| `shopify_get_order` | Get full details of a specific order by ID |
| `shopify_list_products` | List products with optional status, type, and vendor filters |
| `shopify_get_product` | Get full product details including variants and images |
| `shopify_update_product` | Update title, body_html, status, tags, or vendor of a product |
| `shopify_list_customers` | List customers in the store |
| `shopify_get_customer` | Get full customer details including addresses and order stats |
| `shopify_search_customers` | Search customers by email, name, or other fields |
| `shopify_create_draft_order` | Create a draft order with line items |
## Setup
Requires a Shopify Custom App access token and your store name:
1. Go to your Shopify admin → **Settings → Apps and sales channels → Develop apps**
2. Create a custom app and install it with the required API scopes
3. Copy the **Admin API access token** from the app credentials
```bash
SHOPIFY_ACCESS_TOKEN=shpat_xxxxxxxxxxxxxxxxxxxxxxxxxxxx
SHOPIFY_STORE_NAME=your-store-name
```
> `SHOPIFY_STORE_NAME` is the subdomain of your store. For `https://my-shop.myshopify.com`, use `my-shop`.
Required API scopes:
- `read_orders`, `write_orders` — order management
- `read_products`, `write_products` — product management
- `read_customers`, `write_customers` — customer management
- `read_draft_orders`, `write_draft_orders` — draft order creation
## Usage Examples
### List open orders
```python
shopify_list_orders(status="open", limit=50)
```
### List paid but unfulfilled orders
```python
shopify_list_orders(
financial_status="paid",
fulfillment_status="unshipped",
limit=25,
)
```
### Get a specific order
```python
shopify_get_order(order_id="5678901234")
```
### List active products
```python
shopify_list_products(status="active", limit=50)
```
### Get a specific product
```python
shopify_get_product(product_id="1234567890")
```
### Update a product
```python
shopify_update_product(
product_id="1234567890",
title="Updated Product Name",
status="active",
tags="sale,featured",
)
```
### List customers
```python
shopify_list_customers(limit=100)
```
### Search customers by email
```python
shopify_search_customers(query="email:alice@example.com")
```
### Search customers by name
```python
shopify_search_customers(query="first_name:Alice")
```
### Create a draft order
```python
shopify_create_draft_order(
line_items_json='[{"variant_id": 12345678, "quantity": 2}]',
customer_id="987654321",
note="VIP order",
tags="manual,vip",
)
```
## Order Status Values
| Filter | Values |
|--------|--------|
| `status` | `open`, `closed`, `cancelled`, `any` |
| `financial_status` | `paid`, `pending`, `refunded`, `voided` |
| `fulfillment_status` | `fulfilled`, `partial`, `on_hold`, `null` (unfulfilled) |
> **Note:** `fulfillment_status=null` filters for unfulfilled orders. Shopify silently ignores unrecognized filter values rather than returning an error.
## Error Handling
All tools return error dicts on failure:
```python
{"error": "Shopify credentials not configured", "help": "Set SHOPIFY_ACCESS_TOKEN and SHOPIFY_STORE_NAME environment variables or configure via credential store"}
{"error": "Invalid Shopify access token"}
{"error": "Insufficient API scopes for this Shopify resource"}
{"error": "Shopify rate limit exceeded. Try again later."}
```
@@ -0,0 +1,111 @@
# Snowflake Tool
SQL statement execution and async query management via the Snowflake REST API v2.
## Tools
| Tool | Description |
|------|-------------|
| `snowflake_execute_sql` | Execute a SQL statement and return results |
| `snowflake_get_statement_status` | Poll the status and results of an async query |
| `snowflake_cancel_statement` | Cancel a running SQL statement |
## Setup
Requires a Snowflake account identifier and an OAuth or JWT access token:
1. Note your **Account Identifier** (e.g. `orgname-accountname` or `xy12345.us-east-1`)
2. Generate an access token via OAuth, key-pair authentication, or Snowflake programmatic access
```bash
SNOWFLAKE_ACCOUNT=orgname-accountname
SNOWFLAKE_TOKEN=your-oauth-or-jwt-token
```
Optional — set default context to avoid repeating them per query:
```bash
SNOWFLAKE_WAREHOUSE=COMPUTE_WH
SNOWFLAKE_DATABASE=MY_DATABASE
SNOWFLAKE_SCHEMA=PUBLIC
SNOWFLAKE_TOKEN_TYPE=OAUTH
```
> `SNOWFLAKE_TOKEN_TYPE` defaults to `OAUTH`. Set to `KEYPAIR_JWT` if using key-pair auth.
## Usage Examples
### Run a simple query
```python
snowflake_execute_sql(statement="SELECT CURRENT_USER(), CURRENT_DATABASE()")
```
### Query a specific database and schema
```python
snowflake_execute_sql(
statement="SELECT * FROM orders WHERE status = 'pending' LIMIT 100",
database="SALES_DB",
schema="PUBLIC",
warehouse="COMPUTE_WH",
)
```
### Run a long query asynchronously
```python
# Returns immediately with status="running"
result = snowflake_execute_sql(
statement="SELECT COUNT(*) FROM very_large_table",
timeout=120,
)
# Poll until complete
snowflake_get_statement_status(
statement_handle=result["statement_handle"]
)
```
### Cancel a running query
```python
snowflake_cancel_statement(
statement_handle="01abc123-0000-0001-0000-000100020003"
)
```
## Response Format
A completed query returns:
```python
{
"statement_handle": "01abc...",
"status": "complete",
"num_rows": 42,
"columns": ["ID", "NAME", "CREATED_AT"],
"rows": [["1", "Alice", "2024-01-01"], ...],
"truncated": False, # True if > 100 rows returned
}
```
An async query in progress returns:
```python
{
"statement_handle": "01abc...",
"status": "running",
"message": "Asynchronous execution in progress",
}
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "SNOWFLAKE_ACCOUNT and SNOWFLAKE_TOKEN are required", "help": "Set SNOWFLAKE_ACCOUNT and SNOWFLAKE_TOKEN environment variables"}
{"error": "HTTP 422: Query failed"}
{"error": "statement is required"}
```
@@ -0,0 +1,131 @@
# Supabase Tool
Database queries, auth, and edge function invocation via the Supabase REST API.
## Tools
| Tool | Description |
|------|-------------|
| `supabase_select` | Query rows from a table using PostgREST filters |
| `supabase_insert` | Insert one or more rows into a table |
| `supabase_update` | Update rows matching PostgREST filters |
| `supabase_delete` | Delete rows matching PostgREST filters |
| `supabase_auth_signup` | Register a new user via Supabase Auth (GoTrue) |
| `supabase_auth_signin` | Sign in a user and retrieve an access token |
| `supabase_edge_invoke` | Invoke a Supabase Edge Function |
## Setup
Requires a Supabase project URL and anon/service key:
1. Go to [supabase.com/dashboard](https://supabase.com/dashboard) → your project → **Project Settings → API**
2. Copy your **Project URL** and **anon public** key (or service role key for elevated access)
Set the following environment variables:
```bash
SUPABASE_URL=https://your-project-id.supabase.co
SUPABASE_ANON_KEY=your-anon-or-service-key
```
## Usage Examples
### Query rows with filters
```python
supabase_select(
table="users",
columns="id,name,email",
filters="status=eq.active&role=eq.admin",
order="created_at.desc",
limit=50,
)
```
### Insert a single row
```python
supabase_insert(
table="orders",
rows='{"customer_id": 42, "total": 99.99, "status": "pending"}',
)
```
### Insert multiple rows
```python
supabase_insert(
table="products",
rows='[{"name": "Widget A", "price": 10}, {"name": "Widget B", "price": 20}]',
)
```
### Update rows matching a filter
```python
supabase_update(
table="orders",
filters="id=eq.123",
data='{"status": "shipped"}',
)
```
### Delete rows matching a filter
```python
supabase_delete(
table="sessions",
filters="expires_at=lt.2024-01-01",
)
```
### Sign up a new user
```python
supabase_auth_signup(
email="alice@example.com",
password="securepassword",
)
```
### Sign in and get an access token
```python
supabase_auth_signin(
email="alice@example.com",
password="securepassword",
)
```
### Invoke an Edge Function
```python
supabase_edge_invoke(
function_name="send-welcome-email",
body='{"user_id": "abc123"}',
method="POST",
)
```
## PostgREST Filter Syntax
| Operator | Meaning | Example |
|----------|---------|---------|
| `eq` | equals | `status=eq.active` |
| `neq` | not equals | `role=neq.admin` |
| `gt` | greater than | `age=gt.18` |
| `lt` | less than | `price=lt.100` |
| `like` | pattern match | `name=like.*Alice*` |
| `is` | is null/true/false | `deleted_at=is.null` |
Combine multiple filters with `&`: `"status=eq.active&role=eq.admin"`
## Error Handling
All tools return error dicts on failure:
```python
{"error": "SUPABASE_ANON_KEY or SUPABASE_URL not set", "help": "Get your keys at https://supabase.com/dashboard → Project Settings → API"}
{"error": "Supabase error 403: ..."}
{"error": "Request to Supabase timed out"}
```
@@ -0,0 +1,57 @@
# Terraform Tool
Manage Terraform Cloud/Enterprise workspaces and runs via the Terraform API.
## Tools
| Tool | Description |
|------|-------------|
| `terraform_list_workspaces` | List workspaces in an organization |
| `terraform_get_workspace` | Get details of a specific workspace |
| `terraform_list_runs` | List runs for a workspace |
| `terraform_get_run` | Get details of a specific run |
| `terraform_create_run` | Trigger a new plan/apply run |
## Setup
Set the following environment variable:
| Variable | Description |
|----------|-------------|
| `TFC_TOKEN` | Terraform Cloud/Enterprise API token |
| `TFC_URL` | Terraform Enterprise URL (optional, defaults to Terraform Cloud) |
Get a token at: [Terraform Cloud Tokens](https://app.terraform.io/app/settings/tokens)
Note: The `organization` name is passed as a parameter to tools, not as an environment variable.
## Usage Examples
### List workspaces
```python
terraform_list_workspaces(organization="my-org")
```
### Get workspace details
```python
terraform_get_workspace(workspace_id="ws-abc123")
```
### List runs for a workspace
```python
terraform_list_runs(workspace_id="ws-abc123")
```
### Trigger a new run
```python
terraform_create_run(workspace_id="ws-abc123", message="Deploy v2.1.0")
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "TFC_TOKEN is required", "help": "Set TFC_TOKEN environment variable"}
{"error": "organization is required"}
{"error": "Request timed out"}
```
@@ -0,0 +1,55 @@
# Tines Tool
Manage Tines automation stories and actions via the Tines API.
## Tools
| Tool | Description |
|------|-------------|
| `tines_list_stories` | List stories with optional status filter |
| `tines_get_story` | Get details of a specific story |
| `tines_list_actions` | List actions in a story |
| `tines_get_action` | Get details of a specific action |
| `tines_get_action_logs` | Get execution logs for an action |
## Setup
Set the following environment variables:
| Variable | Description |
|----------|-------------|
| `TINES_API_KEY` | Tines API key |
| `TINES_DOMAIN` | Tines tenant URL (e.g., `https://your-tenant.tines.com`) |
Get an API key at: Settings → API Keys in your Tines account.
## Usage Examples
### List all stories
```python
tines_list_stories()
```
### Get story details
```python
tines_get_story(story_id=12345)
```
### List actions in a story
```python
tines_list_actions(story_id=12345)
```
### Get action logs
```python
tines_get_action_logs(action_id=67890)
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "TINES_DOMAIN and TINES_API_KEY are required", "help": "Set TINES_DOMAIN and TINES_API_KEY environment variables"}
{"error": "Tines API error (HTTP 404): Story not found"}
{"error": "Request timed out"}
```
@@ -0,0 +1,111 @@
# Twilio Tool
SMS and WhatsApp messaging, call logs, and phone number management via the Twilio REST API.
## Tools
| Tool | Description |
|------|-------------|
| `twilio_send_sms` | Send an SMS message |
| `twilio_send_whatsapp` | Send a WhatsApp message |
| `twilio_list_messages` | List recent messages with optional filters |
| `twilio_get_message` | Get details of a specific message by SID |
| `twilio_delete_message` | Delete a message from your Twilio account |
| `twilio_list_phone_numbers` | List phone numbers owned by the account |
| `twilio_list_calls` | List recent calls with optional filters |
## Setup
Requires a Twilio Account SID and Auth Token:
1. Go to [console.twilio.com](https://console.twilio.com)
2. Copy your **Account SID** and **Auth Token** from the dashboard
```bash
TWILIO_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TWILIO_AUTH_TOKEN=your-auth-token
```
> Phone numbers must be in **E.164 format**: `+14155552671`
## Usage Examples
### Send an SMS
```python
twilio_send_sms(
to="+14155552671",
from_number="+18005550100",
body="Your verification code is 123456",
)
```
### Send a WhatsApp message
```python
twilio_send_whatsapp(
to="+14155552671",
from_number="+14155550000",
body="Hello from Twilio WhatsApp!",
)
```
### List recent messages
```python
twilio_list_messages(page_size=20)
```
### Filter messages by recipient
```python
twilio_list_messages(to="+14155552671", page_size=10)
```
### Get a specific message
```python
twilio_get_message(message_sid="SMxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")
```
### Delete a message
```python
twilio_delete_message(message_sid="SMxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")
```
### List phone numbers on the account
```python
twilio_list_phone_numbers()
```
### List recent calls
```python
twilio_list_calls(status="completed", page_size=20)
```
## Call Status Values
| Status | Meaning |
|--------|---------|
| `queued` | Call is queued |
| `ringing` | Recipient's phone is ringing |
| `in-progress` | Call is active |
| `completed` | Call ended successfully |
| `busy` | Recipient was busy |
| `failed` | Call failed |
| `no-answer` | No answer |
| `canceled` | Call was canceled |
## Error Handling
All tools return error dicts on failure:
```python
{"error": "TWILIO_ACCOUNT_SID and TWILIO_AUTH_TOKEN not set", "help": "Get credentials from https://console.twilio.com/"}
{"error": "Unauthorized. Check your Twilio credentials."}
{"error": "Rate limited. Try again shortly."}
{"error": "Message not found."}
```
@@ -0,0 +1,108 @@
# Twitter Tool
Tweet search, user lookup, and timeline access via the X (Twitter) API v2.
## Tools
| Tool | Description |
|------|-------------|
| `twitter_search_tweets` | Search recent tweets from the last 7 days |
| `twitter_get_user` | Get a user profile by username |
| `twitter_get_user_tweets` | Get recent tweets from a user's timeline |
| `twitter_get_tweet` | Get details of a specific tweet by ID |
| `twitter_get_user_followers` | Get followers of a user |
| `twitter_get_tweet_replies` | Get replies to a specific tweet |
| `twitter_get_list_tweets` | Get recent tweets from a Twitter/X list |
## Setup
Requires an X (Twitter) Bearer Token for read-only API v2 access:
1. Go to [developer.x.com](https://developer.x.com) → **Projects & Apps → Your App → Keys and Tokens**
2. Copy the **Bearer Token** under **Authentication Tokens**
```bash
X_BEARER_TOKEN=your-bearer-token
```
> The Bearer Token provides read-only access. Write operations (post, like, retweet) are not supported by this tool.
## Usage Examples
### Search recent tweets
```python
twitter_search_tweets(
query="python machine learning -is:retweet lang:en",
max_results=25,
sort_order="recency",
)
```
### Search tweets from a specific user
```python
twitter_search_tweets(query="from:openai has:media", max_results=10)
```
### Get a user profile
```python
twitter_get_user(username="elonmusk")
```
### Get a user's recent tweets
```python
twitter_get_user_tweets(
user_id="44196397",
max_results=20,
exclude_replies=True,
exclude_retweets=True,
)
```
### Get a specific tweet
```python
twitter_get_tweet(tweet_id="1234567890123456789")
```
### Get followers of a user
```python
twitter_get_user_followers(user_id="44196397", max_results=50)
```
### Get replies to a tweet
```python
twitter_get_tweet_replies(tweet_id="1234567890123456789", max_results=20)
```
### Get tweets from a list
```python
twitter_get_list_tweets(list_id="84839422", max_results=10)
```
## Search Query Operators
| Operator | Example | Meaning |
|----------|---------|---------|
| `from:` | `from:nasa` | Tweets by a specific user |
| `to:` | `to:support` | Replies to a specific user |
| `-is:retweet` | `-is:retweet` | Exclude retweets |
| `has:media` | `has:media` | Tweets with media |
| `lang:` | `lang:en` | Filter by language |
| `#` | `#python` | Hashtag search |
## Error Handling
All tools return error dicts on failure:
```python
{"error": "X_BEARER_TOKEN is required", "help": "Set X_BEARER_TOKEN environment variable"}
{"error": "HTTP 429: ..."}
{"error": "tweet_id is required"}
```
@@ -0,0 +1,106 @@
# Vercel Tool
Manage deployments, projects, domains, and environment variables via the Vercel REST API.
## Tools
| Tool | Description |
|------|-------------|
| `vercel_list_deployments` | List deployments, optionally filtered by project or state |
| `vercel_get_deployment` | Get details of a specific deployment |
| `vercel_list_projects` | List all Vercel projects |
| `vercel_get_project` | Get details of a specific project |
| `vercel_list_project_domains` | List domains configured for a project |
| `vercel_list_env_vars` | List environment variables for a project |
| `vercel_create_env_var` | Create an environment variable for a project |
## Setup
Requires a Vercel access token:
```bash
export VERCEL_TOKEN="your_vercel_token"
```
> Get a token at https://vercel.com/account/tokens
## Usage Examples
### List recent deployments
```python
vercel_list_deployments(limit=10, state="READY")
```
### List deployments for a specific project
```python
vercel_list_deployments(project_id="my-project", limit=5)
```
### Get deployment details
```python
vercel_get_deployment(deployment_id="dpl_abc123")
```
### List all projects
```python
vercel_list_projects(limit=20)
```
### Get project details
```python
vercel_get_project(project_id="my-project")
```
### List domains for a project
```python
vercel_list_project_domains(project_id="my-project")
```
### List environment variables
```python
vercel_list_env_vars(project_id="my-project")
```
### Create an environment variable
```python
vercel_create_env_var(
project_id="my-project",
key="DATABASE_URL",
value="postgresql://user:pass@host/db",
target="production,preview",
env_type="encrypted"
)
```
## Deployment States
| State | Description |
|-------|-------------|
| `BUILDING` | Currently building |
| `READY` | Live and serving traffic |
| `ERROR` | Build or runtime error |
| `QUEUED` | Waiting to build |
| `INITIALIZING` | Starting up |
| `CANCELED` | Manually canceled |
## Environment Variable Types
| Type | Description |
|------|-------------|
| `encrypted` | Encrypted at rest, not visible after creation (default) |
| `secret` | Reference to a shared secret, value not stored directly |
| `plain` | Plaintext, visible in dashboard |
| `sensitive` | Encrypted, never shown after creation |
| `system` | System-provided variable |
## Error Handling
All tools return error dicts on failure:
```python
{"error": "VERCEL_TOKEN not set", "help": "Get a token at https://vercel.com/account/tokens"}
{"error": "Unauthorized. Check your VERCEL_TOKEN."}
{"error": "Forbidden: ..."}
{"error": "Vercel API error 404: ..."}
{"error": "Request to Vercel timed out"}
```
@@ -0,0 +1,118 @@
# Yahoo Finance Tool
Latest available stock quotes, historical prices, financial statements, and company info via the `yfinance` library.
> **Note:** Data is sourced from Yahoo Finance and may be delayed by 15 minutes or more depending on the exchange.
## Tools
| Tool | Description |
|------|-------------|
| `yahoo_finance_quote` | Get current stock quote and key statistics |
| `yahoo_finance_history` | Get historical OHLCV price data |
| `yahoo_finance_financials` | Get income statement, balance sheet, or cash flow |
| `yahoo_finance_info` | Get detailed company information |
| `yahoo_finance_search` | Search for ticker symbols by company name or keyword |
## Setup
No API key or credentials required. The tool uses the `yfinance` Python library which accesses Yahoo Finance data directly.
> Data is provided by Yahoo Finance and is subject to their terms of use.
## Usage Examples
### Get a stock quote
```python
yahoo_finance_quote(symbol="AAPL")
```
### Get historical daily prices for the last month
```python
yahoo_finance_history(
symbol="MSFT",
period="1mo",
interval="1d",
)
```
### Get intraday prices (last 5 days, hourly)
```python
yahoo_finance_history(
symbol="GOOGL",
period="5d",
interval="1h",
)
```
### Get the income statement
```python
yahoo_finance_financials(symbol="AAPL", statement="income")
```
### Get the balance sheet
```python
yahoo_finance_financials(symbol="TSLA", statement="balance")
```
### Get the cash flow statement
```python
yahoo_finance_financials(symbol="NVDA", statement="cashflow")
```
### Get company info
```python
yahoo_finance_info(symbol="AMZN")
```
### Search for a ticker symbol
```python
yahoo_finance_search(query="Tesla")
```
## Period Values
| Period | Meaning |
|--------|---------|
| `1d` | 1 day |
| `5d` | 5 days |
| `1mo` | 1 month |
| `3mo` | 3 months |
| `6mo` | 6 months |
| `1y` | 1 year |
| `2y` | 2 years |
| `5y` | 5 years |
| `ytd` | Year to date |
| `max` | All available history |
## Interval Values
| Interval | Meaning |
|----------|---------|
| `1m` | 1 minute (last 7 days only) |
| `5m` | 5 minutes |
| `1h` | 1 hour |
| `1d` | Daily |
| `1wk` | Weekly |
| `1mo` | Monthly |
> **Interval constraints:** Intraday intervals have range limits enforced by yfinance. `1m` is limited to the last 7 days. `5m`, `15m`, `30m`, and `1h` are limited to the last 60 days. Invalid combinations return an empty result silently rather than raising an error.
## Error Handling
All tools return error dicts on failure:
```python
{"error": "symbol is required"}
{"error": "No data found for symbol 'XYZ'"}
{"error": "Invalid statement type: xyz. Use: income, balance, cashflow"}
{"error": "Failed to fetch quote for AAPL: ..."}
```
@@ -0,0 +1,110 @@
# YouTube Transcript Tool
Retrieve video transcripts and list available caption tracks via the `youtube-transcript-api` library.
## Tools
| Tool | Description |
|------|-------------|
| `youtube_get_transcript` | Get the transcript/captions for a YouTube video |
| `youtube_list_transcripts` | List all available transcript languages for a video |
## Setup
No API key required. The tool uses the `youtube-transcript-api` Python library and works with any public video that has captions enabled — no authentication needed.
Ensure the package is installed:
```bash
pip install youtube-transcript-api
```
> Only videos with captions enabled (auto-generated or manual) can be transcribed. Private or age-restricted videos may not be accessible.
## Usage Examples
### Get the English transcript of a video
```python
youtube_get_transcript(
video_id="dQw4w9WgXcQ",
language="en",
)
```
### Get transcript in another language
```python
youtube_get_transcript(
video_id="dQw4w9WgXcQ",
language="de",
)
```
### Get transcript preserving HTML formatting tags
```python
youtube_get_transcript(
video_id="dQw4w9WgXcQ",
language="en",
preserve_formatting=True,
)
```
### List all available transcript languages
```python
youtube_list_transcripts(video_id="dQw4w9WgXcQ")
```
## Response Format
`youtube_get_transcript` returns:
```python
{
"video_id": "dQw4w9WgXcQ",
"language": "English",
"language_code": "en",
"is_generated": True,
"snippet_count": 312,
"snippets": [
{"text": "Never gonna give you up", "start": 18.44, "duration": 1.72},
...
]
}
```
`youtube_list_transcripts` returns:
```python
{
"video_id": "dQw4w9WgXcQ",
"count": 3,
"transcripts": [
{"language": "English", "language_code": "en", "is_generated": True, "is_translatable": True},
{"language": "German", "language_code": "de", "is_generated": False, "is_translatable": True},
]
}
```
## Finding the Video ID
The video ID is the `v=` parameter in a YouTube URL:
```
https://www.youtube.com/watch?v=dQw4w9WgXcQ
^^^^^^^^^^^
This is the video ID
```
## Error Handling
All tools return error dicts on failure:
```python
{"error": "video_id is required"}
{"error": "TranscriptsDisabled: ..."}
{"error": "NoTranscriptFound: ..."}
{"error": "youtube-transcript-api package not installed. Run: pip install youtube-transcript-api"}
```
@@ -0,0 +1,134 @@
# Zendesk Tool
Ticket management, comments, user listing, and search via the Zendesk Support API.
## Tools
| Tool | Description |
|------|-------------|
| `zendesk_list_tickets` | List tickets in the account |
| `zendesk_get_ticket` | Get full details of a specific ticket |
| `zendesk_create_ticket` | Create a new support ticket |
| `zendesk_update_ticket` | Update ticket status, priority, or tags |
| `zendesk_search_tickets` | Search tickets using Zendesk query syntax |
| `zendesk_get_ticket_comments` | List all comments on a ticket |
| `zendesk_add_ticket_comment` | Add a public reply or internal note to a ticket |
| `zendesk_list_users` | List users filtered by role |
## Setup
Requires a Zendesk subdomain, agent email, and API token:
1. Log in to your Zendesk admin panel
2. Go to **Admin → Apps and integrations → APIs → Zendesk API**
3. Enable **Token Access** and create a new API token
```bash
ZENDESK_SUBDOMAIN=your-subdomain
ZENDESK_EMAIL=agent@yourcompany.com
ZENDESK_API_TOKEN=your-api-token
```
> `ZENDESK_SUBDOMAIN` is the part before `.zendesk.com`. For `https://acme.zendesk.com`, use `acme`.
## Usage Examples
### List open tickets
```python
zendesk_list_tickets(page_size=25)
```
### Get a specific ticket
```python
zendesk_get_ticket(ticket_id=12345)
```
### Create a new ticket
```python
zendesk_create_ticket(
subject="Login button not working",
body="Users are reporting that the login button on mobile is unresponsive.",
priority="high",
ticket_type="incident",
tags="mobile,login,bug",
)
```
### Update a ticket status
```python
zendesk_update_ticket(
ticket_id=12345,
status="pending",
priority="urgent",
)
```
### Add a public reply to a ticket
```python
zendesk_add_ticket_comment(
ticket_id=12345,
body="We have identified the issue and a fix is being deployed.",
public=True,
)
```
### Add an internal note
```python
zendesk_add_ticket_comment(
ticket_id=12345,
body="Escalated to the backend team via Slack #incidents.",
public=False,
)
```
### Search tickets
```python
zendesk_search_tickets(
query="status:open priority:urgent",
sort_by="updated_at",
sort_order="desc",
)
```
### Search by assignee and tag
```python
zendesk_search_tickets(query="assignee:agent@company.com tags:billing")
```
### List all agents
```python
zendesk_list_users(role="agent", page_size=50)
```
## Ticket Status Values
| Status | Meaning |
|--------|---------|
| `new` | Newly created, unassigned |
| `open` | Assigned and being worked on |
| `pending` | Waiting for requester response |
| `hold` | Waiting on a third party |
| `solved` | Resolved by agent |
| `closed` | Permanently closed |
## Error Handling
All tools return error dicts on failure:
```python
[
{"error": "ZENDESK_SUBDOMAIN, ZENDESK_EMAIL, and ZENDESK_API_TOKEN not set", "help": "Create an API token in Zendesk Admin > Apps and integrations > APIs > Zendesk API"},
{"error": "Unauthorized. Check your Zendesk credentials."},
{"error": "Forbidden. Check your Zendesk permissions."},
{"error": "Rate limited. Try again shortly."},
]
```
@@ -0,0 +1,130 @@
# Zoom Tool
Meeting management, recordings, and user info via the Zoom API v2.
## Tools
| Tool | Description |
|------|-------------|
| `zoom_get_user` | Get Zoom user profile information |
| `zoom_list_meetings` | List scheduled, live, or upcoming meetings for a user |
| `zoom_get_meeting` | Get full details of a specific meeting |
| `zoom_create_meeting` | Create a new instant or scheduled meeting |
| `zoom_update_meeting` | Update topic, time, duration, or agenda of a meeting |
| `zoom_delete_meeting` | Cancel and delete a meeting |
| `zoom_list_recordings` | List cloud recordings within a date range |
| `zoom_list_meeting_participants` | List participants from a past meeting |
| `zoom_list_meeting_registrants` | List registrants for a registration-enabled meeting |
## Setup
Requires a Zoom Server-to-Server OAuth access token:
1. Go to [marketplace.zoom.us](https://marketplace.zoom.us) → **Develop → Build App → Server-to-Server OAuth**
2. Create an app and note the **Account ID**, **Client ID**, and **Client Secret**
3. Generate an access token and set it as an environment variable:
```bash
ZOOM_ACCESS_TOKEN=your-server-to-server-oauth-token
```
> **Token expiry:** Server-to-Server OAuth tokens expire after **1 hour**. You will need to regenerate the token and update `ZOOM_ACCESS_TOKEN` when you see an `"Invalid or expired Zoom access token"` error.
Required OAuth scopes:
- `meeting:read` — list and read meetings
- `meeting:write` — create, update, delete meetings
- `recording:read` — list cloud recordings
- `user:read` — read user profiles
## Usage Examples
### Get authenticated user info
```python
zoom_get_user(user_id="me")
```
### List upcoming meetings
```python
zoom_list_meetings(user_id="me", type="upcoming", page_size=10)
```
### Create a scheduled meeting
```python
zoom_create_meeting(
topic="Sprint Planning",
start_time="2025-06-01T10:00:00Z",
duration=60,
timezone="America/New_York",
agenda="Review sprint backlog and assign tasks",
)
```
### Create an instant meeting
```python
zoom_create_meeting(topic="Quick Sync")
```
### Update a meeting
```python
zoom_update_meeting(
meeting_id="123456789",
topic="Sprint Planning - Updated",
duration=90,
)
```
### Delete a meeting
```python
zoom_delete_meeting(meeting_id="123456789")
```
### List cloud recordings for a date range
```python
zoom_list_recordings(
from_date="2025-05-01",
to_date="2025-05-31",
user_id="me",
)
```
### List participants from a past meeting
```python
zoom_list_meeting_participants(meeting_id="123456789")
```
### List approved registrants
```python
zoom_list_meeting_registrants(
meeting_id="123456789",
status="approved",
)
```
## Meeting Types
| Type value | Meaning |
|------------|---------|
| `upcoming` | All upcoming meetings |
| `scheduled` | Scheduled meetings only |
| `live` | Currently live meetings |
| `previous_meetings` | Past meetings |
## Error Handling
All tools return error dicts on failure:
```python
{"error": "Zoom credentials not configured", "help": "Set ZOOM_ACCESS_TOKEN environment variable or configure via credential store"}
{"error": "Invalid or expired Zoom access token"}
{"error": "Insufficient Zoom API scopes for this operation"}
{"error": "Zoom rate limit exceeded. Try again later."}
```
-84
View File
@@ -1,84 +0,0 @@
"""
Manual test script for browser highlight animations.
Launches a visible browser, goes to Google, searches "aden hive",
and clicks the first result with highlight animations on each action.
Usage:
python tools/test_highlights.py
"""
import asyncio
import sys
# Ensure the package is importable
sys.path.insert(0, "tools/src")
from gcu.browser.highlight import highlight_coordinate, highlight_element
from gcu.browser.session import BrowserSession
async def step(label: str) -> None:
print(f"\n{label}")
async def main() -> None:
session = BrowserSession(profile="highlight-test")
try:
# 1. Start browser (visible)
await step("Starting browser (headless=False)")
result = await session.start(headless=False, persistent=False)
print(f" {result}")
# 2. Open a tab and navigate to Google
await step("Navigating to google.com")
result = await session.open_tab("https://www.google.com")
print(f" {result}")
page = session.get_active_page()
assert page, "No active page"
# Small pause so you can see the page load
await asyncio.sleep(1)
# 3. Highlight + fill the search bar
selector = 'textarea[name="q"]'
await step(f"Highlighting search bar: {selector}")
await highlight_element(page, selector)
await step("Filling search bar with 'aden hive'")
await page.fill(selector, "aden hive")
await asyncio.sleep(0.5)
# 4. Press Enter to search
await step("Pressing Enter")
await page.press(selector, "Enter")
await page.wait_for_load_state("domcontentloaded", timeout=10000)
await asyncio.sleep(1)
# 5. Highlight + click the first search result link
first_result = "#search a h3"
await step(f"Highlighting first result: {first_result}")
await highlight_element(page, first_result)
await step("Clicking first result")
await page.click(first_result, timeout=10000)
await page.wait_for_load_state("domcontentloaded", timeout=10000)
await asyncio.sleep(1)
# 6. Bonus: test coordinate highlight at center of viewport
await step("Testing coordinate highlight at viewport center (960, 540)")
await highlight_coordinate(page, 960, 540)
print("\n✓ All steps complete. Browser stays open for 5 seconds...")
await asyncio.sleep(5)
finally:
await step("Stopping browser")
await session.stop()
print("Done.")
if __name__ == "__main__":
asyncio.run(main())
@@ -1,42 +0,0 @@
"""Tests for browser advanced tools."""
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from fastmcp import FastMCP
from gcu.browser.tools.advanced import register_advanced_tools
@pytest.fixture
def mcp() -> FastMCP:
"""Create a fresh FastMCP instance for testing."""
return FastMCP("test-browser-advanced")
@pytest.fixture
def browser_wait_fn(mcp):
"""Register browser tools and return the browser_wait function."""
register_advanced_tools(mcp)
return mcp._tool_manager._tools["browser_wait"].fn
@pytest.mark.asyncio
async def test_browser_wait_passes_text_as_function_argument(browser_wait_fn):
"""Quoted and multiline text should be passed as data, not JS source."""
text = "O'Reilly\nMedia"
page = MagicMock()
page.wait_for_function = AsyncMock()
session = MagicMock()
session.get_page.return_value = page
with patch("gcu.browser.tools.advanced.get_session", return_value=session):
result = await browser_wait_fn(text=text, timeout_ms=1234)
assert result == {"ok": True, "action": "wait", "condition": "text", "text": text}
page.wait_for_function.assert_awaited_once_with(
"(text) => document.body.innerText.includes(text)",
arg=text,
timeout=1234,
)
+38 -14
View File
@@ -622,7 +622,7 @@ class TestInspection:
# browser_screenshot returns list of content blocks
assert isinstance(result, list)
mock_bridge.screenshot.assert_awaited_once_with(100, full_page=True)
mock_bridge.screenshot.assert_awaited_once_with(100, full_page=True, selector=None)
class TestAdvancedTools:
@@ -671,9 +671,27 @@ class TestAdvancedTools:
assert result["result"]["value"]["status"] == "success"
@pytest.mark.asyncio
async def test_file_upload(self, mcp: FastMCP, mock_bridge: MagicMock):
async def test_file_upload(self, mcp: FastMCP, mock_bridge: MagicMock, tmp_path):
"""Test file upload functionality."""
mock_bridge.upload_file = AsyncMock(return_value={"ok": True, "files": 2})
# Create real files — browser_upload validates they exist on disk
file1 = tmp_path / "file1.pdf"
file2 = tmp_path / "file2.pdf"
file1.write_bytes(b"fake pdf 1")
file2.write_bytes(b"fake pdf 2")
# Mock the CDP calls used by browser_upload
mock_bridge.cdp_attach = AsyncMock(return_value={"ok": True})
async def mock_cdp(tab_id, method, params=None):
if method == "DOM.getDocument":
return {"root": {"nodeId": 1}}
if method == "DOM.querySelector":
return {"nodeId": 42}
if method == "DOM.setFileInputFiles":
return {"ok": True}
return {"ok": True}
mock_bridge._cdp = AsyncMock(side_effect=mock_cdp)
register_advanced_tools(mcp)
browser_upload = mcp._tool_manager._tools["browser_upload"].fn
@@ -685,10 +703,11 @@ class TestAdvancedTools:
):
result = await browser_upload(
selector="input[type='file']",
file_paths=["/tmp/file1.pdf", "/tmp/file2.pdf"],
file_paths=[str(file1), str(file2)],
)
assert result.get("ok") is True
assert result.get("count") == 2
class TestErrorHandling:
@@ -745,8 +764,14 @@ class TestIFWrapping:
"""Tests for JavaScript IIFE wrapping to handle return statements."""
@pytest.mark.asyncio
async def test_evaluate_with_bare_return(self, mcp: FastMCP, mock_bridge: MagicMock):
"""Test that scripts with bare return statements are wrapped properly."""
async def test_evaluate_passes_script_through_to_bridge(
self, mcp: FastMCP, mock_bridge: MagicMock
):
"""browser_evaluate should pass the script through to bridge.evaluate unchanged.
IIFE wrapping happens inside bridge.evaluate (see bridge.py), not in
the tool layer. The tool's job is just to forward the script.
"""
call_args = []
async def mock_evaluate_capture(tab_id: int, script: str) -> dict:
@@ -763,15 +788,12 @@ class TestIFWrapping:
"gcu.browser.tools.advanced._get_context",
return_value={"groupId": 1, "activeTabId": 100},
):
# Script with bare return at top level
result = await browser_evaluate(script="return 42;")
# Verify the script was wrapped in IIFE
assert len(call_args) == 1
wrapped_script = call_args[0]
assert wrapped_script.startswith("(function()")
assert wrapped_script.endswith("})()")
assert result.get("ok") is True
# Tool passes script through unchanged — wrapping is bridge's job
assert call_args == ["return 42;"]
# Tool returns bridge's raw result
assert result == {"result": {"value": 42}}
@pytest.mark.asyncio
async def test_evaluate_complex_script(self, mcp: FastMCP, mock_bridge: MagicMock):
@@ -798,7 +820,9 @@ class TestIFWrapping:
"""
result = await browser_evaluate(script=complex_script)
assert result.get("ok") is True
# browser_evaluate returns bridge.evaluate's raw result
assert "result" in result
assert result["result"]["value"] == {"total": 100, "filtered": 50}
class TestConcurrentOperations:
+14 -10
View File
@@ -111,7 +111,7 @@ class TestResolveRef:
"e0": RefEntry(role="button", name="Submit", nth=0),
}
result = resolve_ref("e0", ref_map)
assert result == 'role=button[name="Submit"] >> nth=0'
assert result == '[role="button"][aria-label="Submit"]:nth-of-type(1)'
def test_passes_through_css_selectors(self):
ref_map = {"e0": RefEntry(role="button", name="OK", nth=0)}
@@ -133,33 +133,37 @@ class TestResolveRef:
with pytest.raises(ValueError, match="no snapshot"):
resolve_ref("e0", None)
def test_escapes_quotes_in_name(self):
def test_quoted_name_passes_through(self):
# Note: the CSS selector output does not currently escape inner quotes.
# This produces technically-broken CSS when name contains double quotes,
# but the bridge-based matcher appears to tolerate it. Tracked
# separately as a follow-up.
ref_map = {
"e0": RefEntry(role="button", name='Say "Hello"', nth=0),
}
result = resolve_ref("e0", ref_map)
assert result == 'role=button[name="Say \\"Hello\\""] >> nth=0'
assert result == '[role="button"][aria-label="Say "Hello""]:nth-of-type(1)'
def test_no_name_produces_role_only_selector(self):
ref_map = {
"e0": RefEntry(role="textbox", name=None, nth=0),
}
result = resolve_ref("e0", ref_map)
assert result == "role=textbox >> nth=0"
assert result == '[role="textbox"]:nth-of-type(1)'
def test_empty_name(self):
ref_map = {
"e0": RefEntry(role="button", name="", nth=0),
}
result = resolve_ref("e0", ref_map)
assert result == 'role=button[name=""] >> nth=0'
assert result == '[role="button"][aria-label=""]:nth-of-type(1)'
def test_nth_in_selector(self):
ref_map = {
"e0": RefEntry(role="link", name="Next", nth=2),
}
result = resolve_ref("e0", ref_map)
assert result == 'role=link[name="Next"] >> nth=2'
assert result == '[role="link"][aria-label="Next"]:nth-of-type(3)'
# ---------------------------------------------------------------------------
@@ -172,13 +176,13 @@ class TestRoundTrip:
snapshot = '- button "Submit"\n- textbox "Email"\n- link "Home"'
_, ref_map = annotate_snapshot(snapshot)
# Each ref should resolve to a valid Playwright role selector
# Each ref should resolve to a valid CSS selector (bridge-based API)
for ref_id, entry in ref_map.items():
resolved = resolve_ref(ref_id, ref_map)
assert resolved.startswith(f"role={entry.role}")
assert resolved.startswith(f'[role="{entry.role}"]')
if entry.name is not None:
assert f'name="{entry.name}"' in resolved
assert f"nth={entry.nth}" in resolved
assert f'[aria-label="{entry.name}"]' in resolved
assert f":nth-of-type({entry.nth + 1})" in resolved
def test_css_selectors_still_work_after_annotate(self):
snapshot = '- button "OK"'
Generated
+4 -8
View File
@@ -3498,6 +3498,8 @@ dependencies = [
{ name = "jsonpath-ng" },
{ name = "litellm" },
{ name = "pandas" },
{ name = "playwright" },
{ name = "playwright-stealth" },
{ name = "psycopg2-binary" },
{ name = "pydantic" },
{ name = "pypdf" },
@@ -3516,8 +3518,6 @@ all = [
{ name = "google-cloud-bigquery" },
{ name = "openpyxl" },
{ name = "pillow" },
{ name = "playwright" },
{ name = "playwright-stealth" },
{ name = "pytesseract" },
{ name = "restrictedpython" },
]
@@ -3526,8 +3526,6 @@ bigquery = [
]
browser = [
{ name = "pillow" },
{ name = "playwright" },
{ name = "playwright-stealth" },
]
databricks = [
{ name = "databricks-mcp" },
@@ -3585,10 +3583,8 @@ requires-dist = [
{ name = "pillow", marker = "extra == 'all'", specifier = ">=10.0.0" },
{ name = "pillow", marker = "extra == 'browser'", specifier = ">=10.0.0" },
{ name = "pillow", marker = "extra == 'ocr'", specifier = ">=10.0.0" },
{ name = "playwright", marker = "extra == 'all'", specifier = ">=1.40.0" },
{ name = "playwright", marker = "extra == 'browser'", specifier = ">=1.40.0" },
{ name = "playwright-stealth", marker = "extra == 'all'", specifier = ">=2.0.0" },
{ name = "playwright-stealth", marker = "extra == 'browser'", specifier = ">=2.0.0" },
{ name = "playwright", specifier = ">=1.40.0" },
{ name = "playwright-stealth", specifier = ">=2.0.0" },
{ name = "psycopg2-binary", specifier = ">=2.9.0" },
{ name = "pydantic", specifier = ">=2.0.0" },
{ name = "pypdf", specifier = ">=4.0.0" },