Files

T

Richard Tang 12ffacccab feat: tools config frontend grouping and tools cleanup

2026-05-01 15:28:40 -07:00

59 KiB

Raw Permalink Blame History

Skill Registry — Product & Business Requirements Document

Status: Draft v1 Last updated: 2026-03-13 Authors: Timothy Reviewers: Platform, Product, OSS/Community, Developer Experience

1. Executive Summary

This document proposes a Skill System for Hive — a portable implementation of the open Agent Skills standard — combined with a community registry and a set of built-in default skills that give every worker agent runtime resiliency out of the box.

1.1 The Agent Skills Standard

Agent Skills is an open format, originally developed by Anthropic, for giving agents new capabilities and expertise. It has been adopted by 30+ products including Claude Code, Cursor, VS Code, GitHub Copilot, Gemini CLI, OpenHands, Goose, Roo Code, OpenAI Codex, and more.

A skill is a directory containing a SKILL.md file — YAML frontmatter (name, description) plus markdown instructions — optionally accompanied by scripts, reference docs, and assets. Agents discover skills at startup, load only the name and description into context (progressive disclosure tier 1), and activate the full instructions on demand when the task matches (tier 2). Supporting files are loaded only when the instructions reference them (tier 3).

my-skill/
├── SKILL.md          # Required: metadata + instructions
├── scripts/          # Optional: executable code
├── references/       # Optional: documentation
├── assets/           # Optional: templates, resources
└── evals/            # Optional: test cases and assertions

1.2 What Hive Adds

Hive implements the Agent Skills standard faithfully — no forks, no proprietary extensions to the SKILL.md format. A skill written for Claude Code, Cursor, or any other compatible product works in Hive with zero changes, and vice versa.

On top of the standard, Hive adds two things:

Default skills — Six built-in skills shipped with the Hive framework that every worker agent loads automatically. These encode runtime operational discipline: structured note-taking, batch progress tracking, context preservation, quality self-assessment, error recovery protocols, and task decomposition. They are the "muscle memory" that makes agents reliable by default.
Community registry (hive-skill-registry) — A curated GitHub repository where contributors submit skill packages via pull request. Skills in the registry are standard Agent Skills packages. Includes CI validation, trust tiers, starter packs, and bounty program integration.

1.3 Abstraction Hierarchy

Layer	What it is	Example
Tool	A single function call via MCP	`web_search`, `gmail_send`, `jira_create_issue`
Skill	A `SKILL.md` with instructions, scripts, and references	"Deep Research", "Code Review", "Data Analysis"
Default Skill	A built-in skill for runtime resiliency	"Structured Note-Taking", "Colony Progress Tracker"
Agent	A complete goal-driven worker composed of skills	"Sales Outreach Agent", "Support Triage Agent"

2. Problem Statement

2.1 Current State

Worker agents have no skill system. There is no mechanism to discover, load, or follow reusable procedural instructions on demand.
The 12 example templates in examples/templates/ are copy-paste only — they cannot be composed, imported, versioned, or discovered at runtime.
Agent builders must either hand-write all prompts and tool orchestration from scratch, or copy patterns from other agents manually.
Skills written for Claude Code, Cursor, and other Agent Skills-compatible products do not work in Hive. Users who adopt Hive lose access to the growing ecosystem of community skills.
Worker agents have no standardized operational discipline. The framework provides mechanical safeguards (stall detection, doom-loop fingerprinting, checkpoint/resume), but there is no cognitive protocol for how an agent should take structured notes when processing a 50-item batch, when to proactively save data before context pruning, or how to self-assess quality degradation. Each agent author either reinvents these patterns in their system prompts or — more commonly — skips them entirely.
When a community member builds a battle-tested skill (research pattern, triage workflow, outreach playbook), there is no pathway to share it, no discovery mechanism, no versioning, and no quality signals.

2.2 Who Is Affected

Persona	Pain Point
OSS contributor	Built a great skill for another Agent Skills-compatible product; wants it to work in Hive too, or wants to share a Hive skill with the wider ecosystem
Agent builder (beginner)	Overwhelmed by framework concepts; wants to install a "deep research" skill and use it without understanding graph internals
Agent builder (advanced)	Copies the same prompt patterns and tool orchestration across agents; wants reusable, version-pinned building blocks
Platform team	Cannot codify best practices as reusable runtime primitives; every quality improvement is a docs change, not a skill update
Enterprise user	Wants an internal skill library so teams share proven patterns; needs cross-product compatibility

2.3 Impact of Not Solving

Hive is incompatible with the Agent Skills ecosystem — a growing open standard adopted by 30+ products. Users choosing Hive lose access to community skills; contributors targeting the ecosystem skip Hive.
Agent quality depends entirely on individual author skill. No mechanism to propagate proven patterns.
Worker agents are unreliable during long-running or batch processing sessions — no built-in operational discipline.
The self-improvement loop's output (better prompts, better patterns) stays locked in individual deployments with no pathway to contribute back.

3. Goals & Success Criteria

3.1 Primary Goals

#	Goal	Metric
G1	Any `SKILL.md` from the Agent Skills ecosystem works in Hive with zero modifications	Compatibility test suite against `github.com/anthropics/skills` example skills
G2	A Hive skill works in Claude Code, Cursor, and other compatible products with zero modifications	Cross-product verification on 5+ skills
G3	A user can install and use a community skill in under 2 minutes	Time from `hive skill install X` to skill activating in a session
G4	A contributor can publish a skill in under 10 minutes	Time from `hive skill init` to PR submission
G5	Default skills measurably improve agent reliability on batch processing tasks	A/B comparison: agents with default skills vs. without on 10+ batch scenarios
G6	Zero breaking changes to existing agent configurations	All current agents continue to work unchanged

3.2 Community & Ecosystem Goals

#	Goal	Metric
G7	Registry has 100+ community skills within 30 days of launch	Skill count in registry
G8	All registry skills are portable Agent Skills packages — usable in any compatible product	100% of registry entries conform to the standard
G9	Bounty program integrates with skill contributions	Skill submissions tracked in bounty-tracker
G10	Contributors receive attribution when their skills are used	Skill metadata includes author; agent logs credit loaded skills
G11	Existing skills from `github.com/anthropics/skills` are installable via `hive skill install`	All example skills pass validation and activate correctly

3.3 Non-Goals (Explicit Exclusions)

Forking or extending the Agent Skills standard — Hive implements the spec faithfully. No proprietary sidecar files, no Hive-specific schema extensions.
Runtime skill marketplace — no billing, licensing, or monetization. The registry is free and open-source.
Hosting skill execution — the registry stores packages; execution happens locally.
AI-generated skills — automatic skill generation from natural language is a future phase.
Graph-level skill composition — skills are instruction-following units, not graph fragments. Agents compose skills by activating multiple skills and following their combined instructions.

4. Agent Skills Standard — Implementation Spec

This section defines how Hive implements the open Agent Skills standard. The specification at agentskills.io/specification is authoritative; this section describes Hive's conforming implementation.

4.1 Skill Discovery

At session startup, Hive scans for skill directories containing a SKILL.md file. Both cross-client and Hive-specific locations are scanned:

Scope	Path	Purpose
Project	`<project>/.agents/skills/`	Cross-client interoperability (standard convention)
Project	`<project>/.hive/skills/`	Hive-specific project skills
User	`~/.agents/skills/`	Cross-client user-level skills
User	`~/.hive/skills/`	Hive-specific user-level skills
Framework	`<hive-install>/skills/defaults/`	Built-in default skills

Precedence (deterministic): Project-level skills override user-level skills. Within the same scope, .hive/skills/ overrides .agents/skills/. Framework-level default skills have lowest precedence and can be overridden at any scope.

Scanning rules:

Skip .git/, node_modules/, __pycache__/, .venv/ directories
Max depth: 4 levels from the skills root
Max directories: 2000 per scope
Respect .gitignore in project scope

Trust: Project-level skills from untrusted repositories (not marked trusted by the user) require explicit user consent before loading.

4.2 `SKILL.md` Parsing

Each discovered SKILL.md is parsed per the standard:

Extract YAML frontmatter between --- delimiters
Parse required fields: name, description
Parse optional fields: license, compatibility, metadata, allowed-tools
Everything after the closing --- is the skill's markdown body (instructions)

Validation (lenient):

Name doesn't match parent directory → warn, load anyway
Name exceeds 64 characters → warn, load anyway
Description missing or empty → skip the skill, log error
YAML unparseable → try wrapping unquoted colon values in quotes as fallback; if still fails, skip and log

In-memory record per skill:

Field	Source
`name`	Frontmatter
`description`	Frontmatter
`location`	Absolute path to `SKILL.md`
`base_dir`	Parent directory of `SKILL.md`
`source_scope`	`project`, `user`, or `framework`

4.3 Progressive Disclosure

Hive implements the standard three-tier loading model:

Tier	What's loaded	When	Token cost
1. Catalog	Name + description per skill	Session start	~50-100 tokens per skill
2. Instructions	Full `SKILL.md` body	When skill is activated	<5000 tokens recommended
3. Resources	Scripts, references, assets	When instructions reference them	Varies

Catalog disclosure: At session start, all discovered skill names and descriptions are injected into the system prompt:

<available_skills>
  <skill>
    <name>deep-research</name>
    <description>Multi-step web research with source verification. Use when the task requires gathering and synthesizing information from multiple sources.</description>
    <location>/home/user/.hive/skills/deep-research/SKILL.md</location>
  </skill>
  ...
</available_skills>

Behavioral instruction injected alongside the catalog:

## Skills (mandatory)
Before replying: scan <available_skills> <description> entries.
- If exactly one skill clearly applies: read its SKILL.md at <location> with `read_file`, then follow it.
- If multiple could apply: choose the most specific one, then read/follow it.
- If none clearly apply: do not read any SKILL.md.
- When a selected skill references a relative path, resolve it against the
  skill directory (parent of SKILL.md) and use that absolute path in tool commands.

4.4 Skill Activation

Skills are activated via two mechanisms:

Model-driven: The agent reads the skill catalog, decides a skill is relevant, and reads the SKILL.md file using its file-read tool. No special infrastructure needed — the agent's standard file-reading capability is sufficient.

User-driven: Users can activate skills explicitly via @skill-name mention syntax or via agent configuration that pre-activates specific skills for every session.

What happens on activation:

The full SKILL.md body is loaded into context
Bundled resources (scripts, references) are listed but NOT eagerly loaded
The skill directory is allowlisted for file access (no permission prompts for bundled files)
Activation is logged: {skill_name, scope, timestamp}

Deduplication: If a skill is already active in the current session, re-activation is skipped.

Context protection: Activated skill content is exempt from context pruning/compaction — skill instructions are durable behavioral guidance that must persist for the session duration.

4.5 Skill Execution

The agent follows the instructions in SKILL.md. It can:

Execute bundled scripts from scripts/
Read reference materials from references/
Use assets from assets/
Call any MCP tools available in the agent's tool registry

This is identical to how skills work in Claude Code, Cursor, or any other Agent Skills-compatible product.

4.6 Pre-Activated Skills

Agents can declare skills that should be activated at session start — bypassing model-driven activation. This is useful for skills that an agent always needs (e.g., a coding standards skill for a code review agent).

In agent config (agent.json):

{
  "skills": ["deep-research", "code-review"]
}

In Python:

agent = Agent(
    name="my-agent",
    skills=["deep-research", "code-review"],
)

Pre-activated skills have their full SKILL.md body loaded into context at session start (tier 2), skipping the catalog-only tier 1 phase.

5. Default Skills

Default skills are built-in skills shipped with the Hive framework that every worker agent loads automatically. They use the Agent Skills format (SKILL.md) but live in the framework's install directory and serve as runtime operational protocols.

5.1 Why Default Skills

The framework provides mechanical safeguards: stall detection via n-gram similarity, doom-loop fingerprinting, checkpoint/resume, token budget pruning, and max iteration limits. But these are reactive — they trigger after something has gone wrong.

Default skills encode proactive cognitive protocols: how to take structured notes so you don't lose track of a 50-item batch, when to pause and summarize before you hit context limits, how to self-assess whether your output quality is degrading. They are the operational habits that experienced agent builders already encode in their system prompts — standardized so every agent benefits.

5.2 Integration Model

Default skills differ from community skills in how they integrate:

Aspect	Default Skills	Community Skills
Loaded by	Framework automatically	Agent decides at runtime (or pre-activated in config)
Integration	System prompt injection + shared buffer hooks	Instruction-following (standard Agent Skills)
Graph impact	No dedicated nodes — woven into existing nodes	None (just context)
Overridable	Yes (disable, configure, or replace)	N/A

Default skills integrate at four injection points in the EventLoopNode:

System prompt injection (before first LLM call): Default skill protocols are appended to the node's system prompt
Iteration boundary callbacks (between iterations): Quality check, notes staleness warning, budget tracking
Node completion hooks (when node finishes): Batch completeness check, handoff summary
Phase transition hooks (on edge traversal): Context carry-over, notes persistence

5.3 Default Skill Catalog

Six default skills ship with Hive:

5.3.1 Structured Note-Taking (`hive.note-taking`)

Purpose: Maintain a structured working document throughout execution so the agent never loses track of what it knows, what it's decided, and what's pending.

Problem: Without structured notes, agents processing long sessions rely entirely on conversation history. When context is pruned (automatically at 60% token usage), intermediate reasoning is lost. Agents repeat work, contradict earlier decisions, or silently drop items.

Protocol (injected into system prompt):

## Operational Protocol: Structured Note-Taking

Maintain structured working notes in shared buffer key `_working_notes`.
Update at these checkpoints:

- After completing each discrete subtask or batch item
- After receiving new information that changes your plan
- Before any tool call that will produce substantial output

Structure:

### Objective — restate the goal

### Current Plan — numbered steps, mark completed with ✓

### Key Decisions — decisions made and WHY

### Working Data — intermediate results, extracted values

### Open Questions — uncertainties to verify

### Blockers — anything preventing progress

Update incrementally — do not rewrite from scratch each time.

Shared memory: _working_notes (string), _notes_updated_at (timestamp)

Config: enabled (default true), update_frequency (default per_subtask), max_notes_length (default 4000 chars)

5.3.2 Colony Progress Tracker (`hive.colony-progress-tracker`)

Purpose: When workers in a colony share a queue of tasks, claim/complete them through a per-colony SQLite ledger (progress.db) so no item is skipped, duplicated, or silently dropped — across workers, runs, and crashes.

Problem: Agents processing batches lose track of which items they've handled, especially after context compaction, checkpoint resume, or worker hand-off. In-memory ledgers don't survive crashes and don't synchronize across parallel workers.

Background: Replaces the older in-memory _batch_ledger (and _working_notes → Current Plan decomposition) — both were removed on 2026-04-15 because they duplicated state that belongs in SQLite. The queue, per-task steps decomposition, and sop_checklist hard-gates now all live in progress.db and are authoritative.

Protocol (injected into system prompt): Workers receive db_path and colony_id (and optionally task_id) in their spawn message and interact with the ledger via sqlite3 through terminal_exec. The full claim → load plan → execute step → SOP-gate → mark done loop is documented in the skill's SKILL.md.

Tables:

tasks — queue: pending → claimed → done|failed, with worker_id and atomic claim tokens
steps — per-task decomposition with status and evidence
sop_checklist — hard gates that must be checked off before a task can be marked done
colony_meta — colony-level metadata

Config: enabled (default true). Concurrency is handled by SQLite WAL mode + BEGIN IMMEDIATE claims; no checkpoint frequency knob.

5.3.3 Context Preservation (`hive.context-preservation`)

Purpose: Proactively preserve critical information before automatic context pruning destroys it.

Problem: The framework's prune_old_tool_results() at 60% token usage removes content indiscriminately. Agents that don't proactively save important data into working notes lose it permanently.

Protocol (injected into system prompt):

## Operational Protocol: Context Preservation

You operate under a finite context window. Important information WILL be pruned.

Save-As-You-Go: After any tool call producing information you'll need later,
immediately extract key data into `_working_notes` or `_preserved_data`.
Do NOT rely on referring back to old tool results.

What to extract: URLs and key snippets (not full pages), relevant API fields
(not raw JSON), specific lines/values (not entire files), analysis results
(not raw data).

Before transitioning to the next phase/node, write a handoff summary to
`_handoff_context` with everything the next phase needs to know.

Shared memory: _handoff_context (string), _preserved_data (dict)

Config: enabled (default true), warn_at_usage_ratio (default 0.45), require_handoff (default true)

5.3.4 Quality Self-Assessment (`hive.quality-monitor`)

Purpose: Periodically prompt the agent to self-evaluate output quality, catching degradation before the judge does.

Problem: The judge system evaluates at node completion — once per node, not during execution. An agent can degrade gradually over many iterations without detection until the node completes.

Protocol (injected into system prompt):

## Operational Protocol: Quality Self-Assessment

Every 5 iterations, self-assess:

1. On-task? Still working toward the stated objective?
2. Thorough? Cutting corners compared to earlier?
3. Non-repetitive? Producing new value or rehashing?
4. Consistent? Latest output contradict earlier decisions?
5. Complete? Tracking all items, or silently dropped some?

If degrading: write assessment to `_quality_log`, re-read `_working_notes`,
change approach explicitly. If acceptable: brief note in `_quality_log`.

Shared memory: _quality_log (list), _quality_degradation_count (int)

Config: enabled (default true), assessment_interval (default 5), degradation_threshold (default 3)

5.3.5 Error Recovery Protocol (`hive.error-recovery`)

Purpose: When a tool call fails or returns unexpected results, follow a structured recovery protocol instead of blindly retrying or giving up.

Problem: The framework retries transient errors automatically. But non-transient failures (wrong input, business logic error, missing resource) are handed back to the agent with no guidance. Agents often retry the same call or abandon the task.

Protocol (injected into system prompt):

## Operational Protocol: Error Recovery

When a tool call fails:

1. Diagnose — record error in notes, classify as transient or structural
2. Decide — transient: retry once. Structural fixable: fix and retry.
   Structural unfixable: record as failed, move to next item.
   Blocking all progress: record escalation note.
3. Adapt — if same tool failed 3+ times, stop using it and find alternative.
   Update plan in notes. Never silently drop the failed item.

Shared memory: _error_log (list), _failed_tools (dict), _escalation_needed (bool)

Config: enabled (default true), max_retries_per_tool (default 3), escalation_on_block (default true)

5.4 Default Skill Configuration

Agents configure default skills via default_skills in their agent definition:

Declarative (agent.json):

{
  "default_skills": {
    "hive.note-taking": { "enabled": true },
    "hive.colony-progress-tracker": { "enabled": true },
    "hive.context-preservation": {
      "enabled": true,
      "warn_at_usage_ratio": 0.4
    },
    "hive.quality-monitor": { "enabled": false },
    "hive.error-recovery": { "enabled": true }
  }
}

Disable all: "default_skills": {"_all": {"enabled": false}}

5.5 Prompt Budget

All default skill protocols combined must total under 2000 tokens to minimize impact on the agent's domain reasoning budget. Protocols are terse operational checklists, not verbose documentation.

5.6 Shared Memory Convention

All default skill shared buffer keys use the _ prefix (_working_notes, _preserved_data, etc.) to avoid collisions with domain-level keys. These keys are:

Visible to the agent (for self-reference)
Visible to the judge (for evaluation context)
Excluded from the agent's declared output contract (operational, not domain output)

6. Community Registry

6.1 Registry Repository

A public GitHub repository (hive-skill-registry) serves as the curated community index. Every entry is a standard Agent Skills package — portable to any compatible product.

hive-skill-registry/
├── registry/
│   ├── skills/
│   │   ├── deep-research/
│   │   │   ├── SKILL.md
│   │   │   ├── scripts/
│   │   │   ├── references/
│   │   │   ├── evals/
│   │   │   └── README.md
│   │   ├── email-triage/
│   │   └── ...
│   ├── packs/
│   │   ├── research-pack.json
│   │   └── ...
│   └── _template/
├── skill_index.json               (auto-generated)
├── CONTRIBUTING.md
└── README.md

6.2 Trust Tiers

Tier	Meaning	Requirements
`official`	Maintained by Hive team	Internal review
`verified`	Audited community contribution	Code audit, maintainer SLA, test coverage
`community`	Community-submitted	Passes CI validation, maintainer review on PR

6.3 Registry Index

The registry auto-generates a skill_index.json on merge for client consumption:

{
  "name": "deep-research",
  "description": "Multi-step web research with source verification...",
  "status": "verified",
  "author": { "name": "Alex Researcher", "github": "alexr" },
  "maintainer": { "github": "alexr" },
  "version": "1.2.0",
  "license": "MIT",
  "tags": ["research", "web", "synthesis"],
  "categories": ["knowledge-work"],
  "install_count": 342,
  "last_validated_at": "2026-03-13T10:00:00Z",
  "deprecated": false
}

6.4 Starter Packs

Themed collections of skills that work well together:

{
  "name": "research-pack",
  "display_name": "Research & Analysis Pack",
  "description": "Skills for research-heavy agents",
  "skills": [
    { "name": "deep-research", "version": ">=1.0.0" },
    { "name": "synthesis", "version": ">=1.0.0" },
    { "name": "executive-summary", "version": ">=1.0.0" }
  ]
}

6.5 Evaluation Framework

Skills in the registry can include an evals/ directory following the Agent Skills evaluation pattern:

{
  "skill_name": "deep-research",
  "evals": [
    {
      "id": 1,
      "prompt": "Research the current state of quantum computing and summarize the top 3 breakthroughs from the past year.",
      "expected_output": "A structured summary with 3 breakthroughs, each with source citations.",
      "assertions": [
        "Output includes at least 3 distinct breakthroughs",
        "Each breakthrough has at least one source URL",
        "Sources are from the past 12 months"
      ]
    }
  ]
}

CI runs these evals on submitted skills to validate quality.

6.6 Bounty Integration

Contribution	Points
New skill	75
Skill improvement PR	30
Skill tests/evals	20
Skill docs	20

7. Requirements

7.1 Functional Requirements — Agent Skills Standard

ID	Requirement	Priority
AS-1	Discover skills by scanning `.agents/skills/` and `.hive/skills/` at project and user scopes	P0
AS-2	Parse `SKILL.md` YAML frontmatter per the Agent Skills spec: `name`, `description` (required), `license`, `compatibility`, `metadata`, `allowed-tools` (optional)	P0
AS-3	Lenient validation: warn on non-critical issues, skip only on missing description or unparseable YAML	P0
AS-4	Progressive disclosure tier 1: skill catalog (name + description + location) injected into system prompt at session start	P0
AS-5	Progressive disclosure tier 2: full `SKILL.md` body loaded into context when agent or user activates a skill	P0
AS-6	Progressive disclosure tier 3: scripts, references, and assets loaded on demand when instructions reference them	P0
AS-7	Model-driven activation: agent reads `SKILL.md` via file-read tool when it decides a skill is relevant	P0
AS-8	User-driven activation: `@skill-name` mention syntax intercepted by harness	P1
AS-9	Skill directories allowlisted for file access — no permission prompts for bundled resources	P0
AS-10	Activated skill content protected from context pruning/compaction	P0
AS-11	Duplicate activations in the same session deduplicated	P1
AS-12	Name collisions resolved deterministically: project overrides user, `.hive/` overrides `.agents/`, log warning	P0
AS-13	Trust gating: project-level skills from untrusted repos require user consent	P1
AS-14	Compatibility with `github.com/anthropics/skills` example skills — all pass validation and activate correctly	P0
AS-15	Cross-client YAML compatibility: handle unquoted colon values via automatic fixup	P1
AS-16	Pre-activated skills via `skills` list in agent config (`agent.json` and Python API)	P0
AS-17	Subagent delegation: optionally run a skill's instructions in an isolated sub-session	P2

7.2 Functional Requirements — Default Skills

ID	Requirement	Priority
DS-1	Ship default skills: `hive.note-taking`, `hive.colony-progress-tracker`, `hive.context-preservation`, `hive.quality-monitor`, `hive.error-recovery`, `hive.writing-hive-skills`	P0
DS-2	Default skills are valid Agent Skills packages (`SKILL.md` format) in the framework install directory	P0
DS-3	All default skills loaded automatically for every worker agent unless explicitly disabled	P0
DS-4	Default skills integrate via system prompt injection — no additional graph nodes	P0
DS-5	Default skills use `_`-prefixed shared buffer keys to avoid domain collisions	P0
DS-6	Each default skill independently configurable via `default_skills` in agent config	P0
DS-7	All defaults disableable at once: `{"_all": {"enabled": false}}`	P0
DS-8	Default skill protocols appended in a `## Operational Protocols` system prompt section	P0
DS-9	Iteration boundary callbacks for quality check and notes staleness	P0
DS-10	Node completion hooks for batch completeness and handoff write	P0
DS-11	Phase transition hooks for context carry-over and notes persistence	P1
DS-13	`hive.context-preservation` warns at 0.45 token usage (before 0.6 framework prune)	P0
DS-14	Combined default skill prompts total under 2000 tokens	P0
DS-15	Agent startup logs active default skills and config	P0

7.3 Functional Requirements — CLI

ID	Requirement	Priority
CLI-1	`hive skill list` — list discovered skills (all scopes) with source and status	P0
CLI-2	`hive skill install <name> [--version X]` — install from registry to `~/.hive/skills/`	P0
CLI-3	`hive skill install --pack <name>` — install a starter pack	P1
CLI-4	`hive skill remove <name>` — uninstall	P0
CLI-5	`hive skill search <query>` — search registry by name, tag, description	P1
CLI-6	`hive skill info <name>` — show details: description, author, scripts, references	P0
CLI-7	`hive skill init [--name X]` — scaffold a skill directory with `SKILL.md` template	P0
CLI-8	`hive skill validate <path>` — validate `SKILL.md` against the Agent Skills spec	P0
CLI-9	`hive skill test <path> [--input <json>]` — run skill in isolation, execute evals if present	P1
CLI-10	`hive skill doctor [name]` — check health: SKILL.md parseable, scripts executable, deps available	P0
CLI-11	`hive skill doctor --defaults` — check all default skills operational	P1
CLI-12	`hive skill fork <name> [--name new-name]` — create local editable copy of a registry skill	P1
CLI-13	`hive skill update [name]` — update registry cache or specific skill	P1

7.4 Functional Requirements — Registry

ID	Requirement	Priority
REG-1	Public GitHub repo with defined directory structure	P0
REG-2	CI validates `SKILL.md` on every PR using `skills-ref validate`	P0
REG-3	Flat index (`skill_index.json`) auto-generated on merge	P0
REG-4	`_template/` directory with starter skill for contributors	P0
REG-5	`CONTRIBUTING.md` with step-by-step submission guide	P0
REG-6	CI runs skill evals when `evals/` directory is present	P1
REG-7	Trust tiers: `official`, `verified`, `community`	P0
REG-8	Tags follow controlled taxonomy	P1
REG-9	Seed with 10+ skills: extract from existing templates + port from `github.com/anthropics/skills`	P0
REG-10	Starter pack definitions in `registry/packs/`	P1

7.5 Failure Handling & Diagnostics

ID	Requirement	Priority
DX-1	Structured error codes: `SKILL_NOT_FOUND`, `SKILL_PARSE_ERROR`, `SKILL_ACTIVATION_FAILED`	P0
DX-2	Every error includes: what failed, why, and suggested fix	P0
DX-3	Agent startup logs per-skill summary: `{name, scope, status}`	P0
DX-4	`hive skill doctor` machine-parseable with `--json` flag	P2

7.6 Non-Functional Requirements

ID	Requirement	Priority
NFR-1	Skill discovery (scanning + parsing) completes in <500ms for up to 50 skills	P1
NFR-2	Installing a skill does not require a Hive restart	P0
NFR-3	All new code has unit test coverage	P0
NFR-4	Registry CI runs in <120s	P1
NFR-5	`hive skill install` prints security notice on first use	P0
NFR-6	Skills loaded at runtime are read-only — modifications require forking	P0

8. Architecture Overview

                    ┌─────────────────────────────────────┐
                    │     hive-skill-registry (GitHub)      │
                    │                                       │
                    │  registry/skills/deep-research/       │
                    │    ├── SKILL.md                       │
                    │    ├── scripts/                       │
                    │    └── evals/                         │
                    │  registry/packs/research-pack.json    │
                    │  skill_index.json (auto-built)        │
                    └──────────────┬────────────────────────┘
                                   │  hive skill install
                                   ▼
┌──────────────────────────────────────────────────────────────────────┐
│                           Skill Sources                              │
│                                                                      │
│  ~/.hive/skills/           .agents/skills/       <hive>/skills/     │
│  (user, Hive-specific)     (project, cross-      defaults/          │
│                             client portable)      (framework built-  │
│                                                    in defaults)      │
└──────────────────────┬───────────────────────────────────────────────┘
                       │
                       ▼
              ┌────────────────────┐
              │   SkillDiscovery   │
              │                    │
              │ scan() → catalog   │
              │ parse SKILL.md     │
              │ resolve collisions │
              └────────┬───────────┘
                       │
           ┌───────────┴───────────┐
           │                       │
           ▼                       ▼
  ┌──────────────────┐   ┌───────────────────────┐
  │ Community Skills │   │ Default Skills         │
  │                  │   │                        │
  │ Catalog injected │   │ DefaultSkillManager    │
  │ into system      │   │ • prompt injection     │
  │ prompt (tier 1)  │   │ • iteration hooks      │
  │                  │   │ • completion hooks      │
  │ Activated on     │   │ • transition hooks      │
  │ demand (tier 2)  │   │                        │
  │                  │   │ Always active           │
  │ Agent follows    │   │ (unless disabled)       │
  │ SKILL.md         │   │                        │
  │ instructions     │   │ Protocols woven into   │
  │                  │   │ existing node prompts   │
  └──────────────────┘   └───────────────────────┘
           │                       │
           └───────────┬───────────┘
                       │
                       ▼
              ┌────────────────────┐
              │   EventLoopNode    │
              │                    │
              │ System prompt =    │
              │   agent prompt     │
              │ + node prompt      │
              │ + default skill    │
              │   protocols        │
              │ + activated skill  │
              │   instructions     │
              │                    │
              │ Same iteration     │
              │ loop, tools,       │
              │ judges             │
              └────────────────────┘

Component Responsibilities

Component	Responsibility
SkillDiscovery	Scan skill directories, parse `SKILL.md`, resolve collisions, build catalog
SkillCatalog	In-memory index of discovered skills; injected into system prompt at session start
DefaultSkillManager	Load, configure, and inject the 6 built-in default skills; manage prompt injection and hook registration
EventLoopNode (extended)	New hook points for default skills: iteration callbacks, completion hooks. Appends default protocols and activated skill content to system prompt.
AgentRunner (extended)	Resolve `skills` (pre-activation) and `default_skills` config; trigger discovery; log skill summary at startup
hive skill CLI	User-facing commands for install, search, validate, test, doctor
hive-skill-registry (GitHub)	Community-curated skill packages; CI validation; trust tiers; starter packs

9. Risks & Mitigations

Risk	Impact	Likelihood	Mitigation
Agent Skills spec evolves in breaking ways	Hive implementation falls out of sync	Low	Standard is backed by Anthropic and adopted by 30+ products; changes are conservative. Track spec repo; participate in governance.
Low community adoption — nobody submits skills	Registry empty, no value	Medium	Seed with 10+ skills from existing templates + ported from `github.com/anthropics/skills`; bounty program; `hive skill init` trivializes creation
Prompt injection via malicious skill instructions	Skill manipulates agent behavior	Medium	Trust gating for project-level skills; maintainer review on registry PRs; `verified` tier requires audit; security notice on install
Default skill prompts bloat system prompt	Reduced token budget for reasoning	Medium	Hard cap of 2000 tokens total; individually disableable; terse checklist format
Default skills create rigid behavior for simple tasks	Agent follows queue protocol on trivial single-item task	Medium	`hive.colony-progress-tracker` only activates when the spawn message has `db_path:`; all defaults individually disableable
Context window consumed by too many active skills	Multiple skills + default skills exhaust context	Medium	Progressive disclosure limits base cost (~100 tokens/skill); skills activated one-at-a-time on demand; skill body recommended <5000 tokens; default skills capped at 2000 tokens
Skill quality inconsistent across registry	Users install ineffective skills	Medium	Trust tiers; eval framework in CI; `hive skill test`; community signals (install count); `deprecated` flag

10. Backward Compatibility

This system is fully additive:

Existing agents without skills continue to work unchanged.
Default skills are loaded automatically but are behaviorally non-breaking: they add operational instructions to system prompts but do not change graph structure, tool availability, or output contracts.
Default skills can be fully disabled via "default_skills": {"_all": {"enabled": false}}.
Agents without a skills list load zero community skills (model may still activate from catalog).
The GraphExecutor is unchanged — no new execution model.
Existing tools.py, mcp_servers.json, and mcp_registry.json work alongside skills.
Skills from the Agent Skills ecosystem (Claude Code, Cursor, etc.) work without modification.

11. Interaction with MCP Registry

Skills and MCP servers are complementary:

Concern	MCP Registry	Skill System
What it shares	Tool infrastructure (servers, connections)	Agent behavior (instructions, prompts, scripts)
Format	Manifest JSON (Hive-specific)	`SKILL.md` (open standard)
Granularity	Atomic tool functions	Multi-step behavioral patterns

Integration: Skills reference tools by name in their SKILL.md instructions; the agent resolves them via the normal tool registry. If a skill requires a tool that isn't available, the agent will encounter an error at execution time — hive skill doctor can pre-check this.

12. Documentation & Examples Strategy

Doc	Audience	Deliverable
"Install and use your first skill"	Users	From `hive skill search` to skill activating in a session
"Write your first skill"	Contributors	Step-by-step: `hive skill init` → write SKILL.md → validate → submit PR
"Port a skill from Claude Code/Cursor"	Contributors	Usually just install it — guide explains verification
"Default skills reference"	All users	All 6 defaults: purpose, config, shared buffer keys, tuning
"Tuning default skills"	Advanced builders	When to disable vs. configure; per-agent overrides; measuring impact
Skill cookbook	Contributors	Annotated examples: research, triage, draft, review, outreach, data extraction
"Evaluating skill quality"	Contributors	Setting up evals, writing assertions, iterating with the eval-driven loop
Starter pack guide	Users	Finding, installing, and customizing starter packs

13. Phased Delivery

Phase	Scope	Depends On
Phase 0: Default Skills	Implement 6 default skills as `SKILL.md` packages; `DefaultSkillManager` with system prompt injection, iteration callbacks, node completion hooks, phase transition hooks; `DefaultSkillConfig` in Python API and `agent.json`; `_`-prefixed shared buffer convention; startup logging	—
Phase 1: Agent Skills Standard	`SkillDiscovery` scanning `.agents/skills/` and `.hive/skills/`; `SKILL.md` parsing with lenient validation; progressive disclosure (catalog injection, activation, resource loading); model-driven and user-driven activation; context protection; deduplication; pre-activated skills config; compatibility tests against `github.com/anthropics/skills`	—
Phase 2: CLI & Contributor Tooling	`hive skill init`, `validate`, `test`, `fork`; `hive skill doctor`; `hive skill install/remove/list/search/info/update`; version pinning; `skills-ref` integration for validation	Phase 1
Phase 3: Registry Repo	Create `hive-skill-registry` GitHub repo; CI validation using `skills-ref`; `_template/`; `CONTRIBUTING.md`; seed with 10+ skills (extracted from templates + ported from anthropics/skills); eval CI	Phase 1
Phase 4: Docs & Launch	All documentation from section 12; example agents using skills; announcement; bounty program integration	Phase 2, 3
Phase 5: Community Growth	Trust tier promotion process; starter packs; community signals (install counts); monthly skill spotlight; eval-driven quality ranking	Phase 4
Phase 6: Advanced Features (future)	Subagent delegation for skill execution; skill-level telemetry; AI-assisted skill creation	Phase 5

Phase 0 and Phase 1 can proceed in parallel — default skills depend on the prompt injection pipeline, while Agent Skills standard depends on discovery/parsing/activation.

14. Open Questions

#	Question	Owner	Status
Q1	Should the registry repo live under `aden-hive` org or a shared `agentskills` org?	Platform	Open
Q2	Should default skill protocols be adaptive (e.g., `hive.colony-progress-tracker` adjusts SOP-gate strictness based on task type)?	Engineering	Open
Q3	Should default skills be tunable per-node (not just per-agent)?	Engineering	Open
Q4	Should `hive.quality-monitor` self-assessments feed into judge decisions (auto-trigger RETRY on self-reported degradation)?	Engineering	Open
Q5	What is the right combined token budget for default skill prompts? 2000 tokens proposed — configurable or fixed?	Engineering	Open
Q6	Should Hive support subagent delegation for skill execution (run skill in isolated session, return summary)?	Engineering	Open
Q7	Should Hive also scan `.claude/skills/` for pragmatic compatibility with Claude Code's native skill location?	Engineering	Open
Q8	What is the process for promoting a `community` skill to `verified`?	Platform + Security	Open
Q9	Should the registry support private/enterprise skill indexes (`hive skill config --index-url`)?	Platform	Open
Q10	Should `hive skill test` use the official `skills-ref` library or a Hive-native implementation?	Engineering	Open
Q12	How should skill-level telemetry (activation counts, eval pass rates) be collected without compromising privacy?	Product + Privacy	Open

15. Stakeholder Sign-Off

Role	Name	Status
Engineering Lead		Pending
Product		Pending
OSS / Community		Pending
Security		Pending
Developer Experience		Pending

59 KiB Raw Permalink Blame History

Skill Registry — Product & Business Requirements Document

1. Executive Summary

1.1 The Agent Skills Standard

1.2 What Hive Adds

1.3 Abstraction Hierarchy

2. Problem Statement

2.1 Current State

2.2 Who Is Affected

2.3 Impact of Not Solving

3. Goals & Success Criteria

3.1 Primary Goals

3.2 Community & Ecosystem Goals

3.3 Non-Goals (Explicit Exclusions)

4. Agent Skills Standard — Implementation Spec

4.1 Skill Discovery

4.2 SKILL.md Parsing

4.3 Progressive Disclosure

4.4 Skill Activation

4.5 Skill Execution

4.6 Pre-Activated Skills

5. Default Skills

5.1 Why Default Skills

5.2 Integration Model

5.3 Default Skill Catalog

5.3.1 Structured Note-Taking (hive.note-taking)

5.3.2 Colony Progress Tracker (hive.colony-progress-tracker)

5.3.3 Context Preservation (hive.context-preservation)

5.3.4 Quality Self-Assessment (hive.quality-monitor)

5.3.5 Error Recovery Protocol (hive.error-recovery)

5.4 Default Skill Configuration

5.5 Prompt Budget

5.6 Shared Memory Convention

6. Community Registry

6.1 Registry Repository

6.2 Trust Tiers

6.3 Registry Index

6.4 Starter Packs

6.5 Evaluation Framework

6.6 Bounty Integration

7. Requirements

7.1 Functional Requirements — Agent Skills Standard

7.2 Functional Requirements — Default Skills

7.3 Functional Requirements — CLI

7.4 Functional Requirements — Registry

7.5 Failure Handling & Diagnostics

7.6 Non-Functional Requirements

8. Architecture Overview

Component Responsibilities

9. Risks & Mitigations

10. Backward Compatibility

11. Interaction with MCP Registry

12. Documentation & Examples Strategy

13. Phased Delivery

14. Open Questions

15. Stakeholder Sign-Off

59 KiB

Raw Permalink Blame History

4.2 `SKILL.md` Parsing

5.3.1 Structured Note-Taking (`hive.note-taking`)

5.3.2 Colony Progress Tracker (`hive.colony-progress-tracker`)

5.3.3 Context Preservation (`hive.context-preservation`)

5.3.4 Quality Self-Assessment (`hive.quality-monitor`)

5.3.5 Error Recovery Protocol (`hive.error-recovery`)