Accepts a multipart upload of `tar` / `tar.gz` (any compression
tarfile.open auto-detects) containing a single top-level directory and
unpacks it into HIVE_HOME/colonies/<name>. Lets a desktop client (or any
external tool) hand a colony spec to a remote runtime to run.
Form fields:
file (required) the archive blob
name (optional) override the colony name; defaults to
the archive's top-level dir
replace_existing (optional) "true" to overwrite; else 409 if the
target dir already exists
Safety:
- 50 MB upload cap (multipart reader streams + caps each part)
- Manual path-traversal validation per member (Python 3.11 compatible —
tarfile's safe `filter='data'` only landed in 3.12)
- Symlinks, hardlinks, device, fifo entries all rejected
- Colony name validated against the existing [a-z0-9_]+ pattern used by
routes_colony_workers + queen_lifecycle_tools
- Mode bits masked to 0o755 / 0o644 so a tampered tar can't ship
world-writable scripts
Tests cover happy path, name override, 409 / 201 around replace_existing,
path traversal, absolute paths, symlinks, multiple top-level dirs,
invalid colony name, missing file part, corrupt tar, non-multipart, and
uncompressed tar.
Future work (not in this PR): export endpoint, colony list/delete via
this same prefix, and an MCP tool wrapper so queens can move colonies
between hosts mid-conversation.
Two unrelated test failures were keeping main red:
- test_capabilities.py: fixtures referenced deprecated model identifiers
no longer in model_catalog.json. After the catalog refactor unknown
models default to vision-capable, so 12 "expect False" assertions
flipped to True. Replace fixtures with current catalog entries that
carry an explicit supports_vision flag.
- test_colony_runtime_overseer.py: a 200ms hard sleep racing the
background worker was flaky on Windows CI. Poll for llm.stream_calls
with a 5s deadline instead.
Adds 29 MCP tools for SimilarWeb V5 covering traffic and engagement,
competitor intelligence, keywords/SERP, audience demographics, and
segment analysis. Includes credential spec, health checker, README,
and tests on ubuntu and windows.
Closes#7022
Cache-aware cost reporting + new frontier models (GPT-5.5, DeepSeek V4
Pro/Flash, GLM-5.1). cache_control now propagates through OpenRouter
sub-providers (anthropic / gemini / glm / minimax) so the static system
prefix actually hits cache, and every response/finish event carries
cost_usd computed from a four-source fallback chain.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The compaction trigger now reserves headroom equal to
compaction_buffer_tokens + compaction_buffer_ratio * max_context_tokens.
The fixed component (default 8k, sized for one max-sized tool result)
gives a floor on small windows; the ratio (default 0.15) keeps the
trigger meaningful on large windows where any constant buffer becomes
a rounding error (8k buffer is 75% on a 32k window but 96% on a 200k
window). Result: ~80% pre-turn trigger on 200k+ windows so the inner
tool loop has room to grow without firing the mid-turn pre-send guard.
Compaction + worker-storage copy moved to a background task in f39c1c87;
the test checked the worker-storage file before the task ran, which flaked
under CI load.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>