fix: skills for colonies

2026-04-14 16:23:17 -07:00
parent 958bafea29
commit 256b52b818
3 changed files with 365 additions and 77 deletions
@@ -695,29 +695,64 @@ a saved agent.

 ## Forking the session into a persistent colony

-**When to use create_colony:** the user needs work to run \
-**headless, recurring, or in parallel to this chat** — something \
-that keeps going after you stop talking. Typical triggers:
+**Prove the work inline BEFORE scaling to a colony.** This is the \
+most important rule in this section. A colony is a durable, \
+unattended runtime — you must know the task mechanics work before \
+you bake them into one. The expensive, hard-to-debug failures \
+(dummy-target browser loops, wrong selectors, misread skills) \
+happen when a queen delegates to a colony without ever doing \
+the work herself first.
+
+**The inline-first, scale-after pattern:**
+
+  1. **Do one instance of the work yourself, inline**, right in \
+     this chat. Use your own tools. Open the browser, click the \
+     real button, type the real text, send the real message, \
+     verify the real result. This is the shortest path from \
+     "vague intent" to "known-working procedure" — you learn \
+     the exact selectors, the exact quirks, the exact sequence \
+     that works on this site / API / system right now.
+
+  2. **Report the result to the user.** "I sent the message to \
+     Dimitris — here's the confirmation. Before I scale this to \
+     your whole connection list, want me to tweak anything?" \
+     This gives the user a concrete sample to react to AND \
+     gives you feedback before the cost of scaling multiplies.
+
+  3. **Only after a successful inline run**, decide whether to:
+     - stay inline and iterate by hand (small batches)
+     - fan out via `run_parallel_workers` (one-shot batch, \
+       results needed RIGHT NOW, no persistence needed)
+     - scale via `create_colony` (headless / recurring / needs \
+       to survive this chat ending)
+
+**When to use create_colony:** after step 2 has succeeded, and \
+the user needs work to run **headless, recurring, or in parallel \
+to this chat**. Typical triggers:
  - "run this every morning / every hour / on a cron"
  - "keep monitoring X and alert me when Y"
  - "fire this off in the background, I'll check on it later"
  - "spin up a dedicated agent for this so I can keep working here"
  - any task that should survive the current conversation ending

-**When NOT to use it:** if the user just wants results RIGHT NOW \
-in this chat, use `run_parallel_workers` instead. If they want to \
-iterate on an agent design, stay in the planning/building flow. \
-Don't create a colony just because you "learned something \
-reusable" — the trigger is operational (needs to keep running), \
-not epistemic (knowledge worth saving).
+**When NOT to use it:**
+  - You haven't actually done the work once yet. STOP. Do it \
+    inline first. Delegating an untested procedure to a colony \
+    is the single most common cause of silent worker failure.
+  - The user wants results RIGHT NOW and doesn't need the task \
+    to persist → stay inline or use `run_parallel_workers`.
+  - You "learned something reusable" but there's no operational \
+    need to keep running — knowledge worth saving goes in a \
+    skill file, not a colony.

-**Two-step flow:**
+**Two-step flow (assuming step 1-2 above have succeeded):**
  1. AUTHOR A SKILL FIRST so the colony worker has the operational \
-     context it needs to run unattended. Use write_file to create a \
-     skill folder (recommended location: \
-     `~/.hive/skills/{skill-name}/SKILL.md`) capturing the \
-     procedure — API endpoints, auth flow, response shapes, \
-     gotchas, conventions, query patterns, rate limits. The \
+     context it needs to run unattended — and write it from the \
+     knowledge you just earned doing the work inline, not from \
+     speculation. Include the EXACT selectors, tool call \
+     sequences, and gotchas you hit in your own run. Use \
+     write_file to create the skill folder (recommended \
+     location: `~/.hive/skills/{skill-name}/SKILL.md`). The \
     SKILL.md needs YAML frontmatter with `name` (matching the \
     directory name) and `description` (1-1024 chars including \
     trigger keywords), followed by a markdown body. Optional \
@@ -726,12 +761,13 @@ not epistemic (knowledge worth saving).
  2. create_colony(colony_name, task, skill_path) — Validates the \
     skill folder, installs it under ~/.hive/skills/ if it isn't \
     already there, and forks this session into a new colony. \
-     NOTHING RUNS after this call: the task is baked into \
-     worker.json and the user starts the worker (or wires up a \
-     trigger) later from the new colony page. The task string \
-     must be FULL and self-contained — when the worker eventually \
-     runs it has zero memory of your chat. The skill you wrote is \
-     discovered on first scan so the worker starts informed.
+     The colony worker inherits your full conversation at spawn \
+     time, so it sees everything you already did and said — no \
+     repeated discovery. NOTHING RUNS immediately after this \
+     call: the task is baked into worker.json and the user starts \
+     the worker (or wires up a trigger) later from the new colony \
+     page. The task string still must be FULL and self-contained \
+     because triggers fire without your chat context.

 ## Workflow summary
 1. Understand requirements → discover tools → design the layout
@@ -843,32 +879,62 @@ synthesis.

 ## Forking this session into a persistent colony

-**When to use create_colony:** the user needs work to run \
-**headless, recurring, or in parallel to this chat** — something \
-that should keep going after this conversation ends. Typical \
-triggers:
+**Prove the work inline BEFORE scaling to a colony.** This is the \
+most important rule in this section. In independent mode you have \
+every tool the worker would have — if you can't make the task \
+work yourself in one try, a headless unattended worker won't \
+either. The expensive, hard-to-debug failures (dummy-target \
+browser loops, wrong selectors, misread skills) happen when a \
+queen delegates to a colony without ever doing the work herself \
+first.
+
+**The inline-first, scale-after pattern:**
+
+  1. **Do one instance of the work yourself, inline**, right in \
+     this chat. Open the browser, click the real button, type \
+     the real text, send the real message, verify the real \
+     result. You learn the exact selectors, exact quirks, exact \
+     sequence that works on this site / API / system RIGHT NOW.
+  2. **Report the result to the user.** Show them the concrete \
+     sample. Ask if they want anything adjusted before you \
+     scale up.
+  3. **Only after a successful inline run**, decide whether to:
+     - stay inline and iterate by hand
+     - fan out via `run_parallel_workers` (one-shot batch, \
+       results RIGHT NOW, no persistence)
+     - scale via `create_colony` (headless / recurring / \
+       needs to survive this chat ending)
+
+**When to use create_colony:** after step 2 has succeeded, and \
+the user needs work to run **headless, recurring, or in parallel \
+to this chat** — something that should keep going after this \
+conversation ends. Typical triggers:
  - "run this every morning / every hour / on a cron"
  - "keep monitoring X and alert me when Y changes"
  - "fire this off in the background so I can keep working here"
  - "spin up a dedicated agent for this job"
  - any task that needs to survive the current session

-**When NOT to use it:** if the user just wants results RIGHT NOW \
-in this chat, use `run_parallel_workers` instead. Don't create a \
-colony just because you "learned something reusable" — the \
-trigger is operational (needs to keep running), not epistemic \
-(knowledge worth saving).
+**When NOT to use it:**
+  - You haven't actually done the work once yet. STOP. Do it \
+    inline first. This is the #1 cause of silent worker failure.
+  - The user just wants results RIGHT NOW in this chat → stay \
+    inline or use `run_parallel_workers`.
+  - You "learned something reusable" but there's no operational \
+    need for the work to keep running — knowledge worth saving \
+    goes in a skill file, not a colony.

-**Two-step flow:**
+**Two-step flow (assuming step 1-2 above have succeeded):**
  1. AUTHOR A SKILL FIRST in a SCRATCH location so the colony \
     worker has the operational context it needs to run \
-     unattended. Use write_file to create a skill folder \
+     unattended — and write it from the knowledge you just \
+     earned doing the work inline, not from speculation. Include \
+     the EXACT selectors, tool call sequences, and gotchas you \
+     hit in your own run. Use write_file to create a skill folder \
     somewhere temporary (e.g. `/tmp/{skill-name}/` or your \
-     working directory) capturing the procedure — API endpoints, \
-     auth flow, pagination, gotchas, rate limits, response \
-     shapes. DO NOT author it under `~/.hive/skills/` — that path \
-     is user-global and would leak the skill to every other \
-     agent. The SKILL.md needs YAML frontmatter with `name` \
+     working directory). DO NOT author it under `~/.hive/skills/` \
+     — that path is user-global and would leak the skill to every \
+     other agent. The SKILL.md needs YAML frontmatter with `name` \
     (matching the directory name) and `description` (1-1024 \
     chars including trigger keywords), followed by a markdown \
     body. Optional subdirs: scripts/, references/, assets/. \
@@ -878,12 +944,14 @@ trigger is operational (needs to keep running), not epistemic \
     the skill folder, forks this session into a new colony, and \
     installs the skill COLONY-SCOPED at \
     `~/.hive/colonies/{colony_name}/skills/{skill_name}/`. Only \
-     that colony's worker sees it, no other agent. NOTHING RUNS \
-     after this call — the task is baked into worker.json and \
-     the user starts the worker (or wires up a trigger) later \
-     from the new colony page. The task string must be FULL and \
-     self-contained because the worker has zero memory of your \
-     chat when it eventually runs.
+     that colony's worker sees it, no other agent. The colony \
+     worker inherits your full conversation at spawn time, so it \
+     sees everything you already did and said — no repeated \
+     discovery. NOTHING RUNS immediately after this call — the \
+     task is baked into worker.json and the user starts the \
+     worker (or wires up a trigger) later from the new colony \
+     page. The task string must still be FULL and self-contained \
+     because triggers fire without your chat context.
 """

 _queen_behavior_editing = """
@@ -899,33 +967,52 @@ Report the last run's results to the user and ask what they want to do next.
 """

 _queen_behavior_independent = """
-## Independent — do the work yourself
+## Independent — do the work yourself (inline first, always)

-You are the agent. No pre-loaded worker — you execute directly.
-1. Understand the task from the user
-2. Plan your approach briefly (no flowcharts or agent design)
-3. Execute using your tools: file I/O, shell commands, browser automation
-4. Report results, iterate if needed
+You are the agent. No pre-loaded worker — you execute directly. \
+**Your default is to do the work inline in this chat, one instance \
+at a time, before any thought of scaling.**

-## Scaling up from independent mode
+1. Understand the task from the user.
+2. Plan your approach briefly (no flowcharts, no agent design).
+3. **Do the work yourself, inline. One real instance.** Open the \
+   browser, call the real API, write to the real file, send the \
+   real message. Use your actual tools against real state. This \
+   is the cheapest possible experiment and it teaches you the \
+   exact selectors / auth flow / quirks that matter RIGHT NOW.
+4. **Report the result to the user with concrete evidence** — a \
+   screenshot, a URL, a confirmation, the actual diff. Let them \
+   react before you scale.
+5. Iterate if needed — STAY INLINE while you figure out the \
+   mechanics. Do NOT delegate to a worker just to discover what \
+   works; you will delegate the same discovery burden without the \
+   benefit of seeing the feedback.
+6. Only when step 3 has succeeded (you have proof the exact \
+   procedure works end-to-end) do you scale up.

-You have no pre-loaded worker in this phase, but you DO have two \
-lifecycle tools for spinning up work dynamically:
+**Scaling pathways** (in order of cost, cheapest first):
+- **Stay inline, run it again.** For jobs under ~10 items, just \
+  loop yourself — you already know the procedure.
+- **`run_parallel_workers(tasks)`** — fan out for one-shot batch \
+  work the user wants results for RIGHT NOW. No persistence, no \
+  colony. Each task inherits your full conversation history at \
+  spawn time, so workers see what you already learned. Use when \
+  you need concurrency to beat wall-clock time.
+- **`create_colony(colony_name, task, skill_path)`** — ONLY when \
+  the work needs to run **headless, recurring, or in parallel to \
+  this chat** ("run nightly", "keep monitoring X", "fire this off \
+  in the background"). Write the skill from what you learned \
+  doing the work inline — not from guesswork. Then fork. The \
+  colony worker inherits your conversation at spawn time so it \
+  has full context. Do NOT use this just because you "learned \
+  something reusable" — the trigger is operational (needs to \
+  keep running), not epistemic.

- **run_parallel_workers(tasks)** — for one-off batch work the user \
-  wants results for RIGHT NOW. Fan out N subtasks concurrently and \
-  synthesize the aggregated reports. No colony is created; the \
-  workers exist only for this call.
- **create_colony(colony_name, task, skill_path)** — when the user \
-  wants work to run **headless, recurring, or in parallel to this \
-  chat** (e.g. "run nightly", "keep monitoring X", "fire this off \
-  in the background"). Write a skill folder to scratch capturing \
-  the operational procedure, then call this to fork the session \
-  and install the skill colony-scoped. Nothing runs after fork — \
-  the user starts the worker (or sets a trigger) later from the \
-  new colony page. Do NOT use this just because you "learned \
-  something reusable" — the trigger is operational (needs to keep \
-  running), not epistemic.
+**Hard rule: NEVER call `run_parallel_workers` or `create_colony` \
+before you have successfully completed the task once inline.** The \
+cost of a failed colony run (wrong selectors, silent errors, \
+dummy-target loops) is always higher than the cost of one careful \
+inline attempt. When in doubt, do it yourself first.

 You do NOT have the agent-building lifecycle (no save_agent_draft, \
 confirm_and_build, load_built_agent, run_agent_with_input). If the \
@@ -41,6 +41,42 @@ if TYPE_CHECKING:
 logger = logging.getLogger(__name__)


+def _format_spawn_task_message(task: str, input_data: dict[str, Any]) -> str:
+    """Render the spawn task into the worker's next user message.
+
+    Spawned workers inherit the queen's conversation via
+    ``ColonyRuntime._fork_parent_conversation``; this helper builds
+    the content of the trailing user message that carries the new
+    task. The queen's chat already provides the context for the
+    task, so we frame this as an explicit hand-off.
+
+    Additional keys from ``input_data`` (other than the task itself)
+    are rendered below the hand-off line so the worker sees them as
+    structured hand-off data. This mirrors the fresh-path
+    ``AgentLoop._build_initial_message`` shape so worker prompts look
+    roughly the same whether or not inheritance fired.
+    """
+    lines = [
+        "# New task delegated by the queen",
+        "",
+        "The queen's conversation up to this point is visible above. "
+        "Use it as context (who the user is, what was already decided, "
+        "which skills apply). Your own system prompt and tool set are "
+        "set by the framework — the queen's tools may differ from "
+        "yours, so treat her prior tool calls as history only.",
+        "",
+        f"task: {task}",
+    ]
+    for key, value in (input_data or {}).items():
+        if key in ("task", "user_request"):
+            # Already rendered above; don't duplicate.
+            continue
+        if value is None:
+            continue
+        lines.append(f"{key}: {value}")
+    return "\n".join(lines)
+
+
@dataclass
 class ColonyConfig:
    max_concurrent_workers: int = 100
@@ -432,6 +468,131 @@ class ColonyRuntime:
    def resume_timers(self) -> None:
        self._timers_paused = False

+    async def _fork_parent_conversation(
+        self,
+        dest_conv_dir: Path,
+        *,
+        task: str,
+        input_data: dict[str, Any] | None = None,
+    ) -> None:
+        """Fork the colony's parent queen conversation into ``dest_conv_dir``.
+
+        Copies the queen's ``parts/*.json`` and ``meta.json`` into the
+        worker's fresh conversation dir, then appends a synthetic user
+        message carrying the new task. The worker's subsequent
+        ``AgentLoop._restore`` reads this conversation via the usual
+        path — the queen's history is visible as prior turns, the task
+        appears as the most recent user message, and the worker starts
+        acting on it with full context.
+
+        This is a no-op if the colony runtime doesn't own a parent
+        queen conversation (e.g. a standalone colony started without a
+        queen wrapper).
+
+        Notes on filtering compatibility:
+          - Queen parts have ``phase_id=None``. When the worker's
+            restore applies its own phase filter, the backward-compat
+            fallback in NodeConversation.restore kicks in: an
+            all-None-phased store bypasses the filter. See
+            ``conversation.py:1369-1378``.
+          - ``cursor.json`` is deliberately NOT copied. The worker
+            should start fresh at iteration 0; copying the queen's
+            cursor would make the worker think it had already done
+            work.
+          - The queen's ``meta.json`` is copied but the AgentLoop
+            immediately rebuilds ``system_prompt`` from the worker's
+            own context post-restore (see agent_loop.py:533-535), so
+            the queen's system prompt does not leak into the worker.
+        """
+        # Resolve the queen's own conversation dir. For a queen-backed
+        # ColonyRuntime, storage_path points at the queen's session dir
+        # and conversations/ lives inside it. For standalone runtimes
+        # (tests, legacy fork path under ~/.hive/agents/{name}/worker/)
+        # there's no parent conversation — fall through to the fresh
+        # spawn path.
+        src_conv_dir = self._storage_path / "conversations"
+        src_parts_dir = src_conv_dir / "parts"
+        if not src_parts_dir.exists():
+            # No queen conversation to inherit — the worker starts with
+            # only the task, same as the pre-fork behavior. AgentLoop's
+            # fresh-conversation branch will call _build_initial_message
+            # and render input_data into the worker's first user message.
+            return
+
+        def _copy_and_append() -> None:
+            dest_parts = dest_conv_dir / "parts"
+            dest_parts.mkdir(parents=True, exist_ok=True)
+
+            # Copy each queen part. Use json.dumps round-trip (not raw
+            # file copy) so we can be defensive about unreadable files —
+            # a corrupted queen part file shouldn't take down the worker
+            # spawn, just drop that one part.
+            max_seq = -1
+            for part_file in sorted(src_parts_dir.glob("*.json")):
+                try:
+                    data = json.loads(part_file.read_text(encoding="utf-8"))
+                except (json.JSONDecodeError, OSError) as exc:
+                    logger.warning(
+                        "spawn fork: skipping unreadable queen part %s: %s",
+                        part_file.name,
+                        exc,
+                    )
+                    continue
+                seq = data.get("seq")
+                if isinstance(seq, int) and seq > max_seq:
+                    max_seq = seq
+                (dest_parts / part_file.name).write_text(
+                    json.dumps(data, ensure_ascii=False),
+                    encoding="utf-8",
+                )
+
+            # Copy the queen's meta.json so the worker's restore finds
+            # the conversation during its first run. The meta fields
+            # (system_prompt, max_context_tokens, etc.) get overridden
+            # by the worker's own AgentLoop config + context after
+            # restore, so nothing here bleeds into runtime behavior.
+            src_meta = src_conv_dir / "meta.json"
+            if src_meta.exists():
+                try:
+                    meta_data = json.loads(src_meta.read_text(encoding="utf-8"))
+                    (dest_conv_dir / "meta.json").write_text(
+                        json.dumps(meta_data, ensure_ascii=False),
+                        encoding="utf-8",
+                    )
+                except (json.JSONDecodeError, OSError) as exc:
+                    logger.warning(
+                        "spawn fork: failed to copy queen meta.json: %s", exc
+                    )
+
+            # Append the task as the next user message so the worker's
+            # LLM sees it as the most recent turn in the conversation
+            # after restore. This replaces the fresh-path call to
+            # _build_initial_message for spawned workers.
+            task_content = _format_spawn_task_message(task, input_data or {})
+            next_seq = max_seq + 1
+            task_part = {
+                "seq": next_seq,
+                "role": "user",
+                "content": task_content,
+                # phase_id omitted (None) so the backward-compat
+                # fallback in NodeConversation.restore keeps it visible
+                # to both queen-style and phase-filtered restores.
+                # run_id omitted so the worker's run_id filter (off by
+                # default since ctx.run_id is empty) doesn't reject it.
+            }
+            task_filename = f"{next_seq:010d}.json"
+            (dest_parts / task_filename).write_text(
+                json.dumps(task_part, ensure_ascii=False),
+                encoding="utf-8",
+            )
+            logger.info(
+                "spawn fork: inherited %d queen parts + appended task at seq %d",
+                max_seq + 1,
+                next_seq,
+            )
+
+        await asyncio.to_thread(_copy_and_append)
+
    # ── Worker Spawning ─────────────────────────────────────────

    async def spawn(
@@ -497,6 +658,22 @@ class ColonyRuntime:
            # (worse) the process CWD.
            worker_storage = self._storage_path / "workers" / worker_id
            worker_storage.mkdir(parents=True, exist_ok=True)
+
+            # Fork the queen's conversation into the worker's store.
+            # The queen already accumulated the user chat, read relevant
+            # skills, and made decisions about how to approach the task;
+            # the worker would repeat that discovery work (and often
+            # mis-step — see the 2026-04-14 "dummy-target" incident)
+            # if spawned with a blank store. We snapshot the queen's
+            # parts + meta at spawn time, then append the task as the
+            # next user message so the worker's AgentLoop restores into
+            # a conversation that already ends with its new instruction.
+            await self._fork_parent_conversation(
+                worker_storage / "conversations",
+                task=task,
+                input_data=input_data,
+            )
+
            worker_conv_store = FileConversationStore(
                worker_storage / "conversations"
            )
@@ -98,10 +98,20 @@ textarea = browser_evaluate("""
 browser_click_coordinate(textarea['cx'], textarea['cy'])
 sleep(0.6)

-# 6. Insert text via CDP Input.insertText (browser_type does this by default now).
-#    Per-char keyDown fails on Lexical composers — the keys dispatch but
-#    the editor never turns them into text.
-browser_type(<appropriate-selector-or-skip-selector-and-use-bridge-insertText>, text)
+# 6. Insert text via document.execCommand('insertText') through browser_evaluate.
+#    This is the ONLY reliable approach for LinkedIn's Lexical composer.
+#    See the "Lexical composer quirks" section below for why browser_type
+#    with a selector does NOT work here (the contenteditable lives inside
+#    the #interop-outlet shadow root which document.querySelector can't
+#    reach). The click in step 5 already put Lexical into edit mode, so
+#    execCommand injects straight into the focused editor's state.
+browser_evaluate("""
+  (function(){
+    document.execCommand('insertText', false, %s);
+    return true;
+  })();
+""" % json.dumps(message_text))   # json.dumps gives you a safely-escaped JS string literal
+sleep(1.0)   # let Lexical commit state + enable Send button

 # 7. Find the modal Send button (filter by in-viewport, reject pinned bar)
 send = browser_evaluate("""
@@ -133,11 +143,24 @@ send = browser_evaluate("""
  })();
 """)

-# 8. ONLY click Send if it's enabled — if disabled, the editor didn't register the input.
-#    Don't click blindly; the framework state is the source of truth, not the DOM text.
-if not send['disabled']:
-    browser_click_coordinate(send['cx'], send['cy'])
-    sleep(2.5)  # wait for send + bubble render
+# 8. ONLY click Send if it's enabled — if disabled, the execCommand
+#    didn't land. DO NOT retry with a different tool; the fix is
+#    always: re-click the composer rect, re-run execCommand, re-check.
+#    The Send button's `disabled` state IS the ground truth — if
+#    Lexical registered your text, it enables the button. If it's
+#    still disabled, your text did not reach the editor, regardless
+#    of what any tool call claims.
+if send['disabled']:
+    # The editor didn't receive your text. Do NOT click Send. Do NOT
+    # fall back to browser_type with a dummy selector (see anti-pattern
+    # in Common Pitfalls). Instead: re-click the textarea rect from
+    # step 4, wait a beat, re-run the execCommand insertText from step
+    # 6. If that still fails after 2 retries, bail and surface — the
+    # modal may have been reclaimed by a stale state or auth wall.
+    raise Exception("Send button disabled after insertText — editor did not receive input")
+
+browser_click_coordinate(send['cx'], send['cy'])
+sleep(2.5)  # wait for send + bubble render
 ```

 **Verify post-send**: the composer textarea should now be empty (`innerText === ''`) and `.msg-s-event-listitem__message-bubble` count should have grown by 1. Walk the shadow tree via `browser_evaluate` to check.
@@ -247,8 +270,9 @@ If any of those show up, **stop the run, screenshot the state, and surface the i
 ## Common pitfalls

 - **`innerHTML` injection is silently dropped** — LinkedIn's Trusted Types CSP discards any `innerHTML = "<...>"` from injected scripts, no console error. Always use `createElement` + `appendChild` + `setAttribute` for DOM injection. `textContent`, `style.cssText`, and `.value` assignments are fine.
- **Per-char keyDown on the message composer produces empty text** — Lexical intercepts `beforeinput` and drops raw keys. Use `browser_type` (which now routes through CDP `Input.insertText`), or call `Input.insertText` directly via the bridge on the focused shadow element.
- **`browser_type(selector=...)` can't see the message composer** — it's inside `#interop-outlet` shadow. `document.querySelector('div.msg-form__contenteditable')` returns nothing. Use the shadow-walk + click-to-focus pattern above.
+- **Do NOT use `browser_type` on the message composer — use `document.execCommand('insertText', false, text)` via `browser_evaluate` instead.** The Lexical contenteditable lives inside the `#interop-outlet` shadow root which `document.querySelector` (what `browser_type` uses under the hood) cannot see. Attempts to work around this with `browser_shadow_query` fail because `browser_type` doesn't support the `>>>` shadow-pierce syntax. The ONLY reliable insert path is: (1) `browser_click_coordinate` on the composer rect (put Lexical in edit mode via a real CDP pointer click) → (2) `browser_evaluate` with `document.execCommand('insertText', false, <message>)` against the focused editor. This pattern is verified end-to-end across 15+ successful sends in session `session_20260414_113244_a98cfd66` (2026-04-14).
+- **Per-char keyDown on the message composer produces empty text** — Lexical intercepts `beforeinput` and drops raw keys. Ignore `browser_type` entirely for LinkedIn DMs; use the `execCommand('insertText')` path above.
+- **ANTI-PATTERN: "inject a dummy `<div id='dummy-target'>` and pass it as the `selector` arg to `browser_type`".** This looks tempting but fails compoundingly: `browser_type` clicks the **dummy div's** rect (not the editor's), the click lands on the Lexical wrapper's non-editable chrome, the contenteditable never receives focus, and `Input.insertText` fires against nothing. The bridge will still return `{"ok": true, "action": "type", "length": N}` because it has no way to verify the text actually landed. Symptom: Send button stays `disabled: true` forever. Fix: use `execCommand('insertText')` exactly as shown in the profile-message flow above. (See `session_20260414_114820_08bd3c4d` for the failed attempt.)
 - **Multiple Send buttons on the page** — the pinned bottom-right messaging bar has its own `msg-form__send-button` that's usually below `innerHeight`. Filter by in-viewport before clicking.
 - **`window.onbeforeunload` hangs navigation/close** — after typing in a composer, any `browser_navigate` or `close_tab` can pop a native "unsent message, leave?" confirm dialog that deadlocks the bridge. Always strip `onbeforeunload` before any navigation, and wrap composer flows in a `try/finally` that runs the cleanup block: