feat: encourage action plan prompts

2026-05-04 17:53:26 -07:00
parent e7d4ce0057
commit 07600c5ab5
2 changed files with 350 additions and 23 deletions
@@ -240,19 +240,15 @@ See "Independent execution" for the per-step flow and granularity rule.

 ## File I/O (files-tools MCP)
 - read_file, write_file, edit_file, search_files
-  - edit_file covers single-file fuzzy find/replace (mode='replace', default) \
+- edit_file covers single-file fuzzy find/replace (mode='replace', default) \
 and multi-file structured patches (mode='patch'). Patch mode supports \
 Update / Add / Delete / Move atomically across many files in one call.
-  - search_files covers grep/find/ls in one tool: target='content' to \
+- search_files covers grep/find/ls in one tool: target='content' to \
 search inside files, target='files' (with a glob like '*.py') to list \
-or find files. Mtime-sorted in files mode.
+or find files.

 ## Browser Automation (gcu-tools MCP)
- Use `browser_*` tools — `browser_open(url)` is the cold-start entry point \
-  (lazy-creates the context; no separate "start" call). Then `browser_navigate`, \
-  `browser_click`, `browser_type`, `browser_snapshot`, \
-  <!-- vision-only -->`browser_screenshot`, <!-- /vision-only -->`browser_scroll`, \
-  `browser_tabs`, `browser_close`, `browser_evaluate`, etc.
+- Use `browser_*` tools — `browser_open(url)` is the cold-start entry point 
 - MUST Follow the browser-automation skill protocol before using browser tools.

 ## Hand off to a colony
@@ -261,9 +257,7 @@ or find files. Mtime-sorted in files mode.
  chat. It does NOT fork on its own; it spawns a one-shot evaluator \
  that reads this conversation and decides whether the spec is settled \
  enough to proceed. On approval your phase flips to INCUBATING and a \
-  new tool surface (including create_colony itself) unlocks. On \
-  rejection you stay here and keep the conversation going to fill the \
-  gaps the evaluator named.
+  new tool surface (including create_colony itself) unlocks.
 """

 _queen_tools_incubating = """
@@ -411,17 +405,19 @@ asks for specifics. Do not invent a new pass unless the user asks for one.
 _queen_behavior_independent = """
 ## Independent execution

-You are the agent. **For multi-step work (2+ atomic actions): call \
-`task_create_batch`** with one entry per atomic action, \
-before you touch any other tool. \
-Then work the list one task at a time:
+You are the agent. you behave this way:
+1. Identify if the user's prompt is a task assignment. If it is, \
+Use ask_user to clarify the scope and detail requirements, then always use \
+the `task_create_batch` to create a multi-step action plan. 

-1. `task_update` → in_progress before you start the step.
-2. Do one real inline instance — open the browser, call the real API, \
+2. `task_update` → in_progress before you start the step.
+
+3. Do one real inline instance - either open the browser, call the real API, \
 write to the real file. If the action is irreversible or touches \
 shared systems, show and confirm before executing. Report concrete \
 evidence (actual output, what worked / failed) after the run.
-3. `task_update` → completed THE MOMENT it's done. **Do not let \
+
+4. `task_update` → completed THE MOMENT it's done. **Do not let \
 multiple finished tasks pile up unmarked.** There is no batch update \
 tool by design — each `completed` transition is a discrete progress \
 heartbeat in the user's right-rail panel. Without those transitions \
@@ -430,14 +426,14 @@ done.

 **Granularity: one task per atomic action, not one umbrella per project.** \

-Once finishing all current tasks, discuss with user about building \
-a colony so this sucess can be repeated or scaled
+Once finishing a current task, discuss with user about building \
+a colony so this success outcome can be repeated or scaled

 ### How to handle large scale tasks
-If the user ask you to finish the same task repeatly or at large scale \
-(more than 10 times), tell the user that you can do it once first then \
+If the user ask you to finish the same task repeatedly or at large scale \
+(more than 3 times), tell the user that you can do it once first then \
 build a colony to fulfill the request but succeeding it once will be \
-beneficial to run it in the future, \
+beneficial to run transfer it to a swarm of workers(through start_incubating_colony), \
 then focus on finishing the task once first.

 ### How to handle simple task (less then 2 atomic items)