Skip to content

feat(prompts): add retry/loop-breaking rules to bu-flash system prompt#41

Draft
caffeinum wants to merge 1 commit into
webllm:mainfrom
caffeinum:feat/bu-flash-retry-rules
Draft

feat(prompts): add retry/loop-breaking rules to bu-flash system prompt#41
caffeinum wants to merge 1 commit into
webllm:mainfrom
caffeinum:feat/bu-flash-retry-rules

Conversation

@caffeinum
Copy link
Copy Markdown
Contributor

Summary

Adds an 8-line <retry_strategy> block to system_prompt_browser_use_flash.md. The rules are a condensation of guidance that already exists in the main system_prompt.md but was stripped from the slim flash variants on the assumption that the bu-2-0 fine-tune would internalize it. In production we've observed bu-2-0 looping — re-emitting the same failing action 3+ times without changing approach — particularly on auth/MFA flows.

Added rules (paraphrased)

  1. Don't repeat the same action 3+ times without visible progress.
  2. If the URL hasn't changed for 3+ steps and the page looks identical, you're stuck — change strategy.
  3. If a click/input did nothing, don't click the same index again — pick a different element or scroll.
  4. If a field's actual value differs from what you typed (page reformatted/autocompleted), wait one step before retrying.
  5. If credentials are needed and not provided, stop and report — don't invent emails/passwords/codes.

Cross-reference with upstream main prompt

Equivalent guidance already exists in upstream system_prompt.md:

  • L98: "if you are on the same URL for 3+ steps without meaningful progress, or the same action fails 2-3 times, try a different approach" (rules 1 & 2)
  • L88: autocomplete/combobox handling (rule 4)
  • L89 / L149: don't login without credentials, blocking-error check (rule 5)
  • L97, L249, L264, L267: don't repeat failing actions, break loops (rules 1 & 3)

So this isn't fork-specific tuning — it's restoring upstream's own guidance into the slim flash variant where it was dropped.

Test plan

  • Smoke-test bu-2-0 on a known loopy scenario (e.g. auth0 login w/ TOTP) and check that retry counts drop
  • No regressions on the existing prompt test suite (pnpm test:unit)

Note

system_prompt_flash.md (the non-browser-use flash prompt) has the same gap. Out of scope for this PR; happy to follow up if maintainers want.

🤖 Generated with Claude Code

The browser-use provider auto-flips flash_mode (service.ts:599), which
selects the 15-line minimal prompt. That prompt lacked any guidance for
loop avoidance, retry strategy, or autocomplete handling — gaps the
fine-tuned bu-2-0 model was supposed to internalize but doesn't always.

Append a short <retry_strategy> block (5 generic rules, no URL or
provider matching) covering: same-action-3x, stuck-URL, dead clicks,
autocomplete value mismatch, and missing credentials. Keeps the prompt
small (15 -> 23 lines) without re-inflating to the full 269-line variant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant