From 0902957f991ef5c860fa9dc7b9f3b2145e2e2447 Mon Sep 17 00:00:00 2001 From: Himadri M Date: Thu, 28 May 2026 12:59:35 +0530 Subject: [PATCH] Prevent agents from treating scraped search as the default Search skills framed --scrape as the efficient quick-start path even though it adds scrape credits per result. Put plain search first and label inline scraping as advanced so agents do not accidentally burn large credit volumes. Constraint: End-of-trial AX evidence showed Codex copied the skill example into costly default behavior. Rejected: Removing --scrape examples entirely | Full-content search remains useful when the user needs every result scraped. Confidence: high Scope-risk: narrow Directive: Keep cost-sensitive flags near explicit credit warnings in agent-facing skills. Tested: git diff --check; grep confirmed removed misleading "saves credits" wording from edited skills. Not-tested: No runtime CLI invocation; markdown-only skill guidance change. --- skills/firecrawl-cli/SKILL.md | 5 +++++ skills/firecrawl-search/SKILL.md | 11 +++++++---- 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/skills/firecrawl-cli/SKILL.md b/skills/firecrawl-cli/SKILL.md index 8105cc8721..f31a618449 100644 --- a/skills/firecrawl-cli/SKILL.md +++ b/skills/firecrawl-cli/SKILL.md @@ -40,6 +40,10 @@ firecrawl scrape "https://firecrawl.dev" -o .firecrawl/install-check.md ``` ```bash +# Basic search first; inspect URLs/snippets before deciding what to scrape. +firecrawl search "query" --limit 5 + +# Advanced: scrape each result inline. Adds scrape credits per result; keep limits small. firecrawl search "query" --scrape --limit 3 ``` @@ -203,6 +207,7 @@ Use `modes: ["json", "git-diff"]` for **mixed mode**: you get both `diff.json` ( **Avoid redundant fetches:** +- Start with plain `search`; use `search --scrape` only when you need full content for every result. `--scrape` adds scrape credits per result, so keep `--limit` small. - `search --scrape` already fetches full page content. Don't re-scrape those URLs. - Check `.firecrawl/` for existing data before fetching again. diff --git a/skills/firecrawl-search/SKILL.md b/skills/firecrawl-search/SKILL.md index 87b426cf92..586f0358ad 100644 --- a/skills/firecrawl-search/SKILL.md +++ b/skills/firecrawl-search/SKILL.md @@ -23,11 +23,12 @@ Web search with optional content scraping. Returns search results as JSON, optio # Basic search firecrawl search "your query" -o .firecrawl/result.json --json -# Search and scrape full page content from results -firecrawl search "your query" --scrape -o .firecrawl/scraped.json --json - # News from the past day firecrawl search "your query" --sources news --tbs qdr:d -o .firecrawl/news.json --json + +# Advanced: search and scrape full page content for each result +# Use only when snippets/URLs are not enough; this adds scrape credits per result. +firecrawl search "your query" --scrape --limit 3 -o .firecrawl/scraped.json --json ``` ## Options @@ -47,7 +48,9 @@ firecrawl search "your query" --sources news --tbs qdr:d -o .firecrawl/news.json ## Tips -- **`--scrape` fetches full content** — don't re-scrape URLs from search results. This saves credits and avoids redundant fetches. +- **Start without `--scrape` by default.** Plain search costs the least and usually gives enough URLs/snippets to choose what to inspect next. +- **Use `--scrape` only when you need full page content for every returned result.** It adds normal scrape credits for each result page on top of the search cost, so keep `--limit` small. +- **If you already used `--scrape`, do not re-scrape those URLs.** The full content is already in the search output. - Always write results to `.firecrawl/` with `-o` to avoid context window bloat. - Use `jq` to extract URLs or titles: `jq -r '.data.web[].url' .firecrawl/search.json` - Naming convention: `.firecrawl/search-{query}.json` or `.firecrawl/search-{query}-scraped.json`