Open
Conversation
Aligns the CLI with scrapegraph-js PR #13 (v2 SDK). The v2 API consolidates endpoints and drops legacy ones; the CLI follows suit. Commands kept (rewritten against v2 types): - scrape — multi-format (markdown/html/screenshot/branding/links/images/summary/json) - crawl — polls until the job reaches a terminal state - history — new response shape (data/pagination), service filter optional - credits, validate — re-wired to getCredits / checkHealth Commands added: - extract — structured extraction with prompt + schema - search — web search + optional extraction - monitor — create/list/get/update/delete/pause/resume/activity Commands removed (no longer in v2 API): - smart-scraper (use `scrape -f json -p ...` or `extract`) - search-scraper (use `search`) - markdownify (use `scrape` — markdown is the default format) - sitemap, agentic-scraper, generate-schema Other changes: - package.json: scrapegraph-js pinned to github:ScrapeGraphAI/scrapegraph-js#096c110, CLI bumped 0.2.1 → 1.0.0 to track SDK v2.0.0 - src/lib/env.ts: bridges legacy SGAI_TIMEOUT_S / JUST_SCRAPE_TIMEOUT_S → SGAI_TIMEOUT (renamed by SDK v2) - README + smoke test updated Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
scrapegraph-js is pinned to a GitHub commit (PR #13 head) that ships without a prebuilt dist/, so module resolution fails on a fresh install. Build it in-place after bun install so tsc/biome/bun test can resolve it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Migrates
just-scrapeto the scrapegraph-js v2 SDK (head096c110). The v2 API consolidates endpoints and drops legacy ones; the CLI now mirrors that surface.Commands kept (rewritten against v2)
scrape <url>-f markdown,html,screenshot,branding,links,images,summary,json(comma-separate for multi-format output)crawl <url>crawl.start→crawl.getuntil completed / failed / deletedhistory [service] [id]data[] + pagination), service filter optionalcreditsgetCredits; response exposesremaining,used,plan,jobs.{crawl,monitor}validatecheckHealth(/health)Commands added
extract <url> -p "…"— structured extraction with optional--schemasearch "<query>"— web search with optional-p/--schema,--country,--time-range,--formatmonitor <action>—create / list / get / update / delete / pause / resume / activityCommands removed (not in v2 API)
smart-scraper <url> -p "…"scrape <url> -f json -p "…"orextract <url> -p "…"search-scraper "…"search "…"(query is now positional;-pis the extraction prompt)markdownify <url>scrape <url>(markdown is the default format)scrape <url>(raw HTML only)scrape <url> -f htmlsitemap <url>crawlwith--include-patternsagentic-scrapergenerate-schemaOther changes
package.json:scrapegraph-jspinned togithub:ScrapeGraphAI/scrapegraph-js#096c110(PR feat!: migrate CLI to scrapegraph-js v2 API #13 head); CLI version bumped0.2.1→1.0.0to track SDK v2.0.0.src/lib/env.ts: bridges legacySGAI_TIMEOUT_S/JUST_SCRAPE_TIMEOUT_S→SGAI_TIMEOUT(SDK v2 renamed the var).SGAI_API_URLdefault is nowhttps://api.scrapegraphai.com/api/v2(baked into the SDK).scrape,extract,search,crawl.start,monitor.create,ScrapeGraphAI).Test plan
bun run check— tsc + biome cleanbun test— smoke test passes (v2 exports resolvable)bun run build— bundles cleanly todist/cli.mjsjust-scrape --help— lists 8 commands (scrape/extract/search/crawl/monitor/history/credits/validate)just-scrape <cmd> --help— all subcommand help renders correctly/api/v2/*returns 404 in prod, which is expected while the SDK PR is open🤖 Generated with Claude Code