Skip to content

feat: sync CLI with scrapegraph-js v2 PR #13 head#14

Closed
VinciGit00 wants to merge 11 commits intomainfrom
feat/sync-sdk-v2-pr13-head
Closed

feat: sync CLI with scrapegraph-js v2 PR #13 head#14
VinciGit00 wants to merge 11 commits intomainfrom
feat/sync-sdk-v2-pr13-head

Conversation

@VinciGit00
Copy link
Copy Markdown
Member

Summary

Syncs just-scrape with the latest commits on scrapegraph-js#13 (head 096c110), picking up changes made after the previous pin (0738786):

  • New monitor.activity endpoint (scrapegraph-js 096c110) — paginated tick history with diffs
  • SGAI_TIMEOUT_SSGAI_TIMEOUT env var rename (scrapegraph-js 2eba148)
  • Default base URL now https://api.scrapegraphai.com/api/v2 (baked into SDK)
  • Health endpoint path fixed to /health (relative to /api/v2)

CLI changes

  • just-scrape monitor activity --id <id> [--limit N] [--cursor C] — new action calling sgai.monitor.activity()
  • src/lib/env.ts bridges legacy SGAI_TIMEOUT_S (and JUST_SCRAPE_TIMEOUT_S) → SGAI_TIMEOUT so nothing breaks for existing users
  • package.json bumps scrapegraph-js pin 0738786096c110, and CLI version 0.3.01.0.0 to track SDK v2.0.0
  • README.md: documents the new activity action, updates env-var table, adds SGAI_DEBUG

Test plan

  • bun run check — tsc + biome clean
  • bun run build — bundles successfully
  • bun run dev --help / monitor --help show the new activity action and --limit / --cursor flags
  • Live smoke test against a real monitor ID once the SDK PR is merged

Docs follow-up in docs-mintlify#39.

🤖 Generated with Claude Code

VinciGit00 and others added 11 commits March 31, 2026 11:48
Align the CLI with ScrapeGraphAI/scrapegraph-js#11 (v2 SDK migration):

- Rename smart-scraper → extract, search-scraper → search
- Remove commands dropped from the API: agentic-scraper, generate-schema, sitemap, validate
- Add client factory (src/lib/client.ts) using the new scrapegraphai({ apiKey }) pattern
- Update scrape command with --format flag (markdown, html, screenshot, branding)
- Update crawl to use crawl.start/status polling lifecycle
- Update history to use v2 service names and parameters
- All commands now use try/catch (v2 throws on error) and self-timed elapsed

BREAKING CHANGE: CLI commands have been renamed and removed to match the v2 API surface.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Aligns CLI with scrapegraph-js v2 SDK change (b570a57) that replaced
stealth/render booleans with a unified fetch mode enum:
auto, fast, js, direct+stealth, js+stealth.

- All commands: --stealth boolean → --mode <mode> string
- Pin SDK to commit b570a57 (includes fetch mode change)
- Update README and SKILL.md with new flag syntax

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Bump scrapegraph-js pin b570a57 → c5bf757
- scrape: support 8 formats (markdown, html, screenshot, branding,
  links, images, summary, json), multi-format via comma-separated -f,
  add --html-mode, --scrolls, --prompt/--schema for json format
- search: add --location-geo-code, --time-range, --format
- crawl: add --format flag
- README: document new flags and formats

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The SDK is pinned to a GitHub commit (not on npm yet) and ships without a
prebuilt dist/, so module resolution fails right after bun install. Build it
as a post-install CI step until v2 lands on npm.

Also rewrite tests/smoke.test.ts — the old test still imported the v1
symbols (smartScraper, HISTORY_SERVICES) that no longer exist; replace
with a sanity check against the v2 scrapegraphai() factory.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split compound fetch modes (direct+stealth, js+stealth) into separate
--mode (auto|fast|js) and --stealth boolean flag. Add --nationality
param to search command. Update SDK dependency to latest PR commit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Import ApiExtractOptions, ApiScrapeOptions, ApiSearchOptions, and
ApiCrawlOptions from scrapegraph-js to satisfy biome noExplicitAny rule.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The SDK's Zod-inferred types have strict required fields (from
.default()) that don't match partial CLI arg construction. Allow
`as any` in src/commands/ where we bridge string args to the SDK.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace SDK factory with raw function imports (scrape, extract, search, crawl, monitor, history, getCredits)
- Add monitor command (create, list, get, update, delete, pause, resume)
- Update crawl to use formats array and crawl.get instead of crawl.status
- Update history to use history.list/history.get with new pagination response
- Update search to pass query in params, remove nationality flag
- Update extract to pass url in params
- Make history service filter optional
- Update README with monitor docs and v2 migration notes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- bump scrapegraph-js to latest PR HEAD (adds monitor.activity, renames SGAI_TIMEOUT_S → SGAI_TIMEOUT, bakes /api/v2 into default base URL)
- add `just-scrape monitor activity --id <id> [--limit] [--cursor]` for paginated tick history
- bridge legacy SGAI_TIMEOUT_S (and JUST_SCRAPE_TIMEOUT_S) to new SGAI_TIMEOUT
- README: document activity command, update default base URL, note SGAI_DEBUG
- bump CLI to v1.0.0 to track SDK v2.0.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
VinciGit00 added a commit to ScrapeGraphAI/docs-mintlify that referenced this pull request Apr 15, 2026
Brings the CLI docs in line with the CLI changes in
ScrapeGraphAI/just-scrape#14 (which pulls in
scrapegraph-js v2 PR #13 head 096c110):

- Document the full `just-scrape monitor` action set, including the new
  `monitor activity --id <id> [--limit] [--cursor]` for paginated tick history
- Replace stale `-m direct+stealth` / `-m js+stealth` with real CLI syntax
  (`-m js --stealth`, fetch modes: auto/fast/js)
- Env vars: `SGAI_TIMEOUT_S` → `SGAI_TIMEOUT`, default base URL now
  `https://api.scrapegraphai.com/api/v2`, document `SGAI_DEBUG`
- Credits example uses `.remaining` (v2 response shape)
- Add `schema` to the history services list
- Fix `--location-geo-code` → `--country` in search example
- Add monitor usage examples (webhook, activity, jq filter for changes)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@VinciGit00
Copy link
Copy Markdown
Member Author

Closing to re-open once the command-file consolidation refactor lands on this branch.

@VinciGit00 VinciGit00 closed this Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant