Skip to content

feat: migrate SDK to v2 API#13

Open
FrancescoSaverioZuppichini wants to merge 22 commits intomainfrom
feat/v2-migration
Open

feat: migrate SDK to v2 API#13
FrancescoSaverioZuppichini wants to merge 22 commits intomainfrom
feat/v2-migration

Conversation

@FrancescoSaverioZuppichini
Copy link
Copy Markdown
Member

@FrancescoSaverioZuppichini FrancescoSaverioZuppichini commented Apr 14, 2026

Summary

  • Complete SDK rewrite for v2 API endpoints
  • Add ScrapeGraphAI({ apiKey? }) client factory (reads SGAI_API_KEY from env)
  • Add crawl namespace (start, get, stop, resume, delete)
  • Add monitor namespace (create, list, get, update, delete, pause, resume)
  • Add comprehensive Zod schemas matching API exactly
  • Remove generateSchema (no longer in API)
  • Rename: getCreditscredits, checkHealthhealthy

Usage

import { ScrapeGraphAI } from "scrapegraph-js";

// reads SGAI_API_KEY from env, or pass explicitly: ScrapeGraphAI({ apiKey: "..." })
const sgai = ScrapeGraphAI();

// scrape - minimal (defaults to markdown)
const { data } = await sgai.scrape({ url: "https://example.com" });

// scrape - multiple formats
const { data } = await sgai.scrape({ 
  url: "https://example.com", 
  formats: [
    { type: "markdown", mode: "reader" },
    { type: "screenshot", fullPage: true },
    { type: "json", prompt: "Extract product info" },
  ],
  fetchConfig: { mode: "js", stealth: true },
});

// extract
const { data } = await sgai.extract({ url: "https://example.com", prompt: "Extract prices" });

// search
const { data } = await sgai.search({ query: "best tools 2024", numResults: 5 });

// crawl
const { data: crawl } = await sgai.crawl.start({ url: "https://example.com", maxPages: 10 });
await sgai.crawl.get(crawl.id);
await sgai.crawl.stop(id);

// monitor
const { data: mon } = await sgai.monitor.create({ url: "...", interval: "0 * * * *", formats: [...] });
await sgai.monitor.pause(mon.cronId);

// utilities
await sgai.credits();
await sgai.healthy();
await sgai.history.list({ service: "scrape", limit: 10 });

Breaking Changes

  • All function signatures changed to match v2 API
  • Old v1 functions removed (smartScraper, markdownify, searchScraper, sitemap, etc.)
  • generateSchema removed
  • getCreditscredits, checkHealthhealthy

Test plan

  • bun run format - no fixes needed
  • bun run lint - passes
  • bunx tsc --noEmit - passes
  • bun run build - passes
  • bun test - 52 unit tests pass
  • bun run test:integration - 10 integration tests pass (includes format variations)

🤖 Generated with Claude Code

- Add all API request/response schemas matching v2 API exactly
- Remove llmConfig from schemas (not exposed in SDK)
- Add comprehensive types for all endpoints

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add scrape, extract, search, generateSchema endpoints
- Add crawl namespace: start, get, stop, resume, delete
- Add monitor namespace: create, list, get, update, delete, pause, resume
- Add getCredits, checkHealth, getHistory, getHistoryEntry
- Export schemas for client-side validation
- Add zod dependency

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace cancelled with deleted in ApiCrawlStatus
- Add deleted to ApiHistoryStatus
- Move types from src/types/index.ts to src/types.ts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update unit tests for new SDK structure
- Add integration tests for live API
- Fix schemas to use deleted instead of cancelled
- Move types.ts out of folder

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…types

- Add unit tests for all scrape formats (markdown, html, json, screenshot, summary, branding, links, images)
- Add tests for fetchConfig options (mode, stealth, timeout, headers, cookies, country, scrolls)
- Add tests for PDF/DOCX/image document scraping with OCR
- Add extract tests for URL, HTML, and markdown inputs with schema
- Add search tests with filters (location, timeRange, numResults)
- Add crawl/monitor tests with full config options
- Fix types to use z.input for request types (allows omitting fields with defaults)
- Remove obsolete v1 integration_test.ts

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Follow namespace pattern consistent with crawl.* and monitor.*

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Rename integration tests to *.spec.ts (excluded from CI)
- `bun run test` runs only *.test.ts (unit tests for CI)
- `bun run test:integration` runs *.spec.ts (live API tests)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove old v1 examples (smartscraper, markdownify, searchscraper, sitemap, agenticscraper)
- Add scrape examples (basic, multi-format, pdf, fetchConfig)
- Add extract examples (basic, with-schema)
- Add search examples (basic, with-extraction)
- Add monitor examples (basic, with-webhook)
- Update crawl examples for namespace API
- Update schema examples for camelCase fields
- Update utilities for v2 response shapes

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update all API documentation for v2 endpoints
- Add examples table with path and description
- Add scrape_json_extraction example
- Enhance scrape_pdf and scrape_multi_format examples
- Update environment variables section

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove process.exit() from crawl example, use if/else instead
- Fix non-null assertion in crawl example
- Fix undefined variable references in README crawl section
- Use consistent example.com URLs across all examples

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Require SGAI_API_KEY env var instead of fallback

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
VinciGit00 added a commit to ScrapeGraphAI/docs-mintlify that referenced this pull request Apr 14, 2026
Rewrite all JavaScript code examples to match the new v2 SDK API from
ScrapeGraphAI/scrapegraph-js#13. Key changes:
- Replace factory pattern (scrapegraphai({ apiKey })) with direct imports
- All functions use (apiKey, params) signature
- scrape() uses formats array instead of single format string
- Return type is ApiResult<T> with status check, not throw-on-error
- crawl.status() renamed to crawl.get(), crawl.delete() added
- monitor.create() uses formats array, not prompt
- Restore generateSchema and checkHealth in docs
- Schema params use JSON objects, not Zod instances
- history is now history.list() and history.get()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add ScrapeGraphAI({ apiKey? }) factory that reads SGAI_API_KEY from env
- Rename client methods: getCredits → credits, checkHealth → healthy
- Remove generateSchema (no longer in API)
- Update all examples to use new client pattern
- Update README with client usage

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use new client pattern instead of standalone functions
- Add test for scrape with no formats (defaults to markdown)
- Rename tests for clarity

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant