ScrAPI is your ultimate web scraping solution, offering powerful, reliable, and easy-to-use features to extract data from any website effortlessly.
Official Python SDK for the ScrAPI web scraping service.
- Website: https://scrapi.tech
- API docs: https://scrapi.tech/docs
- Source repository: https://github.com/DevEnterpriseSoftware/scrapi-sdk-python
- Installation
- Quick start (sync)
- Quick start (async)
- Scrape request options
- Browser commands
- Scrape response data
- Scrape request defaults
- Lookups
- Exceptions
- HTML helper utilities (optional)
- Sample app
- Development
- Build and publish
pip install scrapi-sdkInstall optional HTML helpers:
pip install "scrapi-sdk[html]"from scrapi_sdk import ScrapeRequest, ScrapiClient
with ScrapiClient("YOUR_API_KEY") as client:
response = client.scrape(ScrapeRequest("https://deventerprise.com"))
print(response.content if response else "No response")import asyncio
from scrapi_sdk import AsyncScrapiClient
async def main() -> None:
async with AsyncScrapiClient("YOUR_API_KEY") as client:
response = await client.scrape("https://deventerprise.com")
print(response.content if response else "No response")
asyncio.run(main())All options map to ScrAPI API fields while exposing Pythonic snake_case names.
| Python field | Type | Description |
|---|---|---|
url |
str |
URL to scrape. Relative inputs are normalized to https://.... |
response_format |
ResponseFormat |
Must be ResponseFormat.JSON when using this SDK client. |
response_selector |
str | None |
CSS/XPath selector for response filtering. |
cookies |
dict[str, str] |
Cookies sent to target request. |
headers |
dict[str, str] |
Headers sent to target request. |
request_method |
str |
HTTP method override (default GET). |
request_body_base64 |
str | None |
Base64 request payload. |
proxy_type |
ProxyType |
NONE, FREE, RESIDENTIAL, DATACENTER, TOR, CUSTOM. |
proxy_country |
str | None |
Three-letter country code, e.g. USA. |
proxy_city |
str | None |
City key (requires proxy_country). |
custom_proxy_url |
str | None |
Custom proxy URL. |
use_browser |
bool |
Enable browser mode. |
solve_captchas |
bool |
Auto solve captchas (browser mode only). |
include_screenshot |
bool |
Include screenshot URL in response (browser mode only). |
include_pdf |
bool |
Include PDF URL in response (browser mode only). |
include_video |
bool |
Include video URL in response (browser mode only). |
accept_dialogs |
bool |
Accept browser dialogs/popups. |
session_id |
str | None |
Reuse session context across calls. |
callback_url |
str | None |
Webhook URL called when scrape completes. |
browser_commands |
BrowserCommandList |
Ordered browser action commands. |
Example:
from scrapi_sdk import ProxyType, ResponseFormat, ScrapeRequest
request = ScrapeRequest("https://deventerprise.com")
request.proxy_type = ProxyType.RESIDENTIAL
request.proxy_country = "USA"
request.use_browser = True
request.solve_captchas = True
request.include_screenshot = True
request.response_format = ResponseFormat.JSONWhen use_browser=True, chain browser commands with BrowserCommandList:
from scrapi_sdk import ScrapeRequest
request = ScrapeRequest("https://www.roboform.com/filling-test-all-fields")
request.use_browser = True
request.accept_dialogs = True
request.browser_commands \
.input("input[name='01___title']", "Mr") \
.input("input[name='02frstname']", "Werner") \
.input("input[name='04lastname']", "van Deventer") \
.select("select[name='40cc__type']", "Discover") \
.wait(3000) \
.wait_for("input[type='reset']") \
.click("input[type='reset']") \
.wait(1000) \
.scroll(1000) \
.evaluate("console.log('any valid code...')")ScrapeResponse includes all API response details.
response = client.scrape("https://deventerprise.com")
if response:
print(response.request_url)
print(response.response_url)
print(response.duration)
print(response.attempts)
print(response.credits_used)
print(response.status_code)
print(response.screenshot_url)
print(response.pdf_url)
print(response.video_url)
print(response.content)
print(response.content_hash) # SHA1 of UTF-16LE content to match .NET SDK parity.
for captcha_name, solved_count in response.captchas_solved.items():
print(f"{captcha_name}: {solved_count}")
for key, value in response.headers.items():
print(f"{key}: {value}")
for key, value in response.cookies.items():
print(f"{key}: {value}")
for message in response.error_messages or []:
print(message)If beautifulsoup4 is installed, response.html returns a parsed BeautifulSoup object.
ScrapeRequestDefaults applies defaults to every new ScrapeRequest.
from scrapi_sdk import ProxyType, ScrapeRequest, ScrapeRequestDefaults
ScrapeRequestDefaults.proxy_type = ProxyType.RESIDENTIAL
ScrapeRequestDefaults.use_browser = True
ScrapeRequestDefaults.solve_captchas = True
ScrapeRequestDefaults.headers["Sample"] = "Custom-Value"
request = ScrapeRequest("https://deventerprise.com")
request.proxy_type = ProxyType.TOR # explicit override
assert request.proxy_type == ProxyType.TOR
assert request.use_browser is True
assert request.solve_captchas is True
assert request.headers["Sample"] == "Custom-Value"balance = client.get_credit_balance()
print(balance)countries = client.get_supported_countries()
for country in countries:
print(country.key, country.name, country.proxy_count)cities = client.get_supported_cities("USA")
for city in cities:
print(city.key, city.name, city.proxy_count)Any client/API errors are raised as ScrapiException with HTTP status code details.
from scrapi_sdk import ScrapeRequest, ScrapiClient, ScrapiException
with ScrapiClient("YOUR_API_KEY") as client:
try:
response = client.scrape(ScrapeRequest("https://deventerprise.com"))
except ScrapiException as ex:
print(f"Error ({ex.status_code}): {ex}")
raiseInstall optional dependency first:
pip install "scrapi-sdk[html]"Helpers exported from scrapi_sdk:
numbers_only(text, include_decimal_points=False, trim=True)html_with_no_script(html)next_element(node)is_visible(node, check_parent_nodes=True)
Example:
from scrapi_sdk import html_with_no_script, numbers_only
print(numbers_only("USD 1,299.95", include_decimal_points=True))
print(html_with_no_script("<p>safe</p><script>alert(1)</script>"))A runnable sample app is included at examples/basic_scrape/main.py.
It reads SCRAPI_API_KEY and scrapes https://deventerprise.com.
python -m venv .venv
. .venv/Scripts/activate # Windows PowerShell: .venv\Scripts\Activate.ps1
pip install -e .[dev,html]
pytestpython -m pip install --upgrade pip build twine
python -m build
python -m twine check dist/*# PowerShell
$env:TWINE_USERNAME="__token__"
$env:TWINE_PASSWORD="pypi-..."
python -m twine upload -r testpypi dist/*# PowerShell
$env:TWINE_USERNAME="__token__"
$env:TWINE_PASSWORD="pypi-..."
python -m twine upload dist/*