diff --git a/README.md b/README.md
index 24cef390..560124f6 100644
--- a/README.md
+++ b/README.md
@@ -1,47 +1,131 @@
-
Apify SDK for Python
+Apify SDK for Python
-
-
-
-
-
+ The official Python SDK for building Apify Actors.
-The Apify SDK for Python is the official library to create [Apify Actors](https://docs.apify.com/platform/actors)
-in Python. It provides useful features like Actor lifecycle management, local storage emulation, and Actor
-event handling.
+
+
+
+
+
+
+
+
+
+
+`apify` is the official SDK for building [Apify Actors](https://docs.apify.com/platform/actors) in Python. Actors are serverless programs that run on the [Apify platform](https://apify.com), where you can scale them, schedule them, and monetize them. The SDK handles the Actor lifecycle, [storage](https://docs.apify.com/platform/storage) access, platform events, [Apify Proxy](https://docs.apify.com/platform/proxy), and pay-per-event charging.
-If you just need to access the [Apify API](https://docs.apify.com/api/v2) from your Python applications,
-check out the [Apify Client for Python](https://docs.apify.com/api/client/python) instead.
+> If you only need to **consume** the [Apify API](https://docs.apify.com/api/v2) from Python (running Actors, reading datasets, managing storages) rather than building Actors, use the [Apify API client for Python](https://docs.apify.com/api/client/python) instead. It comes bundled with this SDK.
+
+## Table of contents
+
+- [Installation](#installation)
+- [Quick start](#quick-start)
+- [What are Actors?](#what-are-actors)
+- [Features](#features)
+- [What you can build](#what-you-can-build)
+- [Usage examples](#usage-examples)
+- [Documentation](#documentation)
+- [Related projects](#related-projects)
+- [Support and community](#support-and-community)
+- [Contributing](#contributing)
+- [License](#license)
## Installation
-The Apify SDK for Python is available on PyPI as the `apify` package.
-For default installation, using Pip, run the following:
+The Apify SDK for Python requires **Python 3.11 or higher**. It is published on [PyPI](https://pypi.org/project/apify/) as the `apify` package and can be installed with [pip](https://pip.pypa.io/):
```bash
pip install apify
```
-For users interested in integrating Apify with Scrapy, we provide a package extra called `scrapy`.
-To install Apify with the `scrapy` extra, use the following command:
+or with [uv](https://docs.astral.sh/uv/):
```bash
-pip install apify[scrapy]
+uv add apify
```
-## Documentation
+To use the Scrapy integration, install the `scrapy` extra:
+
+```bash
+pip install 'apify[scrapy]'
+```
+
+## Quick start
+
+An Actor is a Python program that runs inside the `async with Actor:` context. The context initializes the Actor when it starts and tears it down when it finishes. Here's a minimal Actor that reads its input and stores a result:
+
+```python
+from apify import Actor
+
+
+async def main() -> None:
+ async with Actor:
+ actor_input = await Actor.get_input()
+ Actor.log.info('Actor input: %s', actor_input)
+ await Actor.set_value('OUTPUT', 'Hello, world!')
+```
+
+The quickest way to scaffold a full Actor project, with the `.actor` configuration, input schema, and Dockerfile already in place, is the [Apify CLI](https://docs.apify.com/cli):
+
+1. Install the CLI:
+
+ ```bash
+ npm install -g apify-cli
+ ```
+
+2. Create a new Actor from the Python "getting started" template:
+
+ ```bash
+ apify create my-actor --template python-start
+ ```
-For usage instructions, check the documentation on [Apify Docs](https://docs.apify.com/sdk/python/).
+3. Run it locally:
-## Examples
+ ```bash
+ cd my-actor
+ apify run
+ ```
-Below are few examples demonstrating how to use the Apify SDK with some web scraping-related libraries.
+To create, run, and deploy your first Actor step by step, see the [Quick start guide](https://docs.apify.com/sdk/python/docs/quick-start).
-### Apify SDK with HTTPX and BeautifulSoup
+## What are Actors?
+
+Actors are serverless cloud programs that can do almost anything a human can do in a web browser. They range from small tasks, such as filling in forms or unsubscribing from online services, all the way up to scraping and processing vast numbers of web pages.
+
+They run either locally or on the [Apify platform](https://docs.apify.com/platform/), where you can run them at scale, monitor them, schedule them, or publish and monetize them. If you're new to Apify, learn [what Apify is](https://docs.apify.com/platform/about) in the platform documentation.
+
+## Features
+
+- Run the full Actor lifecycle inside `async with Actor:`, covering init, exit, failures, status messages, and reboots ([Actor lifecycle](https://docs.apify.com/sdk/python/docs/concepts/actor-lifecycle)).
+- Read Actor input validated against your input schema with `Actor.get_input()` ([Actor input](https://docs.apify.com/sdk/python/docs/concepts/actor-input)).
+- Read and write datasets, key-value stores, and request queues, locally or on the platform ([Working with storages](https://docs.apify.com/sdk/python/docs/concepts/storages)).
+- React to platform events such as system info, migration, and abort ([Actor events](https://docs.apify.com/sdk/python/docs/concepts/actor-events)).
+- Route requests through Apify Proxy with group selection, country targeting, and rotation ([Proxy management](https://docs.apify.com/sdk/python/docs/concepts/proxy-management)).
+- Start, call, abort, and metamorph other Actors and tasks, and attach webhooks to run events ([Interacting with other Actors](https://docs.apify.com/sdk/python/docs/concepts/interacting-with-other-actors), [Webhooks](https://docs.apify.com/sdk/python/docs/concepts/webhooks)).
+- Monetize your Actor with pay-per-event charging ([Pay-per-event](https://docs.apify.com/sdk/python/docs/concepts/pay-per-event)).
+- Reach the full [Apify API](https://docs.apify.com/api/v2) through a preconfigured `ApifyClient` ([Accessing the Apify API](https://docs.apify.com/sdk/python/docs/concepts/access-apify-api)).
+
+## What you can build
+
+Almost any Python project can become an Actor, including projects for:
+
+- **Web scraping and crawling** — The SDK is fully compatible with [Crawlee](https://crawlee.dev/python), which makes Apify a natural place to deploy and scale your crawlers (see the [Crawlee guide](https://docs.apify.com/sdk/python/docs/guides/crawlee)). It also works with other popular scraping libraries, such as [Scrapy](https://docs.apify.com/sdk/python/docs/guides/scrapy), [Scrapling](https://docs.apify.com/sdk/python/docs/guides/scrapling), or [Crawl4AI](https://docs.apify.com/sdk/python/docs/guides/crawl4ai).
+- **Browser automation** — Drive a real browser with [Playwright](https://docs.apify.com/sdk/python/docs/guides/playwright) or [Selenium](https://docs.apify.com/sdk/python/docs/guides/selenium), or with higher-level tools such as [Browser Use](https://docs.apify.com/sdk/python/docs/guides/browser-use).
+- **Web servers and APIs** — Run a [web server](https://docs.apify.com/sdk/python/docs/guides/running-webserver) inside an Actor to serve HTTP requests, for example to expose your scraper as a live API.
+- **AI agents** — Host agents built with your framework of choice. Ready-made Actor templates cover [PydanticAI](https://apify.com/templates/python-pydanticai), [CrewAI](https://apify.com/templates/python-crewai), [LangGraph](https://apify.com/templates/python-langgraph), [LlamaIndex](https://apify.com/templates/python-llamaindex-agent), and [Smolagents](https://apify.com/templates/python-smolagents).
+- **MCP servers** — Deploy a Python MCP server as an Actor and make its tools available to any MCP client. See [MCP server](https://apify.com/templates/python-mcp-empty) and [MCP proxy](https://apify.com/templates/python-mcp-proxy) templates
+
+Whatever you build, the Apify SDK doesn't lock you into a particular framework. Bring the libraries you already use, and let Apify run your project in the cloud.
+
+## Usage examples
+
+The examples below show two common setups, but the same `async with Actor:` pattern works with any stack. For more, see the [guides](https://docs.apify.com/sdk/python/docs/guides/beautifulsoup-httpx).
+
+### HTTPX with BeautifulSoup
-This example illustrates how to integrate the Apify SDK with [HTTPX](https://www.python-httpx.org/) and [BeautifulSoup](https://pypi.org/project/beautifulsoup4/) to scrape data from web pages.
+Scrape pages with [HTTPX](https://www.python-httpx.org/) and [BeautifulSoup](https://pypi.org/project/beautifulsoup4/), using the Actor's request queue to track URLs:
```python
from bs4 import BeautifulSoup
@@ -52,45 +136,31 @@ from apify import Actor
async def main() -> None:
async with Actor:
- # Retrieve the Actor input, and use default values if not provided.
actor_input = await Actor.get_input() or {}
start_urls = actor_input.get('start_urls', [{'url': 'https://apify.com'}])
- # Open the default request queue for handling URLs to be processed.
+ # Enqueue the start URLs into the default request queue.
request_queue = await Actor.open_request_queue()
-
- # Enqueue the start URLs.
for start_url in start_urls:
- url = start_url.get('url')
- await request_queue.add_request(url)
+ await request_queue.add_request(start_url['url'])
- # Process the URLs from the request queue.
+ # Process the queue until it's empty.
while request := await request_queue.fetch_next_request():
Actor.log.info(f'Scraping {request.url} ...')
-
- # Fetch the HTTP response from the specified URL using HTTPX.
async with AsyncClient() as client:
response = await client.get(request.url)
-
- # Parse the HTML content using Beautiful Soup.
soup = BeautifulSoup(response.content, 'html.parser')
- # Extract the desired data.
- data = {
+ # Push the extracted data to the default dataset.
+ await Actor.push_data({
'url': request.url,
- 'title': soup.title.string,
- 'h1s': [h1.text for h1 in soup.find_all('h1')],
- 'h2s': [h2.text for h2 in soup.find_all('h2')],
- 'h3s': [h3.text for h3 in soup.find_all('h3')],
- }
-
- # Store the extracted data to the default dataset.
- await Actor.push_data(data)
+ 'title': soup.title.string if soup.title else None,
+ })
```
-### Apify SDK with PlaywrightCrawler from Crawlee
+### Crawlee with Playwright
-This example demonstrates how to use the Apify SDK alongside `PlaywrightCrawler` from [Crawlee](https://crawlee.dev/python) to perform web scraping.
+Scrape pages with [Crawlee](https://crawlee.dev/python)'s `PlaywrightCrawler`, which handles queueing, concurrency, and the browser for you:
```python
from crawlee.crawlers import PlaywrightCrawler, PlaywrightCrawlingContext
@@ -100,83 +170,61 @@ from apify import Actor
async def main() -> None:
async with Actor:
- # Retrieve the Actor input, and use default values if not provided.
actor_input = await Actor.get_input() or {}
- start_urls = [url.get('url') for url in actor_input.get('start_urls', [{'url': 'https://apify.com'}])]
+ start_urls = [url['url'] for url in actor_input.get('start_urls', [{'url': 'https://apify.com'}])]
- # Exit if no start URLs are provided.
- if not start_urls:
- Actor.log.info('No start URLs specified in Actor input, exiting...')
- await Actor.exit()
+ crawler = PlaywrightCrawler(max_requests_per_crawl=50, headless=True)
- # Create a crawler.
- crawler = PlaywrightCrawler(
- # Limit the crawl to max requests. Remove or increase it for crawling all links.
- max_requests_per_crawl=50,
- headless=True,
- )
-
- # Define a request handler, which will be called for every request.
@crawler.router.default_handler
- async def request_handler(context: PlaywrightCrawlingContext) -> None:
- url = context.request.url
- Actor.log.info(f'Scraping {url}...')
-
- # Extract the desired data.
- data = {
+ async def handler(context: PlaywrightCrawlingContext) -> None:
+ Actor.log.info(f'Scraping {context.request.url} ...')
+ await context.push_data({
'url': context.request.url,
'title': await context.page.title(),
- 'h1s': [await h1.text_content() for h1 in await context.page.locator('h1').all()],
- 'h2s': [await h2.text_content() for h2 in await context.page.locator('h2').all()],
- 'h3s': [await h3.text_content() for h3 in await context.page.locator('h3').all()],
- }
-
- # Store the extracted data to the default dataset.
- await context.push_data(data)
-
- # Enqueue additional links found on the current page.
+ })
+ # Follow links found on the page.
await context.enqueue_links()
- # Run the crawler with the starting URLs.
await crawler.run(start_urls)
```
-## What are Actors?
+## Documentation
-Actors are serverless cloud programs that can do almost anything a human can do in a web browser.
-They can do anything from small tasks such as filling in forms or unsubscribing from online services,
-all the way up to scraping and processing vast numbers of web pages.
+The full SDK documentation lives at **[docs.apify.com/sdk/python](https://docs.apify.com/sdk/python)**. For the Apify platform itself, see the [Apify documentation](https://docs.apify.com/).
-They can be run either locally, or on the [Apify platform](https://docs.apify.com/platform/),
-where you can run them at scale, monitor them, schedule them, or publish and monetize them.
+| Section | What you'll find |
+|---|---|
+| [Overview](https://docs.apify.com/sdk/python/docs/overview) | What the SDK is, what Actors are, and how the pieces fit together. |
+| [Quick start](https://docs.apify.com/sdk/python/docs/quick-start) | Create, run, and deploy your first Python Actor. |
+| [Concepts](https://docs.apify.com/sdk/python/docs/concepts/actor-lifecycle) | Actor lifecycle, input, storages, events, proxy management, interacting with other Actors, webhooks, accessing the Apify API, logging, configuration, and pay-per-event. |
+| [Guides](https://docs.apify.com/sdk/python/docs/guides/beautifulsoup-httpx) | Integrations with BeautifulSoup, Parsel, Playwright, Selenium, Crawlee, Scrapy, Crawl4AI, and Browser Use, plus running a web server and using uv. |
+| [Upgrading](https://docs.apify.com/sdk/python/docs/upgrading/upgrading-to-v4) | Migrating between major versions. |
+| [API reference](https://docs.apify.com/sdk/python/reference) | Generated reference for every class and method. |
+| [Changelog](https://docs.apify.com/sdk/python/docs/changelog) | Release history and breaking changes. |
-If you're new to Apify, learn [what is Apify](https://docs.apify.com/platform/about)
-in the Apify platform documentation.
+## Related projects
-## Creating Actors
+- **[Apify API client for Python](https://docs.apify.com/api/client/python)** — talk to the Apify API directly from Python (bundled with this SDK).
+- **[Crawlee for Python](https://crawlee.dev/python)** — web scraping and browser automation framework; fully compatible with this SDK.
+- **[Apify SDK for JavaScript / TypeScript](https://docs.apify.com/sdk/js)** — the equivalent SDK for Node.js.
+- **[Apify API client for JavaScript / TypeScript](https://docs.apify.com/api/client/js)** — the equivalent API client for Node.js.
+- **[Crawlee for JavaScript / TypeScript](https://crawlee.dev)** — the original Node.js implementation of Crawlee.
+- **[Apify CLI](https://docs.apify.com/cli)** — command-line tool for creating, running, and deploying Actors locally and on the platform.
-To create and run Actors through Apify Console,
-see the [Console documentation](https://docs.apify.com/academy/getting-started/creating-actors#choose-your-template).
+## Support and community
-To create and run Python Actors locally, check the documentation for
-[how to create and run Python Actors locally](https://docs.apify.com/sdk/python/docs/quick-start).
+- **Discord** — chat with the team and other users on the [Apify Discord server](https://discord.gg/jyEM2PRvMU).
+- **GitHub issues** — report a bug or request a feature in the [issue tracker](https://github.com/apify/apify-sdk-python/issues).
-## Guides
+## Contributing
-To see how you can use the Apify SDK with other popular libraries used for web scraping,
-check out our guides for using
-[BeautifulSoup with HTTPX](https://docs.apify.com/sdk/python/docs/guides/beautifulsoup-httpx),
-[Parsel with Impit](https://docs.apify.com/sdk/python/docs/guides/parsel-impit),
-[Playwright](https://docs.apify.com/sdk/python/docs/guides/playwright),
-[Selenium](https://docs.apify.com/sdk/python/docs/guides/selenium),
-[Crawlee](https://docs.apify.com/sdk/python/docs/guides/crawlee),
-or [Scrapy](https://docs.apify.com/sdk/python/docs/guides/scrapy).
+Bug reports, fixes, and improvements are welcome! See [CONTRIBUTING.md](./CONTRIBUTING.md) for the development setup, coding standards, testing, and release process. The project uses [uv](https://docs.astral.sh/uv/) for project management and [Poe the Poet](https://poethepoet.natn.io/) as a task runner; the typical loop is:
+
+```bash
+uv run poe install-dev # install dev dependencies and git hooks
+uv run poe check-code # lint, type-check, and unit tests
+```
-## Usage concepts
+## License
-To learn more about the features of the Apify SDK and how to use them,
-check out the Usage Concepts section in the sidebar,
-particularly the guides for the [Actor lifecycle](https://docs.apify.com/sdk/python/docs/concepts/actor-lifecycle),
-[working with storages](https://docs.apify.com/sdk/python/docs/concepts/storages),
-[handling Actor events](https://docs.apify.com/sdk/python/docs/concepts/actor-events)
-or [how to use proxies](https://docs.apify.com/sdk/python/docs/concepts/proxy-management).
+Released under the [Apache License 2.0](./LICENSE).