Skip to content

Add v0.1.0 Python client SDK#1

Open
AlexLiu190625 wants to merge 14 commits into
xorbitsai:mainfrom
AlexLiu190625:main
Open

Add v0.1.0 Python client SDK#1
AlexLiu190625 wants to merge 14 commits into
xorbitsai:mainfrom
AlexLiu190625:main

Conversation

@AlexLiu190625
Copy link
Copy Markdown
Collaborator

Summary

Phase 1 Python client SDK for the v1 HTTP API shipped in xagent PR #384.
Eight commits land an end-to-end usable 0.1.0 release: client class,
endpoint methods, polling helpers, full exception hierarchy, dataclass
response models, 68 unit tests, e2e test scaffolding, and a complete
README.

Public surface

  • XAgentClient with env-var fallback (XAGENT_API_KEY / XAGENT_BASE_URL)
  • client.me() identity probe
  • client.tasks.create / append / get / steps / wait / run
  • RunResult bundles the final TaskInfo plus the step timeline
  • Six exception classes mapping the V1 envelope codes (InvalidAPIKey,
    AgentNotFound, TaskNotFound, TaskBusy, RateLimited,
    InternalError) plus three SDK-coined (InvalidInput,
    XAgentTransportError, TaskTimeout)

Design notes

  • base_url is required (kwarg or env); no production default is
    hard-coded while the prod endpoint is being finalized. A future
    release will add it as a non-breaking default.
  • wait() / run() terminal states mirror backend v1/tasks.py:170:
    only COMPLETED and FAILED. PAUSED stays non-terminal so
    multi-process workflows (one caller polls while another appends a
    resume turn) can observe the transition rather than returning early.
  • The sleep between polls is capped to the time remaining before the
    deadline so an unusually long poll_interval cannot overshoot the
    caller's requested wall-clock timeout.
  • Pydantic v2 is used internally for response parsing via
    TypeAdapter, but the public surface stays @dataclass(frozen=True)
    so downstream apps are not pinned to a specific pydantic version.
  • The transport layer wraps every httpx.HTTPError into
    XAgentTransportError, so every exception escaping the SDK descends
    from XAgentError.

Test plan

  • uv sync --group dev && uv run pre-commit install
  • uv run pre-commit run --all-files — ruff, mypy strict,
    codespell all pass
  • uv run pytest — 68 unit tests, hermetic, sub-second
  • uv run pytest -m e2e skips cleanly when XAGENT_API_KEY or
    XAGENT_BASE_URL is unset
  • With both env vars set against a running backend,
    uv run pytest -m e2e passes; NO_PROXY=localhost,127.0.0.1
    may be needed on macOS / corporate networks
  • CI workflow (.github/workflows/ci.yml) runs pre-commit and the
    pytest matrix on Python 3.11 and 3.14

Notes for maintainers

  • The repo currently only carries LICENSE and .gitignore; this PR
    drops in the full SDK layout under src/xagent_sdk/ plus tests, CI,
    and docs.
  • pyproject.toml pins the version to 0.1.0. Tagging a release as
    v0.1.0 after merge will activate the README install command
    (pip install "xagent-sdk @ git+https://github.com/xorbitsai/xagent-sdk-python@v0.1.0").
  • Tooling versions (ruff v0.12.3, mypy v1.19.0, pre-commit hooks
    v5.0.0, codespell v2.4.1) match the xagent backend repo so
    cross-repo lint behavior stays consistent.

Set up the initial Python tooling for the xAgent SDK client:

- pyproject.toml: hatchling build, py>=3.11, httpx + pydantic deps,
  ruff + mypy + pytest configuration, PEP 735 dev dependency group.
- .pre-commit-config.yaml: ruff-check / ruff-format / mypy / codespell
  hooks pinned to the same versions used by the xagent backend repo.
- .github/workflows/ci.yml: pre-commit gate plus pytest matrix on
  Python 3.11 and 3.14.
- src/xagent_sdk/{__init__,_version}.py: empty package exposing
  __version__ = "0.1.0" so importers and tooling have an anchor.
- tests/__init__.py: package marker so future tests are discovered.
- README.md: minimal stub describing project status and dev workflow.
- .gitignore: ignore uv.lock and .claude/.
Introduce HTTPClient in xagent_sdk._http, the internal transport layer
that future endpoint methods will share. Responsibilities are scoped
narrowly: configure an httpx.Client with the Bearer header, a
User-Agent identifying the SDK version, base_url normalization,
30s/10s timeouts, a 10-connection pool, and context-manager close.

The class returns raw httpx.Response objects. Status-code-to-exception
mapping and response parsing are deferred to subsequent commits so
this layer stays free of v1 contract assumptions.

A transport= parameter is exposed for test injection (httpx.MockTransport)
without yet adding tests; test coverage lands in a later commit.
Define the SDK's exception layer mirroring the backend's stable error
codes from V1 envelope responses.

XAgentError is the base class with code/message/http_status. Six
subclasses map 1:1 to the backend's stable codes (invalid_api_key,
agent_not_found, task_not_found, task_busy, rate_limited,
internal_error). Three SDK-coined subclasses cover cases the server
does not model: InvalidInput (from FastAPI 422 {"detail": [...]}),
XAgentTransportError (network or timeout below HTTP), TaskTimeout
(reserved for wait/run local deadlines, used in a later commit).

errors.from_response() maps an httpx.Response to the right subclass.
Malformed or non-V1 bodies fall back to InternalError. HTTPClient is
extended to wrap httpx.HTTPError into XAgentTransportError so all
errors raised from this SDK descend from XAgentError.

All public types are re-exported from xagent_sdk so users can write
`from xagent_sdk import TaskBusy` etc.
Materialize the v1 success-path response types as frozen dataclasses
with module-level pydantic TypeAdapter parsers.

Two enums (TaskStatus, StepType) capture the closed status / type
vocabularies the backend ships. Five dataclasses (MeResponse,
CreateTaskResult, AppendResult, TaskInfo, Step) cover the five success
responses; AppendResult carries accepted_at (not created_at) to match
the backend wire shape, and Step.id is str (not int) carrying a
"<type>:<seq>" prefix the backend exposes for client-side de-dupe.

TypeAdapter handles ISO datetime parsing, enum coercion, and Optional
handling without bespoke conversion code. Pydantic is kept strictly
internal: the public surface is plain @DataClass(frozen=True) so the
SDK does not pin downstream apps to a particular pydantic version.

All seven new types are re-exported from xagent_sdk. The five parsers
are module-private; endpoint methods in a later commit will call them
directly.
Wire the v1 endpoints into the public SDK surface.

XAgentClient is the user-facing class. It resolves api_key and base_url
in the order: explicit kwarg -> environment variable
(XAGENT_API_KEY / XAGENT_BASE_URL) -> raise ValueError. A future
release will add a hardcoded production base_url default once the
xAgent team finalizes the prod endpoint; the resolution order is
designed to keep that addition backward-compatible.

The client owns one HTTPClient (connection pool), exposes me() for the
identity probe, and dispatches a single _request helper that maps
4xx/5xx responses to the right XAgentError subclass before returning
to callers.

TasksAPI (mounted as client.tasks) provides four methods on top of
that helper: create, append, get, steps. Both write methods take
message: str and wrap it as {"role": "user", "content": ...}
internally; the v1 contract pins role to "user" so exposing the field
would only mislead.

Docstrings call out thread-safety, the fork() caveat, the
transport= power-user knob, and that me() does not cache.

XAgentClient is re-exported from xagent_sdk. Polling helpers (wait,
run) and tests land in subsequent commits.
Add the two helpers that close the loop on single-turn task usage:

- TasksAPI.wait(task_id, timeout, poll_interval) polls GET
  /v1/chat/tasks/{id} until the task reaches a terminal status and
  returns the final TaskInfo. Raises TaskTimeout when the wall-clock
  deadline elapses. Other errors from the underlying get() propagate
  unchanged -- retry semantics are the caller's business.

- TasksAPI.run(agent_id, message, timeout, poll_interval, metadata)
  bundles create + wait + steps into one call and returns a
  RunResult carrying the final snapshot plus the full step timeline.
  Equivalent to the lower-level trio with a single deadline; use the
  trio directly for multi-turn flows.

RunResult is a frozen dataclass with .output / .status property
shortcuts over the embedded TaskInfo, re-exported from xagent_sdk.

Terminal states mirror the backend's own definition
(v1/tasks.py:170): only COMPLETED and FAILED. PAUSED is non-terminal
because the backend allows append() onto a paused task, which flips
it back to RUNNING; a wait()ing observer should see that transition
rather than return early with PAUSED.

The sleep between polls is capped to the time remaining before the
deadline so an unusually long poll_interval cannot overshoot the
caller's requested wall-clock timeout.

The wait/run defaults (timeout=120s, poll_interval=1.0s) match the
hand-off document. TaskTimeout, previously reserved by the exception
hierarchy, is now raised by wait() when its deadline elapses.
Add tests/unit/ with full coverage of the v0.1.0 surface: error
envelope parsing and exception classes (test_errors.py), pydantic
dataclass parsing and frozen behavior (test_types.py), the HTTPClient
wrapper including transport-error wrapping (test_http.py), XAgentClient
construction, env-var fallback, and the me() probe (test_client.py),
and the TasksAPI write+read endpoints plus the wait/run polling
helpers (test_tasks.py). 68 unit tests total, hermetic, ~0.5s.

Tests use httpx.MockTransport rather than pytest-httpx so they exercise
the SDK's own transport= injection point. A clean_xagent_env autouse
fixture strips XAGENT_* environment variables between tests to prevent
ambient-config bleed. Polling-helper tests check terminal-status
membership (only COMPLETED and FAILED, mirroring backend), the PAUSED
non-terminal contract, last-observed-status reporting in TaskTimeout,
propagation of underlying errors, and the sleep cap that prevents
poll_interval from overshooting the caller's wall-clock timeout.

Add tests/e2e/ scaffolding with a single smoke test marked
@pytest.mark.e2e. The fixture skips when XAGENT_API_KEY or
XAGENT_BASE_URL is unset, so CI naturally skips e2e while local
developers can run it with both env vars set.

pyproject.toml registers the e2e marker and adds `-m 'not e2e'` to
addopts so the default `pytest` invocation never tries to reach a
real backend. Remove pytest-httpx from the dev dependency group and
the mypy pre-commit hook since the suite uses MockTransport directly;
add it back if a future test needs declarative HTTP mocking.
Replace the bootstrap stub with a full README aimed at v0.1.0 users:
install (with a git-tag pin and Python 3.11+ note), quick-start
one-liner, a four-example walkthrough (identity probe, single-turn
run with tool_call step inspection, explicit multi-turn append,
error handling), an API reference table, configuration knobs,
status semantics including the PAUSED non-terminal contract, the
version policy (SemVer in 0.x means minor bumps may break; never
install from @main), and a development section with the
NO_PROXY=localhost gotcha that bites users running e2e through a
system proxy.

All code examples use prompts and outputs that have been verified
end-to-end against a real xAgent backend so the docs do not drift
from actual SDK behavior.
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements the initial version of the xAgent Python SDK, providing a synchronous client for interacting with the xAgent v1 HTTP API. Key features include task lifecycle management via the TasksAPI, a robust exception hierarchy, and Pydantic-based data models. Review feedback identifies a contradiction in documentation regarding terminal task states, suggests adding validation for polling intervals, recommends improved deadline tracking in the run method to ensure strict timeout adherence, and points out an outdated comment in the error definitions.

Comment thread src/xagent_sdk/types.py Outdated
"""Lifecycle states a task can hold.

The full set the SDK may observe is fixed at 5 values; ``run()`` and
``wait()`` treat ``COMPLETED``, ``FAILED``, and ``PAUSED`` as terminal.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The docstring states that wait() treats PAUSED as a terminal state, but the implementation in src/xagent_sdk/tasks.py (lines 28 and 150) explicitly excludes PAUSED from _TERMINAL_STATUSES. This contradiction should be resolved to match the intended behavior described in tasks.py.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3145b06 — the TaskStatus docstring now matches _TERMINAL_STATUSES (only COMPLETED and FAILED) and explains why PAUSED stays non-terminal (backend allows append() onto a paused task, which transitions it back to RUNNING).

Raises:
TaskTimeout: when ``timeout`` elapses without a terminal state.
"""
deadline = time.monotonic() + timeout
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The poll_interval should be validated to ensure it is a positive value. A zero or negative value will cause issues in the polling loop or when calling time.sleep().

        if poll_interval <= 0:
            raise ValueError("poll_interval must be positive")
        deadline = time.monotonic() + timeout

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3df6de6wait() now rejects negative timeout and negative poll_interval up front with ValueError. Zero is still accepted for both (timeout=0 polls exactly once; poll_interval=0 tight-loops via time.sleep(0)). Two unit tests added (test_negative_timeout_rejected, test_negative_poll_interval_rejected).

Comment thread src/xagent_sdk/tasks.py Outdated
Comment on lines +191 to +192
created = self.create(agent_id=agent_id, message=message, metadata=metadata)
info = self.wait(created.task_id, timeout=timeout, poll_interval=poll_interval)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The run() method docstring mentions a "single deadline," but the implementation applies the timeout only to the wait() phase. The time consumed by the create() call is not subtracted from the timeout passed to wait(). To strictly adhere to a single deadline, the elapsed time should be tracked.

Suggested change
created = self.create(agent_id=agent_id, message=message, metadata=metadata)
info = self.wait(created.task_id, timeout=timeout, poll_interval=poll_interval)
start_time = time.monotonic()
created = self.create(agent_id=agent_id, message=message, metadata=metadata)
remaining = max(0.0, timeout - (time.monotonic() - start_time))
info = self.wait(created.task_id, timeout=remaining, poll_interval=poll_interval)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3df6de6run() now captures time.monotonic() before create(), then forwards max(0.0, timeout - elapsed) to wait() so the combined create + wait wall-clock is bounded by the caller's timeout. steps() still runs after the terminal state is observed; the docstring now calls it out separately as a small additional cheap GET. Test test_shared_deadline injects a create_delay=0.1 and asserts elapsed < timeout + create_delay/2, which strictly distinguishes the new bound from the old behavior.

Comment thread src/xagent_sdk/errors.py Outdated
Comment on lines +76 to +77
Not yet raised by any code path; reserved for the polling layer added
in a later commit.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This comment is outdated as the polling layer that raises TaskTimeout is implemented in this pull request (see src/xagent_sdk/tasks.py).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3145b06 — the TaskTimeout docstring no longer claims it is unraised. It now describes when it actually fires (TasksAPI.wait, and indirectly TasksAPI.run) and why http_status is None (the deadline is purely client-side, no HTTP exchange surfaces the failure).

…havior

Two docstring drifts surfaced in review:

- TaskStatus said wait() treats COMPLETED, FAILED, and PAUSED as
  terminal, but the polling helper added in 25c49f4 only treats
  COMPLETED and FAILED that way (PAUSED is non-terminal so callers
  observe append()-driven resume transitions). Update the enum
  docstring to match.

- TaskTimeout still carried "Not yet raised by any code path;
  reserved for the polling layer added in a later commit." That
  reservation lapsed once 25c49f4 wired wait() to raise it. Replace
  the comment with a description of when it actually fires and why
  http_status is None.

No behavior change; docstring-only fix.
…eate()

Two issues surfaced in review:

1. wait() did not validate timeout / poll_interval. A negative
   poll_interval crashes inside time.sleep() with a ValueError
   originating in stdlib, exposing the SDK's internals rather than
   failing at the boundary with a clear message. Add explicit checks
   that reject negative values up front; zero is still accepted for
   both (timeout=0 polls exactly once; poll_interval=0 tight-loops
   while yielding the GIL).

2. run()'s docstring promised a "single deadline" but the
   implementation passed the full timeout to wait() without
   subtracting the time spent in create(). For a slow create(),
   the wall-clock could exceed the user's requested timeout. Now
   run() captures monotonic time before create(), subtracts the
   elapsed from timeout, and forwards the remaining budget to
   wait(). steps() runs after the terminal state is observed and
   is documented separately as additional cheap latency.

Tests:
  - test_negative_timeout_rejected / test_negative_poll_interval_rejected
    on TestWait
  - test_shared_deadline on TestRun, injecting create_delay=0.1 and
    asserting elapsed < timeout + create_delay/2 so the assertion
    strictly distinguishes the new behavior from the old
@AlexLiu190625
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request establishes the initial codebase for the xAgent Python SDK, featuring a synchronous XAgentClient for the v1 HTTP API. Key components include task lifecycle management, a robust error mapping system, and Pydantic models for type safety. The PR also sets up development tooling with pre-commit hooks and an extensive test suite. Reviewers recommended enhancing the readability of validation error messages and increasing the resilience of response parsing by adding explicit type checks for incoming data.

Comment thread src/xagent_sdk/errors.py Outdated
Comment on lines +134 to +135
if isinstance(detail, list):
return "; ".join(str(item) for item in detail)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation of _format_422_detail for list inputs will produce string representations of dictionary objects (e.g., "{'msg': 'field required'}") which are not very user-friendly. Since FastAPI validation errors typically provide a msg field within each detail dictionary, it is better to extract that field specifically when available.

Suggested change
if isinstance(detail, list):
return "; ".join(str(item) for item in detail)
if isinstance(detail, list):
return "; ".join(str(i.get("msg", i) if isinstance(i, dict) else i) for i in detail)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 396e334 — went one step beyond the suggestion: extract loc as a dotted path together with msg, since loc is the part that tells the caller which field failed. A standard FastAPI 422 entry now formats as body.message.content: field required instead of field required (or the raw dict repr). Falls back to msg alone when loc is missing, and to str(item) when neither is a string. Three tests cover the three branches.

``client.tasks.steps(task_id)`` so those wrapper fields are redundant
and dropped here.
"""
return _STEP_LIST_ADAPTER.validate_python(data.get("steps", []))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This line assumes that data is always a dictionary. However, resp.json() can return other types such as a list or None if the server response is unexpected. Accessing .get() on a non-dictionary object will raise an AttributeError. Adding a type check ensures the SDK handles malformed or unexpected successful responses gracefully.

Suggested change
return _STEP_LIST_ADAPTER.validate_python(data.get("steps", []))
steps_data = data.get("steps", []) if isinstance(data, dict) else []
return _STEP_LIST_ADAPTER.validate_python(steps_data)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 396e334 — applied the suggestion plus widened the signature from dict[str, Any] to Any so the runtime check is no longer fighting the static type. Non-dict input (None, list, str, int, bool) now returns [] instead of crashing in dict.get. A parametrized test (test_non_dict_returns_empty) covers all five non-dict types.

Second-round review surfaced two more graceful-degradation gaps:

1. _format_422_detail joined str() of each list entry, producing the
   raw dict repr (e.g. "{'loc': ['body', 'message', 'content'],
   'msg': 'field required', 'type': 'missing'}") rather than the
   human-readable loc.msg form FastAPI 422 entries support. Add a
   _format_422_item helper that emits "body.message.content: field
   required" when both fields are present, falls back to msg alone
   when loc is missing, and to str(item) when neither field is a
   string. Preserves more information than the suggested msg-only
   fix because loc tells the caller which field failed, not just
   what failed.

2. _parse_steps called .get("steps", []) on its argument, which
   raises AttributeError when the server or an upstream proxy
   returns a non-dict body (list, null, etc.). Widen the signature
   to Any and short-circuit to [] when the input is not a dict, so
   malformed responses degrade gracefully instead of leaking a
   builtin exception that does not descend from XAgentError.

Tests:
  - test_422_detail_list now asserts the formatted message
    ("field required" for msg-only input)
  - test_422_detail_list_with_loc covers the new loc.msg path
  - test_422_detail_list_of_strings covers raw-string entries
  - test_non_dict_returns_empty (parameterized over None, list,
    str, int, bool) verifies _parse_steps no longer crashes on
    malformed responses
The upstream repository was renamed from xagent-sdk-python to
xagent-sdk and will host clients for multiple languages, each in its
own top-level directory. This commit moves the existing Python SDK
into python/ and adds the monorepo-wide bits (top-level README,
single pre-commit config, language-specific CI workflow) so other
language clients can be added later without re-organizing.

Layout changes:
- All existing SDK files (src, tests, pyproject.toml, README.md) are
  now under python/.
- A new top-level README.md serves as the monorepo navigator and
  points readers at python/README.md for SDK usage.
- .pre-commit-config.yaml moved back to repo root with file filters
  scoped to ^python/. The mypy hook is converted to a local hook that
  cd's into python/ before invoking `uv run mypy --package xagent_sdk`,
  which lets python/pyproject.toml keep mypy_path = "src" and works
  both via pre-commit and via direct invocation from python/.
- .github/workflows/ci.yml renamed to python-ci.yml with paths filters
  limiting the workflow to python/ + shared/ + the config files, and
  working-directory: python on the dependency/test steps.

Install command in python/README.md updated to reference the new
repo URL with `#subdirectory=python` so pip finds pyproject.toml in
the right place.

No SDK behavior change. All 78 unit tests still pass; e2e tests are
unaffected.
Set up shared/fixtures/v1/ as the single source of truth for the v1
HTTP wire contract across all language clients. Each JSON file holds
the raw body the server emits (no wrapping, no metadata) so a future
TypeScript / JavaScript client can drive its tests off the same files
that the Python client uses, and a wire-shape change can be made in
one place rather than once per language.

Fixtures included:
- responses/: me, create_task, append_task, task_info_completed,
  steps_full (covers all four Step types in one body).
- errors/: the six stable V1 envelope codes plus validation_422
  (FastAPI's {"detail": [...]} shape). Status codes are documented in
  shared/README.md rather than embedded in the JSON, because they are
  implicit per error code in the wire contract.

Python integration:
- python/tests/unit/_fixtures.py provides `response(name)` and
  `error_envelope(name)` helpers that resolve paths relative to the
  repo root via Path(__file__).resolve().parents[3].
- python/tests/unit/test_errors.py::TestFromResponseStableCodes now
  loads each envelope from a fixture rather than inlining a JSON
  literal. The parametrize names the fixture instead of the code,
  which is the same string by construction.

The remaining Python tests (frozen-dataclass behavior, env-var
fallback, polling deadlines) stay inline because they exercise
Python-specific behavior, not the wire contract.

78 unit tests still pass, hermetic, <1s. No SDK behavior change.
A long-standing backend bug (POST /v1/chat/tasks blocking until the
LLM call completes) slipped past our earlier e2e tests because they
only checked final output, not request timing. The bug surfaced when
LLM latency crept past the SDK's 30s per-request HTTP timeout and
the smoke test started failing with a generic "timed out" message
that did not point at the contract violation.

This commit adds test_create_is_async to enforce the timing contract
directly:

- A new patient_client fixture provides a 60s-timeout XAgentClient
  so the test observes POST's actual latency even when the backend
  is synchronous, rather than surfacing as a transport timeout.

- test_create_is_async measures monotonic time around tasks.create()
  and asserts (a) the response status is PENDING and (b) elapsed is
  under 5s. A correct backend implementation returns in well under
  a second; the 5s bound leaves slack for cold start and slow
  networks.

When this test fires, the failure message includes the measured
elapsed seconds and a one-paragraph explanation of the async-polling
contract it is checking, so the next person diagnosing the failure
sees the contract violation rather than a generic transport timeout.

Verified against the currently-broken backend: the test fails with
"POST took 37.26s; v1 contract requires async return (typically <=1s
in practice)." That message is the failure surface we wanted.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant