Skip to content

feat(closes OPEN-10341): add native async runner for testset batches#648

Open
gustavocidornelas wants to merge 1 commit into
mainfrom
gustavo/open-10341-add-native-async-runner-for-testset-batches
Open

feat(closes OPEN-10341): add native async runner for testset batches#648
gustavocidornelas wants to merge 1 commit into
mainfrom
gustavo/open-10341-add-native-async-runner-for-testset-batches

Conversation

@gustavocidornelas
Copy link
Copy Markdown
Contributor

Pull Request

Summary

Adds native async support to OpenlayerModel.run_batch_from_df so that customers whose per-row work hits slow APIs
(~5s/row) can run testset batches concurrently instead of strictly sequentially.

Users opt in by defining run as async def run(...); the framework then drives rows through asyncio.gather gated by a semaphore. Sync run keeps today's behavior byte-for-byte.

Changes

  • run may now be defined as async def run(...). When it is, run_batch_from_df dispatches rows concurrently
    via asyncio.gather + asyncio.Semaphore(max_workers).
  • New max_workers kwarg on run_batch_from_df and batch, plus --max-workers on the CLI. Default resolves to
    4 for async run, 1 for sync run — writing async def is the opt-in signal that interleaving is safe.
  • max_workers > 1 with a sync run raises ValueError rather than silently ignoring it.
  • Per-row exceptions still propagate and abort the batch (fail-fast, same as today). For the async path,
    asyncio.gather cancels in-flight siblings before re-raising.
  • Extracted _run_rows_async, _apply_row_result, and _build_config helpers so the row bookkeeping is shared
    between the sync and async paths.

Context

OPEN-10341: Add native async runner for testset batches

Testing

  • Manual testing

Monitoring

  • No expected impact

Notes

  • Backwards compatible: existing sync run implementations behave identically. Same sequential code path, same
    fail-fast semantics, no executor or asyncio overhead.
  • Customer-facing usage: if a customer's openlayer_run.py already defines async def run(...), they get 4-way
    concurrency automatically next release. To override, append --max-workers N to the batchCommand in
    openlayer.json.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant