feat(closes OPEN-10341): add native async runner for testset batches#648
Open
gustavocidornelas wants to merge 1 commit into
Open
feat(closes OPEN-10341): add native async runner for testset batches#648gustavocidornelas wants to merge 1 commit into
gustavocidornelas wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request
Summary
Adds native async support to
OpenlayerModel.run_batch_from_dfso that customers whose per-row work hits slow APIs(~5s/row) can run testset batches concurrently instead of strictly sequentially.
Users opt in by defining
runasasync def run(...); the framework then drives rows throughasyncio.gathergated by a semaphore. Syncrunkeeps today's behavior byte-for-byte.Changes
runmay now be defined asasync def run(...). When it is,run_batch_from_dfdispatches rows concurrentlyvia
asyncio.gather+asyncio.Semaphore(max_workers).max_workerskwarg onrun_batch_from_dfandbatch, plus--max-workerson the CLI. Default resolves to4 for async
run, 1 for syncrun— writingasync defis the opt-in signal that interleaving is safe.max_workers > 1with a syncrunraisesValueErrorrather than silently ignoring it.asyncio.gathercancels in-flight siblings before re-raising._run_rows_async,_apply_row_result, and_build_confighelpers so the row bookkeeping is sharedbetween the sync and async paths.
Context
OPEN-10341: Add native async runner for testset batches
Testing
Monitoring
Notes
runimplementations behave identically. Same sequential code path, samefail-fast semantics, no executor or asyncio overhead.
openlayer_run.pyalready definesasync def run(...), they get 4-wayconcurrency automatically next release. To override, append
--max-workers Nto thebatchCommandinopenlayer.json.