chore: consolidate gptoss example + fixes by arekay-nv · Pull Request #283 · mlcommons/endpoints

arekay-nv · 2026-04-15T01:29:27Z

What does this PR do?

Consolidate the gpt example plus pipe clean to make sure it works end to end after recent refactors.

Type of change

Bug fix
New feature
Documentation update
Refactor/cleanup

Related issues

Testing

Tests added/updated
All tests pass locally
Manual testing completed

Checklist

Code follows project style
Pre-commit hooks pass
Documentation updated (if needed)

github-actions · 2026-04-15T01:29:37Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

gemini-code-assist

Code Review

This pull request focuses on extensive documentation updates, example consolidation, and internal API refactoring. Key changes include restructuring the GPT-OSS-120B benchmarking examples, updating CLI argument references in READMEs, and transitioning the CPU affinity configuration to a boolean flag. Additionally, the internal issue_query method was renamed to issue. Feedback is provided regarding the removal of conditional checks in the endpoint client configuration, which inadvertently prevents the programmatic override of adapters and accumulators.

Copilot

Pull request overview

This PR updates documentation and examples to align with recent endpoint-client refactors and consolidates the GPT-OSS-120B end-to-end example (configs + accuracy scripts) so it runs cleanly again.

Changes:

Update docs/READMEs to reflect refactored endpoint client API (issue/poll/drain, updated worker internals, CLI flag updates).
Consolidate GPT-OSS-120B example guidance (vLLM + SGLang) and remove the separate SGLang-only example README.
Add standalone evaluation scripts (GPQA/AIME25/LiveCodeBench) and plumb --force-regenerate through dataset generation in the example runner.

Reviewed changes

Copilot reviewed 13 out of 16 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/inference_endpoint/profiling/README.md`	Updates profiling callouts to match refactored client/worker method names.
`src/inference_endpoint/endpoint_client/config.py`	Adjusts default resolution so adapter/accumulator are always derived from `api_type`.
`src/inference_endpoint/endpoint_client/README.md`	Updates client API usage from `issue_query` to `issue`.
`src/inference_endpoint/dataset_manager/README.md`	Fixes import path for `DatasetFormat`.
`examples/README.md`	Updates GPT-OSS-120B example description and removes separate SGLang example entry.
`examples/07_GPT-OSS-120B_SGLang_Example/README.md`	Removes redundant SGLang-only README (content consolidated into 04 example).
`examples/04_GPTOSS120B_Example/run.py`	Adds `force_regenerate` passthrough for dataset generation.
`examples/04_GPTOSS120B_Example/gptoss_120b_example.yaml`	Updates default endpoint port to `30000`.
`examples/04_GPTOSS120B_Example/eval_livecodebench.py`	Adds standalone LiveCodeBench re-scoring script from an existing report dir.
`examples/04_GPTOSS120B_Example/eval_gpqa.py`	Adds standalone GPQA re-scoring script from an existing report dir.
`examples/04_GPTOSS120B_Example/eval_aime.py`	Adds standalone AIME25 re-scoring script from an existing report dir.
`examples/04_GPTOSS120B_Example/Readme.md`	Consolidates end-to-end instructions for vLLM/SGLang + accuracy suite + troubleshooting.
`examples/02_ServerBenchmarking/README.md`	Updates CLI example flags to current `--endpoints/--dataset/--model` usage.
`docs/CLI_QUICK_REFERENCE.md`	Updates `init` command guidance to include `concurrency` and removes redundant line.
`docs/CLIENT_PERFORMANCE_TUNING.md`	Updates CPU affinity docs to match `enable_cpu_affinity` + `--no-cpu-affinity`.
`AGENTS.md`	Updates repository structure notes for `core/` layout and record location.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Consolidates and repairs the gpt-oss-120b examples and related client configuration behavior so the selected endpoint_config.api_type reliably drives adapter/accumulator selection end-to-end after recent refactors.

Changes:

Propagate endpoint_config.api_type into settings.client.api_type at config construction time, and make HTTPClientConfig.with_updates() clear stale auto-resolved fields when api_type changes.
Update benchmark execution/docs to rely on the propagated api_type (no runtime patching in execute.py) and refresh client/profiling documentation for renamed methods.
Consolidate GPT-OSS examples (remove the separate SGLang example README, enhance the unified example, add accuracy eval scripts, and update example docs/imports).

Reviewed changes

Copilot reviewed 16 out of 19 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tests/unit/config/test_schema.py	Adds regression tests ensuring `api_type` propagation and adapter/accumulator resolution behave correctly across `with_updates()`.
src/inference_endpoint/profiling/README.md	Updates profiling guide to match renamed client/worker methods.
src/inference_endpoint/endpoint_client/config.py	Adds `HTTPClientConfig.with_updates()` logic to clear adapter/accumulator when `api_type` changes.
src/inference_endpoint/endpoint_client/README.md	Updates docs to reflect `HTTPEndpointClient.issue()` API naming.
src/inference_endpoint/dataset_manager/README.md	Fixes import guidance for `DatasetFormat` (not exported from package root).
src/inference_endpoint/config/schema.py	Adds validator to propagate `endpoint_config.api_type` into the internal HTTP client config at construction.
src/inference_endpoint/commands/benchmark/execute.py	Removes runtime `api_type` override, relying on schema propagation.
examples/README.md	Updates GPT-OSS example description and removes separate SGLang example entry.
examples/07_GPT-OSS-120B_SGLang_Example/README.md	Deletes now-redundant standalone SGLang example README.
examples/04_GPTOSS120B_Example/run.py	Adds `force_regenerate` passthrough to dataset generators.
examples/04_GPTOSS120B_Example/gptoss_120b_example.yaml	Updates endpoint default port in the example config.
examples/04_GPTOSS120B_Example/eval_livecodebench.py	Adds standalone scoring script for LiveCodeBench from a saved dataset/report.
examples/04_GPTOSS120B_Example/eval_gpqa.py	Adds standalone scoring script for GPQA from a saved dataset/report.
examples/04_GPTOSS120B_Example/eval_aime.py	Adds standalone scoring script for AIME25 from a saved dataset/report.
examples/04_GPTOSS120B_Example/Readme.md	Expands and consolidates end-to-end instructions for vLLM + SGLang and accuracy workflows.
examples/02_ServerBenchmarking/README.md	Updates CLI usage example to current long-form flags.
docs/CLI_QUICK_REFERENCE.md	Updates `init` command quick reference to include concurrency template option.
docs/CLIENT_PERFORMANCE_TUNING.md	Updates CPU-affinity documentation to match `enable_cpu_affinity` config/CLI.
AGENTS.md	Updates repo structure documentation (core/types, record location).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>

arekay-nv requested review from a team and Copilot April 15, 2026 01:29

github-actions Bot requested a review from nvzhihanj April 15, 2026 01:29

Copilot started reviewing on behalf of arekay-nv April 15, 2026 01:29 View session

gemini-code-assist Bot reviewed Apr 15, 2026

View reviewed changes

Comment thread src/inference_endpoint/endpoint_client/config.py Outdated

Copilot AI reviewed Apr 15, 2026

View reviewed changes

Comment thread src/inference_endpoint/endpoint_client/config.py Outdated

Comment thread src/inference_endpoint/endpoint_client/config.py Outdated

viraatc approved these changes Apr 22, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings April 27, 2026 16:36

Copilot started reviewing on behalf of arekay-nv April 27, 2026 16:37 View session

Copilot AI reviewed Apr 27, 2026

View reviewed changes

Comment thread examples/04_GPTOSS120B_Example/gptoss_120b_example.yaml

arekay-nv added 4 commits April 27, 2026 12:01

Fixup+consolidate gpt-oss-120b

999a6e6

Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>

Fix adapter setup

38835b8

Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>

Refactor api_type handling

072a1df

Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>

Fix.

5b19767

Signed-off-by: Rashid Kaleem <230885705+arekay-nv@users.noreply.github.com>

Copilot AI review requested due to automatic review settings April 27, 2026 17:04

arekay-nv force-pushed the arekay/chore_consolidate_gptoss branch from f782cec to 5b19767 Compare April 27, 2026 17:04

arekay-nv merged commit efd0180 into main Apr 27, 2026
8 checks passed

arekay-nv deleted the arekay/chore_consolidate_gptoss branch April 27, 2026 17:14

github-actions Bot locked and limited conversation to collaborators Apr 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: consolidate gptoss example + fixes#283

chore: consolidate gptoss example + fixes#283
arekay-nv merged 4 commits intomainfrom
arekay/chore_consolidate_gptoss

arekay-nv commented Apr 15, 2026

Uh oh!

github-actions Bot commented Apr 15, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

arekay-nv commented Apr 15, 2026

What does this PR do?

Type of change

Related issues

Testing

Checklist

Uh oh!

github-actions Bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented Apr 15, 2026 •

edited

Loading