ci: Improve benchmarking tests with CodSpeed by Copilot · Pull Request #236 · plugboard-dev/plugboard

Copilot · 2026-03-25T07:36:26Z

Summary

Replace pytest-benchmark with pytest-codspeed for benchmarking, extend test coverage to full process lifecycle and run-only benchmarks, and parameterize across connector types.

Changes

pyproject.toml: Swap pytest-benchmark → pytest-codspeed>=4.4.0
tests/benchmark/test_benchmarking.py:
- test_benchmark_process_lifecycle — benchmarks full init → run → destroy using @pytest.mark.benchmark
- test_benchmark_process_run — benchmarks only process.run() using BenchmarkFixture.pedantic with setup/teardown callbacks and uvloop.run, excluding init and destroy from timing
- Both parameterized over AsyncioConnector/LocalProcess, ZMQConnector/LocalProcess, and RayConnector/RayProcess
tests/benchmark/conftest.py: Added benchmark-specific ray_ctx fixture that overrides the session-scoped one from tests/conftest.py, using include_dashboard=False to prevent the Ray dashboard's MetricsHead module from timing out under CodSpeed instrumentation in CI.
.github/workflows/benchmarks.yaml: Replace manual checkout-both-branches-and-compare workflow with CodSpeedHQ/action@v4 on ubuntu-latest using default mode. Adds push to main and workflow_dispatch triggers for baseline tracking and backtesting. Uses OIDC auth (id-token: write). Adds RAY_ENABLE_UV_RUN_RUNTIME_ENV and PLUGBOARD_IO_READ_TIMEOUT env vars for Ray-based benchmarks.
uv.lock: Updated to reflect dependency change

- Replace pytest-benchmark with pytest-codspeed in test dependencies - Rewrite benchmark tests to use @pytest.mark.benchmark decorator - Add full process lifecycle benchmark (init + run + destroy) - Parameterize benchmarks across AsyncioConnector and ZMQConnector - Replace GitHub Actions workflow with CodSpeed integration Co-authored-by: toby-coleman <13170610+toby-coleman@users.noreply.github.com> Agent-Logs-Url: https://github.com/plugboard-dev/plugboard/sessions/b8d70f1d-7f6f-412b-9452-e70611a349de

…tion - test_benchmark_process_run now uses benchmark.pedantic with setup to only time process.run(), excluding init - Added RayConnector/RayProcess to both benchmark test parametrizations - Added Ray env vars to CI workflow for benchmark tests Co-authored-by: toby-coleman <13170610+toby-coleman@users.noreply.github.com> Agent-Logs-Url: https://github.com/plugboard-dev/plugboard/sessions/2d017dc2-a948-4b46-8c61-c42be32507a4

Copilot · 2026-03-25T20:42:30Z

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

metadata.google.internal
- Triggering command: /home/REDACTED/work/plugboard/plugboard/.venv/bin/python /home/REDACTED/work/plugboard/plugboard/.venv/bin/python /home/REDACTED/work/plugboard/plugboard/.venv/lib/python3.12/site-packages/ray/dashboard/dashboard.py --host=127.0.0.1 --port=8265 --port-retries=50 --temp-dir=/tmp/ray --log-dir=/tmp/ray/session_2026-03-25_20-28-43_333217_4102/logs --session-dir=/tmp/ray/session_2026-03-25_20-28-43_333217_4102 --logging-rotate-bytes=536870912 --logging-rotate-backup-count=5 --gcs-address=127.0.0.1:33279 --cluster-id-hex=9ec9fda9fcb85e65d5dc26651d1db3efb8d117964fb0bb2db1d9c416 --node-ip-address=127.0.0.1 --stdout-filepath=/tmp/ray/session_2026-03-25_20-28-43_333217_4102/logs/dashboard.out --stderr-filepath=/tmp/ray/session_2026-03-25_20-28-43_333217_4102/logs/dashboard.err (dns block)
- Triggering command: /home/REDACTED/work/plugboard/plugboard/.venv/bin/python /home/REDACTED/work/plugboard/plugboard/.venv/bin/python /home/REDACTED/work/plugboard/plugboard/.venv/lib/python3.12/site-packages/ray/dashboard/dashboard.py --host=127.0.0.1 --port=8265 --port-retries=50 --temp-dir=/tmp/ray --log-dir=/tmp/ray/session_2026-03-25_20-33-01_962890_5585/logs --session-dir=/tmp/ray/session_2026-03-25_20-33-01_962890_5585 --logging-rotate-bytes=536870912 --logging-rotate-backup-count=5 --gcs-address=127.0.0.1:37307 --cluster-id-hex=6c37b0969d30736700cb2e0a3ff6b3b023662ffa335ba1d9fb65db86 --node-ip-address=127.0.0.1 --stdout-filepath=/tmp/ray/session_2026-03-25_20-33-01_962890_5585/logs/dashboard.out --stderr-filepath=/tmp/ray/session_2026-03-25_20-33-01_962890_5585/logs/dashboard.err cal/bin/git hub.com/.extrahe/home/REDACTED/.cache/uv/builds-v0/.tmpycA9Bm/bin/python (dns block)
- Triggering command: /home/REDACTED/work/plugboard/plugboard/.venv/bin/python3 /home/REDACTED/work/plugboard/plugboard/.venv/bin/python3 /home/REDACTED/work/plugboard/plugboard/.venv/lib/python3.12/site-packages/ray/dashboard/dashboard.py --host=127.0.0.1 --port=8265 --port-retries=50 --temp-dir=/tmp/ray --log-dir=/tmp/ray/session_2026-03-25_20-34-45_613600_6660/logs --session-dir=/tmp/ray/session_2026-03-25_20-34-45_613600_6660 --logging-rotate-bytes=536870912 --logging-rotate-backup-count=5 --gcs-address=127.0.0.1:34471 --cluster-id-hex=e65bcc378675174b4494fed2ef08f5c8492840459a2b9e0ece8361a3 --node-ip-address=127.0.0.1 --stdout-filepath=/tmp/ray/session_2026-03-25_20-34-45_613600_6660/logs/dashboard.out --stderr-filepath=/tmp/ray/session_2026-03-25_20-34-45_613600_6660/logs/dashboard.err python http.https://gitbash (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

codspeed-hq · 2026-03-25T20:52:29Z

Congrats! CodSpeed is installed 🎉

🆕 6 new benchmarks were detected.

You will start to see performance impacts in the reports once the benchmarks are run from your default branch.

Detected benchmarks

test_benchmark_process_run[asyncio] (Simulation): 1.2 s
test_benchmark_process_lifecycle[asyncio] (Simulation): 1.2 s
test_benchmark_process_run[ray] (Simulation): 15.1 ms
test_benchmark_process_lifecycle[zmq] (Simulation): 1.4 s
test_benchmark_process_run[zmq] (Simulation): 1.4 s
test_benchmark_process_lifecycle[ray] (Simulation): 880.3 ms

codecov · 2026-03-25T21:12:38Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…marking-tests

toby-coleman · 2026-04-06T11:45:30Z

Wait for CodSpeedHQ/pytest-codspeed#113 before merging this.

toby-coleman · 2026-04-17T12:54:29Z

@copilot Reinstate the run-only tests - take a look at c663755 to see what to add back in. Update the version of pytest-codespeed to >=4.4.0

Agent-Logs-Url: https://github.com/plugboard-dev/plugboard/sessions/e645afe0-8441-4d40-a914-d6d34ad285d0 Co-authored-by: toby-coleman <13170610+toby-coleman@users.noreply.github.com>

Copilot · 2026-04-17T12:58:59Z

@copilot Reinstate the run-only tests - take a look at c663755 to see what to add back in. Update the version of pytest-codespeed to >=4.4.0

Reinstated the run-only tests based on c663755 and updated pytest-codspeed to >=4.4.0 in 85362c6. Also fixed the _run function signature (removed unused process parameter) and added a teardown callback for proper cleanup in f585742.

…marking-tests

This reverts commit 9356ef9.

toby-coleman · 2026-04-19T20:22:55Z

@copilot change the benchmark tests back to using ubuntu-latest rather than codspeed macro, and don't use walltime mode - go back to default

Agent-Logs-Url: https://github.com/plugboard-dev/plugboard/sessions/d0198a73-2b53-48d6-a373-3d0276122cf4 Co-authored-by: toby-coleman <13170610+toby-coleman@users.noreply.github.com>

Copilot · 2026-04-19T20:23:52Z

@copilot change the benchmark tests back to using ubuntu-latest rather than codspeed macro, and don't use walltime mode - go back to default

Done in 29feb10 — switched runs-on back to ubuntu-latest and removed mode: walltime so CodSpeed uses its default instrumentation mode.

toby-coleman · 2026-04-19T20:41:15Z

@copilot determine why benchmark tests in CI fail with RuntimeError: Module MetricsHead failed to start. Timeout after 30.0 seconds related to ray dashboard. Fix.

… in CI Agent-Logs-Url: https://github.com/plugboard-dev/plugboard/sessions/fba811b4-cfe6-408b-976c-314de91e50c8 Co-authored-by: toby-coleman <13170610+toby-coleman@users.noreply.github.com>

Copilot · 2026-04-19T20:43:27Z

@copilot determine why benchmark tests in CI fail with RuntimeError: Module MetricsHead failed to start. Timeout after 30.0 seconds related to ray dashboard. Fix.

Fixed in 76af665. The ray_ctx session fixture in tests/conftest.py uses include_dashboard=True, which causes the Ray dashboard's MetricsHead module to time out under CodSpeed's instrumented environment. Added a benchmark-specific tests/benchmark/conftest.py that overrides ray_ctx with include_dashboard=False. Also removed the explicit mode: simulation from the workflow (it was leftover from a previous change).

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

astral.sh
- Triggering command: /usr/bin/curl curl -LsSf REDACTED /rg (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

Initial plan

0c9475b

Copilot AI assigned Copilot and toby-coleman Mar 25, 2026

Copilot started work on behalf of toby-coleman March 25, 2026 07:36 View session

Copilot AI changed the title ~~[WIP] Enhance benchmarking tests using codspeed~~ ci: Improve benchmarking tests with CodSpeed Mar 25, 2026

Copilot AI requested a review from toby-coleman March 25, 2026 07:47

Copilot finished work on behalf of toby-coleman March 25, 2026 07:47

toby-coleman requested changes Mar 25, 2026

View reviewed changes

Comment thread tests/benchmark/test_benchmarking.py Outdated

Comment thread tests/benchmark/test_benchmarking.py Outdated

Copilot started work on behalf of toby-coleman March 25, 2026 20:25 View session

Copilot AI requested a review from toby-coleman March 25, 2026 20:42

Copilot finished work on behalf of toby-coleman March 25, 2026 20:42

toby-coleman marked this pull request as ready for review March 25, 2026 20:50

toby-coleman approved these changes Mar 26, 2026

View reviewed changes

toby-coleman added 12 commits March 28, 2026 18:29

Use codspeed runners

4dac1b5

Add codspeed badge

adfd12c

Remove run-only tests and keep full lifecycle tests

6a427e5

Merge remote-tracking branch 'origin/main' into copilot/improve-bench…

ece386c

…marking-tests

Missing await

83eea99

Try reinstated benchmark

c663755

Fix

f2254a1

Reinstate setup

48872a6

Try marking asyncio

d0efaf5

Retry

f8c7636

Simple test

e2c6d53

Note to resinstate run-only tests

95dde5d

Copilot started work on behalf of toby-coleman April 17, 2026 12:54 View session

Copilot AI and others added 2 commits April 17, 2026 12:57

Reinstate run-only benchmark tests and update pytest-codspeed to >=4.4.0

85362c6

Agent-Logs-Url: https://github.com/plugboard-dev/plugboard/sessions/e645afe0-8441-4d40-a914-d6d34ad285d0 Co-authored-by: toby-coleman <13170610+toby-coleman@users.noreply.github.com>

Fix run-only benchmark: remove unused arg, add teardown for cleanup

f585742

Agent-Logs-Url: https://github.com/plugboard-dev/plugboard/sessions/e645afe0-8441-4d40-a914-d6d34ad285d0 Co-authored-by: toby-coleman <13170610+toby-coleman@users.noreply.github.com>

Copilot finished work on behalf of toby-coleman April 17, 2026 12:59

Copilot AI requested a review from toby-coleman April 17, 2026 12:59

toby-coleman and others added 4 commits April 17, 2026 13:19

Support AARCH64

787dc56

Upgrade greenlet

71910d3

Merge remote-tracking branch 'origin/main' into copilot/improve-bench…

6d1c54e

…marking-tests

Fixup the run-only test

4c5223f

toby-coleman approved these changes Apr 19, 2026

View reviewed changes

toby-coleman added 2 commits April 19, 2026 20:56

Quick test

9356ef9

Revert "Quick test"

0415516

This reverts commit 9356ef9.

Copilot started work on behalf of toby-coleman April 19, 2026 20:23 View session

ci: use ubuntu-latest runner and default CodSpeed mode

29feb10

Agent-Logs-Url: https://github.com/plugboard-dev/plugboard/sessions/d0198a73-2b53-48d6-a373-3d0276122cf4 Co-authored-by: toby-coleman <13170610+toby-coleman@users.noreply.github.com>

Copilot finished work on behalf of toby-coleman April 19, 2026 20:24

Copilot AI requested a review from toby-coleman April 19, 2026 20:24

Set mode

f4ee50b

Copilot started work on behalf of toby-coleman April 19, 2026 20:41 View session

fix: disable Ray dashboard in benchmarks to avoid MetricsHead timeout…

76af665

… in CI Agent-Logs-Url: https://github.com/plugboard-dev/plugboard/sessions/fba811b4-cfe6-408b-976c-314de91e50c8 Co-authored-by: toby-coleman <13170610+toby-coleman@users.noreply.github.com>

Copilot finished work on behalf of toby-coleman April 19, 2026 20:45

Fixup and reduce permissions

0f59fff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: Improve benchmarking tests with CodSpeed#236

ci: Improve benchmarking tests with CodSpeed#236
Copilot wants to merge 27 commits intomainfrom
copilot/improve-benchmarking-tests

Copilot AI commented Mar 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Mar 25, 2026

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

codspeed-hq bot commented Mar 25, 2026 •

edited

Loading

Detected benchmarks

Uh oh!

codecov bot commented Mar 25, 2026

Uh oh!

toby-coleman commented Apr 6, 2026

Uh oh!

toby-coleman commented Apr 17, 2026

Uh oh!

Copilot AI commented Apr 17, 2026

Uh oh!

toby-coleman commented Apr 19, 2026

Uh oh!

Copilot AI commented Apr 19, 2026

Uh oh!

toby-coleman commented Apr 19, 2026

Uh oh!

Copilot AI commented Apr 19, 2026 •

edited

Loading

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Mar 25, 2026

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

codspeed-hq bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Congrats! CodSpeed is installed 🎉

Detected benchmarks

Uh oh!

codecov bot commented Mar 25, 2026

Codecov Report

Uh oh!

toby-coleman commented Apr 6, 2026

Uh oh!

toby-coleman commented Apr 17, 2026

Uh oh!

Copilot AI commented Apr 17, 2026

Uh oh!

toby-coleman commented Apr 19, 2026

Uh oh!

Copilot AI commented Apr 19, 2026

Uh oh!

toby-coleman commented Apr 19, 2026

Uh oh!

Copilot AI commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Mar 25, 2026 •

edited

Loading

codspeed-hq bot commented Mar 25, 2026 •

edited

Loading

Copilot AI commented Apr 19, 2026 •

edited

Loading