Skip to content

Add standby compression start delay#184

Draft
sjmiller609 wants to merge 14 commits intomainfrom
codex/standby-compression-delay
Draft

Add standby compression start delay#184
sjmiller609 wants to merge 14 commits intomainfrom
codex/standby-compression-delay

Conversation

@sjmiller609
Copy link
Copy Markdown
Collaborator

@sjmiller609 sjmiller609 commented Apr 4, 2026

Summary

  • add a standby-only compression delay override on POST /instances/{id}/standby and a per-instance default in snapshot_policy
  • keep delayed standby compression jobs cancelable before start and distinguish pending-delay skips from active compression cancellation
  • add metrics, logs, traces, OpenAPI updates, and tests for the new standby compression delay behavior

Testing

  • go test ./lib/instances
  • go test ./cmd/api/api -run 'Test(CreateInstance_MapsStandbyCompressionDelayInSnapshotPolicy|CreateInstance_InvalidStandbyCompressionDelayInSnapshotPolicy|InstanceToOAPI_EmitsStandbyCompressionDelayInSnapshotPolicy|StandbyInstance_MapsCompressionDelay|StandbyInstance_InvalidCompressionDelay|StandbyInstance_InvalidRequest)$'

Notes

  • go test ./cmd/api/api still hits unrelated environment-dependent volume tests on this machine because mkfs.ext4 is not available in $PATH.

Note

Medium Risk
Changes standby snapshot compression scheduling and cancellation semantics (including restart recovery) and updates networking iptables rule management to wait on the xtables lock; these touch core instance lifecycle and host networking paths but are gated behind new optional fields and covered by tests.

Overview
Adds a standby-only snapshot compression delay that can be set per request via POST /instances/{id}/standby (compression_delay) and as a per-instance default via snapshot_policy.standby_compression_delay (OpenAPI + API/domain mappings + validation).

Implements delayed compression jobs in the instance manager: jobs can be pending-delay vs running, can be skipped if canceled before start, persist a PendingStandbyCompression plan in instance metadata for restart recovery, and clear that plan on restore/snapshot operations; preemption metrics are now recorded only when interrupting active compression.

Extends observability with new snapshot compression metrics (wait duration, active vs pending gauges, skipped result), additional logs/traces, and includes targeted unit/integration test hardening (guest exec retries/readiness probe) plus more reliable iptables operations/tests using iptables -w 5 to avoid xtables lock flakes.

Reviewed by Cursor Bugbot for commit 17cb581. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 4, 2026

✱ Stainless preview builds

This PR will update the hypeman SDKs with the following commit message.

feat: Add standby compression start delay

Edit this comment to update it. It will appear in the SDK's changelogs.

hypeman-openapi studio · code · diff

Your SDK build had at least one "note" diagnostic, but this did not represent a regression.
generate ✅

hypeman-go studio · code · diff

Your SDK build had at least one "note" diagnostic, but this did not represent a regression.
generate ✅build ✅lint ✅test ✅

go get github.com/stainless-sdks/hypeman-go@03373f4b24a61da25d1e13c6a82b7b4869d02699
hypeman-typescript studio · code · diff

Your SDK build had at least one "note" diagnostic, but this did not represent a regression.
generate ✅build ✅lint ✅test ✅

npm install https://pkg.stainless.com/s/hypeman-typescript/3a8978e60b5af2d2fcb4f189392f17fb97784254/dist.tar.gz

This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
If you push custom code to the preview branch, re-run this workflow to update the comment.
Last updated: 2026-04-08 21:00:35 UTC

sjmiller609

This comment was marked as resolved.

@sjmiller609 sjmiller609 marked this pull request as ready for review April 8, 2026 13:12
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 080050b. Configure here.

@sjmiller609 sjmiller609 requested a review from hiroTamada April 8, 2026 14:56
@sjmiller609
Copy link
Copy Markdown
Collaborator Author

waiting until data or use case justified

@sjmiller609 sjmiller609 marked this pull request as draft April 9, 2026 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant