Skip to content

refactor(server): use ComputeDriver RPC surface in-process#839

Merged
drew merged 1 commit intomainfrom
refactor-compute-driver-surface-anewberry
Apr 15, 2026
Merged

refactor(server): use ComputeDriver RPC surface in-process#839
drew merged 1 commit intomainfrom
refactor-compute-driver-surface-anewberry

Conversation

@drew
Copy link
Copy Markdown
Collaborator

@drew drew commented Apr 15, 2026

Summary

Bind openshell-server directly to the generated ComputeDriver RPC surface in-process so the server and driver share one method surface without network transport. Keep public GetSandbox and ListSandboxes store-backed, and reconcile that store from driver snapshots without regressing newer watch updates or pruning live sandboxes.

Related Issue

N/A

Changes

  • Remove the local ComputeBackend adapter and call ComputeDriver RPC handlers directly with tonic::Request.
  • Keep public sandbox get and list store-backed while reconciling the store from ListSandboxes and GetSandbox.
  • Fix reconcile races so stale list snapshots cannot overwrite newer watch state or delete a sandbox that now exists in the driver.
  • Preserve driver gRPC status mapping for already-exists and precondition errors, and update the gateway architecture doc.

Testing

  • mise run pre-commit passes
  • Unit tests added/updated
  • E2E tests added/updated (not applicable)
  • cargo test -p openshell-server compute:: -- --nocapture
  • cargo test -p openshell-driver-kubernetes grpc::tests -- --nocapture
  • mise run pre-commit failed in an unrelated openshell-cli --test sandbox_create_lifecycle_integration run because port 8080 was already in use.

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)

Signed-off-by: Drew Newberry <anewberry@nvidia.com>
@drew drew requested a review from a team as a code owner April 15, 2026 01:17
@drew drew self-assigned this Apr 15, 2026
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Apr 15, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@maxamillion
Copy link
Copy Markdown

LGTM

+1 to the convergence of the api call regardless of local or remote gRPC

@drew drew merged commit 0bf4216 into main Apr 15, 2026
10 checks passed
@drew drew deleted the refactor-compute-driver-surface-anewberry branch April 15, 2026 04:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants