Skip to content

Instrument guest exec steps#246

Open
sjmiller609 wants to merge 1 commit into
mainfrom
hypeship/instrument-guest-exec
Open

Instrument guest exec steps#246
sjmiller609 wants to merge 1 commit into
mainfrom
hypeship/instrument-guest-exec

Conversation

@sjmiller609
Copy link
Copy Markdown
Collaborator

@sjmiller609 sjmiller609 commented May 29, 2026

Summary

  • add child spans for guest exec connection lookup, stream open, start send, and receive-until-exit steps when waiting for guest-agent readiness
  • record receive-path byte counts and exit code so slow reconfigure execs can be separated from connection/setup time
  • cover the detailed wait-for-agent tracing path and the no-wait path in guest exec tests

Tests

  • go test ./lib/guest
  • go test ./lib/instances (fails in this checkout: missing embedded hypervisor and guest-agent binaries)

Note

Low Risk
Observability-only changes gated on WaitForAgent; exec and retry behavior are unchanged aside from tracing hooks.

Overview
Adds OpenTelemetry child spans around each phase of a single guest exec attempt when WaitForAgent is set, so traces can separate agent-wait/retry time from connection setup vs. stream I/O vs. command runtime.

Under the existing guest.exec parent span, new steps are guest.exec.get_conn, open_stream, send_start, and recv_until_exit. The receive step records stdout/stderr byte counts, bytes sent, and exit code on success or failure. Step spans are not created when WaitForAgent is zero, avoiding duplicate noise next to API-level exec spans.

Tests use an in-memory span recorder to assert the full span tree and attributes on the wait path, and that no guest.exec* spans appear without waiting. Retry tests now clean up pooled gRPC connections via CloseConn.

Reviewed by Cursor Bugbot for commit 4b6e973. Bugbot is set up for automated code reviews on this repo. Configure here.

@sjmiller609 sjmiller609 marked this pull request as ready for review May 29, 2026 20:58
@firetiger-agent
Copy link
Copy Markdown

Firetiger deploy monitoring skipped

This PR didn't match the auto-monitor filter configured on your GitHub connection:

PRs in the kernel, infra, hypeman, and hypeship repos. kernel is a ~mono repo with many logical services underneath, ensure to focus on the implicated service for the PR

Reason: Cannot determine repository from PR information; please clarify if this PR is in the kernel, infra, hypeman, or hypeship repo, and if in kernel, confirm the affected service.

To monitor this PR anyway, reply with @firetiger monitor this.

@sjmiller609 sjmiller609 requested a review from rgarcia May 29, 2026 20:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants