fix(vhd): mask fwupd on Ubuntu 24.04 to unblock E2E PR gate (AB#38355676)#8662
fix(vhd): mask fwupd on Ubuntu 24.04 to unblock E2E PR gate (AB#38355676)#8662djsly wants to merge 2 commits into
Conversation
fwupd ships in the Ubuntu 24.04 cloud image and tries to start on boot. On AKS Linux nodes there is no firmware to manage -- firmware on Azure VMs is handled out-of-band by the host -- and recent fwupd releases on 24.04 exit non-zero, which trips the ValidateNoFailedSystemdUnits E2E validator (e2e/validators.go:995) on every Ubuntu 2404 scenario in the PR check-in gate (pipeline 119535). Mask fwupd.service, fwupd-refresh.service, and fwupd-refresh.timer during VHD build (Ubuntu 24.04 only) in vhdbuilder/packer/install-dependencies.sh, mirroring the apt-daily masking pattern in the same Ubuntu block. Masking (vs. disabling) prevents systemctl preset-all from re-enabling these units. AB#38355676 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add testFwupdMaskedOnUbuntu2404 to linux-vhd-content-test.sh so that any future regression in the fwupd masking applied by vhdbuilder/packer/install-dependencies.sh is caught at VHD-build time rather than reaching the E2E gate. AB#38355676 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Masks fwupd-related systemd units during the Ubuntu 24.04 VHD build to prevent fwupd.service from entering a failed state on boot (which trips the E2E ValidateNoFailedSystemdUnits validator), and adds a VHD-content test to ensure the units remain masked.
Changes:
- Mask/disable
fwupd.service,fwupd-refresh.service, andfwupd-refresh.timerduring Ubuntu 24.04 VHD build. - Add
linux-vhd-content-test.shcoverage asserting those units are masked (or absent) on Ubuntu 24.04.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
vhdbuilder/packer/install-dependencies.sh |
Adds Ubuntu 24.04-specific masking/disabling of fwupd units during VHD build. |
vhdbuilder/packer/test/linux-vhd-content-test.sh |
Adds and wires a VHD-content test that asserts fwupd units are masked (or not present) on Ubuntu 24.04. |
| # `|| true` because the units only exist when fwupd is installed (24.04 cloud image | ||
| # default; not guaranteed on minimal or future SKUs) and `mask` against a non-existent | ||
| # unit can fail under newer systemd. | ||
| if [ "$OS_VERSION" = "24.04" ]; then | ||
| systemctl mask fwupd.service fwupd-refresh.service fwupd-refresh.timer || true | ||
| systemctl disable --now fwupd.service fwupd-refresh.service fwupd-refresh.timer 2>/dev/null || true | ||
| fi |
|
AgentBaker Linux PR gate — Build VHD fails on every distro: CRLF line endings in
Exact first-failure signature (Packer shell provisioner, immediately after The Three-level analysis:
Build-vs-test: build/VHD regression introduced by this PR. Recommended next action / owner: PR author (Sylvain) — re-save And/or: Also worth confirming Note: with this fix in flight, this PR is the proposed mitigation for the recurring Ubuntu 24.04 Posted by Clawpilot AgentBaker gate detective. |
|
not needed, disabling phasing instead |
What this PR does
Masks
fwupd.service,fwupd-refresh.service, andfwupd-refresh.timerduring VHD build for Ubuntu 24.04, and adds a VHD-content test asserting it stays masked.Why
Starting around 2026-06-08 ~21:15 UTC, every Ubuntu 24.04 E2E scenario in pipeline 119535 (
AKS Linux VHD Build - PR check-in gate) started failing deterministically with:The failure is distro-scoped to Ubuntu 24.04 and hits every 2404 scenario in unrelated PRs (~20 leaf failures/build). AzureLinux and Ubuntu 22.04 SKUs are NOT affected.
Affected builds
Sample detective comment: #8642 (comment)
RCA (one paragraph)
fwupd(firmware-update daemon) is installed by the Ubuntu 24.04 cloud image andfwupd.serviceis enabled by default. On Azure VMs (Hyper-V Gen2) the daemon has no usable firmware-update surface — node firmware is managed out-of-band by the Azure host — and the recent fwupd version included in the rolled-up Ubuntu archive snapshot (likely via the 2026-05-24 security-patch refresh in #8582) exits non-zero at boot. The existingValidateNoFailedSystemdUnitsvalidator (e2e/validators.go:995) already allowlists the siblingfwupd-refresh.service(see line 936) but notfwupd.serviceitself, so every 24.04 scenario trips. Sample test-log entries from build 167206065:Fix and why (option chosen: mask at VHD-build time, not validator allowlist)
We chose to mask the units rather than allowlist
fwupd.servicein the E2E validator because:apt-dailymasking 10 lines above in the same file; Ubuntu Pro inert on 20.04/FIPS in fix: make Ubuntu Pro inert on 20.04/FIPS VHDs to stop phone-home (AB#38255910) #8638).systemctl mask(vs.disable) survivessystemctl preset-alland any reinstall of fwupd.Changes
vhdbuilder/packer/install-dependencies.shapt-dailymask), added anOS_VERSION = "24.04"guard thatsystemctl masks anddisable --nowsfwupd.service,fwupd-refresh.service, andfwupd-refresh.timer. Trailing|| truebecause the units only exist when fwupd is installed (the 24.04 cloud-image default, but not guaranteed on minimal/future SKUs).vhdbuilder/packer/test/linux-vhd-content-test.shtestFwupdMaskedOnUbuntu2404(mirrors the existingtestNfsServerServicepattern) and wired it into the test dispatch right aftertestNfsServerService. The test treatsmaskedas pass, treats absent/not-foundas pass (variant doesn't ship fwupd), and fails on any other state, so a future regression is caught at VHD-build time rather than at E2E.Tests run locally
This change is in
vhdbuilder/packer/(notparts/orpkg/), somake generatesnapshot regen is not triggered and was not run. The Windows agent this PR was authored from has nogo,shellcheck, ordockeravailable, so unit-level lint/build was not executed locally — the ADO CI on this PR will exercise:vhdbuilder/packer/build via the linux-vhd-build pipelinetestFwupdMaskedOnUbuntu2404assertion will execute as part of the in-VHDlinux-vhd-content-test.shsuite during the buildfwupd.serviceas a failed unitScope / blast radius
[ "$OS_VERSION" = "24.04" ]guard).e2e/validators.go— the validator stays strict, which is what we want..pipelines/.vsts-vhd-builder.yamlor any other pipeline definition.\|\| truefalls back gracefully if a future minimal SKU does not ship fwupd.Tracking
Note on commit signatures
Both commits in this PR were authored from a sandboxed Windows agent without a local GPG key, and the GitHub Contents API used for the push did not auto-sign with the web-flow key (this happens for some PAT-authenticated requests on org repos). Per
CONTRIBUTING.mdthe recommended remediation isgit commit --amend -S+git push --force-with-leasefrom a GPG-equipped machine; I (Sylvain) will do that before merge. The diffs themselves are intentionally minimal and easy to review without rebasing.Do NOT auto-merge
This PR is deliberately not marked auto-complete — reviewer eyes on the masking decision are wanted before this lands.
cc @djsly @cameronmeissner @Devinwong @lilypan26 @r2k1