vminit: replace initramfs with erofs rootfs#208
Conversation
There was a problem hiding this comment.
Pull request overview
Replaces the gzip CPIO initramfs with an EROFS block-device rootfs mounted as /dev/vda. The kernel now boots directly into the EROFS image (root=/dev/vda rootfstype=erofs ro init=/sbin/vminitd), avoiding the previous initramfs decompression + switch_root. Build, host shim, and guest init code are updated accordingly.
Changes:
- Build: replace
initrd-buildDockerfile stage witherofs-build(mkfs.erofs -zlz4), and rename the bake target / Make rule / Taskfile entry frominitrdtorootfs. - Host shim: add the EROFS rootfs as the first virtio-blk disk (
/dev/vda), shift container disk allocation to start atvdb, move dynamic mount targets to/run/mnt, and fix the libkrunSetKernelbinding so that an emptyinitrdargument maps to a CNULLpointer. - Guest init: drop the explicit
devtmpfsmount (kernel auto-mounts it), mount a tmpfs over/etcsoresolv.conf/hostswrites succeed on the read-only rootfs, fix atmpsfstypo, and stop forcingNoPivot=truesorunchonors the runtime option.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| Dockerfile | Replaces the cpio/gzip initrd stage with an mkfs.erofs -zlz4 rootfs stage, lays out /sbin/vminitd, /sbin/crun, mount-point directories, and var/run -> /run. |
| docker-bake.hcl | Renames the initrd bake target to rootfs targeting the new erofs stage. |
| Makefile | Renames _output/nerdbox-initrd target to _output/nerdbox-rootfs.erofs and wires it to build:rootfs. |
| Taskfile.yml | Renames build:initrd to build:rootfs and updates descriptions. |
| README.md | Updates documented artifact name to nerdbox-rootfs.erofs. |
| internal/vm/libkrun/instance.go | Looks up nerdbox-rootfs.erofs, adds it as the first virtio-blk disk, stores rootfsPath, and boots with the new kernel cmdline and no initrd. |
| internal/vm/libkrun/krun.go | Changes SetKernel's initramfs to unsafe.Pointer so "" becomes a NULL C pointer. |
| internal/shim/task/mount.go | Starts the shared diskAllocator at 'b' (since vda is the VM rootfs) and moves bind/block VM mount targets to /run/mnt/.... |
| internal/shim/task/mount_test.go | Updates expected disk IDs, device names, and mount paths to reflect vdb+ and /run/mnt. |
| internal/vminit/process/init.go | Uses p.NoPivotRoot instead of forcing NoPivot: true in runc.CreateOpts. |
| pkg/vminit/initd/initd.go | Restructures systemInit to return a single error, adds the DHCP renewer goroutine inside it, fixes the tmpfs source typo, replaces the explicit devtmpfs mount with a tmpfs-over-/etc mount, and logs system-init elapsed time. |
Note (outside diff):
.github/workflows/ci.yml:250-264still referencesnerdbox-initrd(includingfile _output/nerdbox-initrd), and.github/workflows/benchmarks.yml:109still labels the step "initrd and shim". CI will fail because the build no longer produces_output/nerdbox-initrd. These files are not part of this diff, but should be updated alongside this PR.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
524481d to
95eaa71
Compare
95eaa71 to
13441a1
Compare
13441a1 to
83c25dd
Compare
83c25dd to
e397ae6
Compare
e397ae6 to
1a7436f
Compare
Replace the gzip-compressed CPIO initramfs with an EROFS block device image as the VM root filesystem, eliminating the tmpfs switch_root that was required to make pivot_root available to containers. The kernel boots directly into the erofs image on /dev/vda via 'root=/dev/vda rootfstype=erofs ro init=/sbin/vminitd', removing the initramfs decompression which contributed significantly to kernel boot time. Build changes: - Dockerfile: replace cpio/gzip initrd-build stage with erofs-build stage using mkfs.erofs -zlz4; vminitd placed at /sbin/vminitd - docker-bake.hcl, Makefile: rename initrd target to rootfs Host shim changes: - instance.go: search for nerdbox-rootfs.erofs; add it as the first virtio-blk device in NewInstance so it is always /dev/vda; pass the erofs boot cmdline to krun_set_kernel with no initrd - krun.go: change SetKernel's initramfs parameter to unsafe.Pointer so that an empty string maps to a C null pointer (purego converts an empty Go string to a non-null pointer, causing libkrun to fail) - mount.go: start diskAllocator at 'b' since /dev/vda is now the VM rootfs; container disks begin at /dev/vdb Guest init changes: - initd.go: replace with systemMounts that mounts proc, sysfs, cgroup2, run, tmp, and a tmpfs over /etc (so runtime writes such as resolv.conf succeed on the read-only erofs). /dev is omitted — CONFIG_DEVTMPFS_MOUNT=y mounts devtmpfs before init starts, and a redundant mount returns EBUSY on the block-device root. Signed-off-by: Derek McGowan <derek@mcg.dev>
1a7436f to
8e0d57e
Compare
Replace the gzip-compressed CPIO initramfs with an EROFS block device image as the VM root filesystem, eliminating the tmpfs switch_root that was required to make pivot_root available to containers.
The kernel boots directly into the erofs image on /dev/vda via 'root=/dev/vda rootfstype=erofs ro init=/sbin/vminitd', removing the initramfs decompression which contributed significantly to kernel boot time.
Build changes:
Host shim changes:
Guest init changes: