NixOS support#379
Conversation
- build.yml: single build + Cachix push + unstable channel updates - release.yml: manual release workflow for stable/beta channels Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The SD image module provides filesystems, but toplevel builds need a minimal stub to evaluate successfully. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Required for NixOS module system to accept devMode setting. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Required when module has both options and config sections. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replaces FIXME placeholders with actual SRI hashes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Uses Pi5 runner when RUNNER_LABELS variable is set, falls back to ubuntu with QEMU emulation otherwise. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Filter to only Pi 4B device tree (CM4 incompatible with our overlays) - Use shorthand DTS syntax for PWM overlay Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Untracked file was excluded from Nix flake source tree, causing "No module named 'PiFinder.sys_utils_base'" on SD card boot. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add camera overlay (imx477) to netboot config.txt via flake.nix - Fix sys_utils import in main.py to use utils.get_sys_utils() - Add hip_main.dat fetch to pifinder-src.nix for starfield plotting - Add dma_heap udev rule for libcamera/picamera2 access - Fix shared memory naming in solver.py (remove leading /) - Add DNS nameservers for netboot environment - Document power control scripts in CLAUDE.md Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add runtimeCameraSelection option to hardware.nix (default: true) - SD image includes config.txt with "include camera.txt" directive - Users can edit camera.txt and reboot to switch cameras - Supported cameras: imx296, imx290 (imx462), imx477 - Fix cameraDriver scope in hardware.nix (moved to top-level let) - Add sudoers rules for systemctl stop/start pifinder.service - Add DMA heap udev rule for libcamera video group access - Netboot config sets cameraType = "imx477" for HQ camera dev Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Refactor sys_utils modules to use common base class - Add sys_utils_nixos.py for NixOS-specific implementations - Add get_sys_utils() detection in utils.py for platform selection - Add flake.lock for reproducible builds - Add NetworkManager config to networking.nix - Add deploy-image-to-nfs.sh for netboot development workflow Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update build.yml CI workflow - Fix fonts.py import - Fix marking_menus.py formatting - Add missing import to preview.py - Simplify objects_db.py - Add catalog_imports improvements - Update pifinder_objects.db Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Switch to NFSv4 with caching disabled (noac, actimeo=0) - Disable auto-optimise-store in devMode (hard links fail on NFS) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add ServerAliveInterval/CountMax to prevent timeout during transfers - Use rsync -R (relative) to preserve directory structure correctly Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Comets.txt is downloaded at runtime and must be in a writable location, not the read-only Nix store. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Extend eth0 wait to 30 seconds with debug output - Wait for link carrier before DHCP - Add DHCP retries (3 attempts) - Add LIBCAMERA_IPA_MODULE_PATH to pifinder service environment Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Restore SUBSYSTEM=="pwm" udev rule that was accidentally removed. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Turns on keypad LEDs during sysinit for early visual boot feedback. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- boot-splash.c: displays welcome image with scanning animation - Starts at sysinit, stops when pifinder.service starts - Much faster than Python splash Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove nixos-hardware module (saves 659MB linux-firmware) - Fetch nixos-rebuild at runtime (saves ~500MB llvm/nix deps) - Remove git from systemPackages (nix has built-in git for flakes) Target: ~150MB vs current 1.7GB
- Remove default packages (vim, nano, etc) - Disable polkit, udisks2, speechd - Should reduce closure significantly
NetworkManager-vpnc alone has 1.1GB closure (webkitgtk, llvm, etc). Disable all NM plugins for bootstrap - we just need WiFi.
A trusted build (secrets + write token) is the only thing that can push to the cache and write the nixos-manifest branch — a fork PR never can. Restore the push trigger but gate the build jobs on github.repository so it acts only in brickbots: a mrosseel push still builds nothing, while a merge to brickbots builds and publishes (attic push + update-manifest, whose fork-PR skip doesn't apply to pushes).
Adopt brickbots' canonical CI + the trusted testable-PR builder from brickbots#493 instead of the nixos branch's divergent workflows: + nixos-pr-build.yml (pull_request_target, label-gated, trusted) + nox.yml (brickbots standard lint/test) ~ web-integration-tests.yml (match brickbots main) - build.yml, lint.yml, release.yml (nixos-only; superseded) publish_manifest.sh / update_manifest.py already match brickbots#493.
…over rotation The cutover recreated the Attic cache with a fresh key (Vkem), but nothing deployed trusted it — the whole 8UU fleet got stranded. Attic can't dual-sign (serves only its own key), so the fix is to put the cache back on 8UU (done server-side) and revert the config to match. pifinder-release key kept (new cache, never previously trusted).
It holds always-on device config (avahi, hostname, substituters/keys, sudo), used long after the Debian→NixOS migration — the name was misleading. Update the flake import and the RELEASE.md reference. Pure rename, no content change.
8920417 to
c2bdfcf
Compare
…ostname nixos_upgrade.py: - Download progress now shows a size bar that moves *within* a path and names the package being copied, parsed live from nix's internal-json (resProgress byte events summed over copyPath activities; denominator = the dry-run 'unpacked' total, since Attic narinfos omit a compressed FileSize). All accounting is throttled and wrapped so a counter bug can never stall the stream or abort the upgrade; only a short log tail is kept (not the ~800k-line stream). Dropped the now-dead per-path path-info query, the size map, and write_sizes_file/UPGRADE_SIZES_FILE. - key-proof: fetch each cache's current signing key from its anonymous Attic cache-config endpoint and trust it for the pull (extra-trusted-public-keys; verification stays on) so a cache key rotation can't strand the fleet. networking.nix: own avahi here (the module shared by the running system and the migration build); fix the boot race (NM dispatcher re-scans avahi on connect) and make the PiFinder_data hostname stick — hostname-mode=none plus the dispatcher re-asserting hostname + avahi-set-host-name (NixOS bakes host-name=<static> into avahi's config, which a restart would otherwise revert). sys_utils.py / software.py: parse and show the package label under the bar.
…crashing
The external observing-lists feature added `list_descriptions` to
CompositeObject, but CACHE_VERSION was not bumped. Devices upgrading from the
prior release keep their existing composite_objects.pkl, whose unpickled
objects lack the new field (dataclass defaults are not applied on unpickle),
so opening any object's details crashes the whole app:
AttributeError: 'CompositeObject' object has no attribute 'list_descriptions'
The main process hosts the multiprocessing shared-state manager, so its death
cascades BrokenPipe/connection-reset into every worker — the symptom seen in
the logs; the real cause was masked because main()'s handler logs via the
multiprocess queue and then os._exit()s before the record is written.
- catalog_cache: bump CACHE_VERSION 1 -> 2 so pre-list_descriptions caches are
rebuilt on upgrade (the real fix for deployed devices).
- composite_object: getattr guard in composed_sections so a stale-cached object
degrades gracefully instead of taking down the process.
- main: print + flush the traceback before os._exit so a fatal exception in
main() lands in the journal instead of being lost to the log queue.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01RjeCZ17KqhKzhKikWGwBDo
Camera switching only changed the persisted choice and ran switch-to-configuration boot, but the generic-extlinux builder always writes DEFAULT=nixos-default (the base camera) and device-tree overlays load only at boot. So a device set to imx477 kept booting the base imx462 DTB, the imx290 driver bound to absent hardware (Error writing reg 0x3038), and the camera never worked. Fix A keeps the one-image/no-rebuild specialisation design and just makes the chosen specialisation the boot default: - set-extlinux-default: fail-safe helper that repoints extlinux DEFAULT to the latest-generation nixos-<gen>-<camera> entry (base camera -> nixos-default); leaves a bootable DEFAULT untouched if the entry is missing. - pifinder-switch-camera: repoint DEFAULT to the chosen camera, then reboot (DT overlays are boot-only). - nixos_upgrade: re-apply the persisted camera's DEFAULT after activation, before the upgrade reboot (every rebuild resets DEFAULT to base). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RjeCZ17KqhKzhKikWGwBDo
The ADR said device update downloads ship only the genuinely-new chunks (~80 MB for a 1.5 GB closure). That conflates server-side storage / CI-upload dedup with the device download. Attic serves whole NARs over the standard binary-cache protocol: a device fetches the full compressed NAR of every changed store path, with no chunk-delta against the previous version. The only device-side saving is path-level (unchanged paths are not refetched). True client-side chunk-delta needs a casync/desync client with a local chunk store. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RjeCZ17KqhKzhKikWGwBDo
pifinder-src is rebuilt (new store hash) on every code change, and Attic ships whole NARs — so each change re-downloads ~43MB and rewrites ~70MB to the SD, even for a one-line edit. 56MB of that is stable: fonts (~31MB) and the pinned tetra3/cedar-solve solver (~25MB, ~15MB after trimming examples/tests/docs). Move both into their own derivations (like astro-data) and symlink them into pifinder-src. A routine code change now rewrites only the ~16MB code path; fonts/tetra3 are distributed once and shared across changes. tetra3 keeps cedar_detect_pb2 (ships in the repo) and is pre-compiled; it is symlinked after compileall so bytecode isn't written into the read-only store path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01RjeCZ17KqhKzhKikWGwBDo
Adds a 'Rollback' channel next to stable/beta/unstable that lists the on-disk system generations you can roll back to (all but the current one). - sys_utils.list_rollback_targets(): reads ONLY immutable generation data (the /nix/var/nix/profiles symlinks + store-path labels) — no sidecar JSON state to evolve or corrupt across up/downgrades. Entry = label + generation + date. - software.py: the Rollback channel is built locally, so it's available even when the manifest fetch fails — i.e. exactly when you're stranded on a bad build. Select -> confirm -> reuses update_software() (the entry's ref is the generation's store path, so it activates + reboots with no download). - nixos_upgrade.cleanup_old_generations: keep +3 (was +2) -> 2 rollback targets. Labels currently come from the store-path name; a follow-up can set system.nixos.label at build time for 'PR-379'-style names (and matching bootloader entries).
| { config, lib, pkgs, ... }: | ||
| { | ||
| networking = { | ||
| hostName = "pifinder"; |
There was a problem hiding this comment.
Does networking.nix hardcode the hostName and revert it every time nix is invoked? (Cf. Hashed password vs. initialPassword)
There was a problem hiding this comment.
It seems that the hostName is cared for later on. Please confirm
There was a problem hiding this comment.
Hi, yes the hostname is stored in the pifinder data area
| # NTP server can't block the clock. FallbackNTP alone is skipped whenever a | ||
| # per-interface server is known — too fragile to rely on for first-boot | ||
| # migration, which gates the binary-cache fetch on a synchronized clock. | ||
| services.timesyncd.servers = [ |
There was a problem hiding this comment.
Does timesyncd support sourcing from GPS time?
There was a problem hiding this comment.
No idea, was this a feature we had before? If not I propose to fix that in a PR against this branch. It's already a big migration , want to wrap it up asap
|
|
||
| [wifi] | ||
| mode=ap | ||
| ssid=PiFinderAP |
There was a problem hiding this comment.
How does PiFinder with nix handle changes of the WiFi AP name? Is this overwritten every time a new version is selected and downloaded?
Summary
Test plan
🤖 Generated with Claude Code