Skip to content

feat: add dev mode toggle and screen off burn-in protection#385

Open
mrosseel wants to merge 252 commits into
brickbots:mainfrom
mrosseel:screensaver
Open

feat: add dev mode toggle and screen off burn-in protection#385
mrosseel wants to merge 252 commits into
brickbots:mainfrom
mrosseel:screensaver

Conversation

@mrosseel

Copy link
Copy Markdown
Collaborator

Summary

  • Press square 7 times anywhere to toggle dev mode (shows popup feedback)
  • Dev mode unlocks "Screen Off" option in Status menu and unstable software channel
  • Screen off blanks display and LEDs after configurable timeout (burn-in protection)
  • Hourly LED heartbeat pulse while screen is off
  • Test mode now shows popup "Test Mode ON/OFF" and uses hollow camera icon

Test plan

  • Press square 7x → see "DEV MODE ON" popup
  • Go to Status → "Screen Off" option visible
  • Set sleep_timeout="10s", screen_off_timeout="30s"
  • Wait 10s → screen dims (sleep)
  • Wait 30s more → screen blanks, LEDs off
  • Press any key → full wake
  • Press square 7x → "DEV MODE OFF", Screen Off hidden

🤖 Generated with Claude Code

mrosseel and others added 30 commits February 4, 2026 19:08
Required for NixOS module system to accept devMode setting.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Required when module has both options and config sections.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replaces FIXME placeholders with actual SRI hashes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Uses Pi5 runner when RUNNER_LABELS variable is set, falls back to
ubuntu with QEMU emulation otherwise.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Filter to only Pi 4B device tree (CM4 incompatible with our overlays)
- Use shorthand DTS syntax for PWM overlay

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Untracked file was excluded from Nix flake source tree, causing
"No module named 'PiFinder.sys_utils_base'" on SD card boot.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add camera overlay (imx477) to netboot config.txt via flake.nix
- Fix sys_utils import in main.py to use utils.get_sys_utils()
- Add hip_main.dat fetch to pifinder-src.nix for starfield plotting
- Add dma_heap udev rule for libcamera/picamera2 access
- Fix shared memory naming in solver.py (remove leading /)
- Add DNS nameservers for netboot environment
- Document power control scripts in CLAUDE.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add runtimeCameraSelection option to hardware.nix (default: true)
- SD image includes config.txt with "include camera.txt" directive
- Users can edit camera.txt and reboot to switch cameras
- Supported cameras: imx296, imx290 (imx462), imx477
- Fix cameraDriver scope in hardware.nix (moved to top-level let)
- Add sudoers rules for systemctl stop/start pifinder.service
- Add DMA heap udev rule for libcamera video group access
- Netboot config sets cameraType = "imx477" for HQ camera dev

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Refactor sys_utils modules to use common base class
- Add sys_utils_nixos.py for NixOS-specific implementations
- Add get_sys_utils() detection in utils.py for platform selection
- Add flake.lock for reproducible builds
- Add NetworkManager config to networking.nix
- Add deploy-image-to-nfs.sh for netboot development workflow

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update build.yml CI workflow
- Fix fonts.py import
- Fix marking_menus.py formatting
- Add missing import to preview.py
- Simplify objects_db.py
- Add catalog_imports improvements
- Update pifinder_objects.db

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Switch to NFSv4 with caching disabled (noac, actimeo=0)
- Disable auto-optimise-store in devMode (hard links fail on NFS)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add ServerAliveInterval/CountMax to prevent timeout during transfers
- Use rsync -R (relative) to preserve directory structure correctly

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Comets.txt is downloaded at runtime and must be in a writable
location, not the read-only Nix store.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Extend eth0 wait to 30 seconds with debug output
- Wait for link carrier before DHCP
- Add DHCP retries (3 attempts)
- Add LIBCAMERA_IPA_MODULE_PATH to pifinder service environment

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Restore SUBSYSTEM=="pwm" udev rule that was accidentally removed.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Turns on keypad LEDs during sysinit for early visual boot feedback.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- boot-splash.c: displays welcome image with scanning animation
- Starts at sysinit, stops when pifinder.service starts
- Much faster than Python splash

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove nixos-hardware module (saves 659MB linux-firmware)
- Fetch nixos-rebuild at runtime (saves ~500MB llvm/nix deps)
- Remove git from systemPackages (nix has built-in git for flakes)

Target: ~150MB vs current 1.7GB
- Remove default packages (vim, nano, etc)
- Disable polkit, udisks2, speechd
- Should reduce closure significantly
NetworkManager-vpnc alone has 1.1GB closure (webkitgtk, llvm, etc).
Disable all NM plugins for bootstrap - we just need WiFi.
- Disable xdg.mime/icons/sounds (pulls xdg-utils -> perl 112MB)
- Disable command-not-found (pulls perl)
- Disable fuse (86MB)
- Disable initrd extra filesystems
github-actions Bot and others added 5 commits June 24, 2026 16:52
The update-manifest job branched nixos-manifest off the full source
checkout and only git-added the manifest, so the branch inherited the
entire 1400+ file source tree and history. Concurrent trunk/PR stamps
also clobbered or failed each other on the single update-manifest.json.

Extract publishing into .github/scripts/publish_manifest.sh, shared by
build.yml and release.yml:
- rebuild nixos-manifest as a single-file orphan tree every run
- fetch -> re-apply this run's entry -> retry on a rejected push, so
  concurrent writers cannot lose an update (a git ref update is CAS)
Add a manifest-write concurrency group on the build update-manifest job
as a coarse serializer in front of the retry loop.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01RjeCZ17KqhKzhKikWGwBDo
mrosseel and others added 8 commits June 25, 2026 10:45
…ollapse

nixos-manifest is metadata-only (just update-manifest.json); the find-prune +
add -A was one-time cleanup of the already-collapsed branch. Keep only: fetch
tip, rewrite the entry, push, with the concurrency retry.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01RjeCZ17KqhKzhKikWGwBDo
cache.pifinder.eu's chunk store moved from local disk to AWS S3 (bucket
pifinder-nix-cache, us-west-1). The cutover used a fresh atticd DB, which
regenerated both caches' NAR signing keypairs. Rotate the pinned dev key
and wire in the now-provisioned pifinder-release key.

  pifinder:         8UU/...jBmE=  ->  Vkem...3gck=
  pifinder-release: (newly pinned)    WG/F...KpoM=

ATTIC_TOKEN is unchanged (RS256 server secret preserved), so CI keeps
working once these land.
A push to main/nixos triggered the whole build pipeline (build-native →
build-emulated → update-manifest → migration-tarball). By design mrosseel
commits must build nothing; only labeled PRs (brickbots contributions)
and pre-releases/releases (release.yml) should build. Drop the `push:`
trigger and the event_name=='push' arms in build-native/native-wait;
update-manifest still follows the build jobs, so it runs on labeled PRs.
…n fork PRs

- software.py: read update-manifest.json from brickbots/PiFinder (canonical),
  not the mrosseel fork — devices were pointed at the wrong repo.
- build.yml build-native: guard 'Push to Attic' on $ATTIC_TOKEN so fork PRs
  (no secrets) build verify-only instead of failing with AccessError, matching
  build-emulated.
- build.yml update-manifest: drop 'ref: <head_ref>' checkout — a fork PR's head
  branch doesn't exist here, so it 404'd; default ref resolves for both.
A trusted build (secrets + write token) is the only thing that can push to
the cache and write the nixos-manifest branch — a fork PR never can. Restore
the push trigger but gate the build jobs on github.repository so it acts only
in brickbots: a mrosseel push still builds nothing, while a merge to brickbots
builds and publishes (attic push + update-manifest, whose fork-PR skip doesn't
apply to pushes).
Adopt brickbots' canonical CI + the trusted testable-PR builder from brickbots#493
instead of the nixos branch's divergent workflows:
  + nixos-pr-build.yml  (pull_request_target, label-gated, trusted)
  + nox.yml             (brickbots standard lint/test)
  ~ web-integration-tests.yml  (match brickbots main)
  - build.yml, lint.yml, release.yml  (nixos-only; superseded)
publish_manifest.sh / update_manifest.py already match brickbots#493.
@mrosseel mrosseel added testable Ready for testing via PiFinder software update and removed testable Ready for testing via PiFinder software update labels Jun 25, 2026
mrosseel and others added 5 commits June 26, 2026 00:21
…over rotation

The cutover recreated the Attic cache with a fresh key (Vkem), but nothing
deployed trusted it — the whole 8UU fleet got stranded. Attic can't dual-sign
(serves only its own key), so the fix is to put the cache back on 8UU (done
server-side) and revert the config to match. pifinder-release key kept (new
cache, never previously trusted).
It holds always-on device config (avahi, hostname, substituters/keys, sudo),
used long after the Debian→NixOS migration — the name was misleading. Update
the flake import and the RELEASE.md reference. Pure rename, no content change.
…ostname

nixos_upgrade.py:
- Download progress now shows a size bar that moves *within* a path and names
  the package being copied, parsed live from nix's internal-json (resProgress
  byte events summed over copyPath activities; denominator = the dry-run
  'unpacked' total, since Attic narinfos omit a compressed FileSize). All
  accounting is throttled and wrapped so a counter bug can never stall the
  stream or abort the upgrade; only a short log tail is kept (not the
  ~800k-line stream). Dropped the now-dead per-path path-info query, the size
  map, and write_sizes_file/UPGRADE_SIZES_FILE.
- key-proof: fetch each cache's current signing key from its anonymous Attic
  cache-config endpoint and trust it for the pull (extra-trusted-public-keys;
  verification stays on) so a cache key rotation can't strand the fleet.

networking.nix: own avahi here (the module shared by the running system and the
migration build); fix the boot race (NM dispatcher re-scans avahi on connect)
and make the PiFinder_data hostname stick — hostname-mode=none plus the
dispatcher re-asserting hostname + avahi-set-host-name (NixOS bakes
host-name=<static> into avahi's config, which a restart would otherwise revert).

sys_utils.py / software.py: parse and show the package label under the bar.
…crashing

The external observing-lists feature added `list_descriptions` to
CompositeObject, but CACHE_VERSION was not bumped. Devices upgrading from the
prior release keep their existing composite_objects.pkl, whose unpickled
objects lack the new field (dataclass defaults are not applied on unpickle),
so opening any object's details crashes the whole app:

    AttributeError: 'CompositeObject' object has no attribute 'list_descriptions'

The main process hosts the multiprocessing shared-state manager, so its death
cascades BrokenPipe/connection-reset into every worker — the symptom seen in
the logs; the real cause was masked because main()'s handler logs via the
multiprocess queue and then os._exit()s before the record is written.

- catalog_cache: bump CACHE_VERSION 1 -> 2 so pre-list_descriptions caches are
  rebuilt on upgrade (the real fix for deployed devices).
- composite_object: getattr guard in composed_sections so a stale-cached object
  degrades gracefully instead of taking down the process.
- main: print + flush the traceback before os._exit so a fatal exception in
  main() lands in the journal instead of being lost to the log queue.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01RjeCZ17KqhKzhKikWGwBDo
Camera switching only changed the persisted choice and ran
switch-to-configuration boot, but the generic-extlinux builder always writes
DEFAULT=nixos-default (the base camera) and device-tree overlays load only at
boot. So a device set to imx477 kept booting the base imx462 DTB, the imx290
driver bound to absent hardware (Error writing reg 0x3038), and the camera
never worked.

Fix A keeps the one-image/no-rebuild specialisation design and just makes the
chosen specialisation the boot default:
- set-extlinux-default: fail-safe helper that repoints extlinux DEFAULT to the
  latest-generation nixos-<gen>-<camera> entry (base camera -> nixos-default);
  leaves a bootable DEFAULT untouched if the entry is missing.
- pifinder-switch-camera: repoint DEFAULT to the chosen camera, then reboot
  (DT overlays are boot-only).
- nixos_upgrade: re-apply the persisted camera's DEFAULT after activation,
  before the upgrade reboot (every rebuild resets DEFAULT to base).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01RjeCZ17KqhKzhKikWGwBDo
mrosseel and others added 4 commits June 26, 2026 12:37
The ADR said device update downloads ship only the genuinely-new chunks
(~80 MB for a 1.5 GB closure). That conflates server-side storage / CI-upload
dedup with the device download. Attic serves whole NARs over the standard
binary-cache protocol: a device fetches the full compressed NAR of every
changed store path, with no chunk-delta against the previous version. The only
device-side saving is path-level (unchanged paths are not refetched). True
client-side chunk-delta needs a casync/desync client with a local chunk store.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01RjeCZ17KqhKzhKikWGwBDo
pifinder-src is rebuilt (new store hash) on every code change, and Attic ships
whole NARs — so each change re-downloads ~43MB and rewrites ~70MB to the SD,
even for a one-line edit. 56MB of that is stable: fonts (~31MB) and the pinned
tetra3/cedar-solve solver (~25MB, ~15MB after trimming examples/tests/docs).

Move both into their own derivations (like astro-data) and symlink them into
pifinder-src. A routine code change now rewrites only the ~16MB code path;
fonts/tetra3 are distributed once and shared across changes. tetra3 keeps
cedar_detect_pb2 (ships in the repo) and is pre-compiled; it is symlinked after
compileall so bytecode isn't written into the read-only store path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01RjeCZ17KqhKzhKikWGwBDo
Adds a 'Rollback' channel next to stable/beta/unstable that lists the on-disk
system generations you can roll back to (all but the current one).

- sys_utils.list_rollback_targets(): reads ONLY immutable generation data (the
  /nix/var/nix/profiles symlinks + store-path labels) — no sidecar JSON state to
  evolve or corrupt across up/downgrades. Entry = label + generation + date.
- software.py: the Rollback channel is built locally, so it's available even
  when the manifest fetch fails — i.e. exactly when you're stranded on a bad
  build. Select -> confirm -> reuses update_software() (the entry's ref is the
  generation's store path, so it activates + reboots with no download).
- nixos_upgrade.cleanup_old_generations: keep +3 (was +2) -> 2 rollback targets.

Labels currently come from the store-path name; a follow-up can set
system.nixos.label at build time for 'PR-379'-style names (and matching
bootloader entries).
Re-applied the dev-mode / test-mode + screen-off burn-in feature as a clean
delta on top of the updated nixos branch (which now carries the upstream
2.6.0 merge), dropping the stale nixos-infra commits the old branch carried.

Conflicts resolved against nixos: use _imu.moving (not dict access) in the
sleep/screen-off loop; gate debug-camera image send on test_mode via the
existing debug flag; keep nixos.s computed camera-icon position; add the
Screen Off item to the Experimental menu without resurrecting the SQM entry
that nixos moved to top level. status.py taken from nixos (its _config_options
table was refactored away, so the feature.s edits there are superseded).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
dev_mode_only was declared on the Screen Off menu item but never read by
any menu filter, so it had no effect. Remove it; Screen Off stays under
Tools > Experimental > Dev Tools next to Telemetry. Dev mode keeps its one
real effect (the 7x SQUARE gesture unlocking the Unstable update channel).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01RjeCZ17KqhKzhKikWGwBDo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testable Ready for testing via PiFinder software update

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants