Skip to content

Marcos idempotency#343

Open
JedMeister wants to merge 5 commits intoturnkeylinux:19.x-devfrom
JedMeister:marcos-idempotency
Open

Marcos idempotency#343
JedMeister wants to merge 5 commits intoturnkeylinux:19.x-devfrom
JedMeister:marcos-idempotency

Conversation

@JedMeister
Copy link
Copy Markdown
Member

Idempotency updates provided by @marcos-mendez - pulled from #339 (rebased on current HEAD of 19.x-dev branch to fix merge conflicts).

navigator and others added 5 commits April 10, 2026 10:57
- plans/turnkey/base: add libsocket6-perl + libio-socket-ssl-perl (IPv6 Webmin)
- plans/turnkey/base: uncomment tklbam (migrated to Python 3.13)
- conf/turnkey.d/webmin-conf: enable ipv6=1 by default
- overlays/turnkey.d/networking/etc/gai.conf: prefer IPv4 for external connections

Tested: Built turnkey-core v19 ISO (406MB), LXC container running with
Webmin on IPv4+IPv6, SSH, systemd, Python 3.13, kernel 6.12.
@OnGle
Copy link
Copy Markdown
Member

OnGle commented Apr 13, 2026

@JedMeister Do we actually assert that common conf scripts are idempotent?

@JedMeister
Copy link
Copy Markdown
Member Author

@JedMeister Do we actually assert that common conf scripts are idempotent?

No we don't but I figured that making them more idempotent was not a bad thing? Although thanks for asking the question and get me to look at it again with a fresh set of eyes. Looking at it again I have some thoughts...

It occurs to me that the current behavior is quite handy. When things fail loudly and fatally if they are not in an "expected" state it's easy to see issues. That's particularly likely to occur during a transition. I recall that earlier on in the v19.x dev cycle we actually caught a few issues because of the current behavior. Having to rebuild the whole root.patched can suck, but arguably it's better than having a "successful" build that includes bugs that don't get discovered until later - perhaps even by a user!

Even though these updates do display warnings, I could imagine that it would be easy to miss them. I often put a build on and then do other things and it's usually done when I come back.

If we did implement this (or similar), it probably should be configurable - defaulting to failing when things are in unexpected state. But that sounds like a fair bit of work...

A "build report" could potentially mitigate risk of bugs slipping through, but that's something we don't have currently so means more work... Obviously that would be a good thing regardless, but becomes more important if changes raise the possibility of things breaking and possibly easy to miss.

Regardless, IIRC there isn't an easy way to rerun common conf scripts on an existing root.patched. Obviously you can use fab to explicitly (re)run specific scripts but I'm not sure how often that would be happening... So TBH I'm not even 100% sure of the problem this is solving?

@marcos-mendez I assume that there is some sort of friction you hit while developing that lead you to make these changes? Can you please share some more context on your use case/workflow that makes things better for you with these changes applied?

@marcos-mendez
Copy link
Copy Markdown
Member

@JedMeister @OnGle — fair questions. Here's the concrete context.

The friction

While building the first v19 appliances (ejabberd, Moodle, Mastodon), I was iterating heavily on common — fixing plans, overlays, and conf scripts as Trixie-related issues surfaced one by one. The cycle looked like this:

  1. make → build fails at some conf script (e.g. fail2ban-fixes patch doesn't apply on Trixie's new sshd.conf)
  2. Fix the script in common
  3. make again → the deck-based root.patched already has state from the previous partial run
  4. A different script fails on something trivial: mkdir on an existing directory, ln -s on an existing symlink, patch already applied

Each of these required a make clean + full rebuild from scratch just to get past a mkdir that should have been mkdir -p. On our hardware that's ~15-20 minutes per cycle. Over a few days of active development, it adds up significantly.

What the changes actually are

They're minimal — the kind of thing you'd see in any shell script best practice guide:

  • mkdirmkdir -p
  • ln -sln -sf
  • patch with a guard to check if already applied
  • conf scripts that check if a package is installed before trying to configure it

These aren't masking real errors. A missing directory being created is correct state; mkdir -p just doesn't fail if it's already there. A symlink pointing to the right target is correct state; ln -sf just replaces it atomically.

On Jed's concern about silent failures

I agree that builds should fail loudly on unexpected state. But these aren't unexpected — they're the expected result of a previous successful run of the same script. The distinction is:

  • Unexpected state (should fail): a config file has wrong content, a service can't start, a dependency is missing → these still fail loudly with these changes
  • Already-done state (should be idempotent): directory exists, symlink exists, patch already applied → these are safe to skip

The warnings are still printed so they show up in build logs — nothing is silently swallowed.

On @OnGle question

No, common doesn't formally assert idempotency. But I'd argue it's still good practice — it costs nothing (the changes are trivially small), it doesn't hide real errors, and it makes the development workflow significantly smoother for anyone actively iterating on appliance builds.

That said, if you'd prefer a different approach (like a FAB_IDEMPOTENT=1 env var or a --force flag), I'm open to that too. The important thing to me is not having to make clean because of a mkdir without -p.

@OnGle
Copy link
Copy Markdown
Member

OnGle commented Apr 14, 2026

Each of these required a make clean + full rebuild from scratch just to get past a mkdir that should have been mkdir -p. On our hardware that's ~15-20 minutes per cycle. Over a few days of active development, it adds up significantly.

The fab/build system only caches the last finished target. If something fails you can just re-run make, you don't need to make clean, unless the script you're working on finishes successfully, if that's the case, just add an exit 1 so it never builds past what you want to check.


I agree that builds should fail loudly on unexpected state. But these aren't unexpected — they're the expected result of a previous successful run of the same script.
...
No, common doesn't formally assert idempotency. But I'd argue it's still good practice — it costs nothing (the changes are trivially small), it doesn't hide real errors, and it makes the development workflow significantly smoother for anyone actively iterating on appliance builds.

I see the confusion here, this is a workflow issue as far as I can tell, instead of doing full rebuilds do this:

  1. add exit 1 just before whatever it is you want to test (or if it's already erroring where you want to test don't both)
  2. look at the build/ directory in your appliance, and see which one is the latest
  3. deck it to a new deck deck build/root.patched build/root.testing for example
  4. chroot into the deck fab-chroot build/root.testing, manually apply the changes you're testing or run the script
  5. if you encounter issues and want to reset then you just exit the deck, run deck -D build/root.testing, repeat from step 3 until the script works as you want it too.
  6. remount the deck that the build originally worked on with deck -m build/root.patched, remove your exit 1 continue your build with make or rebuild from clean

I agree that builds should fail loudly on unexpected state. But these aren't unexpected — they're the expected result of a previous successful run of the same script.

A previous successful run of the same script is an unexpected state. The build system does not expect scripts to ever be re-run.

@marcos-mendez
Copy link
Copy Markdown
Member

Ok thanks for explaining

@JedMeister
Copy link
Copy Markdown
Member Author

Thanks @OnGle

I don't want to muddy the waters but at step 4, I recommend using fab-investigate instead of fab-chroot. The differences:

  • fab-chroot is a fairly simple wrapper around chroot with a few extra "batteries".
  • fab-investigate is a wrapper around fab-chroot which sets up a more TurnKey "dev friendly" chroot, including:
    • adjusts the prompt so (when possible) it is clear at a glance that you are in a specific appliance chroot.
    • disables confconsole auto start (which will overwise trigger when you manually fab-chroot in).
    • includes our fake systemctl & service scripts
    • winds all that back when you exit

Also have a look at fab-rewind.

I've also opened a somewhat related issue: turnkeylinux/tracker#2106

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants