Marcos idempotency by JedMeister · Pull Request #343 · turnkeylinux/common

JedMeister · 2026-04-12T23:57:08Z

Idempotency updates provided by @marcos-mendez - pulled from #339 (rebased on current HEAD of 19.x-dev branch to fix merge conflicts).

- plans/turnkey/base: add libsocket6-perl + libio-socket-ssl-perl (IPv6 Webmin) - plans/turnkey/base: uncomment tklbam (migrated to Python 3.13) - conf/turnkey.d/webmin-conf: enable ipv6=1 by default - overlays/turnkey.d/networking/etc/gai.conf: prefer IPv4 for external connections Tested: Built turnkey-core v19 ISO (406MB), LXC container running with Webmin on IPv4+IPv6, SSH, systemd, Python 3.13, kernel 6.12.

OnGle · 2026-04-13T06:19:03Z

@JedMeister Do we actually assert that common conf scripts are idempotent?

JedMeister · 2026-04-13T06:58:40Z

@JedMeister Do we actually assert that common conf scripts are idempotent?

No we don't but I figured that making them more idempotent was not a bad thing? Although thanks for asking the question and get me to look at it again with a fresh set of eyes. Looking at it again I have some thoughts...

It occurs to me that the current behavior is quite handy. When things fail loudly and fatally if they are not in an "expected" state it's easy to see issues. That's particularly likely to occur during a transition. I recall that earlier on in the v19.x dev cycle we actually caught a few issues because of the current behavior. Having to rebuild the whole root.patched can suck, but arguably it's better than having a "successful" build that includes bugs that don't get discovered until later - perhaps even by a user!

Even though these updates do display warnings, I could imagine that it would be easy to miss them. I often put a build on and then do other things and it's usually done when I come back.

If we did implement this (or similar), it probably should be configurable - defaulting to failing when things are in unexpected state. But that sounds like a fair bit of work...

A "build report" could potentially mitigate risk of bugs slipping through, but that's something we don't have currently so means more work... Obviously that would be a good thing regardless, but becomes more important if changes raise the possibility of things breaking and possibly easy to miss.

Regardless, IIRC there isn't an easy way to rerun common conf scripts on an existing root.patched. Obviously you can use fab to explicitly (re)run specific scripts but I'm not sure how often that would be happening... So TBH I'm not even 100% sure of the problem this is solving?

@marcos-mendez I assume that there is some sort of friction you hit while developing that lead you to make these changes? Can you please share some more context on your use case/workflow that makes things better for you with these changes applied?

marcos-mendez · 2026-04-13T14:18:41Z

@JedMeister @OnGle — fair questions. Here's the concrete context.

The friction

While building the first v19 appliances (ejabberd, Moodle, Mastodon), I was iterating heavily on common — fixing plans, overlays, and conf scripts as Trixie-related issues surfaced one by one. The cycle looked like this:

make → build fails at some conf script (e.g. fail2ban-fixes patch doesn't apply on Trixie's new sshd.conf)
Fix the script in common
make again → the deck-based root.patched already has state from the previous partial run
A different script fails on something trivial: mkdir on an existing directory, ln -s on an existing symlink, patch already applied

Each of these required a make clean + full rebuild from scratch just to get past a mkdir that should have been mkdir -p. On our hardware that's ~15-20 minutes per cycle. Over a few days of active development, it adds up significantly.

What the changes actually are

They're minimal — the kind of thing you'd see in any shell script best practice guide:

mkdir → mkdir -p
ln -s → ln -sf
patch with a guard to check if already applied
conf scripts that check if a package is installed before trying to configure it

These aren't masking real errors. A missing directory being created is correct state; mkdir -p just doesn't fail if it's already there. A symlink pointing to the right target is correct state; ln -sf just replaces it atomically.

On Jed's concern about silent failures

I agree that builds should fail loudly on unexpected state. But these aren't unexpected — they're the expected result of a previous successful run of the same script. The distinction is:

Unexpected state (should fail): a config file has wrong content, a service can't start, a dependency is missing → these still fail loudly with these changes
Already-done state (should be idempotent): directory exists, symlink exists, patch already applied → these are safe to skip

The warnings are still printed so they show up in build logs — nothing is silently swallowed.

On @OnGle question

No, common doesn't formally assert idempotency. But I'd argue it's still good practice — it costs nothing (the changes are trivially small), it doesn't hide real errors, and it makes the development workflow significantly smoother for anyone actively iterating on appliance builds.

That said, if you'd prefer a different approach (like a FAB_IDEMPOTENT=1 env var or a --force flag), I'm open to that too. The important thing to me is not having to make clean because of a mkdir without -p.

OnGle · 2026-04-14T05:44:31Z

Each of these required a make clean + full rebuild from scratch just to get past a mkdir that should have been mkdir -p. On our hardware that's ~15-20 minutes per cycle. Over a few days of active development, it adds up significantly.

The fab/build system only caches the last finished target. If something fails you can just re-run make, you don't need to make clean, unless the script you're working on finishes successfully, if that's the case, just add an exit 1 so it never builds past what you want to check.

I agree that builds should fail loudly on unexpected state. But these aren't unexpected — they're the expected result of a previous successful run of the same script.
...
No, common doesn't formally assert idempotency. But I'd argue it's still good practice — it costs nothing (the changes are trivially small), it doesn't hide real errors, and it makes the development workflow significantly smoother for anyone actively iterating on appliance builds.

I see the confusion here, this is a workflow issue as far as I can tell, instead of doing full rebuilds do this:

add exit 1 just before whatever it is you want to test (or if it's already erroring where you want to test don't both)
look at the build/ directory in your appliance, and see which one is the latest
deck it to a new deck deck build/root.patched build/root.testing for example
chroot into the deck fab-chroot build/root.testing, manually apply the changes you're testing or run the script
if you encounter issues and want to reset then you just exit the deck, run deck -D build/root.testing, repeat from step 3 until the script works as you want it too.
remount the deck that the build originally worked on with deck -m build/root.patched, remove your exit 1 continue your build with make or rebuild from clean

I agree that builds should fail loudly on unexpected state. But these aren't unexpected — they're the expected result of a previous successful run of the same script.

A previous successful run of the same script is an unexpected state. The build system does not expect scripts to ever be re-run.

marcos-mendez · 2026-04-14T06:34:48Z

Ok thanks for explaining

JedMeister · 2026-04-14T22:05:19Z

Thanks @OnGle

I don't want to muddy the waters but at step 4, I recommend using fab-investigate instead of fab-chroot. The differences:

fab-chroot is a fairly simple wrapper around chroot with a few extra "batteries".
fab-investigate is a wrapper around fab-chroot which sets up a more TurnKey "dev friendly" chroot, including:
- adjusts the prompt so (when possible) it is clear at a glance that you are in a specific appliance chroot.
- disables confconsole auto start (which will overwise trigger when you manually fab-chroot in).
- includes our fake systemctl & service scripts
- winds all that back when you exit

Also have a look at fab-rewind.

I've also opened a somewhat related issue: turnkeylinux/tracker#2106

navigator and others added 5 commits April 10, 2026 10:57

WIP: idempotency fixes and postgresql plan for 19.x-dev

f8f8192

fix: guard fail2ban-fixes script when fail2ban not installed

c6bf2b7

fix: add locales to base plan (no longer pulled as dependency in Trixie)

7bee7f0

fix: add mawk to base plan (resolves virtual package awk for Trixie)

1bbe9d6

JedMeister requested a review from OnGle April 12, 2026 23:57

JedMeister mentioned this pull request Apr 13, 2026

TKL v19: dual-stack networking, Redis/Ruby verticals, and misc fixes #339

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Marcos idempotency#343

Marcos idempotency#343
JedMeister wants to merge 5 commits intoturnkeylinux:19.x-devfrom
JedMeister:marcos-idempotency

JedMeister commented Apr 12, 2026

Uh oh!

OnGle commented Apr 13, 2026

Uh oh!

JedMeister commented Apr 13, 2026

Uh oh!

marcos-mendez commented Apr 13, 2026

Uh oh!

OnGle commented Apr 14, 2026

Uh oh!

marcos-mendez commented Apr 14, 2026

Uh oh!

JedMeister commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

JedMeister commented Apr 12, 2026

Uh oh!

OnGle commented Apr 13, 2026

Uh oh!

JedMeister commented Apr 13, 2026

Uh oh!

marcos-mendez commented Apr 13, 2026

What the changes actually are

Uh oh!

OnGle commented Apr 14, 2026

Uh oh!

marcos-mendez commented Apr 14, 2026

Uh oh!

JedMeister commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants