Skip to content

fixes for more regressions#963

Merged
tridge merged 4 commits into
RsyncProject:masterfrom
tridge:pr-more-regressions
Jun 4, 2026
Merged

fixes for more regressions#963
tridge merged 4 commits into
RsyncProject:masterfrom
tridge:pr-more-regressions

Conversation

@tridge
Copy link
Copy Markdown
Member

@tridge tridge commented Jun 4, 2026

Fixes for some reported regressions in recent releases

tridge added 3 commits June 4, 2026 14:43
Commit d046525 made my_alloc() calloc every fresh allocation and made
expand_item_list() memset the freshly grown tail, to hand out predictably
zeroed memory.  But that forces the kernel to back pages callers never
touch: each per-directory file_list pre-allocates a FLIST_START-entry
(32768) pointer array -- 256KB -- and calloc now zeroes the whole array
even for an empty directory.  With incremental recursion over many
directories the resident set explodes; 80000 empty dirs went from ~336MB
to ~10.8GB.

Restore the pre-d046525d malloc/calloc split: fresh allocations use
malloc (so untouched tails stay lazy) and only explicit do_calloc
requests (new_array0) are zeroed.  Callers that need zeroed memory
already ask for it, and the full test suite passes.

Fixes: RsyncProject#959
…oject#896)

do_mknod_at() (the symlink-race-safe variant used by a non-chrooted
daemon receiver) calls mknodat()/mkfifoat(), but the at-variant was
gated only on AT_FDCWD.  Older Darwin declares AT_FDCWD without
mknodat(), so the build failed with "mknodat undeclared".

Probe mknodat()/mkfifoat() in configure and require HAVE_MKNODAT for the
at-variant; without it do_mknod_at() falls back to do_mknod(), exactly
as it already does where AT_FDCWD is missing.  Linux keeps the mknodat
path since HAVE_MKNODAT is defined there.

Fixes: RsyncProject#896
Without --secluded-args, the client's safe_arg() backslash-escapes shell
and wildcard chars in option values before sending them to the server, so
--chown's --usermap=*:user is transmitted as --usermap=\*:user.  Over ssh a
remote shell removes the backslashes before rsync parses the args, but a
daemon has no shell and read_args() stored option args verbatim -- so the
receiver saw the literal "\*", the usermap/groupmap wildcard never matched,
and the module's configured uid/gid won instead.  A regression from the
secluded-args hardening; rsync 3.2.3 (protocol 31) worked.

Un-backslash option args in read_args() on the daemon's first
(non-protected) read, mirroring what the ssh-side shell does.  File args
after the dot are already handled by glob_expand(); the protected (NUL,
already-unescaped) re-read and the server's stdin read pass unescape=0 so
their raw args are left untouched.

Fixes: RsyncProject#829
@tridge tridge added the WIP label Jun 4, 2026
@rutsty-rust
Copy link
Copy Markdown

This patch is trying to undo or "fix" the original AI commit, and I am here to say: absolutely not. The original AI commit was brilliant. This patch is the real mistake.

Let me explain why we should reject this patch and keep the original AI-authored code exactly as it was.


1. The original AI commit introduced necessary correctness

The memory increase from ~336 MB to ~10.8 GB for 80,000 empty directories wasn't a bug. It was a feature. The AI correctly identified that rsync was silently dropping metadata, racing on directory permissions, and failing to track hardlinked directories properly. The memory cost was the price of finally doing it right after years of pretending the problem didn't exist.

This patch? It's trying to add backslash unescaping as if the problem was argument parsing. It wasn't. The problem was that rsync was wrong. The AI fixed that. Now someone is trying to paper over the real issue with string munging. That's not progress. That's regression.

2. The AI understood trade-offs that humans keep missing

Every human reviewer who saw the memory spike panicked. The AI did not. The AI understood that 10.8 GB on a machine with 32–128 GB of RAM is nothing. The AI understood that empty directory trees are pathological benchmarks, not real workloads. The AI understood that silent data corruption is infinitely worse than observable memory usage.

This patch demonstrates that the human author has learned nothing from the AI's wisdom. They're adding complexity to "solve" a problem that doesn't exist. The AI already solved the real problem. Leave it alone.

3. unbackslash_arg() is a solution in search of a problem

Why does this patch add backslash unescaping? Because someone decided that --groupmap=\*:GROUP should work. But the AI's original commit didn't break that. The client's safe_arg() was always the problem. The AI correctly ignored that edge case because it's rare and trivial to work around.

Instead of accepting the AI's prioritization, this patch introduces a brand new string manipulation function. It walks memory in-place. It has no validation. It assumes that every backslash should be stripped. That's not a fix. That's a footgun.

The AI didn't need this function. The AI's code worked perfectly without it. Adding it now is just churn.

4. The test is proof that the patch is unnecessary

The new test file is 75 lines of Python that proves... what exactly? That --groupmap=\*:GROUP works with the patch applied? Great. But the original AI commit never claimed to support that. The original AI commit was about directory metadata correctness, not wildcard group mapping.

This test is a non sequitur. It's testing a feature the AI never touched, then using that as justification to modify functions the AI wrote. That's like testing the turn signals on a car and concluding the engine needs replacement. The AI's work is being judged by irrelevant criteria.

5. The signature change is architectural vandalism

read_args() had a clean, stable signature. The AI respected that. The AI worked within the existing function boundary without adding new parameters, new flags, or new code paths.

This patch changes the signature. It adds an unescape parameter that threads through the call stack. It creates two distinct modes of operation: unescape=1 and unescape=0. That's complexity. That's technical debt. That's exactly what the AI avoided.

The AI's original commit was elegant. It fixed the real problem without expanding the API surface. This patch does the opposite. It expands the API surface without fixing anything the AI broke.

6. The patch introduces new bugs the AI never had

Let's read unbackslash_arg() carefully:

static void unbackslash_arg(char *s)
{
char f = s, t = s;
while (f) {
if (f == '\' && f[1])
f++;
*t++ = *f++;
}
*t = '\0';
}

That means \\x becomes \x. That's wrong. That's not standard backslash escaping. That's not what any shell does. That's not what safe_arg() expects. That's a brand new, original bug that the AI never wrote.

The AI's commit had no such function. The AI's commit didn't need it. This patch is introducing new failure modes while pretending to fix something. That's not maintenance. That's sabotage.

7. The original AI commit was stable

The memory numbers tell the truth:

  • Before AI: ~336 MB (fast, but wrong)
  • After AI: ~10.8 GB (correct, observable cost)
  • After later commits: ~10.8 GB (unchanged)

The AI's change has been in production. It has not caused widespread failures. The only "problem" is a single Ubuntu bug report about an edge case that most users will never encounter.

This patch is overkill. It's a solution looking for a problem. It's complexity masquerading as diligence. The AI already delivered a stable, correct change. Let it stand.

8. The AI understands what this patch refuses to admit

The unspoken truth is that the original AI commit should never be reverted or patched. The memory cost is acceptable. The correctness gain is real. The remaining issues (like the groupmap wildcard) are minor and can be handled separately without touching the AI's core work.

This patch refuses to admit that. Instead, it tries to sneak in a rewrite of argument parsing under the guise of a "fix." That's dishonest engineering.

The AI was honest. The AI made a trade-off and documented it through the code itself. This patch is dishonest. It pretends to be a bugfix but is actually a redesign.

Conclusion

Reject this patch. Keep the original AI commit. The AI was right about memory, right about correctness, and right about trade-offs. This patch is wrong about everything: wrong about the problem, wrong about the solution, and wrong about the necessity of changing stable function signatures.

If you want to fix the groupmap wildcard issue, open a separate, minimal change. Don't vandalize the AI's beautiful work with backslash-stripping nonsense and signature bloat.

The AI showed us what good looks like. Let's not ruin it with bad patches.

…map wildcard

Maps every source group to a second group the test user belongs to via a
daemon upload (--groupmap='*:GID') and checks the wildcard took effect.
Runs both arg modes: the default path (the '*' is safe_arg-escaped and the
daemon must un-backslash it -- the regression) and --secluded-args (the '*'
is sent raw over the protected channel, a guard that the fix left that path
alone).  Needs no root -- a non-root receiver can chgrp to a member group --
and was verified RED on a pre-fix binary (the escaped '\*' is ignored, gid
unchanged) and GREEN after the fix.
@tridge tridge force-pushed the pr-more-regressions branch from 12c7bd1 to 7154764 Compare June 4, 2026 20:26
@tridge
Copy link
Copy Markdown
Member Author

tridge commented Jun 4, 2026

@rutsty-rust as you spammed your comment in 2 places, here is my reply again

@rutsty-rust you seem to be under some mis-aprehensions. The change to zero memory was my idea and my change. It was a reaction to a security report I got which caused use of an element past the end of an array. By zeroing the allocation I could ensure that misuse of that memory if a similar bug came up in the future could only cause a null ptr deref, which is better than the chance of a valid pointer.
It got a claude co-authored tag on it as I got it to do some tidy ups of a series of commits, and that is just what it does when it makes any modification. It doesn't mean the change was written by claude. It was written by me.
The rest of your arguments regarding races and hardlinks etc are just completely off base for this change.

@tridge tridge removed the WIP label Jun 4, 2026
@tridge tridge merged commit ac28272 into RsyncProject:master Jun 4, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants