From 35a2164e3772b12847ffdb5df12c9f677f80d522 Mon Sep 17 00:00:00 2001 From: pterror Date: Fri, 5 Jun 2026 17:24:05 +1000 Subject: [PATCH 1/3] receiver: fix NULL deref on the delta discard path receive_data() crashed a receiver that was merely DISCARDING a file's delta stream. discard_receive_data() calls receive_data() with fname == NULL and fd == -1, so size_r == 0 and mapbuf == NULL. A normal block-MATCH token (against a block the basis and source share) then reaches the !mapbuf branch added in 31fbb17d ("receiver: fix absolute --partial-dir delta resume"), which calls full_fname(fname). full_fname() dereferences its argument unconditionally (util1.c: `if (*fn == '/')`), so fname == NULL faults there -> receiver SIGSEGV. This is a normal-operation crash with a stock cooperating sender, not an adversarial one. The generator hands the sender real block sums whenever the basis is readable and we're in delta mode; the receiver only decides to discard afterwards, when its output cannot be produced -- e.g. the destination directory is not writable (mkstemp fails), the basis turns out to be a directory, or a --partial-dir resume is skipped. A MATCH token arriving during that discard hit the NULL deref. The 31fbb17d branch is correct only for a REAL output transfer (fd != -1, fname valid): there, a block match with no mapped basis is a genuine protocol inconsistency (the generator promised a basis the receiver could not open), and honoring it would silently omit those bytes from the verification checksum or leave a hole, so hard-erroring -- and full_fname(fname) -- is right. It conflated that with the discard path. The discriminator is fd, not mapbuf: on the discard path fd == -1 always; on the real-output inconsistency fd != -1. Scope the "no basis file" protocol error to fd != -1 (where fname is non-NULL and full_fname is safe) and, on the discard path (fd == -1), absorb the matched bytes benignly (offset += len; continue) -- symmetric with the literal-token handling just above, and restoring the pre-31fbb17d behavior. The real-transfer inconsistency check is preserved unchanged. Co-Authored-By: Claude Opus 4.8 --- receiver.c | 34 +++++++++++++++++++++++++--------- 1 file changed, 25 insertions(+), 9 deletions(-) diff --git a/receiver.c b/receiver.c index 84bb151aa..2b263cea0 100644 --- a/receiver.c +++ b/receiver.c @@ -423,16 +423,32 @@ static int receive_data(int f_in, char *fname_r, int fd_r, OFF_T size_r, stats.matched_data += len; - /* A block match can only be honored if we actually mapped the - * basis. If we didn't (basis open failed), the sender should - * never have been told a basis existed -- treat it as a protocol - * inconsistency rather than silently omitting these bytes from - * the verification checksum (which yields a spurious failure) or - * leaving a hole in the output. */ + /* A block match with no mapped basis is a protocol inconsistency + * ONLY when we are actually producing output (fd != -1): the + * generator told the sender a basis existed but the receiver could + * not open it, so honoring the match would silently omit these + * bytes from the verification checksum (a spurious failure) or + * leave a hole in the output. Fail cleanly in that case. + * + * On the DISCARD path (fd == -1, fname == NULL) there is no output + * and no verification: discard_receive_data() deliberately drains a + * delta the receiver never intends to write (basis fstat failed, + * basis is a directory, output open failed, batch skip, ...). The + * sender does not know the data is being discarded and streams an + * ordinary delta, so a match token here is NORMAL protocol, not + * malformed. Absorb it benignly (advance the offset and continue), + * as the pre-existing "if (mapbuf)" guards did before this check was + * added in 31fbb17d -- erroring would wrongly break legitimate + * transfers, and full_fname(fname) with fname==NULL would + * dereference NULL (a receiver crash on a normal transfer). */ if (!mapbuf) { - rprintf(FERROR, "got a block match with no basis file for %s [%s]\n", - full_fname(fname), who_am_i()); - exit_cleanup(RERR_PROTOCOL); + if (fd != -1) { + rprintf(FERROR, "got a block match with no basis file for %s [%s]\n", + full_fname(fname), who_am_i()); + exit_cleanup(RERR_PROTOCOL); + } + offset += len; + continue; } if (DEBUG_GTE(DELTASUM, 3)) { From 976b0f0cbff98dac7fe79886d48bf18b936c3f0a Mon Sep 17 00:00:00 2001 From: pterror Date: Fri, 5 Jun 2026 17:24:13 +1000 Subject: [PATCH 2/3] testsuite: regression for the receiver discard-path NULL deref Drives a real sender<->receiver pair (client sender -> daemon receiver, both the binary under test in the default pipe transport) so the receiver actually takes the recv_files discard path -- a local `rsync a b` does not. The basis and source share a leading block so the generator emits real sums and the receiver gets a block MATCH; the destination directory is made unwritable so the receiver's output mkstemp() fails and it discards the delta. Pre-fix the receiver SIGSEGVs in full_fname(NULL), which the client sees as a protocol-data-stream error (code 12); post-fix it drains the delta and reports a benign code 23 (or 0). Skips (exit 77) when run as root, since root bypasses DAC and the unwritable destination would not make mkstemp() fail -- so the discard path, and the bug, would never be reached. Verified red-on-buggy / green-on-fixed against the 0d0399bb receiver. Co-Authored-By: Claude Opus 4.8 --- testsuite/recv-discard-nullderef_test.py | 123 +++++++++++++++++++++++ 1 file changed, 123 insertions(+) create mode 100755 testsuite/recv-discard-nullderef_test.py diff --git a/testsuite/recv-discard-nullderef_test.py b/testsuite/recv-discard-nullderef_test.py new file mode 100755 index 000000000..a09195783 --- /dev/null +++ b/testsuite/recv-discard-nullderef_test.py @@ -0,0 +1,123 @@ +#!/usr/bin/env python3 +# Regression test for a receiver NULL-deref on the delta DISCARD path. +# +# In receiver.c receive_data(), a block-MATCH token that arrives while the +# receiver is DISCARDING a file (discard_receive_data() -> receive_data() with +# fname==NULL, fd==-1, hence mapbuf==NULL) reached +# rprintf(FERROR, "...%s...", full_fname(fname), ...) +# with fname==NULL. full_fname() dereferences its argument unconditionally +# (util1.c: `if (*fn == '/')`), so the receiver SIGSEGVs. The faulty error +# branch was added in 31fbb17d ("receiver: fix absolute --partial-dir delta +# resume"); the fix discriminates on fd (not mapbuf) and, on the discard path +# (fd==-1), absorbs the matched bytes benignly instead of erroring. +# +# This is a NORMAL-operation crash, not adversarial: a stock cooperating sender +# triggers it. The generator sends real block sums (basis readable, delta mode); +# the receiver then has to discard because its output mkstemp() fails -- here +# because the destination directory is not writable. A block MATCH against the +# shared leading block reaches the discard path and crashes the pre-fix binary. +# +# We drive a real sender<->receiver pair (client sender -> daemon receiver) so +# the receiver actually takes the recv_files discard path; a local `rsync a b` +# does not. In the default (pipe) daemon transport both ends are the binary +# under test. +# +# Skipped (exit 77) when running as root (root bypasses DAC), or when the +# directory mode is not enforced (e.g. a non-root process holding +# CAP_DAC_OVERRIDE in an unprivileged container): in both cases the receiver's +# mkstemp() would succeed despite chmod 0555, the discard path would not be +# taken, and the test would silently pass against a buggy binary. The +# post-chmod writability probe converts that silent false-pass into an honest +# skip and subsumes the root check. + +import os +import shlex +import subprocess +import tempfile + +from rsyncfns import ( + SCRATCHDIR, RSYNC, TMPDIR, + get_testuid, get_rootuid, makepath, start_test_daemon, write_daemon_conf, + test_fail, test_skipped, +) + +DAEMON_PORT = 12895 + +if get_testuid() == get_rootuid(): + test_skipped("root bypasses DAC: the unwritable dest dir wouldn't make " + "the receiver's mkstemp fail, so the discard path (and the " + "bug) is never reached") + +os.chdir(TMPDIR) + +MODDIR = SCRATCHDIR / 'recvdiscard-mod' # daemon module root (writable) +BASISDIR = MODDIR / 'd' # made read-only -> mkstemp fails +SRCDIR_ = SCRATCHDIR / 'recvdiscard-src' # client source tree +makepath(MODDIR, BASISDIR, SRCDIR_) + +# Basis and source share a leading block (2000 'A's) so the generator emits +# real sums and the receiver gets a block MATCH; the tails differ and the +# source is larger so a delta (not a no-op) is sent. +basis = BASISDIR / 'f' +basis.write_bytes(b'A' * 2000 + b'C' * 1000) +src = SRCDIR_ / 'f' +src.write_bytes(b'A' * 2000 + b'B' * 3000) + +# A read/write daemon module rooted at MODDIR. +conf = write_daemon_conf([('recvdiscard', {'path': str(MODDIR), + 'read only': 'no'})]) +url = start_test_daemon(conf, DAEMON_PORT, rsync_cmd=RSYNC) + +# Make the destination directory unwritable so the receiver's output mkstemp() +# fails and it falls back to discarding the delta stream. Restore in finally so +# the per-test scratch tree can be cleaned up. +os.chmod(BASISDIR, 0o555) + +# Probe that the chmod actually denies writes for *this* process. A non-root +# user holding CAP_DAC_OVERRIDE bypasses the directory write bit, so mkstemp +# would succeed in the daemon receiver too, the discard path would never be +# taken, and the test would silently pass on a buggy binary. Better to skip +# explicitly. (Root takes this path too: its probe succeeds → skip, which +# subsumes the uid==0 check.) +try: + _fd, _probe = tempfile.mkstemp(dir=BASISDIR) + os.close(_fd) + os.unlink(_probe) + os.chmod(BASISDIR, 0o755) + test_skipped("destination dir is writable despite chmod 0555 " + "(CAP_DAC_OVERRIDE?); cannot force the receiver discard path") +except OSError: + pass # EACCES -- good, the precondition is enforced + +try: + argv = shlex.split(RSYNC) + [ + '--no-whole-file', '-a', + str(src), f'{url}recvdiscard/d/f', + ] + print('Running:', ' '.join(argv)) + proc = subprocess.run(argv, stdout=subprocess.PIPE, + stderr=subprocess.STDOUT, text=True) + print(proc.stdout, end='') +finally: + os.chmod(BASISDIR, 0o755) + +rc = proc.returncode + +# A receiver SIGSEGV manifests to the client as a protocol error (the daemon's +# receiver child crashes mid-stream and the connection drops). Pre-fix this is +# code 12 (error in rsync protocol data stream); post-fix the receiver drains +# the delta and reports a benign "could not transfer" (code 23), or succeeds. +# +# rsync's own exit codes are all < 128, so we can't read the receiver's signal +# directly from the client. The discriminator is the PROTOCOL error: only a +# crashed (or otherwise vanished) receiver produces code 12 here. A clean +# discard yields 23 (file not transferred) or 0. +if rc == 12: + test_fail(f"receiver crashed on the discard path (rsync exited {rc}: " + "error in rsync protocol data stream -- the receiver child " + "SIGSEGV'd in full_fname(NULL))") +if rc not in (0, 23): + test_fail(f"unexpected rsync exit {rc} (expected 0 or 23, a benign " + "discard; 12 would be the crash)") + +print(f"OK: receiver discarded the delta without crashing (rsync exit {rc})") From fef4dea71c7e0bdb21541efd7b59861a718167c6 Mon Sep 17 00:00:00 2001 From: Andrew Tridgell Date: Sat, 6 Jun 2026 18:43:06 +1000 Subject: [PATCH 3/3] testsuite,ci: mark recv-discard-nullderef CI skip and tighten its check The regression test honestly skips when it cannot force the receiver's output mkstemp() to fail -- as root (root bypasses DAC) and on Cygwin (chmod 0555 does not deny the owner a write). The ubuntu, ubuntu-22.04, almalinux and macOS jobs run `make check` as root, and Cygwin can't enforce the unwritable directory, so the test skips on all of them. runtests.py fails a run on any skip-set mismatch, so add the test to those jobs' RSYNC_EXPECT_SKIPPED lists; the BSD/Solaris jobs run as root too but enforce no expected-skip set, so they need no change. Also tighten the pass condition. The post-chmod writability probe already guarantees the receiver discards (mkstemp must fail), so an exit 0 would mean the file actually transferred and the discard path was never exercised -- a silent false-pass. Require exactly exit 23 (the forced discard leaves the file untransferred); 12 remains the pre-fix crash. --- .github/workflows/almalinux-8-build.yml | 2 +- .github/workflows/cygwin-build.yml | 2 +- .github/workflows/macos-build.yml | 2 +- .github/workflows/ubuntu-22.04-build.yml | 6 +++--- .github/workflows/ubuntu-build.yml | 6 +++--- testsuite/recv-discard-nullderef_test.py | 23 +++++++++++++---------- 6 files changed, 22 insertions(+), 19 deletions(-) diff --git a/.github/workflows/almalinux-8-build.yml b/.github/workflows/almalinux-8-build.yml index 1a0b0e157..f3ba969f4 100644 --- a/.github/workflows/almalinux-8-build.yml +++ b/.github/workflows/almalinux-8-build.yml @@ -62,7 +62,7 @@ jobs: # crtimes-not-supported skip matches the other Linux jobs; # daemon-chroot-acl and proxy-response-line-too-long skip because # the default (secure) transport opens no listening socket. - run: RSYNC_EXPECT_SKIPPED=crtimes,daemon-access-ip,daemon-chroot-acl,proxy-response-line-too-long make check + run: RSYNC_EXPECT_SKIPPED=crtimes,daemon-access-ip,daemon-chroot-acl,proxy-response-line-too-long,recv-discard-nullderef make check - name: check (TCP daemon transport) # Second run exercising the real loopback-TCP daemon path. run: ./runtests.py --rsync-bin="$PWD/rsync" --use-tcp -j 8 diff --git a/.github/workflows/cygwin-build.yml b/.github/workflows/cygwin-build.yml index a63a3f26b..a14e9d069 100644 --- a/.github/workflows/cygwin-build.yml +++ b/.github/workflows/cygwin-build.yml @@ -46,7 +46,7 @@ jobs: # RESOLVE_BENEATH symlink-race tests. symlink-dirlink-basis also now # RUNS (the #915 non-daemon basis open uses a plain do_open, restoring # following an in-tree dir-symlink basis without RESOLVE_BENEATH). - run: bash -c 'RSYNC_EXPECT_SKIPPED=acls-default,acls-depth,acls,bare-do-open-symlink-race,chdir-symlink-race,chown,daemon-access-ip,daemon-chroot-acl,devices,dir-sgid,open-noatime,protected-regular,proxy-response-line-too-long,sender-flist-symlink-leak,simd-checksum make check' + run: bash -c 'RSYNC_EXPECT_SKIPPED=acls-default,acls-depth,acls,bare-do-open-symlink-race,chdir-symlink-race,chown,daemon-access-ip,daemon-chroot-acl,devices,dir-sgid,open-noatime,protected-regular,proxy-response-line-too-long,recv-discard-nullderef,sender-flist-symlink-leak,simd-checksum make check' - name: check (TCP daemon transport) # Second run with daemon tests over a real loopback rsyncd; the default # 'make check' above uses the secure stdio-pipe transport. diff --git a/.github/workflows/macos-build.yml b/.github/workflows/macos-build.yml index 067671ce0..971389fb6 100644 --- a/.github/workflows/macos-build.yml +++ b/.github/workflows/macos-build.yml @@ -44,7 +44,7 @@ jobs: # chown-fake / devices-fake / xattrs / xattrs-hlink now RUN on macOS # (rsyncfns.py drives xattrs via the `xattr` command), verified on a # real macOS host, so they're no longer in the skip set. - run: sudo RSYNC_EXPECT_SKIPPED=acls-default,acls-depth,chmod-temp-dir,daemon-access-ip,daemon-chroot-acl,dir-sgid,open-noatime,preallocate,protected-regular,proxy-response-line-too-long,simd-checksum,sparse make check + run: sudo RSYNC_EXPECT_SKIPPED=acls-default,acls-depth,chmod-temp-dir,daemon-access-ip,daemon-chroot-acl,dir-sgid,open-noatime,preallocate,protected-regular,proxy-response-line-too-long,recv-discard-nullderef,simd-checksum,sparse make check - name: check (TCP daemon transport) # Second run with daemon tests over a real loopback rsyncd; the default # 'make check' above uses the secure stdio-pipe transport. diff --git a/.github/workflows/ubuntu-22.04-build.yml b/.github/workflows/ubuntu-22.04-build.yml index 392b5e738..af14ce4bb 100644 --- a/.github/workflows/ubuntu-22.04-build.yml +++ b/.github/workflows/ubuntu-22.04-build.yml @@ -39,11 +39,11 @@ jobs: - name: info run: rsync --version - name: check - run: sudo RSYNC_EXPECT_SKIPPED=crtimes,daemon-access-ip,daemon-chroot-acl,proxy-response-line-too-long make check + run: sudo RSYNC_EXPECT_SKIPPED=crtimes,daemon-access-ip,daemon-chroot-acl,proxy-response-line-too-long,recv-discard-nullderef make check - name: check30 - run: sudo RSYNC_EXPECT_SKIPPED=crtimes,daemon-access-ip,daemon-chroot-acl,proxy-response-line-too-long make check30 + run: sudo RSYNC_EXPECT_SKIPPED=crtimes,daemon-access-ip,daemon-chroot-acl,proxy-response-line-too-long,recv-discard-nullderef make check30 - name: check29 - run: sudo RSYNC_EXPECT_SKIPPED=crtimes,daemon-access-ip,daemon-chroot-acl,proxy-response-line-too-long make check29 + run: sudo RSYNC_EXPECT_SKIPPED=crtimes,daemon-access-ip,daemon-chroot-acl,proxy-response-line-too-long,recv-discard-nullderef make check29 - name: check (TCP daemon transport) # Second run with daemon tests over a real loopback rsyncd; the default # 'make check' above uses the secure stdio-pipe transport. diff --git a/.github/workflows/ubuntu-build.yml b/.github/workflows/ubuntu-build.yml index c142d693f..72f19e3fd 100644 --- a/.github/workflows/ubuntu-build.yml +++ b/.github/workflows/ubuntu-build.yml @@ -63,11 +63,11 @@ jobs: - name: info run: rsync --version - name: check - run: sudo RSYNC_EXPECT_SKIPPED=crtimes,daemon-access-ip,daemon-chroot-acl,proxy-response-line-too-long make check + run: sudo RSYNC_EXPECT_SKIPPED=crtimes,daemon-access-ip,daemon-chroot-acl,proxy-response-line-too-long,recv-discard-nullderef make check - name: check30 - run: sudo RSYNC_EXPECT_SKIPPED=crtimes,daemon-access-ip,daemon-chroot-acl,proxy-response-line-too-long make check30 + run: sudo RSYNC_EXPECT_SKIPPED=crtimes,daemon-access-ip,daemon-chroot-acl,proxy-response-line-too-long,recv-discard-nullderef make check30 - name: check29 - run: sudo RSYNC_EXPECT_SKIPPED=crtimes,daemon-access-ip,daemon-chroot-acl,proxy-response-line-too-long make check29 + run: sudo RSYNC_EXPECT_SKIPPED=crtimes,daemon-access-ip,daemon-chroot-acl,proxy-response-line-too-long,recv-discard-nullderef make check29 - name: check (TCP daemon transport) # Second run with daemon tests over a real loopback rsyncd. The default # 'make check' above uses the secure stdio-pipe transport (no listening diff --git a/testsuite/recv-discard-nullderef_test.py b/testsuite/recv-discard-nullderef_test.py index a09195783..e130203f8 100755 --- a/testsuite/recv-discard-nullderef_test.py +++ b/testsuite/recv-discard-nullderef_test.py @@ -104,20 +104,23 @@ rc = proc.returncode # A receiver SIGSEGV manifests to the client as a protocol error (the daemon's -# receiver child crashes mid-stream and the connection drops). Pre-fix this is -# code 12 (error in rsync protocol data stream); post-fix the receiver drains -# the delta and reports a benign "could not transfer" (code 23), or succeeds. +# receiver child crashes mid-stream and the connection drops): exit code 12. +# With the fix the receiver drains the delta and, because the forced-unwritable +# destination leaves the file untransferred, the run reports the benign "some +# files were not transferred" -- exit code 23. # -# rsync's own exit codes are all < 128, so we can't read the receiver's signal -# directly from the client. The discriminator is the PROTOCOL error: only a -# crashed (or otherwise vanished) receiver produces code 12 here. A clean -# discard yields 23 (file not transferred) or 0. +# 23 is the ONLY non-crash outcome here: the writability probe above guarantees +# the receiver's mkstemp() fails, so the file is always discarded. An exit 0 +# would mean the file actually transferred -- the discard path was NOT exercised +# and the run proves nothing -- so require exactly 23 (and call out 12 as the +# pre-fix crash). if rc == 12: test_fail(f"receiver crashed on the discard path (rsync exited {rc}: " "error in rsync protocol data stream -- the receiver child " "SIGSEGV'd in full_fname(NULL))") -if rc not in (0, 23): - test_fail(f"unexpected rsync exit {rc} (expected 0 or 23, a benign " - "discard; 12 would be the crash)") +if rc != 23: + test_fail(f"expected rsync exit 23 (the forced discard leaves the file " + f"untransferred); got {rc} -- the discard path was not exercised, " + "so this run validates nothing (12 would be the pre-fix crash)") print(f"OK: receiver discarded the delta without crashing (rsync exit {rc})")