Commit ae7e901
authored
fix(sandbox): strip " (deleted)" suffix from unlinked /proc/<pid>/exe paths (#844)
* fix(sandbox): strip " (deleted)" suffix from unlinked /proc/<pid>/exe paths
When a running binary is unlinked from its filesystem path — the common
case is a `docker cp` hot-swap of `/opt/openshell/bin/openshell-sandbox`
during the `cluster-deploy-fast` dev upgrade workflow — the Linux kernel
appends the literal string ` (deleted)` to the `/proc/<pid>/exe` readlink
target. The tainted `PathBuf` then flows into `collect_ancestor_binaries`
and on into `BinaryIdentityCache::verify_or_cache`, which tries to
`std::fs::metadata` the path. `stat()` fails with `ENOENT` because the
literal suffix isn't a real filesystem path, and the CONNECT proxy denies
every outbound request with:
ancestor integrity check failed for \
/opt/openshell/bin/openshell-sandbox (deleted): \
Failed to stat ...: No such file or directory (os error 2)
Reproduced in production 2026-04-15: a cluster-deploy-fast-style hot-swap
of the supervisor binary caused every pod whose PID 1 held the now-deleted
inode to deny ALL outbound CONNECTs (slack.com, registry.npmjs.org,
169.254.169.254, etc.), breaking Slack REST delivery, npm installs, and
IMDS probes simultaneously. Existing pre-hot-swap TCP tunnels (e.g. Slack
Socket Mode WSS) kept working because they never re-evaluate the proxy.
Strip the suffix in `binary_path()` so downstream callers see the clean,
grep-friendly path. This aligns the cache key and log messages with the
original on-disk location.
Note: stripping the suffix does NOT by itself make the identity cache
tolerant of a legitimate binary replacement — `verify_or_cache` will now
`stat` and hash whatever currently lives at the stripped path, which is
the NEW binary, and surface a clearer `Binary integrity violation` error.
Fully unblocking the cluster-deploy-fast hot-swap workflow needs a
follow-up that either (a) reads running-binary content from
`/proc/<pid>/exe` directly via `File::open` (procfs resolves this to the
live in-memory executable even when the original inode has been unlinked),
or (b) keys the identity cache by exec dev+inode instead of path. Happy to
send that as a separate PR once the approach is decided — filing this
narrow fix first because it stands on its own: it fixes a concrete
misleading error and unblocks the obvious next step.
Added `binary_path_strips_deleted_suffix` test that copies `/bin/sleep`
to a temp path, spawns a child from it, unlinks the temp binary, verifies
the raw readlink contains the ` (deleted)` suffix, then asserts the public
API returns the stripped path.
Signed-off-by: mjamiv <michael.commack@gmail.com>
* fix(sandbox): narrow the " (deleted)" suffix strip and exercise it end-to-end
Address review feedback from @johntmyers on the initial version of this PR:
procfs::binary_path
- Only strip the kernel's " (deleted)" suffix when stat() on the raw
readlink target reports NotFound. A live executable whose basename
literally ends with " (deleted)" is now returned unchanged instead of
being silently truncated, which matters because identity.rs hashes
whatever this function returns as a trust anchor.
- Operate on raw bytes via OsStrExt, so filenames that are not valid
UTF-8 still get exactly one suffix stripped. The previous
strip_suffix-on-&str path skipped non-UTF-8 entirely and fell through
to returning the tainted path.
- Expand the doc comment to describe both guardrails.
procfs tests
- binary_path_preserves_live_deleted_basename: copy /bin/sleep to a
live file literally named "sleepy (deleted)", spawn it, and assert
that the returned path still ends with " (deleted)".
- binary_path_strips_suffix_for_non_utf8_filename: exec a binary whose
basename contains a 0xFF byte, unlink it, and assert that
binary_path returns the stripped non-UTF-8 path. Writes the bytes
with OpenOptions + sync_all + explicit drop so the write fd is fully
released before exec() to avoid ETXTBSY under concurrent tests.
proxy: extract resolve_process_identity helper
- Pull the peer-resolution + TOFU verify + ancestor walk + cmdline
collection block out of evaluate_opa_tcp into resolve_process_identity.
- Introduce IdentityError which carries the deny reason along with
whatever partial identity data was resolved before the failure so
evaluate_opa_tcp can thread that into ConnectDecision unchanged.
- evaluate_opa_tcp now calls the helper and proceeds straight to the
OPA evaluate step; the surface visible to OPA and OCSF is unchanged.
proxy: end-to-end regression test for the hot-swap contract
- resolve_process_identity_surfaces_binary_integrity_violation_on_hot_swap
stands up a real TcpListener, copies /bin/bash to a temp path,
primes BinaryIdentityCache with that binary, spawns bash with a
/dev/tcp one-liner that opens a real connection to the listener,
and captures the peer's ephemeral port from accept().
- Simulates `docker cp` correctly: unlink the running binary (which
persists via the child's exec mapping) and create a fresh file
with different bytes at the same path. Writing in place is
rejected by the kernel with ETXTBSY, so the old single-inode
approach did not actually model the production failure mode.
- Asserts the error returned by resolve_process_identity contains
"Binary integrity violation" (from BinaryIdentityCache) and does
NOT contain "Failed to stat" or "(deleted)" — the pre-PR-#844
failure mode. The binary field on the error is populated and is
free of the tainted suffix.
- Skips cleanly if /bin/bash is not installed. Child process is
always reaped before the assertion block so a failure does not
leak a sleeping process.
---------
Signed-off-by: mjamiv <michael.commack@gmail.com>1 parent 5c3015a commit ae7e901
2 files changed
+479
-47
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
23 | | - | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
24 | 48 | | |
25 | 49 | | |
26 | | - | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
27 | 58 | | |
28 | 59 | | |
29 | 60 | | |
30 | 61 | | |
31 | 62 | | |
32 | | - | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
33 | 79 | | |
34 | 80 | | |
35 | 81 | | |
| |||
391 | 437 | | |
392 | 438 | | |
393 | 439 | | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
394 | 601 | | |
395 | 602 | | |
396 | 603 | | |
| |||
0 commit comments