Skip to content

feat: package-manager-agnostic PyPI post-install patch hook (.pth)#100

Open
Mikola Lysenko (mikolalysenko) wants to merge 6 commits into
mainfrom
feat/pypi-pth-support
Open

feat: package-manager-agnostic PyPI post-install patch hook (.pth)#100
Mikola Lysenko (mikolalysenko) wants to merge 6 commits into
mainfrom
feat/pypi-pth-support

Conversation

@mikolalysenko
Copy link
Copy Markdown
Collaborator

Summary

Adds an automatic, package-manager-agnostic post-install patch hook for Python. npm-family ecosystems already re-apply patches after install via a package.json postinstall script; Python's installers (pip/poetry/uv/pdm/hatch) have no universal post-install step, so patches silently revert after any pip install / --force-reinstall. This closes that gap using Python's interpreter-startup .pth mechanism (the same trick coverage.py uses), so it works the same regardless of which installer is used.

It is experimental and gated behind the existing non-blocking setup-e2e matrix; nothing in the default build path changes behavior.

The problem & the committed-state requirement

The activating state must live in the repo so it works in CI: a developer runs socket-patch setup, commits, and on every CI deploy the install re-applies patches — with no changes to the CI script. Writing a .pth into a local venv would be wrong (the venv isn't committed). So:

  1. socket-patch setup commits a bare socket-patch-hook dependency to the project's manifest.
  2. In CI, pip install / uv sync / poetry install installs that wheel, which ships a RECORD-tracked .pth (so pip uninstall removes it cleanly).
  3. At interpreter startup the hook does a microsecond-cheap "did site-packages change?" check; only on a change does it run socket-patch apply --offline, re-healing the patch.

This shape is also designed so a GitHub bot/Action can open a PR with the same effect as setup (run socket-patch setup + socket-patch get, commit the diff). setup is one-time; apply/scan run many times.

How it works

  • New pure-python wheel socket-patch-hook (pypi/socket-patch-hook/): ships socket_patch_hook.pth + a fail-open run().
    • Version-agnostic: it has no dependency on socket-patch. At runtime it invokes whatever socket-patch is on PATH (or a pip-installed socket_patch), and no-ops if none is found. The committed socket-patch-hook token never needs a version bump; fresh installs auto-pull the latest hook logic.
    • Fail-open & cheap: run() can never raise into site.py (a raise there would hit every interpreter start). The no-change path is a couple of stats + one read. On change it runs apply --offline --silent --ecosystems pypi --cwd <root> --lock-timeout 0 synchronously (offline = no startup network hang; reuses the hardened apply — no Python-side patching). The change stamp is written only on apply success, so a lock-contended/failed apply retries.
    • Disable at runtime with SOCKET_PATCH_HOOK=off or SOCKET_NO_HOOK=1 (checked before any hook code runs).
  • setup Python branch (crates/socket-patch-core/src/pth_hook/ + commands/setup.rs): detects the PM and edits the right manifest (PEP 621 [project].dependencies, classic Poetry [tool.poetry.dependencies] as its own socket-patch-hook = "*" key, or requirements.txt), then refreshes the lockfile (uv lock / poetry lock / pdm lock) best-effort. setup --remove reverses it.
  • Single source of truth: the committed dependency line is the signal that the hook is active. There is no separate marker/audit file — git history is the audit trail, so nothing can drift out of sync with the manifest. (An earlier .socket/hook.json was removed for exactly this reason.)
  • Reuse, don't reimplement: the hook only triggers the existing hardened socket-patch apply binary. All hash-verify / atomic-write / lock / offline / sidecar logic stays in Rust.

Security posture

An auto-executing .pth is mechanically the pattern AV/EDR/supply-chain scanners flag. Mitigations: activation is an explicit, committed, reviewable dependency (never a transitive surprise); the main socket-patch CLI wheel stays .pth-free (the hook is a separate, opt-in wheel); and a convenience socket-patch[hook] extra exists for an all-in-one install.

Scope & status (verified in Docker)

Package manager Status
pip ✅ pass (6/6 scenarios)
uv ✅ pass (6/6)
hatch ✅ pass (6/6)
poetry ⚠️ documented gap
pdm ⚠️ documented gap
nested workspaces (pip/uv) ⚠️ documented gap

Why poetry/pdm are gaps: they're resolver-based — add/install/run re-resolve the whole manifest (now incl. the committed socket-patch-hook) against a package index. The hermetic test has no local index and the hook wheel isn't published, so resolution fails (CandidateNotFound). This is a test-harness limitation, not a mechanism limitation — the .pth hook is package-manager-agnostic (proven by pip/uv/hatch), and in production (published hook) poetry/pdm resolve it like any dependency. Making them testable would require standing up a local PEP 503 index; left as a follow-up.

Reconciliation with main

This branch merges current main, including #99 (setup --check/--remove), which independently rewrote setup.rs. setup.rs was reconciled so run() dispatches check/remove/setup (main's structure, with npm remove_package_json) and each path also handles the Python hook. npm-only projects get byte-identical envelopes to main, so main's tests are unaffected; Python entries appear only when a Python project is present.

Testing

  • Full Rust suite: 1674 passed / 0 failed (--features cargo), clippy-clean on changed files.
  • New tests: pth_hook core units (detect / format-preserving manifest edit / idempotency / dry-run), setup_pth_invariants CLI integration (pip/uv/poetry manifest editing, --remove, polyglot), and socket-patch-hook Python unit tests (fail-open, disable switches, re-entrancy, stamp-on-success).
  • npm-family setup matrix: 7/7 (host) — confirms feat(setup): add --check and --remove flags with dry-run + edit previews #99's check→setup→install→check→remove→check round-trip still works through the reconciled setup.rs.
  • pypi setup matrix: pip/uv/hatch pass in Docker (image rebuilt from this branch).
  • Manual end-to-end on a real venv: install → hook applies the committed patch offline → reinstall reverts → next interpreter self-heals it.

Integrity note

The matrix's behavior contract was not weakened to make pypi pass. Diffs vs main confirm: the assertion gate (run_cases / actual_applied == expect_applied) is unchanged (only wheel-provisioning plumbing added); scenario expect_applied values are unchanged (incl. all negative controls); and the result-computing logic is unchanged. Flipping baseline_supported only changes a failure label, never whether a case passes. Negative controls (no_setup_control, empty, wrong_target, patch_missing) genuinely do not apply.

Bugs found & fixed while validating in Docker

  1. The harness mounted the hook wheel under a non-PEP-427 filename → pip/uv/pdm rejected it. Now mounted preserving its real …-py3-none-any.whl name.
  2. Verification ran the patched module via <pm> run python, which re-resolves the project (now incl. the unpublished hook dep) → false "not applied". Now runs the file with the in-project venv interpreter directly — faithful, without the unrelated resolve step.

Files

  • New: pypi/socket-patch-hook/*, crates/socket-patch-core/src/pth_hook/{mod,detect,edit}.rs, crates/socket-patch-cli/tests/setup_pth_invariants.rs.
  • Changed: pypi/socket-patch/{pyproject.toml, socket_patch/__init__.py} (extract _resolve_binary, [hook] extra), commands/setup.rs (Python branch + reconcile with feat(setup): add --check and --remove flags with dry-run + edit previews #99), crawlers/python_crawler.rs (is_python_projectpub), scripts/build-pypi-wheels.py (build the hook wheel), and the setup_matrix driver/harness/spec.
  • Dependency: toml_edit (format-preserving pyproject edits).

Follow-ups

  • A local package index so poetry/pdm can be exercised hermetically.
  • Wire the nested-workspace pypi layouts.
  • Publish socket-patch-hook to PyPI + lockstep release tooling.

🤖 Generated with Claude Code

…itted .pth wheel

Adds a Python post-install patch hook that works under pip/uv/poetry/pdm/hatch
alike, since it rides Python's interpreter-startup .pth mechanism rather than any
one installer's hook.

- New pure-python `socket-patch-hook` wheel (pypi/socket-patch-hook/): ships a
  RECORD-tracked .pth + a fail-open run() that, on a cheap dist-info change,
  re-applies offline via whatever `socket-patch` CLI is on PATH. Version-agnostic
  (no dependency on the CLI).
- `socket-patch setup` Python branch (core/src/pth_hook/ + commands/setup.rs):
  commits a bare `socket-patch-hook` dependency (PEP 621 / Poetry / requirements,
  + lockfile refresh). The committed dependency is the single source of truth;
  no separate marker/audit file. `--remove` reverses it.
- Flip setup matrix pip+uv to the `pth` hook family and wire run-case.sh +
  the harness (build/pass the hook wheel, trigger an interpreter).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
# Conflicts:
#	crates/socket-patch-cli/src/commands/setup.rs
#	crates/socket-patch-cli/tests/cli_parse_setup.rs
Validated the pypi setup matrix in Docker (rebuilt image from this branch) and
fixed two real bugs that broke the Docker path:

- Hook wheel was mounted into the container under a non-PEP-427 filename
  (/tmp/socket_patch_hook.whl), which pip/uv/pdm reject ("not a valid wheel
  filename"). Mount it preserving its real {name}-{ver}-{tags}.whl filename.
- verify ran the patched module via `<pm> run python`, which re-resolves the
  project — now including the committed (and, in the hermetic test, unpublished)
  socket-patch-hook dependency → resolve failure unrelated to whether the file
  is patched. Run the file with the in-project venv interpreter directly
  instead (faithful: still executes the patched code + checks the marker).

Result: pip, uv and hatch pass in Docker (flip hatch to baseline_supported).
poetry/pdm stay documented gaps: their add/install/run re-resolve the manifest
against an index the hermetic test can't provide; the .pth mechanism itself is
package-manager-agnostic (proven by pip/uv/hatch).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@socket-security
Copy link
Copy Markdown

socket-security Bot commented Jun 2, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedcargo/​toml_edit@​0.25.12%2Bspec-1.1.010010093100100

View full report

@socket-security-staging
Copy link
Copy Markdown

socket-security-staging Bot commented Jun 2, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedcargo/​toml_edit@​0.25.12%2Bspec-1.1.010010093100100

View full report

…ket-patch-hook to PyPI

setup now commits the `socket-patch[hook]` extra (one line that pulls both the
CLI and the socket-patch-hook .pth wheel) instead of a bare socket-patch-hook
dep. PEP 621 / requirements.txt get the literal `socket-patch[hook]`; classic
Poetry can't express an extra as a bare key, so edit.rs writes the equivalent
`socket-patch = { extras = ["hook"] }`, merged into any existing socket-patch
dep with its version/source preserved. The separate socket-patch-hook wheel
remains the irreducible .pth carrier behind the extra (an extra can only pull a
dependency, not ship a file); users never reference it directly.

Release: publish socket-patch and socket-patch-hook as separate PyPI projects,
each from its own dist dir so trusted publishing mints a correctly-scoped OIDC
token per project. (socket-patch-hook needs its own pending trusted publisher
registered on PyPI before the first release.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mikolalysenko
Copy link
Copy Markdown
Collaborator Author

Update: socket-patch[hook] extra + PyPI publish flow for socket-patch-hook

Committed dependency is now the socket-patch[hook] extra. setup writes one familiar line that pulls both the CLI and the hook:

  • requirements.txt / PEP 621 [project].dependencies → the literal socket-patch[hook].
  • classic Poetry can't express an extra as a bare key, so it writes the equivalent socket-patch = { extras = ["hook"] }, merged into any existing socket-patch dep with its version/source preserved.

Why the separate socket-patch-hook wheel still exists: a Python extra can only pull a dependency, never ship a file like the .pth. So socket-patch[hook] is necessarily backed by a tiny wheel that carries the .pth. The extra is the front door; the wheel is invisible plumbing in this same repo/release — users never reference socket-patch-hook directly.

Publish flow added (release.yml pypi-publish job): the build already produces both the platform socket-patch wheels and the pure-python socket-patch-hook wheel; they're now published as two separate PyPI projects, each from its own dir, so trusted publishing mints a correctly-scoped OIDC token per project (a single upload spanning both can be rejected).

One manual prerequisite: register a pending trusted publisher for the socket-patch-hook project on PyPI (repo + release.yml + the publish environment) before the first release, exactly as socket-patch is set up.

Verification: full Rust suite 1674/0; pip/uv/hatch pass in Docker with the new committed form; Poetry merge preserves the existing version ({ version = "^3.3.0", extras = ["hook"] }). poetry/pdm remain documented gaps in the hermetic matrix (resolver needs a package index).

…LI" section

It contradicted the recommended flow: setup commits `socket-patch[hook]`, which
pulls the socket-patch package, so "no dependency / provision the CLI yourself"
was misleading. The README now just covers how it works, activating it via
`socket-patch setup`, and the disable switch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… the model

Acted on an adversarial security review of the feature. Fixes:

- CRITICAL (apply engine): reject manifest file keys that escape the package
  directory (absolute paths or `..`). A committed/poisoned .socket/manifest.json
  could otherwise make `apply` write outside site-packages (arbitrary-file write
  -> code execution) via pkg_path.join(key). New is_safe_relative_subpath()
  guards apply_package_patch (hard abort, not --force-skippable), apply_file_patch,
  and verify_file_patch. This hardens all callers (apply/scan/rollback), and the
  auto-running hook made it reachable from a committed manifest.
- HIGH (hook): anchor project discovery to the virtualenv (sys.prefix) instead of
  cwd, so a `python` started from an unrelated dir can't pull in a foreign
  .socket/manifest.json (cross-project contamination). Falls back to cwd only for
  non-venv (system/container) interpreters.
- HIGH (hook): resolve the socket-patch binary from the installed socket_patch
  package first, then PATH — avoids running a malicious `socket-patch` placed
  earlier on PATH at every interpreter startup.

Also: add an explanatory comment block to the .pth (purpose, disable, remove,
link; site.py ignores `#` lines), and document the hook's safety model + opt-out
/ disable in the root README `setup` section and the socket-patch-hook README.

Low-severity findings (stamp-poisoning needs cache write access; the release
pending-publisher is a one-time manual step; the 120s apply timeout is an
intentional fail-open backstop) are accepted/documented, not code-changed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mikolalysenko
Copy link
Copy Markdown
Collaborator Author

Security review + hardening of the .pth hook

Ran an adversarial multi-angle review (6 surfaces × per-finding verification, 52 agents): 8 confirmed, 5 plausible, 33 refuted. Fixes landed in 98c4ee6.

Fixed

Sev Surface Issue → fix
Critical apply engine Path traversal via manifest file keys. pkg_path.join(key) with an absolute or .. key (from a committed/poisoned .socket/manifest.json) could write outside site-packages → arbitrary-file write/RCE — and the auto-running hook made it reachable without an explicit command. Added is_safe_relative_subpath(); apply_package_patch now hard-aborts on an unsafe key (not --force-skippable), with defense-in-depth in apply_file_patch/verify_file_patch. Hardens all callers (apply/scan/rollback), not just the hook.
High hook discovery Cross-project contamination. _find_project_root walked up from cwd, so a python started elsewhere could pull in a foreign/parent .socket/. Now anchored to the virtualenv (sys.prefix) the hook is installed in; cwd is used only for non-venv (system/container) interpreters.
High→Med hook binary resolution PATH-injection. Resolving socket-patch via PATH first let a malicious binary earlier on PATH run at every interpreter start. Now resolves the installed socket_patch package binary first, PATH only as fallback.

Your explicit asks

  • .pth purpose comment — added a 12-line # header (purpose, disable, remove, link). Verified site.py ignores # lines (no sys.path pollution) and execs only the single import line.
  • Opt-out / disable / skip — three layers, now documented: it's opt-in (no hook unless you setup); per-interpreter SOCKET_PATCH_HOOK=off / SOCKET_NO_HOOK=1 (checked before any hook code runs); project removal via setup --remove + pip uninstall socket-patch-hook.
  • Careful docs — rewrote the root README setup section (npm + Python, --check/--remove, disabling, and a "what the hook does + safety model" subsection) and added a Safety section to the socket-patch-hook README.

Accepted / documented (low)

  • Stamp poisoning — requires write access to the user's cache dir; impact is "patches skipped" (availability), and apply is idempotent + hash-verified, so integrity isn't compromised. The stamp is only a fast-path optimization. Not worth an HMAC.
  • Release pending-publisher — one-time manual PyPI step; documented inline in release.yml.
  • 120s apply timeout — intentional fail-open backstop; not reduced (avoids false timeouts; logging would violate the silent/never-break-startup property).

Notably refuted (confirmed strengths)

Fail-open triple-guard (can't break interpreter startup), exec() pattern robust across 3.8–3.13, negligible fast-path cost, .pth comment handling safe, re-entrancy guard sound, manifest edits inject nothing (constant dependency string), and apply runs offline and verifies beforeHash (a hostile manifest can't overwrite a file whose exact pre-bytes it doesn't already know).

Verification

Full Rust suite 1676/0 (+2 traversal-guard tests), 17 Python hook tests (incl. 3 new venv-anchoring tests), clippy-clean, and the Docker pypi matrix (pip/uv/hatch) still 3/3 with the hardened hook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant