Add verify-deps + precheck for supply-chain freshness checks (npm + pnpm + Python)#89
Draft
Ibrahimrahhal wants to merge 4 commits into
Draft
Add verify-deps + precheck for supply-chain freshness checks (npm + pnpm + Python)#89Ibrahimrahhal wants to merge 4 commits into
verify-deps + precheck for supply-chain freshness checks (npm + pnpm + Python)#89Ibrahimrahhal wants to merge 4 commits into
Conversation
Introduces `corgea verify-deps`, a new top-level command that scans a project's locked dependencies, looks each one up against the public registry (npm or PyPI), and flags any whose installed version was published within a configurable recency window. This is a fast, hermetic supply-chain tripwire useful right before a build or in CI. Capabilities: * Ecosystems: npm and Python (selectable via --ecosystem). * npm sources: package-lock.json (v1, v2, v3), npm-shrinkwrap.json, yarn.lock (classic). Non-registry deps (git/file/link/workspace) are skipped because they can't be looked up by version. * Python sources: poetry.lock, Pipfile.lock, uv.lock, and requirements.txt (==-pinned lines only). * Threshold: human-friendly durations -- 2d (default), 48h, 30m, 1w, bare numbers as days. Rejects negative / unknown / non-finite values. * --fail flag for CI: exits 1 when something recent is found. * --json for machine-readable output (results, summary, sources, scanned_at, threshold_seconds). * --include-dev to opt into dev dependencies; production-only by default to keep the signal tight. * Honors CORGEA_NPM_REGISTRY / CORGEA_PYPI_REGISTRY env overrides (intended for tests / mirror users). Implementation notes: * PyPI lookup uses the per-version JSON endpoint (/pypi/<name>/<version>/json) and takes the earliest upload_time across the version's artifacts. Names are URL-encoded so PyPI's case- and separator-insensitive matching does the right thing. * npm lookup hits the package metadata endpoint and reads time[<version>]; scoped names like @types/node are encoded as @types%2fnode in the URL. The abbreviated metadata format is intentionally avoided because it omits time. * Python distribution names are normalised per PEP 503 before output. * The registry HTTP client is separate from the rest of the CLI so the user's Corgea auth header is never sent to a third-party. * Dependencies are de-duplicated by (ecosystem, name, version) before registry lookups to avoid hammering the registry on transitive collisions. Tests: * 23 hermetic unit tests covering threshold parsing, duration formatting, ecosystem parsing, name normalization, and lockfile parsers (npm v1, npm v3, yarn classic, requirements.txt, poetry, Pipfile, uv). * 5 #[ignore]'d live integration tests against npmjs.org and pypi.org (left-pad, requests, Flask, plus error paths) for end-to-end verification. Skipped by default to keep CI offline. Docs: skills/corgea/SKILL.md updated with command reference and a CI workflow snippet. Co-authored-by: Ibrahim Rahhal <ibrahim.rahhal3636@gmail.com>
Adds pnpm-lock.yaml as a third npm-ecosystem source, alongside the
existing package-lock.json/npm-shrinkwrap.json and yarn.lock parsers.
Discovery prefers package-lock first, then pnpm-lock.yaml, then
yarn.lock.
Lockfile shapes handled in a single line-based parser:
* v5/v6 `packages:` keys with leading slash + slash separator:
/lodash/4.17.21:
/@types/node/20.10.5:
* v6+ keys with at-sign separator:
/lodash@4.17.21:
/@types/node@20.10.5:
* v9 keys with no leading slash and quoted scoped names:
lodash@4.17.21:
'@types/node@20.10.5':
* Peer-dep suffixes are stripped from the version before lookup —
both v6 underscore form (`1.0.0_react@18.0.0`) and v9 paren form
(`1.0.0(react@18.0.0)`). The bare semver is what the registry
knows.
Dev/prod classification:
* v6 lockfiles carry a per-package `dev:` field — used directly.
* v9 lockfiles don't. We parse `importers:` (and the v5 flat
layout) to get top-level dependencies vs devDependencies, and
treat a (name, version) appearing only in devDependencies of all
importers as dev. Unclassified transitive packages stay treated
as prod, which is the safer default for a supply-chain tripwire.
Tests:
* 7 new unit tests covering all three key conventions, peer suffix
stripping in both forms, garbage rejection, v9/v6/v5 lockfile
parsing, and dev/prod classification.
* Verified end-to-end against a real pnpm-lock.yaml generated by
`pnpm install --lockfile-only` for express@4.18.2 +
@types/node@20.10.5 + typescript@5.4.5(dev): 70 transitive deps
correctly resolved, typescript correctly excluded from prod
scans, and live registry lookups flagged 2 actually-recent
transitive deps (hasown, side-channel-list) within a 60d window.
Docs: `skills/corgea/SKILL.md` updated to advertise pnpm-lock.yaml
(v5/v6/v9) in the supported lockfile list, and the verify-deps
section that was lost during the previous commit's edits is restored.
Co-authored-by: Ibrahim Rahhal <ibrahim.rahhal3636@gmail.com>
verify-deps command for supply-chain freshness checks (npm + Python)verify-deps command for supply-chain freshness checks (npm + pnpm + Python)
Adds a new `--fail-unpinned` flag to `corgea verify-deps` so users
can fail the build when any declared dependency can't be verified
against a registry because it isn't pinned to an exact version.
Independent of the existing `--fail` (which gates on registry
freshness): the two flags compose, so a CI step like
corgea verify-deps --threshold 2d --fail --fail-unpinned
now enforces both 'no recently published deps' AND 'no unfrozen
deps' in one shot.
What counts as 'unpinned':
* `package.json` declares dependencies but no
`package-lock.json` / `pnpm-lock.yaml` / `yarn.lock` /
`npm-shrinkwrap.json` is present.
* `pyproject.toml` declares dependencies (PEP 621
`[project].dependencies` / `optional-dependencies`,
`[tool.poetry.dependencies]`, or
`[tool.poetry.group.*.dependencies]`) but no `poetry.lock` /
`uv.lock` / `Pipfile.lock` is present.
* `Pipfile` is present without a sibling `Pipfile.lock`.
* `requirements.in` is present without a compiled
`requirements.txt`.
* Any `requirements.txt` line that isn't `==`-pinned (range
specifiers, bare names, etc.). VCS / URL specifiers are
explicit escape hatches and are not flagged.
Behaviour:
* Warnings are surfaced in the report by default — no exit-code
change unless the user opts in. This keeps the existing
contract for callers that just want freshness gating.
* `--fail-unpinned` upgrades them to a non-zero exit. Existing
`--fail` still controls only freshness, so the two are
composable.
* JSON output now includes a top-level `unpinned` array and an
`unpinned` count in `summary`, mirroring the shape of the
`recent` and `errors` fields.
Implementation:
* `DiscoverResult` now carries a `warnings: Vec<UnpinnedWarning>`
alongside its `deps`. Both `npm::discover` and
`python::discover` populate it. When discovery would have
returned the old 'no lockfile found' error AND a manifest
explains why, the discovery now returns successfully with an
empty deps list and a warning instead — the caller's
ecosystem-skip path stays compatible because we keep the error
when there's *nothing* to report.
* `parse_requirements` was refactored into
`parse_requirements_with_warnings` which returns
`(pinned, unpinned_lines)`; the old function is retained as a
thin wrapper for tests.
* Added `pyproject_has_deps` (TOML parsing of PEP 621 + Poetry
tables) and `package_json_has_deps` to avoid false positives
on placeholder manifests with no declared deps.
* `VerifyOptions` gains `fail_unpinned: bool`; `VerifyReport`
gains `unpinned_warnings` plus a `has_unpinned()` helper.
`main.rs` exits with status 1 when `fail_unpinned` is set
and any warning was emitted.
Tests:
* 9 new unit tests covering: `requirements.txt` line
classification with the new VCS / URL escape-hatch handling;
discover-level warnings for `package.json` without a
lockfile, `package.json` with a lockfile (no warning),
`pyproject.toml` declaring deps without a lockfile,
`pyproject.toml` with no declared deps (still bubbles the
'no lockfile' error), `Pipfile` without `Pipfile.lock`,
`requirements.in` paired with `pyproject.toml`, and
`requirements.txt` line-level unpinned warnings emitted
through the public `discover` API. (`tempfile` is already a
workspace dep so no new crates are needed.)
* Verified end-to-end against a fixture project with all four
failure modes (package.json, pyproject.toml, Pipfile, and
unpinned requirements.txt lines): default run prints warnings
with exit 0; `--fail-unpinned` exits 1; adding a real
`pnpm-lock.yaml` removes the npm warning correctly.
Docs: `skills/corgea/SKILL.md` updated with the flag, a CI
combination example, and the `--fail-unpinned` row in the flag
table.
Co-authored-by: Ibrahim Rahhal <ibrahim.rahhal3636@gmail.com>
`corgea precheck <pkg-mgr> <subcommand> [args...]` is a thin registry-aware wrapper around the package manager's install commands. It resolves what the package manager would install (against registry.npmjs.org or pypi.org) and refuses to run the install when a resolved version was published within --threshold (default 2d). Use it as a drop-in for the bare command in CI scripts or interactive shells: corgea precheck npm install axios@^1.0.0 --save-dev corgea precheck pnpm add @types/node@latest corgea precheck pip install requests==2.31.0 corgea precheck pip install -r requirements.txt corgea precheck npm install (bare - verifies the lockfile) Capabilities - Supported package managers: npm, yarn, pnpm, pip (alias pip3). - Spec resolution against the registry: - npm: bare name, @latest, any dist-tag (@next, @beta, ...), exact versions, and full semver ranges (^1.0.0, ~1.2.0, ">=1.0.0 <2.0.0"). Both Rust-style comma-separated and npm-style space-separated ranges parse via a new parse_npm_range helper. - PyPI: bare name, ==X, and PEP 440 specifiers >=, <=, >, <, !=, ~= with comma-separated AND. Exact pins are honoured precisely; other specifiers fall back to "highest matching stable" using semver for ordering after a small PyPI->semver normalisation step. - Spec parsing handles common edge cases: scoped npm names (@types/node@1.0.0), npm aliases (npm:other@1.0.0), workspace specs, git / URL / file / path specs, pip extras (requests[security]==2.31.0), env markers (requests==2.31.0; python_version >= "3.7"), and pip flag-with-value pairs (-r FILE, -c FILE, -e PATH, --requirement=FILE, --editable=PATH). Tokens that can not be classified are reported as "skipped" - never block the install. - Subcommands other than install/add/i are forwarded transparently to the package manager. - Bare npm install / pip install (no positional specs) verify the existing lockfile via the existing verify-deps machinery, then exec. - pip install -r FILE reads the file and runs the same registry verification that verify-deps would run on a project's requirements.txt. Works with arbitrary file names (e.g. -r dev-reqs.txt) via a new verify_arbitrary_requirements path. Behaviour - Default: a recent finding makes precheck exit 1 without running the install. Tripwire intent. - --no-fail: demote the block to a warning; install still runs. - --check-only: never exec, regardless of result. - --fail-unpinned: also fail on unverifiable specs (URL / git / file / editable) and on unpinned lines pulled in by -r. - --json: machine-readable output mirroring the verify-deps schema (results, summary, threshold_seconds). Implementation notes - New src/precheck/{mod.rs, parse.rs} for command logic and argument parsing. Exec uses which (already a workspace dep) so the same code path resolves npm.cmd shims on Windows. - Registry layer extended with two new public APIs in verify_deps/registry.rs: - npm_resolve(name, NpmSpec, registry) - fetches full package metadata once and resolves Latest / Tag / Exact / Range using semver::VersionReq. Pre-releases are excluded from range matches unless the range itself names one (matches npm). - pypi_resolve(name, PypiSpec, registry) - uses the per-package /pypi/<name>/json endpoint, filters out yanked / empty releases, and applies PEP 440 specifiers via best-effort semver ordering. - New crate dep: semver = "1" (Rust's standard semver, also used by Cargo). - Exec preserves the package manager's exit code, including signal-based termination on Unix (128+sig). Tests - 17 new unit tests (under precheck::parse::tests and precheck::tests) covering: package-manager parsing, install-subcommand recognition, npm flag stripping with the -- boundary, scoped / unscoped npm spec classification across Latest / Tag / Exact / Range, npm "unverifiable" specs (git / URL / file / path / npm: / workspace:), pip exact / specifier / extras / env-marker parsing, and pip -r / -e extraction. - 8 new #[ignore]-gated live integration tests against npmjs.org and pypi.org covering Latest, Exact, Range (both comma- and space-style), unknown-tag failure, PyPI Latest / Exact / Specifier. - Verified end-to-end against real registries: scoped names with ranges, dist-tag resolution catching today's @types/node@25.9.1 (~1d 20h old) within the default 2d window, exec passthrough, JSON output, mixed valid+skipped specs. Docs: skills/corgea/SKILL.md updated with a Precheck section, flag table, spec-resolution rules, and a CI workflow snippet. Open follow-ups left out on purpose (happy to add on request): - Wrappers for poetry add / pipenv install / uv add / npx. - Honouring per-command --registry flags. - Support for npm || OR ranges (not natively supported by the Rust semver crate). Co-authored-by: Ibrahim Rahhal <ibrahim.rahhal3636@gmail.com>
verify-deps command for supply-chain freshness checks (npm + pnpm + Python)verify-deps + precheck for supply-chain freshness checks (npm + pnpm + Python)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds two complementary commands for supply-chain freshness checking:
corgea verify-deps— point at a project, scan its lockfiles, flag installed-version freshness against the registry.corgea precheck <pkg-mgr> ...— wrap an install command (npm install,pnpm add,pip install, ...), resolve what would actually be installed, and refuse to run when a resolved version was published within the threshold.verify-depscapabilities--ecosystem npm|python|all).package-lock.json(v1, v2, v3),npm-shrinkwrap.json,pnpm-lock.yaml(v5, v6, v9), andyarn.lock(classic). Non-registry deps (git / file / link / workspace) are skipped.poetry.lock,Pipfile.lock,uv.lock, andrequirements.txt(==-pinned lines only).2d(default),48h,30m,1w. Bare numbers are interpreted as days.--fail: exit 1 if any recent dep is found.--fail-unpinned: exit 1 if any dep can't be verified because it isn't pinned --package.json/pyproject.toml/Pipfile/requirements.inwithout a matching lockfile, or unpinnedrequirements.txtlines.--json: machine-readable output --results,unpinned,summary,sources,scanned_at,threshold_seconds.CORGEA_NPM_REGISTRY/CORGEA_PYPI_REGISTRYoverrides.precheckcapabilitiesnpm,yarn,pnpm,pip(aliaspip3). Subcommands other thaninstall/add/iare forwarded straight through (socorgea precheck npm view ...still works).@latest, any dist-tag (@next,@beta, ...), exact versions, full semver ranges (^1.0.0,~1.2.0,>=1.0.0 <2.0.0). Both Rust-style comma-separated and npm-style space-separated ranges are accepted.==X, and PEP 440 specifiers>=,<=,>,<,!=,~=with comma-separated AND. Exact pins are honoured precisely; other specifiers fall back to "highest matching stable" usingsemverordering after a PyPI->semver normalisation step.@types/node@1.0.0), npm aliases (npm:other@1.0.0), workspace specs, git/URL/file/path specs, pip extras (requests[security]==2.31.0), env markers (requests==2.31.0; python_version >= "3.7"), and pip flag-with-value pairs (-r FILE,-c FILE,-e PATH,--requirement=FILE,--editable=PATH). Unverifiable specs (git / URL / file / editable) are reported as "skipped", never block.npm install/pip install: verify the existing lockfile via the sameverify-depsmachinery, then exec.pip install -r FILE: reads the file (any name, not justrequirements.txt) and runs registry verification on each line; unpinned lines flow through--fail-unpinned.--no-fail— demote to a printed warning; install still runs.--check-only— never exec, regardless of result.--fail-unpinned— also fail on unverifiable specs and unpinned-rlines.--json— machine-readable output.whichto resolve the manager binary (handles.cmdshims on Windows) and preserves the manager's exit code, including signal-based termination on Unix (128+sig).Implementation Notes
/pypi/<name>/<version>/json) for verify-deps; full/pypi/<name>/jsonfor precheck range resolution. Names URL-encoded; results normalised per PEP 503 before display.time[<version>]lookup. Scoped names encoded as@types%2fnode. Abbreviated metadata format avoided because it omitstime./lodash/4.17.21), v6+ at-sign (/lodash@4.17.21), and v9 (lodash@4.17.21,'@types/node@20.10.5'). Peer-dep suffixes are stripped from the version (both1.0.0_react@18.0.0v6 and1.0.0(react@18.0.0)v9). Dev/prod classification uses the per-packagedev:field on v6, and falls back to walkingimporters.dependencies/importers.devDependencieson v9.warnings: Vec<UnpinnedWarning>alongsidedeps. Lightweight TOML / JSON parsers confirm a manifest actually declares dependencies (placeholder files don't trigger warnings).npm_resolveandpypi_resolvein the registry module usesemver = "1"for the Rust-side ordering / matching.parse_npm_rangenormalises npm-style space ranges to comma-separated for the Rust crate.Tests
--boundary in flag stripping.#[ignore]-gated live integration tests againstregistry.npmjs.organdpypi.orgcovering: publish-time lookup (left-pad@1.3.0,requests@2.31.0,Flask@2.3.2, error paths) plus dist-tag / exact / range / specifier resolution (npm Latest/Exact/Range/space-Range/unknown-tag, PyPI Latest/Exact/Specifier).Walkthroughs
npm + Python freshness:
verify_deps_demo.log
pnpm-lock.yaml (v9, real
pnpm install --lockfile-only):verify_deps_pnpm_demo.log
--fail-unpinned(mixed npm + Python project, no lockfiles):verify_deps_fail_unpinned_demo.log
precheck(live registry resolution, real recent publish caught):precheck_demo.log
Files
src/verify_deps/mod.rs—verify-depscommand core, options, threshold parsing, run loop, report aggregation,UnpinnedWarningtype.src/verify_deps/npm.rs— npm/yarn/pnpm lockfile parsers,package.json-without-lockfile detection.src/verify_deps/python.rs— poetry/Pipfile/uv/requirements parsers + PEP 503 name normalisation,pyproject.toml/Pipfile/requirements.inwarnings, unpinned-line detection.src/verify_deps/registry.rs— npmjs.org and PyPI lookups + newnpm_resolve/pypi_resolveforprecheck.src/verify_deps/report.rs— text and JSON renderers.src/precheck/mod.rs—precheckcommand core, dispatch, exec passthrough.src/precheck/parse.rs— install-command argument parser for npm/yarn/pnpm/pip with all spec edge cases.src/main.rs— clap subcommand wiring (verify-deps,precheck).Cargo.toml— addssemver = "1".skills/corgea/SKILL.md— command reference and CI workflow snippets.Open follow-ups (left out of this PR on purpose)
poetry add,pipenv install,uv add,npx— same machinery, just dispatch + arg parsing.--registryflags.||OR ranges (not natively supported by the Rustsemvercrate).To show artifacts inline, enable in settings.