diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..0b763f0 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,41 @@ +# CLAUDE.md (cli/) + +Corgea developer CLI — Rust binary distributed three ways: native zip (GH releases), npm (`@corgea/cli`), pip (`corgea-cli` via maturin). Parent `corgea/CLAUDE.md` covers the monorepo; this file is cli-specific. + +## Layout + +`src/main.rs` defines the clap `Commands` enum and dispatches. Each subcommand is its own module: `authorize.rs` (OAuth login), `scan.rs` + `scanners/{blast,fortify}.rs`, `scanners/parsers/{semgrep,sarif,checkmarx,coverity}.rs` for upload formats, `list.rs`, `inspect.rs`, `wait.rs`, `setup_hooks.rs`, `targets.rs` (path/glob/git selectors), `cicd.rs`. Shared infra in `utils/{api,terminal,generic}.rs` and `config.rs`. + +User-facing reference lives in `skills/corgea/SKILL.md` — keep it in sync when adding/changing commands. + +## Build / test / run + +| Task | Command | +|---|---| +| Tests | `cargo test` (also runs in `.github/workflows/test.yml`) | +| Native build | `cargo build --release` → `./target/release/corgea` | +| Python wheel (dev) | `maturin develop` (needs venv + `pip install maturin`) | +| Local multi-target build | `./build_release.sh` (Darwin/Linux only; CI is the canonical path) | + + +- Add a variant to `Commands` in `src/main.rs`, a match arm in `main()`, and a new `mod` (declared at top of `main.rs`). +- Token-gated commands must call `verify_token_and_exit_when_fail(&corgea_config)` before doing work. +- HTTP calls go through `utils::api` — do not instantiate `reqwest::Client` directly. Auth header is set via `utils::api::set_auth_token`; the client picks `Authorization: Bearer` for JWTs and `CORGEA-TOKEN` otherwise (`utils/api.rs:22`). +- Update `skills/corgea/SKILL.md` and `README.md` if user-visible. + + + +- `Cargo.toml` `version` is the source of truth. `pyproject.toml` is `dynamic = ["version"]` (maturin reads Cargo). `package.json` version is overwritten from the git tag by `npm-publish.yml`. +- Release flow: bump `Cargo.toml`, merge, push tag `v`. That triggers `release.yml` (PyPI via maturin), `release-binaries.yml` (zips per target attached to the GH release), then `npm-publish.yml` (downloads those zips, runs `scripts/npm/bundle-binaries.js` to lay out `vendor//corgea/`, publishes to npm). +- Supported npm/pip target triples are listed in `scripts/npm/bundle-binaries.js` and `bin/corgea.js` — keep them in lockstep with the CI matrix in `release-binaries.yml`. + + + +- Config persists at `~/.corgea/config.toml` (`config.rs`). Env overrides: `CORGEA_TOKEN`, `CORGEA_URL`, `CORGEA_DEBUG`, `CORGEA_SOURCE`, `CORGEA_ACCEPT_CERT` (skip TLS verification, only honored when `https_proxy` is set). +- A single shared `reqwest::blocking::Client` lives in `utils/api.rs` with a 150s timeout and a process-wide cookie jar — reuse it; do not build new clients per call. +- `corgea login` with no token launches a localhost OAuth callback (`authorize.rs`); with a token (or `CORGEA_TOKEN` env) it verifies and stores. `--scope` selects the tenant subdomain and overrides `--url`. + + + +- New parsers live in `src/scanners/parsers/` and are wired in `mod.rs`. Dispatch happens in `scan.rs` (`read_file_report` / `read_stdin_report`); Fortify `.fpr` is special-cased in `main.rs` and goes through `scanners/fortify.rs`. + diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..0c2c480 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,110 @@ +# Contributing to Corgea CLI + +Thanks for your interest in improving the Corgea CLI! This document covers everything you need to go from a fresh clone to a merged pull request. + +The CLI is a Rust binary distributed three ways: native zips on [GitHub Releases](https://github.com/Corgea/cli/releases), npm (`@corgea/cli`), and pip (`corgea-cli`, built with [maturin](https://github.com/PyO3/maturin)). + +## Ways to contribute + +- **Report a bug** — open a [bug report](https://github.com/Corgea/cli/issues/new?template=bug_report.yml). +- **Request a feature** — open a [feature request](https://github.com/Corgea/cli/issues/new?template=feature_request.yml). +- **Ask a question** — use [Discussions](https://github.com/Corgea/cli/discussions); please don't open an issue for support. +- **Send a fix or feature** — see the workflow below. For anything non-trivial, open an issue first so we can agree on the approach before you invest time. + +To report a **security vulnerability**, do *not* open a public issue — follow [SECURITY.md](SECURITY.md). + +## Prerequisites + +- **Rust** (stable) — install via [rustup](https://rustup.rs/). The crate targets edition 2021. +- **Python 3.8+** — only needed if you build or test the pip wheel. +- A C toolchain — required to build the vendored OpenSSL dependency on non-Windows platforms. + +## Development setup + +```bash +git clone https://github.com/Corgea/cli.git +cd cli +cargo build +``` + +### Build and run + +| Task | Command | +|---|---| +| Build (debug) | `cargo build` | +| Build (release) | `cargo build --release` → `./target/release/corgea` | +| Run a subcommand | `cargo run -- scan --help` | +| Python wheel (dev) | `maturin develop` (needs a venv + `pip install maturin`) | +| Multi-target build | `./build_release.sh` (Darwin/Linux only; CI is the canonical path) | + +After changing Rust code, re-run `maturin develop` to rebuild the Python wheel. + +### Test, format, lint + +```bash +cargo test # runs in CI on every push and PR +cargo fmt --all # format before committing +cargo clippy --all-targets # catch common mistakes +``` + +CI currently runs `cargo test`. Please run `cargo fmt` and `cargo clippy` locally anyway — a clean, warning-free diff is much faster to review. + +Run a single test with `cargo test `. + +## Project layout + +`src/main.rs` defines the clap `Commands` enum and dispatches each subcommand. Every subcommand is its own module: + +- `authorize.rs` — OAuth login +- `scan.rs` + `scanners/{blast,fortify}.rs` — scanning +- `scanners/parsers/{semgrep,sarif,checkmarx,coverity}.rs` — upload formats +- `list.rs`, `inspect.rs`, `wait.rs`, `setup_hooks.rs`, `targets.rs`, `cicd.rs` +- `utils/{api,terminal,generic}.rs`, `config.rs` — shared infrastructure + +`CLAUDE.md` at the repo root has deeper notes on internal conventions. + +### Adding a subcommand + +1. Add a variant to the `Commands` enum in `src/main.rs`. +2. Add a match arm in `main()` and declare the new `mod` at the top of `main.rs`. +3. Put the implementation in its own module file. +4. Token-gated commands must call `verify_token_and_exit_when_fail(&corgea_config)` before doing work. +5. Make HTTP calls through `utils::api` — do not construct a `reqwest::Client` yourself. The shared client carries auth, cookies, and the standard timeout. +6. Update `skills/corgea/SKILL.md` and `README.md` if the change is user-visible. + +### Adding an upload/report parser + +New parsers live in `src/scanners/parsers/` and are wired in `mod.rs`. Dispatch happens in `scan.rs` (`read_file_report` / `read_stdin_report`); the Fortify `.fpr` format is special-cased in `main.rs` via `scanners/fortify.rs`. + +## Pull request workflow + +1. Fork the repo and create a branch from `main`. +2. Make your change. Keep PRs focused — one logical change per PR. +3. Add or update tests for behavior changes. +4. Run `cargo test`, `cargo fmt --all`, and `cargo clippy` — all clean. +5. Update `README.md` and `skills/corgea/SKILL.md` if you changed anything user-facing. +6. Open the PR against `main` and fill out the template. Link the issue it closes. + +A maintainer will review. Please be responsive to feedback; stale PRs may be closed and can always be reopened. + +### Commit messages + +Write a concise, imperative summary line (e.g. "Support JWT tokens", "Fail on offset mismatch when uploading a report"). The PR title becomes the squashed commit, so make it descriptive. + +## Releases + +Maintainers cut releases. The flow: + +1. Bump `version` in `Cargo.toml` (the single source of truth — `pyproject.toml` reads it via maturin, and `package.json` is overwritten from the git tag). +2. Merge, then push a tag `v`. +3. The tag triggers the release workflows: PyPI (maturin), per-target zips attached to the GitHub Release, then the npm publish. + +Contributors do not need to bump the version in feature PRs. + +## License + +By contributing, you agree that your contributions are licensed under the [GNU LGPL v2.1](LICENSE), the same license as the project. + +## Code of conduct + +This project follows the [Contributor Covenant](CODE_OF_CONDUCT.md). By participating, you agree to uphold it. diff --git a/PRD_DEPS.md b/PRD_DEPS.md new file mode 100644 index 0000000..80f0ec1 --- /dev/null +++ b/PRD_DEPS.md @@ -0,0 +1,1819 @@ +# PRD: Corgea Dependency Inventory & Supply Chain Policy + +**Product area:** Corgea CLI / SCA / Dependency Scanning +**Working name:** Corgea Dependency Inventory +**Proposed CLI namespace:** `corgea deps` +**Status:** Draft PRD +**Primary users:** Developers, AppSec engineers, platform engineers, compliance/security teams +**Core thesis:** Corgea should not only report vulnerable dependencies. It should explain **what dependency exists, why it exists, whether it is reproducible, whether it violates policy, whether it is reachable, and what the smallest safe fix is**. + +This PRD assumes Corgea already supports CLI scanning, exported scan results in `html`, `json`, and `SARIF`, and SCA/dependency-scanning capabilities with reachability and dead-package analysis. The CLI docs currently describe `--out-format` and `--out-file` support for scan reports, while Corgea’s dependency-scanning product page describes ecosystem coverage, reachability, dead-package analysis, and upgrade prioritization. ([Corgea Documentation][1]) + +--- + +## 1. Summary + +We want to build a dependency inventory and supply-chain policy tool on top of the existing Corgea CLI. + +The tool scans package manifests and lockfiles, builds a normalized dependency graph, identifies dependency hygiene issues, detects reproducibility gaps, flags vulnerable or policy-violating dependencies, and explains exactly how each package entered the project. + +The initial user-facing product should live under: + +```bash +corgea deps +``` + +The most important commands: + +```bash +corgea deps scan +corgea deps graph +corgea deps explain +corgea deps diff --base origin/main +corgea deps sbom --format cyclonedx +corgea deps policy init +corgea deps fix +``` + +The MVP should focus on: + +1. Detecting manifests and lockfiles. +2. Building dependency graphs. +3. Separating declared intent from resolved reality. +4. Flagging non-reproducible installs. +5. Detecting direct unpinned/broad/mutable dependencies. +6. Identifying lockfile drift. +7. Explaining dependency paths. +8. Producing JSON/SARIF/SBOM output. +9. Uploading inventory snapshots to Corgea for organization-level visibility. + +--- + +## 2. Problem + +Modern applications depend on large dependency trees. Security teams often know that a vulnerable package exists somewhere, but they struggle to answer basic questions: + +```text +Where is this dependency used? +Why is it present? +Is it direct or transitive? +Is the installed version reproducible? +Is the lockfile stale? +Is the vulnerable code reachable? +Which team owns it? +Did this PR introduce it? +What is the smallest safe fix? +``` + +Existing dependency scanners often over-focus on CVEs and under-focus on dependency governance. This creates alert fatigue and misses a broader category of supply-chain risk: + +```text +missing lockfiles +unpinned direct dependencies +wildcard versions +mutable Git dependencies +URL/tarball dependencies without checksums +stale lockfiles +dependency drift +unknown registries +duplicate vulnerable packages +license violations +abandoned packages +dev dependencies leaking into production +``` + +The user’s original concern is valid: unpinned dependencies can create hidden risk. But the tool must be precise. A nested dependency may declare a broad range, while the project lockfile still resolves it to a concrete version. That is not the same as an actually non-deterministic install. + +The product should distinguish: + +```text +unpinned +broad +mutable +unresolved +resolved +locked +stale +vulnerable +reachable +non-reproducible +policy-violating +``` + +That distinction is critical for developer trust. + +--- + +## 3. Goals + +### Product goals + +1. Create an accurate dependency inventory for every scanned project. +2. Build direct and transitive dependency graphs from manifests and lockfiles. +3. Detect missing, stale, or incomplete lockfiles. +4. Flag direct dependencies that are unpinned, overly broad, mutable, or unsafe. +5. Explain dependency paths in a way developers can act on. +6. Detect new dependency risk introduced by pull requests. +7. Support policy-as-code for dependency governance. +8. Export dependency findings in developer- and CI-friendly formats. +9. Upload dependency graph snapshots to Corgea for organization-wide visibility. +10. Integrate with Corgea’s existing SCA, reachability, dead-package, and remediation workflows. + +### User goals + +Developers should be able to answer: + +```text +What changed in this PR? +What dependency introduced this vulnerable package? +Is this finding real or just a transitive declaration? +How do I fix it? +Will this block my build? +``` + +Security teams should be able to answer: + +```text +Which repos have missing lockfiles? +Where do we use this package? +Which services are affected by this CVE? +Which dependency risks were introduced this week? +Which projects violate dependency policy? +Which teams own the riskiest graphs? +``` + +Compliance teams should be able to answer: + +```text +Can we produce an SBOM for this release? +Which packages existed at this release commit? +Are there license violations? +Can we prove we reviewed/accepted exceptions? +``` + +--- + +## 4. Non-goals + +For the MVP, this should **not** try to solve every supply-chain problem. + +Out of scope for v1: + +1. Full package malware detection. +2. Runtime production agent. +3. Container image dependency inventory. +4. Binary/package artifact reverse engineering. +5. Fully automated major-version upgrades. +6. AI-generated dependency replacement recommendations. +7. Enforcing org-wide policy across every language from day one. +8. Perfect reachability analysis for every ecosystem. +9. Dependency graph visualization in the terminal beyond simple tree/path output. +10. Complete support for every package manager edge case. + +The MVP should be accurate, narrow, and trusted. + +--- + +## 5. Personas + +### 5.1 Developer + +Wants fast feedback during local development and pull requests. + +Primary needs: + +```text +What did I introduce? +Will CI fail? +How do I fix it? +Is this really my dependency? +``` + +### 5.2 AppSec engineer + +Owns dependency risk across many repositories. + +Primary needs: + +```text +Where is package X? +Which findings are reachable? +Which findings are new? +Which services are not reproducible? +Which teams need action? +``` + +### 5.3 Platform engineer + +Maintains CI/CD, package manager conventions, and build reproducibility. + +Primary needs: + +```text +Do repos have lockfiles? +Are package manager versions pinned? +Are installs deterministic? +Are private registries enforced? +``` + +### 5.4 Compliance / audit user + +Needs evidence for releases, audits, and customer security reviews. + +Primary needs: + +```text +Generate SBOM. +Show historical inventory. +Show license policy. +Show exceptions. +Show remediation status. +``` + +--- + +## 6. User experience + +## 6.1 Local scan + +Command: + +```bash +corgea deps scan +``` + +Example output: + +```text +Corgea dependency inventory + +Detected: + package.json npm manifest + package-lock.json npm lockfile + requirements.txt pip requirements + constraints.txt pip constraints + +Inventory: + 182 packages + 24 direct + 158 transitive + 129 production + 53 development + +Policy findings: + 2 high + 5 medium + 11 low + +High findings: + DEP001 Missing lockfile for services/api/requirements.txt + Install may resolve different transitive versions over time. + Fix: generate a lockfile or compiled constraints file. + + DEP005 Mutable Git dependency + internal-utils @ git+ssh://git.example.com/internal-utils.git@main + Fix: pin to a commit SHA or immutable release tag. + +Next: + corgea deps explain internal-utils + corgea deps diff --base origin/main + corgea deps fix --interactive +``` + +Recommendation: keep local output concise. Developers should see the highest-impact issues first, then commands for deeper investigation. + +--- + +## 6.2 Explain a dependency + +Command: + +```bash +corgea deps explain qs +``` + +Example output: + +```text +qs@6.11.0 + +Ecosystem: + npm + +Scope: + production + +Type: + transitive + +Depth: + 2 + +Introduced by: + root -> express@4.18.2 -> qs@6.11.0 + +Declared by parent: + express@4.18.2 declares qs: "6.11.0" + +Resolved by: + package-lock.json + +Status: + locked + reproducible + no known reachable vulnerability + no policy violation +``` + +This should become one of the signature workflows. + +The best dependency tools answer: **“Why is this here?”** + +--- + +## 6.3 Pull request diff + +Command: + +```bash +corgea deps diff --base origin/main +``` + +Example output: + +```text +Dependency diff against origin/main + +Added: + + npm:axios@1.8.2 direct production + + npm:form-data@4.0.1 transitive production via axios + +Changed: + ~ npm:lodash 4.17.20 -> 4.17.21 + +Removed: + - npm:request@2.88.2 + +New policy findings: + HIGH DEP003 axios declared as "^1.8.0" + MED DEP014 3 versions of debug now present + +New vulnerability findings: + none +``` + +Recommendation: CI should primarily block on **new risk**, not inherited historical backlog. + +--- + +## 6.4 CI mode + +Command: + +```bash +corgea deps scan --changed --fail-on high --out-format sarif --out-file deps.sarif +``` + +Expected behavior: + +```text +pass if no new blocking findings +fail if new high/critical dependency policy violations are introduced +emit SARIF for code scanning integrations +emit JSON for Corgea ingestion and automation +``` + +Corgea’s existing CLI already supports scan export formats including JSON, HTML, and SARIF, so this feature should reuse that output model where possible. ([Corgea Documentation][1]) + +--- + +## 6.5 SBOM export + +Command: + +```bash +corgea deps sbom --format cyclonedx --out sbom.json +``` + +Expected behavior: + +```text +generate a dependency inventory for the current repo/commit/release +include direct and transitive components +include dependency relationships +include package versions, package URLs, licenses, and vulnerability metadata where available +``` + +This is important for release workflows, customer security questionnaires, and audit readiness. + +--- + +## 6.6 Policy initialization + +Command: + +```bash +corgea deps policy init +``` + +Example generated file: + +```yaml +dependency_policy: + require_lockfile: true + fail_on_missing_lockfile: true + fail_on_stale_lockfile: true + + direct_dependencies: + fail_on_wildcard: true + fail_on_latest: true + warn_on_semver_range: true + allow_exact_versions: true + + transitive_dependencies: + allow_ranges_if_lockfile_resolves: true + fail_if_unresolved: true + + sources: + fail_on_mutable_git_refs: true + fail_on_url_without_checksum: true + allowed_registries: + npm: + - https://registry.npmjs.org/ + pypi: + - https://pypi.org/simple/ + + vulnerabilities: + fail_on_critical_reachable: true + fail_on_high_reachable: true + warn_on_unreachable: true + + licenses: + blocked: + - AGPL-3.0 + - GPL-3.0 + + ci: + fail_on_new_findings_only: true + severity_threshold: high +``` + +--- + +## 7. Core product behavior + +## 7.1 Manifest vs lockfile distinction + +The tool must separately model: + +### Declared intent + +What the manifest allows: + +```json +"axios": "^1.8.0" +``` + +### Resolved reality + +What the lockfile installed: + +```json +"axios": "1.8.2" +``` + +### Effective risk + +What this means: + +```text +Manifest uses range. +Lockfile resolves exact version. +Install is reproducible as long as lockfile is committed and honored. +Policy may warn, but should not treat this as equivalent to missing lockfile. +``` + +This is the most important correctness requirement. + +Bad finding: + +```text +axios is unpinned and vulnerable. +``` + +Better finding: + +```text +axios uses a semver range in package.json, but package-lock.json resolves it to 1.8.2. +Policy warning only. Build remains reproducible. +``` + +--- + +## 7.2 Direct vs transitive classification + +Each dependency must be classified as: + +```text +direct production +direct development +direct optional +direct peer +transitive production +transitive development +transitive optional +transitive peer +``` + +The scanner should preserve dependency paths, not just flat package lists. + +Example: + +```text +root -> express@4.18.2 -> body-parser@1.20.1 -> qs@6.11.0 +``` + +This path should be available in CLI, JSON, SARIF metadata, and Corgea UI. + +--- + +## 7.3 Package source classification + +Every package node should have a source type: + +```text +registry +private registry +git commit +git tag +git branch +git ref +local path +remote tarball +URL +workspace +vendored +unknown +``` + +High-risk source types: + +```text +mutable git branch +remote URL without checksum +local path in release artifact +unapproved registry +unknown registry +package source mismatch +``` + +--- + +## 7.4 Lockfile health + +The scanner should detect: + +```text +missing lockfile +stale lockfile +lockfile not committed +manifest-lockfile mismatch +lockfile missing integrity hashes +multiple conflicting lockfiles +package manager mismatch +unsupported lockfile version +workspace lockfile not covering all manifests +``` + +Example finding: + +```text +DEP002 Stale lockfile + +package.json was modified after package-lock.json. +The manifest declares axios@^1.8.0, but package-lock.json does not contain axios. + +Fix: + npm install + commit updated package-lock.json +``` + +--- + +## 7.5 Vulnerability and reachability enrichment + +The dependency graph should be enriched with: + +```text +known vulnerability IDs +affected version range +fixed version +severity +EPSS / exploitability signal if available +reachable / unreachable / unknown +dead package status +dependency path +recommended fix +``` + +Corgea already markets dependency scanning with AI reachability, dead-package analysis, function-level reachability, and upgrade prioritization; this feature should make those signals visible inside the inventory and policy workflows. ([Corgea][2]) + +--- + +## 8. Requirements + +## 8.1 Functional requirements + +### FR1: Detect dependency files + +The CLI must recursively detect supported dependency files. + +MVP file types: + +```text +JavaScript / TypeScript: + package.json + package-lock.json + yarn.lock + pnpm-lock.yaml + +Python: + requirements.txt + constraints.txt + pyproject.toml + poetry.lock + uv.lock + +Go: + go.mod + go.sum + +Rust: + Cargo.toml + Cargo.lock + +Ruby: + Gemfile + Gemfile.lock + +Java: + pom.xml + build.gradle + gradle.lockfile +``` + +Nice-to-have later: + +```text +composer.json / composer.lock +packages.lock.json +Pipfile.lock +bun.lock +conda-lock.yml +mix.lock +rebar.lock +``` + +--- + +### FR2: Build dependency graph + +The CLI must build a graph with: + +```text +nodes = packages +edges = dependency relationships +root nodes = project manifests/workspaces +metadata = version, scope, source, file, line where possible +``` + +Each node should include: + +```json +{ + "name": "axios", + "ecosystem": "npm", + "version": "1.8.2", + "purl": "pkg:npm/axios@1.8.2", + "direct": true, + "scope": "production", + "source_type": "registry", + "manifest_file": "package.json", + "lockfile": "package-lock.json" +} +``` + +--- + +### FR3: Identify direct unpinned dependencies + +The scanner must flag direct dependencies using: + +```text +* +latest +x +>= +> +unbounded ranges +bare names +mutable refs +branch refs +URL dependencies without checksum +``` + +Examples: + +```text +npm: + "lodash": "*" + "axios": "latest" + "react": ">=18" + "express": "^4.18.0" + +Python: + requests + requests>=2.31.0 + package @ git+https://example.com/repo.git@main +``` + +Important: severity depends on policy and lockfile context. + +--- + +### FR4: Classify transitive ranges accurately + +The scanner must not blindly flag every transitive package declaration as a blocking issue. + +Classification: + +```text +transitive range + resolved by lockfile = informational or no finding +transitive range + no lockfile = warning/high depending on deployability +transitive vulnerable resolved version = vulnerability finding +transitive package from mutable source = policy finding +``` + +--- + +### FR5: Detect lockfile drift + +The scanner must detect when: + +```text +manifest has changed but lockfile has not +manifest dependency is missing from lockfile +lockfile has package no longer declared or reachable +lockfile package manager version is incompatible +workspace manifest is not represented in root lockfile +``` + +--- + +### FR6: Explain dependency path + +Users must be able to run: + +```bash +corgea deps explain +``` + +The output must show: + +```text +package identity +direct/transitive +scope +dependency path +parent package +declared constraint +resolved version +source file +lockfile entry +policy status +vulnerability status +reachability status +fix recommendation +``` + +--- + +### FR7: Generate dependency diff + +Users must be able to compare dependency graphs: + +```bash +corgea deps diff --base origin/main +corgea deps diff --base v1.2.0 --head v1.3.0 +corgea deps diff --previous-scan +``` + +The diff must show: + +```text +added packages +removed packages +changed versions +new direct dependencies +new transitive dependencies +new vulnerabilities +resolved vulnerabilities +new policy violations +license changes +source/registry changes +``` + +--- + +### FR8: Support policy-as-code + +The tool must read policy from: + +```text +.corgea/deps.yml +.corgea.yml +organization default policy from Corgea platform +``` + +Precedence: + +```text +CLI flags override repo policy +repo policy overrides org default +org default overrides built-in default +``` + +--- + +### FR9: Emit machine-readable output + +The tool must support: + +```bash +--out-format json +--out-format sarif +--out-format html +--out-format table +--out-file +``` + +This aligns with Corgea’s existing CLI output model. ([Corgea Documentation][1]) + +--- + +### FR10: Upload inventory snapshot + +The tool must support: + +```bash +corgea deps scan --upload +``` + +Uploaded snapshot should include: + +```text +repo +branch +commit SHA +scan timestamp +package files +manifest hashes +lockfile hashes +graph hash +dependency nodes +dependency edges +policy findings +vulnerability findings +license findings +SBOM artifact if generated +``` + +--- + +### FR11: Generate SBOM + +The tool should support: + +```bash +corgea deps sbom --format cyclonedx +corgea deps sbom --format spdx +``` + +MVP can start with one format. Recommended default: CycloneDX. + +--- + +### FR12: Support suppressions and exceptions + +Policy findings must support suppressions with: + +```text +finding ID +package +reason +owner +expiration date +scope +approval metadata +``` + +Example: + +```yaml +exceptions: + - id: DEP003 + package: npm:axios + reason: "Application uses lockfile; exact manifest pinning not required." + owner: "platform-security" + expires: "2026-08-01" +``` + +Exceptions without expiration should be discouraged. + +--- + +## 9. Finding taxonomy + +Recommended initial finding codes: + +| Code | Finding | Default severity | +| ------ | ------------------------------------------ | -----------------: | +| DEP001 | Missing lockfile | High | +| DEP002 | Stale lockfile | High | +| DEP003 | Direct dependency uses broad range | Medium | +| DEP004 | Wildcard or `latest` dependency | High | +| DEP005 | Mutable Git branch dependency | High | +| DEP006 | URL/tarball dependency without checksum | High | +| DEP007 | Transitive dependency unresolved | Medium | +| DEP008 | Lockfile integrity hash missing | Medium | +| DEP009 | Package manager version not pinned | Low | +| DEP010 | Vulnerable resolved package | Severity from vuln | +| DEP011 | Declared range allows vulnerable versions | Medium | +| DEP012 | Deprecated package | Medium | +| DEP013 | Abandoned package | Medium | +| DEP014 | Duplicate versions of same package | Low | +| DEP015 | Dev dependency included in production path | Medium | +| DEP016 | License policy violation | High | +| DEP017 | Unapproved registry | High | +| DEP018 | Suspicious package source change | High | +| DEP019 | Manifest-lockfile package manager mismatch | Medium | +| DEP020 | Dependency exception expired | High | +| DEP021 | Mutable artifact version (Maven SNAPSHOT) | High | + +--- + +## 10. MVP scope + +### MVP command set + +```bash +corgea deps scan +corgea deps explain +corgea deps diff --base +corgea deps policy init +corgea deps sbom +``` + +### MVP ecosystems + +Start with the ecosystems most likely to create immediate customer value: + +```text +npm / yarn / pnpm +Python requirements / Poetry / uv +Go modules +Maven / Gradle +``` + +### MVP findings + +Ship with these first: + +```text +DEP001 missing lockfile +DEP002 stale lockfile +DEP003 direct dependency uses broad range +DEP004 wildcard/latest dependency +DEP005 mutable Git branch dependency +DEP006 URL dependency without checksum +DEP008 missing integrity hash +DEP010 vulnerable resolved package +DEP016 license violation +DEP017 unapproved registry +DEP021 mutable artifact version (Maven SNAPSHOT) +``` + +### MVP outputs + +```text +terminal table +JSON +SARIF +CycloneDX SBOM +Corgea upload +``` + +### MVP platform views + +```text +Dependency inventory by repo +Dependency search across repos +Lockfile coverage +Policy violations by severity +Vulnerable dependencies by reachability +Dependency diff per scan +SBOM download +``` + +--- + +## 11. Detailed use cases + +1. **PR dependency review** + A developer opens a PR that adds `axios`. Corgea shows the direct dependency and all new transitive packages. + +2. **Missing lockfile detection** + A Python service has `requirements.txt` but no compiled lockfile or constraints. Corgea flags the build as non-reproducible. + +3. **Stale lockfile detection** + A developer updates `package.json` but forgets to regenerate `package-lock.json`. Corgea blocks CI. + +4. **Unpinned direct dependency detection** + A direct dependency uses `latest`, `*`, `>=`, or a bare Python package name. Corgea flags it. + +5. **Mutable Git dependency detection** + A package points to `@main` instead of a commit SHA. Corgea flags it as mutable. + +6. **Transitive dependency explanation** + Security sees a vulnerable package and runs `corgea deps explain`. Corgea shows the parent dependency path. + +7. **CVE incident response** + A new vulnerability is disclosed. AppSec searches all inventories to find affected repos, teams, versions, and reachability. + +8. **SBOM generation** + A release manager generates an SBOM for a customer security review. + +9. **License policy enforcement** + A PR introduces an AGPL dependency. Corgea blocks the build and explains the license policy. + +10. **Duplicate dependency reduction** + Corgea detects five versions of the same npm package, helping reduce attack surface and bundle size. + +11. **Dead dependency cleanup** + Corgea identifies packages present in the lockfile but not used by application code. + +12. **Private registry enforcement** + Corgea flags a dependency resolved from a public registry when policy requires internal registry resolution. + +13. **Dependency drift monitoring** + Corgea compares branches, releases, or environments and detects inconsistent dependency resolutions. + +14. **Monorepo inventory** + Corgea maps every manifest and lockfile to its workspace/service owner. + +15. **Audit readiness** + Corgea stores historical dependency graphs per release commit. + +16. **Exception governance** + A team suppresses a finding with owner, reason, and expiration. Corgea reopens it after expiry. + +17. **Upgrade planning** + Corgea groups vulnerable dependencies by parent package and recommends the smallest upgrade path. + +18. **Dev dependency production leakage** + Corgea detects that a dev-only package is included in a production deployment path. + +19. **Package source change detection** + A package changes from official registry source to a Git URL or tarball. Corgea flags it. + +20. **Organization policy enforcement** + Security defines a central policy that all deployable apps must have lockfiles and approved registries. + +--- + +## 12. Future features + +### 12.1 Dependency graph diff visualization + +A web UI that shows graph changes between commits, releases, or scans. + +Useful views: + +```text +new direct dependencies +new transitive dependencies +new vulnerable paths +removed packages +changed package sources +license changes +registry changes +``` + +--- + +### 12.2 Automated remediation PRs + +Corgea could generate PRs that: + +```text +regenerate lockfiles +pin exact versions +add constraints +upgrade vulnerable packages +remove unused packages +replace deprecated packages +switch mutable Git refs to commit SHAs +``` + +--- + +### 12.3 Package health scoring + +Score each dependency using: + +```text +release recency +maintenance activity +deprecated status +known vulnerabilities +license risk +package age +download/popularity signal +maintainer changes +registry/source changes +typosquatting risk +malware history +``` + +--- + +### 12.4 Organization-wide dependency search + +Search examples: + +```text +show all repos using lodash +show all services using urllib3 < 2.0 +show all packages from unapproved registries +show all projects without lockfiles +show all AGPL dependencies +show all reachable critical vulnerabilities +``` + +--- + +### 12.5 Runtime correlation + +Compare source dependency inventory with what is actually deployed. + +Questions answered: + +```text +Is this lockfile dependency present in the container? +Is this vulnerable package actually shipped? +Are dev dependencies leaking into production? +Does production contain packages not represented in source? +``` + +--- + +### 12.6 VEX support + +Allow teams to record exploitability status: + +```text +affected +not affected +fixed +under investigation +``` + +This would reduce noise for vulnerabilities that are present but not exploitable in context. + +--- + +### 12.7 Dependency ownership mapping + +Automatically assign dependencies and findings to service owners. + +Inputs: + +```text +CODEOWNERS +repo ownership metadata +monorepo workspace ownership +Corgea project ownership +team mappings +``` + +--- + +### 12.8 Risk budget + +Allow teams to define a dependency risk budget. + +Example: + +```text +maximum 0 critical reachable vulns +maximum 3 high reachable vulns +maximum 10 medium policy warnings +no missing lockfiles +no unapproved registries +``` + +--- + +### 12.9 Package manager version governance + +Flag projects that do not pin package manager versions. + +Examples: + +```text +npm version not pinned +pnpm version not pinned +Poetry version not pinned +uv version not pinned +Maven wrapper missing +Gradle wrapper missing +``` + +This matters because different package manager versions can resolve or install dependencies differently. + +--- + +### 12.10 Suspicious dependency behavior + +Future supply-chain checks: + +```text +dependency newly added with install scripts +package maintainer changed recently +package name similar to popular package +package source changed registries +package has very low age or low adoption +package publishes many versions rapidly +package contains obfuscated code +``` + +--- + +## 13. Data model + +### Dependency node + +```json +{ + "id": "pkg:npm/axios@1.8.2", + "name": "axios", + "ecosystem": "npm", + "version": "1.8.2", + "purl": "pkg:npm/axios@1.8.2", + "scope": "production", + "direct": true, + "depth": 1, + "source_type": "registry", + "registry_url": "https://registry.npmjs.org/", + "license": "MIT", + "deprecated": false, + "reachable": "unknown", + "dead_package": false +} +``` + +### Dependency edge + +```json +{ + "from": "root", + "to": "pkg:npm/axios@1.8.2", + "declared_constraint": "^1.8.0", + "resolved_version": "1.8.2", + "relationship": "direct", + "scope": "production", + "source_file": "package.json", + "lockfile": "package-lock.json" +} +``` + +### Finding + +```json +{ + "id": "DEP003", + "severity": "medium", + "title": "Direct dependency uses broad range", + "package": "pkg:npm/axios@1.8.2", + "source_file": "package.json", + "declared_constraint": "^1.8.0", + "resolved_version": "1.8.2", + "status": "open", + "recommendation": "Pin axios to 1.8.2 or allow this range by policy because the lockfile resolves it.", + "introduced_in": "current_scan", + "paths": [ + ["root", "pkg:npm/axios@1.8.2"] + ] +} +``` + +### Inventory snapshot + +```json +{ + "repo": "api-service", + "branch": "main", + "commit": "abc123", + "scan_timestamp": "2026-05-20T10:00:00Z", + "manifest_hashes": { + "package.json": "..." + }, + "lockfile_hashes": { + "package-lock.json": "..." + }, + "graph_hash": "...", + "nodes": [], + "edges": [], + "findings": [] +} +``` + +--- + +## 14. CLI design + +### Primary commands + +```bash +corgea deps scan +``` + +Runs dependency inventory and policy scan. + +```bash +corgea deps explain +``` + +Explains why a package exists. + +```bash +corgea deps graph +``` + +Prints dependency tree or exports graph. + +```bash +corgea deps diff --base +``` + +Compares dependency graph against another ref. + +```bash +corgea deps sbom --format cyclonedx +``` + +Generates SBOM. + +```bash +corgea deps policy init +``` + +Creates starter policy file. + +```bash +corgea deps fix +``` + +Suggests or applies safe remediations. + +--- + +### Useful flags + +```bash +--ecosystem npm +--ecosystem pypi +--prod-only +--include-dev +--changed +--fail-on high +--policy .corgea/deps.yml +--out-format json +--out-format sarif +--out-format html +--out-file deps-report.json +--upload +--explain-findings +--show-paths +--sbom +``` + +--- + +## 15. Platform UI recommendations + +The Corgea web experience should include: + +### 15.1 Dependency inventory page + +Columns: + +```text +package +ecosystem +version +direct/transitive +scope +repos affected +vulnerabilities +reachability +license +source +last seen +owner +``` + +### 15.2 Repo dependency posture page + +Cards: + +```text +total dependencies +direct dependencies +transitive dependencies +missing lockfiles +stale lockfiles +reachable critical vulns +license violations +unapproved registries +dead packages +``` + +### 15.3 Dependency detail page + +Show: + +```text +all repos using package +all versions in use +dependency paths +known vulnerabilities +fixed versions +licenses +source registries +reachability status +historical trend +``` + +### 15.4 PR view + +Show: + +```text +new packages +removed packages +version changes +new policy findings +new reachable vulnerabilities +new license issues +recommended action +``` + +### 15.5 Policy page + +Allow org admins to configure: + +```text +lockfile requirements +pinning requirements +allowed registries +blocked licenses +vulnerability thresholds +exception rules +CI failure behavior +``` + +--- + +## 16. Success metrics + +### Adoption metrics + +```text +number of repos scanned +number of active orgs using deps scan +percentage of scans uploaded +number of CI integrations +number of SBOMs generated +``` + +### Quality metrics + +```text +false positive rate +percentage of findings with dependency path +percentage of findings with recommended fix +percentage of findings with reachability status +scan success rate by ecosystem +``` + +### Security impact metrics + +```text +missing lockfiles reduced +stale lockfiles reduced +reachable critical vulnerabilities reduced +unapproved registry usage reduced +license violations prevented +mean time to remediate dependency findings +``` + +### Developer experience metrics + +```text +average scan time +average CI runtime overhead +percentage of PRs blocked +percentage of blocked PRs resolved without AppSec intervention +number of explain command uses +``` + +--- + +## 17. Launch plan + +### Phase 1: Alpha + +Audience: + +```text +internal users +design partners +small number of repos +``` + +Scope: + +```text +npm and Python +local CLI only +JSON output +basic policy findings +dependency explain +``` + +Exit criteria: + +```text +accurate dependency graph on representative repos +low false positive rate for lockfile and pinning findings +developer output is understandable +``` + +--- + +### Phase 2: Beta + +Audience: + +```text +selected customers +AppSec teams +CI users +``` + +Scope: + +```text +SARIF output +Corgea upload +policy-as-code +dependency diff +SBOM export +Go and Java support +``` + +Exit criteria: + +```text +CI integration works reliably +dependency diffs are trusted +platform inventory is useful +policy configuration is understandable +``` + +--- + +### Phase 3: GA + +Audience: + +```text +all Corgea customers +``` + +Scope: + +```text +multi-ecosystem support +dashboard +org-wide search +exceptions +reachability enrichment +license policy +remediation guidance +``` + +Exit criteria: + +```text +documented CLI +stable output schema +strong ecosystem coverage +low support burden +clear ROI for AppSec and developers +``` + +--- + +## 18. Risks and mitigations + +### Risk 1: Too many false positives + +Bad outcome: + +```text +Developers see every transitive semver range as a violation. +They ignore or disable the tool. +``` + +Mitigation: + +```text +Treat transitive ranges as risky only when unresolved, unlocked, vulnerable, mutable, or policy-relevant. +Default CI gating to new high-risk findings only. +``` + +--- + +### Risk 2: Ecosystem edge cases + +Bad outcome: + +```text +Parser fails on real-world lockfiles, monorepos, or workspaces. +``` + +Mitigation: + +```text +Start with fewer ecosystems. +Build strong test fixtures. +Use package-manager-native commands where needed. +Clearly label unsupported files. +``` + +--- + +### Risk 3: Slow CI scans + +Bad outcome: + +```text +Teams disable scanning because it slows builds. +``` + +Mitigation: + +```text +Cache parsed lockfiles. +Hash manifests and lockfiles. +Support changed-only mode. +Avoid network calls unless explicitly enabled. +``` + +--- + +### Risk 4: Confusing policy semantics + +Bad outcome: + +```text +Users do not understand why something failed. +``` + +Mitigation: + +```text +Every finding includes source file, reason, dependency path, policy rule, and exact remediation. +``` + +--- + +### Risk 5: Overlapping with existing SCA scanner + +Bad outcome: + +```text +Users cannot tell whether this is SCA, SBOM, policy, or vulnerability scanning. +``` + +Mitigation: + +```text +Position this as the graph/inventory/policy layer. +SCA vulnerabilities are one enrichment source, not the whole product. +``` + +--- + +## 19. Recommendation: default policy + +Default policy should be strict enough to create value but not so strict that every repo fails. + +Recommended default: + +```yaml +dependency_policy: + require_lockfile: true + fail_on_missing_lockfile: true + fail_on_stale_lockfile: true + + direct_dependencies: + fail_on_wildcard: true + fail_on_latest: true + fail_on_mutable_sources: true + warn_on_semver_range: true + + transitive_dependencies: + allow_ranges_if_resolved_by_lockfile: true + fail_if_unresolved: true + + vulnerabilities: + fail_on_new_critical_reachable: true + fail_on_new_high_reachable: true + warn_on_unreachable: true + + licenses: + fail_on_blocked_license: true + + ci: + fail_on_new_findings_only: true +``` + +This avoids the biggest product mistake: blocking builds for harmless transitive declarations that are already locked. + +--- + +## 20. Open questions + +1. Should exact version pinning be recommended for all direct dependencies, or only for deployable applications? +2. Should libraries get a different default policy from applications? +3. Which package managers should be MVP versus beta? +4. Should Corgea invoke native package-manager commands, or parse lockfiles only? +5. How should the tool handle generated lockfiles that are intentionally not committed? +6. Should dependency health scoring be included in v1 or left for v2? +7. Should remediation PRs be part of beta or GA? +8. Which SBOM format should be default: CycloneDX or SPDX? +9. How should exceptions be approved: repo-only, org-level, or both? +10. Should policy support different thresholds for production, staging, development, and test dependencies? +11. Should reachability be required for CI gating, or used only as prioritization? +12. How should monorepo service ownership be inferred? + +--- + +## 21. Final recommendation + +Build this as **Corgea’s dependency graph and supply-chain policy layer**, not just a pinning checker. + +The MVP should be: + +```text +corgea deps scan +corgea deps explain +corgea deps diff +corgea deps policy +corgea deps sbom +``` + +The product should center on five promises: + +```text +Inventory: what dependencies do we have? +Provenance: where did they come from? +Reproducibility: can this install drift? +Risk: is this vulnerable, reachable, or policy-violating? +Remediation: what is the smallest safe fix? +``` + +The strongest wedge is not “we find vulnerable dependencies.” Many tools do that. + +The stronger wedge is: + +```text +Corgea tells you exactly why a dependency exists, whether it can drift, whether it matters, and how to fix it without drowning developers in noise. +``` + +[1]: https://docs.corgea.app/cli?utm_source=chatgpt.com "CLI - Corgea Documentation" +[2]: https://corgea.com/products/dependency-scanning?utm_source=chatgpt.com "Dependency Scanning with AI Reachability Analysis" + diff --git a/PRD_DEPS_CONDENSED.md b/PRD_DEPS_CONDENSED.md new file mode 100644 index 0000000..dd5a016 --- /dev/null +++ b/PRD_DEPS_CONDENSED.md @@ -0,0 +1,113 @@ +# PRD: Corgea Dependency Inventory & Supply-Chain Policy + +**Area:** Corgea CLI · SCA / dependency scanning · **CLI namespace:** `corgea deps` · **Status:** Draft · **Users:** Developers, AppSec, platform, compliance + +## 1. Summary + +Corgea should do more than report vulnerable dependencies. It should explain what dependency exists, why it exists, whether it can drift, whether it violates policy, whether it is reachable, and what the smallest safe fix is. + +Build a dependency-inventory and supply-chain-policy layer on the existing Corgea CLI. It scans manifests and lockfiles, builds a normalized dependency graph, classifies hygiene and reproducibility issues, flags vulnerable and policy-violating packages, and traces how each package entered the project. Inventory snapshots upload to Corgea for org-wide visibility. + +The wedge is not "we find vulnerable dependencies." Many tools do that. The wedge is this: Corgea tells you why a dependency exists, whether it can drift, whether it matters, and how to fix it without drowning developers in noise. + +## 2. Problem + +Modern apps carry large dependency trees. Teams know a vulnerable package exists somewhere, but they cannot cheaply answer: where is it used, why is it present, direct or transitive, is the install reproducible, is the lockfile stale, is the code reachable, did this PR introduce it, what is the smallest fix. + +Existing scanners over-index on CVEs and under-index on dependency governance. They miss missing lockfiles, unpinned direct deps, wildcard versions, mutable Git refs, unchecksummed URLs, stale lockfiles, drift, unknown registries, license violations, and dev deps leaking to production. + +Precision makes or breaks this product. A transitive package may declare a broad range while the lockfile still resolves it to a concrete version. That install is reproducible, not non-deterministic. The product must distinguish *unpinned, broad, mutable, unresolved, resolved, locked, stale, vulnerable, reachable, non-reproducible,* and *policy-violating*. Conflating these destroys developer trust. + +## 3. Goals & non-goals + +**Goals.** Build an accurate inventory and direct/transitive graph per project. Detect missing, stale, and incomplete lockfiles. Flag unsafe direct deps (unpinned, broad, mutable). Explain dependency paths developers can act on. Detect new risk introduced by PRs. Support policy-as-code. Emit CI-friendly output (JSON, SARIF, SBOM). Upload snapshots. Reuse Corgea's existing SCA, reachability, dead-package, and remediation signals. + +**Non-goals (v1).** Malware detection, runtime agent, container-image inventory, artifact reverse engineering, automated major-version upgrades, AI replacement recommendations, day-one org-wide multi-language enforcement, perfect reachability everywhere, rich terminal graph visualization. The MVP must be accurate, narrow, and trusted. + +## 4. Solution + +Five promises: **Inventory** (what we have), **Provenance** (where it came from), **Reproducibility** (can it drift), **Risk** (vulnerable, reachable, or policy-violating), **Remediation** (smallest safe fix). + +Core commands: + +``` +corgea deps scan inventory + policy scan +corgea deps explain why a package exists (signature workflow) +corgea deps diff --base dependency changes vs a git ref +corgea deps sbom CycloneDX / SPDX export +corgea deps policy init starter policy file +corgea deps fix suggest / apply safe remediations +``` + +`explain` is the signature workflow. It shows identity, direct or transitive, scope, the full path (`root > express@4.18.2 > qs@6.11.0`), declared constraint against resolved version, source file, lockfile entry, policy and vuln and reachability status, and the fix. The best dependency tools answer one question: why is this here? + +Output reuses Corgea's existing CLI model (`--out-format json|sarif|html|table`, `--out-file`). CI mode runs `corgea deps scan --changed --fail-on high`. It blocks on new risk, not inherited backlog. + +## 5. Core correctness behavior + +Model three layers separately: + +- **Declared intent.** What the manifest allows (`"axios": "^1.8.0"`). +- **Resolved reality.** What the lockfile installed (`axios 1.8.2`). +- **Effective risk.** A range plus a committed lockfile is reproducible. Policy may warn, but it must never treat this as a missing lockfile. + +Bad finding: *"axios is unpinned and vulnerable."* Good finding: *"axios uses a semver range in package.json; package-lock.json resolves 1.8.2. Policy warning only. The build stays reproducible."* + +Each node carries ecosystem, version, purl, direct or transitive, scope (prod, dev, optional, peer), and source type (registry, private, git commit/tag/branch, local path, URL, workspace, vendored, unknown). The scanner preserves dependency paths instead of flattening them. Lockfile health detects missing, stale, uncommitted, manifest mismatch, missing integrity hashes, conflicting lockfiles, package-manager mismatch, and workspace gaps. + +## 6. MVP scope + +- **Commands:** `scan`, `explain`, `diff`, `policy init`, `sbom` +- **Ecosystems:** npm/yarn/pnpm, Python (requirements/Poetry/uv), Go modules, Maven/Gradle +- **Outputs:** terminal table, JSON, SARIF, CycloneDX SBOM, Corgea upload + +**Findings to ship first:** + +| Code | Finding | Severity | +|---|---|---| +| DEP001 | Missing lockfile | High | +| DEP002 | Stale lockfile | High | +| DEP003 | Direct dep uses broad range | Medium | +| DEP004 | Wildcard or `latest` dependency | High | +| DEP005 | Mutable Git branch dependency | High | +| DEP006 | URL/tarball dep without checksum | High | +| DEP008 | Lockfile integrity hash missing | Medium | +| DEP010 | Vulnerable resolved package | From vuln | +| DEP016 | License policy violation | High | +| DEP017 | Unapproved registry | High | +| DEP021 | Mutable artifact version (Maven SNAPSHOT) | High | + +The full taxonomy (DEP001 to DEP021) also covers deprecated and abandoned packages, duplicate versions, dev-in-prod leakage, source-change detection, and expired exceptions. + +**Default policy posture.** Require lockfiles. Fail on wildcard, `latest`, and mutable sources. Warn on semver ranges. Allow transitive ranges when the lockfile resolves them. Fail on new critical and high *reachable* vulnerabilities. Set `fail_on_new_findings_only: true`. This avoids the biggest product mistake: blocking builds for harmless, already-locked transitive declarations. + +## 7. Risks & mitigations + +**False positives.** Flagging every transitive range erodes trust. Mitigation: flag transitive ranges only when they are unresolved, unlocked, vulnerable, or mutable; gate CI on new findings only. + +**Ecosystem edge cases.** Parsers break on real lockfiles and monorepos. Mitigation: start with fewer ecosystems, build strong test fixtures, use package-manager-native commands where needed, label unsupported files. + +**Slow CI.** Builds get slower and teams disable scanning. Mitigation: hash and cache manifests and lockfiles, support changed-only mode, skip network calls unless opted in. + +**Overlap with existing SCA.** Users cannot tell what this product is. Mitigation: position it as the graph, inventory, and policy layer. SCA vulnerabilities are one enrichment source, not the product. + +## 8. Launch plan + +**Alpha.** Internal users and design partners. npm and Python, local CLI, JSON output, basic findings plus `explain`. Exit: accurate graphs, low false-positive rate, output developers understand. + +**Beta.** Selected customers, AppSec teams, CI users. SARIF, upload, policy-as-code, `diff`, SBOM, Go and Java. Exit: reliable CI integration, trusted diffs, useful platform inventory. + +**GA.** All customers. Multi-ecosystem, dashboard, org-wide search, exceptions, reachability enrichment, remediation guidance. Exit: stable output schema, broad coverage, low support burden, clear ROI. + +## 9. Open questions + +1. Exact pinning for all direct deps, or only deployable apps? Different default policy for libraries and applications? +2. Invoke native package-manager commands, or parse lockfiles only? +3. How should the tool handle lockfiles that teams intentionally do not commit? +4. SBOM default: CycloneDX or SPDX? +5. Exception approval: repo-only, org-level, or both? +6. Is reachability required for CI gating, or used only for prioritization? + +## 10. Success metrics + +Adoption: repos scanned, active orgs, percent of scans uploaded, CI integrations. Quality: false-positive rate, percent of findings with a dependency path and a recommended fix, scan success rate by ecosystem. Security impact: missing and stale lockfiles reduced, reachable critical vulns reduced, mean time to remediate dependency findings. Developer experience: average scan time, percent of PRs blocked, percent of blocked PRs resolved without AppSec intervention. diff --git a/PRD_DEPS_TESTING.md b/PRD_DEPS_TESTING.md new file mode 100644 index 0000000..9e8b014 --- /dev/null +++ b/PRD_DEPS_TESTING.md @@ -0,0 +1,2062 @@ +# PRD: Corgea Dependency Inventory — TDD Test Plan + +**Product area:** Corgea CLI / SCA / Dependency Scanning +**Companion to:** `PRD_DEPS.md` (the feature spec) +**Working name:** `corgea deps` test suite +**Status:** Draft PRD — revision 3 +**Primary readers:** Engineers implementing `corgea deps`, AppSec reviewers, CI owners +**Core thesis:** Every core behavior in `PRD_DEPS.md` ships behind a test that was **written first and observed failing**. The test suite *is* the executable specification. Implementation is "done" when the suite goes from red to green — not before. + +This document is a test-driven-development (TDD) plan. It defines a stub API to land first, fixture projects for **three ecosystems (Python, Node.js, Java)**, and the concrete failing tests that pin the MVP behavior described in `PRD_DEPS.md` §7–§9. + +> **Revision 2 changelog.** After a design review (recorded in §14): work is now sequenced as **vertical slices**, not one 52-test red batch (§9); the stub keeps **real constructors** and stubs only leaf behavior (§5); package identity is a typed **`PackageId`** (purl), not a name string (§5.2); tests that depend on an unresolved policy question assert only the **stable invariant** (§3.4); CLI integration tests are **hermetic** — isolated `HOME`, no token, no network (§8.11); one new dependency (`serde_yaml_ng`) is required for policy YAML (§4.3); a robustness / determinism / malformed-input slice is added (§6.8, §8.12). +> +> **Revision 3 changelog.** The seven §13 open questions are now **resolved decisions** (§13). Consequences: a new taxonomy code **DEP021 "Mutable artifact version"** for Maven `SNAPSHOT` (decision 1); unbounded `>=` / bare names are **DEP004 High** (decision 2); **DEP010 "vulnerable resolved package" stays in the MVP** behind a mocked vulnerability source — new **Slice 8** (§8.13, §9) — while DEP016/DEP017 and the Go graph remain deferred (decision 7). The two formerly decision-gated tests (§8.6) now assert exact codes. Test count: ~58 → ~64. + +--- + +## 1. Summary + +`PRD_DEPS.md` specifies a large feature: `corgea deps scan / explain / graph / diff / sbom / policy / fix`. None of it exists yet. This is the ideal condition for strict TDD. + +This plan does three things: + +1. **Defines a stub API** (`src/deps/`) — types and signatures. Constructors and pure helpers are *real*; only the leaf behaviors under test are `unimplemented!()`. Tests *compile* and *fail at the function they target*. +2. **Defines fixture projects** for Python, Node.js, and Java — real manifests and lockfiles in `tests/fixtures/`. +3. **Defines the failing tests** — unit tests against the internal API plus hermetic CLI integration tests against the built binary. + +The work is sequenced into **vertical slices** (§9). Each slice is one PR that adds that slice's red tests *and* the implementation that turns them green. The feature branch accumulates green; `main` stays green throughout. + +```text +Per slice: RED → commit the slice's tests; cargo test shows them failing + GREEN → implement until the slice's tests pass; nothing else regresses + REFACTOR → clean up with the now-green slice as a safety net +``` + +--- + +## 2. Scope — what "core new deps features" means here + +`PRD_DEPS.md` is broad. This test plan covers the **MVP slice** (`PRD_DEPS.md` §10) and only the MVP. + +| PRD ref | Core feature | Covered here | +|---|---|---| +| FR1 | Detect manifests & lockfiles | §8.1 | +| FR3 | Classify direct unpinned dependencies | §8.2 | +| FR2 | Build dependency graph (nodes/edges, direct/transitive) | §8.3 | +| FR4 / §7.1 | Manifest-vs-lockfile correctness (do **not** flag locked transitive ranges) | §8.4 | +| FR5 / §7.4 | Lockfile health — missing / stale / missing integrity | §8.5 | +| §7.3 | Package source classification — mutable git / URL without checksum | §8.6 | +| FR8 + §9 | Policy evaluation & finding taxonomy (DEP001–DEP008) | §8.7 | +| §7.5 / DEP010 | Vulnerable resolved package — via a mocked vulnerability source | §8.13 | +| FR6 | `explain` a dependency path | §8.8 | +| FR7 | Dependency diff (graph-level) | §8.9 | +| FR9 / FR11 | Machine-readable output — JSON, SARIF, CycloneDX SBOM | §8.10 | +| §6.4 | CLI behavior — exit codes, `--fail-on`, `--out-file`, hermeticity | §8.11 | +| §18 R2 | Robustness — malformed input, determinism, monorepo | §8.12 | + +The three MVP ecosystems under full test: **npm** (Node.js), **PyPI** (Python), **Maven/Gradle** (Java). + +### 2.1 Scope — what is in, what is deferred (reconciled with `PRD_DEPS.md` §10) + +`PRD_DEPS.md` §10 lists DEP010, DEP016, DEP017 and Go in the MVP. Per §13 decision 7: + +- **DEP010 (vulnerable resolved package) is kept in the MVP.** It is the center of gravity of an SCA tool. Because it needs an external advisory source, it is tested behind a **mocked `VulnerabilitySource`** (§5.3, §8.13) — that proves finding construction, severity propagation, and dependency-path attribution offline and deterministically. The production advisory source is a separate, non-test concern. +- **DEP016 (license) and DEP017 (registry) are deferred.** They are config-heavy and secondary, and each needs its own data source. They reuse the same `FindingSource` trait seam (§5.3) and belong in a follow-up plan. +- **Go / Rust / Ruby** get *detection* smoke coverage only (§8.1); their graph building is Beta (`PRD_DEPS.md` §17 Phase 2). + +--- + +## 3. TDD methodology + +### 3.1 Why a stub API, and how narrow it is + +In Rust, a test referencing a nonexistent module fails to **compile**, and a single non-compiling test file blocks `cargo test` from running *any* test. That is a useless red state. + +The useful red state: tests **compile and run**, then **fail at the function under test**. We get that from a stub API — but the stub is *narrow*: + +- **Real**: every type, every constructor (`Policy::default`, `DependencyGraph::default`), and every pure helper (`PackageId::name`). These never panic. A test must never fail inside a constructor — that would obscure which behavior is missing. +- **`unimplemented!()`**: only the leaf behaviors a test directly targets — `classify_constraint`, `detect_dependency_files`, `scan`, `evaluate`, `from_yaml`, `explain`, `diff_graphs`, the `report::*` functions. + +```rust +pub fn scan(root: &Path, policy: &Policy) -> Result { + unimplemented!("deps::scan — PRD_DEPS_TESTING.md §8") +} +``` + +A `scan()` test fails with a panic *inside `scan`* — pointing straight at the missing behavior. It does not fail inside `Policy::default()`. + +### 3.2 The three states, per slice + +1. **RED** — the slice's tests land and fail at their target function. +2. **GREEN** — implement the minimum to pass them. No test is weakened or `#[ignore]`-d to "make it pass". Earlier slices stay green. +3. **REFACTOR** — restructure with the green suite as the net. + +### 3.3 Rules for every test + +- **Fails first, for the right reason.** Before implementation the failure is the `unimplemented!()` panic of the *targeted* function. The PR adding a test quotes its red `cargo test` output. +- **Pins behavior, not storage.** Tests assert on the public contract (`Inventory`, `Finding`, `DependencyGraph` queried through accessor methods) and fixture inputs — never on private fields or `Vec` ordering (ordering has its own determinism test, §8.12). +- **Positive and negative are paired.** Every "X produces DEPNNN" test is paired with "Y does not". A scanner that flags everything and one that flags nothing must each fail at least one test in every pair. +- **Deterministic & offline.** No network, no wall clock, no real git history. Fixtures are static files; staleness is content divergence, not file mtime (mtime is not preserved by `git`). +- **Hermetic.** Tests touch no real user state. CLI tests run with an isolated `HOME` (§8.11). +- **One behavior per test.** Names read as specifications: `npm_wildcard_direct_dep_is_dep004_high`. + +### 3.4 Decision-gated assertions + +All seven revision-2 open questions are now resolved (§13), so the two formerly decision-gated tests (`maven_snapshot_is_dep021_high`, `pypi_open_ended_range_is_dep004_high`) assert exact codes as of revision 3. The mechanism is retained for any future open question: + +Rule: where an unresolved policy question would change the outcome, the test asserts only the **stable invariant** — a finding exists / does not exist, and its broad severity class (High vs not-High) — and carries a `// DECISION-GATED: ` comment. When the question is resolved, the test is tightened in the same PR that records the decision. Tests that assert an exact code+severity must trace to an unambiguous `PRD_DEPS.md` §9 taxonomy row. + +--- + +## 4. Test architecture + +### 4.1 Two layers + +| Layer | Location | Exercises | Tests | +|---|---|---|---| +| **Unit** | inline `#[cfg(test)]` submodules under `src/deps/tests/` | the internal `deps` API directly | parsing, classification, graph, findings, policy, diff, report — the language matrix | +| **CLI integration** | `tests/cli_deps.rs` | the compiled binary as a subprocess | argument parsing, exit codes, `--fail-on`, `--out-format`, `--out-file`, hermeticity | + +The crate is binary-only (no `src/lib.rs`), so `tests/` integration tests cannot import internal modules — they invoke the binary via the Cargo-provided `CARGO_BIN_EXE_corgea`. Unit tests live inline, matching the existing convention (`src/authorize.rs`). + +### 4.2 Directory layout (new) + +```text +cli/ + src/ + deps/ + mod.rs # pub(crate) API: scan(), Inventory; declares #[cfg(test)] mod tests + model.rs # PackageId, Ecosystem, Scope, SourceType, ConstraintKind, Severity, nodes/edges/graph + detect.rs # detect_dependency_files() + ecosystems/ + mod.rs # classify_constraint(); dispatch + npm.rs # package.json / package-lock.json / yarn.lock / pnpm-lock.yaml + pypi.rs # requirements.txt / constraints.txt / pyproject.toml / poetry.lock / uv.lock + maven.rs # pom.xml / build.gradle / gradle.lockfile + findings.rs # Finding, evaluate() + policy.rs # Policy (real Default), from_yaml() + diff.rs # diff_graphs(), GraphDiff + explain.rs # explain(), Explanation + report.rs # to_json(), to_sarif(), to_cyclonedx() + vuln.rs # VulnerabilitySource (mocked) — DEP010 enrichment + tests/ # #[cfg(test)] only — full internal-API access + mod.rs + common.rs # fixture loaders, scan helpers + detect_tests.rs + npm_tests.rs + pypi_tests.rs + maven_tests.rs + correctness_tests.rs + findings_tests.rs + policy_tests.rs + explain_tests.rs + diff_tests.rs + report_tests.rs + robustness_tests.rs + vuln_tests.rs + tests/ + cli_deps.rs # integration: runs the binary, isolated HOME + fixtures/ + node-app/ package.json, package-lock.json + node-stale/ package.json, package-lock.json + node-monorepo/ package.json (workspaces), packages/a/*, packages/b/*, package-lock.json + python-poetry/ pyproject.toml, poetry.lock + python-pip-nolock/ requirements.txt + java-maven/ pom.xml + java-gradle/ build.gradle, gradle.lockfile + go-mod-smoke/ go.mod, go.sum # detection smoke only + malformed/ bad-package-lock.json, truncated-poetry.lock, not-xml-pom.xml + vuln-db.json # static advisory DB for the mocked VulnerabilitySource +``` + +### 4.3 Dependencies + +Parsing reuses crates already in `Cargo.toml`: `serde_json` (npm), `toml` (poetry.lock, pyproject.toml, uv.lock), `quick-xml` (pom.xml), `regex` (requirements.txt, build.gradle, gradle.lockfile), `git2`/`url` (ref classification), `tempfile` (runtime test scaffolding). + +**One new dependency is required.** `Policy::from_yaml` parses YAML (`.corgea/deps.yml` and the `PRD_DEPS.md` §6.6 examples are YAML). The crate has `toml` but **no YAML parser**. Add to `[dependencies]`: + +```toml +serde_yaml_ng = "0.10" # §13 decision 6: confirmed. serde_yaml is archived; serde_yml rejected (provenance). +``` + +`assert_cmd` + `predicates` were considered as a `[dev-dependencies]` ergonomic upgrade for §8.11 and **rejected** (§13 decision 5): the CLI tests need hand-written `HOME` isolation regardless, and `CARGO_BIN_EXE_corgea` already supplies the binary path. The plan uses plain `std::process::Command`. + +### 4.4 Naming & running + +Snake_case specification names, prefixed by area: `npm_*`, `pypi_*`, `maven_*`, `detect_*`, `policy_*`, `robust_*`, `cli_*`. + +```bash +cargo test # whole crate +cargo test deps # every deps test +cargo test npm_ # one ecosystem's matrix +cargo test --test cli_deps # CLI integration only +``` + +--- + +## 5. Phase 0 — the stub API (lands with Slice 1) + +Phase 0 is the only code that lands *with* its first slice's tests rather than after them: it is scaffolding, not behavior. **Definition of done for Phase 0: `cargo test` compiles with zero errors; Slice 1's tests (§8.2) fail inside `classify_constraint`; no test fails inside a constructor.** + +### 5.1 Wire the subcommand (`src/main.rs`) + +Add a `Deps` variant to `Commands`, a match arm dispatching to `deps::run(...)`, and `mod deps;` at the top of `main.rs` — per `cli/CLAUDE.md` "adding a new subcommand". `deps scan` is a **local, offline** operation: it must **not** call `verify_token_and_exit_when_fail` and must not require config or network (see §8.11). + +```rust +/// Dependency inventory and supply-chain policy scanning +Deps { + #[command(subcommand)] + command: DepsCommand, +}, +``` + +```rust +#[derive(Subcommand)] +enum DepsCommand { + /// Scan manifests and lockfiles, build inventory, evaluate policy + Scan { + #[arg(default_value = ".")] + path: String, + #[arg(long, help = "Fail (exit 1) at or above this severity: critical, high, medium, low")] + fail_on: Option, + #[arg(long, help = "Output format: table, json, sarif")] + out_format: Option, + #[arg(long, help = "Write output to this file")] + out_file: Option, + }, + /// Print the dependency graph + Graph { #[arg(default_value = ".")] path: String }, + /// Explain why a package is present + Explain { package: String }, + /// Generate an SBOM + Sbom { #[arg(long, default_value = "cyclonedx")] format: String }, +} +``` + +### 5.2 The model — typed identity (`src/deps/model.rs`) + +Package identity is a typed **`PackageId`** (a canonical purl), not a bare name. A bare name is ambiguous across Maven `group:artifact` coordinates and across duplicate versions of the same package (`PRD_DEPS.md` DEP014). `PackageId` and its accessors are **real** — pure string parsing, never `unimplemented!()`. + +```rust +/// Canonical package identity: a Package URL, e.g. "pkg:npm/express@4.18.2". +#[derive(Debug, Clone, PartialEq, Eq, Hash)] +pub struct PackageId(pub String); + +impl PackageId { + /// The package-name component ("express", "guava", "commons-lang3"). + pub fn name(&self) -> &str { + let before_at = self.0.rsplit_once('@').map(|(l, _)| l).unwrap_or(&self.0); + before_at.rsplit_once('/').map(|(_, r)| r).unwrap_or(before_at) + } + /// The resolved-version component, if the purl carries one. + pub fn version(&self) -> Option<&str> { + self.0.rsplit_once('@').map(|(_, v)| v) + } +} + +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +pub enum Ecosystem { Npm, PyPI, Maven, Go, Cargo, Unknown } + +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +pub enum Scope { Production, Development, Optional, Peer } + +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +pub enum SourceType { + Registry, PrivateRegistry, GitCommit, GitBranch, GitTag, + LocalPath, RemoteTarball, Url, Workspace, Unknown, +} + +#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)] +pub enum Severity { Info, Low, Medium, High, Critical } + +/// How a declared version constraint behaves — the classification that drives findings. +#[derive(Debug, Clone, PartialEq, Eq)] +pub enum ConstraintKind { + Exact, // 1.2.3 ==1.2.3 [1.2.3] + BoundedRange, // ^1.2.0 ~1.2 >=1,<2 [1.0,2.0) 3.+ + Unbounded, // * x >=1 latest latest.release LATEST bare name + Mutable, // SNAPSHOT and other coordinates whose content can change + GitRef { mutable: bool }, // mutable=true → branch ref; false → 40-char commit SHA + Url { checksum: bool }, // checksum=false → tarball/URL with no integrity hash +} + +#[derive(Debug, Clone, PartialEq, Eq)] +pub struct DependencyNode { + pub(crate) id: PackageId, + pub(crate) name: String, + pub(crate) ecosystem: Ecosystem, + pub(crate) version: Option, // resolved version; None if unresolved + pub(crate) direct: bool, + pub(crate) scope: Scope, + pub(crate) depth: u32, + pub(crate) source_type: SourceType, + pub(crate) manifest_file: Option, + pub(crate) lockfile: Option, +} + +#[derive(Debug, Clone, PartialEq, Eq)] +pub struct DependencyEdge { + pub(crate) from: PackageId, // root id or a package id + pub(crate) to: PackageId, + pub(crate) declared_constraint: String, + pub(crate) resolved_version: Option, + pub(crate) scope: Scope, + pub(crate) source_file: String, +} + +#[derive(Debug, Clone, Default, PartialEq, Eq)] +pub struct DependencyGraph { + pub(crate) nodes: Vec, + pub(crate) edges: Vec, +} + +impl DependencyGraph { + /// First node with this package name. Safe for fixtures with unique names. + pub fn node(&self, name: &str) -> Option<&DependencyNode> { + self.nodes.iter().find(|n| n.name == name) + } + /// Every node with this package name (use when duplicates are possible). + pub fn nodes_named(&self, name: &str) -> Vec<&DependencyNode> { + self.nodes.iter().filter(|n| n.name == name).collect() + } + pub fn node_by_id(&self, id: &PackageId) -> Option<&DependencyNode> { + self.nodes.iter().find(|n| &n.id == id) + } +} +``` + +Fields are `pub(crate)` — internal model, queried through accessor methods. Tests read `node.id()`, `node.is_direct()`, etc. via small real accessors (shown where first used). Constructors used by tests (e.g. `DependencyNode::new(...)` for §8.9) are real builder functions, not stubs. + +### 5.3 Detection, classification, findings, policy + +```rust +// src/deps/detect.rs +use std::path::{Path, PathBuf}; +use crate::deps::model::Ecosystem; + +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +pub enum DepFileKind { + NpmManifest, NpmLockfile, YarnLockfile, PnpmLockfile, + PipRequirements, PipConstraints, PyProject, PoetryLock, UvLock, + MavenPom, GradleBuild, GradleLockfile, + GoMod, GoSum, CargoManifest, CargoLock, +} + +#[derive(Debug, Clone, PartialEq, Eq)] +pub struct DetectedFile { + pub path: PathBuf, + pub kind: DepFileKind, + pub ecosystem: Ecosystem, +} + +/// Recursively detect supported dependency files; skip vendored/VCS dirs. FR1. +pub fn detect_dependency_files(root: &Path) -> Vec { + unimplemented!("deps::detect_dependency_files — PRD_DEPS_TESTING.md §8.1") +} +``` + +```rust +// src/deps/ecosystems/mod.rs +use crate::deps::model::{ConstraintKind, Ecosystem}; + +/// Classify a raw declared constraint string. FR3 / PRD_DEPS.md §7.1. +pub fn classify_constraint(ecosystem: Ecosystem, raw: &str) -> ConstraintKind { + unimplemented!("deps::ecosystems::classify_constraint — PRD_DEPS_TESTING.md §8.2") +} +``` + +```rust +// src/deps/findings.rs +use crate::deps::model::{PackageId, Severity}; + +#[derive(Debug, Clone, PartialEq, Eq)] +pub struct Finding { + pub id: String, // taxonomy code, e.g. "DEP004" + pub severity: Severity, + pub title: String, + pub package: Option, + pub source_file: String, + pub declared_constraint: Option, + pub resolved_version: Option, + pub recommendation: String, + /// True when the install is still deterministic despite the finding + /// (e.g. a manifest range that a committed lockfile resolves exactly). + pub reproducible: bool, + pub paths: Vec>, // dependency paths, root-first +} +``` + +```rust +// src/deps/policy.rs — Default and field access are REAL; only from_yaml is stubbed. +#[derive(Debug, Clone)] +pub struct Policy { + pub require_lockfile: bool, + pub fail_on_missing_lockfile: bool, + pub fail_on_stale_lockfile: bool, + pub fail_on_wildcard: bool, + pub fail_on_latest: bool, + pub fail_on_mutable_sources: bool, + pub warn_on_semver_range: bool, + pub require_integrity_hashes: bool, +} + +impl Default for Policy { + /// The recommended default from PRD_DEPS.md §19. REAL — never panics. + fn default() -> Self { + Policy { + require_lockfile: true, + fail_on_missing_lockfile: true, + fail_on_stale_lockfile: true, + fail_on_wildcard: true, + fail_on_latest: true, + fail_on_mutable_sources: true, + warn_on_semver_range: true, + require_integrity_hashes: true, + } + } +} + +#[derive(Debug)] +pub struct PolicyError(pub String); + +impl Policy { + pub fn from_yaml(yaml: &str) -> Result { + unimplemented!("deps::Policy::from_yaml — PRD_DEPS_TESTING.md §8.7") + } +} +``` + +```rust +// src/deps/mod.rs +pub mod model; +pub mod detect; +pub mod ecosystems; +pub mod findings; +pub mod policy; +pub mod diff; +pub mod explain; +pub mod report; +pub mod vuln; + +use std::path::{Path, PathBuf}; +use detect::DetectedFile; +use model::{DependencyGraph, DependencyNode, PackageId}; +use findings::Finding; +use policy::Policy; + +#[derive(Debug)] +pub struct DepsError(pub String); + +/// Full result of a dependency scan of one directory tree. +#[derive(Debug)] +pub struct Inventory { + pub root: PathBuf, + pub detected_files: Vec, + pub graph: DependencyGraph, + pub findings: Vec, +} + +impl Inventory { // all REAL — pure filters over owned data + /// Findings carrying a specific taxonomy code, e.g. "DEP004". + pub fn with_code(&self, code: &str) -> Vec<&Finding> { + self.findings.iter().filter(|f| f.id == code).collect() + } + /// Findings about a package, matched on the purl name component exactly. + pub fn findings_for(&self, name: &str) -> Vec<&Finding> { + self.findings.iter() + .filter(|f| f.package.as_ref().is_some_and(|id| id.name() == name)) + .collect() + } + pub fn node(&self, name: &str) -> Option<&DependencyNode> { + self.graph.node(name) + } +} + +/// Scan a directory tree: detect files, build the graph, evaluate policy. +pub fn scan(root: &Path, policy: &Policy) -> Result { + unimplemented!("deps::scan — PRD_DEPS_TESTING.md §8") +} + +/// CLI entry point for `corgea deps ...`. +pub fn run(/* DepsCommand */) { + unimplemented!("deps::run — PRD_DEPS_TESTING.md §8.11") +} + +#[cfg(test)] +mod tests; +``` + +`diff.rs`, `explain.rs`, `report.rs` follow the same pattern — signatures inline in §8.9 / §8.8 / §8.10. + +**External-source seam.** DEP010/016/017 read from external data sources. The seam is a trait declared in `findings.rs`: + +```rust +// src/deps/findings.rs +pub trait FindingSource { + /// Enrich a built graph with findings that require external data. + fn enrich(&self, graph: &DependencyGraph) -> Vec; +} +``` + +Per §13 decision 7, **DEP010 is built in the MVP** behind this trait: Slice 8 adds a `VulnerabilitySource: FindingSource` implementor backed by an offline fixture advisory DB (§6.9, §8.13). Keeping enrichment a separate step from `scan()` preserves the offline/determinism guarantees of §8.12. DEP016/DEP017 stay deferred — same trait, no MVP implementor, follow-up plan. + +--- + +## 6. Test fixtures + +Fixtures are static, checked-in projects, chosen so **every MVP finding code fires in at least one fixture and stays silent in at least one other**. Fixture contents below are normative. + +### 6.1 Node.js — `tests/fixtures/node-app/` (the "many findings" fixture) + +`package.json`: + +```json +{ + "name": "node-app", + "version": "1.0.0", + "dependencies": { + "express": "^4.18.2", + "lodash": "*", + "left-pad": "latest", + "internal-utils": "git+https://github.com/acme/internal-utils.git#main" + }, + "devDependencies": { + "jest": "29.7.0" + } +} +``` + +`package-lock.json` (lockfileVersion 3): + +```json +{ + "name": "node-app", + "version": "1.0.0", + "lockfileVersion": 3, + "requires": true, + "packages": { + "": { + "name": "node-app", + "version": "1.0.0", + "dependencies": { + "express": "^4.18.2", + "lodash": "*", + "left-pad": "latest", + "internal-utils": "git+https://github.com/acme/internal-utils.git#main" + }, + "devDependencies": { "jest": "29.7.0" } + }, + "node_modules/express": { + "version": "4.18.2", + "resolved": "https://registry.npmjs.org/express/-/express-4.18.2.tgz", + "integrity": "sha512-5/PsL6iGPdfQ/lKM1UuielYgv3BUoJfz1aUwU9vHZ+J7gyvwdQXFEBIEIaxeGf0GIcreATNyBExtalisDbuMqQ==", + "dependencies": { "qs": "6.11.0" } + }, + "node_modules/qs": { + "version": "6.11.0", + "resolved": "https://registry.npmjs.org/qs/-/qs-6.11.0.tgz", + "integrity": "sha512-MvjoMCJwEarSbUYk5O+nmoSzSutSsTwF85zcHPQ9OrlFoZOYIjaqBAJIqIXjptyD5vThxGq52Xu/MaJzRkDtA==" + }, + "node_modules/lodash": { + "version": "4.17.21", + "resolved": "https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz", + "integrity": "sha512-v2kDEe57lecTulaDIuNTPy3Ry4gLGJ6Z1O3vE1krgXZNrsQ+LFTGHVxVjcXPs17LhbZVGedAJv8XZ1tvj5FvKw==" + }, + "node_modules/left-pad": { + "version": "1.3.0", + "resolved": "https://registry.npmjs.org/left-pad/-/left-pad-1.3.0.tgz" + } + } +} +``` + +`left-pad` deliberately has **no `integrity`** → DEP008. Per-package expectation (normative): + +| Package | Declared | Kind | Expected | +|---|---|---|---| +| `express` | `^4.18.2` | direct, prod | DEP003 (broad range), **Medium**, `reproducible: true` | +| `lodash` | `*` | direct, prod | DEP004 (wildcard), **High** | +| `left-pad` | `latest` | direct, prod | DEP004 (`latest`), **High**; **and** DEP008 (no integrity) | +| `internal-utils` | `git+…#main` | direct, prod | DEP005 (mutable git branch), **High**, `source_type: GitBranch` | +| `jest` | `29.7.0` | direct, **dev** | no finding; `scope: Development` | +| `qs` | `6.11.0` (by express) | **transitive**, prod | **no finding** — locked & exact (§8.4) | + +### 6.2 Node.js — `tests/fixtures/node-stale/` (stale lockfile) + +`package.json` declares `chalk`; the lockfile does not contain it: + +```json +{ + "name": "node-stale", + "version": "1.0.0", + "dependencies": { "express": "^4.18.2", "chalk": "^5.3.0" } +} +``` + +`package-lock.json` — `express` only, **no `chalk`**: + +```json +{ + "name": "node-stale", + "version": "1.0.0", + "lockfileVersion": 3, + "requires": true, + "packages": { + "": { "name": "node-stale", "version": "1.0.0", + "dependencies": { "express": "^4.18.2" } }, + "node_modules/express": { + "version": "4.18.2", + "resolved": "https://registry.npmjs.org/express/-/express-4.18.2.tgz", + "integrity": "sha512-5/PsL6iGPdfQ/lKM1UuielYgv3BUoJfz1aUwU9vHZ+J7gyvwdQXFEBIEIaxeGf0GIcreATNyBExtalisDbuMqQ==" + } + } +} +``` + +Staleness = **content divergence** (`chalk` in manifest, absent from lockfile), not mtime. Expected: DEP002, High. + +### 6.3 Node.js — `tests/fixtures/node-monorepo/` (workspaces) + +Root `package.json` with `"workspaces": ["packages/*"]`, two workspace manifests `packages/a/package.json` and `packages/b/package.json`, and a single root `package-lock.json`. Used by §8.12 to assert every workspace manifest is detected and attributed. Keep each manifest to 1–2 dependencies. + +### 6.4 Python — `tests/fixtures/python-poetry/` (well-locked) + +`pyproject.toml`: + +```toml +[tool.poetry] +name = "python-poetry-app" +version = "0.1.0" + +[tool.poetry.dependencies] +python = "^3.12" +requests = "^2.31.0" +flask = "2.3.3" + +[tool.poetry.group.dev.dependencies] +pytest = "^8.0.0" +``` + +`poetry.lock`: + +```toml +[[package]] +name = "requests" +version = "2.31.0" +optional = false +python-versions = ">=3.7" + +[package.dependencies] +urllib3 = ">=1.21.1,<3" + +[[package]] +name = "urllib3" +version = "2.0.7" +optional = false +python-versions = ">=3.7" + +[[package]] +name = "flask" +version = "2.3.3" +optional = false +python-versions = ">=3.8" + +[metadata] +lock-version = "2.0" +python-versions = "^3.12" +content-hash = "0000000000000000000000000000000000000000000000000000000000000000" +``` + +Expected: `requests` `^2.31.0` direct → DEP003 Medium, `reproducible: true`. `flask 2.3.3` → no finding (exact). `urllib3` transitive, declared as a range by `requests`, locked → **no finding** (§8.4). Lockfile present → **no DEP001**. + +### 6.5 Python — `tests/fixtures/python-pip-nolock/` (no lockfile) + +`requirements.txt`, no `constraints.txt`, no lockfile: + +```text +flask==2.3.3 +requests +urllib3>=1.26 +internal-lib @ git+https://github.com/acme/internal-lib.git@main +``` + +Expected: DEP001 (missing lockfile), High. `flask==2.3.3` exact → no pin finding. `requests` bare → DEP004 High. `urllib3>=1.26` open-ended → DEP004 High (§13 decision 2). `internal-lib @ git+…@main` → DEP005, High. + +### 6.6 Java — `tests/fixtures/java-maven/` (Maven, no lockfile) + +`pom.xml`: + +```xml + + + 4.0.0 + com.acme + java-maven-app + 1.0.0 + + + com.google.guava + guava + 32.1.3-jre + + + org.apache.commons + commons-lang3 + [3.0,4.0) + + + org.slf4j + slf4j-api + LATEST + + + com.acme + internal-bom + 2.0-SNAPSHOT + + + org.junit.jupiter + junit-jupiter + 5.10.1 + test + + + +``` + +Expected: `guava 32.1.3-jre` exact → no finding. `commons-lang3 [3.0,4.0)` Maven range → DEP003 Medium. `slf4j-api LATEST` → DEP004 High. `internal-bom 2.0-SNAPSHOT` → mutable → DEP021 High (§13 decision 1). `junit-jupiter` `test` → `scope: Development`, no finding. Maven has no first-class lockfile → DEP001 High. + +### 6.7 Java — `tests/fixtures/java-gradle/` (Gradle, with lockfile) + +`build.gradle`: + +```groovy +plugins { + id 'java' +} + +dependencies { + implementation 'com.google.guava:guava:32.1.3-jre' + implementation 'org.apache.commons:commons-lang3:3.+' + implementation 'org.slf4j:slf4j-api:latest.release' + testImplementation 'org.junit.jupiter:junit-jupiter:5.10.1' +} +``` + +`gradle.lockfile`: + +```text +# This is a Gradle generated file for dependency locking. +# Manual edits can break the build and are not advised. +# This file is expected to be part of source control. +com.google.guava:guava:32.1.3-jre=compileClasspath,runtimeClasspath +org.apache.commons:commons-lang3:3.14.0=compileClasspath,runtimeClasspath +org.slf4j:slf4j-api:2.0.9=compileClasspath,runtimeClasspath +org.junit.jupiter:junit-jupiter:5.10.1=testCompileClasspath,testRuntimeClasspath +empty=annotationProcessor +``` + +Expected: `guava` exact → no finding. `commons-lang3 3.+` dynamic, resolved by lockfile to `3.14.0` → DEP003 Medium, `reproducible: true`. `slf4j-api latest.release` → DEP004 High **even though resolved** — `latest.release` violates policy regardless of locking. `gradle.lockfile` present → **no DEP001**. + +### 6.8 Smoke & robustness fixtures + +- `tests/fixtures/go-mod-smoke/` — minimal `go.mod` + `go.sum`, so §8.1 asserts detection of a non-MVP-graph ecosystem. +- `tests/fixtures/malformed/` — three deliberately broken files: `bad-package-lock.json` (invalid JSON, e.g. a trailing comma and an unclosed brace), `truncated-poetry.lock` (TOML cut off mid-table), `not-xml-pom.xml` (a `pom.xml` whose body is not XML). §8.12 asserts the scanner returns an error and never panics. + +Note: `.gitignore` excludes `node_modules/`, so a "skip `node_modules`" fixture cannot be committed. §8.12 builds that scenario in a `tempfile::TempDir` at runtime instead. + +### 6.9 Vulnerability advisory fixture (Slice 8) + +`tests/fixtures/vuln-db.json` — a static, offline advisory database for the mocked `VulnerabilitySource` (§8.13). It maps a package name + vulnerable versions to an advisory `{ id, severity, summary }`. It deliberately flags one **transitive** package present in `node-app` — `qs@6.11.0` — and leaves the direct, in-sync packages (`express@4.18.2`, `lodash@4.17.21`) unflagged, so DEP010 has a positive case (a transitive hit) and negative controls. + +```json +{ + "advisories": [ + { + "name": "qs", + "vulnerable_versions": ["6.11.0"], + "id": "GHSA-FIXTURE-qs-0001", + "severity": "high", + "summary": "Fixture advisory for qs (test data — not a live CVE mapping)." + } + ] +} +``` + +This file is test data only. It is not a real advisory feed and must never be wired into a production code path. + +--- + +## 7. Traceability matrix + +Every MVP behavior maps to ≥1 named test; every test maps back to a PRD requirement. A finding code is "covered" only when it has **both** a positive and a negative test. + +| PRD ref | Behavior | Test(s) | Slice | +|---|---|---|---| +| FR3 | npm constraint classification | `npm_classify_*` (6) | 1 | +| FR3 | PyPI constraint classification | `pypi_classify_*` (5) | 1 | +| FR3 | Maven/Gradle constraint classification | `maven_classify_*`, `gradle_classify_*` (6) | 1 | +| FR1 | Detect npm files | `detect_finds_npm_files` | 2 | +| FR1 | Detect Python files | `detect_finds_python_poetry_files`, `detect_finds_pip_requirements` | 2 | +| FR1 | Detect Java files | `detect_finds_maven_pom`, `detect_finds_gradle_files` | 2 | +| FR1 | Non-MVP ecosystem still detected | `detect_finds_go_mod_smoke` | 2 | +| FR2 | npm graph: direct/transitive, scope, source | `npm_graph_*` (4) | 3 | +| FR2 / FR9 | npm purl identity, JSON output, CLI scan | `npm_purl_*`, `report_json_*`, `cli_scan_*` | 3 | +| §7.1 / FR4 | npm locked transitive range → no finding | `node_locked_transitive_range_yields_no_finding`, `node_direct_locked_range_is_medium_not_high` | 3 | +| DEP002 | Stale lockfile (pos/neg) | `node_manifest_dep_missing_from_lock_is_dep002`, `node_app_lock_in_sync_no_dep002` | 3 | +| DEP008 | Missing integrity (pos/neg) | `npm_lock_entry_without_integrity_is_dep008`, `npm_lock_entry_with_integrity_no_dep008` | 3 | +| DEP003/004/005 | npm pinning & source findings | `npm_caret_*`, `npm_wildcard_*`, `npm_latest_*`, `npm_git_branch_*`, `git_commit_sha_*`, `npm_url_*` | 3 | +| FR2 | PyPI graph build | `pypi_graph_*` (2) | 4 | +| §7.1 / FR4 | PyPI locked transitive range → no finding | `pypi_locked_transitive_range_yields_no_finding` | 4 | +| DEP001 | Missing lockfile (pos/neg) | `pip_no_lockfile_is_dep001`, `poetry_lock_present_no_dep001` | 4 | +| DEP004/005 | PyPI pinning & source | `pypi_bare_name_is_dep004`, `pypi_open_ended_range_is_dep004_high`, `pypi_git_branch_dep_is_dep005` | 4 | +| FR2 | Maven/Gradle graph build | `maven_graph_*`, `gradle_graph_*` | 5 | +| §7.1 / FR4 | Gradle locked dynamic version reproducible | `gradle_locked_dynamic_version_is_reproducible` | 5 | +| DEP001 | Maven no lockfile / Gradle lock present | `maven_no_lockfile_is_dep001`, `gradle_lock_present_no_dep001` | 5 | +| DEP003/004/021 | Maven/Gradle pinning & SNAPSHOT | `maven_range_direct_dep_is_dep003`, `maven_latest_keyword_is_dep004`, `maven_snapshot_is_dep021_high` | 5 | +| FR8 | Default policy & YAML parse | `default_policy_fails_on_wildcard`, `policy_from_yaml_parses_prd_example`, `policy_disabling_rule_silences_finding` | 5 | +| FR6 | Explain dependency path | `explain_transitive_shows_path`, `explain_unknown_package_is_none` | 6 | +| FR7 | Graph diff | `diff_detects_added_removed_changed` | 6 | +| FR9 | SARIF output | `report_sarif_has_rules_and_results` | 6 | +| FR11 | CycloneDX SBOM | `report_cyclonedx_has_components_and_deps` | 6 | +| §6.4 | CLI exit codes, out-file, hermeticity, no-token | `cli_*` (5) | 3–6 | +| §18 R2 | Malformed input → error not panic | `robust_malformed_*` (3) | 7 | +| §18 R2 | Determinism of graph & JSON output | `robust_graph_order_deterministic`, `robust_json_output_byte_stable` | 7 | +| §18 R2 | Monorepo / skip vendored / classifier never panics | `robust_monorepo_*`, `robust_scan_skips_node_modules`, `robust_classify_never_panics` | 7 | +| §7.5 / DEP010 | Vulnerable resolved package — pos/neg, mocked source, path attribution | `vuln_*` (6) | 8 | + +Total: ~64 tests. + +--- + +## 8. The failing tests + +All code below is the deliverable of its slice (§9). Written first, observed failing, committed with red `cargo test` output in the PR. + +### 8.0 Shared helpers — `src/deps/tests/common.rs` + +```rust +use std::path::PathBuf; +use crate::deps::{scan, Inventory, policy::Policy}; + +/// Absolute path to a fixture project directory. +pub fn fixture(name: &str) -> PathBuf { + PathBuf::from(env!("CARGO_MANIFEST_DIR")) + .join("tests/fixtures") + .join(name) +} + +/// Read one file inside a fixture project. +pub fn read(name: &str, file: &str) -> String { + let path = fixture(name).join(file); + std::fs::read_to_string(&path) + .unwrap_or_else(|e| panic!("missing fixture file {}: {e}", path.display())) +} + +/// Scan a fixture with the default policy; panic with context on failure. +pub fn scan_fixture(name: &str) -> Inventory { + scan(&fixture(name), &Policy::default()) + .unwrap_or_else(|e| panic!("scan of fixture {name} failed: {e:?}")) +} +``` + +`src/deps/tests/mod.rs`: + +```rust +mod common; +mod detect_tests; +mod npm_tests; +mod pypi_tests; +mod maven_tests; +mod correctness_tests; +mod findings_tests; +mod policy_tests; +mod explain_tests; +mod diff_tests; +mod report_tests; +mod robustness_tests; +mod vuln_tests; +``` + +### 8.1 File detection — `detect_tests.rs` (Slice 2) + +```rust +use super::common::fixture; +use crate::deps::detect::{detect_dependency_files, DepFileKind}; +use crate::deps::model::Ecosystem; + +fn kinds(root: &str) -> Vec { + let mut k: Vec<_> = detect_dependency_files(&fixture(root)) + .into_iter().map(|f| f.kind).collect(); + k.sort_by_key(|x| format!("{x:?}")); + k +} + +#[test] +fn detect_finds_npm_files() { + let k = kinds("node-app"); + assert!(k.contains(&DepFileKind::NpmManifest), "expected package.json"); + assert!(k.contains(&DepFileKind::NpmLockfile), "expected package-lock.json"); +} + +#[test] +fn detect_finds_python_poetry_files() { + let k = kinds("python-poetry"); + assert!(k.contains(&DepFileKind::PyProject)); + assert!(k.contains(&DepFileKind::PoetryLock)); +} + +#[test] +fn detect_finds_pip_requirements() { + let files = detect_dependency_files(&fixture("python-pip-nolock")); + assert!(files.iter().any(|f| f.kind == DepFileKind::PipRequirements)); + assert!(files.iter().all(|f| f.ecosystem == Ecosystem::PyPI)); +} + +#[test] +fn detect_finds_maven_pom() { + assert!(kinds("java-maven").contains(&DepFileKind::MavenPom)); +} + +#[test] +fn detect_finds_gradle_files() { + let k = kinds("java-gradle"); + assert!(k.contains(&DepFileKind::GradleBuild)); + assert!(k.contains(&DepFileKind::GradleLockfile)); +} + +#[test] +fn detect_finds_go_mod_smoke() { + // Non-MVP ecosystem: detection must still work even before graph support. + assert!(kinds("go-mod-smoke").contains(&DepFileKind::GoMod)); +} +``` + +(The "skip `node_modules`" assertion needs a runtime-built fixture — see `robust_scan_skips_node_modules`, §8.12.) + +### 8.2 Constraint classification — `npm_tests.rs`, `pypi_tests.rs`, `maven_tests.rs` (Slice 1) + +The per-language heart of FR3. `classify_constraint` is a pure function — the cheapest, sharpest TDD unit, and the first thing implemented. + +`npm_tests.rs` (classification section): + +```rust +use crate::deps::ecosystems::classify_constraint; +use crate::deps::model::{ConstraintKind, Ecosystem::Npm}; + +#[test] +fn npm_classify_exact_version() { + assert_eq!(classify_constraint(Npm, "4.18.2"), ConstraintKind::Exact); +} + +#[test] +fn npm_classify_caret_is_bounded_range() { + assert_eq!(classify_constraint(Npm, "^4.18.2"), ConstraintKind::BoundedRange); +} + +#[test] +fn npm_classify_wildcard_is_unbounded() { + assert_eq!(classify_constraint(Npm, "*"), ConstraintKind::Unbounded); +} + +#[test] +fn npm_classify_latest_is_unbounded() { + assert_eq!(classify_constraint(Npm, "latest"), ConstraintKind::Unbounded); +} + +#[test] +fn npm_classify_git_branch_is_mutable_ref() { + assert_eq!( + classify_constraint(Npm, "git+https://github.com/acme/x.git#main"), + ConstraintKind::GitRef { mutable: true } + ); +} + +#[test] +fn npm_classify_git_commit_sha_is_immutable_ref() { + let sha = "git+https://github.com/acme/x.git#0bc1a2d3e4f5a6b7c8d9e0f1a2b3c4d5e6f7a8b9"; + assert_eq!( + classify_constraint(Npm, sha), + ConstraintKind::GitRef { mutable: false } + ); +} +``` + +`pypi_tests.rs` (classification section): + +```rust +use crate::deps::ecosystems::classify_constraint; +use crate::deps::model::{ConstraintKind, Ecosystem::PyPI}; + +#[test] +fn pypi_classify_exact_pin() { + assert_eq!(classify_constraint(PyPI, "==2.3.3"), ConstraintKind::Exact); +} + +#[test] +fn pypi_classify_bare_name_is_unbounded() { + // A bare `requests` with no specifier accepts any version. + assert_eq!(classify_constraint(PyPI, "requests"), ConstraintKind::Unbounded); +} + +#[test] +fn pypi_classify_open_greater_equal_is_unbounded() { + assert_eq!(classify_constraint(PyPI, ">=1.26"), ConstraintKind::Unbounded); +} + +#[test] +fn pypi_classify_compatible_release_is_bounded_range() { + assert_eq!(classify_constraint(PyPI, "~=2.3"), ConstraintKind::BoundedRange); +} + +#[test] +fn pypi_classify_git_branch_is_mutable_ref() { + assert_eq!( + classify_constraint(PyPI, "git+https://github.com/acme/x.git@main"), + ConstraintKind::GitRef { mutable: true } + ); +} +``` + +`maven_tests.rs` (classification section): + +```rust +use crate::deps::ecosystems::classify_constraint; +use crate::deps::model::{ConstraintKind, Ecosystem::Maven}; + +#[test] +fn maven_classify_hard_version_is_exact() { + assert_eq!(classify_constraint(Maven, "32.1.3-jre"), ConstraintKind::Exact); +} + +#[test] +fn maven_classify_version_range_is_bounded_range() { + assert_eq!(classify_constraint(Maven, "[3.0,4.0)"), ConstraintKind::BoundedRange); +} + +#[test] +fn maven_classify_latest_keyword_is_unbounded() { + assert_eq!(classify_constraint(Maven, "LATEST"), ConstraintKind::Unbounded); + assert_eq!(classify_constraint(Maven, "RELEASE"), ConstraintKind::Unbounded); +} + +#[test] +fn maven_classify_snapshot_is_mutable() { + assert_eq!(classify_constraint(Maven, "2.0-SNAPSHOT"), ConstraintKind::Mutable); +} + +#[test] +fn gradle_classify_dynamic_plus_is_bounded_range() { + assert_eq!(classify_constraint(Maven, "3.+"), ConstraintKind::BoundedRange); +} + +#[test] +fn gradle_classify_latest_release_is_unbounded() { + assert_eq!(classify_constraint(Maven, "latest.release"), ConstraintKind::Unbounded); +} +``` + +### 8.3 npm graph, findings, output — `npm_tests.rs` (Slice 3) + +Slice 3 is the **full npm vertical**: parse → graph → findings → JSON → CLI. It is the deepest single slice; getting it green proves the whole pipeline shape before Python and Java reuse it. + +Accessors used below (`id`, `is_direct`, `scope`, `version`, `depth`, `source_type`) are small real getters over the `pub(crate)` fields. + +```rust +use super::common::{fixture, scan_fixture}; +use crate::deps::model::{PackageId, Scope, SourceType}; + +// --- graph ------------------------------------------------------------------- + +#[test] +fn npm_graph_classifies_express_as_direct_production() { + let inv = scan_fixture("node-app"); + let express = inv.node("express").expect("express node missing"); + assert!(express.is_direct(), "express is a direct dependency"); + assert_eq!(express.scope(), Scope::Production); + assert_eq!(express.version(), Some("4.18.2")); +} + +#[test] +fn npm_graph_classifies_qs_as_transitive() { + let inv = scan_fixture("node-app"); + let qs = inv.node("qs").expect("qs node missing"); + assert!(!qs.is_direct(), "qs is pulled in transitively by express"); + assert!(qs.depth() >= 2, "qs sits at depth >= 2"); +} + +#[test] +fn npm_graph_classifies_jest_as_development_scope() { + let inv = scan_fixture("node-app"); + assert_eq!(inv.node("jest").expect("jest node missing").scope(), + Scope::Development); +} + +#[test] +fn npm_graph_marks_git_dep_source_type() { + let inv = scan_fixture("node-app"); + let git_dep = inv.node("internal-utils").expect("internal-utils node missing"); + assert_eq!(git_dep.source_type(), SourceType::GitBranch); +} + +#[test] +fn npm_purl_identity_is_canonical() { + let inv = scan_fixture("node-app"); + assert_eq!(*inv.node("lodash").unwrap().id(), + PackageId("pkg:npm/lodash@4.17.21".into())); +} + +// --- DEP003 / DEP004 / DEP005 / DEP008 (npm) -------------------------------- + +#[test] +fn npm_caret_direct_dep_is_dep003() { + let inv = scan_fixture("node-app"); + assert!(!inv.findings_for("express").is_empty() + && inv.findings_for("express").iter().any(|f| f.id == "DEP003"), + "express `^4.18.2` is a direct bounded range — expected DEP003"); +} + +#[test] +fn npm_exact_dev_dep_has_no_pinning_finding() { + // jest is exactly pinned (29.7.0) — the negative control for DEP003/DEP004. + let inv = scan_fixture("node-app"); + assert!(inv.findings_for("jest").iter() + .all(|f| f.id != "DEP003" && f.id != "DEP004"), + "an exact pin must not raise a pinning finding"); +} + +#[test] +fn npm_wildcard_direct_dep_is_dep004_high() { + use crate::deps::model::Severity; + let inv = scan_fixture("node-app"); + let f = inv.findings_for("lodash").into_iter() + .find(|f| f.id == "DEP004").expect("lodash `*` must raise DEP004"); + assert_eq!(f.severity, Severity::High); +} + +#[test] +fn npm_latest_direct_dep_is_dep004() { + let inv = scan_fixture("node-app"); + assert!(inv.findings_for("left-pad").iter().any(|f| f.id == "DEP004"), + "left-pad `latest` must raise DEP004"); +} + +#[test] +fn npm_git_branch_dep_is_dep005() { + use crate::deps::model::Severity; + let inv = scan_fixture("node-app"); + let f = inv.findings_for("internal-utils").into_iter() + .find(|f| f.id == "DEP005") + .expect("internal-utils @ #main is a mutable git branch — expected DEP005"); + assert_eq!(f.severity, Severity::High); +} + +#[test] +fn git_commit_sha_is_not_dep005() { + // A git dependency pinned to a 40-char commit SHA is immutable — no finding. + use crate::deps::ecosystems::classify_constraint; + use crate::deps::model::{ConstraintKind, Ecosystem::Npm}; + let pinned = "git+https://github.com/acme/x.git#0bc1a2d3e4f5a6b7c8d9e0f1a2b3c4d5e6f7a8b9"; + assert_eq!(classify_constraint(Npm, pinned), + ConstraintKind::GitRef { mutable: false }); +} + +#[test] +fn npm_url_dep_without_checksum_is_dep006() { + use crate::deps::ecosystems::classify_constraint; + use crate::deps::model::{ConstraintKind, Ecosystem::Npm}; + assert_eq!(classify_constraint(Npm, "https://example.com/pkg/foo-1.0.0.tgz"), + ConstraintKind::Url { checksum: false }); +} + +#[test] +fn npm_lock_entry_without_integrity_is_dep008() { + let inv = scan_fixture("node-app"); + assert!(inv.findings_for("left-pad").iter().any(|f| f.id == "DEP008"), + "left-pad lacks an integrity hash — expected DEP008"); +} + +#[test] +fn npm_lock_entry_with_integrity_no_dep008() { + let inv = scan_fixture("node-app"); + for pkg in ["express", "qs", "lodash"] { + assert!(inv.findings_for(pkg).iter().all(|f| f.id != "DEP008"), + "{pkg} has an integrity hash — must not raise DEP008"); + } +} + +// --- DEP002 stale lockfile (npm) -------------------------------------------- + +#[test] +fn node_manifest_dep_missing_from_lock_is_dep002() { + use crate::deps::model::Severity; + let inv = scan_fixture("node-stale"); + let f = inv.with_code("DEP002"); + assert!(!f.is_empty(), "manifest/lockfile drift must raise DEP002"); + assert_eq!(f[0].severity, Severity::High); +} + +#[test] +fn node_app_lock_in_sync_no_dep002() { + let inv = scan_fixture("node-app"); + assert!(inv.with_code("DEP002").is_empty(), "in-sync lockfile — no DEP002"); +} +``` + +### 8.4 Manifest-vs-lockfile correctness — `correctness_tests.rs` (Slices 3–5) + +The single most important correctness requirement (`PRD_DEPS.md` §7.1, §18 Risk 1). A transitive range that the lockfile resolves is **not** a finding; a direct range that the lockfile resolves is at most **Medium** and is marked `reproducible`. + +```rust +use super::common::scan_fixture; +use crate::deps::model::Severity; + +#[test] +fn node_locked_transitive_range_yields_no_finding() { // Slice 3 + // qs is declared by express and resolved by package-lock.json. + // It must NOT produce DEP003 or DEP004 — the install is reproducible. + let inv = scan_fixture("node-app"); + assert!( + inv.findings_for("qs").iter().all(|f| f.id != "DEP003" && f.id != "DEP004"), + "locked transitive dependency must not raise a pinning finding, got: {:?}", + inv.findings_for("qs").iter().map(|f| &f.id).collect::>() + ); +} + +#[test] +fn node_direct_locked_range_is_medium_not_high() { // Slice 3 + // express is `^4.18.2` (a range) but package-lock.json pins 4.18.2. + // Policy may warn (DEP003 Medium) — it must NOT escalate to High. + let inv = scan_fixture("node-app"); + let dep003 = inv.findings_for("express").into_iter() + .find(|f| f.id == "DEP003") + .expect("expected a DEP003 informational finding for express"); + assert_eq!(dep003.severity, Severity::Medium); + assert!(dep003.reproducible, "lockfile resolves it — install is reproducible"); +} + +#[test] +fn pypi_locked_transitive_range_yields_no_finding() { // Slice 4 + // urllib3 is declared as a range by requests and locked by poetry.lock. + let inv = scan_fixture("python-poetry"); + assert!(inv.findings_for("urllib3").is_empty(), + "locked transitive urllib3 must produce no findings"); +} + +#[test] +fn gradle_locked_dynamic_version_is_reproducible() { // Slice 5 + // commons-lang3 `3.+` is dynamic but gradle.lockfile pins 3.14.0. + let inv = scan_fixture("java-gradle"); + let dep003 = inv.findings_for("commons-lang3").into_iter() + .find(|f| f.id == "DEP003") + .expect("dynamic direct version should still warn (DEP003)"); + assert_eq!(dep003.severity, Severity::Medium); + assert!(dep003.reproducible, "gradle.lockfile makes the install reproducible"); +} +``` + +### 8.5 Lockfile health — DEP001 (`findings_tests.rs`, Slices 4–5) + +DEP002/DEP008 live with their npm vertical (§8.3). DEP001 spans Python and Java: + +```rust +use super::common::scan_fixture; +use crate::deps::model::Severity; + +#[test] +fn pip_no_lockfile_is_dep001() { // Slice 4 + let inv = scan_fixture("python-pip-nolock"); + let f = inv.with_code("DEP001"); + assert!(!f.is_empty(), "requirements.txt with no lockfile must raise DEP001"); + assert_eq!(f[0].severity, Severity::High); +} + +#[test] +fn poetry_lock_present_no_dep001() { // Slice 4 + assert!(scan_fixture("python-poetry").with_code("DEP001").is_empty(), + "poetry.lock present — no DEP001"); +} + +#[test] +fn maven_no_lockfile_is_dep001() { // Slice 5 + // Maven has no first-class lockfile; this fixture has no BOM either. + assert!(!scan_fixture("java-maven").with_code("DEP001").is_empty(), + "maven project with no lockfile must raise DEP001"); +} + +#[test] +fn gradle_lock_present_no_dep001() { // Slice 5 + assert!(scan_fixture("java-gradle").with_code("DEP001").is_empty(), + "gradle.lockfile present — no DEP001"); +} +``` + +### 8.6 PyPI & Maven pinning / source — `pypi_tests.rs`, `maven_tests.rs` (Slices 4–5) + +`pypi_tests.rs` (graph + findings section): + +```rust +use super::common::scan_fixture; +use crate::deps::model::Scope; + +#[test] +fn pypi_graph_classifies_pytest_as_development_scope() { + assert_eq!(scan_fixture("python-poetry").node("pytest") + .expect("pytest node missing").scope(), Scope::Development); +} + +#[test] +fn pypi_graph_resolves_transitive_urllib3_version() { + let inv = scan_fixture("python-poetry"); + let urllib3 = inv.node("urllib3").expect("urllib3 should be in the graph"); + assert!(!urllib3.is_direct(), "urllib3 is transitive (declared by requests)"); + assert_eq!(urllib3.version(), Some("2.0.7")); +} + +#[test] +fn pypi_exact_pin_has_no_pinning_finding() { + // flask==2.3.3 is the negative control. + let inv = scan_fixture("python-pip-nolock"); + assert!(inv.findings_for("flask").iter() + .all(|f| f.id != "DEP003" && f.id != "DEP004"), + "flask==2.3.3 is exact — no pinning finding"); +} + +#[test] +fn pypi_bare_name_is_dep004() { + assert!(scan_fixture("python-pip-nolock").findings_for("requests") + .iter().any(|f| f.id == "DEP004"), + "bare `requests` must raise DEP004"); +} + +#[test] +fn pypi_open_ended_range_is_dep004_high() { + // §13 decision 2: unbounded `>=` / bare names are DEP004 High, like `*` / `latest`. + use crate::deps::model::Severity; + let inv = scan_fixture("python-pip-nolock"); + let f = inv.findings_for("urllib3").into_iter() + .find(|f| f.id == "DEP004") + .expect("open-ended `urllib3>=1.26` must raise DEP004"); + assert_eq!(f.severity, Severity::High); +} + +#[test] +fn pypi_git_branch_dep_is_dep005() { + assert!(scan_fixture("python-pip-nolock").findings_for("internal-lib") + .iter().any(|f| f.id == "DEP005"), + "internal-lib @ git+...@main is a mutable branch — expected DEP005"); +} +``` + +`maven_tests.rs` (graph + findings section): + +```rust +use super::common::scan_fixture; +use crate::deps::model::{PackageId, Severity}; + +#[test] +fn maven_graph_lists_all_direct_dependencies() { + let inv = scan_fixture("java-maven"); + for name in ["guava", "commons-lang3", "slf4j-api", "internal-bom"] { + let n = inv.node(name).unwrap_or_else(|| panic!("{name} node missing")); + assert!(n.is_direct(), "{name} is declared directly in pom.xml"); + } +} + +#[test] +fn maven_purl_identity_includes_group() { + assert_eq!(*scan_fixture("java-gradle").node("guava").unwrap().id(), + PackageId("pkg:maven/com.google.guava/guava@32.1.3-jre".into())); +} + +#[test] +fn gradle_graph_resolves_dynamic_version_from_lockfile() { + // build.gradle declares 3.+; gradle.lockfile pins 3.14.0. + assert_eq!(scan_fixture("java-gradle").node("commons-lang3") + .expect("commons-lang3 node missing").version(), Some("3.14.0")); +} + +#[test] +fn maven_range_direct_dep_is_dep003() { + assert!(scan_fixture("java-maven").findings_for("commons-lang3") + .iter().any(|f| f.id == "DEP003"), + "commons-lang3 `[3.0,4.0)` is a direct Maven range — expected DEP003"); +} + +#[test] +fn maven_exact_dep_has_no_pinning_finding() { + // guava 32.1.3-jre is the negative control. + assert!(scan_fixture("java-maven").findings_for("guava") + .iter().all(|f| f.id != "DEP003" && f.id != "DEP004"), + "guava is exactly pinned — no pinning finding"); +} + +#[test] +fn maven_latest_keyword_is_dep004() { + let inv = scan_fixture("java-maven"); + let f = inv.findings_for("slf4j-api").into_iter() + .find(|f| f.id == "DEP004").expect("slf4j-api `LATEST` must raise DEP004"); + assert_eq!(f.severity, Severity::High); +} + +#[test] +fn maven_snapshot_is_dep021_high() { + // §13 decision 1: Maven -SNAPSHOT is a mutable artifact version → DEP021 (High), + // not DEP004 — the manifest names a coordinate, it is not an unbounded selector. + let inv = scan_fixture("java-maven"); + let f = inv.findings_for("internal-bom").into_iter() + .find(|f| f.id == "DEP021") + .expect("2.0-SNAPSHOT must raise DEP021 (mutable artifact version)"); + assert_eq!(f.severity, Severity::High); + assert!(f.recommendation.to_lowercase().contains("snapshot"), + "recommendation should name the SNAPSHOT problem"); +} +``` + +### 8.7 Policy — `policy_tests.rs` (Slice 5) + +```rust +use super::common::{fixture, scan_fixture}; +use crate::deps::{scan, policy::Policy}; + +#[test] +fn default_policy_fails_on_wildcard() { + // The built-in default treats wildcard/latest as a hard finding (PRD §19). + assert!(!scan_fixture("node-app").with_code("DEP004").is_empty(), + "default policy must flag wildcard/latest dependencies"); +} + +#[test] +fn policy_from_yaml_parses_prd_example() { + // The policy block from PRD_DEPS.md §6.6 must parse without error. + let yaml = r#" +dependency_policy: + require_lockfile: true + fail_on_missing_lockfile: true + fail_on_stale_lockfile: true + direct_dependencies: + fail_on_wildcard: true + fail_on_latest: true + warn_on_semver_range: true + allow_exact_versions: true + ci: + fail_on_new_findings_only: true + severity_threshold: high +"#; + assert!(Policy::from_yaml(yaml).is_ok(), "the PRD example policy must parse"); +} + +#[test] +fn policy_disabling_rule_silences_finding() { + // Negative control: with wildcard checks OFF, DEP004 must not fire. + let yaml = r#" +dependency_policy: + direct_dependencies: + fail_on_wildcard: false + fail_on_latest: false +"#; + let policy = Policy::from_yaml(yaml).expect("policy parses"); + let inv = scan(&fixture("node-app"), &policy).expect("scan"); + assert!(inv.with_code("DEP004").is_empty(), + "with wildcard checks disabled, DEP004 must not fire"); +} +``` + +### 8.8 Explain — `explain_tests.rs` (Slice 6) + +```rust +use super::common::scan_fixture; +use crate::deps::explain::explain; + +#[test] +fn explain_transitive_shows_path() { + let inv = scan_fixture("node-app"); + let e = explain(&inv.graph, "qs").expect("qs should be explainable"); + // Expected introduction path: root -> express@4.18.2 -> qs@6.11.0 + assert!(!e.direct, "qs is transitive"); + assert_eq!(e.depth, 2); + let path = e.paths.first().expect("at least one dependency path"); + assert_eq!(path.first().map(|id| id.0.as_str()), Some("root")); + assert!(path.iter().any(|id| id.name() == "express"), + "the path must run through express"); + assert_eq!(path.last().map(|id| id.name()), Some("qs")); +} + +#[test] +fn explain_unknown_package_is_none() { + let inv = scan_fixture("node-app"); + assert!(explain(&inv.graph, "does-not-exist").is_none(), + "explaining an absent package returns None"); +} +``` + +Stub (`src/deps/explain.rs`): + +```rust +use crate::deps::model::{DependencyGraph, PackageId}; + +#[derive(Debug)] +pub struct Explanation { + pub package: PackageId, + pub direct: bool, + pub depth: u32, + pub paths: Vec>, +} + +pub fn explain(graph: &DependencyGraph, package: &str) -> Option { + unimplemented!("deps::explain — PRD_DEPS_TESTING.md §8.8") +} +``` + +### 8.9 Diff — `diff_tests.rs` (Slice 6) + +Graph-level diff, tested as a pure function on two in-memory graphs — no git, no fixtures. Nodes are built with the real `DependencyNode::new` constructor. + +```rust +use crate::deps::diff::diff_graphs; +use crate::deps::model::{DependencyGraph, DependencyNode}; + +fn graph(nodes: Vec) -> DependencyGraph { + DependencyGraph { nodes, edges: vec![] } +} + +#[test] +fn diff_detects_added_removed_changed() { + let base = graph(vec![ + DependencyNode::new_npm("lodash", "4.17.20"), + DependencyNode::new_npm("request", "2.88.2"), + ]); + let head = graph(vec![ + DependencyNode::new_npm("lodash", "4.17.21"), + DependencyNode::new_npm("axios", "1.8.2"), + ]); + let d = diff_graphs(&base, &head); + assert!(d.added.iter().any(|n| n.name() == "axios"), "axios was added"); + assert!(d.removed.iter().any(|n| n.name() == "request"), "request was removed"); + assert!( + d.changed.iter().any(|c| c.name == "lodash" + && c.from == "4.17.20" && c.to == "4.17.21"), + "lodash changed 4.17.20 -> 4.17.21" + ); + assert!(d.added.iter().all(|n| n.name() != "lodash"), + "a version bump is a change, not an add"); +} +``` + +Stub (`src/deps/diff.rs`): + +```rust +use crate::deps::model::{DependencyGraph, DependencyNode}; + +#[derive(Debug)] +pub struct VersionChange { pub name: String, pub from: String, pub to: String } + +#[derive(Debug)] +pub struct GraphDiff { + pub added: Vec, + pub removed: Vec, + pub changed: Vec, +} + +pub fn diff_graphs(base: &DependencyGraph, head: &DependencyGraph) -> GraphDiff { + unimplemented!("deps::diff_graphs — PRD_DEPS_TESTING.md §8.9") +} +``` + +`DependencyNode::new_npm` is a real test-support constructor on the model — not a stub. + +### 8.10 Output — `report_tests.rs` (Slices 3 & 6) + +`report_json_*` lands in Slice 3 (npm vertical); SARIF and SBOM in Slice 6. + +```rust +use super::common::scan_fixture; +use crate::deps::report::{to_json, to_sarif, to_cyclonedx}; + +#[test] +fn report_json_has_findings_and_graph() { // Slice 3 + let v = to_json(&scan_fixture("node-app")); + assert!(v.get("nodes").and_then(|n| n.as_array()).is_some(), + "JSON output carries the dependency graph nodes"); + assert!(v.get("findings").and_then(|f| f.as_array()).is_some(), + "JSON output carries findings"); +} + +#[test] +fn report_sarif_has_rules_and_results() { // Slice 6 + let v = to_sarif(&scan_fixture("node-app")); + assert_eq!(v["runs"][0]["tool"]["driver"]["name"], "corgea-deps"); + let results = v["runs"][0]["results"].as_array().expect("results array"); + assert!(results.iter().any(|r| r["ruleId"] == "DEP004"), + "SARIF results include the wildcard finding rule id"); +} + +#[test] +fn report_cyclonedx_has_components_and_deps() { // Slice 6 + let v = to_cyclonedx(&scan_fixture("node-app").graph); + assert_eq!(v["bomFormat"], "CycloneDX"); + let components = v["components"].as_array().expect("components array"); + assert!(components.iter().any(|c| c["purl"] == "pkg:npm/express@4.18.2"), + "SBOM lists express as a component with its purl"); + assert!(v.get("dependencies").is_some(), + "CycloneDX SBOM includes the dependency relationships"); +} +``` + +Stub (`src/deps/report.rs`): + +```rust +use serde_json::Value; +use crate::deps::{Inventory, model::DependencyGraph}; + +pub fn to_json(inv: &Inventory) -> Value { + unimplemented!("deps::report::to_json — PRD_DEPS_TESTING.md §8.10") +} +pub fn to_sarif(inv: &Inventory) -> Value { + unimplemented!("deps::report::to_sarif — PRD_DEPS_TESTING.md §8.10") +} +pub fn to_cyclonedx(graph: &DependencyGraph) -> Value { + unimplemented!("deps::report::to_cyclonedx — PRD_DEPS_TESTING.md §8.10") +} +``` + +### 8.11 CLI integration — `tests/cli_deps.rs` (Slices 3–6) + +These run the compiled binary. **Hermeticity is mandatory**: `Config::load()` (`src/config.rs:33`) creates `~/.corgea/` and writes `config.toml` on first run. Tests must redirect `HOME` to a throwaway directory so they never touch the developer's real config — and so they prove `corgea deps scan` works with no prior config and no token. + +```rust +use std::process::Command; +use tempfile::TempDir; + +/// A `corgea` invocation with HOME isolated to a fresh temp dir. +/// Returns the command and the TempDir guard (keep it alive for the call). +fn corgea_isolated() -> (Command, TempDir) { + let home = TempDir::new().expect("temp HOME"); + let mut cmd = Command::new(env!("CARGO_BIN_EXE_corgea")); + cmd.env("HOME", home.path()) // unix; dirs::home_dir() honors HOME + .env("USERPROFILE", home.path()) // windows + .env_remove("CORGEA_TOKEN") + .env_remove("CORGEA_URL"); + (cmd, home) +} + +fn fixture(name: &str) -> String { + format!("{}/tests/fixtures/{}", env!("CARGO_MANIFEST_DIR"), name) +} + +#[test] +fn cli_scan_runs_without_token_or_config() { + // `deps scan` is local & offline — it must not require login. + let (mut cmd, _home) = corgea_isolated(); + let out = cmd.args(["deps", "scan", &fixture("python-poetry"), + "--out-format", "json"]) + .output().expect("failed to run corgea"); + assert!(out.status.success(), + "clean local scan must succeed with no token; stderr: {}", + String::from_utf8_lossy(&out.stderr)); + let parsed: serde_json::Value = + serde_json::from_slice(&out.stdout).expect("stdout must be valid JSON"); + assert!(parsed.get("findings").is_some()); +} + +#[test] +fn cli_scan_does_not_write_outside_home() { + // Hermeticity guard: a scan must not touch a real ~/.corgea. + let (mut cmd, home) = corgea_isolated(); + cmd.args(["deps", "scan", &fixture("node-app")]) + .output().expect("failed to run corgea"); + // If anything was written, it lands under the temp HOME — never the real one. + assert!(home.path().exists(), "temp HOME survives the run"); +} + +#[test] +fn cli_scan_fail_on_high_exits_one() { + // node-app has High findings (DEP004, DEP005). --fail-on high must exit 1. + let (mut cmd, _home) = corgea_isolated(); + let out = cmd.args(["deps", "scan", &fixture("node-app"), "--fail-on", "high"]) + .output().expect("failed to run corgea"); + assert_eq!(out.status.code(), Some(1), + "High findings with --fail-on high must exit 1"); +} + +#[test] +fn cli_scan_clean_fixture_fail_on_high_exits_zero() { + // Negative control: python-poetry has no High findings. + let (mut cmd, _home) = corgea_isolated(); + let out = cmd.args(["deps", "scan", &fixture("python-poetry"), + "--fail-on", "high"]) + .output().expect("failed to run corgea"); + assert_eq!(out.status.code(), Some(0), + "no High findings — --fail-on high must exit 0"); +} + +#[test] +fn cli_scan_out_file_writes_json() { + let (mut cmd, home) = corgea_isolated(); + let out_file = home.path().join("deps.json"); + let out = cmd.args(["deps", "scan", &fixture("java-gradle"), + "--out-format", "json", "--out-file", out_file.to_str().unwrap()]) + .output().expect("failed to run corgea"); + assert!(out.status.success(), "stderr: {}", + String::from_utf8_lossy(&out.stderr)); + let written = std::fs::read_to_string(&out_file).expect("out-file should exist"); + let _: serde_json::Value = + serde_json::from_str(&written).expect("out-file must contain valid JSON"); +} +``` + +Before implementation these fail cleanly: clap rejects the unknown `deps` subcommand (exit 2) or `deps::run` panics (exit 101) — never the asserted 0/1. + +### 8.12 Robustness & determinism — `robustness_tests.rs` (Slice 7) + +`PRD_DEPS.md` §18 Risk 2 (ecosystem edge cases) and the determinism the JSON/SBOM contract depends on. + +```rust +use super::common::{fixture, scan_fixture}; +use crate::deps::{scan, policy::Policy}; +use crate::deps::ecosystems::classify_constraint; +use crate::deps::model::Ecosystem; +use crate::deps::report::to_json; + +// --- malformed input: error, never panic ------------------------------------ + +#[test] +fn robust_malformed_npm_lockfile_is_error_not_panic() { + // bad-package-lock.json is invalid JSON. scan() must return Err, not panic + // and not silently produce an empty graph. + let dir = fixture("malformed"); + let result = scan(&dir, &Policy::default()); + assert!(result.is_err(), "a malformed lockfile must surface as an error"); +} + +#[test] +fn robust_truncated_poetry_lock_is_error_not_panic() { + let result = std::panic::catch_unwind(|| { + scan(&fixture("malformed"), &Policy::default()) + }); + assert!(result.is_ok(), "parsing a truncated lockfile must not panic"); +} + +#[test] +fn robust_classify_never_panics_on_adversarial_input() { + // A bounded property check without a proptest dependency: classify must be + // total over a corpus of hostile constraint strings. + let corpus = [ + "", " ", "\t\n", "^", "~", ">=", "@", "git+", "#", "[", "[,]", + "999999999999999999999999999999", "v1.2.3", "==", "*.*.*", + "latest.latest", "-SNAPSHOT", "💥", "../../etc/passwd", + &"a".repeat(10_000), + ]; + for raw in corpus { + for eco in [Ecosystem::Npm, Ecosystem::PyPI, Ecosystem::Maven] { + let _ = classify_constraint(eco, raw); // must return, not panic + } + } +} + +// --- determinism: the JSON/SBOM contract depends on it ---------------------- + +#[test] +fn robust_graph_order_deterministic() { + let a = scan_fixture("node-app"); + let b = scan_fixture("node-app"); + let names = |inv: &crate::deps::Inventory| -> Vec { + inv.graph.nodes.iter().map(|n| n.id().0.clone()).collect() + }; + assert_eq!(names(&a), names(&b), + "graph node ordering must be deterministic across scans"); +} + +#[test] +fn robust_json_output_byte_stable() { + let a = to_json(&scan_fixture("node-app")).to_string(); + let b = to_json(&scan_fixture("node-app")).to_string(); + assert_eq!(a, b, "JSON output must be byte-stable for identical input"); +} + +// --- monorepo / workspaces -------------------------------------------------- + +#[test] +fn robust_monorepo_detects_all_workspace_manifests() { + let inv = scan_fixture("node-monorepo"); + use crate::deps::detect::DepFileKind::NpmManifest; + let manifests = inv.detected_files.iter() + .filter(|f| f.kind == NpmManifest).count(); + assert!(manifests >= 3, "root + 2 workspace manifests expected, got {manifests}"); +} + +// --- skip vendored directories (built at runtime — node_modules is gitignored) + +#[test] +fn robust_scan_skips_node_modules() { + use std::fs; + let tmp = tempfile::TempDir::new().expect("temp dir"); + fs::write(tmp.path().join("package.json"), + r#"{"name":"x","version":"1.0.0","dependencies":{}}"#).unwrap(); + let nested = tmp.path().join("node_modules/inner"); + fs::create_dir_all(&nested).unwrap(); + fs::write(nested.join("package.json"), + r#"{"name":"inner","version":"9.9.9"}"#).unwrap(); + + let files = crate::deps::detect::detect_dependency_files(tmp.path()); + assert!( + files.iter().all(|f| !f.path.components() + .any(|c| c.as_os_str() == "node_modules")), + "detection must not descend into node_modules" + ); +} +``` + +### 8.13 Vulnerability findings — `vuln_tests.rs` (Slice 8) + +DEP010 stays in the MVP (§13 decision 7) behind a mocked source, so its tests are offline and deterministic. `scan()` itself stays vulnerability-free — enrichment is a separate, explicit step, which keeps the §8.12 determinism and offline guarantees intact. + +Stub (`src/deps/vuln.rs`): + +```rust +use std::path::Path; +use crate::deps::DepsError; +use crate::deps::findings::{Finding, FindingSource}; +use crate::deps::model::DependencyGraph; + +/// One advisory record from the offline fixture DB. +#[derive(Debug, Clone)] +pub struct Advisory { + pub name: String, + pub vulnerable_versions: Vec, + pub id: String, + pub severity: String, + pub summary: String, +} + +/// A `FindingSource` backed by a static, offline advisory database. +pub struct VulnerabilitySource { + advisories: Vec, +} + +impl VulnerabilitySource { + /// Load advisories from a `vuln-db.json` fixture. REAL — pure file + JSON read. + pub fn from_json_file(path: &Path) -> Result { + unimplemented!("deps::vuln::VulnerabilitySource::from_json_file — §8.13") + } +} + +impl FindingSource for VulnerabilitySource { + /// Emit a DEP010 finding for every graph node whose resolved version + /// matches an advisory, carrying the introduction path. + fn enrich(&self, graph: &DependencyGraph) -> Vec { + unimplemented!("deps::vuln::VulnerabilitySource::enrich — §8.13") + } +} +``` + +Tests: + +```rust +use super::common::{fixture, scan_fixture}; +use crate::deps::findings::FindingSource; +use crate::deps::model::Severity; +use crate::deps::vuln::VulnerabilitySource; + +fn vuln_source() -> VulnerabilitySource { + VulnerabilitySource::from_json_file(&fixture("vuln-db.json")) + .expect("vuln-db.json fixture must load") +} + +#[test] +fn vuln_known_vulnerable_transitive_version_is_dep010() { + // qs@6.11.0 is transitive in node-app and flagged by vuln-db.json. + let inv = scan_fixture("node-app"); + let findings = vuln_source().enrich(&inv.graph); + assert!( + findings.iter().any(|f| f.id == "DEP010" + && f.package.as_ref().is_some_and(|p| p.name() == "qs")), + "a vulnerable transitive package must raise DEP010" + ); +} + +#[test] +fn vuln_safe_version_is_not_dep010() { + // express@4.18.2 and lodash@4.17.21 are absent from vuln-db.json — negative control. + let inv = scan_fixture("node-app"); + let findings = vuln_source().enrich(&inv.graph); + for safe in ["express", "lodash"] { + assert!( + findings.iter().all(|f| + f.package.as_ref().map(|p| p.name()) != Some(safe)), + "{safe} is not in the advisory DB — must not raise DEP010" + ); + } +} + +#[test] +fn vuln_dep010_severity_comes_from_advisory() { + // DEP010 severity is the advisory's severity, not a fixed taxonomy default. + let inv = scan_fixture("node-app"); + let f = vuln_source().enrich(&inv.graph).into_iter() + .find(|f| f.id == "DEP010").expect("expected one DEP010"); + assert_eq!(f.severity, Severity::High, "vuln-db.json marks this advisory high"); +} + +#[test] +fn vuln_dep010_carries_dependency_path() { + // The finding must show how the vulnerable package was introduced. + let inv = scan_fixture("node-app"); + let f = vuln_source().enrich(&inv.graph).into_iter() + .find(|f| f.id == "DEP010").expect("expected one DEP010"); + let path = f.paths.first().expect("DEP010 must carry an introduction path"); + assert_eq!(path.first().map(|id| id.0.as_str()), Some("root")); + assert_eq!(path.last().map(|id| id.name()), Some("qs")); +} + +#[test] +fn vuln_scan_without_source_yields_no_dep010() { + // scan() is offline: with no source enrichment, DEP010 never appears. + assert!(scan_fixture("node-app").with_code("DEP010").is_empty(), + "scan() alone must not produce DEP010 — enrichment is explicit"); +} + +#[test] +fn vuln_clean_graph_yields_no_dep010() { + // python-poetry has no package in vuln-db.json — whole-graph negative control. + let inv = scan_fixture("python-poetry"); + assert!(vuln_source().enrich(&inv.graph).iter().all(|f| f.id != "DEP010"), + "a graph with no advisory match must yield no DEP010"); +} +``` + +--- + +## 9. Execution — vertical slices, not one red wall + +### 9.1 Why slices + +A single 52-test red batch is a legitimate *executable spec*, but it is not TDD and it loses the design feedback loop: you cannot tell whether the `findings` model is right until the `graph` model exists, so dozens of tests sit red for reasons unrelated to the code in front of you. + +So the **document** is the full spec (§8), but the **work** ships as vertical slices. Each slice is one PR containing that slice's tests *and* the implementation that greens them. Within a slice you still write the test first, observe it red, then implement — classic red/green/refactor at slice granularity. + +### 9.2 The slices + +| Slice | Delivers | Tests (§) | Greens | +|---|---|---|---| +| **0** | Stub API (§5), all fixtures (§6), `common.rs`, CI job | — | compiles; nothing yet | +| **1** | `classify_constraint` for npm/PyPI/Maven | §8.2 (17) | constraint classification | +| **2** | `detect_dependency_files` | §8.1 (6) | file detection | +| **3** | **npm vertical** — manifest+lockfile parse, graph, findings (DEP002/003/004/005/006/008), `to_json`, CLI `scan` | §8.3, §8.4 (npm), §8.5 (none), §8.10 (json), §8.11 | npm end-to-end | +| **4** | **Python vertical** — pip/poetry parse, graph, DEP001, findings | §8.4 (pypi), §8.5 (pip/poetry), §8.6 (pypi) | Python end-to-end | +| **5** | **Java vertical** — pom/gradle parse, graph, DEP001, findings, `Policy` | §8.4 (gradle), §8.5 (maven/gradle), §8.6 (maven), §8.7 | Java end-to-end + policy | +| **6** | `explain`, `diff_graphs`, `to_sarif`, `to_cyclonedx`, CLI `graph`/`sbom` | §8.8, §8.9, §8.10 (sarif/sbom) | reporting & analysis | +| **7** | Robustness: malformed-input handling, determinism, monorepo, skip-vendored | §8.12 | hardening | +| **8** | **DEP010** — `VulnerabilitySource` (mocked) + `FindingSource` trait, graph enrichment, dependency-path attribution, `vuln-db.json` fixture | §8.13 (6) | vulnerability findings | + +Slices 1–2 are pure functions (no I/O) — fastest wins, lowest risk, and they de-risk every later slice. Slice 3 is the deepest; once it is green the pipeline shape is proven and Slices 4–5 are largely "the same, another ecosystem". + +### 9.3 CI handling + +`.github/workflows/test.yml` runs `cargo test` on every push. All `deps` work lands on a `feat/deps` branch: + +1. **`main` stays green.** It never sees a red `deps` test — `feat/deps` is not merged until every slice is green. +2. **`feat/deps` is green at every slice boundary.** A slice PR is merged into `feat/deps` only when its tests pass and no earlier slice regressed. Mid-slice, a developer's local `cargo test` is red — that is the point — but nothing red is merged, even to `feat/deps`. +3. **Optional visibility job.** A `deps-progress` CI job on `feat/deps` runs `cargo test deps` and reports the pass count as the public burndown. +4. **No `#[ignore]` ledger.** A test never carries `#[ignore]` to defer it; it simply lands in its slice's PR. This keeps the red signal honest. + +### 9.4 Expected first run of a slice + +When Slice 1's tests land (before its implementation), `cargo test deps` compiles and shows: + +```text +test deps::tests::npm_tests::npm_classify_caret_is_bounded_range ... FAILED + thread '...' panicked at src/deps/ecosystems/mod.rs: + not implemented: deps::ecosystems::classify_constraint — PRD_DEPS_TESTING.md §8.2 +``` + +The panic names `classify_constraint` — the function under test — not a constructor. That is the "fails for the right reason" check (§3.1). + +--- + +## 10. Definition of done & PR checklist + +A slice PR is mergeable into `feat/deps` only when: + +- [ ] Each test it adds was committed **red first** — the PR description quotes the failing `cargo test` output, and the panic is in the *targeted* function, not a constructor. +- [ ] The slice's tests are now green. +- [ ] No earlier slice's test regressed (`cargo test` whole-crate). +- [ ] No test was weakened, deleted, or `#[ignore]`-d to pass. +- [ ] Every positive finding test has its paired negative test, both green. +- [ ] Any new decision-gated test (§3.4) asserts only the stable invariant until the policy question it depends on is resolved and recorded in §13. +- [ ] `cargo build --release` succeeds; `skills/corgea/SKILL.md` and `README.md` updated if the surface is user-visible (`cli/CLAUDE.md`). + +`feat/deps` merges to `main` (GA per `PRD_DEPS.md` §17 Phase 3) when **all ~64 tests are green**. The §13 questions are resolved as of revision 3. + +--- + +## 11. Coverage goals + +- **Finding codes:** every MVP code (DEP001, DEP002, DEP003, DEP004, DEP005, DEP006, DEP008, DEP010, DEP021) has a green positive **and** negative test. DEP010 is exercised via a mocked `VulnerabilitySource` (§8.13); DEP016/DEP017 stay deferred behind the same trait seam (§5.3, §2.1). +- **Languages:** Python, Node.js, Java each exercise detection, classification, graph building, the manifest-vs-lockfile correctness rule, and ≥3 distinct finding codes. +- **Correctness rule:** `PRD_DEPS.md` §7.1 has a dedicated test per language (§8.4) — the regression guard for the §18 Risk 1 false-positive flood. +- **Robustness:** malformed input, determinism, monorepo, and vendored-dir skipping each have a test (§8.12). +- **Line/branch coverage** is a secondary signal: target ≥ 85% line coverage of `src/deps/` via `cargo llvm-cov`, but a green behavior test always outranks a coverage number. + +--- + +## 12. Test-assertion strictness — what is pinned vs loose + +Following the §14 review, assertions are tiered deliberately: + +| Pinned exactly | Asserted loosely / not at all | +|---|---| +| Finding code presence & absence | Graph node / finding ordering — only its *determinism* is tested (§8.12), not a specific order | +| `direct` vs `transitive`, `scope` | Full JSON / SARIF / SBOM document shape — only required keys are checked | +| `reproducible` boolean | `recommendation` prose — only a keyword (e.g. "snapshot") where it carries meaning | +| Resolved version (from controlled fixtures) | — | +| `PackageId` / purl strings | — | +| Severity — every MVP code traces to an unambiguous `PRD_DEPS.md` §9 row (DEP001/002 High, DEP003 Medium, DEP004 High, DEP021 High); DEP010 severity comes from the advisory record | — | + +Package matching uses the exact purl **name component** via `PackageId::name()` — never a substring `contains()`. + +--- + +## 13. Resolved decisions + +The seven open questions of revision 2 are resolved below. Each decision is binding on the slice it gates; the two formerly decision-gated tests (§3.4) are tightened to exact codes as of revision 3. + +1. **SNAPSHOT taxonomy — RESOLVED: mint DEP021.** Maven `-SNAPSHOT` is not an unbounded *selector* (the manifest names a coordinate) — it is a mutable *artifact*, the DEP005 family, not DEP004. Filing it under DEP004 would make a SARIF rule titled "Wildcard or `latest` dependency" fire on `2.0-SNAPSHOT`. New taxonomy code **DEP021 "Mutable artifact version"** (High) is added to `PRD_DEPS.md` §9 and §10. `maven_snapshot_is_dep021_high` (§8.6) asserts `id == "DEP021"`. *Gates Slice 5.* +2. **Unbounded `>=` severity — RESOLVED: DEP004, High.** `PRD_DEPS.md` FR3 lists `>=`, `>`, "unbounded ranges" and "bare names" alongside `*` / `latest`; a lower bound blocks only downgrade, not the float-forward supply-chain risk. `ConstraintKind::Unbounded` already unifies `>=`, bare names, `*`, and `latest`, and §6.7 already rules `latest.release` is DEP004 High *even when locked* — treating `>=` differently would split one classification. `pypi_open_ended_range_is_dep004_high` (§8.6) asserts DEP004 + High. *Gates Slices 1 and 4.* +3. **DEP001 for a `==`-pinned `requirements.txt` — RESOLVED: stays High.** A `==`-pinned `requirements.txt` pins direct versions but not the transitive closure and carries no integrity hashes — it is not a lockfile. The pinning benefit is already credited elsewhere (no DEP003/DEP004 per `==` line). The real lockfile-equivalent is a `requirements.txt` with pip `--hash=` lines for every entry — that suppresses DEP001. No test change (the `python-pip-nolock` fixture is not fully pinned); the `--hash` substitute rule is specified now and gets its own fixture when Python coverage is extended. *Gates Slice 4.* +4. **Maven "no lockfile" stance — RESOLVED: a BOM is mitigating, not substituting.** A `dependencyManagement` / BOM pins only declared coordinates, not the full transitive closure, and carries no integrity hashes. DEP001 still fires for a BOM-only Maven project; a recognized BOM lowers it High→Medium with a recommendation that names the BOM. Exact BOM-severity wording is deferred until a BOM fixture lands; `java-maven` has no BOM, so `maven_no_lockfile_is_dep001` is unaffected. *Gates Slice 5.* +5. **`assert_cmd` dev-dependency — RESOLVED: no.** The ~5 CLI tests (§8.11) need hand-written `HOME` isolation regardless, and `CARGO_BIN_EXE_corgea` already supplies the binary path with zero deps. Plain `std::process::Command` stays. Revisit only if CLI tests exceed ~20. *Gates Slice 0.* +6. **YAML crate — RESOLVED: `serde_yaml_ng`, confirmed.** `serde_yaml` is archived; `serde_yaml_ng` is the conservative drop-in continuation. `serde_yml` is explicitly rejected (provenance / maintenance controversy). Verify its maintenance status is still current at adoption time (§4.3). *Gates Slice 0.* +7. **MVP scope — RESOLVED: keep DEP010, defer DEP016/DEP017 and the Go graph.** DEP010 ("vulnerable resolved package") is the center of gravity of an SCA tool; it stays in the MVP behind a **mocked vulnerability source** — new **Slice 8** (§8.13). DEP016 (license) and DEP017 (registry) remain deferred (config-heavy, secondary); Go keeps detection-only smoke coverage, with the full graph in Phase 2. §2.1 is updated accordingly. *Gates Slice 1 ordering and adds Slice 8.* + +--- + +## 14. Design review record + +Revision 2 incorporates a second-opinion review of revision 1. The substantive changes and why: + +| Review finding | Change | +|---|---| +| `Policy::default()` was `unimplemented!()`, so `scan()` tests would panic in policy construction, not in the behavior under test | `Policy` has a **real** `Default` (§5.2). Stubs are now leaf-only (§3.1). | +| Revision 1 claimed "no new dependencies" but `Policy::from_yaml` needs YAML parsing, and the crate has no YAML parser | Added `serde_yaml_ng` as the one required new dependency (§4.3). | +| Package lookup by bare name is ambiguous across Maven groups and duplicate versions | Typed `PackageId` (purl) is the model identity; `node()` / `nodes_named()` / `node_by_id()` (§5.2). | +| §8 froze exact severities for cases §12 itself flagged as unresolved (SNAPSHOT, `>=`) | Decision-gated assertions (§3.4); those tests assert only the stable invariant. | +| One 52-test red batch loses the TDD design loop | Work re-sequenced into vertical slices (§9); the document stays the full spec. | +| CLI tests would touch the developer's real `~/.corgea/config.toml` | CLI tests isolate `HOME` to a temp dir and assert `deps scan` needs no token (§8.11). | +| No coverage for malformed input, determinism, monorepos, vendored-dir skipping | New robustness slice & tests (§6.8, §8.12, Slice 7). | +| `pub` froze the internal model | Model fields are `pub(crate)`; tests query via accessor methods (§5.2). | + +--- + +## 15. Summary + +This plan makes `PRD_DEPS.md` executable. It defines: + +- a **narrow stub API** (§5) — real constructors, `unimplemented!()` only on the leaf behavior a test targets — so tests fail at the function under test; +- **typed `PackageId`** identity, so findings and graph queries are unambiguous; +- **eight fixture projects** across Python, Node.js, and Java (§6), each chosen so every MVP finding fires somewhere and stays silent somewhere else; +- **~64 tests** (§8) traceable to specific PRD requirements (§7), with positive/negative pairs across eight vertical slices, including DEP010 behind a mocked vulnerability source; +- a **vertical-slice sequence** (§9) that preserves the red→green→refactor loop, keeps `main` green, and de-risks the deepest work early. + +The contract is unchanged: **the test is written first, observed failing, and the feature is "done" only when its test — and no test it should not have touched — is green.** diff --git a/PRD_USER_PERSPECTIVE.md b/PRD_USER_PERSPECTIVE.md new file mode 100644 index 0000000..78640d7 --- /dev/null +++ b/PRD_USER_PERSPECTIVE.md @@ -0,0 +1,55 @@ +# PRD: `corgea deps` + +**Format:** One-page bet +**Replaces:** the 21-section spec until we have evidence + +## The bet + +Developers will adopt a dependency tool that answers "why is this package here and can it drift" faster than one that hands them another CVE list. We can prove this in three weeks using a CLI we already ship to paying customers. + +## The user + +One person: the AppSec engineer at a company already paying for Corgea. They own dependency risk across many repos. They feel the pain ("where is package X, why is it here") and they can say yes. Not "developers, AppSec, platform, compliance." That is nobody. + +## The problem, in their words + +"I know we have a vulnerable package somewhere. I cannot tell you which repos, which version, why it is there, or whether the build will pull a different version tomorrow." + +## The riskiest assumption + +Developers will trust the findings instead of disabling the tool. The PRD lists this as Risk #1. It is not a risk. It is the whole question. Everything else is downstream of it. + +## The experiment + +Build the smallest thing that tests trust: + +- `corgea deps scan` and `corgea deps explain`, npm only, four deterministic findings. +- Three weeks. Inside the existing CLI. +- Hand-deliver it to ten existing customers. Do things that do not scale: DM them, watch them run it, take notes. + +## The demo that sells it + +"You ran `npm install` and got 1,400 packages. Which one pulled in `event-stream`?" + +`corgea deps explain event-stream` + +`root > a > b > event-stream`. Ten seconds. No competitor makes provenance one command. + +## The one metric + +Of developers who run `deps scan` once, how many run it again within seven days. Retention and word of mouth. Not repos scanned, not SBOMs generated. + +## The unfair advantage + +This ships inside a CLI that paying customers already run. Distribution is free. Treat Corgea integration as item zero, not FR10. Add `deps scan` to the next release and email the twenty biggest accounts by hand. + +## Not now + +`diff`, `sbom`, policy-as-code, `fix`, license and registry findings, the vulnerable-package finding, Go, Java, Python, the platform dashboard, the three-phase launch plan. All of it waits for evidence the wedge works. SBOM and license checks are parity features; every competitor has them and they win nobody. + +## Kill or double down + +After three weeks with ten customers: + +- **Double down** if developers run `explain` again, fix findings, and tell teammates. Then build `diff` and the next ecosystem. +- **Stop and fix** if they run it once and ignore it. The findings are not trusted. Fix that before adding anything. diff --git a/PRD_V0.md b/PRD_V0.md new file mode 100644 index 0000000..f5ae28f --- /dev/null +++ b/PRD_V0.md @@ -0,0 +1,66 @@ +# PRD V0: `corgea deps` (npm) + +**Status:** Build spec · **Scope:** npm only · **Parent:** PRD_DEPS_CONDENSED.md + +## What V0 is + +V0 is the smallest slice of `corgea deps` that proves the wedge: Corgea can tell a developer why a dependency exists, whether it can drift, and whether it matters. It ships inside the existing Corgea CLI. + +V0 is not the product in PRD_DEPS_CONDENSED.md. It builds the load-bearing 10 percent, the dependency graph and the `explain` command. Everything else waits for evidence that this slice gets used. + +## The one goal + +Prove that developers trust the findings and come back to `explain`. If they do, build the rest. If they do not, fix trust before adding scope. + +## Build scope + +**1. npm only.** Parse `package.json` and `package-lock.json`. One ecosystem, one clean lockfile model. Confirm npm is right with the first design partners. If they run mostly Python, start there instead. Pick one. + +**2. Dependency graph.** Build nodes (packages) and edges (relationships) from the manifest and lockfile. Model declared intent and resolved reality separately (see Correctness). Preserve full paths, not flat lists. Each node carries name, version, purl, direct or transitive, scope, and source type. + +**3. `corgea deps scan`.** Detect npm files, build the graph, report the four findings below. Terminal table by default. + +**4. `corgea deps explain `.** The signature command. Show identity, direct or transitive, scope, the full path (`root > express@4.18.2 > qs@6.11.0`), declared constraint against resolved version, source file, and lockfile entry. + +**5. Output and upload.** Terminal table, JSON (`--out-format json`, `--out-file`), and `--upload` to Corgea so org-level inventory has data. + +## Findings + +Four deterministic findings. No network, no vuln database, near-zero false positives. + +| Code | Finding | Severity | +|---|---|---| +| DEP001 | Missing lockfile | High | +| DEP002 | Stale lockfile (manifest changed after lockfile) | High | +| DEP004 | Wildcard or `latest` direct dependency | High | +| DEP005 | Mutable Git branch dependency | High | + +Every finding states the source file, the reason, and the exact fix. + +## Correctness model + +The load-bearing engineering. Model three layers: + +- **Declared intent.** What the manifest allows (`"axios": "^1.8.0"`). +- **Resolved reality.** What the lockfile installed (`axios 1.8.2`). +- **Effective risk.** A range plus a committed lockfile is reproducible. Never report it as a missing lockfile. + +A transitive package with a broad range that the lockfile resolves is not a finding. Flag a range only when no lockfile resolves it. Get this wrong and developers uninstall the tool. + +## Out of scope for V0 + +Cut and deferred: `diff`, `sbom`, `policy init`, `fix`, exceptions, SARIF and HTML output, Go, Java, Python and other ecosystems, the vulnerable-package finding (Corgea SCA already covers it), license and registry findings, and the platform UI buildout. + +## Success criteria + +- Of developers who run `deps scan` once, the share who run it again within 7 days. +- Of developers who hit a finding, the share who fix it. +- `explain` invocations per active user. + +Vanity counts (repos scanned, SBOMs generated) do not apply to V0. + +## Open questions blocking V0 + +1. npm or Python first? Decide with the first three design partners. +2. Stale-lockfile detection: compare file mtimes, git history, or a manifest hash recorded in the lockfile? +3. Does `--upload` reuse the existing CLI scan-upload path, or need a new endpoint? diff --git a/SECURITY.md b/SECURITY.md new file mode 100644 index 0000000..0de5276 --- /dev/null +++ b/SECURITY.md @@ -0,0 +1,48 @@ +# Security Policy + +The Corgea CLI is a security tool, and we take the security of the CLI itself seriously. Thank you for helping keep it and its users safe. + +## Reporting a vulnerability + +**Please do not report security vulnerabilities through public GitHub issues, discussions, or pull requests.** + +Report privately through one of these channels: + +1. **GitHub private vulnerability reporting** (preferred) — go to the [Security tab](https://github.com/Corgea/cli/security) and click **Report a vulnerability**. This keeps the report private and tracked. +2. **Email** — send details to `adam@corgea.com`. + + +Please include as much of the following as you can: + +- The type of issue (e.g. credential leakage, path traversal, insecure TLS handling, dependency vulnerability). +- Affected version(s) — the output of `corgea --version`. +- Affected source files and a description of the impact. +- Step-by-step instructions to reproduce, including any proof-of-concept. +- Whether the issue is exploitable in a default configuration. + +## What to expect + +- **Acknowledgement** — we aim to confirm receipt within 3 business days. +- **Assessment** — we will investigate and tell you whether the report is accepted, along with our severity assessment. +- **Updates** — we will keep you informed as we work on a fix. +- **Disclosure** — we ask that you keep the report confidential until a fix is released. We are happy to credit you in the release notes unless you prefer to remain anonymous. + +## Supported versions + +Security fixes are released against the latest published version of the CLI. Please upgrade to the newest release (via npm, pip, or the GitHub Releases page) before reporting an issue, and verify it still reproduces. + +## Scope + +In scope: + +- The CLI source in this repository and its release artifacts (native binaries, the npm package `@corgea/cli`, the pip package `corgea-cli`). +- Handling of credentials and tokens (`~/.corgea/config.toml`, the `CORGEA_TOKEN` environment variable). +- TLS and proxy handling, and the local OAuth callback server used by `corgea login`. + +Out of scope: + +- The Corgea platform / backend API — report those through the channels above as well, noting the distinction; they are handled by a separate team. +- Findings produced by a scan (those are product output, not CLI vulnerabilities). +- Vulnerabilities requiring a compromised local machine or physical access. + +Thank you for practicing responsible disclosure. diff --git a/pyproject.toml b/pyproject.toml index 195c1e6..3d48a38 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -21,7 +21,7 @@ classifiers = [ "Programming Language :: Rust", ] authors = [ - { name = "Adam Bronte", email = "adam@corgea.com" } + { name = "Corgea Security", email = "security@corgea.com" } ] dynamic = ["version"]