Skip to content

feat(poetry): extract SHA-256 hashes from poetry.lock for SBOM generation#534

Open
a-oren wants to merge 1 commit into
guacsec:mainfrom
a-oren:TC-4333
Open

feat(poetry): extract SHA-256 hashes from poetry.lock for SBOM generation#534
a-oren wants to merge 1 commit into
guacsec:mainfrom
a-oren:TC-4333

Conversation

@a-oren
Copy link
Copy Markdown
Contributor

@a-oren a-oren commented May 20, 2026

Summary

  • Extract SHA-256 artifact hashes from poetry.lock files[].hash entries in the poetry provider
  • Extend _extractMarkerData() to build a hash map per package, preferring sdist (.tar.gz) hashes
  • Hashes flow through the existing base_pyproject pipeline into CycloneDX SBOM components
  • Update all 4 poetry golden SBOM fixture sets with hash data

Implements TC-4333

Test plan

  • Verify all poetry SBOM golden file tests pass (SBOM_CASES pattern)
  • Verify poetry marker filtering tests still pass
  • Verify no regression in other provider tests
  • Verify generated CycloneDX components include hashes array with SHA-256 entries

🤖 Generated with Claude Code

Summary by Sourcery

Add support for extracting SHA-256 hashes from poetry.lock and propagating them into generated SBOM components.

New Features:

  • Capture SHA-256 hashes from poetry.lock file entries and attach them to dependency graph nodes for SBOM generation.

Enhancements:

  • Extend poetry marker extraction to also build a per-package hash map, preferring sdist hashes when available.

Tests:

  • Update poetry SBOM golden fixtures to include expected hash data in component and stack SBOMs.

…tion

Extend _extractMarkerData() to extract files[].hash entries per package
from poetry.lock and populate graph entries with SHA-256 hashes. Prefers
sdist (.tar.gz) hash over wheel hashes. The hashes flow through the
existing base_pyproject pipeline into CycloneDX SBOM components.

Implements TC-4333

Assisted-by: Claude Code
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented May 20, 2026

Reviewer's Guide

Extend the Python Poetry provider to extract SHA-256 hashes from poetry.lock file entries, attach them to the dependency graph, and propagate them into generated CycloneDX SBOM fixtures for poetry-based projects.

File-Level Changes

Change Details Files
Capture SHA-256 hashes from poetry.lock and attach them to the internal dependency graph entries produced by the Poetry provider.
  • Extend _extractMarkerData to build and return a hashMap keyed by canonical package coordinates alongside existing marker maps.
  • Introduce _extractSha256FromFiles helper to choose an sdist (.tar.gz) hash when available, otherwise the first file entry, and normalize sha256: prefixed hashes to bare hex strings.
  • Update _parsePoetryTree to thread the new hashMap through and populate an optional hashes array on each graph node when a hash is available.
src/providers/python_poetry.js
Update Poetry SBOM golden fixtures to assert presence of SHA-256 hashes on CycloneDX components.
  • Regenerate expected_component_sbom.json fixtures for all Poetry scenarios to include hashes arrays on relevant components.
  • Regenerate expected_stack_sbom.json fixtures for all Poetry scenarios so that stack-level SBOMs reflect the new hash metadata.
test/providers/tst_manifests/pyproject/poetry_dev_deps/expected_component_sbom.json
test/providers/tst_manifests/pyproject/poetry_dev_deps/expected_stack_sbom.json
test/providers/tst_manifests/pyproject/poetry_legacy_dev_deps/expected_component_sbom.json
test/providers/tst_manifests/pyproject/poetry_legacy_dev_deps/expected_stack_sbom.json
test/providers/tst_manifests/pyproject/poetry_lock/expected_component_sbom.json
test/providers/tst_manifests/pyproject/poetry_lock/expected_stack_sbom.json
test/providers/tst_manifests/pyproject/poetry_only_deps/expected_component_sbom.json
test/providers/tst_manifests/pyproject/poetry_only_deps/expected_stack_sbom.json

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • _extractMarkerData() is now responsible for both marker extraction and hash extraction, which makes the name and responsibility misleading; consider either renaming it or splitting the hash handling into a separate helper to keep concerns clear.
  • _extractSha256FromFiles() assumes each files[] entry has a truthy file and hash field; it may be safer to guard against malformed entries (e.g., missing file/hash) before accessing endsWith/hash to avoid runtime errors on unexpected lockfile formats.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- _extractMarkerData() is now responsible for both marker extraction and hash extraction, which makes the name and responsibility misleading; consider either renaming it or splitting the hash handling into a separate helper to keep concerns clear.
- _extractSha256FromFiles() assumes each files[] entry has a truthy file and hash field; it may be safer to guard against malformed entries (e.g., missing file/hash) before accessing endsWith/hash to avoid runtime errors on unexpected lockfile formats.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@a-oren
Copy link
Copy Markdown
Contributor Author

a-oren commented May 20, 2026

Verification Report for TC-4333 (commit 42d49e6)

Check Result Details
Review Feedback PASS 1 bot review (sourcery-ai) with general suggestions; no inline code change requests
Root-Cause Investigation N/A No sub-tasks created
Scope Containment PASS All 9 changed files match task scope (1 source + 8 golden fixtures)
Diff Size PASS 789+/508- across 9 files — proportional to 1 source file + 8 JSON fixture updates
Commit Traceability PASS Commit references "Implements TC-4333"
Sensitive Patterns PASS No matches found
CI Status PASS All 5 checks pass (lint/test Node 22, lint/test Node 24, Sourcery, PR title, commit messages)
Acceptance Criteria PASS 4/4 criteria met
Test Quality N/A No test source files in PR diff (only JSON fixture data)
Verification Commands N/A No verification commands in task

Overall: PASS

All checks pass. Implementation correctly extracts SHA-256 hashes from poetry.lock files[].hash entries, attaches them to graph entries in _parsePoetryTree(), and updates all 8 golden SBOM fixtures with correct hash data. CI confirms all tests pass.


This comment was AI-generated by sdlc-workflow/verify-pr v0.5.11.

@a-oren a-oren requested a review from ruromero May 20, 2026 13:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant