Skip to content

feat: add SHA-1 idempotency primitives for CoreFileObject#963

Open
minitriga wants to merge 16 commits intostablefrom
feat/idempotent-file-ops
Open

feat: add SHA-1 idempotency primitives for CoreFileObject#963
minitriga wants to merge 16 commits intostablefrom
feat/idempotent-file-ops

Conversation

@minitriga
Copy link
Copy Markdown
Contributor

@minitriga minitriga commented Apr 22, 2026

Summary

Adds SHA-1 compare-and-skip primitives to InfrahubNode / InfrahubNodeSync so downstream libraries stop maintaining parallel copies of this logic. Four public additions:

  • sha1_of_source(source: bytes | Path | BinaryIO) -> str in infrahub_sdk.file_handler — streaming SHA-1 helper (64 KiB chunks; rewinds BinaryIO to original position). Canonical import path: from infrahub_sdk.file_handler import sha1_of_source.
  • matches_local_checksum(source) -> bool — atomic compare against the node's server-stored checksum attribute. No network, no mutation.
  • upload_if_changed(source, name=None) -> UploadResult — composes compare + upload_from_* + save(). Returns a frozen UploadResult(uploaded, checksum) dataclass. Skips the transfer when the local SHA-1 matches.
  • download_file(..., skip_if_unchanged=True) — short-circuits the download when dest exists on disk with a matching SHA-1. Returns 0 bytes written on skip.

Both async and sync twins. Full async↔sync symmetry.

Motivation

Both opsmill/nornir-infrahub#71 and opsmill/infrahub-ansible#317 shipped near-identical hashlib.sha1(...).hexdigest() + compare logic. Centralising it in the SDK gives one source of truth and — because nornir-infrahub already extended idempotency to the download side — brings that capability to the Ansible collection too.

Usage — nornir-infrahub example

Concretely, the nornir-infrahub tasks in nornir_infrahub/plugins/tasks/file_object.py collapse from ~30 lines of hand-rolled idempotency into a single method call per operation.

Upload

Before (today, on nornir-infrahub main):

import hashlib

def _sha1(data: bytes) -> str:
    return hashlib.sha1(data, usedforsecurity=False).hexdigest()

# ... inside upload_file_object ...
existing_obj = _lookup_existing_object(...)
if existing_obj:
    if _sha1(payload) == existing_obj.checksum.value:
        return Result(changed=False, result="already up to date (checksum match)")
    for attr_name, attr_value in data.items():
        if attr_name in existing_obj._schema.attribute_names:
            setattr(existing_obj, attr_name, attr_value)
    if path is not None:
        existing_obj.upload_from_path(path)
    else:
        existing_obj.upload_from_bytes(content=payload, name=upload_name)
    existing_obj.save()
    return Result(changed=True, result="updated (checksum changed)")

After (with this PR):

from infrahub_sdk.node import UploadResult

# ... inside upload_file_object ...
existing_obj = _lookup_existing_object(...)
if existing_obj:
    for attr_name, attr_value in data.items():
        if attr_name in existing_obj._schema.attribute_names:
            setattr(existing_obj, attr_name, attr_value)
    outcome: UploadResult = existing_obj.upload_if_changed(
        source=path if path is not None else payload,
        name=upload_name,
    )
    return Result(
        changed=outcome.uploaded,
        result=(
            "updated (checksum changed)" if outcome.uploaded
            else "already up to date (checksum match)"
        ),
    )

No _sha1 helper, no hashlib import, no two-branch upload_from_path / upload_from_bytes dispatch — the SDK handles the compare, the staging, and the save in one call. The returned UploadResult.uploaded maps directly to Nornir's Result(changed=...).

Download

Before (today):

server_checksum = obj.checksum.value
changed = False
if resolved_save_to is not None and resolved_save_to.exists() and resolved_save_to.is_file():
    local_checksum = _sha1(resolved_save_to.read_bytes())
    if local_checksum == server_checksum:
        content: bytes = resolved_save_to.read_bytes()
    else:
        content = obj.download_file()  # type: ignore[assignment]
        resolved_save_to.parent.mkdir(parents=True, exist_ok=True)
        resolved_save_to.write_bytes(content)
        changed = True
else:
    content = obj.download_file()  # type: ignore[assignment]
    if resolved_save_to is not None:
        resolved_save_to.parent.mkdir(parents=True, exist_ok=True)
        resolved_save_to.write_bytes(content)
        changed = True

After:

if resolved_save_to is not None:
    resolved_save_to.parent.mkdir(parents=True, exist_ok=True)
    bytes_written = obj.download_file(dest=resolved_save_to, skip_if_unchanged=True)
    changed = bytes_written > 0  # 0 means the local file already matched
    content = resolved_save_to.read_bytes()
else:
    content = obj.download_file()
    changed = False

Two branches collapse into one, the manual SHA-1 compare and file-exists handling move into the SDK, and the changed signal falls out of the return value — no more tracking it imperatively across branches.

Notes

  • upload_if_changed computes sha1_of_source(source) once up front and reuses it for both the compare and the returned UploadResult.checksum. This is the semantically correct value (the server stores sha1(received_bytes), which equals the local digest), and it avoids double-hashing or reading a BinaryIO twice.
  • download_file(skip_if_unchanged=True) validates the saved-node precondition before the skip-check, so an unsaved node with a coincidentally-matching checksum.value still raises ValueError rather than silently returning 0. See the fix commit in the branch and the accompanying .fixed.md changelog fragment.
  • Three new *_FEATURE_NOT_SUPPORTED_MESSAGE constants were added alongside the existing FILE_DOWNLOAD_FEATURE_NOT_SUPPORTED_MESSAGE, following the same convention.

Follow-ups (not in this PR)

Once this lands and a release ships with the new lower bound:

  • [opsmill/nornir-infrahub] bump infrahub-sdk floor and replace local SHA-1 code in nornir_infrahub/plugins/tasks/file_object.py with the new methods (per the usage example above).
  • [opsmill/infrahub-ansible] bump infrahub-sdk floor and replace local SHA-1 code in plugins/module_utils/node.py (get_file_object_local_checksum, _update_object_with_file) with the new methods. Also gains download-side idempotency for free by wiring skip_if_unchanged=True into the object_file_fetch action.

Test plan

  • Unit tests pass for both InfrahubNode and InfrahubNodeSync (85 file-related tests across test_file_handler.py and test_file_object.py, up from ~41 baseline)
  • ruff check clean
  • mypy clean
  • Towncrier Added + Fixed fragments validate
  • Integration test against a live Infrahub instance (separate branch)

Changelog fragments

  • changelog/+idempotent-file-ops.added.md — describes the four new primitives
  • changelog/+idempotent-file-ops-unsaved-node.fixed.md — describes the unsaved-node guard in download_file(skip_if_unchanged=True)

minitriga and others added 15 commits April 22, 2026 14:14
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…o start

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a no-network, no-mutation primitive that callers can use to
compare a local bytes/Path/BinaryIO source against the server-stored
SHA-1 checksum on a CoreFileObject node, without triggering a transfer.
Also adds MATCHES_LOCAL_CHECKSUM_FEATURE_NOT_SUPPORTED_MESSAGE constant
and re-exports it from infrahub_sdk/node/__init__.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds InfrahubNode.upload_if_changed() which composes sha1_of_source,
upload_from_path/upload_from_bytes, and save() to perform uploads only
when local content differs from the server-side checksum. Returns an
UploadResult with the locally-computed digest as the post-upload checksum.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mirror the async InfrahubNode.upload_if_changed on the sync class,
extending TestUploadIfChanged to parametrize over both client types
(standard + sync) for all 6 test scenarios.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a `skip_if_unchanged: bool = False` kwarg to `InfrahubNode.download_file`.
When True and dest is provided, SHA-1 of the local file is compared against the
node's server checksum; a match returns 0 immediately without a network request.
Includes @overload signatures and 5 new parametrized tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ircuit

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirror the async InfrahubNode.download_file skip-if-unchanged logic on
InfrahubNodeSync, including overloads. Extend TestDownloadSkipIfUnchanged
to parametrize over both client types (53 total tests pass).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ment

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@minitriga minitriga requested a review from a team as a code owner April 22, 2026 15:01
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Apr 22, 2026

Deploying infrahub-sdk-python with  Cloudflare Pages  Cloudflare Pages

Latest commit: 65f7c71
Status: ✅  Deploy successful!
Preview URL: https://86dab61c.infrahub-sdk-python.pages.dev
Branch Preview URL: https://feat-idempotent-file-ops.infrahub-sdk-python.pages.dev

View logs

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 22, 2026

Codecov Report

❌ Patch coverage is 96.07843% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
infrahub_sdk/node/node.py 94.93% 0 Missing and 4 partials ⚠️
@@            Coverage Diff             @@
##           stable     #963      +/-   ##
==========================================
+ Coverage   80.77%   80.86%   +0.09%     
==========================================
  Files         132      119      -13     
  Lines       10999    10433     -566     
  Branches     1681     1578     -103     
==========================================
- Hits         8884     8437     -447     
+ Misses       1566     1471      -95     
+ Partials      549      525      -24     
Flag Coverage Δ
integration-tests 41.33% <0.00%> (-1.44%) ⬇️
python-3.10 52.12% <73.52%> (-1.45%) ⬇️
python-3.11 52.10% <73.52%> (-1.45%) ⬇️
python-3.12 52.12% <73.52%> (-1.45%) ⬇️
python-3.13 52.10% <73.52%> (-1.45%) ⬇️
python-3.14 53.82% <79.41%> (-1.33%) ⬇️
python-filler-3.12 23.99% <22.54%> (+1.39%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
infrahub_sdk/file_handler.py 83.95% <100.00%> (+2.26%) ⬆️
infrahub_sdk/node/__init__.py 100.00% <100.00%> (ø)
infrahub_sdk/node/constants.py 100.00% <100.00%> (ø)
infrahub_sdk/node/node.py 87.17% <94.93%> (+0.96%) ⬆️

... and 14 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant