Skip to content

[release-1.10] perf(scripts): speed up RHDH operator install using install-rhdh-catalog-source.sh (~30 min → ~2 min)#2913

Open
openshift-cherrypick-robot wants to merge 6 commits into
redhat-developer:release-1.10from
openshift-cherrypick-robot:cherry-pick-2870-to-release-1.10
Open

[release-1.10] perf(scripts): speed up RHDH operator install using install-rhdh-catalog-source.sh (~30 min → ~2 min)#2913
openshift-cherrypick-robot wants to merge 6 commits into
redhat-developer:release-1.10from
openshift-cherrypick-robot:cherry-pick-2870-to-release-1.10

Conversation

@openshift-cherrypick-robot
Copy link
Copy Markdown

This is an automated cherry-pick of #2870

/assign rm3l

subhashkhileri and others added 6 commits May 28, 2026 10:29
- Skips slow `skopeo inspect` (~42s/bundle) — attempts the copy directly
  instead; failed copies (~3s) are faster than successful inspects
- Processes bundles in parallel up to MAX_PARALLEL (default 10), with a
  portable kill-0 throttle loop that prunes finished PIDs each iteration
- Collects per-worker sed files and applies them in one pass after all
  bundles complete, avoiding concurrent writes to render.yaml
- Runs `opm render` and cluster registry setup in parallel since they
  are independent; waits before the bundle-processing phase begins
- Replaces check-then-delete secret pattern with --ignore-not-found
- Deletes existing CatalogSource before recreating to force OLM re-index
  when the tag is unchanged but the digest has changed (rebuilt IIB)
- Fails loudly if any bundle fails to process (was: silent error log)

Assisted-by: Claude Code
Co-Authored-By: Claude Code <noreply@anthropic.com>
…opeo stderr

Background subshells don't inherit set -e from the parent, so
intermediate failures (umoci, skopeo push) went undetected and the
worker would write a .sed entry for a broken bundle. Also redirect
speculative copy stderr to a per-bundle file instead of /dev/null
so auth failures, timeouts, and disk errors are debuggable.

Assisted-by: Claude Code
Co-Authored-By: Claude Code <noreply@anthropic.com>
Assisted-by: Claude Code
Co-Authored-By: Claude Code <noreply@anthropic.com>
Assisted-by: Claude Code
Co-Authored-By: Claude Code <noreply@anthropic.com>
- Validate MAX_PARALLEL is a positive integer, exit with clear error
  otherwise (prevents infinite hang with 0 or crash with non-numeric)
- Consolidate 3 separate trap EXIT calls into one — they were
  overwriting each other, so only the last one ran (pre-existing bug
  causing kubectl port-forward zombies and TMPDIR not being cleaned)
- Remove unused kanikoLogsPid variable

Assisted-by: Claude Code
Co-Authored-By: Claude Code <noreply@anthropic.com>
…cess group

kill 0 sends SIGTERM to the entire process group including the parent
shell/CI harness, causing segfaults on normal exit. Use jobs -p to
target only this script's background jobs. Split INT/TERM from EXIT
to avoid re-entrant cleanup.

Assisted-by: Claude Code
Co-Authored-By: Claude Code <noreply@anthropic.com>
@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown
Member

@rm3l rm3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but on hold until 1.10.0 is out.

/hold

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants