Add hubs deployment ibu lane#79893
Conversation
WalkthroughThis PR introduces a new OpenShift CI test workflow for In-Band Update (IBU) testing on OCP 4.20. It adds a CI configuration that triggers the ChangesIBU Test Workflow and Infrastructure Setup
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
🚥 Pre-merge checks | ✅ 15✅ Passed checks (15 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: shaior The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/pj-rehearse periodic-ci-openshift-kni-eco-ci-cd-main-cnf-ran-ibu-4.20-cnf-ran-ztp-tests |
|
@shaior: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
[REHEARSALNOTIFIER]
Prior to this PR being merged, you will need to either run and acknowledge or opt to skip these rehearsals. Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
ci-operator/step-registry/telcov10n/functional/cnf-ran/ibu-seed-hub-deploy/telcov10n-functional-cnf-ran-ibu-seed-hub-deploy-ref.yaml (1)
17-48: 💤 Low valueHardcoded
kni-qe-108credential mounts are coupled to theSEED_CLUSTER_NAMEdefault.
SEED_CLUSTER_NAMEis overridable, but the credentialname/mount_pathentries hardcodekni-qe-108. OverridingSEED_CLUSTER_NAMEwithout matching credentials would silently mount the wrong host variables. Worth a brief comment noting these must be updated together, or document that the cluster name is effectively fixed for this step.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/telcov10n/functional/cnf-ran/ibu-seed-hub-deploy/telcov10n-functional-cnf-ran-ibu-seed-hub-deploy-ref.yaml` around lines 17 - 48, The SEED_CLUSTER_NAME default is overridable but the credentials entries hardcode "kni-qe-108" (e.g., names like telcov10n-ansible-kni-qe-108-master0, telcov10n-ansible-kni-qe-108-bastion and mount_path entries under /var/host_variables/kni-qe-108/*), so changing SEED_CLUSTER_NAME will leave mounts mismatched; fix by parameterizing those credential name and mount_path values to use SEED_CLUSTER_NAME (or the template variable used by the pipeline) instead of the literal "kni-qe-108" (or if parameterization isn’t practical, add a clear top-of-file comment next to SEED_CLUSTER_NAME stating that all credential `name` and `mount_path` entries (telcov10n-ansible-kni-qe-108-*, /var/host_variables/kni-qe-108/*, /var/group_variables/kni-qe-108/*) must be updated when SEED_CLUSTER_NAME is changed).ci-operator/step-registry/telcov10n/functional/cnf-ran/ibu-seed-hub-deploy/telcov10n-functional-cnf-ran-ibu-seed-hub-deploy-commands.sh (1)
86-94: 💤 Low valueRemove the temporary SSH private key after use.
The key is materialized at
/tmp/temp_ssh_keyand never deleted. Even in an ephemeral CI container, cleaning it up immediately after the SSH call is cheap hygiene and avoids it lingering for any later step sharing the filesystem.🧹 Suggested cleanup
CLUSTER_VERSION=$(ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \ -i /tmp/temp_ssh_key "${BASTION_USER}@${BASTION_IP}" \ "KUBECONFIG=${HUB_KUBECONFIG} oc get clusterversion version -ojsonpath='{.status.desired.version}'") +rm -f /tmp/temp_ssh_key🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/telcov10n/functional/cnf-ran/ibu-seed-hub-deploy/telcov10n-functional-cnf-ran-ibu-seed-hub-deploy-commands.sh` around lines 86 - 94, The temporary SSH key file /tmp/temp_ssh_key is created and used by the ssh command that sets CLUSTER_VERSION but is never removed; after the ssh call (the command that populates CLUSTER_VERSION via KUBECONFIG=${HUB_KUBECONFIG} oc get clusterversion ...) remove the temp file (e.g., delete /tmp/temp_ssh_key) and ensure removal happens even on failure (use a trap or cleanup step) so the key is not left on disk after the script runs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@ci-operator/step-registry/telcov10n/functional/cnf-ran/ibu-seed-hub-deploy/telcov10n-functional-cnf-ran-ibu-seed-hub-deploy-commands.sh`:
- Line 38: Replace the bare mkdir calls that can fail when directories already
exist with idempotent mkdir -p invocations; locate the mkdir
"/eco-ci-cd/inventories/ocp-deployment/group_vars" (and the similar mkdir at the
other occurrence) in
telcov10n-functional-cnf-ran-ibu-seed-hub-deploy-commands.sh and update them to
use mkdir -p so the script does not abort under set -e when the directory
already exists.
In
`@ci-operator/step-registry/telcov10n/functional/cnf-ran/ibu/telcov10n-functional-cnf-ran-ibu-workflow.yaml`:
- Around line 3-13: The workflow documentation claims it runs the IBU upgrade,
post-upgrade eco-gotests, Slack notification and job chaining, but the steps
section only defines pre with refs
telcov10n-functional-cnf-ran-ibu-seed-hub-deploy and
telcov10n-functional-cnf-ran-ibu-target-hub-deploy; either update the
documentation to state this is currently only a deployment lane, or add the
missing test and post blocks (e.g., include refs for the upgrade/test steps such
as telcov10n-functional-cnf-ran-ibu-upgrade and
telcov10n-functional-cnf-ran-ibu-post-upgrade-notify or whatever the canonical
job refs are) so the YAML's steps match the described Test and Post phases.
---
Nitpick comments:
In
`@ci-operator/step-registry/telcov10n/functional/cnf-ran/ibu-seed-hub-deploy/telcov10n-functional-cnf-ran-ibu-seed-hub-deploy-commands.sh`:
- Around line 86-94: The temporary SSH key file /tmp/temp_ssh_key is created and
used by the ssh command that sets CLUSTER_VERSION but is never removed; after
the ssh call (the command that populates CLUSTER_VERSION via
KUBECONFIG=${HUB_KUBECONFIG} oc get clusterversion ...) remove the temp file
(e.g., delete /tmp/temp_ssh_key) and ensure removal happens even on failure (use
a trap or cleanup step) so the key is not left on disk after the script runs.
In
`@ci-operator/step-registry/telcov10n/functional/cnf-ran/ibu-seed-hub-deploy/telcov10n-functional-cnf-ran-ibu-seed-hub-deploy-ref.yaml`:
- Around line 17-48: The SEED_CLUSTER_NAME default is overridable but the
credentials entries hardcode "kni-qe-108" (e.g., names like
telcov10n-ansible-kni-qe-108-master0, telcov10n-ansible-kni-qe-108-bastion and
mount_path entries under /var/host_variables/kni-qe-108/*), so changing
SEED_CLUSTER_NAME will leave mounts mismatched; fix by parameterizing those
credential name and mount_path values to use SEED_CLUSTER_NAME (or the template
variable used by the pipeline) instead of the literal "kni-qe-108" (or if
parameterization isn’t practical, add a clear top-of-file comment next to
SEED_CLUSTER_NAME stating that all credential `name` and `mount_path` entries
(telcov10n-ansible-kni-qe-108-*, /var/host_variables/kni-qe-108/*,
/var/group_variables/kni-qe-108/*) must be updated when SEED_CLUSTER_NAME is
changed).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 051836f9-863f-4f43-a165-4bc586e8d1c3
⛔ Files ignored due to path filters (1)
ci-operator/jobs/openshift-kni/eco-ci-cd/openshift-kni-eco-ci-cd-main-periodics.yamlis excluded by!ci-operator/jobs/**
📒 Files selected for processing (12)
ci-operator/config/openshift-kni/eco-ci-cd/openshift-kni-eco-ci-cd-main__cnf-ran-ibu-4.20.yamlci-operator/step-registry/telcov10n/functional/cnf-ran/ibu-seed-hub-deploy/OWNERSci-operator/step-registry/telcov10n/functional/cnf-ran/ibu-seed-hub-deploy/telcov10n-functional-cnf-ran-ibu-seed-hub-deploy-commands.shci-operator/step-registry/telcov10n/functional/cnf-ran/ibu-seed-hub-deploy/telcov10n-functional-cnf-ran-ibu-seed-hub-deploy-ref.metadata.jsonci-operator/step-registry/telcov10n/functional/cnf-ran/ibu-seed-hub-deploy/telcov10n-functional-cnf-ran-ibu-seed-hub-deploy-ref.yamlci-operator/step-registry/telcov10n/functional/cnf-ran/ibu-target-hub-deploy/OWNERSci-operator/step-registry/telcov10n/functional/cnf-ran/ibu-target-hub-deploy/telcov10n-functional-cnf-ran-ibu-target-hub-deploy-commands.shci-operator/step-registry/telcov10n/functional/cnf-ran/ibu-target-hub-deploy/telcov10n-functional-cnf-ran-ibu-target-hub-deploy-ref.metadata.jsonci-operator/step-registry/telcov10n/functional/cnf-ran/ibu-target-hub-deploy/telcov10n-functional-cnf-ran-ibu-target-hub-deploy-ref.yamlci-operator/step-registry/telcov10n/functional/cnf-ran/ibu/OWNERSci-operator/step-registry/telcov10n/functional/cnf-ran/ibu/telcov10n-functional-cnf-ran-ibu-workflow.metadata.jsonci-operator/step-registry/telcov10n/functional/cnf-ran/ibu/telcov10n-functional-cnf-ran-ibu-workflow.yaml
| echo "SEED_HUB_VERSION=${SEED_HUB_VERSION}" | ||
|
|
||
| echo "Processing common group_vars" | ||
| mkdir /eco-ci-cd/inventories/ocp-deployment/group_vars |
There was a problem hiding this comment.
Use mkdir -p to avoid a hard failure when the directory already exists.
With set -e, a bare mkdir aborts the whole step if group_vars/host_vars already exist in the eco-ci-cd inventory tree (e.g. on a retry or if the repo ships these dirs). -p makes this idempotent.
🛠️ Proposed fix
-mkdir /eco-ci-cd/inventories/ocp-deployment/group_vars
+mkdir -p /eco-ci-cd/inventories/ocp-deployment/group_vars-mkdir /eco-ci-cd/inventories/ocp-deployment/host_vars
+mkdir -p /eco-ci-cd/inventories/ocp-deployment/host_varsAlso applies to: 52-52
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/telcov10n/functional/cnf-ran/ibu-seed-hub-deploy/telcov10n-functional-cnf-ran-ibu-seed-hub-deploy-commands.sh`
at line 38, Replace the bare mkdir calls that can fail when directories already
exist with idempotent mkdir -p invocations; locate the mkdir
"/eco-ci-cd/inventories/ocp-deployment/group_vars" (and the similar mkdir at the
other occurrence) in
telcov10n-functional-cnf-ran-ibu-seed-hub-deploy-commands.sh and update them to
use mkdir -p so the script does not abort under set -e when the directory
already exists.
| documentation: |- | ||
| IBU (Image Based Upgrade) workflow for CNF RAN testing. | ||
| Pre: deploy seed hub (kni-qe-108) at SEED_HUB_VERSION and target hub (kni-qe-109) at TARGET_HUB_VERSION. | ||
| Test: run IBU upgrade and post-upgrade eco-gotests, report results. | ||
| Post: notify Slack, trigger next job in chain. | ||
| steps: | ||
| pre: | ||
| # Seed hub setup — kni-qe-108 deployed at SEED_HUB_VERSION (upgrade destination) | ||
| - ref: telcov10n-functional-cnf-ran-ibu-seed-hub-deploy | ||
| # Target hub setup — kni-qe-109 deployed at TARGET_HUB_VERSION (upgrade source), manages spokes | ||
| - ref: telcov10n-functional-cnf-ran-ibu-target-hub-deploy |
There was a problem hiding this comment.
Documentation describes Test/Post phases that the workflow does not implement.
The documentation states the workflow runs the IBU upgrade + post-upgrade eco-gotests and then notifies Slack / chains the next job, but steps only defines pre with the two hub-deploy refs — there is no test or post block. As written, this workflow only provisions the hubs and runs no test, so any job using it will pass without exercising IBU. If this "deployment lane" is intentional for now, trim the documentation to match; otherwise the test/post steps are missing.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/telcov10n/functional/cnf-ran/ibu/telcov10n-functional-cnf-ran-ibu-workflow.yaml`
around lines 3 - 13, The workflow documentation claims it runs the IBU upgrade,
post-upgrade eco-gotests, Slack notification and job chaining, but the steps
section only defines pre with refs
telcov10n-functional-cnf-ran-ibu-seed-hub-deploy and
telcov10n-functional-cnf-ran-ibu-target-hub-deploy; either update the
documentation to state this is currently only a deployment lane, or add the
missing test and post blocks (e.g., include refs for the upgrade/test steps such
as telcov10n-functional-cnf-ran-ibu-upgrade and
telcov10n-functional-cnf-ran-ibu-post-upgrade-notify or whatever the canonical
job refs are) so the YAML's steps match the described Test and Post phases.
|
/pj-rehearse periodic-ci-openshift-kni-eco-ci-cd-main-cnf-ran-ibu-4.20-cnf-ran-ztp-tests |
|
@shaior: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-kni-eco-ci-cd-main-cnf-ran-ibu-4.20-cnf-ran-ztp-tests |
|
@shaior: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-kni-eco-ci-cd-main-cnf-ran-ibu-4.20-cnf-ran-ztp-tests |
|
@shaior: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-kni-eco-ci-cd-main-cnf-ran-ibu-4.20-cnf-ran-ztp-tests |
|
@shaior: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-kni-eco-ci-cd-main-cnf-ran-ibu-4.20-cnf-ran-ztp-tests |
|
@shaior: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-kni-eco-ci-cd-main-cnf-ran-ibu-4.20-cnf-ran-ztp-tests |
|
@shaior: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@shaior: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary by CodeRabbit
This PR adds CI/CD infrastructure to the OpenShift release repository to support testing of In-Place Updates (IBU) for CNF RAN (Radio Access Network) clusters. The changes introduce a new test lane called
cnf-ran-ibu-4.20that orchestrates deployment of two hub clusters for IBU testing.What's being added:
Main workflow configuration: A new CI job configuration defines the
telcov10n-functional-cnf-ran-ibuworkflow that validates the IBU process by deploying test infrastructure in a specific sequence.Two-hub deployment setup: The workflow includes two prerequisite CI steps that deploy isolated OpenShift SNO (Single Node OpenShift) clusters:
ibu-seed-hub-deploy) representing the starting stateibu-target-hub-deploy) representing the desired end stateEach deployment step:
Infrastructure configuration: The workflow connects to the
eco-ci-cdbuild root (OCP nightly 4.20) and includes credential mounts for authenticating to lab infrastructure and managing cluster variables.Ownership: OWNERS files are added across the new step registry entries, designating
shaior,kononovn, andeifrachas approvers for this test infrastructure.This enables the OpenShift CI system to run IBU functional tests on CNF RAN hardware by provisioning the necessary hub clusters as part of the test pipeline.