[release-4.22] OCPBUGS-84322: Fix Gateway cleanup in parallel e2e test workers#31068
Conversation
The Gateway API controller tests tracked Gateways in a shared in-memory gateways slice, deleting them during AfterEach cleanup. However, openshift-tests distributes tests across separate parallel worker processes. The annotation-based checkAllTestsDone coordination works correctly because annotations are stored on the cluster-scoped GatewayClass, but the gateways slice is not shared across processes. The process that runs the final AfterEach cleanup has an empty gateways slice, so it deletes the GatewayClass and istiod but never deletes the Gateways created by other processes. This leaves gateway deployments orphaned on the cluster. As a secondary issue, even when gateways were deleted, the GatewayClass and istiod were removed without waiting for the gateway proxy deployments to be fully cleaned up by GC. Since the deployments have an owner reference to the Gateway (not a finalizer), the cascade deletion is asynchronous, creating a race where gateway pods lose their control plane and crash-loop. Fix both issues by cleaning up gateways at the individual test level using defer deleteGateway, which deletes the Gateway and waits for its proxy deployment to be removed by GC. Add deleteGateway and waitForGatewayDeploymentDeletion helpers shared by both the controller tests and the upgrade test Teardown. Cleanup errors now hard fail to surface leftover resources immediately rather than causing confusing downstream test failures. https://redhat.atlassian.net/browse/OCPBUGS-83281 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-Authored-By: Grant Spence <gspence@redhat.com> Co-Authored-By: Ishmam Amin <iamin@redhat.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@openshift-cherrypick-robot: Jira Issue OCPBUGS-83281 has been cloned as Jira Issue OCPBUGS-84322. Will retitle bug to link to clone. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@openshift-cherrypick-robot: This pull request references Jira Issue OCPBUGS-84322, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: openshift-cherrypick-robot The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
pre-submits don't run automatically in 4.22 (may or may not be a bug with the pipeline controller). Here are the required jobs: |
|
/test all |
|
/retest |
|
/jira refresh |
|
@gcs278: This pull request references Jira Issue OCPBUGS-84322, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@rhamini3: This pull request references Jira Issue OCPBUGS-84322, which is valid. The bug has been moved to the POST state. 7 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@openshift-cherrypick-robot: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This is an automated cherry-pick of #31023
/assign rhamini3