Skip to content

Improve batch reload: do reload only when there is non-endpointslice resources #7779

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

wd
Copy link
Contributor

@wd wd commented May 13, 2025

Proposed changes

Fix/Improve #7778

Checklist

Before creating a PR, run through this checklist and mark each as complete.

  • I have read the CONTRIBUTING doc
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked that all unit tests pass after adding my changes
  • I have updated necessary documentation
  • I have rebased my branch onto main
  • I will ensure my PR is targeting the main branch and pulling from my branch from my own fork

@wd wd requested a review from a team as a code owner May 13, 2025 18:47
Copy link

github-actions bot commented May 13, 2025

✅ All required contributors have signed the F5 CLA for this PR. Thank you!
Posted by the CLA Assistant Lite bot.

@github-actions github-actions bot added the go Pull requests that update Go code label May 13, 2025
@wd
Copy link
Contributor Author

wd commented May 13, 2025

I have hereby read the F5 CLA and agree to its terms

@jjngx
Copy link
Contributor

jjngx commented May 14, 2025

@wd could you please include testing and findings after your change in a similar way you did in the bug ticket?

@wd
Copy link
Contributor Author

wd commented May 19, 2025

@jjngx I didn't find a way to write tests for lb.sync, do you have any suggestions?

@jjngx
Copy link
Contributor

jjngx commented May 20, 2025

@jjngx I didn't find a way to write tests for lb.sync, do you have any suggestions?

@wd I mean to test the improvement manually and include the results - the graph that shows the improvement (the same way it's presented in the original GitHub issue).

@github-actions github-actions bot added the dependencies Pull requests that update a dependency file label May 20, 2025
Signed-off-by: Dong Wang <wd@wdicc.com>
@github-actions github-actions bot removed the dependencies Pull requests that update a dependency file label May 20, 2025
@wd
Copy link
Contributor Author

wd commented May 20, 2025

I rethought this code, and I feel the main issue might be that when handling the endpoint, it didn't consider whether the current version is the plus version(no reload is needed for the plus version). So, currently, there's only one line of change in the code.

I have done some manual tests on my side. I'll paste the results along with some explanations. (I have removed a few irrelevant logs)
I have temporarily added a few lines of code to slow down the sync and print the content of the current queue. So, you might see some logs that don't exist in the original version.

I20250520 22:13:49.879398   15142 controller.go:909] Current queue: &{0x40009b78d8 map[{1 default/datadog-jm8vk}:{} {1 kiam/kiam-agent-9nv5w}:{} {1 opentelemetry/filelog-daemonset-collector-monitoring-fvpgg}:{} {1 staging-infra/otel-agent-bwjd2}:{}] map[{1 staging-infra/fortio-castle-xnrj8}:{}] 0x400048b980 false false 0x4000991f00 500000000 {}}

^^^ This is the content of the current queue; they are all endpoint slices. 

I20250520 22:13:49.880416   15142 task_queue.go:73] The queue has 4 element(s)
I20250520 22:13:54.894802   15142 controller.go:909] Current queue: &{0x40009b78d8 map[{1 default/datadog-jm8vk}:{} {1 kiam/kiam-agent-9nv5w}:{} {1 staging-infra/otel-agent-bwjd2}:{}] map[{1 opentelemetry/filelog-daemonset-collector-monitoring-fvpgg}:{}] 0x400048b980 false false 0x4000991f00 500000000 {}}
I20250520 22:13:54.895794   15142 task_queue.go:73] The queue has 3 element(s)
I20250520 22:13:54.896493   15142 controller.go:918] Syncing opentelemetry/filelog-daemonset-collector-monitoring-fvpgg
I20250520 22:13:54.897795   15142 task_queue.go:73] The queue has 3 element(s)
I20250520 22:13:54.898552   15142 task_queue.go:94] Syncing staging-infra/otel-agent-bwjd2
I20250520 22:13:59.899691   15142 controller.go:909] Current queue: &{0x40009b78d8 map[{1 default/datadog-jm8vk}:{} {1 kiam/kiam-agent-9nv5w}:{}] map[{1 staging-infra/otel-agent-bwjd2}:{}] 0x400048b980 false false 0x4000991f00 500000000 {}}
I20250520 22:13:59.900748   15142 task_queue.go:73] The queue has 2 element(s)
I20250520 22:13:59.901512   15142 controller.go:918] Syncing staging-infra/otel-agent-bwjd2
I20250520 22:13:59.902779   15142 task_queue.go:73] The queue has 2 element(s)
I20250520 22:13:59.903379   15142 task_queue.go:94] Syncing default/datadog-jm8vk
I20250520 22:14:04.074675   15142 handlers.go:93] Ignoring Secret user-queue-token-c5fg9 of unsupported type kubernetes.io/service-account-token
I20250520 22:14:04.907275   15142 controller.go:909] Current queue: &{0x40009b78d8 map[{1 kiam/kiam-agent-9nv5w}:{}] map[{1 default/datadog-jm8vk}:{}] 0x400048b980 false false 0x4000991f00 500000000 {}}
I20250520 22:14:04.908413   15142 task_queue.go:73] The queue has 1 element(s)
I20250520 22:14:04.909058   15142 controller.go:918] Syncing default/datadog-jm8vk
I20250520 22:14:04.909953   15142 task_queue.go:73] The queue has 1 element(s)
I20250520 22:14:04.910421   15142 task_queue.go:94] Syncing kiam/kiam-agent-9nv5w
I20250520 22:14:09.915391   15142 controller.go:909] Current queue: &{0x40009b78d8 map[{1 istio-system/istiod-4mrvt}:{}] map[{1 kiam/kiam-agent-9nv5w}:{}] 0x400048b980 false false 0x4000991f00 500000000 {}}
I20250520 22:14:09.916356   15142 task_queue.go:73] The queue has 1 element(s)
I20250520 22:14:09.917013   15142 controller.go:918] Syncing kiam/kiam-agent-9nv5w
I20250520 22:14:09.917868   15142 task_queue.go:73] The queue has 1 element(s)
I20250520 22:14:09.918413   15142 task_queue.go:94] Syncing istio-system/istiod-4mrvt
I20250520 22:14:14.922185   15142 controller.go:909] Current queue: &{0x40009b78d8 map[] map[{1 istio-system/istiod-4mrvt}:{}] 0x400048b980 false false 0x4000991f00 500000000 {}}
I20250520 22:14:14.923203   15142 task_queue.go:73] The queue has 0 element(s)
I20250520 22:14:14.923937   15142 controller.go:918] Syncing istio-system/istiod-4mrvt
I20250520 22:14:14.925051   15142 task_queue.go:73] The queue has 0 element(s)
I20250520 22:14:14.925573   15142 configurator.go:1404] Reloading NGINX for ReloadForBatchUpdates
I20250520 22:14:14.927866   15142 manager.go:320] Reloading nginx with configVersion: 2

^^^ The NIC will reload Nginx after processing all the items in the queue, and the config version increased to 2. 

I20250520 22:16:48.827565   15142 controller.go:909] Current queue: &{0x40009b78d8 map[{1 staging-infra/fortio-castle-fbb9b}:{} {1 staging-infra/fortio-castle-gffbz}:{} {1 staging-infra/fortio-castle-xnrj8}:{}] map[{1 istio-system/istiod-4mrvt}:{}] 0x400048b980 false false 0x4000991f00 500000000 {}}

^^ This is another example. All the items in the queue were endpoint slices. 

I20250520 22:16:48.828088   15142 task_queue.go:73] The queue has 3 element(s)
I20250520 22:16:48.828419   15142 controller.go:918] Syncing istio-system/istiod-4mrvt
I20250520 22:16:48.828873   15142 task_queue.go:73] The queue has 3 element(s)
I20250520 22:16:48.829153   15142 task_queue.go:94] Syncing staging-infra/fortio-castle-fbb9b
I20250520 22:16:53.829915   15142 controller.go:909] Current queue: &{0x40009b78d8 map[{1 staging-infra/fortio-castle-gffbz}:{} {1 staging-infra/fortio-castle-xnrj8}:{}] map[{1 staging-infra/fortio-castle-fbb9b}:{}] 0x400048b980 false false 0x4000991f00 500000000 {}}
I20250520 22:16:53.830999   15142 task_queue.go:73] The queue has 2 element(s)
I20250520 22:16:53.831819   15142 controller.go:918] Syncing staging-infra/fortio-castle-fbb9b
I20250520 22:16:53.842717   15142 task_queue.go:73] The queue has 2 element(s)
I20250520 22:16:53.842917   15142 task_queue.go:94] Syncing staging-infra/fortio-castle-gffbz
I20250520 22:16:55.961358   15142 task_queue.go:61] Adding an element with a key: staging-infra/fortio-castle-gffbz
I20250520 22:16:58.843687   15142 controller.go:909] Current queue: &{0x40009b78d8 map[{1 staging-infra/fortio-castle-gffbz}:{} {1 staging-infra/fortio-castle-xnrj8}:{}] map[{1 staging-infra/fortio-castle-gffbz}:{}] 0x400048b980 false false 0x4000991f00 500000000 {}}
I20250520 22:16:58.844767   15142 task_queue.go:73] The queue has 1 element(s)
I20250520 22:16:58.845509   15142 controller.go:918] Syncing staging-infra/fortio-castle-gffbz
I20250520 22:16:58.856123   15142 task_queue.go:73] The queue has 1 element(s)
I20250520 22:16:58.856358   15142 task_queue.go:94] Syncing staging-infra/fortio-castle-xnrj8
I20250520 22:17:03.859927   15142 controller.go:909] Current queue: &{0x40009b78d8 map[{1 staging-infra/fortio-castle-gffbz}:{}] map[{1 staging-infra/fortio-castle-xnrj8}:{}] 0x400048b980 false false 0x4000991f00 500000000 {}}
I20250520 22:17:03.860938   15142 task_queue.go:73] The queue has 1 element(s)
I20250520 22:17:03.861745   15142 controller.go:918] Syncing staging-infra/fortio-castle-xnrj8
I20250520 22:17:03.872217   15142 task_queue.go:73] The queue has 1 element(s)
I20250520 22:17:03.872492   15142 task_queue.go:94] Syncing staging-infra/fortio-castle-gffbz
I20250520 22:17:04.131966   15142 handlers.go:93] Ignoring Secret user-queue-token-c5fg9 of unsupported type kubernetes.io/service-account-token
I20250520 22:17:08.875746   15142 controller.go:909] Current queue: &{0x40009b78d8 map[] map[{1 staging-infra/fortio-castle-gffbz}:{}] 0x400048b980 false false 0x4000991f00 500000000 {}}
I20250520 22:17:08.876772   15142 task_queue.go:73] The queue has 0 element(s)
I20250520 22:17:08.877365   15142 controller.go:918] Syncing staging-infra/fortio-castle-gffbz
I20250520 22:17:08.889979   15142 task_queue.go:73] The queue has 0 element(s)
I20250520 22:17:08.890105   15142 configurator.go:1404] Reloading NGINX for ReloadForBatchUpdates
I20250520 22:17:08.890936   15142 manager.go:320] Reloading nginx with configVersion: 3

^^^ The NIC will reload Nginx after processing all the items in the queue, and the config version increased to 3. 

Below are the outputs from he fixed version.

I20250520 22:32:47.284570   15585 controller.go:909] Current queue: &{0x400000cac8 map[{1 staging-infra/fortio-castle-fbb9b}:{} {1 staging-infra/fortio-castle-gffbz}:{} {1 staging-infra/fortio-castle-xnrj8}:{}] map[{1 staging-easun/easun-computing-transcoding-direct-6pfx9}:{}] 0x400034d540 false false 0x4000c14380 500000000 {}}

^^^ This is the content of the current queue; they are all endpoint slices. 

I20250520 22:32:47.285302   15585 task_queue.go:73] The queue has 3 element(s)
I20250520 22:32:47.285848   15585 controller.go:918] Syncing staging-easun/easun-computing-transcoding-direct-6pfx9
I20250520 22:32:47.286786   15585 task_queue.go:73] The queue has 3 element(s)
I20250520 22:32:47.287296   15585 task_queue.go:94] Syncing staging-infra/fortio-castle-fbb9b
I20250520 22:32:50.961685   15585 handlers.go:93] Ignoring Secret user-queue-token-c5fg9 of unsupported type kubernetes.io/service-account-token
I20250520 22:32:52.291415   15585 controller.go:909] Current queue: &{0x400000cac8 map[{1 staging-infra/fortio-castle-gffbz}:{} {1 staging-infra/fortio-castle-xnrj8}:{}] map[{1 staging-infra/fortio-castle-fbb9b}:{}] 0x400034d540 false false 0x4000c14380 500000000 {}}
I20250520 22:32:52.292415   15585 task_queue.go:73] The queue has 2 element(s)
I20250520 22:32:52.293047   15585 controller.go:918] Syncing staging-infra/fortio-castle-fbb9b
I20250520 22:32:52.297843   15585 task_queue.go:73] The queue has 2 element(s)
I20250520 22:32:52.298014   15585 task_queue.go:94] Syncing staging-infra/fortio-castle-xnrj8
I20250520 22:32:57.298559   15585 controller.go:909] Current queue: &{0x400000cac8 map[{1 staging-infra/fortio-castle-gffbz}:{}] map[{1 staging-infra/fortio-castle-xnrj8}:{}] 0x400034d540 false false 0x4000c14380 500000000 {}}
I20250520 22:32:57.299501   15585 task_queue.go:73] The queue has 1 element(s)
I20250520 22:32:57.300014   15585 controller.go:918] Syncing staging-infra/fortio-castle-xnrj8
I20250520 22:32:57.305749   15585 task_queue.go:73] The queue has 1 element(s)
I20250520 22:32:57.306187   15585 task_queue.go:94] Syncing staging-infra/fortio-castle-gffbz
I20250520 22:33:01.646235   15585 task_queue.go:61] Adding an element with a key: staging-infra/fortio-castle-fbb9b
I20250520 22:33:02.308561   15585 controller.go:909] Current queue: &{0x400000cac8 map[{1 staging-infra/fortio-castle-fbb9b}:{} {1 staging-infra/fortio-castle-xnrj8}:{}] map[{1 staging-infra/fortio-castle-gffbz}:{}] 0x400034d540 false false 0x4000c14380 500000000 {}}
I20250520 22:33:02.308874   15585 task_queue.go:73] The queue has 2 element(s)
I20250520 22:33:02.309079   15585 controller.go:918] Syncing staging-infra/fortio-castle-gffbz
I20250520 22:33:02.311550   15585 task_queue.go:73] The queue has 2 element(s)
I20250520 22:33:02.311725   15585 task_queue.go:94] Syncing staging-infra/fortio-castle-xnrj8
I20250520 22:33:07.312100   15585 controller.go:909] Current queue: &{0x400000cac8 map[{1 staging-infra/fortio-castle-fbb9b}:{} {1 staging-infra/fortio-castle-gffbz}:{} {1 staging-infra/fortio-castle-xnrj8}:{}] map[{1 staging-infra/fortio-castle-xnrj8}:{}] 0x400034d540 false false 0x4000c14380 500000000 {}}
I20250520 22:33:07.312338   15585 task_queue.go:73] The queue has 2 element(s)
I20250520 22:33:07.312467   15585 controller.go:918] Syncing staging-infra/fortio-castle-xnrj8
I20250520 22:33:07.314530   15585 task_queue.go:73] The queue has 2 element(s)
I20250520 22:33:07.314645   15585 task_queue.go:94] Syncing staging-infra/fortio-castle-fbb9b
I20250520 22:33:12.221590   15585 task_queue.go:61] Adding an element with a key: staging-infra/fortio-castle-gffbz
I20250520 22:33:12.315596   15585 controller.go:909] Current queue: &{0x400000cac8 map[{1 staging-infra/fortio-castle-fbb9b}:{} {1 staging-infra/fortio-castle-gffbz}:{} {1 staging-infra/fortio-castle-xnrj8}:{}] map[{1 staging-infra/fortio-castle-fbb9b}:{}] 0x400034d540 false false 0x4000c14380 500000000 {}}
I20250520 22:33:12.315827   15585 task_queue.go:73] The queue has 2 element(s)
I20250520 22:33:12.316001   15585 controller.go:918] Syncing staging-infra/fortio-castle-fbb9b
I20250520 22:33:12.318971   15585 task_queue.go:73] The queue has 2 element(s)
I20250520 22:33:12.319124   15585 task_queue.go:94] Syncing staging-infra/fortio-castle-gffbz
I20250520 22:33:17.320409   15585 task_queue.go:73] The queue has 2 element(s)
I20250520 22:33:17.320586   15585 controller.go:918] Syncing staging-infra/fortio-castle-gffbz
I20250520 22:33:17.323592   15585 task_queue.go:73] The queue has 2 element(s)
I20250520 22:33:17.323751   15585 task_queue.go:94] Syncing staging-infra/fortio-castle-xnrj8
I20250520 22:33:20.954875   15585 handlers.go:93] Ignoring Secret user-queue-token-c5fg9 of unsupported type kubernetes.io/service-account-token
I20250520 22:33:22.326720   15585 controller.go:909] Current queue: &{0x400000cac8 map[{1 staging-infra/fortio-castle-fbb9b}:{} {1 staging-infra/fortio-castle-gffbz}:{}] map[{1 staging-infra/fortio-castle-xnrj8}:{}] 0x400034d540 false false 0x4000c14380 500000000 {}}
I20250520 22:33:22.326989   15585 task_queue.go:73] The queue has 2 element(s)
I20250520 22:33:22.327111   15585 controller.go:918] Syncing staging-infra/fortio-castle-xnrj8
I20250520 22:33:22.329124   15585 task_queue.go:73] The queue has 2 element(s)
I20250520 22:33:22.329223   15585 task_queue.go:94] Syncing staging-infra/fortio-castle-fbb9b
I20250520 22:33:27.331945   15585 controller.go:909] Current queue: &{0x400000cac8 map[{1 staging-infra/fortio-castle-gffbz}:{}] map[{1 staging-infra/fortio-castle-fbb9b}:{}] 0x400034d540 false false 0x4000c14380 500000000 {}}
I20250520 22:33:27.332860   15585 task_queue.go:73] The queue has 1 element(s)
I20250520 22:33:27.333393   15585 controller.go:918] Syncing staging-infra/fortio-castle-fbb9b
I20250520 22:33:27.342004   15585 task_queue.go:73] The queue has 1 element(s)
I20250520 22:33:27.342603   15585 task_queue.go:94] Syncing staging-infra/fortio-castle-gffbz
I20250520 22:33:32.348044   15585 controller.go:909] Current queue: &{0x400000cac8 map[] map[{1 staging-infra/fortio-castle-gffbz}:{}] 0x400034d540 false false 0x4000c14380 500000000 {}}
I20250520 22:33:32.349145   15585 task_queue.go:73] The queue has 0 element(s)
I20250520 22:33:32.349883   15585 controller.go:918] Syncing staging-infra/fortio-castle-gffbz
I20250520 22:33:32.356348   15585 task_queue.go:73] The queue has 0 element(s)

^^ No reload will happen after processing all the items in the queue. 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go Pull requests that update Go code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants