Skip to content

fix(kubernetes_logs source): don't evict recreated pods with reused names#25558

Draft
thomasqueirozb wants to merge 1 commit into
masterfrom
fix/k8s-logs-recreated-pod-eviction
Draft

fix(kubernetes_logs source): don't evict recreated pods with reused names#25558
thomasqueirozb wants to merge 1 commit into
masterfrom
fix/k8s-logs-recreated-pod-eviction

Conversation

@thomasqueirozb
Copy link
Copy Markdown
Member

Summary

The kubernetes_logs source could randomly stop collecting logs from a pod after another pod with the same name and namespace was deleted and recreated (e.g. StatefulSet rollouts or same-node restarts). The delayed processing of the old pod's deletion event evicted the recreated pod from the reflector store, which is keyed only by name and namespace. Once evicted, Vector stopped watching that pod's log files and emitted Failed to annotate event with pod metadata errors.

Pod incarnations are now distinguished by their UID (MetaDescribe), and a delayed deletion is only applied if the store still holds the same UID. This is the piece missing from the earlier attempt in #21303, which added the UID to the cache but did not guard the store deletion and therefore regressed the recreate case.

Vector configuration

NA

How did you test this PR?

Added two regression tests in src/kubernetes/reflector.rs covering both event orderings (recreate within the delay window, and an out-of-order delete arriving after the recreate). Verified that neutralizing the new store guard makes both tests fail (reproducing #21303's behavior) and that they pass with the guard in place. Ran cargo test --no-default-features --features kubernetes --lib kubernetes:: and cargo clippy --no-default-features --features kubernetes --lib.

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant