Skip to content

out_cloudwatch_logs: Plug SEGV on not found error#11983

Merged
edsiper merged 4 commits into
masterfrom
cosmo0920-plug-segv-on-cloudwatch_logs
Jul 3, 2026
Merged

out_cloudwatch_logs: Plug SEGV on not found error#11983
edsiper merged 4 commits into
masterfrom
cosmo0920-plug-segv-on-cloudwatch_logs

Conversation

@cosmo0920

@cosmo0920 cosmo0920 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

In this PR, we'll treat unrecoverable error on not found exception occurred on out_cloudwatch_logs.

Closes #11959.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • Bug Fixes

    • CloudWatch log delivery now treats “resource not found” failures as non-retriable, preventing repeated send attempts and improving recovery.
    • Flush now honors non-retriable delivery failures by returning the correct non-retry outcome.
  • Tests

    • Updated CloudWatch runtime scenarios for “already exists” and “not found” behaviors.
    • Enhanced CloudWatch mock call-count tracking to verify when log streams are created and when PutLogEvents is invoked, including “create after put” behavior.

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 036a103f-bdbd-4af0-a822-5bc4adfe9cbd

📥 Commits

Reviewing files that changed from the base of the PR and between 11c3ea0 and 610858b.

📒 Files selected for processing (5)
  • plugins/out_cloudwatch_logs/cloudwatch_api.c
  • plugins/out_cloudwatch_logs/cloudwatch_api.h
  • plugins/out_cloudwatch_logs/cloudwatch_logs.c
  • plugins/out_cloudwatch_logs/cloudwatch_logs.h
  • tests/runtime/out_cloudwatch.c
🚧 Files skipped from review as they are similar to previous changes (5)
  • plugins/out_cloudwatch_logs/cloudwatch_api.h
  • plugins/out_cloudwatch_logs/cloudwatch_logs.c
  • plugins/out_cloudwatch_logs/cloudwatch_logs.h
  • tests/runtime/out_cloudwatch.c
  • plugins/out_cloudwatch_logs/cloudwatch_api.c

📝 Walkthrough

Walkthrough

The PR adds non-retriable handling for ResourceNotFoundException in the CloudWatch Logs output path, propagates that state through flush processing, adds mock call-count helpers, and updates runtime tests to validate the new error path and stream-recreation behavior.

Changes

CloudWatch non-retriable error flow

Layer / File(s) Summary
Flush contract and mock declarations
plugins/out_cloudwatch_logs/cloudwatch_logs.h, plugins/out_cloudwatch_logs/cloudwatch_api.h
Adds non_retriable_error to cw_flush, updates the nearby comment block, and declares the mock call-count helper functions.
Mock call counter implementation
plugins/out_cloudwatch_logs/cloudwatch_api.c
Adds counters for CreateLogStream and PutLogEvents, exports reset/get helpers, and tracks CreateLogStream calls that occur after PutLogEvents.
ResourceNotFoundException handling in put_log_events()
plugins/out_cloudwatch_logs/cloudwatch_api.c
Parses the AWS error payload into a local variable and, on ERR_CODE_NOT_FOUND, marks the flush non-retriable, clears stream expiration, destroys the HTTP client, and returns -1.
Non-retriable propagation through send and flush
plugins/out_cloudwatch_logs/cloudwatch_api.c, plugins/out_cloudwatch_logs/cloudwatch_logs.c
process_and_send() exits early when non_retriable_error is set, and cb_cloudwatch_flush returns FLB_ERROR instead of FLB_RETRY for that path.
Expired stream cleanup ordering
plugins/out_cloudwatch_logs/cloudwatch_api.c
Moves expired-stream eviction ahead of stream/group matching in get_or_create_log_stream().
Runtime test updates
tests/runtime/out_cloudwatch.c
Adds the shared already-exists error constant, Windows env wrappers, updates existing scenarios to use the shared constant, and extends the not-found test with mock counter resets and assertions.

Estimated code review effort: 3 (Moderate) | ~25 minutes

Sequence Diagram(s)

sequenceDiagram
  participant put_log_events
  participant aws_error_parser
  participant http_client
  participant cw_flush

  put_log_events->>aws_error_parser: parse AWS error payload
  aws_error_parser-->>put_log_events: ERR_CODE_NOT_FOUND
  put_log_events->>cw_flush: set non_retriable_error = FLB_TRUE
  put_log_events->>cw_flush: set stream->expiration = 0
  put_log_events->>http_client: destroy client
  put_log_events-->>put_log_events: return -1
Loading

Possibly related PRs

Suggested labels: backport to v4.1.x

Suggested reviewers: edsiper, PettitWesley

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title is concise and directly matches the CloudWatch not-found crash fix.
Linked Issues check ✅ Passed The PR prevents the SIGSEGV by handling ResourceNotFoundException and adds tests for the non-crashing recovery path.
Out of Scope Changes check ✅ Passed The mock counters, Windows wrappers, and test adjustments support the same CloudWatch crash fix and are not unrelated scope.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cosmo0920-plug-segv-on-cloudwatch_logs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9b1fcbd710

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/out_cloudwatch_logs/cloudwatch_api.c
@edsiper

edsiper commented Jul 2, 2026

Copy link
Copy Markdown
Member

@cosmo0920 this PR needs to address the code conflict

cosmo0920 added 4 commits July 3, 2026 16:35
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
Signed-off-by: Hiroshi Hatake <hiroshi@chronosphere.io>
@cosmo0920

Copy link
Copy Markdown
Contributor Author

@cosmo0920 this PR needs to address the code conflict

I resolved the conflicts.

@edsiper edsiper merged commit 2c34d72 into master Jul 3, 2026
58 of 61 checks passed
@edsiper edsiper deleted the cosmo0920-plug-segv-on-cloudwatch_logs branch July 3, 2026 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[cloudwatch_logs] SIGSEGV when PutLogEvents returns ResourceNotFoundException on an existing stream (plugin crashes instead of recreating/retrying)

2 participants