fix(dogstatsd): normalize tags in event() and service_check()#953
Open
tomohiro86 wants to merge 2 commits into
Open
fix(dogstatsd): normalize tags in event() and service_check()#953tomohiro86 wants to merge 2 commits into
tomohiro86 wants to merge 2 commits into
Conversation
Tags passed to event() and service_check() were joined into the DogStatsD wire payload without calling normalize_tags(), leaving newline (\n) and pipe (|) characters unfiltered. These are the protocol delimiters used by the agent to split packets and fields, so unsanitized values could inject spurious metrics or spoof event fields (hostname, aggregation key, etc.). The same fix was applied to _serialize_metric() in 2020 (commit 11fe1d8, "Remove illegal characters from tags") but was not extended to the other two serialization paths. Fixes the inconsistency by applying normalize_tags() in both methods, matching the existing behavior in _serialize_metric(). Adds four regression tests covering pipe and newline injection for both methods. Relates to: DataDog#19
There was a problem hiding this comment.
Pull request overview
This PR fixes a DogStatsD wire-protocol injection gap by applying existing tag sanitization (normalize_tags()) to the event() and service_check() code paths, aligning them with the already-sanitized metric serialization path.
Changes:
- Sanitize tags for
DogStatsd.event()by normalizing tags before serializing into the event packet. - Sanitize tags for
DogStatsd.service_check()by normalizing tags before serializing into the service check packet. - Add regression tests ensuring
|and\ninside event/service_check tags are normalized and cannot inject protocol delimiters.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
datadog/dogstatsd/base.py |
Applies normalize_tags() when serializing tags for events and service checks to prevent delimiter injection. |
tests/unit/dogstatsd/test_statsd.py |
Adds regression tests covering ` |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Replace base `assert` statements which can be optimized away. Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
This comment has been minimized.
This comment has been minimized.
atanzu
approved these changes
Jun 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Tags passed to
event()andservice_check()were serialized into the DogStatsD wire protocol without callingnormalize_tags(), leaving\n(newline) and|(pipe) characters unfiltered. Both are protocol delimiters used by the DogStatsD agent to split packets and fields, so a tag value containing either character can inject additional protocol tokens.The same issue was already fixed for
_serialize_metric()in 2020 (commit11fe1d8, "Remove illegal characters from tags", relates to #19). This PR extends that fix to the two code paths that were missed.There is no pre-existing GitHub issue for this specific gap; it was identified during a security review.
Description of the Change
normalize_tags()(already imported in the file) is now called on the user-supplied tag list before joining in bothevent()andservice_check(), exactly as_serialize_metric()already does.normalize_tagsreplaces any character outside[word chars, digits, _, -, :, /, .]with_, so\n→_and|→_.Alternate Designs
normalize_tagsreplaces with_to preserve tag structure; changing that would be a separate, larger decision.Possible Drawbacks
|or\nwill now appear with_in those positions inside Datadog. This is a correctness fix: the previous behaviour was silently malformed on the wire. Any dashboard/alert that matched on a tag containing a literal|or\nwould have been matching against accidental protocol injection rather than a real tag value.Verification Process
Ran the new regression tests and the full existing
event/service_checktest suite locally:Four new tests were added alongside the existing
test_pipe_in_tags(which covers_serialize_metric):test_pipe_in_event_tags|in event tag → replaced with_, not treated as field delimitertest_newline_in_event_tags\nin event tag → replaced with_, not treated as packet delimitertest_pipe_in_service_check_tags|in service_check tag → replaced with_test_newline_in_service_check_tags\nin service_check tag → replaced with_Additional Notes
The
hostname,aggregation_key,source_type_name,priority, andalert_typeparameters ofevent(), andcheck_name/hostnameofservice_check(), are also currently unsanitized. A newline in any of those would similarly inject content. Those fields are far less commonly populated from external input than tags, and fixing them is a slightly larger change; left for a follow-up if desired.Release Notes
Tags containing
|or\npassed tostatsd.event()andstatsd.service_check()are now sanitized (special characters replaced with_), consistent with the existing behaviour of all metric methods.Review checklist (to be filled by reviewers)
changelog/label attached. If applicable it should have thebackward-incompatiblelabel attached.do-not-merge/label attached.kind/andseverity/labels attached at least.