Skip to content

fix(dogstatsd): normalize tags in event() and service_check()#953

Open
tomohiro86 wants to merge 2 commits into
DataDog:masterfrom
tomohiro86:fix/normalize-tags-in-event-and-service-check
Open

fix(dogstatsd): normalize tags in event() and service_check()#953
tomohiro86 wants to merge 2 commits into
DataDog:masterfrom
tomohiro86:fix/normalize-tags-in-event-and-service-check

Conversation

@tomohiro86

@tomohiro86 tomohiro86 commented Jun 6, 2026

Copy link
Copy Markdown

What does this PR do?

Tags passed to event() and service_check() were serialized into the DogStatsD wire protocol without calling normalize_tags(), leaving \n (newline) and | (pipe) characters unfiltered. Both are protocol delimiters used by the DogStatsD agent to split packets and fields, so a tag value containing either character can inject additional protocol tokens.

The same issue was already fixed for _serialize_metric() in 2020 (commit 11fe1d8, "Remove illegal characters from tags", relates to #19). This PR extends that fix to the two code paths that were missed.

There is no pre-existing GitHub issue for this specific gap; it was identified during a security review.

Description of the Change

normalize_tags() (already imported in the file) is now called on the user-supplied tag list before joining in both event() and service_check(), exactly as _serialize_metric() already does.

# event()
- string = "%s|#%s" % (string, ",".join(tags))
+ string = "%s|#%s" % (string, ",".join(normalize_tags(tags)))

# service_check()
- string = u"{0}|#{1}".format(string, ",".join(tags))
+ string = u"{0}|#{1}".format(string, ",".join(normalize_tags(tags)))

normalize_tags replaces any character outside [word chars, digits, _, -, :, /, .] with _, so \n_ and |_.

Alternate Designs

  • Raise an exception on invalid characters — rejected; would be a breaking change for callers that currently pass tags with spaces or other non-ASCII characters that are silently normalised today.
  • Strip the offending characters entirely — the existing normalize_tags replaces with _ to preserve tag structure; changing that would be a separate, larger decision.

Possible Drawbacks

  • Tag values that previously contained | or \n will now appear with _ in those positions inside Datadog. This is a correctness fix: the previous behaviour was silently malformed on the wire. Any dashboard/alert that matched on a tag containing a literal | or \n would have been matching against accidental protocol injection rather than a real tag value.

Verification Process

Ran the new regression tests and the full existing event / service_check test suite locally:

pytest tests/unit/dogstatsd/test_statsd.py -k "event or service_check" -v
# 13 passed, 0 failed

Four new tests were added alongside the existing test_pipe_in_tags (which covers _serialize_metric):

Test What it checks
test_pipe_in_event_tags | in event tag → replaced with _, not treated as field delimiter
test_newline_in_event_tags \n in event tag → replaced with _, not treated as packet delimiter
test_pipe_in_service_check_tags | in service_check tag → replaced with _
test_newline_in_service_check_tags \n in service_check tag → replaced with _

Additional Notes

The hostname, aggregation_key, source_type_name, priority, and alert_type parameters of event(), and check_name / hostname of service_check(), are also currently unsanitized. A newline in any of those would similarly inject content. Those fields are far less commonly populated from external input than tags, and fixing them is a slightly larger change; left for a follow-up if desired.

Release Notes

Tags containing | or \n passed to statsd.event() and statsd.service_check() are now sanitized (special characters replaced with _), consistent with the existing behaviour of all metric methods.

Review checklist (to be filled by reviewers)

  • Feature or bug fix MUST have appropriate tests (unit, integration, etc...)
  • PR title must be written as a CHANGELOG entry (see why)
  • Files changes must correspond to the primary purpose of the PR as described in the title (small unrelated changes should have their own PR)
  • PR must have one changelog/ label attached. If applicable it should have the backward-incompatible label attached.
  • PR should not have do-not-merge/ label attached.
  • If Applicable, issue must have kind/ and severity/ labels attached at least.

Tags passed to event() and service_check() were joined into the
DogStatsD wire payload without calling normalize_tags(), leaving
newline (\n) and pipe (|) characters unfiltered. These are the
protocol delimiters used by the agent to split packets and fields,
so unsanitized values could inject spurious metrics or spoof event
fields (hostname, aggregation key, etc.).

The same fix was applied to _serialize_metric() in 2020 (commit
11fe1d8, "Remove illegal characters from tags") but was not extended
to the other two serialization paths.

Fixes the inconsistency by applying normalize_tags() in both methods,
matching the existing behavior in _serialize_metric(). Adds four
regression tests covering pipe and newline injection for both methods.

Relates to: DataDog#19
@tomohiro86 tomohiro86 requested review from a team as code owners June 6, 2026 15:31
@atanzu atanzu requested a review from Copilot June 8, 2026 13:34

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a DogStatsD wire-protocol injection gap by applying existing tag sanitization (normalize_tags()) to the event() and service_check() code paths, aligning them with the already-sanitized metric serialization path.

Changes:

  • Sanitize tags for DogStatsd.event() by normalizing tags before serializing into the event packet.
  • Sanitize tags for DogStatsd.service_check() by normalizing tags before serializing into the service check packet.
  • Add regression tests ensuring | and \n inside event/service_check tags are normalized and cannot inject protocol delimiters.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
datadog/dogstatsd/base.py Applies normalize_tags() when serializing tags for events and service checks to prevent delimiter injection.
tests/unit/dogstatsd/test_statsd.py Adds regression tests covering `

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/unit/dogstatsd/test_statsd.py Outdated
Comment thread tests/unit/dogstatsd/test_statsd.py Outdated
Comment thread tests/unit/dogstatsd/test_statsd.py Outdated
Comment thread tests/unit/dogstatsd/test_statsd.py Outdated
Replace base `assert` statements which can be optimized away.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@datadog-datadog-prod-us1

This comment has been minimized.

@atanzu atanzu added the changelog/Fixed Fixed features results into a bug fix version bump label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/Fixed Fixed features results into a bug fix version bump

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants