Skip to content

[improve][broker] getPartitionedStats should return 404 instead of empty stats when topic is not loaded#26041

Open
zjxxzjwang wants to merge 1 commit into
apache:masterfrom
zjxxzjwang:master-etPartitionedStats-should-return-404
Open

[improve][broker] getPartitionedStats should return 404 instead of empty stats when topic is not loaded#26041
zjxxzjwang wants to merge 1 commit into
apache:masterfrom
zjxxzjwang:master-etPartitionedStats-should-return-404

Conversation

@zjxxzjwang

Copy link
Copy Markdown
Contributor

Motivation

When calling the admin REST API getPartitionedStats for a partitioned topic, if the topic's partitions have not been loaded yet (i.e. none of the partitions has a managed-ledgers znode created in metadata store), the current implementation returns an HTTP 200 OK with an "empty" PartitionedTopicStatsImpl object (all counters are zero, partitions map is empty or only contains a placeholder TopicStatsImpl).

This behavior is inconsistent with the non-partitioned counterpart getStats, which correctly returns 404 NOT_FOUND in the same situation. As a result:

  • Clients / monitoring systems cannot reliably distinguish "topic exists but has no traffic" from "topic has never been loaded / does not really exist on the broker side", because both cases return a successful empty stats response.
  • Users get a confusing experience: a topic that does not actually exist on the data plane still appears to be "queryable" via the partitioned stats endpoint.

The two endpoints should behave consistently: when the underlying topic data is not present, both should return 404.

Modifications

In PersistentTopicsBase#internalGetPartitionedStats:

  1. After FutureUtil.waitForAll(topicStatsFutureList) completes and we iterate over each per-partition stats future, count how many partitions actually returned stats successfully (successCount). A partition whose managed-ledgers znode does not exist will fail its getStatsAsync call (404 from the per-partition path), so its future ends up completedExceptionally and is skipped — exactly the case we want to detect.
  2. If successCount == 0 (i.e. no partition produced stats), resume the response with RestException(NOT_FOUND, getPartitionedTopicNotFoundErrorMessage(topicName)), the same error path used by getStats. This aligns the 404 semantics and error message between the two APIs.
  3. Removed the now-unreachable if (perPartition && stats.partitions.isEmpty()) fallback branch that previously called partitionedTopicExistsAsync and either inserted a placeholder empty TopicStatsImpl or returned "Internal topics have not been generated yet". With the new successCount == 0 check placed earlier, this branch becomes dead code: whenever successCount > 0 and perPartition == true, stats.partitions is guaranteed to be non-empty (the put happens in the same if block as successCount++).

The change is minimal and only touches the result-aggregation step inside internalGetPartitionedStats; the per-partition fetching logic (isServiceUnitOwnedAsync → local asyncGetStats / remote admin call) is unchanged.

Verifying this change

  • Make sure that the change passes the CI checks.

This change is already covered by existing tests for getStats 404 behavior on partitioned topics; the new branch reuses the exact same error message (getPartitionedTopicNotFoundErrorMessage) and HTTP status, so existing assertions that rely on 404 for unloaded topics now also apply to the partitioned-stats endpoint.

If reviewers prefer additional explicit coverage, a unit / integration test can be added that:

  • Creates a partitioned topic metadata entry but does not produce / consume on it (so no managed-ledgers znode is created).
  • Calls admin.topics().getPartitionedStats(topic) and asserts a NotFoundException (HTTP 404) is thrown, matching the behavior of admin.topics().getStats(topic) in the same setup.

Does this pull request potentially affect one of the following parts:

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Highlight on REST endpoints

GET /admin/v2/persistent/{tenant}/{namespace}/{topic}/partitioned-stats now returns 404 NOT_FOUND with message "Partitioned Topic not found: <topic> has zero partitions" (the standard getPartitionedTopicNotFoundErrorMessage) when no partition of the topic has been loaded, instead of returning 200 OK with an empty stats object.

Consumers of this endpoint that previously relied on receiving an empty-but-successful response for unloaded topics will need to handle 404 — which is already the expected behavior of the non-partitioned getStats API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant