Fix dead condition, division-by-zero, and uninitialized members in LLM stats (#18819) by kirklandsign · Pull Request #18819 · pytorch/executorch

kirklandsign · 2026-04-10T18:00:21Z

Summary:

Three issues fixed:

text_llm_runner.cpp: The condition num_generated_tokens == max_new_tokens
was always false because TextTokenGenerator::generate() receives
max_new_tokens - 1. Fixed to compare against max_new_tokens - 1.
stats.h print_report(): Division by zero when inference/prefill/decode
time is zero (e.g., during very fast warmup runs). Added guards matching
the pattern already used in stats_to_json_string().
stats.h Stats: Added default initializers (= 0) to all timestamp and
counter members to prevent undefined behavior from uninitialized reads.

Reviewed By: manuelcandales

Differential Revision: D99708774

pytorch-bot · 2026-04-10T18:00:28Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18819

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Rolling out OSDC (ARC) runners on pull workflow for PyTorch trunk commits

❌ 3 New Failures, 18 Cancelled Jobs, 5 Unrelated Failures

As of commit 7a0d529 with merge base 5e8a0df ():

NEW FAILURES - The following jobs have failed:

Cadence Build & Test / cpu-test / test-aot / test-aot (gh)
backends/cadence/aot/tests/test_replace_ops_passes.py::TestReplaceOpsPasses::test_replace_conv2d_with_linear
Lint / lintrunner (gh)
>>> Lint for extension/llm/runner/stats.h:
pull / unittest / macos / macos-job (gh)
export/tests/test_target_recipes.py::TestTargetRecipes::test_mv3_model

CANCELLED JOBS - The following jobs were cancelled. Please retry:

pull / test-eval_llama-wikitext-linux / linux-job (gh)
##[error]The operation was canceled.
pull / test-llama-runner-qnn-linux (fp32, qnn_16a16w, qnn) / linux-job (gh)
##[error]The operation was canceled.
pull / test-lora-linux / linux-job (gh)
##[error]The operation was canceled.
pull / test-lora-multimethod-linux / linux-job (gh)
##[error]The operation was canceled.
pull / test-models-linux (ic4, portable, linux.4xlarge.memory) / linux-job (gh)
##[error]The operation was canceled.
pull / test-models-linux (ic4, xnnpack-quantization-delegation, linux.4xlarge.memory) / linux-job (gh)
##[error]The operation was canceled.
pull / test-models-linux (mobilebert, portable, linux.2xlarge) / linux-job (gh)
##[error]The operation was canceled.
pull / test-models-linux (mobilebert, xnnpack-quantization-delegation, linux.2xlarge) / linux-job (gh)
##[error]The operation was canceled.
pull / test-models-linux (phi_4_mini, portable, linux.4xlarge.memory) / linux-job (gh)
##[error]The operation was canceled.
pull / test-moshi-linux / linux-job (gh)
##[error]The operation was canceled.
pull / test-multimodal-linux (gemma3-4b) / linux-job (gh)
##[error]The operation was canceled.
pull / test-parakeet-xnnpack-linux / linux-job (gh)
##[error]The operation was canceled.
pull / test-phi-3-mini-runner-linux / linux-job (gh)
##[error]The operation was canceled.
pull / test-samsung-models-linux / linux-job (gh)
##[error]The operation was canceled.
pull / test-samsung-quantmodels-linux / linux-job (gh)
##[error]The operation was canceled.
pull / test-sqnr-static-llm-qnn-linux (smollm2_135m) / linux-job (gh)
##[error]The operation was canceled.
pull / test-voxtral-realtime-xnnpack-linux / linux-job (gh)
##[error]The operation was canceled.
pull / test-vulkan-models-linux / linux-job (gh)
##[error]The operation was canceled.

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / linux / linux-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest-arm-backend-with-no-deps (test_pytest_models_tosa) / linux-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest-editable / linux / linux-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest-editable / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2026-04-10T18:00:34Z

@kirklandsign has exported this pull request. If you are a Meta employee, you can view the originating Diff in D99708774.

github-actions · 2026-04-10T18:01:18Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copilot

Pull request overview

Fixes correctness/robustness issues in the LLM runner stats/reporting path, improving runtime diagnostics and avoiding undefined behavior in edge cases (warmup/fast runs and uninitialized reads).

Changes:

Fix off-by-one “max new tokens reached” condition in TextLLMRunner::generate().
Guard token-rate computations in print_report() against division-by-zero.
Add default initializers to Stats timestamp/counter members to avoid uninitialized reads.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
extension/llm/runner/text_llm_runner.cpp	Fixes a dead/incorrect condition by aligning the comparison with the `max_new_tokens - 1` passed to the generator.
extension/llm/runner/stats.h	Initializes `Stats` members to safe defaults and prevents division-by-zero in printed token-rate metrics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-10T18:03:29Z

extension/llm/runner/stats.h

      "\t\tGenerated %" PRIu64
      " tokens:\t%f (seconds)\t\t Rate: \t%f (tokens/second)",
      stats.num_generated_tokens,


stats.num_generated_tokens is int64_t, but the format string uses PRIu64. This is a signed/unsigned format mismatch and can lead to undefined behavior or incorrect output on some platforms. Use the matching PRId64 (or cast the value to uint64_t if it’s guaranteed non-negative) to align the format specifier with the argument type.

…M stats (#18819) Summary: Three issues fixed: 1. text_llm_runner.cpp: The condition num_generated_tokens == max_new_tokens was always false because TextTokenGenerator::generate() receives max_new_tokens - 1. Fixed to compare against max_new_tokens - 1. 2. stats.h print_report(): Division by zero when inference/prefill/decode time is zero (e.g., during very fast warmup runs). Added guards matching the pattern already used in stats_to_json_string(). 3. stats.h Stats: Added default initializers (= 0) to all timestamp and counter members to prevent undefined behavior from uninitialized reads. Reviewed By: manuelcandales Differential Revision: D99708774

…M stats (#18819) Summary: Pull Request resolved: #18819 Three issues fixed: 1. text_llm_runner.cpp: The condition num_generated_tokens == max_new_tokens was always false because TextTokenGenerator::generate() receives max_new_tokens - 1. Fixed to compare against max_new_tokens - 1. 2. stats.h print_report(): Division by zero when inference/prefill/decode time is zero (e.g., during very fast warmup runs). Added guards matching the pattern already used in stats_to_json_string(). 3. stats.h Stats: Added default initializers (= 0) to all timestamp and counter members to prevent undefined behavior from uninitialized reads. Reviewed By: manuelcandales Differential Revision: D99708774

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-10T18:18:21Z

extension/llm/runner/stats.h

-          stats.SCALING_FACTOR_UNITS_PER_SECOND);
+      inference_time_ms > 0? (stats.num_generated_tokens) / inference_time_ms *
+              stats.SCALING_FACTOR_UNITS_PER_SECOND
+        : 0.0);


Consider adding a small regression test that calls print_report() with Stats timestamps set such that inference_time_ms/prompt_eval_time/eval_time are 0 (or negative) to ensure the rate calculations don’t reintroduce division-by-zero/inf outputs. The LLM runner already has gtest coverage (e.g., extension/llm/runner/test/test_text_llm_runner.cpp), so this should be straightforward to exercise in CI.

…M stats (#18819) Summary: Three issues fixed: 1. text_llm_runner.cpp: The condition num_generated_tokens == max_new_tokens was always false because TextTokenGenerator::generate() receives max_new_tokens - 1. Fixed to compare against max_new_tokens - 1. 2. stats.h print_report(): Division by zero when inference/prefill/decode time is zero (e.g., during very fast warmup runs). Added guards matching the pattern already used in stats_to_json_string(). 3. stats.h Stats: Added default initializers (= 0) to all timestamp and counter members to prevent undefined behavior from uninitialized reads. Reviewed By: manuelcandales Differential Revision: D99708774

…M stats (#18819) Summary: Pull Request resolved: #18819 Three issues fixed: 1. text_llm_runner.cpp: The condition num_generated_tokens == max_new_tokens was always false because TextTokenGenerator::generate() receives max_new_tokens - 1. Fixed to compare against max_new_tokens - 1. 2. stats.h print_report(): Division by zero when inference/prefill/decode time is zero (e.g., during very fast warmup runs). Added guards matching the pattern already used in stats_to_json_string(). 3. stats.h Stats: Added default initializers (= 0) to all timestamp and counter members to prevent undefined behavior from uninitialized reads. Reviewed By: manuelcandales Differential Revision: D99708774

…M stats (#18819) Summary: Three issues fixed: 1. text_llm_runner.cpp: The condition num_generated_tokens == max_new_tokens was always false because TextTokenGenerator::generate() receives max_new_tokens - 1. Fixed to compare against max_new_tokens - 1. 2. stats.h print_report(): Division by zero when inference/prefill/decode time is zero (e.g., during very fast warmup runs). Added guards matching the pattern already used in stats_to_json_string(). 3. stats.h Stats: Added default initializers (= 0) to all timestamp and counter members to prevent undefined behavior from uninitialized reads. Reviewed By: manuelcandales Differential Revision: D99708774

Copilot AI review requested due to automatic review settings April 10, 2026 18:00

kirklandsign requested review from larryliu0820 and mergennachin as code owners April 10, 2026 18:00

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 10, 2026

meta-codesync bot added fb-exported meta-exported labels Apr 10, 2026

Copilot started reviewing on behalf of kirklandsign April 10, 2026 18:01 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

meta-codesync bot changed the title ~~Fix dead condition, division-by-zero, and uninitialized members in LLM stats~~ Fix dead condition, division-by-zero, and uninitialized members in LLM stats (#18819) Apr 10, 2026

meta-codesync bot force-pushed the export-D99708774 branch from 42b87fa to 220e1be Compare April 10, 2026 18:06

Copilot AI review requested due to automatic review settings April 10, 2026 18:15

kirklandsign force-pushed the export-D99708774 branch from 220e1be to 0fec9a7 Compare April 10, 2026 18:15

Copilot started reviewing on behalf of kirklandsign April 10, 2026 18:16 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

meta-codesync bot force-pushed the export-D99708774 branch from 0fec9a7 to 993b988 Compare April 10, 2026 18:39

meta-codesync bot force-pushed the export-D99708774 branch from 993b988 to 070a377 Compare April 10, 2026 18:39

meta-codesync bot force-pushed the export-D99708774 branch from 070a377 to 088daf8 Compare April 10, 2026 18:40

meta-codesync bot force-pushed the export-D99708774 branch from 088daf8 to 376d9ca Compare April 10, 2026 18:48

Copilot AI review requested due to automatic review settings April 10, 2026 18:48

kirklandsign review requested due to automatic review settings April 10, 2026 18:48

meta-codesync bot force-pushed the export-D99708774 branch from 376d9ca to 1bc1148 Compare April 10, 2026 18:50

meta-codesync bot force-pushed the export-D99708774 branch from 1bc1148 to bf2b8b0 Compare April 10, 2026 18:50

kirklandsign force-pushed the export-D99708774 branch from bf2b8b0 to f42844b Compare April 10, 2026 18:52

meta-codesync bot force-pushed the export-D99708774 branch from f42844b to 0f96b60 Compare April 10, 2026 18:54

Copilot AI review requested due to automatic review settings April 10, 2026 18:59

meta-codesync bot force-pushed the export-D99708774 branch from 0f96b60 to 4779b56 Compare April 10, 2026 18:59

kirklandsign review requested due to automatic review settings April 10, 2026 18:59

Copilot AI review requested due to automatic review settings April 10, 2026 19:13

meta-codesync bot force-pushed the export-D99708774 branch from 4779b56 to 7a0d529 Compare April 10, 2026 19:13

kirklandsign review requested due to automatic review settings April 10, 2026 19:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix dead condition, division-by-zero, and uninitialized members in LLM stats (#18819)#18819

Fix dead condition, division-by-zero, and uninitialized members in LLM stats (#18819)#18819
kirklandsign wants to merge 1 commit intomainfrom
export-D99708774

kirklandsign commented Apr 10, 2026 •

edited by meta-codesync bot

Loading

Uh oh!

pytorch-bot bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

meta-codesync bot commented Apr 10, 2026

Uh oh!

github-actions bot commented Apr 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kirklandsign commented Apr 10, 2026 • edited by meta-codesync bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18819

❗ 1 Active SEVs

❌ 3 New Failures, 18 Cancelled Jobs, 5 Unrelated Failures

Uh oh!

meta-codesync bot commented Apr 10, 2026

Uh oh!

github-actions bot commented Apr 10, 2026

This PR needs a release notes: label

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kirklandsign commented Apr 10, 2026 •

edited by meta-codesync bot

Loading

pytorch-bot bot commented Apr 10, 2026 •

edited

Loading

This PR needs a `release notes:` label