fix: truncate oversized output token strings by shihaobai · Pull Request #1359 · ModelTC/LightLLM

shihaobai · 2026-06-16T06:27:54Z

No description provided.

gemini-code-assist

Code Review

This pull request replaces an assertion check with truncation logic and warning logging when a token string exceeds the maximum allowed bytes in out_token_circlequeue.py. The reviewer recommended pre-truncating the string before encoding to UTF-8 to prevent potential memory issues and performance degradation if an extremely large string is processed.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-16T06:29:18Z

        str_bytes = token_str.encode("utf-8")
-        assert (
-            len(str_bytes) <= LIGHTLLM_TOKEN_MAX_BYTES
-        ), f"Token string {len(str_bytes)} exceeds maximum length of {LIGHTLLM_TOKEN_MAX_BYTES} bytes."
+        if len(str_bytes) > LIGHTLLM_TOKEN_MAX_BYTES:
+            old_len = len(str_bytes)
+            str_bytes = str_bytes[:LIGHTLLM_TOKEN_MAX_BYTES].decode("utf-8", errors="ignore").encode("utf-8")
+            logger.warning(
+                f"Token string {old_len} exceeds maximum length of {LIGHTLLM_TOKEN_MAX_BYTES} bytes, "
+                f"truncated to {len(str_bytes)} bytes."
+            )


If token_str is extremely large (e.g., due to an unexpected generation length or corrupted output), encoding the entire string to bytes first can be highly inefficient and consume significant memory. Since 1 character in UTF-8 is at least 1 byte, we can safely pre-truncate token_str to LIGHTLLM_TOKEN_MAX_BYTES characters before encoding. This avoids encoding a potentially massive string while still guaranteeing that the final byte length will be within the limit after the subsequent byte-level truncation.

Suggested change

str_bytes = token_str.encode("utf-8")

assert (

len(str_bytes) <= LIGHTLLM_TOKEN_MAX_BYTES

), f"Token string {len(str_bytes)} exceeds maximum length of {LIGHTLLM_TOKEN_MAX_BYTES} bytes."

if len(str_bytes) > LIGHTLLM_TOKEN_MAX_BYTES:

old_len = len(str_bytes)

str_bytes = str_bytes[:LIGHTLLM_TOKEN_MAX_BYTES].decode("utf-8", errors="ignore").encode("utf-8")

logger.warning(

f"Token string {old_len} exceeds maximum length of {LIGHTLLM_TOKEN_MAX_BYTES} bytes, "

f"truncated to {len(str_bytes)} bytes."

)

if len(token_str) > LIGHTLLM_TOKEN_MAX_BYTES:

token_str = token_str[:LIGHTLLM_TOKEN_MAX_BYTES]

str_bytes = token_str.encode("utf-8")

if len(str_bytes) > LIGHTLLM_TOKEN_MAX_BYTES:

old_len = len(str_bytes)

str_bytes = str_bytes[:LIGHTLLM_TOKEN_MAX_BYTES].decode("utf-8", errors="ignore").encode("utf-8")

logger.warning(

f"Token string {old_len} exceeds maximum length of {LIGHTLLM_TOKEN_MAX_BYTES} bytes, "

f"truncated to {len(str_bytes)} bytes."

)

sufubao · 2026-06-16T06:42:38Z

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

🤖 Generated with Claude Code

fix: truncate oversized output token strings

ce61e5c

gemini-code-assist Bot reviewed Jun 16, 2026

View reviewed changes

shihaobai force-pushed the decode_fix branch from 79e005c to ce61e5c Compare June 16, 2026 06:59

shihaobai merged commit b28eeac into main Jun 16, 2026
2 checks passed

shihaobai deleted the decode_fix branch June 16, 2026 07:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: truncate oversized output token strings#1359

fix: truncate oversized output token strings#1359
shihaobai merged 1 commit into
mainfrom
decode_fix

shihaobai commented Jun 16, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Uh oh!

sufubao commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shihaobai commented Jun 16, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

sufubao commented Jun 16, 2026

Code review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants