Skip to content

fix: truncate oversized output token strings#1359

Merged
shihaobai merged 1 commit into
mainfrom
decode_fix
Jun 16, 2026
Merged

fix: truncate oversized output token strings#1359
shihaobai merged 1 commit into
mainfrom
decode_fix

Conversation

@shihaobai

Copy link
Copy Markdown
Collaborator

No description provided.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request replaces an assertion check with truncation logic and warning logging when a token string exceeds the maximum allowed bytes in out_token_circlequeue.py. The reviewer recommended pre-truncating the string before encoding to UTF-8 to prevent potential memory issues and performance degradation if an extremely large string is processed.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines 31 to +38
str_bytes = token_str.encode("utf-8")
assert (
len(str_bytes) <= LIGHTLLM_TOKEN_MAX_BYTES
), f"Token string {len(str_bytes)} exceeds maximum length of {LIGHTLLM_TOKEN_MAX_BYTES} bytes."
if len(str_bytes) > LIGHTLLM_TOKEN_MAX_BYTES:
old_len = len(str_bytes)
str_bytes = str_bytes[:LIGHTLLM_TOKEN_MAX_BYTES].decode("utf-8", errors="ignore").encode("utf-8")
logger.warning(
f"Token string {old_len} exceeds maximum length of {LIGHTLLM_TOKEN_MAX_BYTES} bytes, "
f"truncated to {len(str_bytes)} bytes."
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If token_str is extremely large (e.g., due to an unexpected generation length or corrupted output), encoding the entire string to bytes first can be highly inefficient and consume significant memory. Since 1 character in UTF-8 is at least 1 byte, we can safely pre-truncate token_str to LIGHTLLM_TOKEN_MAX_BYTES characters before encoding. This avoids encoding a potentially massive string while still guaranteeing that the final byte length will be within the limit after the subsequent byte-level truncation.

Suggested change
str_bytes = token_str.encode("utf-8")
assert (
len(str_bytes) <= LIGHTLLM_TOKEN_MAX_BYTES
), f"Token string {len(str_bytes)} exceeds maximum length of {LIGHTLLM_TOKEN_MAX_BYTES} bytes."
if len(str_bytes) > LIGHTLLM_TOKEN_MAX_BYTES:
old_len = len(str_bytes)
str_bytes = str_bytes[:LIGHTLLM_TOKEN_MAX_BYTES].decode("utf-8", errors="ignore").encode("utf-8")
logger.warning(
f"Token string {old_len} exceeds maximum length of {LIGHTLLM_TOKEN_MAX_BYTES} bytes, "
f"truncated to {len(str_bytes)} bytes."
)
if len(token_str) > LIGHTLLM_TOKEN_MAX_BYTES:
token_str = token_str[:LIGHTLLM_TOKEN_MAX_BYTES]
str_bytes = token_str.encode("utf-8")
if len(str_bytes) > LIGHTLLM_TOKEN_MAX_BYTES:
old_len = len(str_bytes)
str_bytes = str_bytes[:LIGHTLLM_TOKEN_MAX_BYTES].decode("utf-8", errors="ignore").encode("utf-8")
logger.warning(
f"Token string {old_len} exceeds maximum length of {LIGHTLLM_TOKEN_MAX_BYTES} bytes, "
f"truncated to {len(str_bytes)} bytes."
)

@sufubao

sufubao commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

🤖 Generated with Claude Code

@shihaobai shihaobai merged commit b28eeac into main Jun 16, 2026
2 checks passed
@shihaobai shihaobai deleted the decode_fix branch June 16, 2026 07:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants