fix: truncate oversized output token strings#1359
Conversation
There was a problem hiding this comment.
Code Review
This pull request replaces an assertion check with truncation logic and warning logging when a token string exceeds the maximum allowed bytes in out_token_circlequeue.py. The reviewer recommended pre-truncating the string before encoding to UTF-8 to prevent potential memory issues and performance degradation if an extremely large string is processed.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| str_bytes = token_str.encode("utf-8") | ||
| assert ( | ||
| len(str_bytes) <= LIGHTLLM_TOKEN_MAX_BYTES | ||
| ), f"Token string {len(str_bytes)} exceeds maximum length of {LIGHTLLM_TOKEN_MAX_BYTES} bytes." | ||
| if len(str_bytes) > LIGHTLLM_TOKEN_MAX_BYTES: | ||
| old_len = len(str_bytes) | ||
| str_bytes = str_bytes[:LIGHTLLM_TOKEN_MAX_BYTES].decode("utf-8", errors="ignore").encode("utf-8") | ||
| logger.warning( | ||
| f"Token string {old_len} exceeds maximum length of {LIGHTLLM_TOKEN_MAX_BYTES} bytes, " | ||
| f"truncated to {len(str_bytes)} bytes." | ||
| ) |
There was a problem hiding this comment.
If token_str is extremely large (e.g., due to an unexpected generation length or corrupted output), encoding the entire string to bytes first can be highly inefficient and consume significant memory. Since 1 character in UTF-8 is at least 1 byte, we can safely pre-truncate token_str to LIGHTLLM_TOKEN_MAX_BYTES characters before encoding. This avoids encoding a potentially massive string while still guaranteeing that the final byte length will be within the limit after the subsequent byte-level truncation.
| str_bytes = token_str.encode("utf-8") | |
| assert ( | |
| len(str_bytes) <= LIGHTLLM_TOKEN_MAX_BYTES | |
| ), f"Token string {len(str_bytes)} exceeds maximum length of {LIGHTLLM_TOKEN_MAX_BYTES} bytes." | |
| if len(str_bytes) > LIGHTLLM_TOKEN_MAX_BYTES: | |
| old_len = len(str_bytes) | |
| str_bytes = str_bytes[:LIGHTLLM_TOKEN_MAX_BYTES].decode("utf-8", errors="ignore").encode("utf-8") | |
| logger.warning( | |
| f"Token string {old_len} exceeds maximum length of {LIGHTLLM_TOKEN_MAX_BYTES} bytes, " | |
| f"truncated to {len(str_bytes)} bytes." | |
| ) | |
| if len(token_str) > LIGHTLLM_TOKEN_MAX_BYTES: | |
| token_str = token_str[:LIGHTLLM_TOKEN_MAX_BYTES] | |
| str_bytes = token_str.encode("utf-8") | |
| if len(str_bytes) > LIGHTLLM_TOKEN_MAX_BYTES: | |
| old_len = len(str_bytes) | |
| str_bytes = str_bytes[:LIGHTLLM_TOKEN_MAX_BYTES].decode("utf-8", errors="ignore").encode("utf-8") | |
| logger.warning( | |
| f"Token string {old_len} exceeds maximum length of {LIGHTLLM_TOKEN_MAX_BYTES} bytes, " | |
| f"truncated to {len(str_bytes)} bytes." | |
| ) |
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. 🤖 Generated with Claude Code |
No description provided.