Is there an existing issue ? / 是否已有相关的 issue ?
Describe the bug / 描述这个 bug
调用本地部署的 minicpm5-1b(FP16 精度)OpenAI 兼容对话接口 /v1/chat/completions,输入简单算术问题 1+1=?,模型未输出有效回答,仅返回大量空白换行符。接口返回 finish_reason: length,输出 token 数完全达到 max_tokens 设定值,判定为因长度限制被强制截断。
To Reproduce / 如何复现
- 本地启动 MiniCPM5-1B 推理服务,监听地址
127.0.0.1:1234,模型以 f16 精度加载;
- 使用 curl 命令调用对话补全接口,请求参数指定模型、用户提问、采样参数及
max_tokens=64;
- 执行请求后查看响应结果,发现
message.content 全为空白字符,无有效文本。
curl http://127.0.0.1:1234/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "minicpm5-1b",
"messages": [{"role":"user","content":"1+1=?"}],
"temperature": 0.7, "top_p": 0.95, "max_tokens": 64
}'
{
"id": "chatcmpl-yao6eq5o61c5i3j8zbktn",
"object": "chat.completion",
"created": 1781158399,
"model": "minicpm5-1b@f16",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": " \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n",
"reasoning_content": "",
"tool_calls": []
},
"logprobs": null,
"finish_reason": "length"
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 64,
"total_tokens": 77,
"completion_tokens_details": {
"reasoning_tokens": 0
}
},
"stats": {},
"system_fingerprint": "minicpm5-1b@f16"
}
Expected behavior / 期望的结果
模型正常理解问题并输出有效答案 2,在内容生成完成后正常结束输出,finish_reason 应为 stop,而非 length。
Screenshots / 截图
No response
Environment / 环境
- OS: Windows 11
- CUDA: CUDA 12.8
- Device: RTX 5060 Ti
Additional context / 其他信息
No response
Is there an existing issue ? / 是否已有相关的 issue ?
Describe the bug / 描述这个 bug
调用本地部署的
minicpm5-1b(FP16 精度)OpenAI 兼容对话接口/v1/chat/completions,输入简单算术问题1+1=?,模型未输出有效回答,仅返回大量空白换行符。接口返回finish_reason: length,输出 token 数完全达到max_tokens设定值,判定为因长度限制被强制截断。To Reproduce / 如何复现
127.0.0.1:1234,模型以 f16 精度加载;max_tokens=64;message.content全为空白字符,无有效文本。curl http://127.0.0.1:1234/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "minicpm5-1b", "messages": [{"role":"user","content":"1+1=?"}], "temperature": 0.7, "top_p": 0.95, "max_tokens": 64 }' { "id": "chatcmpl-yao6eq5o61c5i3j8zbktn", "object": "chat.completion", "created": 1781158399, "model": "minicpm5-1b@f16", "choices": [ { "index": 0, "message": { "role": "assistant", "content": " \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n", "reasoning_content": "", "tool_calls": [] }, "logprobs": null, "finish_reason": "length" } ], "usage": { "prompt_tokens": 13, "completion_tokens": 64, "total_tokens": 77, "completion_tokens_details": { "reasoning_tokens": 0 } }, "stats": {}, "system_fingerprint": "minicpm5-1b@f16" }Expected behavior / 期望的结果
模型正常理解问题并输出有效答案
2,在内容生成完成后正常结束输出,finish_reason应为stop,而非length。Screenshots / 截图
No response
Environment / 环境
Additional context / 其他信息
No response