Skip to content

fix: prevent crash in case of a mem alloc error and graceful exit#1566

Open
akleine wants to merge 1 commit into
leejet:masterfrom
akleine:leejetMay25
Open

fix: prevent crash in case of a mem alloc error and graceful exit#1566
akleine wants to merge 1 commit into
leejet:masterfrom
akleine:leejetMay25

Conversation

@akleine
Copy link
Copy Markdown
Contributor

@akleine akleine commented May 26, 2026

Summary

Added a check against nullptr to avoid crash + core dump in model.cpp

Related Issue / Discussion

This test script
./sd-cli --diffusion-model ~/SD_models/flux/LongCat-Image-Q4_K_M.gguf --llm ~/SD_models/flux/Qwen2.5-VL-7B-Instruct.Q3_K_M.gguf -p 'a lovely cat' --mmap
ends with Segmentation fault (core dumped) because there is not enough VRAM.

Here follows some output for version: stable-diffusion.cpp version master-647-72e512a-1-ga397e03, commit a397e03

[INFO ] stable-diffusion.cpp:363  - Version: Longcat-Image                                                                                                                                    
[INFO ] stable-diffusion.cpp:391  - Weight type stat:                      f32: 387  |    q3_K: 113  |    q4_K: 221  |    q5_K: 23   |    bf16: 6                                             
[INFO ] stable-diffusion.cpp:392  - Conditioner weight type stat:          f32: 141  |    q3_K: 113  |    q4_K: 81   |    q5_K: 3                                                             
[INFO ] stable-diffusion.cpp:393  - Diffusion model weight type stat:      f32: 246  |    q4_K: 140  |    q5_K: 20   |    bf16: 6                                                             
[INFO ] stable-diffusion.cpp:394  - VAE weight type stat:                                                                                                                                     
[INFO ] flux.hpp:1291 - flux: depth = 10, depth_single_blocks = 20, guidance_embed = false, context_in_dim = 3584, hidden_size = 3072, num_heads = 24                                         
[INFO ] stable-diffusion.cpp:771  - using VAE for encoding / decoding                                                                                                                         
[ERROR] ggml_extend.hpp:69   - ggml_backend_cuda_buffer_type_alloc_buffer: allocating 3492.20 MiB on device 0: cudaMalloc failed: out of memory                                               
[ERROR] ggml_extend.hpp:69   - alloc_tensor_range: failed to allocate CUDA0 buffer of size 3661840640                                                                                         
[ERROR] ggml_extend.hpp:2699 - flux alloc params backend buffer failed, num_tensors = 412                                                                                                     
[INFO ] model.cpp:806  - NOT using mmap for '/home/xxx/SD_models/flux/LongCat-Image-Q4_K_M.gguf' (mmap disabled by caller)                                                                    
[INFO ] model.cpp:806  - NOT using mmap for '/home/xxx/SD_models/flux/Qwen2.5-VL-7B-Instruct.Q3_K_M.gguf' (mmap disabled by caller)                                                           
[INFO ] model.cpp:818  - model files processing completed in 0.00s                                                                                                                            
./0_test.sh: Zeile 1:  7259 Segmentation fault (core dumped) ./sd-cli --diffusion-model ~/SD_models/flux/LongCat-Image-Q4_K_M.gguf --llm ~/SD_models/flux/Qwen2.5-VL-7B-Ins

Additional Information

After this patch the user gets some information and we run into a clean exit:

[ERROR] model.cpp:999  - memory allocation failed 'model.diffusion_model.double_blocks.0.img_attn.norm.key_norm.scale'
[ERROR] model.cpp:999  - memory allocation failed 'model.diffusion_model.double_blocks.0.img_attn.norm.query_norm.scale'
[ERROR] model.cpp:999  - memory allocation failed 'model.diffusion_model.double_blocks.0.img_attn.proj.bias'
[INFO ] model.cpp:1152 - loading tensors completed, taking 0.00s (read: 0.00s, memcpy: 0.00s, convert: 0.00s, copy_to_backend: 0.00s)
[ERROR] model.cpp:1209 - load tensors from file failed
[ERROR] stable-diffusion.cpp:988  - load tensors from model loader failed
[INFO ] main.cpp:761  - new_sd_ctx_t failed

Checklist

Comment thread src/model.cpp
continue;
}

if (dst_tensor->data == nullptr) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is the best place to check for this, because the allocation can't fail for individual tensors; it's all-or-nothing.

Take a look at StableDiffusionGGML::init, in stable-diffusion.cpp, a bit before load_tensors: some components do report allocation failure, but not all. We should probably propagate the underlying alloc_params_buffer error in all cases.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen that code in stable-diffusion.cpp and wondered about the reasoning behind it. I am just a newbie, and I guess there is a reason for writing the code this way.
So I decided to keep that as is - and better add this little quick fix.
And you’re right, the allocation problem is an all-or-nothing issue. So the message refers to the first tensor and then the program exits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants