fix: prevent crash in case of a mem alloc error and graceful exit by akleine · Pull Request #1566 · leejet/stable-diffusion.cpp

akleine · 2026-05-26T14:45:54Z

Summary

Added a check against nullptr to avoid crash + core dump in model.cpp

Related Issue / Discussion

This test script
./sd-cli --diffusion-model ~/SD_models/flux/LongCat-Image-Q4_K_M.gguf --llm ~/SD_models/flux/Qwen2.5-VL-7B-Instruct.Q3_K_M.gguf -p 'a lovely cat' --mmap
ends with Segmentation fault (core dumped) because there is not enough VRAM.

Here follows some output for version: stable-diffusion.cpp version master-647-72e512a-1-ga397e03, commit a397e03

[INFO ] stable-diffusion.cpp:363  - Version: Longcat-Image                                                                                                                                    
[INFO ] stable-diffusion.cpp:391  - Weight type stat:                      f32: 387  |    q3_K: 113  |    q4_K: 221  |    q5_K: 23   |    bf16: 6                                             
[INFO ] stable-diffusion.cpp:392  - Conditioner weight type stat:          f32: 141  |    q3_K: 113  |    q4_K: 81   |    q5_K: 3                                                             
[INFO ] stable-diffusion.cpp:393  - Diffusion model weight type stat:      f32: 246  |    q4_K: 140  |    q5_K: 20   |    bf16: 6                                                             
[INFO ] stable-diffusion.cpp:394  - VAE weight type stat:                                                                                                                                     
[INFO ] flux.hpp:1291 - flux: depth = 10, depth_single_blocks = 20, guidance_embed = false, context_in_dim = 3584, hidden_size = 3072, num_heads = 24                                         
[INFO ] stable-diffusion.cpp:771  - using VAE for encoding / decoding                                                                                                                         
[ERROR] ggml_extend.hpp:69   - ggml_backend_cuda_buffer_type_alloc_buffer: allocating 3492.20 MiB on device 0: cudaMalloc failed: out of memory                                               
[ERROR] ggml_extend.hpp:69   - alloc_tensor_range: failed to allocate CUDA0 buffer of size 3661840640                                                                                         
[ERROR] ggml_extend.hpp:2699 - flux alloc params backend buffer failed, num_tensors = 412                                                                                                     
[INFO ] model.cpp:806  - NOT using mmap for '/home/xxx/SD_models/flux/LongCat-Image-Q4_K_M.gguf' (mmap disabled by caller)                                                                    
[INFO ] model.cpp:806  - NOT using mmap for '/home/xxx/SD_models/flux/Qwen2.5-VL-7B-Instruct.Q3_K_M.gguf' (mmap disabled by caller)                                                           
[INFO ] model.cpp:818  - model files processing completed in 0.00s                                                                                                                            
./0_test.sh: Zeile 1:  7259 Segmentation fault (core dumped) ./sd-cli --diffusion-model ~/SD_models/flux/LongCat-Image-Q4_K_M.gguf --llm ~/SD_models/flux/Qwen2.5-VL-7B-Ins

Additional Information

After this patch the user gets some information and we run into a clean exit:

[ERROR] model.cpp:999  - memory allocation failed 'model.diffusion_model.double_blocks.0.img_attn.norm.key_norm.scale'
[ERROR] model.cpp:999  - memory allocation failed 'model.diffusion_model.double_blocks.0.img_attn.norm.query_norm.scale'
[ERROR] model.cpp:999  - memory allocation failed 'model.diffusion_model.double_blocks.0.img_attn.proj.bias'
[INFO ] model.cpp:1152 - loading tensors completed, taking 0.00s (read: 0.00s, memcpy: 0.00s, convert: 0.00s, copy_to_backend: 0.00s)
[ERROR] model.cpp:1209 - load tensors from file failed
[ERROR] stable-diffusion.cpp:988  - load tensors from model loader failed
[INFO ] main.cpp:761  - new_sd_ctx_t failed

Checklist

I have read and confirmed this PR follows the contribution guidelines.

...and ensure a graceful exit

wbruna · 2026-05-26T15:25:35Z

                        continue;
                    }

+                    if (dst_tensor->data == nullptr) {


I'm not sure this is the best place to check for this, because the allocation can't fail for individual tensors; it's all-or-nothing.

Take a look at StableDiffusionGGML::init, in stable-diffusion.cpp, a bit before load_tensors: some components do report allocation failure, but not all. We should probably propagate the underlying alloc_params_buffer error in all cases.

I have seen that code in stable-diffusion.cpp and wondered about the reasoning behind it. I am just a newbie, and I guess there is a reason for writing the code this way.
So I decided to keep that as is - and better add this little quick fix.
And you’re right, the allocation problem is an all-or-nothing issue. So the message refers to the first tensor and then the program exits.

fix: prevent crash in case of a memory allocation error

32e4f86

...and ensure a graceful exit

wbruna reviewed May 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent crash in case of a mem alloc error and graceful exit#1566

fix: prevent crash in case of a mem alloc error and graceful exit#1566
akleine wants to merge 1 commit into
leejet:masterfrom
akleine:leejetMay25

akleine commented May 26, 2026

Uh oh!

wbruna May 26, 2026

Uh oh!

akleine May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

akleine commented May 26, 2026

Summary

Related Issue / Discussion

Additional Information

Checklist

Uh oh!

wbruna May 26, 2026

Choose a reason for hiding this comment

Uh oh!

akleine May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants