Model MTP support + Backend Issue

Hi! I found this issue related to the use of MTP with the VLLM backend in docker model runner!  Since I'm here, I want to ask about the integration of VLLM metal, is it activate by default for mac with the vllm backend?

Another interesting thing is that I've installed the vllm backend and it looks like it is using llama.cpp backend, any issue with that? For that I followed this guide https://www.docker.com/blog/docker-model-runner-integrates-vllm/

Thanks!

Error log:

```
docker model run hf.co/unsloth/Qwen3.6-27B-MTP-GGUF:UD-Q4_K_XL
Unable to find model 'hf.co/unsloth/Qwen3.6-27B-MTP-GGUF:UD-Q4_K_XL' locally. Pulling from the server.
9c68785fa64f: Pull complete [==================================================>]  17.91GB/17.91GB
00f45cd696de: Pull complete [==================================================>]  931.1MB/931.1MB
b33563055168: Pull complete [==================================================>]  25.41kB/25.41kB
053533475129: Pull complete [==================================================>]  931.1MB/931.1MB
4085665ee36d: Pull complete [==================================================>]  17.91GB/17.91GB
Model pulled successfully
> background model preload failed: preload failed: status=500 body=unable to load runner: error waiting for runner to be ready: llama.cpp terminated unexpectedly: llama.cpp failed: failed to load model

Verbose output:
llama_model_load: error loading model: missing tensor 'blk.64.ssm_conv1d.weight'
llama_model_load_from_file_impl: failed to load model
common_init_from_params: failed to load model '/Users/edm/.docker/models/bundles/sha256/ae07dd2945afaaf7034f795ec286ec4cf79e6843e23b75b7c0696e31b6d40244/model/model.gguf'
srv    load_model: failed to load model, '/Users/edm/.docker/models/bundles/sha256/ae07dd2945afaaf7034f795ec286ec4cf79e6843e23b75b7c0696e31b6d40244/model/model.gguf'
srv    operator(): operator(): cleaning up before exit...
main: exiting due to model loading error
    = 248065 '<|file_sep|>'
print_info: EOG token             = 248044 '<|endoftext|>'
print_info: EOG token             = 248046 '<|im_end|>'
print_info: EOG token             = 248063 '<|fim_pad|>'
print_info: EOG token             = 248064 '<|repo_name|>'
print_info: EOG token             = 248065 '<|file_sep|>'
print_info: max token length      = 256
load_tensors: loading model tensors, this can take a while... (mmap = false, direct_io = false)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model MTP support + Backend Issue #941

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Model MTP support + Backend Issue #941

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions