Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
3ddabdc
WIP
atobiszei Mar 2, 2026
b5ed7d8
Inpainting/outpainting CPU
atobiszei Mar 9, 2026
c2ba9d8
Update dockerignore
atobiszei Mar 11, 2026
af51476
Demo
atobiszei Mar 13, 2026
5f89fc2
Change mask
atobiszei Mar 13, 2026
102f8af
Fix
atobiszei Mar 13, 2026
330374f
Fix: remove mask from accepted fields in text2image request options
atobiszei Mar 16, 2026
0a65e68
Merge remote-tracking branch 'origin/main' into atobisze_image_inpain…
atobiszei Mar 16, 2026
eac3932
Fix concurrent request inpainting issue, propagate quantization param…
atobiszei Mar 18, 2026
f4be12a
Merge remote-tracking branch 'origin/main' into atobisze_image_inpain…
atobiszei Mar 18, 2026
a2476ee
Add tests & review fix
atobiszei Mar 18, 2026
15dd08a
Address PR review: blocking inpainting guard, string_view optimizatio…
atobiszei Mar 18, 2026
7e392c6
Minor comment fix
atobiszei Mar 19, 2026
eb34c6e
Add LoRA adapter support for image generation
atobiszei Mar 16, 2026
9da0860
Add download commands for inpainting/outpainting demo images
atobiszei Mar 24, 2026
4d24c22
Add LoRA adapter support for image generation
atobiszei Mar 25, 2026
dec1959
Merge origin/main into atobisze_image_inpainting_lora
atobiszei Mar 25, 2026
f10b657
Fix LoRA download auth, style issues, and restore README curl commands
atobiszei Mar 25, 2026
4fe7c75
Fix Windows build: rename shadowed variable 'it' to 'member'
atobiszei Mar 25, 2026
0ef2ebe
Fix Windows path detection in LoRA CLI parser
atobiszei Mar 25, 2026
519ea61
Extract isLocalFilePath() into stringutils for cross-platform path de…
atobiszei Mar 25, 2026
c288736
Use generic_string() for cross-platform path separators in imagegen_init
atobiszei Mar 26, 2026
b1d2ca6
Merge remote-tracking branch 'origin/main' into atobisze_image_inpain…
atobiszei Apr 30, 2026
453f712
Fix build after merge: filesystem path, const cast, LoRA aliases via …
atobiszei Apr 30, 2026
f6d52f4
docs: fix LoRA documentation - model name routing, guidance_scale, pu…
atobiszei May 13, 2026
8e36a01
Merge remote-tracking branch 'origin/main' into atobisze_image_inpain…
atobiszei May 13, 2026
a241d7f
Review fixes: alias validation, collision check, revert poll defaults
atobiszei May 13, 2026
59eaad2
Add NPU LoRA MODE_STATIC support and LoRA test coverage
atobiszei May 14, 2026
33e7a85
Add LoRA alias conflict validation, NPU request rejection, and rename…
atobiszei May 14, 2026
d76da73
Merge remote-tracking branch 'origin/main' into atobisze_image_inpain…
atobiszei May 14, 2026
65e6bd2
feat(image_gen): Add LoRA load mode (FUSE/STATIC/DYNAMIC)
atobiszei May 15, 2026
77c56ed
ImageGen LoRA: rename FUSE→STATIC for NPU, fix misleading comments
atobiszei May 15, 2026
525be69
LoRA: alpha validation, mode docs, Windows drive letter fix
atobiszei May 18, 2026
a705a22
Self-review
atobiszei May 18, 2026
34ce570
self-review part 2
atobiszei May 18, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,10 @@ When analyzing a Pull Request, follow this protocol:
- **Keep headers self-contained but minimal**: each header must compile on its own, but should not pull in transitive dependencies that callers don't need.
- **Prefer opaque types / Pimpl**: for complex implementation details, consider the Pimpl idiom to keep implementation-only types out of the public header entirely.
- **Never include a header solely for a typedef or enum**: forward-declare the enum (`enum class Foo;` in C++17) or relocate the typedef to a lightweight `fwd.hpp`-style header.
13. Be mindful when accepting `const T&` in constructors or functions that store the reference: verify that the referenced object's lifetime outlives the usage to avoid dangling references.
13. **No dangling references or temporaries bound to `const T&`**:
- Never use `const T&` parameters with default arguments that construct temporaries (e.g. `const std::string& param = ""`). This binds a reference to a temporary — use a function overload instead, or pass by value.
- When accepting `const T&` in constructors or functions that store the reference, verify that the referenced object's lifetime outlives the usage to avoid dangling references.
- Prefer overloads over default arguments for non-trivial types passed by reference.

## Build System

Expand Down
47 changes: 45 additions & 2 deletions demos/common/export_models/export_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,11 @@ def add_common_arguments(parser):
parser_image_generation.add_argument('--max_num_images_per_prompt', type=int, default=0, help='Max allowed number of images client is allowed to request for a given prompt', dest='max_num_images_per_prompt')
parser_image_generation.add_argument('--default_num_inference_steps', type=int, default=0, help='Default number of inference steps when not specified by client', dest='default_num_inference_steps')
parser_image_generation.add_argument('--max_num_inference_steps', type=int, default=0, help='Max allowed number of inference steps client is allowed to request for a given prompt', dest='max_num_inference_steps')
parser_image_generation.add_argument('--source_loras', default=None,
help='LoRA adapters to apply. Format: alias1=org1/repo1,alias2=org2/repo2@lora_file.safetensors '
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add export optin for composite lora.

'where @filename is optional and specifies which .safetensors file to use from the downloaded repo '
'(auto-detected when repo contains exactly one). Only for image_generation task.',
dest='source_loras')

parser_text2speech = subparsers.add_parser('text2speech', help='export model for text2speech endpoint')
add_common_arguments(parser_text2speech)
Expand Down Expand Up @@ -339,6 +344,9 @@ def add_common_arguments(parser):
default_num_inference_steps: {{default_num_inference_steps}},{% endif %}
{%- if max_num_inference_steps > 0 %}
max_num_inference_steps: {{max_num_inference_steps}},{% endif %}
{%- for lora in lora_adapters %}
lora_adapters { alias: "{{lora.alias}}" path: "{{lora.path}}" }
{%- endfor %}
}
}
}"""
Expand Down Expand Up @@ -616,7 +624,7 @@ def export_rerank_model(model_repository_path, source_model, model_name, precisi
add_servable_to_config(config_file_path, model_name, os.path.relpath(os.path.join(model_repository_path, model_name), os.path.dirname(config_file_path)))


def export_image_generation_model(model_repository_path, source_model, model_name, precision, task_parameters, config_file_path, num_streams):
def export_image_generation_model(model_repository_path, source_model, model_name, precision, task_parameters, config_file_path, num_streams, source_loras):
model_path = "./"
target_path = os.path.join(model_repository_path, model_name)
model_index_path = os.path.join(target_path, 'model_index.json')
Expand All @@ -629,6 +637,41 @@ def export_image_generation_model(model_repository_path, source_model, model_nam
if os.system(optimum_command):
raise ValueError("Failed to export image generation model", source_model)

# Download and resolve LoRA adapters
lora_adapters = []
if source_loras:
from huggingface_hub import snapshot_download
entries = source_loras.split(',')
for entry in entries:
entry = entry.strip()
if '=' in entry:
alias, repo_and_file = entry.split('=', 1)
else:
repo_and_file = entry
alias = entry.split('/')[-1] if '/' in entry else entry
safetensors_file = ''
if '@' in repo_and_file:
repo, safetensors_file = repo_and_file.rsplit('@', 1)
else:
repo = repo_and_file
lora_dir = os.path.join(target_path, 'loras', repo)
if not os.path.isdir(lora_dir):
print(f"Downloading LoRA adapter: {repo} to {lora_dir}")
snapshot_download(repo_id=repo, local_dir=lora_dir)
else:
print(f"LoRA adapter directory already exists: {lora_dir}")
if not safetensors_file:
st_files = [f for f in os.listdir(lora_dir) if f.endswith('.safetensors')]
if len(st_files) == 0:
raise ValueError(f"No .safetensors files found in LoRA adapter: {repo}")
if len(st_files) > 1:
raise ValueError(f"Multiple .safetensors files in LoRA adapter: {repo}. Use @filename to specify.")
safetensors_file = st_files[0]
lora_path = 'loras/' + repo + '/' + safetensors_file
lora_adapters.append({'alias': alias, 'path': lora_path})
print(f"LoRA adapter: {alias} -> {lora_path}")
task_parameters['lora_adapters'] = lora_adapters

plugin_config = {}
assert num_streams >= 0, "num_streams should be a non-negative integer"
if num_streams > 0:
Expand Down Expand Up @@ -711,4 +754,4 @@ def export_image_generation_model(model_repository_path, source_model, model_nam
'max_num_inference_steps',
'extra_quantization_params'
]}
export_image_generation_model(args['model_repository_path'], args['source_model'], args['model_name'], args['precision'], template_parameters, args['config_file_path'], args['num_streams'])
export_image_generation_model(args['model_repository_path'], args['source_model'], args['model_name'], args['precision'], template_parameters, args['config_file_path'], args['num_streams'], args['source_loras'])
198 changes: 197 additions & 1 deletion demos/image_generation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -397,7 +397,7 @@ A single servable exposes the following endpoints:

> **Note:** Inpainting/outpainting requests are processed sequentially — concurrent requests will be queued.

> **Note:** For inpainting/outpainting, dedicated inpainting models (e.g. `stable-diffusion-v1-5/stable-diffusion-inpainting`) only support the `images/edits` endpoint. Check [supported models](https://openvinotoolkit.github.io/openvino.genai/docs/supported-models/#image-generation-models).
> **Note:** Dedicated inpainting models (e.g. `stable-diffusion-v1-5/stable-diffusion-inpainting`) only support the `images/edits` endpoint — they cannot be used for text-to-image generation via `images/generations`. General-purpose models (e.g. SDXL) support both endpoints. Check [supported models](https://openvinotoolkit.github.io/openvino.genai/docs/supported-models/#image-generation-models).

All requests are processed in unary format, with no streaming capabilities.

Expand Down Expand Up @@ -528,6 +528,12 @@ Output file (`edit_output.png`):

Inpainting replaces a masked region in an image based on the prompt. The `mask` is a black-and-white image where white pixels mark the area to repaint.

Download sample images:
```console
curl -O https://raw.githubusercontent.com/openvinotoolkit/model_server/main/demos/image_generation/cat.png
curl -O https://raw.githubusercontent.com/openvinotoolkit/model_server/main/demos/image_generation/cat_mask.png
```

![cat](./cat.png) ![cat_mask](./cat_mask.png)

::::{tab-set}
Expand Down Expand Up @@ -599,6 +605,12 @@ Outpainting extends an image beyond its original borders. Prepare two images:
- **outpaint_input.png** — the original image centered on a larger canvas (e.g. 768×768) with black borders
- **outpaint_mask.png** — white where the new content should be generated (the borders), black where the original image is

Download sample images:
```console
curl -O https://raw.githubusercontent.com/openvinotoolkit/model_server/main/demos/image_generation/outpaint_input.png
curl -O https://raw.githubusercontent.com/openvinotoolkit/model_server/main/demos/image_generation/outpaint_mask.png
```

![outpaint_input](./outpaint_input.png) ![outpaint_mask](./outpaint_mask.png)

::::{tab-set}
Expand Down Expand Up @@ -718,6 +730,190 @@ ovms --rest_port 8000 ^

Please follow [OpenVINO notebook](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/image-to-image-genai/image-to-image-genai.ipynb) to understand how other parameters affect editing.

## Multi-LoRA Image Generation

This section demonstrates how to serve multiple LoRA adapters with a single SDXL base model, enabling per-request style selection. This replicates the [Multi LoRA Image Generation notebook](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/multilora-image-generation/multilora-image-generation.ipynb) but using OVMS for serving.

### Start Server with Multiple LoRA Adapters

The following command starts OVMS with Stable Diffusion XL and 5 LoRA adapters for different artistic styles:

::::{tab-set}
:::{tab-item} Docker (Linux)
:sync: docker
```bash
mkdir -p models

docker run -d --rm --user $(id -u):$(id -g) -p 8000:8000 -v $(pwd)/models:/models/:rw \
-e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy \
openvino/model_server:latest \
--rest_port 8000 \
--model_repository_path /models/ \
--task image_generation \
--source_model stabilityai/stable-diffusion-xl-base-1.0 \
--source_loras "xray=DoctorDiffusion/doctor-diffusion-s-xray-xl-lora@DD-xray-v1.safetensors,thepoint=alvdansen/the-point@araminta_k_the_point.safetensors,ukiyo=KappaNeuro/ukiyo-e-art@Ukiyo-e Art.safetensors,vector=DoctorDiffusion/doctor-diffusion-s-controllable-vector-art-xl-lora@DD-vector-v2.safetensors,chalk=Norod78/sdxl-chalkboarddrawing-lora@SDXL_ChalkBoardDrawing_LoRA_r8.safetensors"
```
:::

:::{tab-item} Bare metal (Windows)
:sync: bare-metal
```bat
mkdir models

ovms --rest_port 8000 ^
--model_repository_path ./models/ ^
--task image_generation ^
--source_model stabilityai/stable-diffusion-xl-base-1.0 ^
--source_loras "xray=DoctorDiffusion/doctor-diffusion-s-xray-xl-lora@DD-xray-v1.safetensors,thepoint=alvdansen/the-point@araminta_k_the_point.safetensors,ukiyo=KappaNeuro/ukiyo-e-art@Ukiyo-e Art.safetensors,vector=DoctorDiffusion/doctor-diffusion-s-controllable-vector-art-xl-lora@DD-vector-v2.safetensors,chalk=Norod78/sdxl-chalkboarddrawing-lora@SDXL_ChalkBoardDrawing_LoRA_r8.safetensors"
```
:::

::::

The registered adapters and their recommended use:

| Alias | Repository | Style | Recommended Weight | Prompt Template |
|-------|-----------|-------|-------------------|-----------------|
| `xray` | DoctorDiffusion/doctor-diffusion-s-xray-xl-lora | X-Ray style | 0.8 | `xray <subject>` |
| `thepoint` | alvdansen/the-point | Artistic illustration | 0.6 | `<subject>` |
| `ukiyo` | KappaNeuro/ukiyo-e-art | Ukiyo-e Japanese art | 0.8 | `an illustration of <subject> in Ukiyo-e Art style` |
| `vector` | DoctorDiffusion/doctor-diffusion-s-controllable-vector-art-xl-lora | Vector art | 0.8 | `vector <subject>` |
| `chalk` | Norod78/sdxl-chalkboarddrawing-lora | Chalkboard drawing | 0.45 | `A colorful chalkboard drawing of <subject>` |

### Generate Images with Different Styles

Use the adapter alias as the `model` field to select which adapter to apply per request. The adapter is activated via **model name routing** — when the `model` field matches a registered LoRA alias, that adapter is automatically applied.

**X-Ray style:**
```bash
curl http://localhost:8000/v3/images/generations \
-H "Content-Type: application/json" \
-d '{
"model": "xray",
"prompt": "xray a cute cat in sunglasses",
"num_inference_steps": 20,
"guidance_scale": 0.0,
"size": "1024x1024"
}' | jq -r '.data[0].b64_json' | base64 --decode > xray_cat.png
```

**Ukiyo-e Japanese art:**
```bash
curl http://localhost:8000/v3/images/generations \
-H "Content-Type: application/json" \
-d '{
"model": "ukiyo",
"prompt": "an illustration of a cute cat in sunglasses in Ukiyo-e Art style",
"num_inference_steps": 20,
"guidance_scale": 0.0,
"size": "1024x1024"
}' | jq -r '.data[0].b64_json' | base64 --decode > ukiyo_cat.png
```

**Chalkboard drawing:**
```bash
curl http://localhost:8000/v3/images/generations \
-H "Content-Type: application/json" \
-d '{
"model": "chalk",
"prompt": "A colorful chalkboard drawing of a cute cat in sunglasses",
"num_inference_steps": 20,
"guidance_scale": 0.0,
"size": "1024x1024"
}' | jq -r '.data[0].b64_json' | base64 --decode > chalk_cat.png
```

Optionally override the adapter alpha using `lora_alphas`:
```bash
curl http://localhost:8000/v3/images/generations \
-H "Content-Type: application/json" \
-d '{
"model": "xray",
"prompt": "xray a cute cat in sunglasses",
"lora_alphas": {"xray": 0.5},
"num_inference_steps": 20,
"guidance_scale": 0.0,
"size": "1024x1024"
}' | jq -r '.data[0].b64_json' | base64 --decode > xray_cat_half_weight.png
```
### Using OpenAI Python Client with LoRA

```python
from openai import OpenAI
import base64
from io import BytesIO
from PIL import Image

client = OpenAI(
base_url="http://localhost:8000/v3",
api_key="unused"
)

# Define LoRA styles — the adapter alias is used as the model name
styles = {
"xray": {"prompt": "xray {subject}"},
"thepoint": {"prompt": "{subject}"},
"ukiyo": {"prompt": "an illustration of {subject} in Ukiyo-e Art style"},
"vector": {"prompt": "vector {subject}"},
"chalk": {"prompt": "A colorful chalkboard drawing of {subject}"},
}

subject = "a cute cat in sunglasses"

for style_name, style_config in styles.items():
prompt = style_config["prompt"].format(subject=subject)
response = client.images.generate(
model=style_name, # adapter alias activates the LoRA
prompt=prompt,
extra_body={
"num_inference_steps": 20,
"guidance_scale": 0.0,
"size": "1024x1024",
}
)
image_data = base64.b64decode(response.data[0].b64_json)
image = Image.open(BytesIO(image_data))
image.save(f'{style_name}_cat.png')
print(f"Saved {style_name}_cat.png")
```

### Blending Multiple Adapters

To blend multiple adapters, define a **composite adapter** at startup using the `@alias:weight` syntax:

```bash
--source_loras="xray=...,ukiyo=...,blend=@xray:0.5+@ukiyo:0.4"
```

Then use the composite alias as the model name:
```python
response = client.images.generate(
model="blend", # activates both xray and ukiyo
prompt="a cute cat in sunglasses",
extra_body={
"num_inference_steps": 20,
"guidance_scale": 0.0,
"size": "1024x1024",
}
)
```

You can override individual component weights at request time:
```python
response = client.images.generate(
model="blend",
prompt="a cute cat in sunglasses",
extra_body={
"lora_alphas": {"xray": 0.8, "ukiyo": 0.2},
"num_inference_steps": 20,
"guidance_scale": 0.0,
"size": "1024x1024",
}
)
```

> **Note:** For more details on LoRA adapter configuration, see the [Image Generation reference documentation](../../docs/image_generation/reference.md#lora-adapters).

## References
- [Image Generation API](../../docs/model_server_rest_api_image_generation.md)
- [Image Edit API](../../docs/model_server_rest_api_image_edit.md)
Expand Down
Loading