feat: Qwen Image Edit 2511 — full pipeline with LoRA, GGUF, quantization, and UI#131
feat: Qwen Image Edit 2511 — full pipeline with LoRA, GGUF, quantization, and UI#131
Conversation
|
@copilot When I attempt to run a generation with Qwen Image (Diffusers version). I get this stack trace: |
The crash was caused by Fixed in |
|
@copilot Can you figure out where the WARNING in the fourth line of log output is coming from and remove it? |
The warning came from |
|
@copilot I'm getting the following error upon generating with the model: |
Fixed in |
|
@copilot I'm still getting tokenizer errors: |
... Fixed in |
|
@copilot |
Fixed in Processor loading: Replaced Reference images UI: Added txt2img without reference images: No warning or block when no reference images are set — the text encoder works with an empty list, so txt2img works fine. |
|
@copilot New error: |
ba333f3 to
e806505
Compare
…raph builder chore(backend): add test for heuristic detection of Qwen Image Edit GGUF model variant chore(frontend): add regression test for ref images not added to qwen image in generate mode fix(frontend): graph build handling of Qwen Image when CFG <=1 chore(frontend): add regression test for optimal dimension selection
a9252c3 to
c2d7b8b
Compare
e2710c4 to
b7de7d7
Compare
Replace the single-collector pattern (all images added as item edges to one collect node — order not guaranteed) with collector chaining (each image gets its own collect node, chained via collection → collection edge). This matches the FLUX.2 Klein kontext conditioning pattern and preserves insertion order of reference images. Test: verify 2 reference images produce 2 chained collect nodes with correct item, chain, and final output edges. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Test that sample_to_lowres_estimated_image() with the Qwen-specific latent RGB factors and bias produces correct preview images: - Valid RGB PIL Image from synthetic 1x16x4x4 latent tensor - Deterministic output for identical inputs - Known-value test: all-ones tensor → hand-calculated RGB pixel - Zero tensor produces uniform pixels matching bias values - Qwen factors have correct shape (16x3 + 3 bias) - 3D input (no batch dim) accepted Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…obing
When a starter model provides variant in override_fields, it was passed
both via **override_fields and as an explicit kwarg to cls(), causing
TypeError("got multiple values for keyword argument 'variant'").
Use .pop() instead of .get() so variant and repo_variant are removed
from override_fields before spreading.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…wargs The double-kwarg bug (TypeError: got multiple values for keyword argument) could occur for ANY config class that reads a field from override_fields with .get() and then passes it both via **override_fields AND as an explicit kwarg to cls(). Changed all instances of .get() to .pop() for: - variant (11 config classes) - repo_variant (7 config classes) - prediction_type (2 config classes) - submodels (1 config class) Also fixed the GGUF Qwen Image variant logic which was popping the explicit variant but not preserving it when the filename heuristic didn't apply. Regression tests (3 new): - Diffusers config with variant in override_fields doesn't crash - Explicit variant override takes precedence over auto-detection - GGUF config with variant in override_fields doesn't crash Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts: # invokeai/app/api/dependencies.py # invokeai/app/invocations/fields.py # invokeai/app/invocations/metadata.py # invokeai/app/invocations/primitives.py # invokeai/backend/model_manager/configs/factory.py # invokeai/backend/model_manager/configs/lora.py # invokeai/backend/model_manager/configs/main.py # invokeai/backend/model_manager/load/model_loaders/lora.py # invokeai/backend/model_manager/starter_models.py # invokeai/backend/model_manager/taxonomy.py # invokeai/backend/stable_diffusion/diffusion/conditioning_data.py # invokeai/frontend/web/src/features/nodes/types/common.ts # invokeai/frontend/web/src/features/nodes/util/graph/generation/addImageToImage.ts # invokeai/frontend/web/src/features/nodes/util/graph/generation/addInpaint.ts # invokeai/frontend/web/src/features/nodes/util/graph/generation/addTextToImage.ts # invokeai/frontend/web/src/features/nodes/util/graph/graphBuilderUtils.ts # invokeai/frontend/web/src/features/nodes/util/graph/types.ts # invokeai/frontend/web/src/features/parameters/util/optimalDimension.ts # invokeai/frontend/web/src/features/settingsAccordions/components/AdvancedSettingsAccordion/AdvancedSettingsAccordion.tsx # invokeai/frontend/web/src/features/settingsAccordions/components/GenerationSettingsAccordion/GenerationSettingsAccordion.tsx # invokeai/frontend/web/src/services/api/schema.ts
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Conflicts: # invokeai/frontend/web/src/features/metadata/parsing.tsx
Rewrite ParamQwenImageShift to use the same CompositeSlider + CompositeNumberInput + reset-to-auto pattern as ParamZImageShift, and move it from the Advanced Settings accordion to the Generation Settings accordion alongside the Z-Image shift control. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…fusers model When switching to a qwen-image base model, automatically set the VAE/encoder component source to the first available Qwen Image diffusers model if none is currently selected. Follows the same pattern as Z-Image's auto-defaulting logic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…witching qwen3 denoisers
… denoise model variants
Non-edit Qwen GGUFs now get an explicit "generate" variant instead of None, matching the new tensor-based variant detection semantics. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Complete implementation of the Qwen Image Edit 2511 pipeline for InvokeAI, including text-to-image generation, image editing with reference images, LoRA support (including Lightning distillation), GGUF quantized transformers, and BitsAndBytes encoder quantization.
Key Features
Backend Changes
zero_cond_tmodulation, LoRA application via LayerPatcher with sidecar patching for GGUF, shift override for LightningModelLoader),zero_cond_t=True, correctin_channels>=4.56.0(the video processor fallback imports already handle this)Frontend Changes
qwenImageEditComponentSource,qwenImageEditQuantization,qwenImageEditShiftin params slice with persistence and model-switch cleanupFunctional Testing Guide
1. Text-to-Image Generation (Basic)
2. GGUF Quantized Transformer
3. BitsAndBytes Encoder Quantization
4. LoRA Support
5. Image Editing with Reference Image
6. Multiple Reference Images
7. Model Switching Cleanup
🤖 Generated with Claude Code