fix(tensorrt): TRT 10.16.1.11 + modelopt install + run_pip quote-fix by forkni · Pull Request #1 · dotsimulate/StreamDiffusion-installer

forkni · 2026-04-26T20:26:10Z

Summary

Bumps the installer to align with TRT 10.16.1.11 (first Blackwell-Windows-production release; fixes the 78% FP8 perf regression in 10.12–10.13 on SM_120) and adds the missing FP8-quant install block.

sd_installer/tensorrt.py: bump tensorrt_cu12 → 10.16.1.11, polygraphy → 0.49.26, onnx-graphsurgeon → 0.6.1; add FP8-quant block (nvidia-modelopt[onnx] cupy-cuda12x==13.6.0 numpy==1.26.4) previously missing — silent ImportError on fp8_quantize until first FP8 build. Re-pin onnxruntime-gpu==1.24.4 with --no-deps after modelopt's transitive downgrade. Drop shell-style quotes inside package specs (run_pip uses subprocess + .split(), so quotes become literal arg chars).
sd_installer/installer.py: remove torchaudio from cu128 config (not needed); minor ruff format cleanup.
sd_installer/verifier.py: float32_to_bfloat16 diagnostic now points to onnx-graphsurgeon==0.6.1 instead of suggesting an onnx downgrade.
sd_installer/{cli.py, __init__.py, __main__.py}: ruff format cleanup (blank lines, unused import, raw docstring).

Companion PR

Pairs with dotsimulate/StreamDiffusion#12 — the main library work for TRT 10.16.1.11 + FP8 quantization. The installer fix here is a strict prerequisite: the StreamDiffusionTD COMP's Installtensorrt button installs from this repo's sd_installer/tensorrt.py, so without this PR merged the button continues to install TRT 10.12 even after the main PR lands.

Test Plan

Fresh-venv install: confirm pip list reports tensorrt_cu12==10.16.1.11, polygraphy==0.49.26, onnx-graphsurgeon==0.6.1, nvidia-modelopt>=0.19, onnxruntime-gpu==1.24.4 (--no-deps re-pin).
python -c "from streamdiffusion.acceleration.tensorrt.fp8_quantize import *; print('OK')" returns OK on a fresh install (pre-fix this would have ImportError'd on modelopt until the first FP8 build).
All 13 verifier checks pass.

🤖 Generated with Claude Code

- tensorrt.py: bump tensorrt_cu12 to 10.16.1.11, polygraphy 0.49.26, onnx-graphsurgeon 0.6.1; add FP8-quant block (modelopt + cupy-cuda12x + numpy re-lock); re-pin onnxruntime-gpu==1.24.4 with --no-deps after modelopt downgrade; drop shell-style quotes inside package specs (run_pip uses subprocess + .split(), quotes become literal arg chars). - installer.py: remove torchaudio from cu128 config (not needed); minor ruff format cleanup. - verifier.py: float32_to_bfloat16 diagnostic points to onnx-gs 0.6.1 instead of suggesting an onnx downgrade. - __init__.py, __main__.py, cli.py: ruff format cleanup (blank lines, unused import, raw docstring).

Fixes 6 CVEs patched in deps audit 2026-05-23: - idna >=3.16 (CVE-2026-45409: punycode resource exhaustion) - Mako >=1.3.12 (CVE-2026-44307: Windows backslash path traversal) - urllib3 >=2.7.0 (CVE-2026-44432/44431: over-decompression, cross-origin redirect) Added to MANUAL_PINS and installed in phase7_numpy_lock so upgrade runs on both fresh and existing installs. Fresh pip resolves already satisfy these floors; this ensures the minimum on partial updates. pip and onnxruntime-gpu CVEs are handled separately: - pip: phase1_foundation already runs --upgrade pip (gets latest) - onnx 1.19.1: 6 CVEs deferred — 1.21.0 breaks FP8 quantization Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

INTER-NYC and others added 2 commits April 23, 2026 14:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tensorrt): TRT 10.16.1.11 + modelopt install + run_pip quote-fix#1

fix(tensorrt): TRT 10.16.1.11 + modelopt install + run_pip quote-fix#1
forkni wants to merge 2 commits into
dotsimulate:mainfrom
forkni:main

forkni commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

forkni commented Apr 26, 2026

Summary

Companion PR

Test Plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants