fix 4326: bound exported dynamic shapes and normalize symbolic slice …#4341
fix 4326: bound exported dynamic shapes and normalize symbolic slice …#4341micwill755 wants to merge 3 commits into
Conversation
|
@micwill755 |
|
|
||
| with gm.graph.inserting_before(node): | ||
| dim_size = gm.graph.call_function( | ||
| torch.ops.aten.sym_size.int, args=(input_node, dim) |
There was a problem hiding this comment.
Nice yeah I think is exactly what we are looking for.
If you want something a bit cleaner to read, we have this utility:
https://github.com/pytorch/TensorRT/blob/main/py/torch_tensorrt/dynamo/lowering/_SubgraphBuilder.py
but its more style than functional
There was a problem hiding this comment.
Cool, let me review and edit accordingly.
| partitioned_module.recompile() | ||
|
|
||
|
|
||
| def _apply_dynamic_shape_bounds( |
There was a problem hiding this comment.
@apbose please review / comment on if this overlaps with your work
There was a problem hiding this comment.
This was adapted from a previous fix in Wenbing’s branch: _apply_dynamic_shape_bounds.
It fixes the case where ZoomASR is exported with broad or unbounded Dim.DYNAMIC ranges by applying the user-provided torch_tensorrt.Input min/max bounds during Torch-TensorRT compilation. Without this compiler-side fix, I had to work around the issue in the ZoomASR export code by changing the dynamic shape annotations from:
"features": {0: Dim.DYNAMIC, 1: Dim.DYNAMIC, 2: Dim.STATIC},
"feature_lengths": {0: Dim.DYNAMIC},
to explicitly bounded dimensions:
batch_dim = Dim("batch", min=1, max=max_batch_size)
feature_len_dim = Dim("feature_len", min=1, max=3000)
dynamic_shapes={
"features": {0: batch_dim, 1: feature_len_dim, 2: Dim.STATIC},
"feature_lengths": {0: batch_dim},
}
With _apply_dynamic_shape_bounds, the model export can keep using Dim.DYNAMIC, while Torch-TensorRT still constructs the intended bounded TensorRT optimization profile from the provided input specs.
There was a problem hiding this comment.
My only concern with this code and Wenbings prior version of this is im not sure it properly inserts the updated shapes. Like I would expect some form of shape prop done here as well after constraining the placeholders. If we can solve this in export with explicit ranges, it might be better to split this into its own PR that fully handles constraining an already exported exported program. It should still unblock Shane without and the explicit shape in export right?
There was a problem hiding this comment.
Agreed and yes this will unblock Shane.
There was a problem hiding this comment.
yeah PR 4213 addresses this, validating the input bounds against the exporter bounds.
- One issue against changing the var_to_range is that it tracks the exporter bounds, and if we change the Input bounds for a later compile, it conflates the Input bound of TRT profile with the export bound, and can lead to error. Also it does not do the shape prop. So something like this fails in above
ep = torch.export.export(model, (sample,),
dynamic_shapes={"x": {0: Dim.DYNAMIC}})
# var_to_range[s0] = [2, int_oo] -- model valid for any size
Now two separate compilations:
# User A: small deployment, max batch 100
trt_a = torchtrt.dynamo.compile(ep, inputs=[Input(max_shape=(100, 8))])
# _apply_dynamic_shape_bounds mutates var_to_range[s0] → [2, 100]
# User B: larger deployment, max batch 500
trt_b = torchtrt.dynamo.compile(ep, inputs=[Input(max_shape=(500, 8))])
# _apply_dynamic_shape_bounds now sees var_to_range[s0] = [2, 100]
# computes min(100, 500) = 100
# User B silently gets max=100 instead of 500
- Second type which the other PR addresses is — if the export used explicit Dim("batch", min=10, max=20) and the user passes Input(min_shape=2) this method will clamp input bound to 10 ignoring 2.
Description
This change fixes three dynamic-shape issues exposed by the ZoomASR repro. Together, these fixes make Torch-TensorRT respect user-provided input bounds, lower negative symbolic slicing into TensorRT-friendly shape math, and clean up symbolic expressions that otherwise survive into runtime or conversion.
A previous working branch had a helper called _apply_dynamic_shape_bounds in Torch-TensorRT _compiler.py. That function propagated user-provided torch_tensorrt.Input min/max bounds into the FX shape_env, which explains why the old path used the intended ZoomASR feature length bound of 3000 instead of the broad exported range [9, 16384].
PyTorch 2.12 can emit slices with negative symbolic bounds, such as x[-n:] or x[:-n]. This change adds a lowering pass that rewrites those bounds to dim_size - n for both slice start and stop, making the graph explicit and TensorRT-friendly.
The issue discussion mentioned removing torch.sym_min(x, INT64_MAX) as a possible needed patch. This change adds that as a separate lowering pass; it removes the no-op sym_min expression before runtime, preventing failures where torch.sym_min receives tensor-like inputs.
Fixes #4326
Type of change
Checklist: