Skip to content

fix: improve weight offloading to handle plain tensor attrs and use to_empty()#952

Merged
quic-rishinr merged 5 commits into
quic:release/v1.22.0_tmpfrom
quic-rishinr:mem_optim_v2
May 26, 2026
Merged

fix: improve weight offloading to handle plain tensor attrs and use to_empty()#952
quic-rishinr merged 5 commits into
quic:release/v1.22.0_tmpfrom
quic-rishinr:mem_optim_v2

Conversation

@quic-rishinr
Copy link
Copy Markdown
Contributor

fix: improve weight offloading to handle plain tensor attrs and use to_empty()

Replace manual storage resizing with to_empty(device="meta") for
parameters/buffers and explicitly handle plain tensor attributes (e.g.
stacked expert weights in MoE models) that are not registered as
parameters or buffers. This ensures all tensors are properly moved to
the meta device, reducing memory usage after ONNX export.

Add unit tests for plain tensor attribute clearing

@quic-rishinr quic-rishinr force-pushed the mem_optim_v2 branch 2 times, most recently from 356b186 to 7c9e0c4 Compare May 13, 2026 03:53
@quic-rishinr
Copy link
Copy Markdown
Contributor Author

@ochougul @abhishek-singh591 please review the PR

…o_empty()

Signed-off-by: Rishin Raj <rishinr@qti.qualcomm.com>
Signed-off-by: Rishin Raj <rishinr@qti.qualcomm.com>
Signed-off-by: Rishin Raj <rishinr@qti.qualcomm.com>
Copy link
Copy Markdown
Contributor

@abhishek-singh591 abhishek-singh591 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after the last commit, Validated with llama 3 8B model.

Signed-off-by: Rishin Raj <rishinr@qti.qualcomm.com>
@quic-rishinr quic-rishinr added the 1.22 Release 1.22 candidate label May 25, 2026
@quic-rishinr quic-rishinr changed the base branch from main to release/v1.22.0_tmp May 25, 2026 16:39
@quic-rishinr quic-rishinr merged commit 69bf9b9 into quic:release/v1.22.0_tmp May 26, 2026
4 of 5 checks passed
@quic-rishinr quic-rishinr deleted the mem_optim_v2 branch May 26, 2026 05:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

1.22 Release 1.22 candidate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants