[CUDNN] Update frontend to version 1.22 and add cuDNN 9.20 path for SM arch >100 by zmelumian972 · Pull Request #2838 · NVIDIA/TransformerEngine

zmelumian972 · 2026-04-05T16:01:00Z

Summary

Updates the cudnn-frontend submodule to version 1.22 (97f6cb3b)
Adds a new cuDNN 9.20 path in nvte_get_fused_attn_backend for Blackwell (SM arch >= 100) that supports any head dimension, both forward and backward passes, non-paged layouts, and any sequence length

Changes

3rdparty/cudnn-frontend: Bump submodule from 7b9b711c to 97f6cb3b (cuDNN frontend v1.22)
transformer_engine/common/fused_attn/fused_attn.cpp: Add cuDNN 9.20 backend selection condition:
- Enables FusedAttn_F16_Arbitrary_Seqlen backend for SM >= 100 + cuDNN >= 9.20 + non-paged KV layouts
- Fixes the logical operator joining the 9.11 condition from && to || to correctly OR the two Blackwell conditions

Test plan

Verify FusedAttention with cuDNN 9.20+ on Blackwell (SM >= 100) hardware
Confirm existing Hopper (SM 90) paths are unaffected
Run fused attention unit tests for paged/non-paged layouts

🤖 Generated with Claude Code

greptile-apps · 2026-04-05T16:11:21Z

Greptile Summary

Bumps cudnn-frontend to v1.22 and adds a cuDNN 9.20 head-dimension OR branch for Blackwell (SM≥100, non-paged, fprop+bprop, any sq) inside the existing flag_arb check in nvte_get_fused_attn_backend. The parenthesis accounting is correct: the trailing && on the 9.11 clause is correctly changed to || to insert the new sub-condition inside the head-dim OR group, and all outer gates (architecture, mask type, QKV format, sliding window, determinism) continue to apply to the 9.20 path unchanged.

Confidence Score: 5/5

Safe to merge — the logic change is structurally correct and all pre-existing runtime gates still apply to the new 9.20 path.

No new P0/P1 findings. The &&→|| restructuring is correct: it moves the closing ) of the 9.11 sub-condition one level inward and appends the 9.20 clause as a sibling OR branch before the outer group closes, keeping the bug-exclusion && in the right place. The 9.20 condition inherits all surrounding guards (SM≥100 arch check, mask-type block, QKV-format block, Blackwell determinism check), so no previously-gated combinations are accidentally opened beyond what cuDNN 9.20 is expected to support. The one open question about sq=1 + causal on Blackwell/9.20 was already raised in a prior review thread.

No files require special attention.

Important Files Changed

Filename	Overview
transformer_engine/common/fused_attn/fused_attn.cpp	Adds cuDNN 9.20 head-dim OR branch for Blackwell (SM≥100, fprop+bprop, non-paged); changes trailing `&&` to `
3rdparty/cudnn-frontend	Submodule bump from 7b9b711c to 97f6cb3b (cudnn-frontend v1.22); no source changes in this repo.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["F16/BF16 path entered"] --> B{"arch check\nSM≥100 requires\ncuDNN≥9.7"}
    B -->|pass| C{"head_dim % 8 == 0\nfor qk and v?"}
    B -->|fail| Z["flag_arb = false"]
    C -->|yes| D{"Which head_dim\ncondition matches?"}
    C -->|no| Z
    D --> D1["≤128 (any version)"]
    D --> D2["≤256 + Hopper\n(9.1 fprop / 9.5 bprop)"]
    D --> D3["any dim + Blackwell\n+ fprop + non-paged\n+ sq>1  (9.9)"]
    D --> D4["any dim + fprop\n+ any arch (9.10.2)"]
    D --> D5["dqk=192/dv=128\n+ Blackwell + bprop\n+ non-paged (9.11)"]
    D --> D6["any dim + Blackwell\n+ fprop/bprop\n+ non-paged (9.20 NEW)"]
    D1 & D2 & D3 & D4 & D5 & D6 --> E{"Hopper bprop\nbug exclusion\n(sm==90 only)"}
    E -->|not blocked| F{"mask type,\nformat, SWA,\ndeterminism\ngates"}
    E -->|blocked| Z
    F -->|all pass| G["flag_arb = true\n→ NVTE_F16_arbitrary_seqlen"]
    F -->|any fail| Z

_{Reviews (2): Last reviewed commit: "FusedAttention: Add cudnn 9.20 path for ..." | Re-trigger Greptile}

greptile-apps · 2026-04-05T16:11:25Z

+          // 9.20: any head_dim + Blackwell + fprop/bprop + non_paged + any sq
+          (sm_arch_ >= 100 && cudnn_runtime_version >= 92000 &&
+           layout_group != NVTE_QKV_Layout_Group::NVTE_Paged_KV_HD_HD_HD)) &&


Verify sq=1 + causal/padding_causal fprop support in cuDNN 9.20

The 9.20 condition allows any max_seqlen_q (including sq = 1) with any mask type on non-paged Blackwell layouts. The preceding 9.10.2 fprop path explicitly excluded sq = 1 + causal and sq = 1 + padding_causal on non-paged layouts:

(max_seqlen_q == 1 && attn_mask_type != NVTE_Mask_Type::NVTE_CAUSAL_MASK && attn_mask_type != NVTE_Mask_Type::NVTE_PADDING_CAUSAL_MASK)

With the 9.20 path (any sq, no mask-type restriction at the head-dim level), sq=1 + causal + non-paged + fprop on Blackwell/cuDNN≥9.20 will now pass this gate — where it was previously blocked. If cuDNN 9.20 lifts this restriction for SM≥100, this is correct. If not, passing this combination to the backend would produce a runtime error. Please confirm whether cuDNN 9.20 actually supports this combination on Blackwell.

KshitijLakhani · 2026-04-06T18:52:38Z

/te-ci jax L0

Signed-off-by: zmelumian972 <zmelumian@gmail.com>

zmelumian972 force-pushed the cudnn/support_version_1.22 branch 2 times, most recently from d72d8a2 to dcef948 Compare April 5, 2026 16:05

greptile-apps bot reviewed Apr 5, 2026

View reviewed changes

jberchtold-nvidia self-assigned this Apr 8, 2026

zmelumian972 added 2 commits April 16, 2026 08:56

[CUDNN] Update frontend to version 1.22

5317cb1

Signed-off-by: zmelumian972 <zmelumian@gmail.com>

FusedAttention: Add cudnn 9.20 path for SM arch >100

d217bf9

Signed-off-by: zmelumian972 <zmelumian@gmail.com>

zmelumian972 force-pushed the cudnn/support_version_1.22 branch from dcef948 to d217bf9 Compare April 16, 2026 05:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CUDNN] Update frontend to version 1.22 and add cuDNN 9.20 path for SM arch >100#2838

[CUDNN] Update frontend to version 1.22 and add cuDNN 9.20 path for SM arch >100#2838
zmelumian972 wants to merge 2 commits intoNVIDIA:mainfrom
zmelumian972:cudnn/support_version_1.22

zmelumian972 commented Apr 5, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Apr 5, 2026 •

edited

Loading

Uh oh!

greptile-apps bot Apr 5, 2026

Uh oh!

KshitijLakhani commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zmelumian972 commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Uh oh!

greptile-apps bot commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

KshitijLakhani commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zmelumian972 commented Apr 5, 2026 •

edited

Loading

greptile-apps bot commented Apr 5, 2026 •

edited

Loading