Conversation
Signed-off-by: Hao Wu <skyw@nvidia.com>
Signed-off-by: Hao Wu <skyw@nvidia.com>
Signed-off-by: Hao Wu <skyw@nvidia.com>
Signed-off-by: Hao Wu <skyw@nvidia.com>
Greptile SummaryThis PR adds a new Confidence Score: 5/5Safe to merge; all remaining findings are P2 style/documentation issues that do not affect runtime behavior. The model implementation and weight-copy logic are correct for the default configuration (fuse_qkv_params=False). The two P2 findings are a misleading code comment and a dead code branch that is never reached with the current model setup. No P0 or P1 issues remain unaddressed. examples/pytorch/qwen3_moe/test_vs_hf.py — dead "qkv" weight-copy branch (lines 106-109) should be removed or corrected before fuse_qkv_params=True is ever used. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["input_ids (B, S)"] --> B["embed_tokens → hidden_states (B, S, H)"]
B --> C["RotaryPositionEmbedding → freqs"]
C --> D{"For each DecoderLayer"}
D --> E["residual = hidden_states"]
E --> F["te.MultiheadAttention\n(fused LN + QKV + QK-norm + RoPE + attn + O-proj)"]
F --> G["hidden_states = residual + attn_out"]
G --> H["residual = hidden_states"]
H --> I["te.RMSNorm (post_attention_layernorm)"]
I --> J["Qwen3MoeBlock"]
subgraph MoE ["Qwen3MoeBlock"]
J1["hidden_flat (T, H)"] --> J2["Qwen3MoeRouter\n(softmax + top-k)"]
J2 --> J3["merging_probs, routing_map,\ntokens_per_expert, router_logits"]
J3 --> J4["te.moe_permute_with_probs\n→ permuted_input (T*k, H)"]
J4 --> J5["te_ops.Sequential\nGroupedLinear → SwiGLU → GroupedLinear"]
J5 --> J6["te.moe_unpermute\n→ output (T, H)"]
end
J --> J1
J6 --> K["hidden_states = residual + moe_out"]
K --> D
D --> L["te.RMSNorm (final norm)"]
L --> M["te.Linear (lm_head) → logits (B, S, V)"]
Reviews (4): Last reviewed commit: "Merge branch 'main' into vibe_qwen3" | Re-trigger Greptile |
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Hao Wu <skyw@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Hao Wu <skyw@users.noreply.github.com>
Signed-off-by: Hao Wu <skyw@nvidia.com>
Description
A almost pure TE module implementation of Qwen3 Moe model
Type of change
Changes
Please list the changes introduced in this PR:
Checklist: