Skip to content

fix: align thistogram HistByte support with PTO-ISA#697

Open
HecreReed wants to merge 3 commits into
hw-native-sys:mainfrom
HecreReed:codex/pr634-histbyte-fix
Open

fix: align thistogram HistByte support with PTO-ISA#697
HecreReed wants to merge 3 commits into
hw-native-sys:mainfrom
HecreReed:codex/pr634-histbyte-fix

Conversation

@HecreReed
Copy link
Copy Markdown
Collaborator

Summary

  • align pto.thistogram with current PTO-ISA HistByte semantics instead of the older bool-style lowering from fix: use HistByte template args for thistogram emitc #634
  • replace isMSB with byte : i32 in PTO IR and lower byte=0..3 to HistByte::BYTE_0..BYTE_3
  • keep verifier behavior consistent with upstream THistogram constraints for both ui16 and ui32 sources

Why

PTO-ISA now exposes four HistByte modes, but they are not uniformly valid for all source dtypes:

  • ui16: only BYTE_0 / BYTE_1
  • ui32: BYTE_0 / BYTE_1 / BYTE_2 / BYTE_3

PR #634 only converted the EmitC lowering from true/false to BYTE_1/BYTE_0, which still lagged the current upstream interface.

Changes

  • IR:
    • change pto.thistogram attr from isMSB: BoolAttr to byte: I32Attr with default 1
  • verifier:
    • accept src: ui16 | ui32
    • enforce byte in [0, 3]
    • enforce ui16 only allows byte=0/1
    • enforce ui16 vs ui32 specific idx layout / shape constraints
  • lowering:
    • map byte=0..3 to HistByte::BYTE_0..BYTE_3
  • docs/tests:
    • refresh PTO_IR_manual.md
    • add ui32 lowering coverage and invalid-byte verifier coverage

Validation

  • cmake -G Ninja -S /Users/laoda/pto/PTOAS/_pr634_histbyte_fix -B /tmp/ptoas-pr634-build ...
  • ninja -C /tmp/ptoas-pr634-build ptoas
  • ptoas --pto-arch=a5 test/lit/pto/thistogram_emitc.pto | FileCheck ...
  • ptoas --pto-arch=a5 test/lit/pto/thistogram_emitc_u32.pto | FileCheck ...
  • not ptoas --pto-arch=a5 test/lit/pto/thistogram_verify_invalid_byte_a5.pto | FileCheck ...
  • not ptoas --pto-arch=a5 test/lit/pto/thistogram_verify_invalid_ui16_byte_a5.pto | FileCheck ...

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the pto.thistogram operation by replacing the isMSB boolean attribute with a byte integer attribute and adding support for ui32 source types. These changes include updated documentation, IR definitions, and verification logic to handle specific filtering behaviors and layout requirements for different source data types. The review feedback suggests improving the descriptiveness of verification error messages by including expected row counts and removing a redundant variable initialization in the EmitC lowering logic.

Comment thread lib/PTO/IR/PTO.cpp
Comment on lines +6749 to +6756
int64_t expectedIdxRows = 1;
if (byte == 1)
expectedIdxRows = 2;
else if (byte == 0)
expectedIdxRows = 3;
if (!hasCompatibleKnownExtent(idxShape[0], expectedIdxRows) ||
!hasCompatibleKnownExtent(idxValid[0], expectedIdxRows))
return emitOpError("expects idx rows/valid rows to match the byte-selected filter depth when src element type is ui32");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error message could be more descriptive by including the expected number of rows. This helps developers quickly identify and fix shape mismatches in their IR.

      int64_t expectedIdxRows = 1;
      if (byte == 1)
        expectedIdxRows = 2;
      else if (byte == 0)
        expectedIdxRows = 3;
      if (!hasCompatibleKnownExtent(idxShape[0], expectedIdxRows) ||
          !hasCompatibleKnownExtent(idxValid[0], expectedIdxRows))
        return emitOpError("expects idx rows/valid rows to be ")
               << expectedIdxRows
               << " to match the byte-selected filter depth when src element "
                  "type is ui32";


auto templateArgs = rewriter.getArrayAttr({emitc::OpaqueAttr::get(
ctx, op.getIsMSB() ? "HistByte::BYTE_1" : "HistByte::BYTE_0")});
StringRef histByte = "HistByte::BYTE_1";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The initialization of histByte is redundant as it is unconditionally assigned in the following switch statement. Removing it improves code clarity and adheres to common C++ best practices regarding avoiding unnecessary initializations.

Suggested change
StringRef histByte = "HistByte::BYTE_1";
StringRef histByte;

@reedhecre
Copy link
Copy Markdown

reedhecre commented May 24, 2026

Codex Review

该评论由 review 机器人自动更新。

  • PR: fix: align thistogram HistByte support with PTO-ISA #697 fix: align thistogram HistByte support with PTO-ISA
  • Author: HecreReed
  • Base/Head: main / codex/pr634-histbyte-fix
  • Head SHA: e2c578fce2e0
  • Trigger: PR 有新提交
  • Generated At: 2026-05-25T07:56:37Z
  • Previous Head SHA: 1828eacbd5a3
  • Status: completed

Summary

The PR preserves old textual isMSB IR, but it still breaks the generated typed thistogram APIs and leaves one stale manual entry.

Findings

  1. P2 Removing `isMSB` from the ODS still breaks the generated typed THistogram APIs include/PTO/IR/PTOOps.td:2778

isMSB compatibility is implemented only by manually reading an unregistered attribute in the verifier/lowering, but this ODS change still renames the op’s declared attribute to byte. Because the Python bindings are regenerated from this TableGen source (python/CMakeLists.txt) and the typed C++ op API is generated from the same ODS, downstream code that previously built or inspected pto.thistogram via isMSB will stop compiling/working even though old textual IR still parses. That makes the compatibility story incomplete for the public typed APIs.

  1. P3 The manual’s quick-reference entry still describes only the old thistogram semantics docs/PTO_IR_manual.md:4134

The detailed pto.thistogram section was updated for ui32 plus byte, but the quick-reference table still documents only the old dst[i, idx[i,0]] = ... form. That leaves the manual internally inconsistent and advertises the wrong contract to readers who use the summary table.

@HecreReed HecreReed marked this pull request as ready for review May 25, 2026 07:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants