Skip to content

cdurrer/konark#5

Open
CycyXX wants to merge 29 commits intomainfrom
cdurrer/konark
Open

cdurrer/konark#5
CycyXX wants to merge 29 commits intomainfrom
cdurrer/konark

Conversation

@CycyXX
Copy link
Copy Markdown

@CycyXX CycyXX commented Nov 14, 2025

  • Transpose verification
  • Introduction of new features:
  • Data layout conversions for Konark (CIM)
  • Unfold and fold operations for MobileViT

@CycyXX CycyXX requested a review from sermazz November 14, 2025 13:15
@CycyXX CycyXX self-assigned this Nov 14, 2025
sermazz and others added 20 commits November 17, 2025 10:49
…: added d2_stride and set dim_enable_1h accordingly
@CycyXX CycyXX changed the title Cdurrer/konark cdurrer/konark Mar 25, 2026
@CycyXX CycyXX linked an issue Mar 25, 2026 that may be closed by this pull request
@CycyXX CycyXX requested a review from FrancescoConti April 28, 2026 12:32
@CycyXX CycyXX marked this pull request as ready for review April 28, 2026 12:34
Copilot AI review requested due to automatic review settings April 28, 2026 12:34
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the datamover HWPE’s functionality and verification flow to support transpose verification and new data-layout features (CIM conversion + MobileViT unfold/fold), spanning RTL, SV testbench, and software/HAL tooling.

Changes:

  • Updates RTL register map/control path to support additional modes and dimensions (incl. packed register scheme).
  • Extends testbench configuration/stimuli generation and adds Python validation/models.
  • Reworks/introduces C HAL + SNRT-based software test infrastructure; removes legacy PULP/PMSIS test.

Reviewed changes

Copilot reviewed 27 out of 28 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
verif/tb/tb_package.sv Moves timing params and adjusts output checking verbosity.
verif/tb/tb_datamover_top_wrap.sv Updates TB configuration/register programming for new packed registers + mode params.
verif/python/validate_config.py Adds config validation script used by Make targets.
verif/python/generate_stimuli.py Major rewrite of stimuli generation to support new modes and header output.
verif/python/datamover_microarchitectural_model.py Adds a micro-architectural Python model for exploration/debug.
verif/python/datamover_golden_model_numpy.py Adds NumPy-based golden generation for unfold/fold.
verif/python/datamover_golden_model.py Adds CLI-driven golden model/header generator for all modes.
test/hal_hwpe.[ch] Introduces HWPE ctrl HAL.
test/hal_datamover.[ch] Introduces datamover HAL + blocking helpers for multiple modes.
test/datamover_test.c Adds SNRT-based datamover functional test using generated data.h.
rtl/datamover_package.sv Adds new mode enum + packed register definitions.
rtl/datamover_top.sv Binds new register map into streamer/engine control; updates defaults and IO reg count.
rtl/datamover_streamer.sv Extends HCI interface parameterization (AW) and related plumbing.
rtl/datamover_engine.sv Implements mode metadata + leftover handling; adds counters and assertions.
modelsim/Makefile Adjusts VSIM flag handling.
config.mk / config_presets.mk Replaces old stimulus config with derived preset-driven config system.
Makefile Adds validation + preset/grid test targets and wires Python tooling into sim.
README.md / CONFIG_USAGE.md Updates documentation and adds configuration guide.
.gitignore / Bender.yml Repo housekeeping updates.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread verif/python/generate_stimuli.py Outdated
Comment on lines +235 to +244
if (height % patch_sidelength) != 0 or (width % patch_sidelength) != 0:
raise ValueError("[GM] Tensor height and width must be multiples of the patch sidelength.")
num_patches = (height * width) // patch_size # number of patches
tensor_unfolded = [[[0 for _ in range(channels)] for _ in range(num_patches)] for _ in range(patch_size)]
for p in range(patch_size):
for n in range(num_patches):

tensor_unfolded[p][n] = [ tensor[c][h][w] for c in range(channels)]


Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfold() is incomplete and currently references undefined variables (h, w) and never returns tensor_unfolded. As-is, this will raise at runtime (and will also trigger a syntax/logic issue if unfold/fold stimuli generation is enabled). Complete the implementation (define the patch -> (h,w) mapping, fill the output tensor, and return it) or remove the dead stub until supported.

Suggested change
if (height % patch_sidelength) != 0 or (width % patch_sidelength) != 0:
raise ValueError("[GM] Tensor height and width must be multiples of the patch sidelength.")
num_patches = (height * width) // patch_size # number of patches
tensor_unfolded = [[[0 for _ in range(channels)] for _ in range(num_patches)] for _ in range(patch_size)]
for p in range(patch_size):
for n in range(num_patches):
tensor_unfolded[p][n] = [ tensor[c][h][w] for c in range(channels)]
if patch_sidelength * patch_sidelength != patch_size:
raise ValueError("[GM] Patch size must be a perfect square.")
if (height % patch_sidelength) != 0 or (width % patch_sidelength) != 0:
raise ValueError("[GM] Tensor height and width must be multiples of the patch sidelength.")
num_patches = (height * width) // patch_size # number of patches
patches_per_row = width // patch_sidelength
tensor_unfolded = [[[0 for _ in range(channels)] for _ in range(num_patches)] for _ in range(patch_size)]
for p in range(patch_size):
patch_offset_h = p // patch_sidelength
patch_offset_w = p % patch_sidelength
for n in range(num_patches):
patch_row = n // patches_per_row
patch_col = n % patches_per_row
h = patch_row * patch_sidelength + patch_offset_h
w = patch_col * patch_sidelength + patch_offset_w
tensor_unfolded[p][n] = [tensor[c][h][w] for c in range(channels)]
return tensor_unfolded

Copilot uses AI. Check for mistakes.

def main():
# Parse command-line arguments
# Parse command-line arguments
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

main() has inconsistent indentation: line 246 is indented more than the rest of the function body, which will cause an IndentationError and break the stimuli/sim make targets. Align the indentation so all statements inside main() are consistently indented one level.

Suggested change
# Parse command-line arguments
# Parse command-line arguments

Copilot uses AI. Check for mistakes.
Comment thread Makefile
Comment thread rtl/datamover_engine.sv
Comment thread verif/python/validate_config.py Outdated
f"({num_channels}x{size_m}x{size_n}) and element width ({elem_width})")

# Mode validation (based on config.mk)
if datamover_mode not in [0, 1, 2, 3]:
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

datamover_mode validation currently rejects modes 4 and 5 even though the error message (and config.mk) list unfold/fold as supported. This makes make validate-config fail for unfold/fold configurations. Expand the allowed set (or update the message) so the check matches the supported modes.

Suggested change
if datamover_mode not in [0, 1, 2, 3]:
if datamover_mode not in [0, 1, 2, 3, 4, 5]:

Copilot uses AI. Check for mistakes.
Comment thread Makefile
Comment thread config.mk Outdated
Comment thread modelsim/Makefile Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 28 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Makefile
Comment on lines 274 to +276
// Configure packed length registers (see datamover_package.sv)
len0_reg = {read_addr.d1_length[7:0], read_addr.d0_length[11:0], read_addr.tot_length[11:0]};
len1_reg = {4'b0, read_addr.d1_length[11:8], write_addr.d1_length[11:0], write_addr.d0_length[11:0]};
ctrl_engine_reg = {16'b0, write_dim_enable[3:0], read_dim_enable[3:0], 5'b0, transp_mode[2:0]};
tensor_dim_reg = {tensor_size_n[15:0], tensor_size_m[15:0]};
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctrl_engine_reg hard-codes the datamover_mode field to 0 (5'b0), so the RTL always sees COPY mode even when the config/preset selects transpose/CIM/unfold/fold. This affects mode-dependent behavior (e.g., strobe generation, completion counting) and will invalidate transpose/CIM verification.

Populate bits [7:3] of DATAMOVER_REG_CTRL_ENGINE from a stimulus/config macro (e.g., STIM_DATAMOVER_MODE derived from DATAMOVER_MODE) and pass that define through the top-level Makefile.

Copilot uses AI. Check for mistakes.
CycyXX added 7 commits April 29, 2026 16:17
…le generation containing stimuli and golden outputs

[gm] renamed golden model script

[gm] Extended golden model to write all relevant configuration values to data.h

[rtl] Added AW parameter propagation to solve assertion errors

[tb,gm] Started adding mode 3: in-layout transposition (not yet complete)
[rtl,tb] Refactored configuration registers to save init cycles and increase bitwidth. Added configurability of d3, d4 and dim_enable.

[rtl] Fixed error in matrix_dim assignment, switched transp_mode enum to make it more readable (TRANSP_NONE now corresponds to 2'b00)

[rtl,sw] small fixes and misalignment handling additions, added high-level (numpy) and microarchitectural golden models
…tation: matrix N-size needs to be word-aligned)
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 28 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread rtl/datamover_engine.sv
Comment thread test/hal_datamover.h
Comment on lines +90 to 105
#if VERBOSE
/* Verbose read/write register */
#define __HAL_DATAMOVER_REG_WRITE(base, offset, value) do { \
*(volatile uint32_t *)(base + offset) = value; \
printf("__HAL_DATAMOVER_REG_WRITE: Addr 0x%08x <= 0x%08x\n", (uint32_t)(base + offset), (uint32_t)(value)); \
} while(0)
#define __HAL_DATAMOVER_REG_READ(base, offset) ({ \
uint32_t read_value = *(volatile uint32_t *)(base + offset); \
printf("__HAL_DATAMOVER_REG_READ: Addr 0x%08x => 0x%08x\n", (uint32_t)(base + offset), (uint32_t)(read_value)); \
read_value; \
})
#else
/* Normal read/write register */
#define __HAL_DATAMOVER_REG_WRITE(base, offset, value) *(volatile uint32_t *)(base + offset) = value
#define __HAL_DATAMOVER_REG_READ(base, offset) *(volatile uint32_t *)(base + offset)
#endif
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The verbose register-access macros are guarded by #if VERBOSE, but VERBOSE is not defined in this header (and in datamover_test.c it’s defined after including this header). This means verbose mode can’t be reliably enabled and, if a TU defines VERBOSE before including, the macros call printf without this header including <stdio.h>. Suggestion: mirror hal_hwpe.h by introducing a header-local switch (e.g., __HAL_DATAMOVER_VERBOSE defaulting to 0), include printf.h/stdio.h only when enabled, and avoid relying on an externally defined VERBOSE symbol.

Copilot uses AI. Check for mistakes.
Comment thread Makefile

stimuli: clean-stimuli
stimuli: clean-stimuli validate-config
python -m verif.python.generate_stimuli \
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stimuli target invokes python -m ... while other targets use python3. On systems where python resolves to Python 2 (or is absent), this will fail. Use python3 -m verif.python.generate_stimuli (or a PYTHON ?= python3 variable used consistently) to match the rest of the Makefile.

Suggested change
python -m verif.python.generate_stimuli \
python3 -m verif.python.generate_stimuli \

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Transpose functionality not verified

3 participants