cdurrer/konark by CycyXX · Pull Request #5 · pulp-platform/datamover

CycyXX · 2025-11-14T13:15:21Z

Transpose verification
Introduction of new features:
Data layout conversions for Konark (CIM)
Unfold and fold operations for MobileViT

…n branch

…: added d2_stride and set dim_enable_1h accordingly

…running multiple tests consecutively etc.

…r reporting

…nding tests

…er (d0/d1)

…matrices

…working properly)

Copilot

Pull request overview

This PR expands the datamover HWPE’s functionality and verification flow to support transpose verification and new data-layout features (CIM conversion + MobileViT unfold/fold), spanning RTL, SV testbench, and software/HAL tooling.

Changes:

Updates RTL register map/control path to support additional modes and dimensions (incl. packed register scheme).
Extends testbench configuration/stimuli generation and adds Python validation/models.
Reworks/introduces C HAL + SNRT-based software test infrastructure; removes legacy PULP/PMSIS test.

Reviewed changes

Copilot reviewed 27 out of 28 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
verif/tb/tb_package.sv	Moves timing params and adjusts output checking verbosity.
verif/tb/tb_datamover_top_wrap.sv	Updates TB configuration/register programming for new packed registers + mode params.
verif/python/validate_config.py	Adds config validation script used by Make targets.
verif/python/generate_stimuli.py	Major rewrite of stimuli generation to support new modes and header output.
verif/python/datamover_microarchitectural_model.py	Adds a micro-architectural Python model for exploration/debug.
verif/python/datamover_golden_model_numpy.py	Adds NumPy-based golden generation for unfold/fold.
verif/python/datamover_golden_model.py	Adds CLI-driven golden model/header generator for all modes.
test/hal_hwpe.[ch]	Introduces HWPE ctrl HAL.
test/hal_datamover.[ch]	Introduces datamover HAL + blocking helpers for multiple modes.
test/datamover_test.c	Adds SNRT-based datamover functional test using generated `data.h`.
rtl/datamover_package.sv	Adds new mode enum + packed register definitions.
rtl/datamover_top.sv	Binds new register map into streamer/engine control; updates defaults and IO reg count.
rtl/datamover_streamer.sv	Extends HCI interface parameterization (AW) and related plumbing.
rtl/datamover_engine.sv	Implements mode metadata + leftover handling; adds counters and assertions.
modelsim/Makefile	Adjusts VSIM flag handling.
config.mk / config_presets.mk	Replaces old stimulus config with derived preset-driven config system.
Makefile	Adds validation + preset/grid test targets and wires Python tooling into sim.
README.md / CONFIG_USAGE.md	Updates documentation and adds configuration guide.
.gitignore / Bender.yml	Repo housekeeping updates.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-28T12:39:50Z

+    if (height % patch_sidelength) != 0 or (width % patch_sidelength) != 0:
+        raise ValueError("[GM] Tensor height and width must be multiples of the patch sidelength.")
+    num_patches = (height * width) // patch_size  # number of patches
+    tensor_unfolded = [[[0 for _ in range(channels)] for _ in range(num_patches)] for _ in range(patch_size)]
+    for p in range(patch_size):
+        for n in range(num_patches):
+
+            tensor_unfolded[p][n] = [ tensor[c][h][w] for c in range(channels)]
+
+


unfold() is incomplete and currently references undefined variables (h, w) and never returns tensor_unfolded. As-is, this will raise at runtime (and will also trigger a syntax/logic issue if unfold/fold stimuli generation is enabled). Complete the implementation (define the patch -> (h,w) mapping, fill the output tensor, and return it) or remove the dead stub until supported.

Suggested change

if (height % patch_sidelength) != 0 or (width % patch_sidelength) != 0:

raise ValueError("[GM] Tensor height and width must be multiples of the patch sidelength.")

num_patches = (height * width) // patch_size # number of patches

tensor_unfolded = [[[0 for _ in range(channels)] for _ in range(num_patches)] for _ in range(patch_size)]

for p in range(patch_size):

for n in range(num_patches):

tensor_unfolded[p][n] = [ tensor[c][h][w] for c in range(channels)]

if patch_sidelength * patch_sidelength != patch_size:

raise ValueError("[GM] Patch size must be a perfect square.")

if (height % patch_sidelength) != 0 or (width % patch_sidelength) != 0:

raise ValueError("[GM] Tensor height and width must be multiples of the patch sidelength.")

num_patches = (height * width) // patch_size # number of patches

patches_per_row = width // patch_sidelength

tensor_unfolded = [[[0 for _ in range(channels)] for _ in range(num_patches)] for _ in range(patch_size)]

for p in range(patch_size):

patch_offset_h = p // patch_sidelength

patch_offset_w = p % patch_sidelength

for n in range(num_patches):

patch_row = n // patches_per_row

patch_col = n % patches_per_row

h = patch_row * patch_sidelength + patch_offset_h

w = patch_col * patch_sidelength + patch_offset_w

tensor_unfolded[p][n] = [tensor[c][h][w] for c in range(channels)]

return tensor_unfolded

Copilot · 2026-04-28T12:39:51Z

+
 def main():
-    # Parse command-line arguments
+        # Parse command-line arguments


main() has inconsistent indentation: line 246 is indented more than the rest of the function body, which will cause an IndentationError and break the stimuli/sim make targets. Align the indentation so all statements inside main() are consistently indented one level.

Suggested change

# Parse command-line arguments

# Parse command-line arguments

Copilot · 2026-04-28T12:39:52Z

+                      f"({num_channels}x{size_m}x{size_n}) and element width ({elem_width})")
+
+    # Mode validation (based on config.mk)
+    if datamover_mode not in [0, 1, 2, 3]:


datamover_mode validation currently rejects modes 4 and 5 even though the error message (and config.mk) list unfold/fold as supported. This makes make validate-config fail for unfold/fold configurations. Expand the allowed set (or update the message) so the check matches the supported modes.

Suggested change

if datamover_mode not in [0, 1, 2, 3]:

if datamover_mode not in [0, 1, 2, 3, 4, 5]:

Copilot

Pull request overview

Copilot reviewed 27 out of 28 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-28T13:01:13Z

    // Configure packed length registers (see datamover_package.sv)
-    len0_reg = {read_addr.d1_length[7:0], read_addr.d0_length[11:0], read_addr.tot_length[11:0]};
-    len1_reg = {4'b0, read_addr.d1_length[11:8], write_addr.d1_length[11:0], write_addr.d0_length[11:0]};
+    ctrl_engine_reg = {16'b0, write_dim_enable[3:0], read_dim_enable[3:0], 5'b0, transp_mode[2:0]};
+    tensor_dim_reg  = {tensor_size_n[15:0], tensor_size_m[15:0]};


ctrl_engine_reg hard-codes the datamover_mode field to 0 (5'b0), so the RTL always sees COPY mode even when the config/preset selects transpose/CIM/unfold/fold. This affects mode-dependent behavior (e.g., strobe generation, completion counting) and will invalidate transpose/CIM verification.

Populate bits [7:3] of DATAMOVER_REG_CTRL_ENGINE from a stimulus/config macro (e.g., STIM_DATAMOVER_MODE derived from DATAMOVER_MODE) and pass that define through the top-level Makefile.

…le generation containing stimuli and golden outputs [gm] renamed golden model script [gm] Extended golden model to write all relevant configuration values to data.h [rtl] Added AW parameter propagation to solve assertion errors [tb,gm] Started adding mode 3: in-layout transposition (not yet complete)

[rtl,tb] Refactored configuration registers to save init cycles and increase bitwidth. Added configurability of d3, d4 and dim_enable. [rtl] Fixed error in matrix_dim assignment, switched transp_mode enum to make it more readable (TRANSP_NONE now corresponds to 2'b00) [rtl,sw] small fixes and misalignment handling additions, added high-level (numpy) and microarchitectural golden models

…tation: matrix N-size needs to be word-aligned)

…s, including partial tile handling

…eanup

Copilot

Pull request overview

Copilot reviewed 26 out of 28 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-29T14:36:02Z

+#if VERBOSE
+/* Verbose read/write register */
+#define __HAL_DATAMOVER_REG_WRITE(base, offset, value) do { \
+    *(volatile uint32_t *)(base + offset) = value; \
+    printf("__HAL_DATAMOVER_REG_WRITE: Addr 0x%08x <= 0x%08x\n", (uint32_t)(base + offset), (uint32_t)(value)); \
+  } while(0)
+#define __HAL_DATAMOVER_REG_READ(base, offset) ({ \
+    uint32_t read_value = *(volatile uint32_t *)(base + offset); \
+    printf("__HAL_DATAMOVER_REG_READ: Addr 0x%08x => 0x%08x\n", (uint32_t)(base + offset), (uint32_t)(read_value)); \
+    read_value; \
+  })
+#else
+/* Normal read/write register */
+#define __HAL_DATAMOVER_REG_WRITE(base, offset, value) *(volatile uint32_t *)(base + offset) = value
+#define __HAL_DATAMOVER_REG_READ(base, offset)         *(volatile uint32_t *)(base + offset)
 #endif


The verbose register-access macros are guarded by #if VERBOSE, but VERBOSE is not defined in this header (and in datamover_test.c it’s defined after including this header). This means verbose mode can’t be reliably enabled and, if a TU defines VERBOSE before including, the macros call printf without this header including <stdio.h>. Suggestion: mirror hal_hwpe.h by introducing a header-local switch (e.g., __HAL_DATAMOVER_VERBOSE defaulting to 0), include printf.h/stdio.h only when enabled, and avoid relying on an externally defined VERBOSE symbol.

Copilot · 2026-04-29T14:36:02Z


-stimuli: clean-stimuli
+stimuli: clean-stimuli validate-config
 	python -m verif.python.generate_stimuli \


The stimuli target invokes python -m ... while other targets use python3. On systems where python resolves to Python 2 (or is absent), this will fail. Use python3 -m verif.python.generate_stimuli (or a PYTHON ?= python3 variable used consistently) to match the rest of the Makefile.

Suggested change

python -m verif.python.generate_stimuli \

python3 -m verif.python.generate_stimuli \

CycyXX requested a review from sermazz November 14, 2025 13:15

CycyXX self-assigned this Nov 14, 2025

sermazz and others added 20 commits November 17, 2025 10:49

rtl,verif: Fix build flow and RTL compilation issues

240d090

Fixing RTL and simulation issues

df9fa61

[mk] Make vsim GUI optional

1c73e92

[tb] fixed packed register problem

85a0459

[mk,tb] added notes and debug outputs

d3e32af

[tb] extended golden model for Konark data layout conversion

9f664e4

[tb] separated 2d and 3d function in golden model

45be765

[tb] fixed issue introduced while rebasing to smazzola/multi-precisio…

6e74913

…n branch

[tb] Worked on transpose mode and golden model

70506c1

bender: bump hwpe-ctrl version

c8ba826

Fix configuration definitions for matrix mode and updated golden model

835a8b4

[tb, rtl] Using dimension d2 for streaming out data in transpose mode…

63db310

…: added d2_stride and set dim_enable_1h accordingly

[tb] Added support for TRANSP_MODE configurations 2 and 4

e753fa1

[tb] Fixed bug with d2_stride parameter

03caf11

[tb] Created a testing setup with presets, configuration validation, …

e1f3928

…running multiple tests consecutively etc.

[tb] Improved testing flow; added parameter grid tests, improved erro…

6b2a5ac

…r reporting

[rtl] Fixed issue which prevented WORD_WIDTH > 32 from working properly

6335287

[rtl,tb] Implemented data layout conversion mode for CIM and correspo…

0dac05f

…nding tests

[tb] Improved golden model to handle non-word-aligned matrix sizes

35fcddf

[tb] Introduced MISALIGNED_ACCESSES parameter, switched copy mode ord…

bedb480

…er (d0/d1)

CycyXX force-pushed the cdurrer/konark branch from ce62142 to bedb480 Compare November 17, 2025 09:51

CycyXX added 2 commits November 20, 2025 09:44

[rtl,tb] Started implementing misaligned access for non-word-aligned …

2497da9

…matrices

[mk,rtl] cleanup and deactivated strb for misaligned access (not yet …

a587f1c

…working properly)

CycyXX changed the title ~~Cdurrer/konark~~ cdurrer/konark Mar 25, 2026

CycyXX linked an issue Mar 25, 2026 that may be closed by this pull request

Transpose functionality not verified #3

Closed

CycyXX requested a review from FrancescoConti April 28, 2026 12:32

CycyXX marked this pull request as ready for review April 28, 2026 12:34

Copilot AI review requested due to automatic review settings April 28, 2026 12:34

Copilot started reviewing on behalf of CycyXX April 28, 2026 12:35 View session

Copilot AI reviewed Apr 28, 2026

View reviewed changes

CycyXX requested a review from Copilot April 28, 2026 12:54

Copilot started reviewing on behalf of CycyXX April 28, 2026 12:55 View session

Copilot AI reviewed Apr 28, 2026

View reviewed changes

CycyXX added 7 commits April 29, 2026 16:17

[rtl, tb, gm] Implemented partial tile support for copy mode

45322e0

[rtl] Implemented partial tile handling for 1-element transpose (limi…

13d42d5

…tation: matrix N-size needs to be word-aligned)

[rtl, golden model] Introduced CIM layout conversion and reverse mode…

6116f1b

…s, including partial tile handling

[rtl, golden model] Implemented unfold and fold modes (MobileViT)

a58e8a0

Renamed matrix to tensor after introducing channel dimension, code cl…

d58a985

…eanup

CycyXX force-pushed the cdurrer/konark branch from 8078517 to d58a985 Compare April 29, 2026 14:26

CycyXX requested a review from Copilot April 29, 2026 14:29

Copilot started reviewing on behalf of CycyXX April 29, 2026 14:29 View session

Copilot AI reviewed Apr 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cdurrer/konark#5

cdurrer/konark#5
CycyXX wants to merge 29 commits intomainfrom
cdurrer/konark

CycyXX commented Nov 14, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Apr 28, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

Copilot AI Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	# Parse command-line arguments
	# Parse command-line arguments

	if datamover_mode not in [0, 1, 2, 3]:
	if datamover_mode not in [0, 1, 2, 3, 4, 5]:

	python -m verif.python.generate_stimuli \
	python3 -m verif.python.generate_stimuli \

Conversation

CycyXX commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CycyXX commented Nov 14, 2025 •

edited

Loading