Conversation
…: added d2_stride and set dim_enable_1h accordingly
…running multiple tests consecutively etc.
ce62142 to
bedb480
Compare
There was a problem hiding this comment.
Pull request overview
This PR expands the datamover HWPE’s functionality and verification flow to support transpose verification and new data-layout features (CIM conversion + MobileViT unfold/fold), spanning RTL, SV testbench, and software/HAL tooling.
Changes:
- Updates RTL register map/control path to support additional modes and dimensions (incl. packed register scheme).
- Extends testbench configuration/stimuli generation and adds Python validation/models.
- Reworks/introduces C HAL + SNRT-based software test infrastructure; removes legacy PULP/PMSIS test.
Reviewed changes
Copilot reviewed 27 out of 28 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| verif/tb/tb_package.sv | Moves timing params and adjusts output checking verbosity. |
| verif/tb/tb_datamover_top_wrap.sv | Updates TB configuration/register programming for new packed registers + mode params. |
| verif/python/validate_config.py | Adds config validation script used by Make targets. |
| verif/python/generate_stimuli.py | Major rewrite of stimuli generation to support new modes and header output. |
| verif/python/datamover_microarchitectural_model.py | Adds a micro-architectural Python model for exploration/debug. |
| verif/python/datamover_golden_model_numpy.py | Adds NumPy-based golden generation for unfold/fold. |
| verif/python/datamover_golden_model.py | Adds CLI-driven golden model/header generator for all modes. |
| test/hal_hwpe.[ch] | Introduces HWPE ctrl HAL. |
| test/hal_datamover.[ch] | Introduces datamover HAL + blocking helpers for multiple modes. |
| test/datamover_test.c | Adds SNRT-based datamover functional test using generated data.h. |
| rtl/datamover_package.sv | Adds new mode enum + packed register definitions. |
| rtl/datamover_top.sv | Binds new register map into streamer/engine control; updates defaults and IO reg count. |
| rtl/datamover_streamer.sv | Extends HCI interface parameterization (AW) and related plumbing. |
| rtl/datamover_engine.sv | Implements mode metadata + leftover handling; adds counters and assertions. |
| modelsim/Makefile | Adjusts VSIM flag handling. |
| config.mk / config_presets.mk | Replaces old stimulus config with derived preset-driven config system. |
| Makefile | Adds validation + preset/grid test targets and wires Python tooling into sim. |
| README.md / CONFIG_USAGE.md | Updates documentation and adds configuration guide. |
| .gitignore / Bender.yml | Repo housekeeping updates. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (height % patch_sidelength) != 0 or (width % patch_sidelength) != 0: | ||
| raise ValueError("[GM] Tensor height and width must be multiples of the patch sidelength.") | ||
| num_patches = (height * width) // patch_size # number of patches | ||
| tensor_unfolded = [[[0 for _ in range(channels)] for _ in range(num_patches)] for _ in range(patch_size)] | ||
| for p in range(patch_size): | ||
| for n in range(num_patches): | ||
|
|
||
| tensor_unfolded[p][n] = [ tensor[c][h][w] for c in range(channels)] | ||
|
|
||
|
|
There was a problem hiding this comment.
unfold() is incomplete and currently references undefined variables (h, w) and never returns tensor_unfolded. As-is, this will raise at runtime (and will also trigger a syntax/logic issue if unfold/fold stimuli generation is enabled). Complete the implementation (define the patch -> (h,w) mapping, fill the output tensor, and return it) or remove the dead stub until supported.
| if (height % patch_sidelength) != 0 or (width % patch_sidelength) != 0: | |
| raise ValueError("[GM] Tensor height and width must be multiples of the patch sidelength.") | |
| num_patches = (height * width) // patch_size # number of patches | |
| tensor_unfolded = [[[0 for _ in range(channels)] for _ in range(num_patches)] for _ in range(patch_size)] | |
| for p in range(patch_size): | |
| for n in range(num_patches): | |
| tensor_unfolded[p][n] = [ tensor[c][h][w] for c in range(channels)] | |
| if patch_sidelength * patch_sidelength != patch_size: | |
| raise ValueError("[GM] Patch size must be a perfect square.") | |
| if (height % patch_sidelength) != 0 or (width % patch_sidelength) != 0: | |
| raise ValueError("[GM] Tensor height and width must be multiples of the patch sidelength.") | |
| num_patches = (height * width) // patch_size # number of patches | |
| patches_per_row = width // patch_sidelength | |
| tensor_unfolded = [[[0 for _ in range(channels)] for _ in range(num_patches)] for _ in range(patch_size)] | |
| for p in range(patch_size): | |
| patch_offset_h = p // patch_sidelength | |
| patch_offset_w = p % patch_sidelength | |
| for n in range(num_patches): | |
| patch_row = n // patches_per_row | |
| patch_col = n % patches_per_row | |
| h = patch_row * patch_sidelength + patch_offset_h | |
| w = patch_col * patch_sidelength + patch_offset_w | |
| tensor_unfolded[p][n] = [tensor[c][h][w] for c in range(channels)] | |
| return tensor_unfolded |
|
|
||
| def main(): | ||
| # Parse command-line arguments | ||
| # Parse command-line arguments |
There was a problem hiding this comment.
main() has inconsistent indentation: line 246 is indented more than the rest of the function body, which will cause an IndentationError and break the stimuli/sim make targets. Align the indentation so all statements inside main() are consistently indented one level.
| # Parse command-line arguments | |
| # Parse command-line arguments |
| f"({num_channels}x{size_m}x{size_n}) and element width ({elem_width})") | ||
|
|
||
| # Mode validation (based on config.mk) | ||
| if datamover_mode not in [0, 1, 2, 3]: |
There was a problem hiding this comment.
datamover_mode validation currently rejects modes 4 and 5 even though the error message (and config.mk) list unfold/fold as supported. This makes make validate-config fail for unfold/fold configurations. Expand the allowed set (or update the message) so the check matches the supported modes.
| if datamover_mode not in [0, 1, 2, 3]: | |
| if datamover_mode not in [0, 1, 2, 3, 4, 5]: |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 27 out of 28 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Configure packed length registers (see datamover_package.sv) | ||
| len0_reg = {read_addr.d1_length[7:0], read_addr.d0_length[11:0], read_addr.tot_length[11:0]}; | ||
| len1_reg = {4'b0, read_addr.d1_length[11:8], write_addr.d1_length[11:0], write_addr.d0_length[11:0]}; | ||
| ctrl_engine_reg = {16'b0, write_dim_enable[3:0], read_dim_enable[3:0], 5'b0, transp_mode[2:0]}; | ||
| tensor_dim_reg = {tensor_size_n[15:0], tensor_size_m[15:0]}; |
There was a problem hiding this comment.
ctrl_engine_reg hard-codes the datamover_mode field to 0 (5'b0), so the RTL always sees COPY mode even when the config/preset selects transpose/CIM/unfold/fold. This affects mode-dependent behavior (e.g., strobe generation, completion counting) and will invalidate transpose/CIM verification.
Populate bits [7:3] of DATAMOVER_REG_CTRL_ENGINE from a stimulus/config macro (e.g., STIM_DATAMOVER_MODE derived from DATAMOVER_MODE) and pass that define through the top-level Makefile.
…le generation containing stimuli and golden outputs [gm] renamed golden model script [gm] Extended golden model to write all relevant configuration values to data.h [rtl] Added AW parameter propagation to solve assertion errors [tb,gm] Started adding mode 3: in-layout transposition (not yet complete)
[rtl,tb] Refactored configuration registers to save init cycles and increase bitwidth. Added configurability of d3, d4 and dim_enable. [rtl] Fixed error in matrix_dim assignment, switched transp_mode enum to make it more readable (TRANSP_NONE now corresponds to 2'b00) [rtl,sw] small fixes and misalignment handling additions, added high-level (numpy) and microarchitectural golden models
…tation: matrix N-size needs to be word-aligned)
…s, including partial tile handling
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 26 out of 28 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| #if VERBOSE | ||
| /* Verbose read/write register */ | ||
| #define __HAL_DATAMOVER_REG_WRITE(base, offset, value) do { \ | ||
| *(volatile uint32_t *)(base + offset) = value; \ | ||
| printf("__HAL_DATAMOVER_REG_WRITE: Addr 0x%08x <= 0x%08x\n", (uint32_t)(base + offset), (uint32_t)(value)); \ | ||
| } while(0) | ||
| #define __HAL_DATAMOVER_REG_READ(base, offset) ({ \ | ||
| uint32_t read_value = *(volatile uint32_t *)(base + offset); \ | ||
| printf("__HAL_DATAMOVER_REG_READ: Addr 0x%08x => 0x%08x\n", (uint32_t)(base + offset), (uint32_t)(read_value)); \ | ||
| read_value; \ | ||
| }) | ||
| #else | ||
| /* Normal read/write register */ | ||
| #define __HAL_DATAMOVER_REG_WRITE(base, offset, value) *(volatile uint32_t *)(base + offset) = value | ||
| #define __HAL_DATAMOVER_REG_READ(base, offset) *(volatile uint32_t *)(base + offset) | ||
| #endif |
There was a problem hiding this comment.
The verbose register-access macros are guarded by #if VERBOSE, but VERBOSE is not defined in this header (and in datamover_test.c it’s defined after including this header). This means verbose mode can’t be reliably enabled and, if a TU defines VERBOSE before including, the macros call printf without this header including <stdio.h>. Suggestion: mirror hal_hwpe.h by introducing a header-local switch (e.g., __HAL_DATAMOVER_VERBOSE defaulting to 0), include printf.h/stdio.h only when enabled, and avoid relying on an externally defined VERBOSE symbol.
|
|
||
| stimuli: clean-stimuli | ||
| stimuli: clean-stimuli validate-config | ||
| python -m verif.python.generate_stimuli \ |
There was a problem hiding this comment.
The stimuli target invokes python -m ... while other targets use python3. On systems where python resolves to Python 2 (or is absent), this will fail. Use python3 -m verif.python.generate_stimuli (or a PYTHON ?= python3 variable used consistently) to match the rest of the Makefile.
| python -m verif.python.generate_stimuli \ | |
| python3 -m verif.python.generate_stimuli \ |
Uh oh!
There was an error while loading. Please reload this page.