Skip to content

stm32/powerctrl: Add bounded waits, HSEM and CPU2 checks to STM32WB clock paths.#36

Open
andrewleech wants to merge 2 commits into
masterfrom
fix/stm32-powerctrl-timeout
Open

stm32/powerctrl: Add bounded waits, HSEM and CPU2 checks to STM32WB clock paths.#36
andrewleech wants to merge 2 commits into
masterfrom
fix/stm32-powerctrl-timeout

Conversation

@andrewleech
Copy link
Copy Markdown
Owner

@andrewleech andrewleech commented Apr 12, 2026

Summary

I've been hitting intermittent WDT resets on a custom STM32WB55 board that changes SYSCLK at runtime via machine.freq(). The device runs at 4 MHz MSI in a low-power idle state and boosts to 8/16 MHz for data collection and file I/O. The lockup was always silent, the last log message showed a successful frequency change then nothing until the watchdog fired.

Digging into it, the MSI clock path in powerctrl_set_sysclk() has six unbounded while loops waiting on hardware flags (regulator ready, VOS, flash latency, MSI ready, clock switch, PLL shutdown). SystemClock_Config() has another four (HSEM, HSE, PLL, clock switch). If any flag doesn't assert the CPU just spins forever.

The MSI path is also missing a couple of protections that the PLL path and the stop-mode path already have. It modifies RCC registers without acquiring CFG_HW_RCC_SEMID, so CPU2's wireless stack sequencer can collide on the same registers. And it doesn't check CPU2's power state before dropping SYSCLK below 32 MHz. The wireless stack needs HCLK2 >= 32 MHz when CPU2 is running (AN5289 Section 4.3), and no C2HPRE prescaler can multiply a sub-32 MHz SYSCLK up to that. If CPU2 hasn't entered deep sleep after BLE deactivation (the latency is unspecified per ST's docs), the HCLK2 violation can crash the wireless stack and leave HSEM semaphores held indefinitely, deadlocking CPU1.

I ran a stress test cycling the device between temperature states that trigger these frequency transitions. Before this fix, 5 WDT resets in ~80 cycles over 10 minutes. After, 154 cycles over 20 minutes with zero clock-related resets.

This PR:

  • Adds a RCC_WAIT bounded countdown macro in powerctrl.h. Uses an iteration count rather than HAL_GetTick because SysTick rate is changing during the clock transition itself (same approach as ADC_WAIT in machine_adc.c). At 2 MHz MSI, 100000 iterations is ~200 ms.
  • Replaces all ten unbounded loops with RCC_WAIT, returning -MP_ETIMEDOUT on timeout instead of hanging.
  • Acquires CFG_HW_RCC_SEMID around the MSI path's RCC modifications on STM32WB, matching SystemClock_Config() and powerctrl_low_power_prep_wb55().
  • Checks C2DS/C2SB before SYSCLK < 32 MHz on STM32WB, returns -MP_EPERM if CPU2 is awake.
  • Changes SystemClock_Config() to return int so failures propagate. Split cleanup labels (fail_clk48, fail_rcc) release both CLK48 and RCC semaphores correctly.
  • mp_machine_set_freq() raises OSError for the new error codes instead of calling MICROPY_BOARD_FATAL_ERROR.

References:

  • RM0434 Rev 10 Section 7.2.17: RCC register access through HSEM
  • RM0434 Rev 10 Section 5.4.2: Dynamic voltage scaling
  • AN5289 Rev 8 Section 4.3: CPU2 clock requirements (HCLK2 >= 32 MHz)
  • AN5289 Rev 8 Section 4.7: HSEM usage for clock configuration

Testing

Tested on a custom STM32WB55 board with asyncio app, BLE wireless stack v1.22, and hardware watchdog.

Before: stress test cycling between 4 MHz idle and 8-16 MHz active, 5 WDT resets in 10 minutes (~80 cycles). Every lockup was during machine.freq() itself, before any application code ran at the new frequency.

After: same test, 154 cycles over 20 minutes, zero clock-related WDT resets. Both successful data collection cycles completed the full 8 -> 16 -> 8 MHz sequence cleanly.

Also verified:

  • All MSI frequencies (4, 8, 16, 24, 32, 48 MHz) work with BLE inactive (CPU2 in deep sleep)
  • machine.freq(4000000) with BLE active raises OSError("can't change freq") as expected (CPU2 awake)
  • Boot-time SystemClock_Config still works normally

Trade-offs and Alternatives

The RCC_WAIT(cond); if (cond) pattern evaluates the condition twice at the boundary. For volatile register reads this is just a benign re-read. For LL_HSEM_1StepLock the second call on an already-held semaphore returns success (same-core re-lock per STM32WB HSEM behaviour), so it's safe but does one redundant MMIO access per successful acquire.

Non-WB SystemClock_Config implementations (F0, G0, H5, L0, L1, N6, WL) still have unbounded waits in their own oscillator/PLL loops. They're updated to return int for API consistency but don't use RCC_WAIT. The WL MSI path does get the bounded waits though since it shares the powerctrl_set_sysclk code with WB.

Generative AI

I used generative AI tools when creating this PR, but a human has checked the code and is responsible for the description above.

@andrewleech andrewleech changed the title stm32/powerctrl: Add bounded waits, HSEM and CPU2 checks to clock paths. stm32/powerctrl: Add bounded waits, HSEM and CPU2 checks to STM32WB clock paths. Apr 12, 2026
@andrewleech andrewleech force-pushed the fix/stm32-powerctrl-timeout branch from f88930b to dd6d4a1 Compare April 12, 2026 11:27
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 12, 2026

Code size report:

Reference:  github/workflows: Bump codecov/codecov-action from 5 to 6. [8c6dfa5]
Comparison: stm32/powerctrlboot: Add bounded waits to WB55 SystemClock_Config. [merge of 148b397]
  mpy-cross:    +0 +0.000% 
   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:   +96 +0.024% PYBV10
      esp32:    +0 +0.000% ESP32_GENERIC
     mimxrt:    +0 +0.000% TEENSY40
        rp2:    +0 +0.000% RPI_PICO_W
       samd:    +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:    +0 +0.000% VIRT_RV32

@andrewleech andrewleech force-pushed the fix/stm32-powerctrl-timeout branch 7 times, most recently from eb38f70 to e14da59 Compare April 13, 2026 01:32
pi-anl added 2 commits April 14, 2026 10:43
The MSI clock path in powerctrl_set_sysclk() (used by STM32WB and
STM32WL) has six unbounded while loops that permanently hang CPU1 if
any hardware flag fails to assert. On STM32WB55, it also modifies RCC
registers without acquiring the RCC hardware semaphore, and does not
check CPU2's power state before reducing SYSCLK below 32 MHz.

Add a tick-based RCC_WAIT(cond, timeout_ms, result) macro using
mp_hal_ticks_ms and replace all unbounded while loops. Timeout values
match the STM32 HAL: 2ms for MSI/PLL/VOS, 100ms for HSE, 500ms for
HSEM, 5000ms for clock switch. On timeout, return -MP_ETIMEDOUT.

The macro evaluates cond at most once per iteration and breaks
immediately on success, avoiding re-evaluation of side-effecting
conditions like LL_HSEM_1StepLock (which performs a hardware lock
attempt on every RLR register read).

On STM32WB, acquire CFG_HW_RCC_SEMID around RCC modifications,
matching SystemClock_Config() and powerctrl_low_power_prep_wb55().
Check C2DS/C2SB before SYSCLK < 32 MHz; wait up to 1 second for
CPU2 to enter deep sleep, returning -MP_EPERM if it doesn't. The
error is raised before any RCC modification, so SystemCoreClock and
SysTick remain correct and the caller can retry.

Also fix REGLPF wait polarity: the original code waited while the
flag was clear (main regulator ready), matching neither the HAL
implementation nor the RM0434 description. Now waits while set
(low-power regulator still active), consistent with the STM32 HAL
HAL_PWREx_DisableLowPowerRunMode().

Extend mp_machine_set_freq() to raise OSError for the new error codes.

References:
- RM0434 Rev 10 Section 7.2.17: RCC register access through HSEM
- AN5289 Rev 8 Section 4.3: CPU2 clock requirements (HCLK2 >= 32 MHz)

Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>
The STM32WB SystemClock_Config() (64 MHz PLL path) has four unbounded
while loops for HSEM acquire, HSE ready, PLL lock, and SYSCLK switch.

Apply RCC_WAIT with appropriate timeouts to all four loops. Use
separate cleanup labels (fail_clk48, fail_rcc) to release both
CFG_HW_CLK48_CONFIG and CFG_HW_RCC semaphores correctly on
post-CLK48-acquire failures.

Change SystemClock_Config() from void to int across all STM32 targets
for API consistency. Non-WB implementations return 0 unconditionally.

Signed-off-by: Andrew Leech <andrew.leech@planetinnovation.com.au>
@andrewleech andrewleech force-pushed the fix/stm32-powerctrl-timeout branch from e14da59 to 148b397 Compare April 14, 2026 00:43
@andrewleech andrewleech force-pushed the master branch 7 times, most recently from ce2c0c9 to 9f396bb Compare May 1, 2026 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants