Voice SDK: Updated ForceEndOfUtterance to support padding the timestamp by sam-s10s · Pull Request #117 · speechmatics/speechmatics-python-sdk

sam-s10s · 2026-06-26T08:43:53Z

When using the timestamp attribute with ForceEndOfUtterance messages, users have reported that they see some short utterances / single words being missed in the transcript.

The extra attribute added to the finalize() function for the VoiceAgentClient now supports a pad argument (defaults to 0.2s) where the timestamp argument for the FEOU is padded.

Tests to follow.

Introduce an optional `ws_headers` parameter to the `connect` method in `VoiceAgentClient`. This allows users to pass custom headers when establishing the WebSocket connection to the Speechmatics API.

Refactor the EndOfTurnPenaltyItem logic to improve clarity and functionality. Group related penalty items with descriptive comments for better maintainability. Adjust penalties for situations with Smart Turn and VAD to improve detection accuracy, including new conditions for SMART_TURN_FALSE and ACTIVE combinations. This change is necessary to fine-tune the configuration for complex speech patterns and ensure better end-of-turn detection in the transcription process.

* Add No Signal Penalty for Smart Turn * Update Penalty to Extend TTL

…chmatics-python-sdk into fix/smart-turn # Conflicts: # sdk/voice/speechmatics/voice/_models.py

Introduce `test_no_feou_fix.py` to validate scenarios where Fixed End Of Utterance (FEOU) is disabled. This test ensures correct behavior when FEOU mode is set to FIXED in the `VoiceAgentConfig`. Utilize additional vocabulary and message logging for enhanced debugging. Skipped in CI to avoid unnecessary API calls without a valid key.

Add `validate_config` method to `VoiceAgentConfig` to ensure cross-field validation post-merging. This enhances the robustness of configurations by checking for inconsistencies and errors, such as ensuring valid combinations of end-of-utterance modes and features like VAD, and sample rates. Enhance preset functionality by validating merged configurations. This ensures that custom configurations derived from presets are validated before use, preventing runtime errors due to invalid configurations. Drop use of `model_validator` for clearer validation flow and improve error reporting by raising specific exceptions for validation failures.

Set `use_forced_eou` to True in EndOfTurnConfig to ensure correct behavior for utterance detection. Previously, `use_forced_eou` was set to False, which could lead to inaccurate turn-taking scenarios. Added validation in `validate_config` to prevent setting `use_forced_eou` to False, ensuring configurations remain consistent with intended usage and avoiding potential run-time errors.

Remove redundant flags and streamline end-of-utterance (EOU) and voice activity detection (VAD) handling in the VoiceAgentClient class. Changes include: - Rename confusing boolean flags to improve clarity. - Simplify logic for determining when to listen to EOU messages. - Remove unused code paths and clean up comments for better readability. - Combine similar conditional logic to avoid duplicated checks. These changes are intended to make the codebase more maintainable, reduce potential for errors, and improve overall performance.

Remove the `use_forced_eou` setting from the `EndOfTurnConfig` in several test files to simplify configurations. Forced end-of-utterance must always be true (default), so removed.

…n VoiceAgentConfig Remove the conditional validation logic for 'use_forced_eou' within the 'VoiceAgentConfig.validate_config' method. This logic was enforcing that 'EndOfTurnConfig.use_forced_eou' cannot be False, which is no longer required. This change streamlines the validation process, aligning it with updated requirements, and ensuring clarity around utterance handling configurations. Additionally, cleanup of imports of 'EndOfTurnConfig' in test files reflects this update.

Refactor turn management logic to ensure better handling of forced End of Utterance (EOU) configurations. While FEOU cannot be disabled in normal use, it can be disabled for testing directly manipulating the config value: `config.end_of_turn_config.use_forced_eou = False`

Introduce two new tests to validate header handling in the STT client. `test_with_headers` checks successful connection using valid headers, while `test_with_corrupted_headers` ensures connection failure with invalid header format.

Remove unnecessary boolean conversion for 'end_of_utterance_mode' check and update the conditional logic for '_listen_to_eou_messages'. This resolves a logical error that prevented proper handling of 'fixed' end of utterance mode, and ensures the client correctly listens or doesn't listen to EOU messages based on '_listen_to_eou_messages' state. These changes enhance the processing of end of utterance events, improving overall speech-to-text functionality.

Extract the configuration setup into a separate `config` variable to improve readability and maintainability. Add debug print statements for configuration details to aid in debugging. Move client disconnect logic to the end of the test to ensure the connection is properly closed, improving resource management.

Change speechmatics-rt version specifier from a minimum version requirement to an exact version pin (==0.5.3).

Add FFT-based resampling in SmartTurnDetector for non-16kHz audio. Parameterise Silero VAD chunk/context sizes to handle both 8kHz and 16kHz natively. Refactor forced end-of-utterance control: replace the testing flag with a declarative `_use_forced_eou` derived from config. Defer audio format initialisation in AsyncClient until start_session is called, and return the FEOU timestamp for diagnostic logging. Rename `_vad_evaluation` to `_speaker_start_stop_evaluation` and remove unused `EndOfTurnConfig` from presets.

Disable smart turn cutoff skip that prevented re-evaluation. Improve multiple speakers test with accumulated error reporting and turn boundary tracking.

# Conflicts: # sdk/voice/speechmatics/voice/_client.py

# Conflicts: # sdk/rt/speechmatics/rt/_async_client.py # tests/voice/test_17_eou_feou.py

…turn detection

Allow callers to pass a `pad` value through finalize() to _await_forced_eou(), replacing the fixed 0.02s timestamp padding. Add forced_eou_padding config option (default 0.2s) to EndOfTurnConfig and include timing info in the diagnostic message.

sam-s10s and others added 30 commits February 11, 2026 13:23

feat: add optional headers support to WebSocket connection

bc8b401

Introduce an optional `ws_headers` parameter to the `connect` method in `VoiceAgentClient`. This allows users to pass custom headers when establishing the WebSocket connection to the Speechmatics API.

Add Penalty when Smart Turn hasn't been run (#86)

d9de589

* Add No Signal Penalty for Smart Turn * Update Penalty to Extend TTL

Merge branch 'fix/smart-turn' of https://github.com/speechmatics/spee…

3375c3d

…chmatics-python-sdk into fix/smart-turn # Conflicts: # sdk/voice/speechmatics/voice/_models.py

refactor: remove forced end-of-utterance config from tests

0b28473

Remove the `use_forced_eou` setting from the `EndOfTurnConfig` in several test files to simplify configurations. Forced end-of-utterance must always be true (default), so removed.

test: add tests for STT client header handling

ce88321

Introduce two new tests to validate header handling in the STT client. `test_with_headers` checks successful connection using valid headers, while `test_with_corrupted_headers` ensures connection failure with invalid header format.

manually set FEOU to be disabled for the tests.

95dda05

remove ws_headers as part of a different PR

5ecc473

chore: pin speechmatics-rt dependency version for voice

7f03cc5

Change speechmatics-rt version specifier from a minimum version requirement to an exact version pin (==0.5.3).

fix: only predict end of turn when speech ended

0e56620

Disable smart turn cutoff skip that prevented re-evaluation. Improve multiple speakers test with accumulated error reporting and turn boundary tracking.

test: re-enable speaker focus test cases

4182979

test: use env var for RT URL and fix assertions

5583174

fix: remove unused turn extend delay and dead code

18b56f9

Merge branch 'fix/websocket-headers' into feat/va-rel

c27fcb1

Merge branch 'fix/feou' into feat/va-rel

58fa7d6

# Conflicts: # sdk/voice/speechmatics/voice/_client.py

chore: add uv source for speechmatics-rt dependency

942d23c

chore: remove uv source override for speechmatics-rt

d8ccb41

test: switch EOU/FEOU endpoint to eu production

5c7ab13

Merge branch 'main' into fix/smart-turn

8720ed4

# Conflicts: # sdk/rt/speechmatics/rt/_async_client.py # tests/voice/test_17_eou_feou.py

Relax speechmatics-rt version pin to minimum

103cac4

sam-s10s added 6 commits April 10, 2026 13:10

Merge branch 'main' into feat/va-rel

d365f7b

Update speechmatics-rt dependency to version 1.0.0 or higher

d94b34d

Remove timestamp parameter from force_end_of_utterance call in smart …

9d8e69c

…turn detection

pad the timestamp

04ab6ec

Merge branch 'main' into feat/va-rel

d52b42d

sam-s10s closed this Jun 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Voice SDK: Updated ForceEndOfUtterance to support padding the timestamp#117

Voice SDK: Updated ForceEndOfUtterance to support padding the timestamp#117
sam-s10s wants to merge 36 commits into
mainfrom
feat/va-rel

sam-s10s commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

sam-s10s commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants