Sxian/clt 2919/integrate platform audio#669
Open
xianshijing-lk wants to merge 2 commits into
Open
Conversation
| samples = [] | ||
| for i in range(0, len(frames), 3): | ||
| # 24-bit little-endian, take upper 16 bits | ||
| sample = struct.unpack("<i", frames[i : i + 3] + b"\x00")[0] >> 8 |
Contributor
There was a problem hiding this comment.
🔴 24-bit WAV conversion crashes due to missing sign extension
The 24-bit to 16-bit audio conversion in load_wav_file zero-pads the most-significant byte (+ b"\x00") instead of sign-extending it. For any negative 24-bit sample (where frames[i+2] & 0x80 is set), this produces a large positive 32-bit value. After the >> 8 shift, the result exceeds the signed 16-bit range (-32768..32767), causing struct.pack('h', ...) to raise struct.error. Since roughly half of all samples in typical audio are negative, this will crash immediately on almost any real 24-bit WAV file.
Suggested change
| sample = struct.unpack("<i", frames[i : i + 3] + b"\x00")[0] >> 8 | |
| sample = struct.unpack("<i", frames[i : i + 3] + (b"\xff" if frames[i + 2] & 0x80 else b"\x00"))[0] >> 8 |
Was this helpful? React with 👍 or 👎 to provide feedback.
f519dde to
52bed9f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds PlatformAudio support to the Python SDK, enabling microphone capture via WebRTC's Audio Device Module (ADM) with built-in voice processing.
Changes
New: livekit-rtc/livekit/rtc/platform_audio.py
Modified: livekit-rtc/livekit/rtc/track.py
Modified: livekit-rtc/livekit/rtc/init.py
Modified: livekit-rtc/livekit/rtc/room.py
Modified: examples/basic_room.py
New: examples/README.md
PlatformAudio vs Synthetic Mode
┌───────────────────────────────┬───────────────┬───────────────────────────────────────┐
│ Feature │ PlatformAudio │ Synthetic │
├───────────────────────────────┼───────────────┼───────────────────────────────────────┤
│ Voice processing (AEC/NS/AGC) │ Built-in │ Manual │
├───────────────────────────────┼───────────────┼───────────────────────────────────────┤
│ Raw frame access │ No │ Yes │
├───────────────────────────────┼───────────────┼───────────────────────────────────────┤
│ External audio libs needed │ No │ Yes │
├───────────────────────────────┼───────────────┼───────────────────────────────────────┤
│ Use case │ Voice calls │ Custom processing, TTS, file playback │
└───────────────────────────────┴───────────────┴───────────────────────────────────────┘
Both modes can run simultaneously (e.g., mic + background music).
Test Procedure
cd examples
python basic_room.py --list-devices
Expected: Lists available microphones and speakers with device IDs.
Start LiveKit server
livekit-server --dev
In another terminal
export LIVEKIT_URL=ws://localhost:7880
export LIVEKIT_API_KEY=devkey
export LIVEKIT_API_SECRET=secret
python basic_room.py --platform-audio --room test-room
Expected: Connects to room, publishes microphone track with voice processing.
python basic_room.py --platform-audio --mic-id "" --room test-room
Expected: Uses specified microphone.
python basic_room.py --file test.wav --room test-room
Expected: Publishes audio from WAV file.
python basic_room.py --platform-audio --file test.wav --room test-room
Expected: Publishes two audio tracks - microphone and file.
Open https://meet.livekit.io and join the same room to verify audio is received.