Skip to content

feat: add aliyun bailian qwen3-tts-vc voice clone tts provider#9060

Open
makuralymi wants to merge 6 commits into
AstrBotDevs:masterfrom
makuralymi:qwen_tts_vc
Open

feat: add aliyun bailian qwen3-tts-vc voice clone tts provider#9060
makuralymi wants to merge 6 commits into
AstrBotDevs:masterfrom
makuralymi:qwen_tts_vc

Conversation

@makuralymi

@makuralymi makuralymi commented Jun 28, 2026

Copy link
Copy Markdown

Modifications / 改动点

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果


Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”

  • 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到 requirements.txtpyproject.toml 文件相应位置。

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

Summary by Sourcery

Add a new DashScope-based Aliyun Bailian Qwen3 voice-clone TTS provider and wire it into the provider manager and configuration metadata.

New Features:

  • Introduce a DashScope Voice Clone TTS provider for Aliyun Bailian Qwen3-TTS-VC models that synthesize speech using pre-created cloned voices.
  • Expose configuration options for voice clone TTS (voice ID, language type, workspace, region, and custom base URL) in the default config and dashboard metadata.

Enhancements:

  • Ensure provider manager dynamically imports the new DashScope voice clone TTS source alongside existing TTS providers.

@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Jun 28, 2026

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The new DashScope voice clone provider mutates global dashscope.api_key and dashscope.base_http_api_url on each call, which can cause subtle issues when multiple DashScope-based providers are used concurrently; consider scoping configuration to the call (e.g., passing api_key/base URL via arguments only) or otherwise avoiding global state.
  • Several hint fields in default.py rely on implicit string literal concatenation ("..." "..." without separators), which produces a single string without spaces or newlines between segments; consider adding explicit spacing/newlines or using a single literal to keep the UI text readable.
  • In get_audio, the synthesized audio is always written with a .wav extension regardless of the actual format returned by DashScope; if the API may return other formats (e.g., MP3), consider deriving the file extension from the response metadata or URL.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new DashScope voice clone provider mutates global `dashscope.api_key` and `dashscope.base_http_api_url` on each call, which can cause subtle issues when multiple DashScope-based providers are used concurrently; consider scoping configuration to the call (e.g., passing `api_key`/base URL via arguments only) or otherwise avoiding global state.
- Several `hint` fields in `default.py` rely on implicit string literal concatenation (`"..." "..."` without separators), which produces a single string without spaces or newlines between segments; consider adding explicit spacing/newlines or using a single literal to keep the UI text readable.
- In `get_audio`, the synthesized audio is always written with a `.wav` extension regardless of the actual format returned by DashScope; if the API may return other formats (e.g., MP3), consider deriving the file extension from the response metadata or URL.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new Text-to-Speech (TTS) provider for Alibaba Cloud Bailian Voice Cloning (Qwen3-TTS-VC), adding configuration metadata, localization support, and the core implementation. The review feedback highlights critical concurrency issues, specifically that modifying global DashScope configuration variables is not thread-safe and can cause race conditions with other providers. The reviewer recommends passing the API key and base URL dynamically via arguments instead. Additionally, the feedback suggests adding robust error handling by checking API response status codes and HTTP download status codes to prevent corrupted audio files.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread astrbot/core/provider/sources/dashscope_voice_clone_tts.py Outdated
Comment thread astrbot/core/provider/sources/dashscope_voice_clone_tts.py Outdated
Comment thread astrbot/core/provider/sources/dashscope_voice_clone_tts.py
Comment thread astrbot/core/provider/sources/dashscope_voice_clone_tts.py
Comment thread astrbot/core/provider/sources/dashscope_voice_clone_tts.py
makuralymi and others added 5 commits June 28, 2026 13:56
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@makuralymi makuralymi changed the title feat: add aliyun bailina qwen3-tts-vc voice clone tts provider feat: add aliyun bailian qwen3-tts-vc voice clone tts provider Jun 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant