feat: add video SEF/subject-ref, image seed/size, speech clone/design, music-2.5+#66
Closed
raylanlin wants to merge 0 commit intoMiniMax-AI:mainfrom
Closed
feat: add video SEF/subject-ref, image seed/size, speech clone/design, music-2.5+#66raylanlin wants to merge 0 commit intoMiniMax-AI:mainfrom
raylanlin wants to merge 0 commit intoMiniMax-AI:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds comprehensive new features across all generation modules, extensive bug fixes, and full test coverage.
18 files changed, +1197/-143 lines, 11 commits, 109 tests pass
New Features
🎬 Video Generation —
src/commands/video/generate.ts(+54 lines)--last-frame <path-or-url>— SEF (Start-End Frame) InterpolationGenerates a video that smoothly transitions between a start frame (from prompt) and an end frame (provided image).
mmx video generate \ --prompt "A flower blooming in spring garden" \ --last-frame ./end-frame.jpghailuo-02(required for SEF mode)--model hailuo-02is still required if your plan doesn't include this model--subject-image <path-or-url>— Subject-to-Video (S2V)Keeps a character/object consistent throughout the generated video.
mmx video generate \ --prompt "walking through a neon-lit cyberpunk city" \ --subject-image ./character.pngs2v-01(required for S2V mode)/v1/files/uploadautomatically--model s2v-01is still required if your plan doesn't include this modelModel Override Priority
Explicit
--modelflag now takes priority over automatic model switching. If you specify--model hailuo-01with--last-frame, it will try to usehailuo-01(and fail if the API doesn't support it), rather than silently switching.🖼️ Image Generation —
src/commands/image/generate.ts(+34 lines)--seed <n>— Reproducible Generation--width <px>/--height <px>— Custom Dimensionsmmx image generate --prompt "Wide banner" --width 2048 --height 512--aspect-ratiowhen both are setimage-01model--prompt-optimizer— AI Prompt Enhancement--aigc-watermark— AI Content Watermark🗣️ TTS — New Commands
speech clone— Voice Cloning (src/commands/speech/clone.ts, 110 lines)Clone a voice from an audio sample.
/v1/files/uploadfirst, then calls voice_clone APImmx speech synthesize --voice <voice_id>speech design— Voice Design (src/commands/speech/design.ts, 70 lines)Create a voice from a text description.
--genderhint (male/female)mmx speech synthesize --voice <voice_id>🎵 Music Generation —
src/commands/music/generate.ts(+127 lines)music-2.5+Model with Native Instrumental Support--lyrics-optimizer— AI-Generated Lyrics--output-format url— Direct Download URLExpanded Lyric Tags (14 Total)
[Intro][Verse][Pre Chorus][Chorus][Interlude][Bridge][Outro][Post Chorus][Transition][Break][Hook][Build Up][Inst][Solo][Verse: piano]will be sung as lyrics.Bug Fixes
src/client/endpoints.ts/v1/filesreturned 404/v1/files/uploadsrc/output/audio.tsextra_infofield names didn't match API responseaudio_length→music_duration,audio_size→music_size,audio_sample_rate→music_sample_ratesrc/registry.tsdist/minimax.mjsdist/mmx.mjssrc/commands/music/generate.ts--instrumentalwith "无歌词" still sent lyrics fieldlyrics = undefinedwhen using "无歌词"src/commands/video/generate.ts--modelwas overwritten by auto-switchsrc/commands/music/generate.tsconsole.log(stdout)src/commands/music/generate.tsTests
Coverage: 109 pass / 0 fail across 25 test files
test/commands/image/generate.test.tstest/commands/video/generate.test.tstest/commands/speech/clone.test.tstest/commands/speech/design.test.tsTest Highlights
--width+--aspect-ratio→ warning--last-frame→ Hailuo-02,--subject-image→ S2V-01--model hailuo-01 --last-frame→ uses hailuo-01 (not auto-switched)Documentation
skill/SKILL.md(+160 lines)README.md(+18 lines)--help(all commands)API Reference
All features verified against official MiniMax API documentation:
Commits
281c6d24a04cc43800064fb7ddd041d762b7e7fb50847c9b4/v1/files→/v1/files/upload1e2462cf928de2233ff94e98eb11