feat: add multimedia endpoint support (image, TTS, transcription, video) by AlemTuzlak · Pull Request #101 · CopilotKit/aimock

AlemTuzlak · 2026-04-10T10:25:47Z

Summary

Add four new multimedia endpoint types: image generation (/v1/images/generations, /v1beta/models/{model}:predict), text-to-speech (/v1/audio/speech), audio transcription (/v1/audio/transcriptions), and video generation (/v1/videos, /v1/videos/{id})
Add match.endpoint field to FixtureMatch for isolating fixtures by endpoint type, preventing cross-matching (e.g., image fixtures won't match chat requests)
Add convenience methods (onImage, onSpeech, onTranscription, onVideo) on LLMock and backfill _endpointType on all existing handlers

New Endpoints

Route	Method	Format	Match field
`/v1/images/generations`	POST	OpenAI	`prompt` → `userMessage`
`/v1beta/models/{model}:predict`	POST	Gemini Imagen	`instances[0].prompt` → `userMessage`
`/v1/audio/speech`	POST	OpenAI	`input` → `userMessage`
`/v1/audio/transcriptions`	POST	OpenAI (multipart)	`match.endpoint` only
`/v1/videos`	POST	OpenAI	`prompt` → `userMessage`
`/v1/videos/{id}`	GET	OpenAI	Stored video ID

Test plan

Image generation: single, multiple, base64, Gemini Imagen format
TTS: correct Content-Type for mp3/opus, default format fallback
Transcription: simple JSON and verbose_json with words/segments
Video: create + status check, processing state, 404 for unknown ID
X-Test-Id isolation for image endpoint
Endpoint cross-matching prevention (image vs chat)
Convenience methods (onImage, onSpeech, onTranscription, onVideo)
Backfill: endpoint: "chat" and endpoint: "embedding" fixtures match existing handlers
Full suite: 2216 tests pass, 0 failures

pkg-pr-new · 2026-04-10T10:26:21Z

Open in StackBlitz

npm i https://pkg.pr.new/@copilotkit/aimock@101

commit: 1f2d451

jpr5

Code Review — Multimedia Endpoint Support

Well-structured PR. All 4 handlers follow consistent patterns, endpoint backfill is correct across all existing handlers, tests are strong (575 lines with specific assertions). One medium finding, two low.

Medium

Fixtures without endpoint match multimedia requests, then 500 at type guard (router.ts:44-48)

Endpoint filtering is one-directional: fixtures WITH endpoint are restricted, but fixtures WITHOUT endpoint match ANY request type. A user with a generic chat fixture:

mock.addFixture({ match: { userMessage: "guitar" }, response: { content: "Chat about guitars" } });

This matches image requests for "guitar". handleImages matches it, then isImageResponse(response) fails → 500. The test only verifies the reverse direction (image fixture doesn't match chat).

Fix: when a request has _endpointType and the matched fixture has no endpoint, verify the response type is compatible with the endpoint before returning the match. Or make filtering bidirectional.

Low

extractFormField regex on binary multipart data (transcription.ts:15-22) — readBody converts binary to UTF-8 string. If file part appears before text fields, mangled bytes could theoretically match the regex. Extremely unlikely with real audio but fragile. A boundary-delimited parser would be more robust.

_endpointType not a declared field (types.ts) — stored via index signature, no type safety. Adding _endpointType?: string to ChatCompletionRequest would catch typos.

Clean

Image gen (OpenAI + Gemini Imagen), TTS, transcription, video create/poll all correct
matchFixture endpoint filtering works for the designed direction
Convenience methods (onImage, onSpeech, etc.) wire correctly
Video state map with X-Test-Id isolation is correct
Backfill of _endpointType on all existing handlers is consistent

🤖 Reviewed with Claude Code

…iltering New response types (ImageResponse, AudioResponse, TranscriptionResponse, VideoResponse) with type guards. matchFixture now filters by endpoint bidirectionally: fixtures with endpoint only match that type, and multimedia requests skip generic fixtures with incompatible response types.

Image generation (OpenAI + Gemini Imagen), text-to-speech with format support, audio transcription with multipart parsing, video generation with async status polling via in-memory state map.

Register all multimedia routes in server.ts. Add onImage/onSpeech/ onTranscription/onVideo convenience methods on LLMock. Backfill _endpointType on all existing handlers (chat + embedding).

20 integration tests (image gen, TTS, transcription, video create/poll, X-Test-Id isolation, cross-matching prevention, convenience methods, endpoint backfill) + 12 unit tests for type guards and matchFixture endpoint filtering.

New doc pages for image generation, TTS, transcription, and video. Updated fixtures page, index feature list, sidebar nav, comparison table (all competitors lack multimedia), and competitive drift detection keywords.

jpr5 requested changes Apr 10, 2026

View reviewed changes

AlemTuzlak added 4 commits April 10, 2026 11:31

feat: add multimedia handlers (image, TTS, transcription, video)

85e6ecc

Image generation (OpenAI + Gemini Imagen), text-to-speech with format support, audio transcription with multipart parsing, video generation with async status polling via in-memory state map.

feat: wire multimedia routes, convenience methods, endpoint backfill

1cd1edb

Register all multimedia routes in server.ts. Add onImage/onSpeech/ onTranscription/onVideo convenience methods on LLMock. Backfill _endpointType on all existing handlers (chat + embedding).

jpr5 force-pushed the worktree-sharded-rolling-tide branch from 3d68797 to bacfac4 Compare April 10, 2026 18:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add multimedia endpoint support (image, TTS, transcription, video)#101

feat: add multimedia endpoint support (image, TTS, transcription, video)#101
AlemTuzlak wants to merge 5 commits intomainfrom
worktree-sharded-rolling-tide

AlemTuzlak commented Apr 10, 2026

Uh oh!

pkg-pr-new bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

jpr5 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AlemTuzlak commented Apr 10, 2026

Summary

New Endpoints

Test plan

Uh oh!

pkg-pr-new bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpr5 left a comment

Choose a reason for hiding this comment

Code Review — Multimedia Endpoint Support

Medium

Low

Clean

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pkg-pr-new bot commented Apr 10, 2026 •

edited

Loading