fix(llm): auto-shape multimodal mediaPath messages in chat template by msluszniak · Pull Request #1089 · software-mansion/react-native-executorch

msluszniak · 2026-04-22T09:34:04Z

Description

LLMController.generate() collected imagePaths from messages with a mediaPath set, but never transformed their content into the [{type:'image'}, {type:'text', text}] form that the chat template needs to emit the <image> placeholder. Calling generate() directly with a vision-capable model (e.g. LFM2-VL) thus threw "More images paths provided than '<image>' placeholders in prompt" from native, even though sendMessage() worked because it built its own historyForTemplate that did the transformation.

This PR moves the transformation into applyChatTemplate so both call sites (generate and sendMessage) get the correct behavior, and removes the now-redundant historyForTemplate block from sendMessage. The public Message.content type stays string — external callers always pass plain strings; the controller handles the structured array form internally.

The helper is idempotent: messages whose content is already an array (e.g. callers who pre-shaped it as a workaround) are passed through unchanged.

Introduces a breaking change?

Yes
No

Public types are unchanged. sendMessage produces an identical rendered chat-template string (the transformation just happens one step later in the pipeline; token count and rendered output are byte-identical). generate only changes behavior in cases that previously threw — pure bug fix.

Type of change

Bug fix (change which fixes an issue)
New feature (change which adds functionality)
Documentation update (improves or adds clarity to existing documentation)
Other (chores, tests, code style improvements etc.)

Tested on

iOS
Android

The original bug was reproduced on a vision-capable model (LFM2-VL-1.6B-quantized) on Android while building a downstream consumer app. Re-verification of the fix on a real device is recommended before merge — see Testing instructions below. ~~I have not personally re-run the failing scenario after the fix.~~

Testing instructions

To reproduce the original bug (without this PR):

import { LLMModule, LFM2_VL_1_6B_QUANTIZED } from 'react-native-executorch';

const llm = await LLMModule.fromModelName(LFM2_VL_1_6B_QUANTIZED);
await llm.generate([
  { role: 'user', content: 'Describe this image.', mediaPath: 'file:///path/to/image.jpg' },
]);
// Throws: "More images paths provided than '<image>' placeholders in prompt"

With this PR applied, the same call should succeed and return the model's description.

Regression check: a vision-capable sendMessage(text, { imagePath }) flow should continue producing identical output.

Screenshots

N/A (controller change, no UI).

Related issues

Addresses items 1 and 2 of #1086. With item 1 fixed, item 2's Message.content type mismatch no longer surfaces in practice because external callers never need to construct the array form themselves (the as unknown as string workaround that motivated #2 becomes unnecessary).

Checklist

I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have updated the documentation accordingly
My changes generate no new warnings

Additional notes

The messagesForChatTemplate helper lives at module scope rather than as a static class method because it doesn't depend on controller state. Internal any[] return is a deliberate concession to the dynamic shape the chat-template engine accepts; the public Message[] input/output contract stays well-typed.

## Description `LLMController.forward` passed `imagePaths` straight through to `nativeModule.generateMultimodal` with no normalization. The native side requires the `file://` prefix; without it, native throws `"Read image error: invalid argument"` with no further context. Callers can plausibly arrive with either form: - `ResourceFetcher.fetch` returns raw paths *without* `file://` (per its own docstring on the `fetch` method). - Platform image-picker APIs (e.g. `expo-image-picker`) typically return `file:///...` URIs. - The same path string passed to a vision module's `forward(...)` works either way; the asymmetry between vision modules and multimodal LLM is undocumented. This PR normalizes each image path inside `LLMController.forward` so both forms work, and updates the JSDoc on `Message.mediaPath` and `LLMModule.forward.imagePaths` to document the new contract. ### Introduces a breaking change? - [ ] Yes - [x] No Strictly additive: previously-working calls (paths with `file://`) keep working unchanged. Previously-failing calls (paths without `file://`) now succeed. ### Type of change - [x] Bug fix (change which fixes an issue) - [ ] New feature - [ ] Documentation update - [ ] Other ### Tested on - [ ] iOS - [ ] Android The bare-path failure was reproduced on Android (Samsung Galaxy S24 Ultra) with LFM2-VL-1.6B while building a downstream consumer; both forms tested manually post-fix on the same device. Re-verification of both forms on iOS is recommended. ### Testing instructions ```ts import { LLMModule, LFM2_VL_1_6B_QUANTIZED } from 'react-native-executorch'; const llm = await LLMModule.fromModelName(LFM2_VL_1_6B_QUANTIZED); // Both should now work; previously only the first did. await llm.generate([ { role: 'user', content: 'Describe.', mediaPath: 'file:///absolute/path/to/img.jpg' }, ]); await llm.generate([ { role: 'user', content: 'Describe.', mediaPath: '/absolute/path/to/img.jpg' }, ]); ``` ### Related issues Addresses item 3 of #1086. ### Checklist - [ ] I have performed a self-review of my code - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have updated the documentation accordingly - [ ] My changes generate no new warnings ### Additional notes The normalizer is module-scope (matching `messagesForChatTemplate` from #1089) rather than a class method because it doesn't depend on controller state. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LLMController.generate() collected imagePaths from messages with a mediaPath but did not transform their content into the array form ([{type:'image'}, {type:'text', text}]) that the chat template needs to emit the image placeholder. Calling generate() directly with a vision-capable model thus threw "More images paths provided than '<image>' placeholders in prompt" from native. sendMessage() worked because it built its own historyForTemplate that did the transformation. Move the transformation into applyChatTemplate so both call sites get correct behavior, and remove the now-redundant historyForTemplate block from sendMessage. Public Message.content type unchanged; external callers always pass plain strings, the controller handles the array form internally. Refs #1086 (items 1 and 2 — with item 1 fixed, item 2's type mismatch no longer surfaces because external callers never need to construct the array form themselves). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

msluszniak self-assigned this Apr 22, 2026

msluszniak added the bug fix PRs that are fixing bugs label Apr 22, 2026

msluszniak mentioned this pull request Apr 22, 2026

fix(llm): normalize multimodal image paths to file:// URIs #1090

Merged

12 tasks

msluszniak requested a review from barhanc April 22, 2026 13:08

barhanc approved these changes Apr 22, 2026

View reviewed changes

msluszniak force-pushed the fix/generate-multimodal-content-shaping branch from 61874ea to d62a91e Compare April 22, 2026 15:41

msluszniak merged commit b271aa6 into main Apr 22, 2026
4 of 5 checks passed

msluszniak deleted the fix/generate-multimodal-content-shaping branch April 22, 2026 16:08

msluszniak mentioned this pull request Apr 22, 2026

Integration findings - aggregated improvements #1086

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llm): auto-shape multimodal mediaPath messages in chat template#1089

fix(llm): auto-shape multimodal mediaPath messages in chat template#1089
msluszniak merged 1 commit intomainfrom
fix/generate-multimodal-content-shaping

msluszniak commented Apr 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

msluszniak commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Introduces a breaking change?

Type of change

Tested on

Testing instructions

Screenshots

Related issues

Checklist

Additional notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

msluszniak commented Apr 22, 2026 •

edited

Loading