fix(dotAI): QA fixes - image model default, playground overflow, embeddings host resolution by ihoffmann-dot · Pull Request #35716 · dotCMS/core

ihoffmann-dot · 2026-05-14T22:00:37Z

Summary

Bug fixes found during QA testing of the dotAI integration.

Default image model: update the example providerConfig JSON in the dotAI config UI from dall-e-3 to gpt-image-1
Image Playground overflow: fix JSON.stringify(json, 2) (wrong signature — JSON was not pretty-printed) and add overflow-wrap, word-break, and <pre> wrapping so the response JSON no longer overflows horizontally; add max-height + overflow-y: auto on the <pre> to prevent vertical overflow
Embeddings host resolution: EmbeddingContentListener and BulkEmbeddingsRunner were calling getEmbeddingsAPI() without a host, causing EmbeddingsAPIImpl to always fall back to System Host config — indexing silently failed with "no provider config" when dotAI was configured on a specific site only
Search form validation: form submitted even when required fields were empty; added reportValidity() check before firing the request and required attributes on the index select and prompt textarea
Model dropdown empty: writeModelToDropdown crashed silently due to a wrong variable reference (indexName instead of modelName) and read availableModels which no longer exists in the config response; rewritten to parse providerConfig.chat.model and populate one option per configured model
Missing index error: /api/v1/ai/search returned { dotCMSResults: [] } when a non-existent indexName was passed, with no indication the index was missing; now returns 404 with a descriptive error listing known indexes

Changes

dot-ai-config-detail.component.ts: dall-e-3 → gpt-image-1 in EXAMPLE_CONFIG
dotai.js: fix JSON.stringify call, add overflow styles, wrap JSON in <pre> with max-height; add reportValidity() guard in doSearchChatJson; rewrite writeModelToDropdown to read from providerConfig.chat.model
render.jsp: add required to index select and prompt textarea; add value="" to select placeholder option
EmbeddingContentListener.java: extract resolveHost(hostId) helper, pass resolved Host to getEmbeddingsAPI(host) in both addToIndexesIfNeeded and deleteFromIndexes
BulkEmbeddingsRunner.java: resolve host per-contentlet and pass to getEmbeddingsAPI(host) (same bug as EmbeddingContentListener, different code path — triggered by the manual "Build Index" button)
SearchResource.java: validate index existence via countEmbeddingsByIndex() before running the query; return 404 if not found; pass host to getEmbeddingsAPI(host)

Related Issue

This PR fixes #35730

…embeddings host resolution

claude · 2026-05-14T22:01:34Z

Claude finished @ihoffmann-dot's task in 2m 49s —— View job

PR Review

Mostly solid bug-fixing PR. A few things worth a second look:

🔴 `SearchResource.searchByPost` — perf regression on every search

SearchResource.java:130-131 now calls countEmbeddingsByIndex() on every search to validate the index name. That SQL (EmbeddingsSQL.COUNT_EMBEDDINGS_BY_INDEX) is a full-table aggregation over dot_embeddings with COUNT(*), SUM, COUNT(distinct(inode)), STRING_AGG, AVG grouped by index_name. On any sizeable embeddings table this is expensive — and it runs before every legitimate search just to produce a 404 for the unlikely typo case. Consider a cheap existence probe instead (e.g. a SELECT 1 FROM dot_embeddings WHERE index_name = ? LIMIT 1) or a small cache; the heavy aggregation belongs in /info-style endpoints, not the hot search path.

Also: leaking all index names in the error body is fine for an admin-only endpoint, but worth confirming this resource isn't reachable by frontend users — the requiredFrontendUser(true) init suggests it is, in which case the message effectively enumerates available indexes to any logged-in frontend user.

🟡 `LangChain4jAIClient.toImageContent` — NPE risk

LangChain4jAIClient.java:411

final String url = part.optJSONObject("image_url").optString("url", "");

optJSONObject("image_url") returns null when the key is missing or is not a JSON object (e.g. a bare string "image_url": "https://…", which is a real-world shape some clients send). That NPEs the whole toMessages call. Guard with a null check and fall back to optString("image_url", "") for the string-shaped variant.

The tests test_toMultimodalUserMessage_* only cover the well-formed object shape, so this hole isn't caught.

🟡 `EmbeddingContentListener.deleteFromIndexes` — dead-looking call

EmbeddingContentListener.java:157

getConfigJson(contentlet.getHost());

Return value is discarded. The only effect is the implicit getAppConfig → isEnabled throw inside. Either drop it or call getAppConfig(contentlet.getHost()) directly — readers will spend time wondering what this line does. Same call lower down in addToIndexesIfNeeded resolves the host a second time (once via resolveHost, once via getConfigJson → getAppConfig → resolveHost); minor waste, but tidying it would localise the host lookup.

🟡 `render.jsp` size field — UX regression, doesn't match Kevin's ask

render.jsp:364 replaces a 3-option <select> with a readonly <input type="text" value="1024x1024">. Kevin's review comment suggested either an <input type="number"> (so users can pick sizes that fit the model) or just keeping a single square value. The chosen pattern — an input that looks editable but isn't — is the worst of both worlds. Either remove the field entirely (and have doImageJson hard-code 1024x1024) or make it actually editable.

🟡 `dotai.js writeModelToDropdown` — silent split semantics

dotai.js:131-148 parses providerConfig.chat.model and splits on ,. That matches the convention elsewhere, but:

the (default) label is now always the first split entry, regardless of what the server actually picks at request time — this can mislead users when fallback to a later model kicks in
dotAiState.config.providerConfig is assumed to be a string; if it's already an object somewhere (depends on how the config endpoint serialises it), JSON.parse throws and the dropdown silently stays empty (the console.error is the only signal, which the QA case earlier missed)

Worth confirming the shape of dotAiState.config.providerConfig across all config endpoints, and considering surfacing the empty-models case in the UI rather than just the console.

🟢 Pre-existing but worth noting (touched by this PR)

dotai.js:573-596 builds the result panel via innerHTML with raw ${rewrittenPrompt} and ${jsonString} interpolation. The provider response is trusted today, but this is an XSS sink waiting for the day someone wires up a less-vetted provider. Not introduced here — flagging because the PR is editing this exact template.

🟢 Clean

BulkEmbeddingsRunner host resolution + integration test exercising the site-only-config path: good fix, good regression coverage.
LangChain4jAIClient.toMessages multimodal handling is the right shape; tests confirm the array isn't stringified.
dot-ai-config-detail.component.ts dall-e-3 → gpt-image-1: trivially correct.

…response height

…esponse

…I config

wezell

nice.

KevinDavilaDotCMS · 2026-05-20T19:58:02Z

I think we can use input number instead of the dropdown with hardcoded values in

                        <td>
                            <select name="size" style="min-width:400px;">
                                <option value="1024x1024">1024x1024 (Square)</option>
                                <option value="1024x1792">1024x1792 (Vertical)</option>
                                <option value="1792x1024" selected>1792x1024 (Horizontal)</option>


                            </select>
                        </td>

Because some models cannot generate in the hardcoded values

OR, since this is a sandbox, just keep square here, maybe the most supported value from models

…ropdown logging

…s include actual image data

fix(dotAI): update default image model, fix playground overflow, fix …

cf8add9

…embeddings host resolution

github-project-automation Bot added this to dotCMS - Product Planning May 14, 2026

github-actions Bot added Area : Backend PR changes Java/Maven backend code Area : Frontend PR changes Angular/TypeScript frontend code labels May 14, 2026

ihoffmann-dot changed the title ~~fix(dotAI): QA fixes — image model default, playground overflow, embeddings host resolution~~ fix(dotAI): QA fixes - image model default, playground overflow, embeddings host resolution May 14, 2026

claude Bot added the AI: Safe To Rollback label May 14, 2026

fix embeddings host resolution in BulkEmbeddingsRunner and cap image …

88ba866

…response height

github-actions Bot removed the AI: Safe To Rollback label May 14, 2026

claude Bot added the AI: Safe To Rollback label May 14, 2026

nollymar reviewed May 15, 2026

View reviewed changes

Comment thread dotCMS/src/main/java/com/dotcms/ai/api/BulkEmbeddingsRunner.java

fix search form validation, model dropdown, and missing index error r…

546cc42

…esponse

github-actions Bot removed the AI: Safe To Rollback label May 18, 2026

ihoffmann-dot added 3 commits May 18, 2026 16:47

add integration tests for embeddings host resolution with site-only A…

903196d

…I config

Merge branch 'main' into dot-ai-langchain-qa-fixes

986ee28

Merge branch 'main' into dot-ai-langchain-qa-fixes

afac943

claude Bot added the AI: Safe To Rollback label May 19, 2026

wezell approved these changes May 19, 2026

View reviewed changes

KevinDavilaDotCMS reviewed May 20, 2026

View reviewed changes

Comment thread dotCMS/src/main/webapp/html/portlet/ext/dotai/dotai.js

KevinDavilaDotCMS reviewed May 20, 2026

View reviewed changes

Comment thread dotCMS/src/main/webapp/html/portlet/ext/dotai/dotai.js

fix(dotAI): address Kevin's review comments on image size and model d…

d45ba87

…ropdown logging

github-actions Bot removed the AI: Safe To Rollback label May 21, 2026

ihoffmann-dot enabled auto-merge May 21, 2026 00:50

claude Bot added the AI: Safe To Rollback label May 21, 2026

ihoffmann-dot disabled auto-merge May 21, 2026 01:40

fix: parse multimodal content arrays in toMessages() so vision prompt…

c598a2c

…s include actual image data

github-actions Bot removed the AI: Safe To Rollback label May 21, 2026

claude Bot added the AI: Safe To Rollback label May 21, 2026

refactor: extract toImageContent helper from toMultimodalUserMessage

28fbdd7

ihoffmann-dot enabled auto-merge May 21, 2026 02:13

github-actions Bot removed the AI: Safe To Rollback label May 21, 2026

claude Bot added the AI: Safe To Rollback label May 21, 2026

ihoffmann-dot added this pull request to the merge queue May 21, 2026

Any commits made after this event will not be merged.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(dotAI): QA fixes - image model default, playground overflow, embeddings host resolution#35716

fix(dotAI): QA fixes - image model default, playground overflow, embeddings host resolution#35716
ihoffmann-dot wants to merge 9 commits into
mainfrom
dot-ai-langchain-qa-fixes

ihoffmann-dot commented May 14, 2026 •

edited

Loading

Uh oh!

claude Bot commented May 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

wezell left a comment

Uh oh!

Uh oh!

Uh oh!

KevinDavilaDotCMS commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ihoffmann-dot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Related Issue

Uh oh!

claude Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review

🔴 SearchResource.searchByPost — perf regression on every search

🟡 LangChain4jAIClient.toImageContent — NPE risk

🟡 EmbeddingContentListener.deleteFromIndexes — dead-looking call

🟡 render.jsp size field — UX regression, doesn't match Kevin's ask

🟡 dotai.js writeModelToDropdown — silent split semantics

🟢 Pre-existing but worth noting (touched by this PR)

🟢 Clean

Uh oh!

Uh oh!

wezell left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

KevinDavilaDotCMS commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ihoffmann-dot commented May 14, 2026 •

edited

Loading

claude Bot commented May 14, 2026 •

edited

Loading

🔴 `SearchResource.searchByPost` — perf regression on every search

🟡 `LangChain4jAIClient.toImageContent` — NPE risk

🟡 `EmbeddingContentListener.deleteFromIndexes` — dead-looking call

🟡 `render.jsp` size field — UX regression, doesn't match Kevin's ask

🟡 `dotai.js writeModelToDropdown` — silent split semantics