Skip to content

[api][integrations] Support models' native structured output (foundation + OpenAI)#843

Open
weiqingy wants to merge 1 commit into
apache:mainfrom
weiqingy:280-pr1-foundation-openai
Open

[api][integrations] Support models' native structured output (foundation + OpenAI)#843
weiqingy wants to merge 1 commit into
apache:mainfrom
weiqingy:280-pr1-foundation-openai

Conversation

@weiqingy

@weiqingy weiqingy commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Linked issue: #280

Purpose of change

Today output_schema is honored only by prompt-engineering and parsing the response text; no integration uses a provider's native structured-output API. This PR adds the foundation for native structured output at the chat-model connection layer, plus the OpenAI implementation, in Java and Python. It's the first in a small stack under #280 (Azure/Ollama, Anthropic, DashScope follow; ReActAgent final-output wiring is a separate follow-up).

How it works:

  • The output schema is carried to the connection via a reserved key (__structured_output_schema__) in the existing modelParams/kwargs map, so the abstract chat() signature is unchanged.
  • Each connection declares a boolean capability (supportsNativeStructuredOutput() / supports_native_structured_output), default false.
  • A connection applies the native API only when a schema is present, no tools are bound, the schema is a POJO/BaseModel (not RowTypeInfo), and the setup is same-language. The key is always stripped before the SDK call so it can't leak into a request.
  • The prompt path is kept as the fallback. In the ReAct loop tools are always bound, so the native path stays dormant and existing behavior is unchanged.

OpenAI applies response_format json_schema strict. Other connections only strip the reserved key for now. The same-language guard avoids marshaling a schema object across the Pemja bridge, where native structured output can't work anyway (a Java Class is not a Python BaseModel).

Tests

Unit tests with the SDK mocked (no network): native applied with schema and no tools (Java + Python); not applied when tools are bound or for RowTypeInfo; the reserved key never reaches a provider SDK; the same-language threading guard; and existing ReActAgent prompt-path tests remain green.

API

Yes — additive only. BaseChatModelConnection gains a public reserved-key constant and a protected capability method (default false); no existing signatures change.

Documentation

  • doc-needed
  • doc-not-needed
  • doc-included

…ion + OpenAI)

Add the foundation for using a model provider's native structured-output
capability at the chat-model connection layer, plus the OpenAI implementation,
in both Java and Python. Previously output_schema was honored only by
prompt-engineering the request and parsing the response text.

The request's output schema is carried to the connection through a reserved
key in the existing modelParams/kwargs map, so the abstract chat() signature
is unchanged. Each connection declares a boolean native-structured-output
capability (default false). A connection applies the native API only when a
schema is present, no tools are bound on the call, the schema is a POJO
(Java) / BaseModel (Python) rather than a RowTypeInfo, and the setup is
same-language. The reserved key is always removed before the SDK call so it
cannot leak into a provider request. The prompt-engineered path is retained
as the fallback and is unaffected: in the ReAct loop tools are always bound,
so the native path stays dormant there.

OpenAI applies response_format json_schema with strict validation. Other
connections only strip the reserved key; their native paths and the ReActAgent
final-output wiring follow in later changes.
@github-actions github-actions Bot added doc-not-needed Your PR changes do not impact docs fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. priority/major Default priority of the PR or issue. and removed doc-not-needed Your PR changes do not impact docs labels Jun 12, 2026
@wenjin272 wenjin272 added fixVersion/0.4.0 and removed fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. labels Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs fixVersion/0.4.0 priority/major Default priority of the PR or issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants