fix(mcp): keep read-tool responses under the size cap by paging items, not truncating content by Vishnuujain · Pull Request #29732 · open-metadata/OpenMetadata

Vishnuujain · 2026-07-03T11:43:51Z

Following this work: https://github.com/open-metadata/openmetadata-collate/issues/4825

Problem

When an MCP read tool's response was too large (over the ~100k dispatch cap), the whole payload was thrown away and replaced with an empty stub. The agent got nothing back. Some tools also cut descriptions / SQL / DDL at a fixed length, silently losing the exact content the call was for.

Follow-up to #29713 (same bug for get_entity_details / wide tables). Together they close #29707.

Fix

One shared helper, ResponseBudget, measures each item's real size and keeps as many whole items as fit. Size is now controlled by returning fewer items, never by trimming an item's content.

Tool	Change
`search_metadata`, `semantic_search`, `search_company_context`	Fit result list to budget; drop only trailing items; set `hasMore` + next-page hint. Item content untouched.
`get_entity_lineage`	Full edge SQL kept. Oversized graph returns a real partial graph (edges split across up/down) with `Returned`/`Total` markers, not bare counts.
`root_cause_analysis`	Full descriptions + SQL kept. Fits the edge lists to budget instead of dropping to a hint.
`get_test_definitions`	Caps `limit`; if a page still overflows, shrinks and refetches. Native cursor stays consistent, so no row is skipped.

get_company_context and single-entity get_entity_details are left full on purpose: no list to page, and their big fields are the point of the call.

Guarantee: no crucial metadata is truncated to save space.

…ub nuke by fitting results exactly to budget Signed-off-by: Vishnu Jain <vishnujtimes@gmail.com>

…ed graphs to a partial graph instead of bare counts Signed-off-by: Vishnu Jain <vishnujtimes@gmail.com>

…t edges to budget instead of dropping to a hint Signed-off-by: Vishnu Jain <vishnujtimes@gmail.com>

… an empty stub Signed-off-by: Vishnu Jain <vishnujtimes@gmail.com>

…st_definitions so neither nukes to an empty stub

…m RCA edge Map with budget rebalance, document single-item residual

gitar-bot · 2026-07-03T12:03:31Z

Code Review ✅ Approved 3 resolved / 3 findings

Replaces silent empty-stub nukes and field truncation with a shared ResponseBudget utility that truncates list items to fit dispatch caps. The changes effectively resolve the previous findings related to inaccurate offsets, budget imbalance, and oversized single items.

✅ 3 resolved

✅ Bug: search_metadata trim message gives wrong 'from' offset for paged calls

📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/SearchMetadataTool.java:393-398 📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/SearchMetadataTool.java:129-141 📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/SearchMetadataTool.java:265-266
search_metadata supports offset paging via the from parameter (parsed at lines 129-141 and applied to the SearchRequest at 230/240). When a response is trimmed to fit the budget, the guidance message tells the caller to Fetch more with 'from'=%d using trimmed.size() — i.e. only the count returned in the current page. But from is never propagated into buildEnhancedSearchResponse/fitResultsToBudget, so for any request that already had from > 0 the suggested next offset is wrong.

Example: caller requests from=20, gets 5 results back after trimming; the message says from=5, which re-fetches results 5–9 (already seen or skipping 25+). An agent following this guidance duplicates rows and never advances, defeating the correct-paging goal this PR is built around. It is correct only for the default from=0 case.

Fix: thread the original from through and emit from + trimmed.size().

✅ Edge Case: RCA edge fitting never rebalances unused budget between directions

📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/RootCauseAnalysisTool.java:365-374
fitEdgeLists splits the available budget 50/50: upstream is fitted within half, downstream within available - up.usedChars(). Unlike GetLineageTool.fitGraphToBudget (which re-fits upstream with available - down.usedChars() when downstream leaves room), RCA has no reverse rebalance. When one direction is empty or small and the other is large, the large direction is capped at half even though the full budget is available, needlessly withholding edges (and marking truncated) when everything could have fit in one direction. Not a correctness bug — the response stays valid and under cap — but it under-returns data in the common asymmetric RCA case (typically only upstream failing edges exist). Consider mirroring the lineage rebalance so a single-direction analysis can use the whole budget.

✅ Edge Case: Single oversized item still exceeds cap and re-triggers empty-stub nuke

📄 openmetadata-mcp/src/main/java/org/openmetadata/mcp/util/ResponseBudget.java:61-65
The forward-progress guarantee in fitWithin returns 1 item even when that single item's serialized size exceeds the whole budget (and potentially MAX_RESPONSE_CHARS). For search/semantic/company-context that keep one huge result, or a lineage/RCA edge carrying a very large SQL (the PR notes ~4KB SQL, but nothing bounds it), the assembled response can still land above the dispatch cap and get discarded to the exact empty-stub the PR is eliminating. This is an inherent trade-off of never truncating content, but it is worth documenting/handling: e.g. detect the single-item-over-cap case and fall back to a minimal stub with an explicit message rather than silently returning a payload that the dispatcher will nuke. At minimum add a test asserting behavior when one kept item alone exceeds MAX_RESPONSE_CHARS.

Options

Display: compact → Showing less information.

Comment with these commands to change the behavior for this request:

`Compact`
`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

github-actions · 2026-07-03T14:34:01Z

🔴 Playwright Results — 3 failure(s), 26 flaky

✅ 4474 passed · ❌ 3 failed · 🟡 26 flaky · ⏭️ 38 skipped

Shard	Passed	Failed	Flaky	Skipped
🟡 Shard 1	441	0	2	16
🔴 Shard 2	800	2	8	8
🟡 Shard 3	805	0	4	7
🟡 Shard 4	807	0	4	5
🔴 Shard 5	854	1	2	0
🟡 Shard 6	767	0	6	2

Genuine Failures (failed on all attempts)

❌ Features/BulkEditEntity.spec.ts › Database Schema (shard 2)

Error: Unable to fill the active grid description editor

❌ Features/ContextCenterPermission.spec.ts › user with deleteAll permission can see delete action but not restore action on an archived document, and can delete it (shard 2)

Error: Document 1426a105-b051-416d-8925-005a159ffff9 did not appear in archive API within 60000ms

❌ Pages/ExploreBrowse.spec.ts › service type drill-down disables unrelated roots and query-panel Clear resets it (shard 5)

�[31mTest timeout of 180000ms exceeded.�[39m

🟡 26 flaky test(s) (passed on retry)

Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab IS visible for supported type: metric (shard 1, 1 retry)
Flow/SearchRBAC.spec.ts › a fully denied user sees neither asset type when browsing (shard 1, 1 retry)
Features/BulkEditEntity.spec.ts › Database (shard 2, 1 retry)
Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
Features/BulkEditEntity.spec.ts › Glossary Term (Nested) (shard 2, 1 retry)
Features/BulkEditOperationBadges.spec.ts › Glossary bulk edit search filters rows and clear restores them (shard 2, 1 retry)
Features/BulkImport.spec.ts › Database Schema (shard 2, 2 retries)
Features/ContextCenter.spec.ts › clicking a memory row opens the view-only modal (shard 2, 1 retry)
Features/DataQuality/TestCaseImportExportE2eFlow.spec.ts › Admin: Complete export-import-validate flow (shard 2, 2 retries)
Features/ExploreQuickFilters.spec.ts › explore tree sidebar selection is not cleared when a top dropdown filter is applied (shard 2, 1 retry)
Features/IncidentManager.spec.ts › Complete Incident lifecycle with table owner (shard 3, 2 retries)
Features/IncidentManager.spec.ts › Resolving incident & re-run pipeline (shard 3, 2 retries)
Features/KnowledgeCenter.spec.ts › Knowledge Center page (shard 3, 1 retry)
Features/SearchExport.spec.ts › Export queues a background job and downloads from the jobs tray (shard 3, 1 retry)
Features/Workflows/WorkflowOssRestrictions.spec.ts › batch-size-input is enabled in OSS (shard 4, 1 retry)
Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 4, 1 retry)
Pages/CustomProperties.spec.ts › Duration (shard 4, 1 retry)
Pages/DataProductODPS.spec.ts › edits data product metadata (type, visibility, priority) via the modal (shard 4, 1 retry)
Pages/ExplorePageRightPanel_KnowledgeCenter.spec.ts › Should remove user owner for knowledgeCenter (shard 5, 1 retry)
Pages/ExplorePageRightPanel.spec.ts › Should verify deleted user not visible in owner selection for pipeline (shard 5, 1 retry)
Pages/Glossary.spec.ts › Glossary creation with domain selection (shard 6, 1 retry)
Pages/GlossaryImportExport.spec.ts › Import partial success - some terms pass, some fail (shard 6, 2 retries)
Pages/Lineage/LineageFilters.spec.ts › Verify Impact Analysis service filter selection (shard 6, 1 retry)
Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab is NOT visible for pipelineService in platform lineage (shard 6, 1 retry)
Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab is NOT visible for apiService in platform lineage (shard 6, 1 retry)
Pages/Users.spec.ts › Create and Delete user (shard 6, 1 retry)

📦 Download artifacts

How to debug locally

# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Vishnuujain added 5 commits July 3, 2026 14:13

feat(mcp): add shared ResponseBudget and fix search_metadata empty-st…

98bb655

…ub nuke by fitting results exactly to budget Signed-off-by: Vishnu Jain <vishnujtimes@gmail.com>

feat(mcp): return full edge SQL in get_entity_lineage and fit oversiz…

46c4ce4

…ed graphs to a partial graph instead of bare counts Signed-off-by: Vishnu Jain <vishnujtimes@gmail.com>

feat(mcp): return full SQL/descriptions in root_cause_analysis and fi…

b2f2fb0

…t edges to budget instead of dropping to a hint Signed-off-by: Vishnu Jain <vishnujtimes@gmail.com>

feat(mcp): fit semantic_search results to budget so it never nukes to…

7d90f7d

… an empty stub Signed-off-by: Vishnu Jain <vishnujtimes@gmail.com>

feat(mcp): fit search_company_context results and clamp/shrink get_te…

e5e5d4d

…st_definitions so neither nukes to an empty stub

github-actions Bot added the safe to test Add this label to run secure Github workflows on PRs label Jul 3, 2026

gitar-bot Bot reviewed Jul 3, 2026

View reviewed changes

Comment thread openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/SearchMetadataTool.java Outdated

gitar-bot Bot reviewed Jul 3, 2026

View reviewed changes

Comment thread openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/RootCauseAnalysisTool.java

gitar-bot Bot reviewed Jul 3, 2026

View reviewed changes

Comment thread openmetadata-mcp/src/main/java/org/openmetadata/mcp/util/ResponseBudget.java

greptile-apps Bot reviewed Jul 3, 2026

View reviewed changes

Comment thread openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/RootCauseAnalysisTool.java

Comment thread openmetadata-mcp/src/main/java/org/openmetadata/mcp/tools/SearchMetadataTool.java

Vishnuujain had a problem deploying to test July 3, 2026 12:01 — with GitHub Actions Error

gitarbot/greptile feedback: absolute from-offset hint, trim downstrea…

5b1c2dc

…m RCA edge Map with budget rebalance, document single-item residual

Vishnuujain had a problem deploying to test July 3, 2026 12:17 — with GitHub Actions Failure

Vishnuujain temporarily deployed to test July 3, 2026 12:17 — with GitHub Actions Inactive

Vishnuujain had a problem deploying to test July 3, 2026 12:17 — with GitHub Actions Failure

Vishnuujain temporarily deployed to test July 3, 2026 12:17 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(mcp): keep read-tool responses under the size cap by paging items, not truncating content#29732

fix(mcp): keep read-tool responses under the size cap by paging items, not truncating content#29732
Vishnuujain wants to merge 6 commits into
mainfrom
mcp-response-robustness

Vishnuujain commented Jul 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gitar-bot Bot commented Jul 3, 2026

Uh oh!

github-actions Bot commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Vishnuujain commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gitar-bot Bot commented Jul 3, 2026

Uh oh!

github-actions Bot commented Jul 3, 2026

🔴 Playwright Results — 3 failure(s), 26 flaky

Genuine Failures (failed on all attempts)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Vishnuujain commented Jul 3, 2026 •

edited

Loading