Skip to content

⚡️ Add few-shot examples and a "SQON Cheat Sheet" to improve LLM SQON Generation#1078

Open
mistryrn wants to merge 1 commit into
feat/admin_177-mcp-execute-query-toolfrom
feat/admin_177-mcp-execute-query-supplemental
Open

⚡️ Add few-shot examples and a "SQON Cheat Sheet" to improve LLM SQON Generation#1078
mistryrn wants to merge 1 commit into
feat/admin_177-mcp-execute-query-toolfrom
feat/admin_177-mcp-execute-query-supplemental

Conversation

@mistryrn

Copy link
Copy Markdown
Contributor

Summary

Enhancements to PR #1077 to improve the likelihood that smaller LLMs such as gemma-4-e4b can successfully generate valid SQON for use with the execute-query tool. Adds few-shot examples of valid SQON, as well as clear instructions on how to build SQON from a user's query.

Issues

Description of Changes

Initial testing of the execute-query tool with a small local LLM (gemma-4-e4b) revealed that despite having access to the full SQON schema from the get-sqon-schema tool, small LLMs still failed to produce valid SQON. An analysis led by a Claude thinking model determined 5 potential root causes for the failures, summarized below:

  1. Prior training for the model (based on outdated/inaccurate Arranger docs) conflicts with the schema
  2. get-sqon-schema returns a validation artifact, not a generation guide
  3. The published JSON schema contains dangling $refs
  4. The sqon input parameter to execute-query is Zod.unknown(), which provides zero structure at the protocol level
  5. Error responses for invalid SQON provide little helpful detail for LLM self-correction

The docs have since been updated (thanks, @justincorrigible !), so this PR focuses on issues 1 and 2 as follows:

  1. Provide few-shot examples of valid SQON at every touchpoint the LLM has in the execute-query/SQON-generation flow, aiming to take precedence over any prior training.
  2. Update the text response of get-sqon-schema to be a SQON generation guide, while keeping the structuredContent as the full JSON schema.

Local re-testing with gemma-4-e4b following these 2 enhancements has shown promising results so far, allowing the model to generate valid SQON for some basic example queries.

MCP Server

  • Added few-shot examples of valid SQON to the sqon parameter description in the execute-query tool
  • Added a "SQON Cheat Sheet" as the text response from the get-sqon-schema tool, while still returning the full schema response in structuredContent
    • The cheat sheet is structured to help LLMs understand how to structure SQON and build a query, with examples to reinforce best practices and override any erroneous/outdated training data
  • Updated the execute-query tool description to reinforce utilizing the "SQON cheat sheet"
  • Updated get-sqon-schema integration tests to reflect new response structure

Special Instructions

There are no additional steps required to test these changes.

Readiness Checklist

  • Self Review
    • I have performed a self review of code
    • I have run the application locally and manually tested the feature
    • I have checked all updates to correct typos and misspellings
  • Formatting
    • Code follows the project style guide
    • Autmated code formatters (ie. Prettier) have been run
  • Local Testing
    • Successfully built all packages locally
    • Successfully ran all test suites, all unit and integration tests pass
  • Updated Tests
    • Unit and integration tests have been added that describe the bug that was fixed or the features that were added
  • Documentation
    • All new environment variables added to .env.schema file and documented in the README
    • All changes to server HTTP endpoints have open-api documentation
    • All new functions exported from their module have TSDoc comment documentation

… Generation

* Added few-shot examples of valid SQON to the `sqon` parameter description in the `execute-query` tool
* Added a "SQON Cheat Sheet" as the `text` response from the `get-sqon-schema` tool, while still returning the full schema response in `structuredContent`
  * The cheat sheet is structured to help LLMs understand how to structure SQON and build a query, with examples to reinforce best practices and override any erroneous/outdated training data
* Updated the `execute-query` tool description to reinforce utilizing the "SQON cheat sheet"
* Updated `get-sqon-schema` integration tests to reflect new response structure
Comment thread .dev/tech-debt.md
**Fix:** (a) Docs/schema comments pass: update README.md:13, docs/usage/02-arranger-components.md (title + line 29), configs.json.schema:28, and console strings in configs/index.ts. (b) Identifier rename pass (separate commit): buildCatalogsFromFolder -> buildCatalogsFromDirectory, folderName -> directoryName in apps/search-server/src/configs/. (c) Cross-references: add pointer to docs/concepts.md early in docs/usage/02-arranger-components.md; introduce "filter clause" for leaf nodes in docs/sqon/03-sqon-in-detail.md.
**Standalone:** yes; (a) is docs-only; (b) is a mechanical rename; (c) is a docs addition. All three independent.

### `docs/concepts.md` SQON examples use `field` instead of `fieldName`

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this from tech debt as the docs-in-question have been corrected as of 834ad94#diff-c282aa25f1dabaaba19197006c4aa6bab00e8d2b2100375c42b41b7cf4bc55cb

* SQON schema in `@overture-stack/sqon` (modules/sqon/src/schema): the operator names and node
* shapes below are derived from it. Every example here is checked against `SqonSchema`.
*/
const SQON_CHEAT_SHEET = `SQON cheat sheet (Serializable Query Object Notation).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the result of a few revisions of a "SQON Cheat Sheet". Intended to be returned alongside the full JSON schema response (that is structuredContent, this gets returned as text) to guide the LLM in generating valid SQON. It can certainly be refined further, but so far the results are promising 🙏

@mistryrn mistryrn marked this pull request as ready for review June 25, 2026 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant