feat: Data Fabric native entity write tool (P1 LDO writes) by UIPath-Harshit · Pull Request #947 · UiPath/uipath-langchain-python

UIPath-Harshit · 2026-06-27T06:04:12Z

Summary

Adds a write_datafabric tool alongside the existing query_datafabric read tool, enabling agents to perform structured CRUD (insert/update/delete) against native Data Fabric entities. Writes use structured mutation intent delegated to EntitiesService native CRUD — no LLM-generated DML.

Includes the optional OWL ontology compiler layer from write RFC v2.

P1 LDO writes coded-agent POC. Status: blocked on customer discovery (DS-8541) — draft for implementation review while discovery completes.

Two layers

1. Metadata-driven writes (always on)

Everything structural is derived from entity metadata — no ontology required:

is_entity_writable — native-only (excludes federated/ChoiceSet/system)
derive_writable_fields — filters system/hidden/PK/attachment; surfaces ChoiceSet bindings
validate_mutation_intent — entity allowlist, required-field + field-allowlist checks, record_id rules
WriteExecutor — insert/update/delete via EntitiesService
build_write_tool_description — NL intermediate representation (replaces raw OWL injection)

2. Ontology layer (optional, graceful fallback)

The OWL ontology is the authoring/storage format, compiled into a structured CompiledOntology — NOT injected as raw OWL (research shows LLMs reason poorly over raw OWL Turtle, 3-10% accuracy).

compiled_ontology.py — CompiledOntology (entity_access, measure_fields, state_fields, reference_fields, hitl_operations, entity_relationships)
ontology_compiler.py — compile_ontology(owl_turtle) via rdflib. Extracts what metadata can't: allowed operations per entity, field semantics (state/measure/reference), HITL-on-destructive markers, entity relationships. Supports both ontology dialects (.ttl subClassOf+actions, and RFC a df:WritableEntity+df:allowsOperation).
Wired into DataFabricWriteHandler via best-effort get_ontology_file_async. This platform method does not yet exist (only on a feature branch) — so the handler degrades gracefully to metadata-only (compiled_ontology stays None). The build does not break.
validate_mutation_intent gains optional compiled_ontology — rejects operations not in entity_access. State-transition validation deferred to v3.

Modified

File	Change
`models.py`	`DataFabricWriteInput`, `WriteResult`, `WritableFieldInfo`, `EntityWriteSchema`, `EntityWriteOperation`
`datafabric_tool.py`	`DataFabricWriteHandler` (lazy resolution + ontology compile), `create_datafabric_tools()`
`datafabric_prompt_builder.py`	`build_write_context()`
`context_tool.py` / `tool_factory.py`	route to tool-list return; HITL propagation
`pyproject.toml`	`rdflib>=7.0.0`

Testing — 109 tests passing

test_write_validation.py (35) — writability, field derivation, validation, ontology-constrained ops
test_write_integration.py (21) — tool creation, args schema, HITL, federated rejection
test_write_schema_builder.py (17) — NL description generation
test_ontology_compiler.py (23) — refund + order-management + RFC dialects, graceful/malformed paths
test_refund_agent_integ.py (12) — contact-center refund hero case
test_write_executor.py (6) — CRUD via EntitiesService mock

Also validated end-to-end on staging via CLI (df-agent-os/tests/integ_refund_agent.sh). The compiler was verified against the actual design ontology df-agent-os/roadmap/p1-owl-write-extension.ttl (and fixed a Turtle syntax bug in that artifact — adjacent-string-literal concatenation).

Open questions (RFC §10)

Is ontologySet required for writes, or is metadata-only the permanent fallback?
Measure fields (additive semantics): runtime read-modify-write vs LLM responsibility via SOP?
ChoiceSet value validation at write time — pending live value resolution.

🤖 Generated with Claude Code

Adds a write_datafabric tool alongside the existing query_datafabric read tool for structured CRUD (insert/update/delete) against native Data Fabric entities. Writes use structured mutation intent delegated to EntitiesService native CRUD — no LLM-generated DML. Key components: - DataFabricWriteInput / WriteResult / EntityWriteSchema models - is_entity_writable: native-only (excludes federated, ChoiceSet, system) - derive_writable_fields: filters system/hidden/PK/attachment fields, surfaces ChoiceSet bindings - validate_mutation_intent: entity allowlist, required-field and field-allowlist checks, record_id requirements per operation - WriteExecutor: insert/update/delete via EntitiesService - build_write_tool_description: NL intermediate representation for the tool description (replaces raw OWL injection per write RFC v2) - DataFabricWriteHandler: lazy entity resolution; writability enforced after async resolution since entity_type/external_fields are only on resolved Entity objects - create_datafabric_tools: returns [read_tool, write_tool] - HITL: require_conversational_confirmation propagated for conversational agents 87 tests including the contact-center refund hero case (read 4 entities, decide, write RefundRequest + update Order/CustomerRisk/Contact). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

sonarqubecloud · 2026-06-27T06:09:00Z

Quality Gate failed

Failed conditions
89.8% Coverage on New Code (required ≥ 90%)

See analysis details on SonarQube Cloud

Builds the ontology layer from write RFC v2. The OWL ontology is the authoring/storage format; it is compiled into a CompiledOntology intermediate representation (NOT injected as raw OWL — research shows LLMs reason poorly over raw OWL Turtle). - compiled_ontology.py: CompiledOntology model (entity_access, measure_fields, state_fields, reference_fields, hitl_operations, entity_relationships) per RFC §5.2 - ontology_compiler.py: compile_ontology(owl_turtle) via rdflib. Supports both ontology dialects — the .ttl dialect (rdfs:subClassOf df:WritableEntity + action-derived ops + df:hasField) and the RFC dialect (a df:WritableEntity + df:allowsOperation). Resilient to partial annotations; raises OntologyCompileError only on malformed Turtle. - write_validation.py: validate_mutation_intent gains optional compiled_ontology — rejects operations not in entity_access. State transition validation deferred to v3 (documented TODO). - datafabric_tool.py: DataFabricWriteHandler best-effort fetches + compiles the ontology via get_ontology_file_async. The method is absent from the current platform package, so this degrades gracefully to the metadata-only path (compiled_ontology stays None) — the build does not break. - rdflib>=7.0.0 added to dependencies. 23 new tests (refund + order-management dialects, RFC dialect, graceful paths). 109 datafabric_tool tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

scripts/run_agent_with_ontology.py bridges the gap until the platform ships ontology storage/fetch: it monkeypatches EntitiesService.get_ontology_file_async (currently absent) to return a user-supplied .ttl, which activates the real _maybe_compile_ontology path in DataFabricWriteHandler — the ontology is compiled and used in write validation + tool description exactly as it will be once the platform method lands. Usage: python scripts/run_agent_with_ontology.py \ --ontology PATH.ttl --entity-set PATH.json --prompt "..." \ [--model NAME] [--system-prompt PATH.txt] [--dry-run] --dry-run compiles the ontology and prints the extracted facts (entity_access, hitl_operations, state/reference/measure fields, relationships) WITHOUT network. The live run needs UiPath auth env vars + real tenant entity ids. Companions: - sample_refund_entity_set.json (hero-case entities, placeholder ids) - sample_refund_sop.txt (refund SOP from RFC §4.3) - README_run_agent_with_ontology.md (mechanism + offline/real run steps) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Two write-path bugs found while validating the ontology POC end-to-end against live staging, both fixed: 1. Read schema stripped the Id primary key (write_validation system-field filter), so the NL-to-SQL model (a) invented a non-existent 'rowid' column for ORDER BY on paginated reads (FQS 400) and (b) never returned Id, leaving the agent no record_id for updates/deletes. Fix: retain the primary key for WRITABLE entities in the read schema (SELECT it, ORDER BY it); keep other system fields hidden; read-only entities unchanged. P3 collision guard for user/CSV fields sharing a system field name. Harden is_entity_writable with getattr. 2. Write executor called the CRUD endpoint with the entity NAME, but .../EntityService/entity/{id}/insert requires the GUID id ("not valid" 400). Fix: handler maps entity name -> id before executing, restores the friendly name on the result. Verified on staging (dataservicetest/DataFabricFQS): with both fixes the refund flow's insert + 3 updates all persist (read-back confirmed). The ontology compiles, activates, and correctly governs tool selection (RefundRequest insert-only; Order/Risk/Contact update; Customer read-only, never written). POC harness (scripts/): - poc_refund_setup.sh / poc_refund_teardown.sh — create+seed / delete the 5 refund entities, emit ontology + entity-set (referenceKey=GUID) + ids - poc_refund_drive.py — drive the real write handler with the ontology active, verify by read-back (deterministic; no LLM) - run_agent_with_ontology.py — full LLM-in-the-loop variant; gains --agenthub-config (LLM-gateway licensing OpCode; without it the gateway 403s) and recursion_limit - POC_README.md — env setup + the three run levels + the known agent-loop gap (create_agent does not auto-execute the terminal write batch; that is runtime plumbing, not the ontology/write tool) Tests: 740 passed. New: read-schema PK retention (writable vs read-only, other-system-fields-hidden, collision-not-duplicated, rowid-free ORDER BY) and name->id translation for CRUD. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…t, debug, read-flow wiring Extends the ontology layer per review feedback: 1. df:ReadableEntity is now first-class. CompiledOntology gains `known_entities` (every df:entityKey the ontology declares) plus is_known / is_writable / is_read_only helpers. Previously a read-only entity was indistinguishable from one the ontology never mentioned — both were merely absent from entity_access. 2. Read-only is enforced, not advisory. validate_mutation_intent rejects a write to an entity the ontology knows but grants no write ops; the write handler prunes such entities from write_schemas so they never appear in the write tool description. (Verified: Customer is excluded and a direct update is rejected.) 3. Debug output. CompiledOntology.to_human_readable() + module-level format_ontology_debug(owl, compiled) render the raw OWL Turtle and a human-readable IR (entities + access modes, measure/state/reference field semantics, relationships). Logged at DEBUG in the fetch/compile path and printed by both POC scripts during a run. 4. Ontology wired into the READ flow (reads still go through the existing NL-to-SQL path — ontology enriches, does not restrict). Shared maybe_fetch_and_compile_ontology helper used by both handlers; the read handler threads CompiledOntology into DataFabricGraph.create -> datafabric_prompt_builder, which emits an "## Ontology Context" section (access modes, relationships, FK/reference targets, state-value sources) for schema linking (P5). Also: poc_refund_drive.py verification read-back now addresses entities by GUID (get_record_async requires the id, not the name). Validated live on staging (dataservicetest/addyTest): debug IR shows Customer READ-ONLY; Customer pruned from writes + write rejected; refund flow insert + 3 updates persist, 4/4 verified. 752 tests pass, ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Makes the full LLM-in-the-loop refund flow persist writes end-to-end (verified on staging dataservicetest/DataFabricFQS: insert RefundRequest + update Order/CustomerRisk/Contact all success, read-back confirmed). Root cause of the prior "writes planned but not dispatched": the write tool hardcoded `require_conversational_confirmation: True`, whose tool-node gate calls request_approval -> @durable_interrupt, suspending the graph for human approval. In a non-conversational/coded agent (no human/checkpointer) the graph suspended at the first write and ainvoke returned without executing it. - datafabric_tool.py: drop the unconditional `require_conversational_ confirmation` from the write tool metadata. HITL confirmation is still applied per-resource for conversational agents by tool_factory; it is no longer forced on coded agents (where it can only deadlock). Deterministic guardrails remain: writability checks, ontology op-validation, field allowlist, read-only enforcement. - run_agent_with_ontology.py: add --trace (DEBUG logging surfaces the inner NL->SQL generated SQL per read), --api-flavor (default chat-completions), and print tool RESPONSES (not just calls) so reads/writes are visible. - test: assert the write tool no longer hardcodes the confirmation flag. 752 tests pass, ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

UIPath-Harshit and others added 5 commits June 27, 2026 11:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Data Fabric native entity write tool (P1 LDO writes)#947

feat: Data Fabric native entity write tool (P1 LDO writes)#947
UIPath-Harshit wants to merge 6 commits into
mainfrom
worktree-agent-aaa2a776

UIPath-Harshit commented Jun 27, 2026 •

edited

Loading

Uh oh!

sonarqubecloud Bot commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

UIPath-Harshit commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Two layers

1. Metadata-driven writes (always on)

2. Ontology layer (optional, graceful fallback)

Modified

Testing — 109 tests passing

Open questions (RFC §10)

Uh oh!

sonarqubecloud Bot commented Jun 27, 2026

Quality Gate failed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

UIPath-Harshit commented Jun 27, 2026 •

edited

Loading